Merql
Advanced

Row Identity

Row identity determines how merql matches rows across snapshots. When comparing base to ours, it needs to know that row X in base is the same logical row as row Y in ours, even if some column values changed. The identity rule controls which columns form the row key.

use Merql\Identity\PrimaryKeyIdentity;

$identity = new PrimaryKeyIdentity(['id']);
$key = $identity->key(['id' => 42, 'title' => 'Hello']);
// "42"

The RowIdentity interface

All identity strategies implement two methods:

interface RowIdentity
{
    public function key(array $row): string;
    public function columns(): array;
}

key() builds a unique string identifier from a row's data. columns() returns the column names used for identity.

PrimaryKeyIdentity

The default strategy. Matches rows by their primary key columns.

use Merql\Identity\PrimaryKeyIdentity;

// Single-column PK.
$identity = new PrimaryKeyIdentity(['id']);
$key = $identity->key(['id' => 42, 'title' => 'Hello']);
// "42"

// Composite PK.
$identity = new PrimaryKeyIdentity(['tenant_id', 'user_id']);
$key = $identity->key(['tenant_id' => 1, 'user_id' => 99, 'name' => 'Alice']);
// "1\x1F99" (unit separator between parts)

This is what merql uses by default when the table has a primary key defined in its schema.

NaturalKeyIdentity

For tables without an auto-increment primary key, matches rows by unique constraint columns. Useful when the PK is a UUID or when the table uses a composite unique constraint as its logical identity.

use Merql\Identity\NaturalKeyIdentity;

$identity = new NaturalKeyIdentity(['email']);
$key = $identity->key(['id' => 1, 'email' => 'alice@example.com', 'name' => 'Alice']);
// "alice@example.com"

$identity = new NaturalKeyIdentity(['country', 'postal_code']);
$key = $identity->key(['country' => 'US', 'postal_code' => '90210', 'city' => 'Beverly Hills']);
// "US\x1F90210"

ContentHashIdentity

Fallback for tables with no primary key and no unique constraints. Computes a SHA-256 hash of all column values. Two rows are the same if every column matches.

use Merql\Identity\ContentHashIdentity;

$identity = new ContentHashIdentity(['name', 'value']);
$key = $identity->key(['name' => 'color', 'value' => 'blue']);
// "a3f2..." (SHA-256 hash)

Content hash identity has a limitation: if a row's content changes, its hash changes, so the row appears as a delete of the old hash and an insert of the new hash rather than an update. This is an inherent tradeoff when there are no stable identity columns.

IdentityRule

IdentityRule is the serializable identity configuration used by snapshot capture and merge plans.

use Merql\Identity\IdentityRule;

$rule = IdentityRule::natural(['option_name']);
$key = $rule->key(['option_name' => 'siteurl']);
// "siteurl"

Rule types:

TypeMeaning
primaryDatabase primary key columns
naturalStable unique business columns
compositeComposite logical identity columns
contentContent hash fallback

IdentityRule::forSchema($schema) resolves the default rule:

  1. If the table has a primary key, use the primary key columns.
  2. If the table has unique keys, use the first unique key's columns.
  3. If neither exists, use all columns (content hash fallback).

IdentityRuleSet

Use IdentityRuleSet when a host application knows a better identity rule than the schema default.

use Merql\Identity\IdentityRule;
use Merql\Identity\IdentityRuleSet;
use Merql\Snapshot\Snapshotter;

$rules = new IdentityRuleSet([
    'wp_options' => IdentityRule::natural(['option_name']),
    'wp_term_relationships' => IdentityRule::composite(['object_id', 'term_taxonomy_id']),
]);

$snapshotter = new Snapshotter($connection, identityRules: $rules);
$snapshot = $snapshotter->capture('live', ['wp_options', 'wp_term_relationships']);

The Snapshotter resolves an IdentityRule for every captured table. The chosen columns are stored in TableSnapshot::identityColumns and used for all subsequent diff, merge, plan, SQL, and rollback operations.

Merql::init() can receive the same rule set when using the static facade:

use Merql\Merql;

Merql::init($connection, identityRules: $rules);
Merql::snapshot('live', ['wp_options']);

Rule sets have stable array serialization:

$payload = $rules->toArray();
$rules = IdentityRuleSet::fromArray($payload);

Ambiguous Identity

If two captured rows produce the same identity key, capture fails instead of overwriting one row with another.

use Merql\Exceptions\SnapshotException;

try {
    $snapshotter->capture('live', ['wp_postmeta']);
} catch (SnapshotException $e) {
    // The configured identity rule matched multiple rows.
}

Host applications can preflight raw rows with IdentityRuleSet::conflictsFor() when they want to report ambiguity before capture:

$conflicts = $rules->conflictsFor(
    'wp_postmeta',
    IdentityRule::natural(['post_id', 'meta_key']),
    $rows,
);

For WordPress meta tables, post_id + meta_key is usually useful for grouping and display, but not safe as row identity because duplicate meta keys are valid. Use the table's primary key for row identity unless the host application can prove the natural key is unique.

Row key encoding

Row keys are encoded strings that combine multiple column values with a unit separator (\x1F). Values containing the separator or the percent sign are escaped to prevent collisions:

use Merql\Snapshot\Snapshotter;

// Encode.
$key = Snapshotter::buildRowKey(
    ['tenant_id' => '1', 'user_id' => '99'],
    ['tenant_id', 'user_id'],
);
// "1\x1F99"

// Decode.
$parts = Snapshotter::decodeRowKey($key);
// ['1', '99']

Escaping rules:

OriginalEscaped
%%25
\x1F (unit separator)%1F

This ensures that a value containing \x1F does not split into multiple key parts. The encoding is deterministic and reversible.

Auto-increment divergence

When both sides insert new rows, auto-increment IDs will differ between branches. Row #100 in ours might be a completely different record than row #100 in theirs.

merql handles this by matching on primary key. If both sides inserted a row with the same auto-increment ID, it is flagged as an insert_insert conflict. To avoid this, use natural keys (unique columns that are stable across branches) for tables where concurrent inserts are expected.

// A natural key on (slug) avoids auto-increment collisions.
$identity = new NaturalKeyIdentity(['slug']);

Identity in snapshots

The identity columns chosen during capture are stored with each TableSnapshot:

$tableSnapshot = $snapshot->getTable('posts');
$tableSnapshot->identityColumns;
// ['id']

These columns are used by SqlGenerator to build WHERE clauses for UPDATE and DELETE statements, and by merge plans to expose decoded identity values for review, selection, guarded apply, and rollback.

On this page