Re: Proposal: Conflict log history table for Logical Replication

Peter Smith Tue, 13 Jan 2026 18:10:35 -0800

On Tue, Jan 13, 2026 at 9:28 PM Dilip Kumar <[email protected]> wrote:
>
...
> I am not sure does it makes sense to check the tuple from clt for all
> the test cases in 035_conflict.pl or just checking in a few of them is
> sufficient?  Thoughts?
>


I felt just having some was OK, not everything.

//////////

Some review comments for patch v21-0002.

======
Commit message

1.
Handling Multi-row Conflicts: A single remote tuple may conflict with multiple
local tuples (e.g., in the case of multiple_unique_conflicts). To handle this,
the infrastructure creates a single row in the conflict log table for each
remote tuple. The details of all conflicting local rows are aggregated into a
single JSON array in the local_conflicts column.

The JSON array uses the following structured format:
[ { "xid": "1001", "commit_ts": "2025-12-25 10:00:00+05:30", "origin": "node_1",
"key": {"id": 1}, "tuple": {"id": 1, "val": "old_data"} }, ... ]

~

As mentioned in the patch 0001 review comments, there is some
ambiguity regarding JSON versus JSONB. Here, the commit message is
calling it a "JSON array"; later in the patch, the same field is
called a "JSONB array".

Also, the code refers to type JSONOID in places where I thought maybe
it should be JSONBOID (assuming the JSONB array comment was right).
So, which one is it meant to be? All the code/comments need to be
consistent.

Meanwhile, according to my AI... "Only use JSON if you absolutely need
to preserve exact formatting or if your workload is heavily
write-biased with almost no reads."
Since the CLT is basically write-only, and human-readable would be
nice too, maybe it's JSON that we want here.

======
src/backend/replication/logical/conflict.c

2.
+ /*
+ * Get both the conflict log destination and the opened conflict log
+ * relation for insertion.
+ */
+ conflictlogrel = GetConflictLogDestAndTable(&dest);

It' not obvious from that comment that the conflictlogrel could be NULL here.

SUGGESTION
Get the conflict log destination. Also, (if there is one) return the
CLT relation already opened and ready for insertion.

~~~

3.
+ errdetail("Conflict details logged to internal table with OID %u.",
+ MySubscription->conflictlogrelid));

Instead of "internal table", it should be called the "conflict log table".

Also, why not give the real name of the CLT instead of making the user
have to figure it out from the relid?

SUGGESTION
"Conflict details are logged to the conflict log table: %s"

(It might be nicer if you have a common util function for returning
the CLT name "pg_conflict_%u" given the subid)

~~~

4.
+/*
+ * GetConflictLogDestAndTable
+ *
+ * Fetches conflict logging metadata from the cached MySubscription pointer.
+ * Sets the destination enum in *log_dest and, if applicable, opens and
+ * returns the relation handle for the internal log table.
+ */
+Relation
+GetConflictLogDestAndTable(ConflictLogDest *log_dest)

/for the internal log table./for the conflict log table./

======
src/test/subscription/t/035_conflicts.pl

5.
+my $json_query = qq[
+    SELECT string_agg((unnested.j::json)->'key'->>'a', ',')
+    FROM (
+        SELECT unnest(local_conflicts) AS j
+        FROM $clt
+    ) AS unnested;
+];
+
+my $all_keys = $node_subscriber->safe_psql('postgres', $json_query);
+
+# Verify that '2' is present in the resulting string
+like($all_keys, qr/\b2\b/, 'Verified that key 2 exists in the
local_conflicts log');

I noticed you are jumping through some hoops to extract fields from
the JSON before checking their content. If the 'local_conflicts'
really is stored as human-readable JSON (not JSONB), then maybe that
extraction work is not necessary. e.g. consider just matching against
a portion of the JSON string, instead of having to unnest/extract
anything at all.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Reply via email to