On Thu, Aug 1, 2024 at 5:23 PM Amit Kapila <[email protected]> wrote:
>
> On Thu, Aug 1, 2024 at 2:26 PM Hayato Kuroda (Fujitsu)
> <[email protected]> wrote:
> >
> > 04. general
> >
> > According to the documentation [1], there is another constraint "exclude",
> > which
> > can cause another type of conflict. But this pattern cannot be logged in
> > detail.
> >
>
> As per docs, "exclusion constraints can specify constraints that are
> more general than simple equality", so I don't think it satisfies the
> kind of conflicts we are trying to LOG and then in the future patch
> allows automatic resolution for the same. For example, when we have
> last_update_wins strategy, we will replace the rows with remote rows
> when the key column values match which shouldn't be true in general
> for exclusion constraints. Similarly, we don't want to consider other
> constraint violations like CHECK to consider as conflicts. We can
> always extend the basic functionality for more conflicts if required
> but let's go with reporting straight-forward stuff first.
>
It is better to document that exclusion constraints won't be
supported. We can even write a comment in the code and or commit
message that we can extend it in the future.
*
+ * Return true if the commit timestamp data was found, false otherwise.
+ */
+bool
+GetTupleCommitTs(TupleTableSlot *localslot, TransactionId *xmin,
+ RepOriginId *localorigin, TimestampTz *localts)
This API returns both xmin and commit timestamp, so wouldn't it be
better to name it as GetTupleTransactionInfo or something like that?
I have made several changes in the attached top-up patch. These
include changes in the comments, docs, function names, etc.
--
With Regards,
Amit Kapila.
diff --git a/doc/src/sgml/logical-replication.sgml
b/doc/src/sgml/logical-replication.sgml
index c1f6f1aaa8..dfbff3de02 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1585,17 +1585,17 @@ test_sub=# SELECT * FROM t1 ORDER BY id;
<para>
Additional logging is triggered in the following
<firstterm>conflict</firstterm>
- scenarios:
+ cases:
<variablelist>
<varlistentry>
<term><literal>insert_exists</literal></term>
<listitem>
<para>
Inserting a row that violates a <literal>NOT DEFERRABLE</literal>
- unique constraint. Note that to obtain the origin and commit
- timestamp details of the conflicting key in the log, ensure that
+ unique constraint. Note that to log the origin and commit
+ timestamp details of the conflicting key,
<link
linkend="guc-track-commit-timestamp"><varname>track_commit_timestamp</varname></link>
- is enabled. In this scenario, an error will be raised until the
+ should be enabled. In this case, an error will be raised until the
conflict is resolved manually.
</para>
</listitem>
@@ -1605,10 +1605,10 @@ test_sub=# SELECT * FROM t1 ORDER BY id;
<listitem>
<para>
The updated value of a row violates a <literal>NOT DEFERRABLE</literal>
- unique constraint. Note that to obtain the origin and commit
- timestamp details of the conflicting key in the log, ensure that
+ unique constraint. Note that to log the origin and commit
+ timestamp details of the conflicting key,
<link
linkend="guc-track-commit-timestamp"><varname>track_commit_timestamp</varname></link>
- is enabled. In this scenario, an error will be raised until the
+ is enabled. In this case, an error will be raised until the
conflict is resolved manually. Note that when updating a
partitioned table, if the updated row value satisfies another
partition constraint resulting in the row being inserted into a
@@ -1720,10 +1720,10 @@ CONTEXT: processing remote data for replication origin
"pg_16395" during "INSER
commit timestamp can be seen in the <literal>DETAIL</literal> line of the
log. But note that this information is only available when
<link
linkend="guc-track-commit-timestamp"><varname>track_commit_timestamp</varname></link>
- is enabled. Users can use these information to make decisions on whether to
- retain the local change or adopt the remote alteration. For instance, the
- <literal>DETAIL</literal> line in above log indicates that the existing row
was
- modified by a local change, users can manually perform a remote-change-win
+ is enabled. Users can use this information to decide whether to retain the
+ local change or adopt the remote alteration. For instance, the
+ <literal>DETAIL</literal> line in the above log indicates that the existing
+ row was modified locally. Users can manually perform a remote-change-win
resolution by deleting the local row.
</para>
diff --git a/src/backend/executor/execReplication.c
b/src/backend/executor/execReplication.c
index ad3eda1459..34050a3bae 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -475,13 +475,13 @@ retry:
}
/*
- * Find the tuple that violates the passed in unique index constraint
- * (conflictindex).
+ * Find the tuple that violates the passed unique index (conflictindex).
*
- * If no conflict is found, return false and set *conflictslot to NULL.
- * Otherwise return true, and the conflicting tuple is locked and returned in
- * *conflictslot. The lock is essential to allow retrying to find the
- * conflicting tuple in case the tuple is concurrently deleted.
+ * If the conflicting tuple is found return true, otherwise false.
+ *
+ * We lock the tuple to avoid getting it deleted before the caller can fetch
+ * the required information. Note that if the tuple is deleted before a lock
+ * is acquired, we will retry to find the conflicting tuple again.
*/
static bool
FindConflictTuple(ResultRelInfo *resultRelInfo, EState *estate,
@@ -528,25 +528,15 @@ retry:
}
/*
- * Re-check all the unique indexes in 'recheckIndexes' to see if there are
- * potential conflicts with the tuple in 'slot'.
- *
- * This function is invoked after inserting or updating a tuple that detected
- * potential conflict tuples. It attempts to find the tuple that conflicts with
- * the provided tuple. This operation may seem redundant with the unique
- * violation check of indexam, but since we call this function only when we are
- * detecting conflict in logical replication and encountering potential
- * conflicts with any unique index constraints (which should not be frequent),
- * so it's ok. Moreover, upon detecting a conflict, we will report an ERROR and
- * restart the logical replication, so the additional cost of finding the tuple
- * should be acceptable.
+ * Check all the unique indexes in 'recheckIndexes' for conflict with the
+ * tuple in 'slot' and report if found.
*/
static void
-ReCheckConflictIndexes(ResultRelInfo *resultRelInfo, EState *estate,
+CheckAndReportConflict(ResultRelInfo *resultRelInfo, EState *estate,
ConflictType type, List
*recheckIndexes,
TupleTableSlot *slot)
{
- /* Re-check all the unique indexes for potential conflicts */
+ /* Check all the unique indexes for a conflict */
foreach_oid(uniqueidx, resultRelInfo->ri_onConflictArbiterIndexes)
{
TupleTableSlot *conflictslot;
@@ -622,17 +612,22 @@ ExecSimpleRelationInsert(ResultRelInfo *resultRelInfo,
conflictindexes, false);
/*
- * Rechecks the conflict indexes to fetch the conflicting local
tuple
+ * Checks the conflict indexes to fetch the conflicting local
tuple
* and reports the conflict. We perform this check here,
instead of
- * perform an additional index scan before the actual insertion
and
+ * performing an additional index scan before the actual
insertion and
* reporting the conflict if any conflicting tuples are found.
This is
* to avoid the overhead of executing the extra scan for each
INSERT
* operation, even when no conflict arises, which could
introduce
* significant overhead to replication, particularly in cases
where
* conflicts are rare.
+ *
+ * XXX OTOH, this could lead to clean-up effort for dead tuples
added
+ * in heap and index in case of conflicts. But as conflicts
shouldn't
+ * be a frequent thing so we preferred to save the performance
overhead
+ * of extra scan before each insertion.
*/
if (conflict)
- ReCheckConflictIndexes(resultRelInfo, estate,
CT_INSERT_EXISTS,
+ CheckAndReportConflict(resultRelInfo, estate,
CT_INSERT_EXISTS,
recheckIndexes, slot);
/* AFTER ROW INSERT Triggers */
@@ -710,12 +705,12 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
(update_indexes == TU_Summarizing));
/*
- * Refer to the comments above the call to
ReCheckConflictIndexes() in
+ * Refer to the comments above the call to
CheckAndReportConflict() in
* ExecSimpleRelationInsert to understand why this check is
done at
* this point.
*/
if (conflict)
- ReCheckConflictIndexes(resultRelInfo, estate,
CT_UPDATE_EXISTS,
+ CheckAndReportConflict(resultRelInfo, estate,
CT_UPDATE_EXISTS,
recheckIndexes, slot);
/* AFTER ROW UPDATE Triggers */
diff --git a/src/backend/replication/logical/worker.c
b/src/backend/replication/logical/worker.c
index b1dc04ec7e..bff98a236d 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -2668,8 +2668,7 @@ apply_handle_update_internal(ApplyExecutionData *edata,
TimestampTz localts;
/*
- * Check whether the local tuple was modified by a different
origin.
- * If detected, report the conflict.
+ * Report the conflict if the tuple was modified by a different
origin.
*/
if (GetTupleCommitTs(localslot, &localxmin, &localorigin,
&localts) &&
localorigin != replorigin_session_origin)
@@ -2825,8 +2824,7 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
TimestampTz localts;
/*
- * Check whether the local tuple was modified by a different
origin.
- * If detected, report the conflict.
+ * Report the conflict if the tuple was modified by a different
origin.
*/
if (GetTupleCommitTs(localslot, &localxmin, &localorigin,
&localts) &&
localorigin != replorigin_session_origin)
@@ -3038,8 +3036,7 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
}
/*
- * Check whether the local tuple was modified
by a different
- * origin. If detected, report the conflict.
+ * Report the conflict if the tuple was
modified by a different origin.
*/
if (GetTupleCommitTs(localslot, &localxmin,
&localorigin, &localts) &&
localorigin !=
replorigin_session_origin)