I've been working on fixing the bugs that Jeff Janes found [1] with
approach #2 to value locking [2]. Approach #1 was unaffected.

I'm starting this new thread, to discuss these issues. Let's try and
confine discussion of semantics and syntax to the main thread, since
that has been what has mostly been discussed there.

Jeff produced a stress testing tool which tests concurrent deletions
(and not just concurrent updates); his test case deletes tuples when a
counter value goes under zero, which happens occasionally. It also
independently does book keeping for all values in a range of values
that are randomly incremented or decremented, and verifies that
everything is as it should be. This revealed problems with #2 around
concurrent "super deletion" of "unkept" promise heap tuples. As Jeff
pointed out, all-NULL tuples were appearing as visible, or spurious
duplicates appeared. For example:

postgres=# select xmin, xmax, ctid, * from upsert_race_test where index = 1287;
  xmin  |  xmax  |   ctid    | index  | count
--------+--------+-----------+--------+--------
      0 | 205438 | (260,27)  | [null] | [null]
 176066 |      0 | (258,158) |   1287 |      1
(2 rows)

I should point out that one of the problems I found here was a garden
variety bug, which has been fixed (the conflict flag in ExecInsert()
was not being reset correctly, I believe). However, the other
remaining problems could only be fixed with further changes to the
routines in tqual.c, which I'm not happy about, and require further
discussion here. I attach a differential patch which applies on top of
the most recent revision of #2, which is v1.8.vallock2.tar.gz [3].

Note that I've been extending Jeff's stress testing tool to make it
highlight problems earlier, and to make it provide more and better
instrumentation (e.g. printing of XIDs to correlate with pg_xlogdump
output). A good trick I added was to have an INSERT...ON CONFLICT
UPDATE predicate of "WHERE TARGET.index = EXCLUDED.index", which
serves as a kind of assertion when used within the Perl script (the
Perl scripts checks RETURNING-projected tuples, and expects to always
insert or update - that WHERE clause should always pass, obviously).
Jeff's tool is available here (with my revisions):
https://github.com/petergeoghegan/jjanes_upsert

I've been running with fsync off, which Jeff felt increased the
likelihood of revealing the bug. It isn't necessary, though, and FWIW
it was easy for me to recreate this problem once I understood it,
using only my 4 core laptop. I found it important to build without
assertions and at optimization level -O2, though.

The attached patch fixes Jeff's test case, as far as it goes. But, as
I mentioned, it's not intended as a real fix. For one thing, I have
made multiple changes to HeapTupleSatisfies*() routines, but haven't
fully considered the consequences for, say, ri_triggers.c.
HeapTupleSatisfiesSelf() and HeapTupleSatisfiesHistoricMVCC() were not
modified. I've modified HeapTupleSatisfiesVacuum() to see
InvalidTransactionID (not invalid xid infomask bits) as visible for
garbage collection (alongside HeapTupleSatisfiesDirty() and
HeapTupleSatisfiesMVCC(), which I changed first), and I wouldn't be
surprised if that created new problems in other areas. In short, this
patch is just for illustrative purposes.

To see the difference this patch makes, I suggest commenting out the
new code, and running the following test after a fresh initdb:

$ perl count_upsert.pl 4 100000

I think that the actual nature of the problem I fixed was that
concurrent upserters felt entitled to update a promise tuple that they
could at least initially see, but was then "super deleted", but it
turned out that once it was super deleted it was okay to update (the
tuple wasn't actually duplicated insofar as HeapTupleSatisfiesDirty()
was then able to ascertain due to other concurrent activity). It's
pretty complicated, in any case. This is just a band-aid.

Thoughts? Does anyone have any thoughts on my diagnosis of the problem?

[1] 
http://www.postgresql.org/message-id/CAMkU=1w8e9Q6ZX3U85RtsyCMpdYWFrvVAO3=unevtqiruzn...@mail.gmail.com
[2] 
https://wiki.postgresql.org/wiki/Value_locking#.232._.22Promise.22_heap_tuples_.28Heikki_Linnakangas.29
[3] 
http://www.postgresql.org/message-id/cam3swzrg_htrol-6_wfe6_d_ucuyc28jfapsfh_tra76gkk...@mail.gmail.com
-- 
Peter Geoghegan
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 4e5be1a..9bb13df 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -313,8 +313,8 @@ ExecInsert(TupleTableSlot *slot,
 		 * deleting the already-inserted tuple and retrying, but that's fairly
 		 * expensive, so we try to avoid it.
 		 */
-		conflict = false;
 vlock:
+		conflict = false;
 		ItemPointerSetInvalid(&conflictTid);
 
 		/*
@@ -354,7 +354,8 @@ vlock:
 			 * which is appropriate only for non-promise tuples) to wait for us
 			 * to decide if we're going to go ahead with the insertion.
 			 */
-			SpeculativeInsertionLockAcquire(GetCurrentTransactionId());
+			if (spec != SPEC_NONE)
+				SpeculativeInsertionLockAcquire(GetCurrentTransactionId());
 
 			/*
 			 * insert the tuple
@@ -363,7 +364,8 @@ vlock:
 			 * the t_self field.
 			 */
 			newId = heap_insert(resultRelationDesc, tuple,
-								estate->es_output_cid, HEAP_INSERT_SPECULATIVE,
+								estate->es_output_cid,
+								spec != SPEC_NONE? HEAP_INSERT_SPECULATIVE:0,
 								NULL);
 
 			/*
@@ -404,8 +406,11 @@ vlock:
 							estate->es_output_cid, NULL, false, &hufd, true);
 			}
 
-			SpeculativeInsertionLockRelease(GetCurrentTransactionId());
-			ClearSpeculativeInsertionState();
+			if (spec != SPEC_NONE)
+			{
+				SpeculativeInsertionLockRelease(GetCurrentTransactionId());
+				ClearSpeculativeInsertionState();
+			}
 		}
 
 		if (conflict)
@@ -1014,6 +1019,17 @@ ExecLockUpdateTuple(EState *estate,
 						   LockTupleExclusive, false, /* wait */
 						   false, &buffer, &hufd);
 
+	/* Was tuple concurrently super-deleted? */
+	if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple.t_data),
+							 InvalidTransactionId))
+	{
+		test = HeapTupleUpdated;
+		/* XXX Should be caught within tqual.c */
+		elog(WARNING, "concurrent super delete of tuple (%u,%u)",
+			 ItemPointerGetBlockNumber(&tuple.t_data->t_ctid),
+			 ItemPointerGetOffsetNumber(&tuple.t_data->t_ctid));
+	}
+
 	if (test == HeapTupleMayBeUpdated)
 		copyTuple = heap_copytuple(&tuple);
 
@@ -1145,6 +1161,15 @@ ExecLockUpdateTuple(EState *estate,
 				*returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot,
 										planSlot, &onConflict->mt_epqstate,
 										onConflict->ps.state, canSetTag);
+			else
+				/*
+				 * XXX Debug instrumentation.  This is useful for stress test
+				 * queries with "WHERE target.index = excluded.index", in order
+				 * to assert that the UPDATE affects the right row.
+				 */
+				elog(WARNING, "did not update tuple (%u,%u)",
+					 ItemPointerGetBlockNumber(&tuple.t_data->t_ctid),
+					 ItemPointerGetOffsetNumber(&tuple.t_data->t_ctid));
 
 			ReleaseBuffer(buffer);
 
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 9216779..94099f1 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -728,6 +728,16 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
 	snapshot->xmin = snapshot->xmax = InvalidTransactionId;
 	snapshot->speculativeToken = 0;
 
+	/*
+	 * Never return "super-deleted" tuples
+	 *
+	 * XXX:  Comment this code out and you'll get conflicts within
+	 * ExecLockUpdateTuple(), which result in an infinite loop.
+	 */
+	if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+							InvalidTransactionId))
+		return false;
+
 	if (!HeapTupleHeaderXminCommitted(tuple))
 	{
 		if (HeapTupleHeaderXminInvalid(tuple))
@@ -943,6 +953,13 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
 	Assert(ItemPointerIsValid(&htup->t_self));
 	Assert(htup->t_tableOid != InvalidOid);
 
+	/*
+	 * Never return "super-deleted" tuples
+	 */
+	if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+							InvalidTransactionId))
+		return false;
+
 	if (!HeapTupleHeaderXminCommitted(tuple))
 	{
 		if (HeapTupleHeaderXminInvalid(tuple))
@@ -1147,6 +1164,13 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
 	Assert(htup->t_tableOid != InvalidOid);
 
 	/*
+	 * Immediately VACUUM "super-deleted" tuples
+	 */
+	if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+							InvalidTransactionId))
+		return HEAPTUPLE_DEAD;
+
+	/*
 	 * Has inserting transaction committed?
 	 *
 	 * If the inserting transaction aborted, then the tuple was never visible
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to