I've been working on fixing the bugs that Jeff Janes found [1] with approach #2 to value locking [2]. Approach #1 was unaffected.
I'm starting this new thread, to discuss these issues. Let's try and confine discussion of semantics and syntax to the main thread, since that has been what has mostly been discussed there. Jeff produced a stress testing tool which tests concurrent deletions (and not just concurrent updates); his test case deletes tuples when a counter value goes under zero, which happens occasionally. It also independently does book keeping for all values in a range of values that are randomly incremented or decremented, and verifies that everything is as it should be. This revealed problems with #2 around concurrent "super deletion" of "unkept" promise heap tuples. As Jeff pointed out, all-NULL tuples were appearing as visible, or spurious duplicates appeared. For example: postgres=# select xmin, xmax, ctid, * from upsert_race_test where index = 1287; xmin | xmax | ctid | index | count --------+--------+-----------+--------+-------- 0 | 205438 | (260,27) | [null] | [null] 176066 | 0 | (258,158) | 1287 | 1 (2 rows) I should point out that one of the problems I found here was a garden variety bug, which has been fixed (the conflict flag in ExecInsert() was not being reset correctly, I believe). However, the other remaining problems could only be fixed with further changes to the routines in tqual.c, which I'm not happy about, and require further discussion here. I attach a differential patch which applies on top of the most recent revision of #2, which is v1.8.vallock2.tar.gz [3]. Note that I've been extending Jeff's stress testing tool to make it highlight problems earlier, and to make it provide more and better instrumentation (e.g. printing of XIDs to correlate with pg_xlogdump output). A good trick I added was to have an INSERT...ON CONFLICT UPDATE predicate of "WHERE TARGET.index = EXCLUDED.index", which serves as a kind of assertion when used within the Perl script (the Perl scripts checks RETURNING-projected tuples, and expects to always insert or update - that WHERE clause should always pass, obviously). Jeff's tool is available here (with my revisions): https://github.com/petergeoghegan/jjanes_upsert I've been running with fsync off, which Jeff felt increased the likelihood of revealing the bug. It isn't necessary, though, and FWIW it was easy for me to recreate this problem once I understood it, using only my 4 core laptop. I found it important to build without assertions and at optimization level -O2, though. The attached patch fixes Jeff's test case, as far as it goes. But, as I mentioned, it's not intended as a real fix. For one thing, I have made multiple changes to HeapTupleSatisfies*() routines, but haven't fully considered the consequences for, say, ri_triggers.c. HeapTupleSatisfiesSelf() and HeapTupleSatisfiesHistoricMVCC() were not modified. I've modified HeapTupleSatisfiesVacuum() to see InvalidTransactionID (not invalid xid infomask bits) as visible for garbage collection (alongside HeapTupleSatisfiesDirty() and HeapTupleSatisfiesMVCC(), which I changed first), and I wouldn't be surprised if that created new problems in other areas. In short, this patch is just for illustrative purposes. To see the difference this patch makes, I suggest commenting out the new code, and running the following test after a fresh initdb: $ perl count_upsert.pl 4 100000 I think that the actual nature of the problem I fixed was that concurrent upserters felt entitled to update a promise tuple that they could at least initially see, but was then "super deleted", but it turned out that once it was super deleted it was okay to update (the tuple wasn't actually duplicated insofar as HeapTupleSatisfiesDirty() was then able to ascertain due to other concurrent activity). It's pretty complicated, in any case. This is just a band-aid. Thoughts? Does anyone have any thoughts on my diagnosis of the problem? [1] http://www.postgresql.org/message-id/CAMkU=1w8e9Q6ZX3U85RtsyCMpdYWFrvVAO3=unevtqiruzn...@mail.gmail.com [2] https://wiki.postgresql.org/wiki/Value_locking#.232._.22Promise.22_heap_tuples_.28Heikki_Linnakangas.29 [3] http://www.postgresql.org/message-id/cam3swzrg_htrol-6_wfe6_d_ucuyc28jfapsfh_tra76gkk...@mail.gmail.com -- Peter Geoghegan
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 4e5be1a..9bb13df 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -313,8 +313,8 @@ ExecInsert(TupleTableSlot *slot, * deleting the already-inserted tuple and retrying, but that's fairly * expensive, so we try to avoid it. */ - conflict = false; vlock: + conflict = false; ItemPointerSetInvalid(&conflictTid); /* @@ -354,7 +354,8 @@ vlock: * which is appropriate only for non-promise tuples) to wait for us * to decide if we're going to go ahead with the insertion. */ - SpeculativeInsertionLockAcquire(GetCurrentTransactionId()); + if (spec != SPEC_NONE) + SpeculativeInsertionLockAcquire(GetCurrentTransactionId()); /* * insert the tuple @@ -363,7 +364,8 @@ vlock: * the t_self field. */ newId = heap_insert(resultRelationDesc, tuple, - estate->es_output_cid, HEAP_INSERT_SPECULATIVE, + estate->es_output_cid, + spec != SPEC_NONE? HEAP_INSERT_SPECULATIVE:0, NULL); /* @@ -404,8 +406,11 @@ vlock: estate->es_output_cid, NULL, false, &hufd, true); } - SpeculativeInsertionLockRelease(GetCurrentTransactionId()); - ClearSpeculativeInsertionState(); + if (spec != SPEC_NONE) + { + SpeculativeInsertionLockRelease(GetCurrentTransactionId()); + ClearSpeculativeInsertionState(); + } } if (conflict) @@ -1014,6 +1019,17 @@ ExecLockUpdateTuple(EState *estate, LockTupleExclusive, false, /* wait */ false, &buffer, &hufd); + /* Was tuple concurrently super-deleted? */ + if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple.t_data), + InvalidTransactionId)) + { + test = HeapTupleUpdated; + /* XXX Should be caught within tqual.c */ + elog(WARNING, "concurrent super delete of tuple (%u,%u)", + ItemPointerGetBlockNumber(&tuple.t_data->t_ctid), + ItemPointerGetOffsetNumber(&tuple.t_data->t_ctid)); + } + if (test == HeapTupleMayBeUpdated) copyTuple = heap_copytuple(&tuple); @@ -1145,6 +1161,15 @@ ExecLockUpdateTuple(EState *estate, *returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot, planSlot, &onConflict->mt_epqstate, onConflict->ps.state, canSetTag); + else + /* + * XXX Debug instrumentation. This is useful for stress test + * queries with "WHERE target.index = excluded.index", in order + * to assert that the UPDATE affects the right row. + */ + elog(WARNING, "did not update tuple (%u,%u)", + ItemPointerGetBlockNumber(&tuple.t_data->t_ctid), + ItemPointerGetOffsetNumber(&tuple.t_data->t_ctid)); ReleaseBuffer(buffer); diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c index 9216779..94099f1 100644 --- a/src/backend/utils/time/tqual.c +++ b/src/backend/utils/time/tqual.c @@ -728,6 +728,16 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot, snapshot->xmin = snapshot->xmax = InvalidTransactionId; snapshot->speculativeToken = 0; + /* + * Never return "super-deleted" tuples + * + * XXX: Comment this code out and you'll get conflicts within + * ExecLockUpdateTuple(), which result in an infinite loop. + */ + if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple), + InvalidTransactionId)) + return false; + if (!HeapTupleHeaderXminCommitted(tuple)) { if (HeapTupleHeaderXminInvalid(tuple)) @@ -943,6 +953,13 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot, Assert(ItemPointerIsValid(&htup->t_self)); Assert(htup->t_tableOid != InvalidOid); + /* + * Never return "super-deleted" tuples + */ + if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple), + InvalidTransactionId)) + return false; + if (!HeapTupleHeaderXminCommitted(tuple)) { if (HeapTupleHeaderXminInvalid(tuple)) @@ -1147,6 +1164,13 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin, Assert(htup->t_tableOid != InvalidOid); /* + * Immediately VACUUM "super-deleted" tuples + */ + if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple), + InvalidTransactionId)) + return HEAPTUPLE_DEAD; + + /* * Has inserting transaction committed? * * If the inserting transaction aborted, then the tuple was never visible
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers