On Sat, Jan 3, 2015 at 12:42 PM, Peter Geoghegan <p...@heroku.com> wrote: > There are probably additional factors that make the situation more > complicated than it is for the ON CONFLICT patch, but it's clear to me > that the mutual dependency that can be involved with two ordinary > exclusion constraint inserters is reason enough for those to deadlock.
I looked at the code in more detail, and realized that there were old bugs in the exclusion constraint related modifications. I attach a delta patch that fixes them. This is a combined patch that is all that is needed to apply on top of v1.8.vallock2.tar.gz [1] to have all available bugfixes. This unbreaks my previous exclusion constraint test case, as far as it goes. I am still concerned about the risk of lock starvation/livelocks here. But I must admit, having stressed the patch with this latest fix, that it's possible that the real risks have been overestimated. Maybe this is in the same category as the way L&Y's algorithm could theoretically starve a session with a pagesplit that needs to move right indefinitely. It's a theoretical risk that turns out to not matter in practice, for reasons that lack a real theoretical justification beyond the fact that sustained very bad luck - a sustained "perfect storm" - is required for starvation to occur. OTOH, maybe my OS scheduler doesn't have the characteristic that provoke a the suspected bug. I really don't know, but I suppose that one or the other concurrent inserters will tend to win the race often enough - a draw may be a very rare thing. [1] http://www.postgresql.org/message-id/cam3swzrg_htrol-6_wfe6_d_ucuyc28jfapsfh_tra76gkk...@mail.gmail.com -- Peter Geoghegan
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c index ea8c57b..f8a4017 100644 --- a/src/backend/executor/execUtils.c +++ b/src/backend/executor/execUtils.c @@ -1133,8 +1133,11 @@ ExecInsertIndexTuples(TupleTableSlot *slot, * If the index has an associated exclusion constraint, check that. * This is simpler than the process for uniqueness checks since we * always insert first and then check. If the constraint is deferred, - * we check now anyway, but don't throw error on violation; instead - * we'll queue a recheck event. + * we check now anyway, but don't throw error on violation or wait for + * a conclusive outcome from a concurrent insertion; instead we'll + * queue a recheck event. Similarly, noDupErr callers (speculative + * inserters) will recheck later, and wait for a conclusive outcome + * then. * * An index for an exclusion constraint can't also be UNIQUE (not an * essential property, we just don't allow it in the grammar), so no @@ -1142,15 +1145,15 @@ ExecInsertIndexTuples(TupleTableSlot *slot, */ if (indexInfo->ii_ExclusionOps != NULL) { - bool errorOK = (!indexRelation->rd_index->indimmediate && - !noDupErr); + bool violationOK = (!indexRelation->rd_index->indimmediate || + noDupErr); satisfiesConstraint = check_exclusion_or_unique_constraint(heapRelation, indexRelation, indexInfo, tupleid, values, isnull, - estate, false, errorOK, - false, NULL); + estate, false, + violationOK, false, NULL); } if ((checkUnique == UNIQUE_CHECK_PARTIAL || @@ -1158,7 +1161,7 @@ ExecInsertIndexTuples(TupleTableSlot *slot, !satisfiesConstraint) { /* - * The tuple potentially violates the uniqueness or exclusion + * The tuple potentially violates the unique index or exclusion * constraint, so make a note of the index so that we can re-check * it later. */ @@ -1322,7 +1325,7 @@ ExecCheckIndexConstraints(TupleTableSlot *slot, * is convenient for deferred exclusion checks; we need not bother queuing * a deferred event if there is definitely no conflict at insertion time. * - * When errorOK is false, we'll throw error on violation, so a false result + * When violationOK is false, we'll throw error on violation, so a false result * is impossible. * * Note: The indexam is normally responsible for checking unique constraints, @@ -1335,7 +1338,7 @@ check_exclusion_or_unique_constraint(Relation heap, Relation index, IndexInfo *indexInfo, ItemPointer tupleid, Datum *values, bool *isnull, EState *estate, bool newIndex, - bool errorOK, bool wait, + bool violationOK, bool wait, ItemPointer conflictTid) { Oid *constr_procs; @@ -1462,7 +1465,7 @@ retry: * we're not supposed to raise error, just return the fact of the * potential conflict without waiting to see if it's real. */ - if (errorOK && !wait) + if (violationOK && !wait) { conflict = true; if (conflictTid) @@ -1487,7 +1490,7 @@ retry: if (DirtySnapshot.speculativeToken) SpeculativeInsertionWait(DirtySnapshot.xmin, DirtySnapshot.speculativeToken); - else if (errorOK) + else if (violationOK) XactLockTableWait(xwait, heap, &tup->t_self, XLTW_RecheckExclusionConstr); else @@ -1499,7 +1502,7 @@ retry: /* * We have a definite conflict. Return it to caller, or report it. */ - if (errorOK) + if (violationOK) { conflict = true; if (conflictTid) diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 4e5be1a..9bb13df 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -313,8 +313,8 @@ ExecInsert(TupleTableSlot *slot, * deleting the already-inserted tuple and retrying, but that's fairly * expensive, so we try to avoid it. */ - conflict = false; vlock: + conflict = false; ItemPointerSetInvalid(&conflictTid); /* @@ -354,7 +354,8 @@ vlock: * which is appropriate only for non-promise tuples) to wait for us * to decide if we're going to go ahead with the insertion. */ - SpeculativeInsertionLockAcquire(GetCurrentTransactionId()); + if (spec != SPEC_NONE) + SpeculativeInsertionLockAcquire(GetCurrentTransactionId()); /* * insert the tuple @@ -363,7 +364,8 @@ vlock: * the t_self field. */ newId = heap_insert(resultRelationDesc, tuple, - estate->es_output_cid, HEAP_INSERT_SPECULATIVE, + estate->es_output_cid, + spec != SPEC_NONE? HEAP_INSERT_SPECULATIVE:0, NULL); /* @@ -404,8 +406,11 @@ vlock: estate->es_output_cid, NULL, false, &hufd, true); } - SpeculativeInsertionLockRelease(GetCurrentTransactionId()); - ClearSpeculativeInsertionState(); + if (spec != SPEC_NONE) + { + SpeculativeInsertionLockRelease(GetCurrentTransactionId()); + ClearSpeculativeInsertionState(); + } } if (conflict) @@ -1014,6 +1019,17 @@ ExecLockUpdateTuple(EState *estate, LockTupleExclusive, false, /* wait */ false, &buffer, &hufd); + /* Was tuple concurrently super-deleted? */ + if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple.t_data), + InvalidTransactionId)) + { + test = HeapTupleUpdated; + /* XXX Should be caught within tqual.c */ + elog(WARNING, "concurrent super delete of tuple (%u,%u)", + ItemPointerGetBlockNumber(&tuple.t_data->t_ctid), + ItemPointerGetOffsetNumber(&tuple.t_data->t_ctid)); + } + if (test == HeapTupleMayBeUpdated) copyTuple = heap_copytuple(&tuple); @@ -1145,6 +1161,15 @@ ExecLockUpdateTuple(EState *estate, *returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot, planSlot, &onConflict->mt_epqstate, onConflict->ps.state, canSetTag); + else + /* + * XXX Debug instrumentation. This is useful for stress test + * queries with "WHERE target.index = excluded.index", in order + * to assert that the UPDATE affects the right row. + */ + elog(WARNING, "did not update tuple (%u,%u)", + ItemPointerGetBlockNumber(&tuple.t_data->t_ctid), + ItemPointerGetOffsetNumber(&tuple.t_data->t_ctid)); ReleaseBuffer(buffer); diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c index 9216779..94099f1 100644 --- a/src/backend/utils/time/tqual.c +++ b/src/backend/utils/time/tqual.c @@ -728,6 +728,16 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot, snapshot->xmin = snapshot->xmax = InvalidTransactionId; snapshot->speculativeToken = 0; + /* + * Never return "super-deleted" tuples + * + * XXX: Comment this code out and you'll get conflicts within + * ExecLockUpdateTuple(), which result in an infinite loop. + */ + if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple), + InvalidTransactionId)) + return false; + if (!HeapTupleHeaderXminCommitted(tuple)) { if (HeapTupleHeaderXminInvalid(tuple)) @@ -943,6 +953,13 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot, Assert(ItemPointerIsValid(&htup->t_self)); Assert(htup->t_tableOid != InvalidOid); + /* + * Never return "super-deleted" tuples + */ + if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple), + InvalidTransactionId)) + return false; + if (!HeapTupleHeaderXminCommitted(tuple)) { if (HeapTupleHeaderXminInvalid(tuple)) @@ -1147,6 +1164,13 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin, Assert(htup->t_tableOid != InvalidOid); /* + * Immediately VACUUM "super-deleted" tuples + */ + if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple), + InvalidTransactionId)) + return HEAPTUPLE_DEAD; + + /* * Has inserting transaction committed? * * If the inserting transaction aborted, then the tuple was never visible
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers