There's a corner case in the SSI cleanup code that isn't handled correctly. It can arise when running workloads that are comprised mostly (but not 100%) of READ ONLY transactions, and can corrupt the finished SERIALIZABLEXACT list, potentially causing a segfault. The attached patch fixes it.
Specifically, when the only remaining active transactions are READ ONLY, we do a "partial cleanup" of committed transactions because certain types of conflicts aren't possible anymore. For committed r/w transactions, we release the SIREAD locks but keep the SERIALIZABLEXACT. However, for committed r/o transactions, we can go further and release the SERIALIZABLEXACT too. The problem was with the latter case: we were returning the SERIALIZABLEXACT to the free list without removing it from the finished list. The only real change in the patch is the SHMQueueDelete line, but I also reworked some of the surrounding code to make it obvious that r/o and r/w transactions are handled differently -- the existing code felt a bit too clever. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
>From 39f8462332f998d7363058adabac412c7654befe Mon Sep 17 00:00:00 2001 From: "Dan R. K. Ports" <d...@csail.mit.edu> Date: Thu, 29 Dec 2011 23:11:35 -0500 Subject: [PATCH] Read-only SERIALIZABLEXACTs are completely released when doing partial cleanup, so remove them from the finished list. This prevents the finished list from being corrupted. Also make it more clear that read-only transactions are treated differently here. --- src/backend/storage/lmgr/predicate.c | 26 +++++++++++++++++++++++--- 1 files changed, 23 insertions(+), 3 deletions(-) diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c index aefa698..c0b3cb4 100644 --- a/src/backend/storage/lmgr/predicate.c +++ b/src/backend/storage/lmgr/predicate.c @@ -3531,10 +3531,29 @@ ClearOldPredicateLocks(void) else if (finishedSxact->commitSeqNo > PredXact->HavePartialClearedThrough && finishedSxact->commitSeqNo <= PredXact->CanPartialClearThrough) { + /* + * Any active transactions that took their snapshot before + * this transaction committed are read-only, so we can + * clear part of its state. + */ LWLockRelease(SerializableXactHashLock); - ReleaseOneSerializableXact(finishedSxact, - !SxactIsReadOnly(finishedSxact), - false); + + if (SxactIsReadOnly(finishedSxact)) + { + /* A read-only transaction can be removed entirely */ + SHMQueueDelete(&(finishedSxact->finishedLink)); + ReleaseOneSerializableXact(finishedSxact, false, false); + } + else + { + /* + * A read-write transaction can only be partially + * cleared. We need to keep the SERIALIZABLEXACT but + * can release the SIREAD locks and conflicts in. + */ + ReleaseOneSerializableXact(finishedSxact, true, false); + } + PredXact->HavePartialClearedThrough = finishedSxact->commitSeqNo; LWLockAcquire(SerializableXactHashLock, LW_SHARED); } @@ -3640,6 +3659,7 @@ ReleaseOneSerializableXact(SERIALIZABLEXACT *sxact, bool partial, Assert(sxact != NULL); Assert(SxactIsRolledBack(sxact) || SxactIsCommitted(sxact)); + Assert(partial || !SxactIsOnFinishedList(sxact)); Assert(LWLockHeldByMe(SerializableFinishedListLock)); /* -- 1.7.8.2
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers