On Sun, Jun 19, 2011 at 09:10:02PM -0400, Robert Haas wrote:
> Is this an open item for 9.1beta3?

Yes. I've put it on the list.

The SxactGlobalXmin and its refcount were getting out of sync with the
actual transaction state. This is timing-dependent but I can reproduce it
fairly reliably under concurrent workloads.

It looks the problem comes from the change a couple days ago that
removed the SXACT_FLAG_ROLLED_BACK flag and changed the
SxactIsRolledBack checks to SxactIsDoomed. That's the correct thing to
do everywhere else, but gets us in trouble here. We shouldn't be
ignoring doomed transactions in SetNewSxactGlobalXmin, because they're
not eligible for cleanup until the associated backend aborts the
transaction and calls ReleasePredicateLocks.

However, it isn't as simple as just removing the SxactIsDoomed check
from SetNewSxactGlobalXmin. ReleasePredicateLocks calls
SetNewSxactGlobalXmin *before* it releases the aborted transaction's
SerializableXact (it pretty much has to, because we must drop and
reacquire SerializableXactHashLock in between). But we don't want the
aborted transaction included in the SxactGlobalXmin computation.

It seems like we do need that SXACT_FLAG_ROLLED_BACK after all, even
though it's only set for this brief interval. We need to be able to
distinguish a transaction that's just been marked for death (doomed)
from one that's already called ReleasePredicateLocks.

Dan

-- 
Dan R. K. Ports              MIT CSAIL                http://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to