Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
Simon Riggs si...@2ndquadrant.com writes: On 25 August 2015 at 21:51, Tom Lane t...@sss.pgh.pa.us wrote: I don't mean to dismiss the potential for further optimization inside XidInMVCCSnapshot (for instance, the one-XID-cache idea sounds promising); but I think that's material for further research and a separate patch. Patch attached. Doesn't seem worth a separate thread. This doesn't seem right to me: it only caches the case where the XID is found to be present in the snapshot, whereas I think probably we should remember the result in both cases (present or not). Moreover, in the subxip-overflowed case where we replace an originally-inquired-of subxact xid with its parent, what you're caching is the parent xid, which will fail to match subsequent queries. Shouldn't we cache the originally inquired-of xid? The question of whether to cache false results perhaps requires some performance testing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On 25 August 2015 at 21:51, Tom Lane t...@sss.pgh.pa.us wrote: I don't mean to dismiss the potential for further optimization inside XidInMVCCSnapshot (for instance, the one-XID-cache idea sounds promising); but I think that's material for further research and a separate patch. Patch attached. Doesn't seem worth a separate thread. It's not clear to me if anyone wanted to do further review/testing of this patch, or if I should go ahead and push it (after fixing whatever comments need to be fixed). Push, please. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services xidinmvccsnapshot_cache.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On Wed, Aug 26, 2015 at 2:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: Amit Kapila amit.kapil...@gmail.com writes: I am wondering that is there any harm in calling TransactionIdDidAbort() in slow path before calling SubTransGetTopmostTransaction(), that can also maintain consistency of checks in both the functions? I think this is probably a bad idea. It adds a pg_clog lookup that we would otherwise not do at all, in hopes of avoiding a pg_subtrans lookup. It's not exactly clear that that's a win even if we successfully avoid the subtrans lookup (which we often would not). And even if it does win, that would only happen if the other transaction has aborted, which isn't generally the case we prefer to optimize for. I think Alvaro has mentioned the case where it could win, however it can still add some penality where most (or all) transactions are committed. I agree with you that we might not want to optimize or spend our energy figuring out if this is win for not-a-common use case. OTOH, I feel having this and other point related to optimisation (one-XID-cache) could be added as part of function level comments to help, if some body encounters any such case in future or is puzzled why there are some differences in TransactionIdIsInProgress() and XidInMVCCSnapshot(). I don't mean to dismiss the potential for further optimization inside XidInMVCCSnapshot (for instance, the one-XID-cache idea sounds promising); but I think that's material for further research and a separate patch. It's not clear to me if anyone wanted to do further review/testing of this patch, or if I should go ahead and push it (after fixing whatever comments need to be fixed). I think jeff's test results upthread already ensured that this patch is of value and fair enough number of people have already looked into it and provided there feedback, so +1 for pushing it. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
Amit Kapila amit.kapil...@gmail.com writes: On Fri, Aug 21, 2015 at 8:22 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Aug 19, 2015 at 2:53 AM, Amit Kapila amit.kapil...@gmail.com Another minor point is, I think we should modify function level comment for XidInMVCCSnapshot() where it says that this applies to known- committed XIDs which will no longer be true after this patch. Yeah, that comment would certainly be out of date, and I think it may not be the only one. I'll check around some more. I am wondering that is there any harm in calling TransactionIdDidAbort() in slow path before calling SubTransGetTopmostTransaction(), that can also maintain consistency of checks in both the functions? I think this is probably a bad idea. It adds a pg_clog lookup that we would otherwise not do at all, in hopes of avoiding a pg_subtrans lookup. It's not exactly clear that that's a win even if we successfully avoid the subtrans lookup (which we often would not). And even if it does win, that would only happen if the other transaction has aborted, which isn't generally the case we prefer to optimize for. I don't mean to dismiss the potential for further optimization inside XidInMVCCSnapshot (for instance, the one-XID-cache idea sounds promising); but I think that's material for further research and a separate patch. It's not clear to me if anyone wanted to do further review/testing of this patch, or if I should go ahead and push it (after fixing whatever comments need to be fixed). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
Tom Lane wrote: Amit Kapila amit.kapil...@gmail.com writes: I am wondering that is there any harm in calling TransactionIdDidAbort() in slow path before calling SubTransGetTopmostTransaction(), that can also maintain consistency of checks in both the functions? I think this is probably a bad idea. It adds a pg_clog lookup that we would otherwise not do at all, in hopes of avoiding a pg_subtrans lookup. It's not exactly clear that that's a win even if we successfully avoid the subtrans lookup (which we often would not). And even if it does win, that would only happen if the other transaction has aborted, which isn't generally the case we prefer to optimize for. It's probably key to this idea that TransactionIdDidAbort returns in a single slru lookup, whereas SubTransGetTopmostTransaction needs to iterate possibly many layers of subxacts. But the point about this being a win only for aborted xacts makes it probably pointless, I agree. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On Fri, Aug 21, 2015 at 8:22 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Aug 19, 2015 at 2:53 AM, Amit Kapila amit.kapil...@gmail.com wrote: I think one case where the patch can impact is for aborted transactions. In TransactionIdIsInProgress(), we check for aborted transactions before consulting pg_subtrans whereas with patch it will consult pg_subtrans without aborted transaction check. Now it could be better to first check pg_subtrans if we found that the corresponding top transaction is in-progress as that will save extra pg_clog lookup, but I just mentioned because that is one difference that I could see with this patch. Another minor point is, I think we should modify function level comment for XidInMVCCSnapshot() where it says that this applies to known- committed XIDs which will no longer be true after this patch. But only if the snapshot has overflowed, right? Yes. That should affect only a small minority of cases. Agreed, but the effect would be non-negligible for such cases. One thing I am wondering that is there any harm in calling TransactionIdDidAbort() in slow path before calling SubTransGetTopmostTransaction(), that can also maintain consistency of checks in both the functions? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On Wed, Aug 19, 2015 at 2:53 AM, Amit Kapila amit.kapil...@gmail.com wrote: I think one case where the patch can impact is for aborted transactions. In TransactionIdIsInProgress(), we check for aborted transactions before consulting pg_subtrans whereas with patch it will consult pg_subtrans without aborted transaction check. Now it could be better to first check pg_subtrans if we found that the corresponding top transaction is in-progress as that will save extra pg_clog lookup, but I just mentioned because that is one difference that I could see with this patch. Another minor point is, I think we should modify function level comment for XidInMVCCSnapshot() where it says that this applies to known- committed XIDs which will no longer be true after this patch. But only if the snapshot has overflowed, right? That should affect only a small minority of cases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On Tue, Aug 18, 2015 at 8:36 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Just thinking about this ... I wonder why we need to call TransactionIdIsInProgress() at all rather than believing the answer from the snapshot? Under what circumstances could TransactionIdIsInProgress() return true where XidInMVCCSnapshot() had not? I experimented with the attached patch, which replaces HeapTupleSatisfiesMVCC's calls of TransactionIdIsInProgress with XidInMVCCSnapshot, and then as a cross-check has all the return false exits from XidInMVCCSnapshot assert !TransactionIdIsInProgress(). The asserts did not fire in the standard regression tests nor in a pgbench run, which is surely not proof of anything but it suggests that I'm not totally nuts. I wouldn't commit the changes in XidInMVCCSnapshot for real, but otherwise this is a possibly committable patch. Like Jeff's approach, this will set hint bits less aggressively. Suppose there are 10 processes concurrently scanning a whole bunch of dead tuples. Furthermore, suppose those dead tuples were created by a transaction that began after those processes took their snapshot and later aborted. Without the patch, the first one should notice that the tuples are in fact dead and hint them HEAP_XMIN_INVALID. With the patch, the backends will see those transactions as in-progress rather than aborted, so they won't be hinted. Now that probably isn't a reason not to do this. Anything that cuts down on ProcArrayLock acquisition and cache-line bouncing in shared memory is likely to be a win even if it burns a few more cycles elsewhere. And I think the scenario I just mentioned probably doesn't happen that often, and probably wouldn't cost that much if it did. But it's worth thinking about, and at the least we should document in the commit message or the comments that this change has this effect. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
Robert Haas robertmh...@gmail.com writes: On Tue, Aug 18, 2015 at 8:36 PM, Tom Lane t...@sss.pgh.pa.us wrote: I experimented with the attached patch, which replaces HeapTupleSatisfiesMVCC's calls of TransactionIdIsInProgress with XidInMVCCSnapshot, and then as a cross-check has all the return false exits from XidInMVCCSnapshot assert !TransactionIdIsInProgress(). Like Jeff's approach, this will set hint bits less aggressively. Only very marginally so, though: we will not set a hint bit in the case where the updating transaction has committed/aborted but it is still shown as running by our snapshot. As soon as a reader comes along with a snapshot taken after the updater ended, that transaction will set the hint bit. Note that in the case where the updating transaction committed, setting the hint bit early wouldn't save anything anyway, since we'd still need to run XidInMVCCSnapshot(). It is true that if the updater aborted, we could save cycles by hinting earlier; but I believe the general project policy about such choices is not to optimize for the abort case. Also note that the cycles wasted will now be in XidInMVCCSnapshot(), not in ProcArray access, so they won't be as much of a scalability problem. But it's worth thinking about, and at the least we should document in the commit message or the comments that this change has this effect. Agreed, there is a tradeoff being made. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On 2015-08-18 20:36:13 -0400, Tom Lane wrote: I wrote: Just thinking about this ... I wonder why we need to call TransactionIdIsInProgress() at all rather than believing the answer from the snapshot? Under what circumstances could TransactionIdIsInProgress() return true where XidInMVCCSnapshot() had not? I experimented with the attached patch, which replaces HeapTupleSatisfiesMVCC's calls of TransactionIdIsInProgress with XidInMVCCSnapshot, and then as a cross-check has all the return false exits from XidInMVCCSnapshot assert !TransactionIdIsInProgress(). I'm not sure about it, but it might be worthwhile to add a TransactionIdIsKnownCompleted() check before the more expensive parts of XidInMVCCSnapshot(). Neither the array search nor, much more so, the subtrans lookups are free. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
Andres Freund and...@anarazel.de writes: On 2015-08-18 20:36:13 -0400, Tom Lane wrote: I experimented with the attached patch, which replaces HeapTupleSatisfiesMVCC's calls of TransactionIdIsInProgress with XidInMVCCSnapshot, and then as a cross-check has all the return false exits from XidInMVCCSnapshot assert !TransactionIdIsInProgress(). I'm not sure about it, but it might be worthwhile to add a TransactionIdIsKnownCompleted() check before the more expensive parts of XidInMVCCSnapshot(). Neither the array search nor, much more so, the subtrans lookups are free. Hmmm... the comment for TransactionIdIsKnownCompleted says it's to short-circuit calls of TransactionIdIsInProgress, which we wouldn't be doing anymore. Maybe it's useful anyway but I'm not convinced. In any case, the big picture here is that XidInMVCCSnapshot should on average be looking at just about the same number of xids and subtrans ids as TransactionIdIsInProgress does --- only the latter is looking at them in the PGPROC array, so it needs a lock, and iterating over that data structure is more complex than scanning an array too. My own thought about reducing the cost of XidInMVCCSnapshot, if that proves necessary, is that maybe it would be worth the trouble to sort the arrays so we could use binary search. That would increase the cost of snapshot acquisition noticeably though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
I wrote: Andres Freund and...@anarazel.de writes: I'm not sure about it, but it might be worthwhile to add a TransactionIdIsKnownCompleted() check before the more expensive parts of XidInMVCCSnapshot(). Neither the array search nor, much more so, the subtrans lookups are free. Hmmm... the comment for TransactionIdIsKnownCompleted says it's to short-circuit calls of TransactionIdIsInProgress, which we wouldn't be doing anymore. Maybe it's useful anyway but I'm not convinced. After further thought, the right way to implement the equivalent optimization would be to add a couple of fields to struct Snapshot that would cache the xid last checked against that snapshot and the outcome of that check; this would be independent of TransactionIdIsKnownCompleted. There would be no need to use that cache for xids below xmin or above xmax, which would improve its chance of being applicable to in-doubt xids. Not entirely sure it's worth doing, but if someone wants to do the legwork, this would be an independent optimization possibility. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On 19 August 2015 at 16:21, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Andres Freund and...@anarazel.de writes: I'm not sure about it, but it might be worthwhile to add a TransactionIdIsKnownCompleted() check before the more expensive parts of XidInMVCCSnapshot(). Neither the array search nor, much more so, the subtrans lookups are free. Hmmm... the comment for TransactionIdIsKnownCompleted says it's to short-circuit calls of TransactionIdIsInProgress, which we wouldn't be doing anymore. Maybe it's useful anyway but I'm not convinced. After further thought, the right way to implement the equivalent optimization would be to add a couple of fields to struct Snapshot that would cache the xid last checked against that snapshot and the outcome of that check; this would be independent of TransactionIdIsKnownCompleted. There would be no need to use that cache for xids below xmin or above xmax, which would improve its chance of being applicable to in-doubt xids. Not entirely sure it's worth doing, but if someone wants to do the legwork, this would be an independent optimization possibility. This is what I suggested upthread. It's not a massive gain, but its cheap and a straightforward patch. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On 2015-08-19 10:55:00 -0400, Tom Lane wrote: My own thought about reducing the cost of XidInMVCCSnapshot, if that proves necessary, is that maybe it would be worth the trouble to sort the arrays so we could use binary search. That would increase the cost of snapshot acquisition noticeably though. It's also considerably more expensive for search in many cases because it's very hard to predict (branching and prefetching). I doubt it'd be worthwhile. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On Tue, Aug 18, 2015 at 5:36 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Just thinking about this ... I wonder why we need to call TransactionIdIsInProgress() at all rather than believing the answer from the snapshot? Under what circumstances could TransactionIdIsInProgress() return true where XidInMVCCSnapshot() had not? I experimented with the attached patch, which replaces HeapTupleSatisfiesMVCC's calls of TransactionIdIsInProgress with XidInMVCCSnapshot, and then as a cross-check has all the return false exits from XidInMVCCSnapshot assert !TransactionIdIsInProgress(). The asserts did not fire in the standard regression tests nor in a pgbench run, which is surely not proof of anything but it suggests that I'm not totally nuts. I wouldn't commit the changes in XidInMVCCSnapshot for real, but otherwise this is a possibly committable patch. This gives the same performance improvement as the original patch (when compiled without cassert). It has survived testing in a hostile environment (rapid fire stream of contentious updates intermixed with selects, with or without frequent crashes) without signs of missed or duplicated rows. So, it looks good to me. Cheers, Jeff
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On Wed, Aug 19, 2015 at 6:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Just thinking about this ... I wonder why we need to call TransactionIdIsInProgress() at all rather than believing the answer from the snapshot? Under what circumstances could TransactionIdIsInProgress() return true where XidInMVCCSnapshot() had not? I experimented with the attached patch, which replaces HeapTupleSatisfiesMVCC's calls of TransactionIdIsInProgress with XidInMVCCSnapshot, and then as a cross-check has all the return false exits from XidInMVCCSnapshot assert !TransactionIdIsInProgress(). The asserts did not fire in the standard regression tests nor in a pgbench run, which is surely not proof of anything but it suggests that I'm not totally nuts. I wouldn't commit the changes in XidInMVCCSnapshot for real, but otherwise this is a possibly committable patch. I think one case where the patch can impact is for aborted transactions. In TransactionIdIsInProgress(), we check for aborted transactions before consulting pg_subtrans whereas with patch it will consult pg_subtrans without aborted transaction check. Now it could be better to first check pg_subtrans if we found that the corresponding top transaction is in-progress as that will save extra pg_clog lookup, but I just mentioned because that is one difference that I could see with this patch. Another minor point is, I think we should modify function level comment for XidInMVCCSnapshot() where it says that this applies to known- committed XIDs which will no longer be true after this patch. Apart from above, the patch looks good. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
On 19 August 2015 at 00:49, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Janes jeff.ja...@gmail.com writes: When we check a tuple for MVCC, it has to pass checks that the inserting transaction has committed, and that it committed before our snapshot began. And similarly that the deleting transaction hasn't committed, or did so after our snapshot. XidInMVCCSnapshot is (or can be) very much cheaper than TransactionIdIsInProgress, because the former touches only local memory while the latter takes a highly contended lock and inspects shared memory. We do the slow one first, but we could do the fast one first and sometimes short-circuit the slow one. If the transaction is in our snapshot, it doesn't matter if it is still in progress or not. This was discussed back in 2013 ( http://www.postgresql.org/message-id/CAMkU=1yy-YEQVvqj2xJitT1EFkyuFk7uTV_hrOMGyGMxpU=n...@mail.gmail.com ), and I wanted to revive it. The recent lwlock atomic changes haven't made the problem irrelevant. This patch swaps the order of the checks under some conditions. Just thinking about this ... I wonder why we need to call TransactionIdIsInProgress() at all rather than believing the answer from the snapshot? Under what circumstances could TransactionIdIsInProgress() return true where XidInMVCCSnapshot() had not? I'm thinking maybe TransactionIdIsInProgress is only needed for non-MVCC snapshot types. +1 -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
Jeff Janes jeff.ja...@gmail.com writes: When we check a tuple for MVCC, it has to pass checks that the inserting transaction has committed, and that it committed before our snapshot began. And similarly that the deleting transaction hasn't committed, or did so after our snapshot. XidInMVCCSnapshot is (or can be) very much cheaper than TransactionIdIsInProgress, because the former touches only local memory while the latter takes a highly contended lock and inspects shared memory. We do the slow one first, but we could do the fast one first and sometimes short-circuit the slow one. If the transaction is in our snapshot, it doesn't matter if it is still in progress or not. This was discussed back in 2013 ( http://www.postgresql.org/message-id/CAMkU=1yy-YEQVvqj2xJitT1EFkyuFk7uTV_hrOMGyGMxpU=n...@mail.gmail.com), and I wanted to revive it. The recent lwlock atomic changes haven't made the problem irrelevant. This patch swaps the order of the checks under some conditions. Just thinking about this ... I wonder why we need to call TransactionIdIsInProgress() at all rather than believing the answer from the snapshot? Under what circumstances could TransactionIdIsInProgress() return true where XidInMVCCSnapshot() had not? I'm thinking maybe TransactionIdIsInProgress is only needed for non-MVCC snapshot types. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make HeapTupleSatisfiesMVCC more concurrent
I wrote: Just thinking about this ... I wonder why we need to call TransactionIdIsInProgress() at all rather than believing the answer from the snapshot? Under what circumstances could TransactionIdIsInProgress() return true where XidInMVCCSnapshot() had not? I experimented with the attached patch, which replaces HeapTupleSatisfiesMVCC's calls of TransactionIdIsInProgress with XidInMVCCSnapshot, and then as a cross-check has all the return false exits from XidInMVCCSnapshot assert !TransactionIdIsInProgress(). The asserts did not fire in the standard regression tests nor in a pgbench run, which is surely not proof of anything but it suggests that I'm not totally nuts. I wouldn't commit the changes in XidInMVCCSnapshot for real, but otherwise this is a possibly committable patch. regards, tom lane diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c index de7b3fc..c8f59f7 100644 *** a/src/backend/utils/time/tqual.c --- b/src/backend/utils/time/tqual.c *** *** 10,21 * the passed-in buffer. The caller must hold not only a pin, but at least * shared buffer content lock on the buffer containing the tuple. * ! * NOTE: must check TransactionIdIsInProgress (which looks in PGXACT array) * before TransactionIdDidCommit/TransactionIdDidAbort (which look in * pg_clog). Otherwise we have a race condition: we might decide that a * just-committed transaction crashed, because none of the tests succeed. * xact.c is careful to record commit/abort in pg_clog before it unsets ! * MyPgXact-xid in PGXACT array. That fixes that problem, but it also * means there is a window where TransactionIdIsInProgress and * TransactionIdDidCommit will both return true. If we check only * TransactionIdDidCommit, we could consider a tuple committed when a --- 10,22 * the passed-in buffer. The caller must hold not only a pin, but at least * shared buffer content lock on the buffer containing the tuple. * ! * NOTE: When using a non-MVCC snapshot, we must check ! * TransactionIdIsInProgress (which looks in the PGXACT array) * before TransactionIdDidCommit/TransactionIdDidAbort (which look in * pg_clog). Otherwise we have a race condition: we might decide that a * just-committed transaction crashed, because none of the tests succeed. * xact.c is careful to record commit/abort in pg_clog before it unsets ! * MyPgXact-xid in the PGXACT array. That fixes that problem, but it also * means there is a window where TransactionIdIsInProgress and * TransactionIdDidCommit will both return true. If we check only * TransactionIdDidCommit, we could consider a tuple committed when a *** *** 26,31 --- 27,37 * subtransactions of our own main transaction and so there can't be any * race condition. * + * When using an MVCC snapshot, we rely on XidInMVCCSnapshot rather than + * TransactionIdIsInProgress, but the logic is otherwise the same: do not + * check pg_clog until after deciding that the xact is no longer in progress. + * + * * Summary of visibility functions: * * HeapTupleSatisfiesMVCC() *** HeapTupleSatisfiesMVCC(HeapTuple htup, S *** 961,967 if (TransactionIdIsCurrentTransactionId(xvac)) return false; ! if (!TransactionIdIsInProgress(xvac)) { if (TransactionIdDidCommit(xvac)) { --- 967,973 if (TransactionIdIsCurrentTransactionId(xvac)) return false; ! if (!XidInMVCCSnapshot(xvac, snapshot)) { if (TransactionIdDidCommit(xvac)) { *** HeapTupleSatisfiesMVCC(HeapTuple htup, S *** 980,986 if (!TransactionIdIsCurrentTransactionId(xvac)) { ! if (TransactionIdIsInProgress(xvac)) return false; if (TransactionIdDidCommit(xvac)) SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED, --- 986,992 if (!TransactionIdIsCurrentTransactionId(xvac)) { ! if (XidInMVCCSnapshot(xvac, snapshot)) return false; if (TransactionIdDidCommit(xvac)) SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED, *** HeapTupleSatisfiesMVCC(HeapTuple htup, S *** 1035,1041 else return false; /* deleted before scan started */ } ! else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple))) return false; else if (TransactionIdDidCommit(HeapTupleHeaderGetRawXmin(tuple))) SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED, --- 1041,1047 else return false; /* deleted before scan started */ } ! else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot)) return false; else if (TransactionIdDidCommit(HeapTupleHeaderGetRawXmin(tuple))) SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED, *** HeapTupleSatisfiesMVCC(HeapTuple htup, S *** 1048,1061 return false; } } ! /* ! * By here, the inserting transaction has