On Wed, Nov 13, 2013 at 11:04 AM, Noah Misch <n...@leadboat.com> wrote: >> So, in short, ERROR + ERROR*10 = PANIC, but FATAL + ERROR*10 = FATAL. >> That's bizarre. > > Quite so. > >> Given that that's where we are, promoting an ERROR during FATAL >> processing to PANIC doesn't seem like it's losing much; we're >> essentially already doing that in the (probably more likely) case of a >> persistent ERROR during ERROR processing. But since PANIC sucks, I'd >> rather go the other direction: let's make an ERROR during ERROR >> processing promote to FATAL. And then let's do what you write above: >> make sure that there's a separate on-shmem-exit callback for each >> critical shared memory resource and that we call of those during FATAL >> processing. > > Many of the factors that can cause AbortTransaction() to fail can also cause > CommitTransaction() to fail, and those would still PANIC if the transaction > had an xid. How practical might it be to also escape from an error during > CommitTransaction() with a FATAL instead of PANIC? There's more to fix up in > that case (sinval, NOTIFY), but it may be within reach. If such a technique > can only reasonably fix abort, though, I have doubts it buys us enough.
The critical stuff that's got to happen after RecordTransactionCommit() appears to be ProcArrayEndTransaction() and AtEOXact_Inval(). Unfortunately, the latter is well after the point when we're supposed to only be doing "non-critical resource cleanup", nonwithstanding which it appears to be critical. So here's a sketch. Hoist the preparatory logic in RecordTransactionCommit() - smgrGetPendingDeletes, xactGetCommittedChildren, and xactGetCommittedInvalidationMessages up into the caller and do it before setting TRANS_COMMIT. If any of that stuff fails, we'll land in AbortTransaction() which must cope. As soon as we exit the commit critical section, set a flag somewhere (where?) indicating that we have written our commit record; when that flag is set, (a) promote any ERROR after that point through the end of commit cleanup to FATAL and (b) if we enter AbortTransaction(), don't try to RecordTransactionAbort(). I can't see that the notification stuff requires fixup in this case; AFAICS, it is just adjusting backend-local state, and it's OK to disregard any problems there during a FATAL exit. Do you see something to the contrary? But invalidation messages are a problem: if we commit and exit without sending our queued-up invalidation messages, Bad Things Will Happen. Perhaps we could arrange things so that in that case only, we just PANIC. That would allow most write transactions to get by with FATAL, promoting to PANIC only in the case of transactions that have modified system catalogs and only until the invalidations have actually been sent. Avoiding the PANIC in that case seems to require some additional wizardry which is not entirely clear to me at this time. I think we'll have to approach the various problems in this area stepwise, or we'll never make any progress. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers