On 08/30/2017 01:34 AM, Andres Freund wrote: > Hi, > > On 2017-08-30 01:27:34 +0200, Tomas Vondra wrote: >> I've been investigating some failures in test_decoding regression tests, >> and it seems to me the error-handling in ReorderBufferCommit() is >> somewhat broken, leading to segfault crashes. >> >> The problematic part is this: >> >> PG_CATCH() >> { >> /* >> * Force cache invalidation to happen outside of a valid transaction >> * to prevent catalog access as we just caught an error. >> */ >> AbortCurrentTransaction(); >> >> /* make sure there's no cache pollution */ >> ReorderBufferExecuteInvalidations(rb, txn); >> >> ... >> >> } >> >> Triggering it trivial - just add elog(ERROR,...) at the beginning of the >> PG_TRY() block. > > That's not really a valid thing to do, you should put it after the > BeginInternalSubTransaction()/StartTransactionCommand(). It's basically > assumed that those won't fail - arguably they should be outside of a > PG_TRY then, but that's a different matter. If you start decoding > outside of SQL failing before those will lead to rolling back the parent > tx... > > >> I suppose this is not quite intentional, but rather an example that >> error-handling code is an order of magnitude more complicated to write >> and test. I've only noticed as I'm investigating some regression >> failures on Postgres-XL 10, which does not support subtransactions and >> so the BeginInternalSubTransaction() call in the try branch always >> fails, triggering the issue. > > So, IIUC, there's no live problem in postgres core, besides some ugly & > undocumented assumptions? >
I'm not really following your reasoning. You may very well be right that the BeginInternalSubTransaction() example is not quite valid on postgres core, but I don't see how that implies there can be no other errors in the PG_TRY block. It was merely an explanation how I noticed this issue. To make it absolutely clear, I claim that the PG_CATCH() block is demonstrably broken as it calls AbortCurrentTransaction() and then accesses already freed memory. The switch in the PG_TRY() blocks includes multiple elog(ERROR) calls in the switch, and AFAICS hitting any of them will have exactly the same effect as failure in BeginInternalSubTransaction(). And I suppose there many other ways to trigger an error and get into the catch block. In other words, why would we have the block at all? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers