Looking at the issue raised in https://www.postgresql.org/message-id/flat/57EE93C8.8080504%40postgrespro.ru prompted me to wonder whether there were any other places in which we were calling tqual.c routines without holding buffer content lock. I tried adding
Assert(BufferIsLocal(buffer) || LWLockHeldByMe(BufferDescriptorGetContentLock(GetBufferDescriptor(buffer - 1)))); to all the tqual.c tuple-visibility-test routines, and sure enough a bunch of regression tests fell over, as a result of this bit in RI_FKey_check: /* * We should not even consider checking the row if it is no longer valid, * since it was either deleted (so the deferred check should be skipped) * or updated (in which case only the latest version of the row should be * checked). Test its liveness according to SnapshotSelf. * * NOTE: The normal coding rule is that one must acquire the buffer * content lock to call HeapTupleSatisfiesVisibility. We can skip that * here because we know that AfterTriggerExecute just fetched the tuple * successfully, so there cannot be a VACUUM compaction in progress on the * page (either heap_fetch would have waited for the VACUUM, or the * VACUUM's LockBufferForCleanup would be waiting for us to drop pin). And * since this is a row inserted by our open transaction, no one else can * be entitled to change its xmin/xmax. */ Assert(new_row_buf != InvalidBuffer); if (!HeapTupleSatisfiesVisibility(new_row, SnapshotSelf, new_row_buf)) return PointerGetDatum(NULL); Now, you'll notice that that bit of argumentation fails to cover the question of hint bits. It might still be okay, if HeapTupleSatisfiesSelf would never attempt to set a hint bit on a row inserted by our own open transaction ... but a quick perusal of that function shows that it will attempt to do so, if xmax points to an aborted subtransaction. That led me to try this test case: create table pp(f1 int primary key); insert into pp values(1); create table cc(f1 int references pp deferrable initially deferred); begin; insert into cc values(1); savepoint x; delete from cc; rollback to x; commit; and sure enough, that dies with an assertion failure in MarkBufferDirtyHint, or predecessor code, in every release back to 8.2 or thereabouts. We might be able to band-aid around that by not attempting to set the hint bit in the "deleting subtransaction must have aborted" path in HeapTupleSatisfiesSelf --- but it's easy to think of cases where not doing so would be a net loss performance-wise. And TBH that code in RI_FKey_check seems to me to be cantilevered way too far over the canyon's edge anyway, especially when one compares the cost of one buffer lock cycle to everything else the function is going to do. I think we should just make it take the buffer lock and be done. BTW, I was also able to reproduce the problem complained of in the aforementioned thread, in an existing test in postgres_fdw: #2 0x0000000000808cdd in ExceptionalCondition ( conditionName=<value optimized out>, errorType=<value optimized out>, fileName=<value optimized out>, lineNumber=<value optimized out>) at assert.c:54 #3 0x000000000083e3a4 in HeapTupleSatisfiesMVCC (htup=<value optimized out>, snapshot=0x2198b70, buffer=335) at tqual.c:981 #4 0x000000000060635c in ExecCheckHeapTupleVisible ( estate=<value optimized out>, tuple=<value optimized out>, buffer=<value optimized out>) at nodeModifyTable.c:197 #5 0x000000000060845f in ExecCheckTIDVisible (node=0x21bc6e0) at nodeModifyTable.c:222 #6 ExecInsert (node=0x21bc6e0) at nodeModifyTable.c:413 #7 ExecModifyTable (node=0x21bc6e0) at nodeModifyTable.c:1496 #8 0x00000000005ebf98 in ExecProcNode (node=0x21bc6e0) at execProcnode.c:396 #9 0x00000000005e99ba in ExecutePlan (queryDesc=0x21b3908, direction=<value optimized out>, count=0) at execMain.c:1567 #10 standard_ExecutorRun (queryDesc=0x21b3908, direction=<value optimized out>, count=0) at execMain.c:338 (gdb) p debug_query_string $1 = 0x21b36f8 "INSERT INTO \"S 1\".\"T 1\"(\"C 1\", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) ON CONFLICT DO NOTHING" So we do have coverage of this case, sort of. I don't think I want to propose putting in these Asserts as a permanent test, because it seems pretty ugly to have such code outside bufmgr.c (it required adding #include "storage/buf_internals.h" to tqual.c). On the other hand, if we'd had them we would never have shipped either of these bugs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers