Hello Rahila,
16.06.2026 07:42, Rahila Syed wrote:
Hi Alexander,
Thank you for the report. This is an interesting case of incomplete or
incorrect error handling.
Regarding the code path in LocalExecuteInvalidationMessage:
(This can seem dubious, but I guess there could be other (perhaps more
sophisticated) ways to trigger an error somewhere inside
LocalExecuteInvalidationMessage() -> RelationCacheInvalidateEntry() ->
RelationFlushRelation() -> RelationRebuildRelation() ->
RelationBuildDesc() -> RelationBuildTupleDesc() -> systable_getnext()...)
I wonder if we should prevent adding CHECK_FOR_INTERRUPTS (CFI) calls
in this path. A quick search did not reveal any existing CFI calls
here. In your example, the CFI is triggered by the elog(LOG, "") added
to the code as part of your testing.
Thank you for the reply!
I've found a way to reproduce this without any code modifications:
for i in {1..100}; do
echo "ITERATION $i"
(for n in {1..10}; do
psql -qAt -c "SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE query LIKE
'ALTER TABLE%'";
done;) &
cat << EOF | psql >>psql.log
CREATE TABLE pt (a int, $(seq -s, -f 'c%g int' 100)) PARTITION BY LIST (a);
CREATE INDEX ON pt(a);
CREATE INDEX ON pt(a);
CREATE INDEX ON pt(a);
CREATE INDEX ON pt(a);
CREATE INDEX ON pt(a);
CREATE TABLE tp1 (LIKE pt);
INSERT INTO tp1 (a) VALUES (1);
ALTER TABLE pt ATTACH PARTITION tp1 FOR VALUES IN (1);
DELETE FROM tp1;
EOF
wait
psql -v ON_ERROR_STOP=1 -c "DROP TABLE pt, tp1" || break;
done
It fails for me as below:
ITERATION 73
t
ERROR: canceling statement due to user request
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
psql: error: connection to server on socket "/tmp/.s.PGSQL.15432" failed:
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2026-06-21 08:35:59.128 EEST [1134495:8] psql ERROR: canceling statement due
to user request
2026-06-21 08:35:59.128 EEST [1134495:9] psql BACKTRACE:
ProcessInterrupts at postgres.c:3548:4
heap_multi_insert at heapam.c:2367:6
CatalogTuplesMultiInsertWithInfo at indexing.c:287:11
recordMultipleDependencies at pg_depend.c:159:22
recordDependencyOn at pg_depend.c:56:1
StoreCatalogInheritance1 at tablecmds.c:3650:2
CreateInheritance at tablecmds.c:17688:2
attachPartitionTable at tablecmds.c:20516:2
ATExecAttachPartition at tablecmds.c:20777:24
ATExecCmd at tablecmds.c:5727:15
ATRewriteCatalogs at tablecmds.c:5401:4
ATController at tablecmds.c:4954:2
AlterTable at tablecmds.c:4602:1
ProcessUtilitySlow at utility.c:1327:7
standard_ProcessUtility at utility.c:1072:4
ProcessUtility at utility.c:528:3
PortalRunUtility at pquery.c:1149:2
PortalRunMulti at pquery.c:1307:5
PortalRun at pquery.c:788:5
exec_simple_query at postgres.c:1297:11
PostgresMain at postgres.c:4869:27
BackendInitialize at backend_startup.c:142:1
postmaster_child_launch at launch_backend.c:269:3
BackendStartup at postmaster.c:3627:8
ServerLoop at postmaster.c:1731:10
PostmasterMain at postmaster.c:1415:11
main at main.c:236:2
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b) [0x75a5c9c2a28b]
postgres: law regression [local] ALTER TABLE(_start+0x25) [0x653adc53c595]
2026-06-21 08:35:59.128 EEST [1134495:10] psql STATEMENT: ALTER TABLE pt
ATTACH PARTITION tp1 FOR VALUES IN (1);
2026-06-21 19:49:42.527 EEST [1783003:16] psql LOG: statement: DELETE FROM tp1;
TRAP: failed Assert("list != NIL"), File:
"../../../../src/include/nodes/pg_list.h", Line: 322, PID: 1783003
ExceptionalCondition at assert.c:51:13
list_last_cell at pg_list.h:323:14
RelationBuildPublicationDesc at relcache.c:5847:23
CheckCmdReplicaIdentity at execReplication.c:1068:5
CheckValidResultRel at execMain.c:1094:7
ExecInitModifyTable at nodeModifyTable.c:5299:16
ExecInitNode at execProcnode.c:177:27
InitPlan at execMain.c:1002:14
standard_ExecutorStart at execMain.c:274:2
ExecutorStart at execMain.c:140:1
ProcessQuery at pquery.c:162:2
PortalRunMulti at pquery.c:1269:5
PortalRun at pquery.c:788:5
exec_simple_query at postgres.c:1297:11
PostgresMain at postgres.c:4860:27
BackendInitialize at backend_startup.c:142:1
postmaster_child_launch at launch_backend.c:269:3
BackendStartup at postmaster.c:3627:8
ServerLoop at postmaster.c:1731:10
PostmasterMain at postmaster.c:1415:11
main at main.c:236:2
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x7f6cd522a28b]
postgres: law regression [local] DELETE(_start+0x25)[0x5a801c1ff595]
To prevent incomplete cache invalidation during an abort, we probably
need to avoid processing interrupts and ensure the process does not
error out. Otherwise, as you demonstrated, we risk leaving the
relcache in an inconsistent state where a stale entry remains even
after a transaction is rolled back.
Yes, if there is no guarantee that other errors can't occur down that path,
probably just preventing CHECK_FOR_INTERRUPTS won't be sufficient.
Best regards,
Alexander