Hi, While working on [1], I noticed $SUBJECT: postgres_fdw resets the per-connection states of connections, which store async requests sent to remote servers in async_capable mode, during post-abort (pgfdw_xact_callback()), but it fails to do so during post-subabort (pgfdw_subxact_callback()). This causes a crash when re-executing a query that was aborted in a subtransaction:
postgres=# create table t (a text, b text); postgres=# create or replace function slow_data (name text, duration float) returns setof t as $$ begin perform pg_sleep(duration); return query select name, generate_series(1, 1000)::text; end; $$ language plpgsql; postgres=# create view v1 as select * from slow_data('foo', 2.5); postgres=# create view v2 as select * from slow_data('bar', 5.0); postgres=# create extension postgres_fdw; postgres=# create server loopback1 foreign data wrapper postgres_fdw options (dbname 'postgres', async_capable 'true'); postgres=# create server loopback2 foreign data wrapper postgres_fdw options (dbname 'postgres', async_capable 'true'); postgres=# create user mapping for current_user server loopback1; postgres=# create user mapping for current_user server loopback2; postgres=# create foreign table list_p1 (a text, b text) server loopback1 options (table_name 'v1'); postgres=# create foreign table list_p2 (a text, b text) server loopback2 options (table_name 'v2'); postgres=# create table list_pt (a text, b text) partition by list (a); postgres=# alter table list_pt attach partition list_p1 for values in ('foo'); postgres=# alter table list_pt attach partition list_p2 for values in ('bar'); postgres=# begin; BEGIN postgres=*# savepoint s1; SAVEPOINT postgres=*# select count(*) from list_pt; ^CCancel request sent ERROR: canceling statement due to user request postgres=!# rollback to savepoint s1; ROLLBACK postgres=*# select count(*) from list_pt; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. When canceling the SELECT, an async request sent to a remote server using a connection is canceled, but it’s stored in the per-connection state of the connection even after the failed subtransaction for the reason above, so when re-executing the SELECT, postgres_fdw processes the invalid async request to re-use the connection in GetConnection(), causing a segmentation fault. This would be my oversight in commit 27e1f1456. :-( To fix, I modified pgfdw_abort_cleanup() to reset the per-connection state in the post-subabort case as well. Also, I modified the initialization so that it’s done only if necessary, to save cycles, and improved a comment on the initialization a bit. Attached is a patch for that. Best regards, Etsuro Fujita
fix-postgres-fdw-subabort-cleanup.patch
Description: Binary data