On 2020/09/30 0:50, Bharath Rupireddy wrote:
Thanks for the comments.

On Tue, Sep 29, 2020 at 7:30 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote:

+1 to add debug3 message there. But this message doesn't seem to
match with what the error actually happened. What about something like
"could not start remote transaction on connection %p", instead?


Looks better. Changed.


Also maybe it's better to append PQerrorMessage(entry->conn)?


Added. Now the log looks like [1].


+-- Generate a connection to remote. Local backend will cache it.
+SELECT * FROM ft1 LIMIT 1;

The result of this query would not be stable. Probably you need to,
for example, add ORDER BY or replace * with 1, etc.


Changed to SELECT 1 FROM ft1 LIMIT 1;


+-- Retrieve pid of remote backend with application name fdw_retry_check
+-- and kill it intentionally here. Note that, local backend still has
+-- the remote connection/backend info in it's cache.
+SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE
+backend_type = 'client backend' AND application_name = 'fdw_retry_check';

Isn't this test fragile because there is no gurantee that the target backend
has exited just after calling pg_terminate_backend()?


I think this is okay, because pg_terminate_backend() sends SIGTERM to
the backend, and upon receiving SIGTERM the signal handler die() will
be called and since there is no query being executed on the backend by
the time SIGTERM is received, it will exit immediately. Thoughts?

Yeah, basically you're right. But that backend *can* still be running
when the subsequent test query starts. I'm wondering if wait_pid()
(please see regress.c and sql/dblink.sql) should be used to ensure
the target backend disappeared.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Reply via email to