On Sat, May 6, 2017 at 4:41 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > On Fri, May 5, 2017 at 9:32 PM, Robert Haas <robertmh...@gmail.com> wrote: >> On Fri, May 5, 2017 at 11:15 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: >>> I am not saying to rip those changes. Your most of the mail stresses >>> about the 30-second timeout which you have added in the patch to >>> detect timeouts during failures (rollback processing). I have no >>> objection to that part of the patch except that still there is a race >>> condition around PQsendQuery and you indicate that it is better to >>> deal it as a separate patch to which I agree. >> >> OK. >> >>> The point of which I am >>> not in total agreement is about the failure other than the time out. >>> >>> +pgfdw_exec_cleanup_query(PGconn *conn, const char *query) >>> { >>> .. >>> + /* Get the result of the query. */ >>> + if (pgfdw_get_cleanup_result(conn, endtime, &result)) >>> + return false; >>> + >>> + /* Issue a warning if not successful. */ >>> + if (PQresultStatus(result) != PGRES_COMMAND_OK) >>> + { >>> + pgfdw_report_error(WARNING, result, conn, true, query); >>> + return false; >>> + } >>> .. >>> } >>> >>> In the above code, if there is a failure due to timeout then it will >>> return from first statement (pgfdw_get_cleanup_result), however, if >>> there is any other failure, then it will issue a warning and return >>> false. I am not sure if there is a need to return false in the second >>> case, because even without that it will work fine (and there is a >>> chance that it won't drop the connection), but maybe your point is >>> better to be consistent for all kind of error handling in this path. >> >> There are three possible queries that can be executed by >> pgfdw_exec_cleanup_query; let's consider them individually. >> >> 1. ABORT TRANSACTION. If ABORT TRANSACTION fails, I think that we >> have no alternative but to close the connection. If we don't, then >> the next local connection that tries to use this connection will >> probably also fail, because it will find the connection inside an >> aborted transaction rather than not in a transaction. If we do close >> the connection, the next transaction will try to reconnect and >> everything will be fine. >> >> 2. DEALLOCATE ALL. If DEALLOCATE ALL fails, we do not have to close >> the connection, but the reason why we're running that statement in the >> first place is because we've potentially lost track of which >> statements are prepared on the remote side. If we don't drop the >> connection, we'll still be confused about that. Reconnecting will fix >> it. >> > > There is a comment in the code which explains why currently we ignore > errors from DEALLOCATE ALL. > > * DEALLOCATE ALL only exists in 8.3 and later, so this > * constrains how old a server postgres_fdw can > * communicate with. We intentionally ignore errors in > * the DEALLOCATE, so that we can hobble along to some > * extent with older servers (leaking prepared statements > * as we go; but we don't really support update operations > * pre-8.3 anyway). > > It is not entirely clear to me whether it is only for Commit case or > in general. From the code, it appears that the comment holds true for > both commit and abort. If it is for both, then after this patch, the > above comment will not be valid and you might want to update it in > case we want to proceed with your proposed changes for DEALLOCATE ALL > statement. >
Apart from this, I don't see any other problem with the patch. Additionally, I have run the regression tests with the patch and everything passed. As a side note, why we want 30-second timeout only for Rollback related commands and why not for Commit and the Deallocate All as well? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers