Re: Shouldn't postgres_fdw report warning when it gives up getting result from foreign server?
On 2021/12/06 20:50, Bharath Rupireddy wrote: On Mon, Dec 6, 2021 at 1:47 PM Fujii Masao wrote: Yeah, I agree that's not elegant.. So I'd like to propose new patch with different design from what I proposed before. Patch attached. This patch changes pgfdw_exec_cleanup_query() so that it tells its callers the information about whether the timeout expired or not. Then the callers (pgfdw_exec_cleanup_query and pgfdw_cancel_query) report the warning messages based on the results from pgfdw_exec_cleanup_query(). +1 for adding a new timed_out param to pgfdw_get_cleanup_result. pgfdw_get_cleanup_result_v2 patch looks good to me. Thanks for the review! I pushed the patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Shouldn't postgres_fdw report warning when it gives up getting result from foreign server?
On Mon, Dec 6, 2021 at 1:47 PM Fujii Masao wrote: > Yeah, I agree that's not elegant.. > > So I'd like to propose new patch with different design from > what I proposed before. Patch attached. > > This patch changes pgfdw_exec_cleanup_query() so that it tells > its callers the information about whether the timeout expired > or not. Then the callers (pgfdw_exec_cleanup_query and > pgfdw_cancel_query) report the warning messages based on > the results from pgfdw_exec_cleanup_query(). +1 for adding a new timed_out param to pgfdw_get_cleanup_result. pgfdw_get_cleanup_result_v2 patch looks good to me. Regards, Bharath Rupireddy.
Re: Shouldn't postgres_fdw report warning when it gives up getting result from foreign server?
On 2021/12/03 23:04, Bharath Rupireddy wrote: Let's not use the boolean just for the cancel request which isn't scalable IMO. Maybe a macro/enum? Otherwise, we could just do, although it doesn't look elegant: if (pgfdw_get_cleanup_result(conn, endtime, , "(cancel request)")) if (strcmp(query, "(cancel request)") == 0) WARNING without "remote SQL command: else WARNING with "remote SQL command: Yeah, I agree that's not elegant.. So I'd like to propose new patch with different design from what I proposed before. Patch attached. This patch changes pgfdw_exec_cleanup_query() so that it tells its callers the information about whether the timeout expired or not. Then the callers (pgfdw_exec_cleanup_query and pgfdw_cancel_query) report the warning messages based on the results from pgfdw_exec_cleanup_query(). Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATIONdiff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 5c0137220a..6bac4ad23e 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -104,7 +104,7 @@ static bool pgfdw_cancel_query(PGconn *conn); static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors); static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime, - PGresult **result); + PGresult **result, bool *timed_out); static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel); static bool UserMappingPasswordRequired(UserMapping *user); @@ -1154,6 +1154,7 @@ pgfdw_cancel_query(PGconn *conn) charerrbuf[256]; PGresult *result = NULL; TimestampTz endtime; + booltimed_out; /* * If it takes too long to cancel the query and discard the result, assume @@ -1180,8 +1181,19 @@ pgfdw_cancel_query(PGconn *conn) } /* Get and discard the result of the query. */ - if (pgfdw_get_cleanup_result(conn, endtime, )) + if (pgfdw_get_cleanup_result(conn, endtime, , _out)) + { + if (timed_out) + ereport(WARNING, + (errmsg("could not get result of cancel request due to timeout"))); + else + ereport(WARNING, + (errcode(ERRCODE_CONNECTION_FAILURE), +errmsg("could not get result of cancel request: %s", + pchomp(PQerrorMessage(conn); + return false; + } PQclear(result); return true; @@ -1204,6 +1216,7 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors) { PGresult *result = NULL; TimestampTz endtime; + booltimed_out; /* * If it takes too long to execute a cleanup query, assume the connection @@ -1224,8 +1237,17 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors) } /* Get the result of the query. */ - if (pgfdw_get_cleanup_result(conn, endtime, )) + if (pgfdw_get_cleanup_result(conn, endtime, , _out)) + { + if (timed_out) + ereport(WARNING, + (errmsg("could not get query result due to timeout"), +query ? errcontext("remote SQL command: %s", query) : 0)); + else + pgfdw_report_error(WARNING, NULL, conn, false, query); + return false; + } /* Issue a warning if not successful. */ if (PQresultStatus(result) != PGRES_COMMAND_OK) @@ -1245,15 +1267,19 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors) * side back to the appropriate state. * * endtime is the time at which we should give up and assume the remote - * side is dead. Returns true if the timeout expired, otherwise false. - * Sets *result except in case of a timeout. + * side is dead. Returns true if the timeout expired or connection trouble + * occurred, false otherwise. Sets *result except in case of a timeout. + * Sets timed_out to true only when the timeout expired. */ static bool -pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime, PGresult **result) +pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime, PGresult **result, +bool *timed_out) { - volatile bool timed_out = false; + volatile bool failed = false;
Re: Shouldn't postgres_fdw report warning when it gives up getting result from foreign server?
On Fri, Dec 3, 2021 at 2:26 PM Fujii Masao wrote: > > For a cancel request maybe we can just say without te errcontext: > > ereport(WARNING, > > (errmsg("could not get result of cancel > > request due to timeout"))); > > > > See the below existing message using "cancel request": > > errmsg("could not send cancel request: %s", > > > > For SQL command we can say: > > ereport(WARNING, > > (errmsg("could not get query result due to > > timeout"), > > query ? errcontext("remote SQL command: > > %s", query) : 0)); > > I wonder how pgfdw_get_cleanup_result() can determine which > log message to report. Probably we can add new boolean argument > to pgfdw_get_cleanup_result() so that it should be set to true > for cancel request case, but false for query case. Then > pgfdw_get_cleanup_result() can decide wihch message to log > based on that argument. But it seems not good design to me. > Thought? Let's not use the boolean just for the cancel request which isn't scalable IMO. Maybe a macro/enum? Otherwise, we could just do, although it doesn't look elegant: if (pgfdw_get_cleanup_result(conn, endtime, , "(cancel request)")) if (strcmp(query, "(cancel request)") == 0) WARNING without "remote SQL command: else WARNING with "remote SQL command: Regards, Bharath Rupireddy.
Re: Shouldn't postgres_fdw report warning when it gives up getting result from foreign server?
On 2021/11/22 14:16, Bharath Rupireddy wrote: BTW, we can hide the message "remote SQL command: .." in cancel request case, but which would make the debug and troubleshooting harder. Yeah, let's not hide the message. Yes! For a cancel request maybe we can just say without te errcontext: ereport(WARNING, (errmsg("could not get result of cancel request due to timeout"))); See the below existing message using "cancel request": errmsg("could not send cancel request: %s", For SQL command we can say: ereport(WARNING, (errmsg("could not get query result due to timeout"), query ? errcontext("remote SQL command: %s", query) : 0)); I wonder how pgfdw_get_cleanup_result() can determine which log message to report. Probably we can add new boolean argument to pgfdw_get_cleanup_result() so that it should be set to true for cancel request case, but false for query case. Then pgfdw_get_cleanup_result() can decide wihch message to log based on that argument. But it seems not good design to me. Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Shouldn't postgres_fdw report warning when it gives up getting result from foreign server?
On Mon, Nov 22, 2021 at 8:25 AM Fujii Masao wrote: > > On 2021/11/20 1:38, Bharath Rupireddy wrote: > > It reports "remote SQL command: (cancel request)" which isn't a sql > > query, but it looks okay to me as we report (cancel request). The > > pgfdw_get_cleanup_result_v1 patch LGTM. > > BTW, we can hide the message "remote SQL command: .." in cancel request case, > but which would make the debug and troubleshooting harder. Yeah, let's not hide the message. > So I decided to > use the string "(cancel request)" as SQL command string. Probably what string > should be used as SQL command might be debatable. For a cancel request maybe we can just say without te errcontext: ereport(WARNING, (errmsg("could not get result of cancel request due to timeout"))); See the below existing message using "cancel request": errmsg("could not send cancel request: %s", For SQL command we can say: ereport(WARNING, (errmsg("could not get query result due to timeout"), query ? errcontext("remote SQL command: %s", query) : 0)); Regards, Bharath Rupireddy.
Re: Shouldn't postgres_fdw report warning when it gives up getting result from foreign server?
On 2021/11/20 1:38, Bharath Rupireddy wrote: It reports "remote SQL command: (cancel request)" which isn't a sql query, but it looks okay to me as we report (cancel request). The pgfdw_get_cleanup_result_v1 patch LGTM. BTW, we can hide the message "remote SQL command: .." in cancel request case, but which would make the debug and troubleshooting harder. So I decided to use the string "(cancel request)" as SQL command string. Probably what string should be used as SQL command might be debatable. I think we can discuss this refactoring patch separately. Thoughts? Yes! I will consider again if it's worth doing the refactoring, if yes, I will start new thread for the discussion for that. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Shouldn't postgres_fdw report warning when it gives up getting result from foreign server?
On Fri, Nov 19, 2021 at 9:14 PM Fujii Masao wrote: > On 2021/11/19 22:13, Bharath Rupireddy wrote: > > How about adding the warning message in pgfdw_abort_cleanup instead of > > pgfdw_get_cleanup_result? > > > > Just before this in pgfdw_abort_cleanup seems better to me. > > I was thinking pgfdw_get_cleanup_result() is better because it can > easily report different warning messages based on cases of a timeout > or connection failure, respectively. Since pgfdw_get_cleanup_result() > returns true in both those cases, ISTM that it's not easy to > distinguish them in pgfdw_abort_cleanup(). > > Anyway, attached is the patch (pgfdw_get_cleanup_result_v1.patch) > that makes pgfdw_get_cleanup_result() report a warning message. It reports "remote SQL command: (cancel request)" which isn't a sql query, but it looks okay to me as we report (cancel request). The pgfdw_get_cleanup_result_v1 patch LGTM. > > Yeah, this seems to be an opportunity. But, the function should deal > > with the timeout separately, I'm concerned that the function will > > eventually be having if (timeout_param_specified) { } else { } sort > > of code. We can see how much duplicate code we save here vs the > > readability or complexity that comes with the single function. > > Please see the attached patch (refactor_pgfdw_get_result_v1.patch). > This is still WIP, but you can check how much the refactoring can > simplify the code. I think we can discuss this refactoring patch separately. Thoughts? Regards, Bharath Rupireddy.
Re: Shouldn't postgres_fdw report warning when it gives up getting result from foreign server?
On 2021/11/19 22:13, Bharath Rupireddy wrote: How about adding the warning message in pgfdw_abort_cleanup instead of pgfdw_get_cleanup_result? Just before this in pgfdw_abort_cleanup seems better to me. I was thinking pgfdw_get_cleanup_result() is better because it can easily report different warning messages based on cases of a timeout or connection failure, respectively. Since pgfdw_get_cleanup_result() returns true in both those cases, ISTM that it's not easy to distinguish them in pgfdw_abort_cleanup(). Anyway, attached is the patch (pgfdw_get_cleanup_result_v1.patch) that makes pgfdw_get_cleanup_result() report a warning message. Yeah, this seems to be an opportunity. But, the function should deal with the timeout separately, I'm concerned that the function will eventually be having if (timeout_param_specified) { } else { } sort of code. We can see how much duplicate code we save here vs the readability or complexity that comes with the single function. Please see the attached patch (refactor_pgfdw_get_result_v1.patch). This is still WIP, but you can check how much the refactoring can simplify the code. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATIONdiff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 4aff315b7c..203d4c411e 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -104,7 +104,7 @@ static bool pgfdw_cancel_query(PGconn *conn); static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors); static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime, - PGresult **result); + PGresult **result, const char *query); static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel); static bool UserMappingPasswordRequired(UserMapping *user); @@ -1179,7 +1179,7 @@ pgfdw_cancel_query(PGconn *conn) } /* Get and discard the result of the query. */ - if (pgfdw_get_cleanup_result(conn, endtime, )) + if (pgfdw_get_cleanup_result(conn, endtime, , "(cancel request)")) return false; PQclear(result); @@ -1223,7 +1223,7 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors) } /* Get the result of the query. */ - if (pgfdw_get_cleanup_result(conn, endtime, )) + if (pgfdw_get_cleanup_result(conn, endtime, , query)) return false; /* Issue a warning if not successful. */ @@ -1248,7 +1248,8 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors) * Sets *result except in case of a timeout. */ static bool -pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime, PGresult **result) +pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime, PGresult **result, +const char *query) { volatile bool timed_out = false; PGresult *volatile last_res = NULL; @@ -1270,6 +1271,10 @@ pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime, PGresult **result) cur_timeout = TimestampDifferenceMilliseconds(now, endtime); if (cur_timeout <= 0) { + ereport(WARNING, + (errmsg("could not get query result due to timeout"), +query ? errcontext("remote SQL command: %s", query) : 0)); + timed_out = true; goto exit; } @@ -1289,6 +1294,8 @@ pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime, PGresult **result) { if (!PQconsumeInput(conn)) { + pgfdw_report_error(WARNING, NULL, conn, false, query); + /* connection trouble; treat the same as a timeout */ timed_out = true; goto exit; diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 4aff315b7c..3fd1abb8f5 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -103,8 +103,8 @@ static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry); static bool pgfdw_cancel_query(PGconn *conn); static bool
Re: Shouldn't postgres_fdw report warning when it gives up getting result from foreign server?
On Fri, Nov 19, 2021 at 3:31 PM Fujii Masao wrote: > > Hi, > > postgres_fdw reports no log message when it sends "ABORT TRANSACTION" etc > and gives up getting a reply from a foreign server because of timeout or > connection trouble. This makes the troubleshooting a bit harder when > using postgres_fdw. > > So how about making postgres_fdw report a warning in that case? > Specifically I'm thinking to change pgfdw_get_cleanup_result() > in postgres_fdw/connection.c so that it reports a warning in case of > a timeout or connection failure (error of PQconsumeInput()). How about adding the warning message in pgfdw_abort_cleanup instead of pgfdw_get_cleanup_result? Just before this in pgfdw_abort_cleanup seems better to me. /* * If a command has been submitted to the remote server by using an * asynchronous execution function, the command might not have yet * completed. Check to see if a command is still being processed by the * remote server, and if so, request cancellation of the command. */ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE && !pgfdw_cancel_query(entry->conn)) return; /* Unable to cancel running query */ if (!pgfdw_exec_cleanup_query(entry->conn, sql, false)) return; > BTW, pgfdw_get_cleanup_result() does almost the same things as > what pgfdw_get_result() does. So it might be good idea to refactor > those function to reduce the code duplication. Yeah, this seems to be an opportunity. But, the function should deal with the timeout separately, I'm concerned that the function will eventually be having if (timeout_param_specified) { } else { } sort of code. We can see how much duplicate code we save here vs the readability or complexity that comes with the single function. Regards, Bharath Rupireddy.