Re: psql not responding to SIGINT upon db reconnection
Thanks Jelte and Robert for the extra effort to correct my mistake. I apologize. Bad copy-paste from a previous revision of the patch at some point. -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Thu, Apr 4, 2024 at 6:43 AM Jelte Fennema-Nio wrote: > But I don't agree with this. Having pg_unreachable in places where it > brings no perf benefit has two main downsides: > > 1. It's extra lines of code > 2. If a programmer/reviewer is not careful about maintaining this > unreachable invariant (now or in the future), undefined behavior can > be introduced quite easily. > > Also, I'd expect any optimizing compiler to know that the code after > the loop is already unreachable if there are no break/goto statements > in its body. I'm not super-well-informed about the effects of pg_unreachable(), but I feel like it's a little strange to complain that it's an extra line of code. Presumably, it translates into no executable instructions, and might even reduce the size of the generated code. Now, it could still be a cognitive burden on human readers, but hopefully it should do the exact opposite: it should make the code easier to understand by documenting that the author thought that a certain point in the code could not be reached. In that sense, it's like a code comment. Likewise, it certainly is true that one must be careful to put pg_unreachable() only in places that aren't reachable, and to add or remove it as appropriate if the set of unreachable places changes. But it's also true that one must be careful to put any other thing that looks like a function call only in places where that thing is appropriate. > > I agree with Tristan's analysis of 0002. > > Updated 0002, to only change the comment. But honestly I don't think > using "default" really introduces many chances for future bugs in this > case, since it seems very unlikely we'll ever add new variants to the > PostgresPollingStatusType enum. That might be true, but we've avoided using default: for switches over lots of other enum types in lots of other places in PostgreSQL, and overall it's saved us from lots of bugs. I'm not a stickler about this; if there's some reason not to do it in some particular case, that's fine with me. But where it's possible to do it without any real disadvantage, I think it's a healthy practice. I have committed these versions of the patches. -- Robert Haas EDB: http://www.enterprisedb.com
Re: psql not responding to SIGINT upon db reconnection
On Wed, 3 Apr 2024 at 21:49, Robert Haas wrote: > It seems to me that 0001 should either remove the pg_unreachable() > call or change the break to a return, but not both. Updated the patch to only remove the pg_unreachable call (and keep the breaks). > but there's not much value in omitting > pg_unreachable() from unreachable places, either, so I'm not > convinced. But I don't agree with this. Having pg_unreachable in places where it brings no perf benefit has two main downsides: 1. It's extra lines of code 2. If a programmer/reviewer is not careful about maintaining this unreachable invariant (now or in the future), undefined behavior can be introduced quite easily. Also, I'd expect any optimizing compiler to know that the code after the loop is already unreachable if there are no break/goto statements in its body. > I agree with Tristan's analysis of 0002. Updated 0002, to only change the comment. But honestly I don't think using "default" really introduces many chances for future bugs in this case, since it seems very unlikely we'll ever add new variants to the PostgresPollingStatusType enum. From b0ccfb6fca3e4ae9e1e62ac3b6a4c5f428b56e04 Mon Sep 17 00:00:00 2001 From: Jelte Fennema-Nio Date: Wed, 3 Apr 2024 15:19:04 +0200 Subject: [PATCH v14 1/2] Fix actually reachable pg_unreachable call In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was introduced that was actually reachable, because there were break statements in the loop. This addresses that by removing the call to pg_unreachable, which seemed of dubious necessity anyway. --- src/bin/psql/command.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index c005624e9c3..479f9f2be59 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn) pg_unreachable(); } } - - pg_unreachable(); } void base-commit: 936e3fa3787a51397280c1081587586e83c20399 -- 2.34.1 From 799eb6989d481ab617c473b8acbbfc1a405c3c7d Mon Sep 17 00:00:00 2001 From: Jelte Fennema-Nio Date: Wed, 3 Apr 2024 15:21:52 +0200 Subject: [PATCH v14 2/2] Change comment on PGRES_POLLING_ACTIVE Updates the comment on PGRES_POLLING_ACTIVE, since we're stuck with it forever due to ABI compatibility guarantees. --- src/interfaces/libpq/libpq-fe.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h index 8d3c5c6f662..c184e853889 100644 --- a/src/interfaces/libpq/libpq-fe.h +++ b/src/interfaces/libpq/libpq-fe.h @@ -91,8 +91,7 @@ typedef enum PGRES_POLLING_READING, /* These two indicate that one may */ PGRES_POLLING_WRITING, /* use select before polling again. */ PGRES_POLLING_OK, - PGRES_POLLING_ACTIVE /* unused; keep for awhile for backwards - * compatibility */ + PGRES_POLLING_ACTIVE /* unused; keep for backwards compatibility */ } PostgresPollingStatusType; typedef enum -- 2.34.1
Re: psql not responding to SIGINT upon db reconnection
On Wed, Apr 3, 2024 at 11:17 AM Tristan Partin wrote: > I think patch 2 makes it worse. The value in -Wswitch is that when new > enum variants are added, the developer knows the locations to update. > Adding a default case makes -Wswitch pointless. > > Patch 1 is still good. The comment change in patch 2 is good too! It seems to me that 0001 should either remove the pg_unreachable() call or change the break to a return, but not both. The commit message tries to justify doing both by saying that the pg_unreachable() call doesn't have much value, but there's not much value in omitting pg_unreachable() from unreachable places, either, so I'm not convinced. I agree with Tristan's analysis of 0002. -- Robert Haas EDB: http://www.enterprisedb.com
Re: psql not responding to SIGINT upon db reconnection
On Wed Apr 3, 2024 at 10:05 AM CDT, Jelte Fennema-Nio wrote: On Wed, 3 Apr 2024 at 16:55, Tristan Partin wrote: > Removing from the switch statement causes a warning: > > > [1/2] Compiling C object src/bin/psql/psql.p/command.c.o > > ../src/bin/psql/command.c: In function ‘wait_until_connected’: > > ../src/bin/psql/command.c:3803:17: warning: enumeration value ‘PGRES_POLLING_ACTIVE’ not handled in switch [-Wswitch] > > 3803 | switch (PQconnectPoll(conn)) > > | ^~ Ofcourse... fixed now I think patch 2 makes it worse. The value in -Wswitch is that when new enum variants are added, the developer knows the locations to update. Adding a default case makes -Wswitch pointless. Patch 1 is still good. The comment change in patch 2 is good too! -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Wed, 3 Apr 2024 at 16:55, Tristan Partin wrote: > Removing from the switch statement causes a warning: > > > [1/2] Compiling C object src/bin/psql/psql.p/command.c.o > > ../src/bin/psql/command.c: In function ‘wait_until_connected’: > > ../src/bin/psql/command.c:3803:17: warning: enumeration value > > ‘PGRES_POLLING_ACTIVE’ not handled in switch [-Wswitch] > > 3803 | switch (PQconnectPoll(conn)) > > | ^~ Ofcourse... fixed now From 29100578b5f0b4cec68ee1b8c572c4f280335da3 Mon Sep 17 00:00:00 2001 From: Jelte Fennema-Nio Date: Wed, 3 Apr 2024 15:19:04 +0200 Subject: [PATCH v13 1/2] Fix actually reachable pg_unreachable call In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was introduced that was actually reachable, because there were break statements in the loop. This addresses that by both changing the break statements to return and removing the call to pg_unreachable, which seemed of dubious necessity anyway. --- src/bin/psql/command.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index c005624e9c3..dc3cc7c10b7 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -3771,7 +3771,7 @@ wait_until_connected(PGconn *conn) * user has requested a cancellation. */ if (cancel_pressed) - break; + return; /* * Do not assume that the socket remains the same across @@ -3779,7 +3779,7 @@ wait_until_connected(PGconn *conn) */ sock = PQsocket(conn); if (sock == -1) - break; + return; /* * If the user sends SIGINT between the cancel_pressed check, and @@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn) pg_unreachable(); } } - - pg_unreachable(); } void base-commit: 936e3fa3787a51397280c1081587586e83c20399 -- 2.34.1 From 76cef25162b44adc20172afee47836ca765d3b5c Mon Sep 17 00:00:00 2001 From: Jelte Fennema-Nio Date: Wed, 3 Apr 2024 15:21:52 +0200 Subject: [PATCH v13 2/2] Remove PGRES_POLLING_ACTIVE case This enum variant has been unused for at least 21 years 44aba280207740d0956160c0288e61f28f024a71. It's been there only for backwards compatibility of the ABI, so no need to check for it. Also update the comment on PGRES_POLLING_ACTIVE, since we're stuck with it forever due to ABI compatibility guarantees. --- src/bin/psql/command.c | 7 ++- src/interfaces/libpq/libpq-fe.h | 3 +-- 2 files changed, 3 insertions(+), 7 deletions(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index dc3cc7c10b7..5b31d08bf70 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -3802,17 +3802,14 @@ wait_until_connected(PGconn *conn) switch (PQconnectPoll(conn)) { - case PGRES_POLLING_OK: - case PGRES_POLLING_FAILED: -return; case PGRES_POLLING_READING: forRead = true; continue; case PGRES_POLLING_WRITING: forRead = false; continue; - case PGRES_POLLING_ACTIVE: -pg_unreachable(); + default: +return; } } } diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h index 8d3c5c6f662..c184e853889 100644 --- a/src/interfaces/libpq/libpq-fe.h +++ b/src/interfaces/libpq/libpq-fe.h @@ -91,8 +91,7 @@ typedef enum PGRES_POLLING_READING, /* These two indicate that one may */ PGRES_POLLING_WRITING, /* use select before polling again. */ PGRES_POLLING_OK, - PGRES_POLLING_ACTIVE /* unused; keep for awhile for backwards - * compatibility */ + PGRES_POLLING_ACTIVE /* unused; keep for backwards compatibility */ } PostgresPollingStatusType; typedef enum -- 2.34.1
Re: psql not responding to SIGINT upon db reconnection
On Wed Apr 3, 2024 at 9:50 AM CDT, Jelte Fennema-Nio wrote: On Wed, 3 Apr 2024 at 16:31, Tom Lane wrote: > If we do the latter, we will almost certainly get pushback from > distros who check for library ABI breaks. I fear the comment > suggesting that we could remove it someday is too optimistic. Alright, changed patch 0002 to keep the variant. But remove it from the recently added switch/case. And also updated the comment to remove the "for awhile". Removing from the switch statement causes a warning: [1/2] Compiling C object src/bin/psql/psql.p/command.c.o ../src/bin/psql/command.c: In function ‘wait_until_connected’: ../src/bin/psql/command.c:3803:17: warning: enumeration value ‘PGRES_POLLING_ACTIVE’ not handled in switch [-Wswitch] 3803 | switch (PQconnectPoll(conn)) | ^~ -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Wed, 3 Apr 2024 at 16:31, Tom Lane wrote: > If we do the latter, we will almost certainly get pushback from > distros who check for library ABI breaks. I fear the comment > suggesting that we could remove it someday is too optimistic. Alright, changed patch 0002 to keep the variant. But remove it from the recently added switch/case. And also updated the comment to remove the "for awhile". From 29100578b5f0b4cec68ee1b8c572c4f280335da3 Mon Sep 17 00:00:00 2001 From: Jelte Fennema-Nio Date: Wed, 3 Apr 2024 15:19:04 +0200 Subject: [PATCH v12 1/2] Fix actually reachable pg_unreachable call In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was introduced that was actually reachable, because there were break statements in the loop. This addresses that by both changing the break statements to return and removing the call to pg_unreachable, which seemed of dubious necessity anyway. --- src/bin/psql/command.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index c005624e9c3..dc3cc7c10b7 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -3771,7 +3771,7 @@ wait_until_connected(PGconn *conn) * user has requested a cancellation. */ if (cancel_pressed) - break; + return; /* * Do not assume that the socket remains the same across @@ -3779,7 +3779,7 @@ wait_until_connected(PGconn *conn) */ sock = PQsocket(conn); if (sock == -1) - break; + return; /* * If the user sends SIGINT between the cancel_pressed check, and @@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn) pg_unreachable(); } } - - pg_unreachable(); } void base-commit: 936e3fa3787a51397280c1081587586e83c20399 -- 2.34.1 From f71634361a0c9ba3416c75fc3ff6d55536f134c4 Mon Sep 17 00:00:00 2001 From: Jelte Fennema-Nio Date: Wed, 3 Apr 2024 15:21:52 +0200 Subject: [PATCH v12 2/2] Remove PGRES_POLLING_ACTIVE case This enum variant has been unused for at least 21 years 44aba280207740d0956160c0288e61f28f024a71. It's been there only for backwards compatibility of the ABI, so no need to check for it. Also update the comment on PGRES_POLLING_ACTIVE, since we're stuck with it forever due to ABI compatibility guarantees. --- src/bin/psql/command.c | 2 -- src/interfaces/libpq/libpq-fe.h | 3 +-- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index dc3cc7c10b7..9a163cf22cd 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -3811,8 +3811,6 @@ wait_until_connected(PGconn *conn) case PGRES_POLLING_WRITING: forRead = false; continue; - case PGRES_POLLING_ACTIVE: -pg_unreachable(); } } } diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h index 8d3c5c6f662..c184e853889 100644 --- a/src/interfaces/libpq/libpq-fe.h +++ b/src/interfaces/libpq/libpq-fe.h @@ -91,8 +91,7 @@ typedef enum PGRES_POLLING_READING, /* These two indicate that one may */ PGRES_POLLING_WRITING, /* use select before polling again. */ PGRES_POLLING_OK, - PGRES_POLLING_ACTIVE /* unused; keep for awhile for backwards - * compatibility */ + PGRES_POLLING_ACTIVE /* unused; keep for backwards compatibility */ } PostgresPollingStatusType; typedef enum -- 2.34.1
Re: psql not responding to SIGINT upon db reconnection
Jelte Fennema-Nio writes: > Looking at the committed version of this patch, the pg_unreachable > calls seemed weird to me. 1 is actually incorrect, thus possibly > resulting in undefined behaviour. And for the other call an imho > better fix would be to remove the now 21 year unused enum variant, > instead of introducing its only reference in the whole codebase. If we do the latter, we will almost certainly get pushback from distros who check for library ABI breaks. I fear the comment suggesting that we could remove it someday is too optimistic. regards, tom lane
Re: psql not responding to SIGINT upon db reconnection
On Wed Apr 3, 2024 at 8:32 AM CDT, Jelte Fennema-Nio wrote: On Tue, 2 Apr 2024 at 16:33, Robert Haas wrote: > Committed it, I did. My thanks for working on this issue, I extend. Looking at the committed version of this patch, the pg_unreachable calls seemed weird to me. 1 is actually incorrect, thus possibly resulting in undefined behaviour. And for the other call an imho better fix would be to remove the now 21 year unused enum variant, instead of introducing its only reference in the whole codebase. Attached are two trivial patches, feel free to remove both of the pg_unreachable calls. Patches look good. Sorry about causing you to do some work. -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Tue, 2 Apr 2024 at 16:33, Robert Haas wrote: > Committed it, I did. My thanks for working on this issue, I extend. Looking at the committed version of this patch, the pg_unreachable calls seemed weird to me. 1 is actually incorrect, thus possibly resulting in undefined behaviour. And for the other call an imho better fix would be to remove the now 21 year unused enum variant, instead of introducing its only reference in the whole codebase. Attached are two trivial patches, feel free to remove both of the pg_unreachable calls. From 7f4279bb939e2d2707fdd0f471893d734959ab91 Mon Sep 17 00:00:00 2001 From: Jelte Fennema-Nio Date: Wed, 3 Apr 2024 15:21:52 +0200 Subject: [PATCH v11 2/2] Remove PGRES_POLLING_ACTIVE This enum variant has been unused for at least 21 years 44aba280207740d0956160c0288e61f28f024a71. It was left for backwards compatibility for "awhile". I think that time has come. --- src/bin/psql/command.c | 2 -- src/interfaces/libpq/libpq-fe.h | 2 -- 2 files changed, 4 deletions(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index dc3cc7c10b7..9a163cf22cd 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -3811,8 +3811,6 @@ wait_until_connected(PGconn *conn) case PGRES_POLLING_WRITING: forRead = false; continue; - case PGRES_POLLING_ACTIVE: -pg_unreachable(); } } } diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h index 8d3c5c6f662..2459b0cf5e1 100644 --- a/src/interfaces/libpq/libpq-fe.h +++ b/src/interfaces/libpq/libpq-fe.h @@ -91,8 +91,6 @@ typedef enum PGRES_POLLING_READING, /* These two indicate that one may */ PGRES_POLLING_WRITING, /* use select before polling again. */ PGRES_POLLING_OK, - PGRES_POLLING_ACTIVE /* unused; keep for awhile for backwards - * compatibility */ } PostgresPollingStatusType; typedef enum -- 2.34.1 From 29100578b5f0b4cec68ee1b8c572c4f280335da3 Mon Sep 17 00:00:00 2001 From: Jelte Fennema-Nio Date: Wed, 3 Apr 2024 15:19:04 +0200 Subject: [PATCH v11 1/2] Fix actually reachable pg_unreachable call In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was introduced that was actually reachable, because there were break statements in the loop. This addresses that by both changing the break statements to return and removing the call to pg_unreachable, which seemed of dubious necessity anyway. --- src/bin/psql/command.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index c005624e9c3..dc3cc7c10b7 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -3771,7 +3771,7 @@ wait_until_connected(PGconn *conn) * user has requested a cancellation. */ if (cancel_pressed) - break; + return; /* * Do not assume that the socket remains the same across @@ -3779,7 +3779,7 @@ wait_until_connected(PGconn *conn) */ sock = PQsocket(conn); if (sock == -1) - break; + return; /* * If the user sends SIGINT between the cancel_pressed check, and @@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn) pg_unreachable(); } } - - pg_unreachable(); } void base-commit: 936e3fa3787a51397280c1081587586e83c20399 -- 2.34.1
Re: psql not responding to SIGINT upon db reconnection
On Tue Apr 2, 2024 at 9:32 AM CDT, Robert Haas wrote: On Mon, Apr 1, 2024 at 12:04 PM Tristan Partin wrote: > > This sentence seems a bit contorted to me, like maybe Yoda wrote it. > > Incorporated feedback, I have :). Committed it, I did. My thanks for working on this issue, I extend. Thanks to everyone for their reviews! Patch is in a much better place than when it started. -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Mon, Apr 1, 2024 at 12:04 PM Tristan Partin wrote: > > This sentence seems a bit contorted to me, like maybe Yoda wrote it. > > Incorporated feedback, I have :). Committed it, I did. My thanks for working on this issue, I extend. -- Robert Haas EDB: http://www.enterprisedb.com
Re: psql not responding to SIGINT upon db reconnection
On Mon Mar 25, 2024 at 1:44 PM CDT, Robert Haas wrote: On Fri, Mar 22, 2024 at 4:58 PM Tristan Partin wrote: > I had a question about parameter naming. Right now I have a mix of > camel-case and snake-case in the function signature since that is what > I inherited. Should I change that to be consistent? If so, which case > would you like? Uh... PostgreSQL is kind of the wild west in that regard. The thing to do is look for nearby precedents, but that doesn't help much here because in the very same file, libpq-fe.h, we have: extern int PQsetResultAttrs(PGresult *res, int numAttributes, PGresAttDesc *attDescs); extern int PQsetvalue(PGresult *res, int tup_num, int field_num, char *value, int len); Since the existing naming is consistent with one of those two styles, I'd probably just leave it be. + The function returns a value greater than 0 if the specified condition + is met, 0 if a timeout occurred, or -1 if an error + or interrupt occurred. In the event forRead and We either need to tell people how to find out which error it was, or if that's not possible and we can't reasonably make it possible, we need to tell them why they shouldn't care. Because there's nothing more delightful than someone who shows up and says "hey, I tried to do XYZ, and I got an error," as if that were sufficient information for me to do something useful. + end_time is the time in the future in seconds starting from the UNIX + epoch in which you would like the function to return if the condition is not met. This sentence seems a bit contorted to me, like maybe Yoda wrote it. I was about to try to rephrase it and maybe split it in two when I wondered why we need to document how time_t works at all. Can't we just say something like "If end_time is not -1, it specifies the time at which this function should stop waiting for the condition to be met" -- and maybe move it to the end of the first paragraph, so it's before where we list the meanings of the return values? Incorporated feedback, I have :). -- Tristan Partin Neon (https://neon.tech) From 14c794d9699f7b9f27d1a4ec026f748c6b7d8853 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Tue, 30 Jan 2024 15:40:57 -0600 Subject: [PATCH v11 1/2] Expose PQsocketPoll for frontends PQsocketPoll should help with state machine processing when connecting to a database asynchronously via PQconnectStart(). --- doc/src/sgml/libpq.sgml | 40 +++- src/interfaces/libpq/exports.txt | 1 + src/interfaces/libpq/fe-misc.c | 7 +++--- src/interfaces/libpq/libpq-fe.h | 4 4 files changed, 47 insertions(+), 5 deletions(-) diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index d3e87056f2c..e69feacfe6a 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -262,6 +262,41 @@ PGconn *PQsetdb(char *pghost, + + PQsocketPollPQsocketPoll + + + nonblocking connection + Poll a connection's underlying socket descriptor retrieved with . + +int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time); + + + + + This function sets up polling of a file descriptor. The underlying function is either + poll(2) or select(2), depending on platform + support. The primary use of this function is iterating through the connection sequence + described in the documentation of . If + forRead is specified, the function waits for the socket to be ready + for reading. If forWrite is specified, the function waits for the + socket to be ready for write. See POLLIN and POLLOUT + from poll(2), or readfds and + writefds from select(2) for more information. If + end_time is not -1, it specifies the time at which + this function should stop waiting for the condition to be met. + + + + The function returns a value greater than 0 if the specified condition + is met, 0 if a timeout occurred, or -1 if an error + occurred. The error can be retrieved by checking the errno(3) value. In + the event forRead and forWrite are not set, the + function immediately returns a timeout condition. + + + + PQconnectStartParamsPQconnectStartParams PQconnectStartPQconnectStart @@ -358,7 +393,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn); Loop thus: If PQconnectPoll(conn) last returned PGRES_POLLING_READING, wait until the socket is ready to read (as indicated by select(), poll(), or - similar system function). + similar system function). Note that PQsocketPoll + can help reduce boilerplate by abstracting the setup of + select(2) or poll(2) if it is + available on your system. Then call PQconnectPoll(conn) again. Conversely, if PQconnectPoll(conn) last returned PGRES_POLLING_WRITING, wait until the socket is ready diff --git a/src/inte
Re: psql not responding to SIGINT upon db reconnection
On Fri, Mar 22, 2024 at 4:58 PM Tristan Partin wrote: > I had a question about parameter naming. Right now I have a mix of > camel-case and snake-case in the function signature since that is what > I inherited. Should I change that to be consistent? If so, which case > would you like? Uh... PostgreSQL is kind of the wild west in that regard. The thing to do is look for nearby precedents, but that doesn't help much here because in the very same file, libpq-fe.h, we have: extern int PQsetResultAttrs(PGresult *res, int numAttributes, PGresAttDesc *attDescs); extern int PQsetvalue(PGresult *res, int tup_num, int field_num, char *value, int len); Since the existing naming is consistent with one of those two styles, I'd probably just leave it be. + The function returns a value greater than 0 if the specified condition + is met, 0 if a timeout occurred, or -1 if an error + or interrupt occurred. In the event forRead and We either need to tell people how to find out which error it was, or if that's not possible and we can't reasonably make it possible, we need to tell them why they shouldn't care. Because there's nothing more delightful than someone who shows up and says "hey, I tried to do XYZ, and I got an error," as if that were sufficient information for me to do something useful. + end_time is the time in the future in seconds starting from the UNIX + epoch in which you would like the function to return if the condition is not met. This sentence seems a bit contorted to me, like maybe Yoda wrote it. I was about to try to rephrase it and maybe split it in two when I wondered why we need to document how time_t works at all. Can't we just say something like "If end_time is not -1, it specifies the time at which this function should stop waiting for the condition to be met" -- and maybe move it to the end of the first paragraph, so it's before where we list the meanings of the return values? -- Robert Haas EDB: http://www.enterprisedb.com
Re: psql not responding to SIGINT upon db reconnection
On Fri Mar 22, 2024 at 12:17 PM CDT, Robert Haas wrote: On Fri, Mar 22, 2024 at 1:05 PM Tristan Partin wrote: > Sorry for taking a while to get back to y'all. I have taken your > feedback into consideration for v9. This is my first time writing > Postgres docs, so I'm ready for another round of editing :). Yeah, that looks like it needs some wordsmithing yet. I can take a crack at that at some point, but here are a few notes: - "takes care of" and "working through the state machine" seem quite vague to me. - the meanings of forRead and forWrite don't seem to be documented. - "retuns" is a typo. > Robert, could you point out some places where comments would be useful > in 0002? I did rename the function and moved it as suggested, thanks! In > turn, I also realized I forgot a prototype, so also added it. Well, for starters, I'd give the function a header comment. Telling me that a 1 second timeout is a 1 second timeout is less useful than telling me why we've picked a 1 second timeout. Presumably the answer is "so we can respond to cancel interrupts in a timely fashion", but I'd make that explicit. It might be worth a comment saying that PQsocket() shouldn't be hoisted out of the loop because it could be a multi-host connection string and the socket might change under us. Unless that's not true, in which case we should hoist the PQsocket() call out of the loop. I think it would also be worth a comment saying that we don't need to report errors here, as the caller will do that; we just need to wait until the connection is known to have either succeeded or failed, or until the user presses cancel. This is good feedback, thanks. I have added comments where you suggested. I reworded the PQsocketPoll docs to hopefully meet your expectations. I am using the term "connection sequence" which is from the PQconnectStartParams docs, so hopefully people will be able to make that connection. I wrote documentation for "forRead" and "forWrite" as well. I had a question about parameter naming. Right now I have a mix of camel-case and snake-case in the function signature since that is what I inherited. Should I change that to be consistent? If so, which case would you like? Thanks for your continued reviews. -- Tristan Partin Neon (https://neon.tech) From dc45e95ab443d973845dd840600d6719dcd4cae2 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Tue, 30 Jan 2024 15:40:57 -0600 Subject: [PATCH v10 1/2] Expose PQsocketPoll for frontends PQsocketPoll should help with state machine processing when connecting to a database asynchronously via PQconnectStart(). --- doc/src/sgml/libpq.sgml | 43 +++- src/interfaces/libpq/exports.txt | 1 + src/interfaces/libpq/fe-misc.c | 7 +++--- src/interfaces/libpq/libpq-fe.h | 4 +++ 4 files changed, 50 insertions(+), 5 deletions(-) diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index d3e87056f2c..774b57ba70b 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -262,6 +262,44 @@ PGconn *PQsetdb(char *pghost, + + PQsocketPollPQsocketPoll + + + nonblocking connection + Poll a connection's underlying socket descriptor retrieved with . + +int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time); + + + + + This function sets up polling of a file descriptor. The underlying function is either + poll(2) or select(2), depending on platform + support. The primary use of this function is iterating through the connection sequence + described in the documentation of . If + forRead is specified, the function waits for the socket to be ready + for reading. If forWrite is specified, the function waits for the + socket to be ready for write. See POLLIN and POLLOUT + from poll(2), or readfds and + writefds from select(2) for more information. + + + + The function returns a value greater than 0 if the specified condition + is met, 0 if a timeout occurred, or -1 if an error + or interrupt occurred. In the event forRead and + forWrite are not set, the function immediately returns a timeout + condition. + + + + end_time is the time in the future in seconds starting from the UNIX + epoch in which you would like the function to return if the condition is not met. + + + + PQconnectStartParamsPQconnectStartParams PQconnectStartPQconnectStart @@ -358,7 +396,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn); Loop thus: If PQconnectPoll(conn) last returned PGRES_POLLING_READING, wait until the socket is ready to read (as indicated by select(), poll(), or - similar system function). + similar system function). Note that PQsocketPoll + can help reduce boilerplate by abstracting the setup of + select(2) or poll(2) if it is + available on
Re: psql not responding to SIGINT upon db reconnection
On Fri, Mar 22, 2024 at 1:05 PM Tristan Partin wrote: > Sorry for taking a while to get back to y'all. I have taken your > feedback into consideration for v9. This is my first time writing > Postgres docs, so I'm ready for another round of editing :). Yeah, that looks like it needs some wordsmithing yet. I can take a crack at that at some point, but here are a few notes: - "takes care of" and "working through the state machine" seem quite vague to me. - the meanings of forRead and forWrite don't seem to be documented. - "retuns" is a typo. > Robert, could you point out some places where comments would be useful > in 0002? I did rename the function and moved it as suggested, thanks! In > turn, I also realized I forgot a prototype, so also added it. Well, for starters, I'd give the function a header comment. Telling me that a 1 second timeout is a 1 second timeout is less useful than telling me why we've picked a 1 second timeout. Presumably the answer is "so we can respond to cancel interrupts in a timely fashion", but I'd make that explicit. It might be worth a comment saying that PQsocket() shouldn't be hoisted out of the loop because it could be a multi-host connection string and the socket might change under us. Unless that's not true, in which case we should hoist the PQsocket() call out of the loop. I think it would also be worth a comment saying that we don't need to report errors here, as the caller will do that; we just need to wait until the connection is known to have either succeeded or failed, or until the user presses cancel. -- Robert Haas EDB: http://www.enterprisedb.com
Re: psql not responding to SIGINT upon db reconnection
On Fri Mar 22, 2024 at 9:59 AM CDT, Robert Haas wrote: On Wed, Jan 31, 2024 at 1:07 PM Tristan Partin wrote: > I was looking for documentation of PQsocket(), but didn't find any > standalone (unless I completely missed it). So I just copied how > PQsocket() is documented in PQconnectPoll(). I am happy to document it > separately if you think it would be useful. As Jelte said back at the end of January, I think you just completely missed it. The relevant part of libpq.sgml starts like this: PQsocketPQsocket As far as I know, we document all of the exported libpq functions in that SGML file. I think there's no real reason why we couldn't get at least 0001 and maybe also 0002 into this release, but only if you move quickly on this. Feature freeze is approaching rapidly. Modulo the documentation changes, I think 0001 is pretty much ready to go. 0002 needs comments. I'm also not so sure about the name process_connection_state_machine(); it seems a little verbose. How about something like wait_until_connected()? And maybe put it below do_connect() instead of above. Robert, Jelte: Sorry for taking a while to get back to y'all. I have taken your feedback into consideration for v9. This is my first time writing Postgres docs, so I'm ready for another round of editing :). Robert, could you point out some places where comments would be useful in 0002? I did rename the function and moved it as suggested, thanks! In turn, I also realized I forgot a prototype, so also added it. This patchset has also been rebased on master, so the libpq export number for PQsocketPoll was bumped. -- Tristan Partin Neon (https://neon.tech) From 7f8bf7250ecf79f65c129ccc42643c36bc53f882 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Tue, 30 Jan 2024 15:40:57 -0600 Subject: [PATCH v9 1/2] Expose PQsocketPoll for frontends PQsocketPoll should help with state machine processing when connecting to a database asynchronously via PQconnectStart(). --- doc/src/sgml/libpq.sgml | 38 +++- src/interfaces/libpq/exports.txt | 1 + src/interfaces/libpq/fe-misc.c | 7 +++--- src/interfaces/libpq/libpq-fe.h | 4 4 files changed, 45 insertions(+), 5 deletions(-) diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index d3e87056f2c..10f191f6b88 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -262,6 +262,39 @@ PGconn *PQsetdb(char *pghost, + + PQsocketPollPQsocketPoll + + + nonblocking connection + Poll a connection's underlying socket descriptor retrieved with . + +int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time); + + + + + This function takes care of the setup for monitoring a file descriptor. The underlying + function is either poll(2) or select(2), + depending on platform support. The primary use of this function is working through the state + machine instantiated by . + + + + The function returns a value greater than 0 if the specified condition + is met, 0 if a timeout occurred, or -1 if an error + or interrupt occurred. In the event forRead and + forWrite are not set, the function immediately retuns a timeout + condition. + + + + end_time is the time in the future in seconds starting from the UNIX + epoch in which you would like the function to return if the condition is not met. + + + + PQconnectStartParamsPQconnectStartParams PQconnectStartPQconnectStart @@ -358,7 +391,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn); Loop thus: If PQconnectPoll(conn) last returned PGRES_POLLING_READING, wait until the socket is ready to read (as indicated by select(), poll(), or - similar system function). + similar system function). Note that PQsocketPoll + can help reduce boilerplate by abstracting the setup of + select(2) or poll(2) if it is + available on your system. Then call PQconnectPoll(conn) again. Conversely, if PQconnectPoll(conn) last returned PGRES_POLLING_WRITING, wait until the socket is ready diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt index 9fbd3d34074..1e48d37677d 100644 --- a/src/interfaces/libpq/exports.txt +++ b/src/interfaces/libpq/exports.txt @@ -202,3 +202,4 @@ PQcancelSocket199 PQcancelErrorMessage 200 PQcancelReset 201 PQcancelFinish202 +PQsocketPoll 203 diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c index f2fc78a481c..f562cd8d344 100644 --- a/src/interfaces/libpq/fe-misc.c +++ b/src/interfaces/libpq/fe-misc.c @@ -55,7 +55,6 @@ static int pqPutMsgBytes(const void *buf, size_t len, PGconn *conn); static int pqSendSome(PGconn *conn, int len); static int pqSocketCheck(PGconn *conn, int forRead, int forWrite,
Re: psql not responding to SIGINT upon db reconnection
On Wed, Jan 31, 2024 at 1:07 PM Tristan Partin wrote: > I was looking for documentation of PQsocket(), but didn't find any > standalone (unless I completely missed it). So I just copied how > PQsocket() is documented in PQconnectPoll(). I am happy to document it > separately if you think it would be useful. As Jelte said back at the end of January, I think you just completely missed it. The relevant part of libpq.sgml starts like this: PQsocketPQsocket As far as I know, we document all of the exported libpq functions in that SGML file. I think there's no real reason why we couldn't get at least 0001 and maybe also 0002 into this release, but only if you move quickly on this. Feature freeze is approaching rapidly. Modulo the documentation changes, I think 0001 is pretty much ready to go. 0002 needs comments. I'm also not so sure about the name process_connection_state_machine(); it seems a little verbose. How about something like wait_until_connected()? And maybe put it below do_connect() instead of above. -- Robert Haas EDB: http://www.enterprisedb.com
Re: psql not responding to SIGINT upon db reconnection
On Wed, 31 Jan 2024 at 19:07, Tristan Partin wrote: > I was looking for documentation of PQsocket(), but didn't find any > standalone (unless I completely missed it). So I just copied how > PQsocket() is documented in PQconnectPoll(). I am happy to document it > separately if you think it would be useful. PQsocket its documentation is here: https://www.postgresql.org/docs/16/libpq-status.html#LIBPQ-PQSOCKET I think PQsocketPoll should probably have its API documentation (describing arguments and return value at minimum) in this section of the docs: https://www.postgresql.org/docs/16/libpq-async.html And I think the example in the PQconnnectPoll API docs could benefit from having PQsocketPoll used in that example.
Re: psql not responding to SIGINT upon db reconnection
On Tue Jan 30, 2024 at 4:42 PM CST, Jelte Fennema-Nio wrote: On Tue, 30 Jan 2024 at 23:20, Tristan Partin wrote: > Not next week, but here is a respin. I've exposed pqSocketPoll as > PQsocketPoll and am just using that. You can see the diff is so much > smaller, which is great! The exports.txt change should be made part of patch 0001, also docs are missing for the newly exposed function. PQsocketPoll does look like quite a nice API to expose from libpq. I was looking for documentation of PQsocket(), but didn't find any standalone (unless I completely missed it). So I just copied how PQsocket() is documented in PQconnectPoll(). I am happy to document it separately if you think it would be useful. My bad on the exports.txt change being in the wrong commit. Git things... I will fix it on the next re-spin after resolving the previous paragraph. -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Tue, 30 Jan 2024 at 23:20, Tristan Partin wrote: > Not next week, but here is a respin. I've exposed pqSocketPoll as > PQsocketPoll and am just using that. You can see the diff is so much > smaller, which is great! The exports.txt change should be made part of patch 0001, also docs are missing for the newly exposed function. PQsocketPoll does look like quite a nice API to expose from libpq.
Re: psql not responding to SIGINT upon db reconnection
On Fri Jan 12, 2024 at 11:13 AM CST, Tristan Partin wrote: On Fri Jan 12, 2024 at 10:45 AM CST, Robert Haas wrote: > On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin wrote: > > I think the way to go is to expose some variation of libpq's > > pqSocketPoll(), which I would be happy to put together a patch for. > > Making frontends, psql in this case, have to reimplement the polling > > logic doesn't strike me as fruitful, which is essentially what I have > > done. > > I encourage further exploration of this line of attack. I fear that if > I were to commit something like what you've posted up until now, > people would complain that that code was too ugly to live, and I'd > have a hard time telling them that they're wrong. Completely agree. Let me look into this. Perhaps I can get something up next week or the week after. Not next week, but here is a respin. I've exposed pqSocketPoll as PQsocketPoll and am just using that. You can see the diff is so much smaller, which is great! In order to fight the race condition, I am just using a 1 second timeout instead of trying to integrate pselect or ppoll. We could add a PQsocketPPoll() to support those use cases, but I am not sure how available pselect and ppoll are. I guess on Windows we don't have pselect. I don't think using the pipe trick that Heikki mentioned earlier is suitable to expose via an API in libpq, but someone else might have a different opinion. Maybe this is good enough until someone complains? Most people would probably just chalk any latency between keypress and cancellation as network latency and not a hardcoded 1 second. Thanks for your feedback Robert! -- Tristan Partin Neon (https://neon.tech) From 4fa6db1900a355a171d3e16019d02f3a415764d0 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Tue, 30 Jan 2024 15:40:57 -0600 Subject: [PATCH v8 1/2] Expose PQsocketPoll for frontends PQsocketPoll should help with state machine processing when connecting to a database asynchronously via PQconnectStart(). --- doc/src/sgml/libpq.sgml | 5 - src/interfaces/libpq/fe-misc.c | 7 +++ src/interfaces/libpq/libpq-fe.h | 4 3 files changed, 11 insertions(+), 5 deletions(-) diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index d0d5aefadc..aa26c2cc8d 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -358,7 +358,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn); Loop thus: If PQconnectPoll(conn) last returned PGRES_POLLING_READING, wait until the socket is ready to read (as indicated by select(), poll(), or - similar system function). + similar system function). Note that PQsocketPoll + can help reduce boilerplate by abstracting the setup of + select(), or poll() if it is + available on your system. Then call PQconnectPoll(conn) again. Conversely, if PQconnectPoll(conn) last returned PGRES_POLLING_WRITING, wait until the socket is ready diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c index 47a28b0a3a..ee917d375d 100644 --- a/src/interfaces/libpq/fe-misc.c +++ b/src/interfaces/libpq/fe-misc.c @@ -55,7 +55,6 @@ static int pqPutMsgBytes(const void *buf, size_t len, PGconn *conn); static int pqSendSome(PGconn *conn, int len); static int pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time); -static int pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time); /* * PQlibVersion: return the libpq version number @@ -1059,7 +1058,7 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time) /* We will retry as long as we get EINTR */ do - result = pqSocketPoll(conn->sock, forRead, forWrite, end_time); + result = PQsocketPoll(conn->sock, forRead, forWrite, end_time); while (result < 0 && SOCK_ERRNO == EINTR); if (result < 0) @@ -1083,8 +1082,8 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time) * Timeout is infinite if end_time is -1. Timeout is immediate (no blocking) * if end_time is 0 (or indeed, any time before now). */ -static int -pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time) +int +PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time) { /* We use poll(2) if available, otherwise select(2) */ #ifdef HAVE_POLL diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h index defc415fa3..11a7fd32b6 100644 --- a/src/interfaces/libpq/libpq-fe.h +++ b/src/interfaces/libpq/libpq-fe.h @@ -21,6 +21,7 @@ extern "C" #endif #include +#include /* * postgres_ext.h defines the backend's externally visible types, @@ -644,6 +645,9 @@ extern int lo_export(PGconn *conn, Oid lobjId, const char *filename); /* Get the version of the libpq library in use */ extern int PQlibVersion(void); +/* Poll a file descriptor for reading and/or writing with a timeout */ +extern int PQsocketPoll(int sock, int forRead, int forWrite, t
Re: psql not responding to SIGINT upon db reconnection
On Fri Jan 12, 2024 at 10:45 AM CST, Robert Haas wrote: On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin wrote: > I think the way to go is to expose some variation of libpq's > pqSocketPoll(), which I would be happy to put together a patch for. > Making frontends, psql in this case, have to reimplement the polling > logic doesn't strike me as fruitful, which is essentially what I have > done. I encourage further exploration of this line of attack. I fear that if I were to commit something like what you've posted up until now, people would complain that that code was too ugly to live, and I'd have a hard time telling them that they're wrong. Completely agree. Let me look into this. Perhaps I can get something up next week or the week after. -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin wrote: > I think the way to go is to expose some variation of libpq's > pqSocketPoll(), which I would be happy to put together a patch for. > Making frontends, psql in this case, have to reimplement the polling > logic doesn't strike me as fruitful, which is essentially what I have > done. I encourage further exploration of this line of attack. I fear that if I were to commit something like what you've posted up until now, people would complain that that code was too ugly to live, and I'd have a hard time telling them that they're wrong. -- Robert Haas EDB: http://www.enterprisedb.com
Re: psql not responding to SIGINT upon db reconnection
On Fri Jan 5, 2024 at 12:24 PM CST, Robert Haas wrote: On Tue, Dec 5, 2023 at 1:35 PM Tristan Partin wrote: > On Wed Nov 29, 2023 at 11:48 AM CST, Tristan Partin wrote: > > I am not completely in love with the code I have written. Lots of > > conditional compilation which makes it hard to read. Looking forward to > > another round of review to see what y'all think. > > Ok. Here is a patch which just uses select(2) with a timeout of 1s or > pselect(2) if it is available. I also moved the state machine processing > into its own function. Hmm, this adds a function called pqSocketPoll to psql/command.c. But there already is such a function in libpq/fe-misc.c. It's not quite the same, though. Having the same function in two different modules with subtly different definitions seems like it's probably not the right idea. Yep, not tied to the function name. Happy to rename as anyone suggests. Also, this seems awfully complicated for something that's supposed to (checks notes) wait for a file descriptor to become ready for I/O for up to 1 second. It's 160 lines of code in pqSocketPoll and another 50 in the caller. If this is really the simplest way to do this, we really need to rethink our libpq APIs or, uh, something. I wonder if we could make this simpler by, say: - always use select - always use a 1-second timeout - if it returns faster because the race condition doesn't happen, cool - if it doesn't, well then you get to wait for a second, oh well I don't feel strongly that that's the right way to go and if Heikki or some other committer wants to go with this more complex conditional approach, that's fine. But to me it seems like a lot. I think the way to go is to expose some variation of libpq's pqSocketPoll(), which I would be happy to put together a patch for. Making frontends, psql in this case, have to reimplement the polling logic doesn't strike me as fruitful, which is essentially what I have done. Thanks for your input! But also wait a second. In my last email, I said: Ok. Here is a patch which just uses select(2) with a timeout of 1s or pselect(2) if it is available. I also moved the state machine processing into its own function. This is definitely not the patch I meant to send. What the? Here is what I meant to send, but I stand by my comment that we should just expose a variation of pqSocketPoll(). -- Tristan Partin Neon (https://neon.tech) From 5e19b26111708b654c7b49c3b9fd7dc18b94c0e7 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Mon, 24 Jul 2023 11:12:59 -0500 Subject: [PATCH v7 1/2] Allow SIGINT to cancel psql database reconnections After installing the SIGINT handler in psql, SIGINT can no longer cancel database reconnections. For instance, if the user starts a reconnection and then needs to do some form of interaction (ie psql is polling), there is no way to cancel the reconnection process currently. Use PQconnectStartParams() in order to insert a CancelRequested check into the polling loop. --- meson.build| 1 + src/bin/psql/command.c | 156 - src/include/pg_config.h.in | 3 + src/tools/msvc/Solution.pm | 1 + 4 files changed, 160 insertions(+), 1 deletion(-) diff --git a/meson.build b/meson.build index ee58ee7a06..2d63485c53 100644 --- a/meson.build +++ b/meson.build @@ -2440,6 +2440,7 @@ func_checks = [ ['posix_fadvise'], ['posix_fallocate'], ['ppoll'], + ['pselect'], ['pstat'], ['pthread_barrier_wait', {'dependencies': [thread_dep]}], ['pthread_is_threaded_np', {'dependencies': [thread_dep]}], diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 82cc091568..3a76623b05 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -11,6 +11,7 @@ #include #include #include +#include #ifndef WIN32 #include /* for stat() */ #include /* for setitimer() */ @@ -40,6 +41,7 @@ #include "large_obj.h" #include "libpq-fe.h" #include "libpq/pqcomm.h" +#include "libpq/pqsignal.h" #include "mainloop.h" #include "portability/instr_time.h" #include "pqexpbuffer.h" @@ -3298,6 +3300,157 @@ param_is_newly_set(const char *old_val, const char *new_val) return false; } +/* + * Check a file descriptor for read and/or write data, possibly waiting. + * If neither forRead nor forWrite are set, immediately return a timeout + * condition (without waiting). Return >0 if condition is met, 0 + * if a timeout occurred, -1 if an error or interrupt occurred. + * + * Timeout is infinite if end_time is -1. Timeout is immediate (no blocking) + * if end_time is 0 (or indeed, any time before now). + * + * Uses the select(2) family of functions because it is available on every + * platform. It is unlikely that psql would be holding enough file descriptors + * that would necessitate using poll(2) or ppoll(2) for example. + */ +static int +pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time) +{ + /* + * We use functions in the following order if available: pselec
Re: psql not responding to SIGINT upon db reconnection
On Tue, Dec 5, 2023 at 1:35 PM Tristan Partin wrote: > On Wed Nov 29, 2023 at 11:48 AM CST, Tristan Partin wrote: > > I am not completely in love with the code I have written. Lots of > > conditional compilation which makes it hard to read. Looking forward to > > another round of review to see what y'all think. > > Ok. Here is a patch which just uses select(2) with a timeout of 1s or > pselect(2) if it is available. I also moved the state machine processing > into its own function. Hmm, this adds a function called pqSocketPoll to psql/command.c. But there already is such a function in libpq/fe-misc.c. It's not quite the same, though. Having the same function in two different modules with subtly different definitions seems like it's probably not the right idea. Also, this seems awfully complicated for something that's supposed to (checks notes) wait for a file descriptor to become ready for I/O for up to 1 second. It's 160 lines of code in pqSocketPoll and another 50 in the caller. If this is really the simplest way to do this, we really need to rethink our libpq APIs or, uh, something. I wonder if we could make this simpler by, say: - always use select - always use a 1-second timeout - if it returns faster because the race condition doesn't happen, cool - if it doesn't, well then you get to wait for a second, oh well I don't feel strongly that that's the right way to go and if Heikki or some other committer wants to go with this more complex conditional approach, that's fine. But to me it seems like a lot. -- Robert Haas EDB: http://www.enterprisedb.com
Re: psql not responding to SIGINT upon db reconnection
On Wed Nov 29, 2023 at 11:48 AM CST, Tristan Partin wrote: I am not completely in love with the code I have written. Lots of conditional compilation which makes it hard to read. Looking forward to another round of review to see what y'all think. Ok. Here is a patch which just uses select(2) with a timeout of 1s or pselect(2) if it is available. I also moved the state machine processing into its own function. Thanks for your comments thus far. -- Tristan Partin Neon (https://neon.tech) From b22f286d3733d6d5ec2a924682679f6884b3600c Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Mon, 24 Jul 2023 11:12:59 -0500 Subject: [PATCH v6] Allow SIGINT to cancel psql database reconnections After installing the SIGINT handler in psql, SIGINT can no longer cancel database reconnections. For instance, if the user starts a reconnection and then needs to do some form of interaction (ie psql is polling), there is no way to cancel the reconnection process currently. Use PQconnectStartParams() in order to insert a CancelRequested check into the polling loop. --- meson.build| 1 + src/bin/psql/command.c | 224 - src/include/pg_config.h.in | 3 + src/tools/msvc/Solution.pm | 1 + 4 files changed, 228 insertions(+), 1 deletion(-) diff --git a/meson.build b/meson.build index ee58ee7a06..2d63485c53 100644 --- a/meson.build +++ b/meson.build @@ -2440,6 +2440,7 @@ func_checks = [ ['posix_fadvise'], ['posix_fallocate'], ['ppoll'], + ['pselect'], ['pstat'], ['pthread_barrier_wait', {'dependencies': [thread_dep]}], ['pthread_is_threaded_np', {'dependencies': [thread_dep]}], diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 82cc091568..e87401b790 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -11,6 +11,11 @@ #include #include #include +#if HAVE_POLL +#include +#else +#include +#endif #ifndef WIN32 #include /* for stat() */ #include /* for setitimer() */ @@ -40,6 +45,7 @@ #include "large_obj.h" #include "libpq-fe.h" #include "libpq/pqcomm.h" +#include "libpq/pqsignal.h" #include "mainloop.h" #include "portability/instr_time.h" #include "pqexpbuffer.h" @@ -3251,6 +3257,169 @@ copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf) return false; } +/* + * Check a file descriptor for read and/or write data, possibly waiting. + * If neither forRead nor forWrite are set, immediately return a timeout + * condition (without waiting). Return >0 if condition is met, 0 + * if a timeout occurred, -1 if an error or interrupt occurred. + * + * Timeout is infinite if end_time is -1. Timeout is immediate (no blocking) + * if end_time is 0 (or indeed, any time before now). + */ +static int +pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time) +{ + /* + * We use functions in the following order if available: + * - ppoll(2) OR pselect(2) + * - poll(2) OR select(2) + */ +#ifdef HAVE_POLL + struct pollfd input_fd; +#ifdef HAVE_PPOLL + int rc; + sigset_t emptyset; + sigset_t blockset; + sigset_t origset; + struct timespec timeout; + struct timespec *ptr_timeout; +#else + int timeout_ms; +#endif + + if (!forRead && !forWrite) + return 0; + + input_fd.fd = sock; + input_fd.events = POLLERR; + input_fd.revents = 0; + + if (forRead) + input_fd.events |= POLLIN; + if (forWrite) + input_fd.events |= POLLOUT; + + /* Compute appropriate timeout interval */ +#ifdef HAVE_PPOLL + sigemptyset(&blockset); + sigaddset(&blockset, SIGINT); + sigprocmask(SIG_BLOCK, &blockset, &origset); + + if (end_time == ((time_t) -1)) + ptr_timeout = NULL; + else + { + timeout.tv_sec = end_time; + timeout.tv_nsec = 0; + ptr_timeout = &timeout; + } +#else + if (end_time == ((time_t) -1)) + timeout_ms = -1; + else + { + time_t now = time(NULL); + + if (end_time > now) + timeout_ms = (end_time - now) * 1000; + else + timeout_ms = 0; + } +#endif + +#ifdef HAVE_PPOLL + sigemptyset(&emptyset); + if (cancel_pressed) + { + sigprocmask(SIG_SETMASK, &origset, NULL); + return 1; + } + + rc = ppoll(&input_fd, 1, ptr_timeout, &emptyset); + sigprocmask(SIG_SETMASK, &origset, NULL); + return rc; +#else + return poll(&input_fd, 1, timeout_ms); +#endif +#else /* !HAVE_POLL */ + + fd_set input_mask; + fd_set output_mask; + fd_set except_mask; +#ifdef HAVE_PSELECT + int rc; + sigset_t emptyset; + sigset_t blockset; + sigset_t origset; + struct timespec timeout; + struct timespec *ptr_timeout; +#else + struct timeval timeout; + struct timeval *ptr_timeout; +#endif + + if (!forRead && !forWrite) + return 0; + + FD_ZERO(&input_mask); + FD_ZERO(&output_mask); + FD_ZERO(&except_mask); + if (forRead) + FD_SET(sock, &input_mask); + + if (forWrite) + FD_SET(sock, &output_mask); + FD_SET(sock, &except_mask); + + /* Compute appropriate timeout interval */ +#ifdef HAVE_PSELECT + sigemptyset(&blockset); + sigaddset(&blockset, SIGINT); + sigprocmask(SIG_BLOCK, &blockset, &origset); + +
Re: psql not responding to SIGINT upon db reconnection
On Thu Nov 23, 2023 at 3:19 AM CST, Heikki Linnakangas wrote: On 22/11/2023 23:29, Tristan Partin wrote: > Ha, you're right. I had this working yesterday, but convinced myself it > didn't. I had a do while loop wrapping the blocking call. Here is a v4, > which seems to pass the tests that were pointed out to be failing > earlier. Thanks! This suffers from a classic race condition: > + if (cancel_pressed) > + break; > + > + sock = PQsocket(n_conn); > + if (sock == -1) > + break; > + > + rc = pqSocketPoll(sock, for_read, !for_read, (time_t)-1); > + Assert(rc != 0); /* Timeout is impossible. */ > + if (rc == -1) > + { > + success = false; > + break; > + } If a signal arrives between the "if (cancel_pressed)" check and pqSocketPoll(), pgSocketPoll will "miss" the signal and block indefinitely. There are three solutions to this: 1. Use a timeout, so that you wake up periodically to check for any missed signals. Easy solution, the downsides are that you will not react as quickly if the signal is missed, and you waste some cycles by waking up unnecessarily. 2. The Self-pipe trick: https://cr.yp.to/docs/selfpipe.html. We also use that in src/backend/storage/ipc/latch.c. It's portable, but somewhat complicated. 3. Use pselect() or ppoll(). They were created specifically to address this problem. Not fully portable, I think it's missing on Windows at least. Maybe use pselect() if it's available, and select() with a timeout if it's not. First I have ever heard of this race condition, and now I will commit it to durable storage :). I went ahead and did option #3 that you proposed. On Windows, I have a 1 second timeout for the select/poll. That seemed like a good balance of user experience and spinning the CPU. But I am open to other values. I don't have a Windows VM, but maybe I should set one up... I am not completely in love with the code I have written. Lots of conditional compilation which makes it hard to read. Looking forward to another round of review to see what y'all think. For what it's worth, I did try #2, but since psql installs its own SIGINT handler, the code became really hairy. -- Tristan Partin Neon (https://neon.tech) From 4fdccf9211ac78bbd7430488d515646f9e9cce9b Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Mon, 24 Jul 2023 11:12:59 -0500 Subject: [PATCH v5] Allow SIGINT to cancel psql database reconnections After installing the SIGINT handler in psql, SIGINT can no longer cancel database reconnections. For instance, if the user starts a reconnection and then needs to do some form of interaction (ie psql is polling), there is no way to cancel the reconnection process currently. Use PQconnectStartParams() in order to insert a CancelRequested check into the polling loop. --- meson.build| 1 + src/bin/psql/command.c | 222 - src/include/pg_config.h.in | 3 + src/tools/msvc/Solution.pm | 1 + 4 files changed, 226 insertions(+), 1 deletion(-) diff --git a/meson.build b/meson.build index ee58ee7a06..2d63485c53 100644 --- a/meson.build +++ b/meson.build @@ -2440,6 +2440,7 @@ func_checks = [ ['posix_fadvise'], ['posix_fallocate'], ['ppoll'], + ['pselect'], ['pstat'], ['pthread_barrier_wait', {'dependencies': [thread_dep]}], ['pthread_is_threaded_np', {'dependencies': [thread_dep]}], diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 82cc091568..c325fedb51 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -11,6 +11,11 @@ #include #include #include +#if HAVE_POLL +#include +#else +#include +#endif #ifndef WIN32 #include /* for stat() */ #include /* for setitimer() */ @@ -40,6 +45,7 @@ #include "large_obj.h" #include "libpq-fe.h" #include "libpq/pqcomm.h" +#include "libpq/pqsignal.h" #include "mainloop.h" #include "portability/instr_time.h" #include "pqexpbuffer.h" @@ -3251,6 +3257,169 @@ copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf) return false; } +/* + * Check a file descriptor for read and/or write data, possibly waiting. + * If neither forRead nor forWrite are set, immediately return a timeout + * condition (without waiting). Return >0 if condition is met, 0 + * if a timeout occurred, -1 if an error or interrupt occurred. + * + * Timeout is infinite if end_time is -1. Timeout is immediate (no blocking) + * if end_time is 0 (or indeed, any time before now). + */ +static int +pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time) +{ + /* + * We use functions in the following order if available: + * - ppoll(2) OR pselect(2) + * - poll(2) OR select(2) + */ +#ifdef HAVE_POLL + struct pollfd input_fd; +#ifdef HAVE_PPOLL + intrc; + sigset_t emp
Re: psql not responding to SIGINT upon db reconnection
On 22/11/2023 23:29, Tristan Partin wrote: Ha, you're right. I had this working yesterday, but convinced myself it didn't. I had a do while loop wrapping the blocking call. Here is a v4, which seems to pass the tests that were pointed out to be failing earlier. Thanks! This suffers from a classic race condition: + if (cancel_pressed) + break; + + sock = PQsocket(n_conn); + if (sock == -1) + break; + + rc = pqSocketPoll(sock, for_read, !for_read, (time_t)-1); + Assert(rc != 0); /* Timeout is impossible. */ + if (rc == -1) + { + success = false; + break; + } If a signal arrives between the "if (cancel_pressed)" check and pqSocketPoll(), pgSocketPoll will "miss" the signal and block indefinitely. There are three solutions to this: 1. Use a timeout, so that you wake up periodically to check for any missed signals. Easy solution, the downsides are that you will not react as quickly if the signal is missed, and you waste some cycles by waking up unnecessarily. 2. The Self-pipe trick: https://cr.yp.to/docs/selfpipe.html. We also use that in src/backend/storage/ipc/latch.c. It's portable, but somewhat complicated. 3. Use pselect() or ppoll(). They were created specifically to address this problem. Not fully portable, I think it's missing on Windows at least. Maybe use pselect() if it's available, and select() with a timeout if it's not. -- Heikki Linnakangas Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Wed Nov 22, 2023 at 3:00 PM CST, Heikki Linnakangas wrote: On 22/11/2023 19:29, Tristan Partin wrote: > On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote: >> On 06/11/2023 19:16, Tristan Partin wrote: > That sounds like a much better solution. Attached you will find a v4 > that implements your suggestion. Please let me know if there is > something that I missed. I can confirm that the patch works. >> >> This patch is missing a select(). It will busy loop until the connection >> is established or cancelled. > > If I add a wait (select, poll, etc.), then I can't control-C during the > blocking call, so it doesn't really solve the problem. Hmm, they should return with EINTR on signal. At least on Linux; I'm not sure how portable that is. See signal(7) man page, section "Interruption of system calls and library functions by signal handlers". You could also use a timeout like 5 s to ensure that you wake up and notice that the signal was received eventually, even if it doesn't interrupt the blocking call. Ha, you're right. I had this working yesterday, but convinced myself it didn't. I had a do while loop wrapping the blocking call. Here is a v4, which seems to pass the tests that were pointed out to be failing earlier. Noticed that I copy-pasted pqSocketPoll() into the psql code. I think this may be controversial. Not sure what the best solution to the issue is. I will pay attention to the buildfarm animals when they pick this up. -- Tristan Partin Neon (https://neon.tech) From 1b4303df1200c561cf478f9c7037ff940f6cd741 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Mon, 24 Jul 2023 11:12:59 -0500 Subject: [PATCH v4] Allow SIGINT to cancel psql database reconnections After installing the SIGINT handler in psql, SIGINT can no longer cancel database reconnections. For instance, if the user starts a reconnection and then needs to do some form of interaction (ie psql is polling), there is no way to cancel the reconnection process currently. Use PQconnectStartParams() in order to insert a CancelRequested check into the polling loop. --- src/bin/psql/command.c | 129 - 1 file changed, 128 insertions(+), 1 deletion(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 82cc091568..588eed9351 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -11,6 +11,11 @@ #include #include #include +#if HAVE_POLL +#include +#else +#include +#endif #ifndef WIN32 #include /* for stat() */ #include /* for setitimer() */ @@ -40,6 +45,7 @@ #include "large_obj.h" #include "libpq-fe.h" #include "libpq/pqcomm.h" +#include "libpq/pqsignal.h" #include "mainloop.h" #include "portability/instr_time.h" #include "pqexpbuffer.h" @@ -3251,6 +3257,90 @@ copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf) return false; } +/* + * Check a file descriptor for read and/or write data, possibly waiting. + * If neither forRead nor forWrite are set, immediately return a timeout + * condition (without waiting). Return >0 if condition is met, 0 + * if a timeout occurred, -1 if an error or interrupt occurred. + * + * Timeout is infinite if end_time is -1. Timeout is immediate (no blocking) + * if end_time is 0 (or indeed, any time before now). + */ +static int +pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time) +{ + /* We use poll(2) if available, otherwise select(2) */ +#ifdef HAVE_POLL + struct pollfd input_fd; + int timeout_ms; + + if (!forRead && !forWrite) + return 0; + + input_fd.fd = sock; + input_fd.events = POLLERR; + input_fd.revents = 0; + + if (forRead) + input_fd.events |= POLLIN; + if (forWrite) + input_fd.events |= POLLOUT; + + /* Compute appropriate timeout interval */ + if (end_time == ((time_t) -1)) + timeout_ms = -1; + else + { + time_t now = time(NULL); + + if (end_time > now) + timeout_ms = (end_time - now) * 1000; + else + timeout_ms = 0; + } + + return poll(&input_fd, 1, timeout_ms); +#else /* !HAVE_POLL */ + + fd_set input_mask; + fd_set output_mask; + fd_set except_mask; + struct timeval timeout; + struct timeval *ptr_timeout; + + if (!forRead && !forWrite) + return 0; + + FD_ZERO(&input_mask); + FD_ZERO(&output_mask); + FD_ZERO(&except_mask); + if (forRead) + FD_SET(sock, &input_mask); + + if (forWrite) + FD_SET(sock, &output_mask); + FD_SET(sock, &except_mask); + + /* Compute appropriate timeout interval */ + if (end_time == ((time_t) -1)) + ptr_timeout = NULL; + else + { + time_t now = time(NULL); + + if (end_time > now) + timeout.tv_sec = end_time - now; + else + timeout.tv_sec = 0; + timeout.tv_usec = 0; + ptr_timeout = &timeout; + } + + return select(sock + 1, &input_mask, &output_mask, + &except_mask, ptr_timeout); +#endif /* HAVE_POLL */ +} + /* * Ask the user for a password; 'username' is the username the * password is for, if one has been explicitly specified. @@ -3324,6 +3414,7 @@ do_co
Re: psql not responding to SIGINT upon db reconnection
On 22/11/2023 19:29, Tristan Partin wrote: On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote: On 06/11/2023 19:16, Tristan Partin wrote: That sounds like a much better solution. Attached you will find a v4 that implements your suggestion. Please let me know if there is something that I missed. I can confirm that the patch works. This patch is missing a select(). It will busy loop until the connection is established or cancelled. If I add a wait (select, poll, etc.), then I can't control-C during the blocking call, so it doesn't really solve the problem. Hmm, they should return with EINTR on signal. At least on Linux; I'm not sure how portable that is. See signal(7) man page, section "Interruption of system calls and library functions by signal handlers". You could also use a timeout like 5 s to ensure that you wake up and notice that the signal was received eventually, even if it doesn't interrupt the blocking call. -- Heikki Linnakangas Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote: On 06/11/2023 19:16, Tristan Partin wrote: >>> That sounds like a much better solution. Attached you will find a v4 >>> that implements your suggestion. Please let me know if there is >>> something that I missed. I can confirm that the patch works. This patch is missing a select(). It will busy loop until the connection is established or cancelled. If I add a wait (select, poll, etc.), then I can't control-C during the blocking call, so it doesn't really solve the problem. On Linux, we have signalfds which seem like the perfect solution to this problem, "wait on the pq socket or SIGINT." But that doesn't translate well to other operating systems :(. tristan957=> \c NOTICE: Welcome to Neon! Authenticate by visiting: https://console.neon.tech/psql_session/ ^CTerminated You can see here that I can't terminate the command. Where you see "Terminated" is me running `kill $(pgrep psql)` in another terminal. Shouldn't we also clear CancelRequested after we have cancelled the attempt? Otherwise, any subsequent attempts will immediately fail too. After switching to cancel_pressed, I don't think so. See this bit of code in the psql main loop: /* main loop to get queries and execute them */ while (successResult == EXIT_SUCCESS) { /* * Clean up after a previous Control-C */ if (cancel_pressed) { if (!pset.cur_cmd_interactive) { /* * You get here if you stopped a script with Ctrl-C. */ successResult = EXIT_USER; break; } cancel_pressed = false; } Should we use 'cancel_pressed' here rather than CancelRequested? To be honest, I don't understand the difference, so that's a genuine question. There was an attempt at unifying them in the past but it was reverted in commit 5d43c3c54d. The more I look at this, the more I don't understand... I need to look at this harder, but wanted to get this email out. Switched to cancel_pressed though. Thanks for pointing it out. Going to wait to send a v2 while more discussion occurs. -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On 06/11/2023 19:16, Tristan Partin wrote: That sounds like a much better solution. Attached you will find a v4 that implements your suggestion. Please let me know if there is something that I missed. I can confirm that the patch works. This patch is missing a select(). It will busy loop until the connection is established or cancelled. Shouldn't we also clear CancelRequested after we have cancelled the attempt? Otherwise, any subsequent attempts will immediately fail too. Should we use 'cancel_pressed' here rather than CancelRequested? To be honest, I don't understand the difference, so that's a genuine question. There was an attempt at unifying them in the past but it was reverted in commit 5d43c3c54d. -- Heikki Linnakangas Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Thu Nov 2, 2023 at 4:03 AM CDT, Shlok Kyal wrote: Hi, > That sounds like a much better solution. Attached you will find a v4 > that implements your suggestion. Please let me know if there is > something that I missed. I can confirm that the patch works. > > $ ./build/src/bin/psql/psql -h pg.neon.tech > NOTICE: Welcome to Neon! > Authenticate by visiting: > https://console.neon.tech/psql_session/xxx > > > NOTICE: Connecting to database. > psql (17devel, server 15.3) > SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off) > Type "help" for help. > > tristan957=> \c > NOTICE: Welcome to Neon! > Authenticate by visiting: > https://console.neon.tech/psql_session/yyy > > > ^Cconnection to server at "pg.neon.tech" (3.18.6.96), port 5432 failed: > Previous connection kept > tristan957=> I went through CFbot and found out that some tests are timing out. Links: https://cirrus-ci.com/task/6735437444677632 https://cirrus-ci.com/task/4536414189125632 https://cirrus-ci.com/task/5046587584413696 https://cirrus-ci.com/task/6172487491256320 https://cirrus-ci.com/task/5609537537835008 Some tests which are timing out are as follows: [00:48:49.243] Summary of Failures: [00:48:49.243] [00:48:49.243] 4/270 postgresql:regress / regress/regress TIMEOUT 1000.01s [00:48:49.243] 6/270 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade TIMEOUT 1000.02s [00:48:49.243] 34/270 postgresql:recovery / recovery/027_stream_regress TIMEOUT 1000.02s [00:48:49.243] 48/270 postgresql:plpgsql / plpgsql/regress TIMEOUT 1000.02s [00:48:49.243] 49/270 postgresql:plperl / plperl/regress TIMEOUT 1000.01s [00:48:49.243] 61/270 postgresql:dblink / dblink/regress TIMEOUT 1000.03s [00:48:49.243] 89/270 postgresql:postgres_fdw / postgres_fdw/regress TIMEOUT 1000.01s [00:48:49.243] 93/270 postgresql:test_decoding / test_decoding/regress TIMEOUT 1000.02s [00:48:49.243] 110/270 postgresql:test_extensions / test_extensions/regress TIMEOUT 1000.03s [00:48:49.243] 123/270 postgresql:unsafe_tests / unsafe_tests/regress TIMEOUT 1000.02s [00:48:49.243] 152/270 postgresql:pg_dump / pg_dump/010_dump_connstr TIMEOUT 1000.03s Just want to make sure you are aware of the issue. Hi Shlok! Thanks for taking a look. I will check these failures out to see if I can reproduce. -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
Hi, > That sounds like a much better solution. Attached you will find a v4 > that implements your suggestion. Please let me know if there is > something that I missed. I can confirm that the patch works. > > $ ./build/src/bin/psql/psql -h pg.neon.tech > NOTICE: Welcome to Neon! > Authenticate by visiting: > https://console.neon.tech/psql_session/xxx > > > NOTICE: Connecting to database. > psql (17devel, server 15.3) > SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, > compression: off) > Type "help" for help. > > tristan957=> \c > NOTICE: Welcome to Neon! > Authenticate by visiting: > https://console.neon.tech/psql_session/yyy > > > ^Cconnection to server at "pg.neon.tech" (3.18.6.96), port 5432 > failed: > Previous connection kept > tristan957=> I went through CFbot and found out that some tests are timing out. Links: https://cirrus-ci.com/task/6735437444677632 https://cirrus-ci.com/task/4536414189125632 https://cirrus-ci.com/task/5046587584413696 https://cirrus-ci.com/task/6172487491256320 https://cirrus-ci.com/task/5609537537835008 Some tests which are timing out are as follows: [00:48:49.243] Summary of Failures: [00:48:49.243] [00:48:49.243] 4/270 postgresql:regress / regress/regress TIMEOUT 1000.01s [00:48:49.243] 6/270 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade TIMEOUT 1000.02s [00:48:49.243] 34/270 postgresql:recovery / recovery/027_stream_regress TIMEOUT 1000.02s [00:48:49.243] 48/270 postgresql:plpgsql / plpgsql/regress TIMEOUT 1000.02s [00:48:49.243] 49/270 postgresql:plperl / plperl/regress TIMEOUT 1000.01s [00:48:49.243] 61/270 postgresql:dblink / dblink/regress TIMEOUT 1000.03s [00:48:49.243] 89/270 postgresql:postgres_fdw / postgres_fdw/regress TIMEOUT 1000.01s [00:48:49.243] 93/270 postgresql:test_decoding / test_decoding/regress TIMEOUT 1000.02s [00:48:49.243] 110/270 postgresql:test_extensions / test_extensions/regress TIMEOUT 1000.03s [00:48:49.243] 123/270 postgresql:unsafe_tests / unsafe_tests/regress TIMEOUT 1000.02s [00:48:49.243] 152/270 postgresql:pg_dump / pg_dump/010_dump_connstr TIMEOUT 1000.03s Just want to make sure you are aware of the issue. Thanks Shlok Kumar Kyal
Re: psql not responding to SIGINT upon db reconnection
On Mon Jul 24, 2023 at 12:43 PM CDT, Tom Lane wrote: "Tristan Partin" writes: > v3 is attached which fixes up some code comments I added which I hadn't > attached to the commit already, sigh. I don't care for this patch at all. You're bypassing the pqsignal abstraction layer that the rest of psql goes through, and the behavior you're implementing isn't very nice. People do not expect ^C to kill psql - it should just stop the \c attempt and leave you as you were. Admittedly, getting PQconnectdbParams to return control on SIGINT isn't too practical. But you could probably replace that with a loop around PQconnectPoll and test for CancelRequested in the loop. That sounds like a much better solution. Attached you will find a v4 that implements your suggestion. Please let me know if there is something that I missed. I can confirm that the patch works. $ ./build/src/bin/psql/psql -h pg.neon.tech NOTICE: Welcome to Neon! Authenticate by visiting: https://console.neon.tech/psql_session/xxx NOTICE: Connecting to database. psql (17devel, server 15.3) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off) Type "help" for help. tristan957=> \c NOTICE: Welcome to Neon! Authenticate by visiting: https://console.neon.tech/psql_session/yyy ^Cconnection to server at "pg.neon.tech" (3.18.6.96), port 5432 failed: Previous connection kept tristan957=> -- Tristan Partin Neon (https://neon.tech) From 73a95bc537f205cdac567f3f1860b08cf8323e17 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Mon, 24 Jul 2023 11:12:59 -0500 Subject: [PATCH v4] Allow SIGINT to cancel psql database reconnections After installing the SIGINT handler in psql, SIGINT can no longer cancel database reconnections. For instance, if the user starts a reconnection and then needs to do some form of interaction (ie psql is polling), there is no way to cancel the reconnection process currently. Use PQconnectStartParams() in order to insert a CancelRequested check into the polling loop. --- src/bin/psql/command.c | 25 - 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 6733f008fd..bfecc25801 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -40,6 +40,7 @@ #include "large_obj.h" #include "libpq-fe.h" #include "libpq/pqcomm.h" +#include "libpq/pqsignal.h" #include "mainloop.h" #include "portability/instr_time.h" #include "pqexpbuffer.h" @@ -3577,11 +3578,33 @@ do_connect(enum trivalue reuse_previous_specification, values[paramnum] = NULL; /* Note we do not want libpq to re-expand the dbname parameter */ - n_conn = PQconnectdbParams(keywords, values, false); + n_conn = PQconnectStartParams(keywords, values, false); pg_free(keywords); pg_free(values); + while (true) { + int socket; + + if (CancelRequested) +break; + + socket = PQsocket(n_conn); + if (socket == -1) +break; + + switch (PQconnectPoll(n_conn)) { + case PGRES_POLLING_OK: + case PGRES_POLLING_FAILED: +break; + case PGRES_POLLING_READING: + case PGRES_POLLING_WRITING: +continue; + case PGRES_POLLING_ACTIVE: +pg_unreachable(); + } + } + if (PQstatus(n_conn) == CONNECTION_OK) break; -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
"Tristan Partin" writes: > v3 is attached which fixes up some code comments I added which I hadn't > attached to the commit already, sigh. I don't care for this patch at all. You're bypassing the pqsignal abstraction layer that the rest of psql goes through, and the behavior you're implementing isn't very nice. People do not expect ^C to kill psql - it should just stop the \c attempt and leave you as you were. Admittedly, getting PQconnectdbParams to return control on SIGINT isn't too practical. But you could probably replace that with a loop around PQconnectPoll and test for CancelRequested in the loop. regards, tom lane
Re: psql not responding to SIGINT upon db reconnection
v3 is attached which fixes up some code comments I added which I hadn't attached to the commit already, sigh. -- Tristan Partin Neon (https://neon.tech) From 7f9554944911c77aa1a1900537a91e1e7bd75d93 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Mon, 24 Jul 2023 11:12:59 -0500 Subject: [PATCH v3] Allow SIGINT to cancel psql database reconnections After installing the SIGINT handler in psql, SIGINT can no longer cancel database reconnections. For instance, if the user starts a reconnection and then needs to do some form of interaction (ie psql is polling), there is no way to cancel the reconnection process currently. Restore the default SIGINT handler while polling in PQconnectdbParams(), and put the custom handler back afterward. --- src/bin/psql/command.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 6733f008fd..11dedbb4c6 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -40,6 +40,7 @@ #include "large_obj.h" #include "libpq-fe.h" #include "libpq/pqcomm.h" +#include "libpq/pqsignal.h" #include "mainloop.h" #include "portability/instr_time.h" #include "pqexpbuffer.h" @@ -3530,8 +3531,11 @@ do_connect(enum trivalue reuse_previous_specification, { const char **keywords = pg_malloc((nconnopts + 1) * sizeof(*keywords)); const char **values = pg_malloc((nconnopts + 1) * sizeof(*values)); + int rc; int paramnum = 0; PQconninfoOption *ci; + struct sigaction oldact; + struct sigaction newact = { 0 }; /* * Copy non-default settings into the PQconnectdbParams parameter @@ -3576,9 +3580,24 @@ do_connect(enum trivalue reuse_previous_specification, keywords[paramnum] = NULL; values[paramnum] = NULL; + /* + * Restore the default SIGINT behavior while within libpq. Otherwise, + * SIGINT can never exit from polling for the database connection. + * Failure to restore the default is non-fatal. + */ + newact.sa_handler = SIG_DFL; + rc = sigaction(SIGINT, &newact, &oldact); + /* Note we do not want libpq to re-expand the dbname parameter */ n_conn = PQconnectdbParams(keywords, values, false); + /* + * Only attempt to restore the old handler if we successfully overrode + * it initially. + */ + if (rc == 0) + sigaction(SIGINT, &oldact, NULL); + pg_free(keywords); pg_free(values); -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
v2 is attached which fixes a grammatical issue in a comment. -- Tristan Partin Neon (https://neon.tech) From b9ccfc3c84a25b8616fd40495954bb6f77788e28 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Mon, 24 Jul 2023 11:12:59 -0500 Subject: [PATCH v2] Allow SIGINT to cancel psql database reconnections After installing the SIGINT handler in psql, SIGINT can no longer cancel database reconnections. For instance, if the user starts a reconnection and then needs to do some form of interaction (ie psql is polling), there is no way to cancel the reconnection process currently. Restore the default SIGINT handler while polling in PQconnectdbParams(), and put the custom handler back afterward. --- src/bin/psql/command.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 6733f008fd..e40413a229 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -40,6 +40,7 @@ #include "large_obj.h" #include "libpq-fe.h" #include "libpq/pqcomm.h" +#include "libpq/pqsignal.h" #include "mainloop.h" #include "portability/instr_time.h" #include "pqexpbuffer.h" @@ -3530,8 +3531,11 @@ do_connect(enum trivalue reuse_previous_specification, { const char **keywords = pg_malloc((nconnopts + 1) * sizeof(*keywords)); const char **values = pg_malloc((nconnopts + 1) * sizeof(*values)); + int rc; int paramnum = 0; PQconninfoOption *ci; + struct sigaction oldact; + struct sigaction newact = { 0 }; /* * Copy non-default settings into the PQconnectdbParams parameter @@ -3576,9 +3580,24 @@ do_connect(enum trivalue reuse_previous_specification, keywords[paramnum] = NULL; values[paramnum] = NULL; + /* + * Restore the default SIGINT behavior while within libpq. Otherwise, we + * can never exit from polling for the database connection. Failure to + * restore is non-fatal. + */ + newact.sa_handler = SIG_DFL; + rc = sigaction(SIGINT, &newact, &oldact); + /* Note we do not want libpq to re-expand the dbname parameter */ n_conn = PQconnectdbParams(keywords, values, false); + /* + * Only attempt to restore the old handler if we successfully overwrote + * it initially. + */ + if (rc == 0) + sigaction(SIGINT, &oldact, NULL); + pg_free(keywords); pg_free(values); -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Mon Jul 24, 2023 at 12:00 PM CDT, Gurjeet Singh wrote: On Mon, Jul 24, 2023 at 9:26 AM Tristan Partin wrote: > attached patch +/* + * Restore the default SIGINT behavior while within libpq. Otherwise, we + * can never exit from polling for the database connection. Failure to + * restore is non-fatal. + */ +newact.sa_handler = SIG_DFL; +rc = sigaction(SIGINT, &newact, &oldact); There's no action taken if rc != 0. It doesn't seem right to continue as if everything's fine when the handler registration fails. At least a warning is warranted, so that the user reports such failures to the community. If we fail to go back to the default handler, then we just get the behavior we currently have. I am not sure logging a message like "Failed to restore default SIGINT handler" is that useful to the user. It isn't actionable, and at the end of the day isn't going to affect them for the most part. They also aren't even aware that default handler was ever overridden in the first place. I'm more than happy to add a debug log if it is the blocker to getting this patch accepted however. -- Tristan Partin Neon (https://neon.tech)
Re: psql not responding to SIGINT upon db reconnection
On Mon, Jul 24, 2023 at 9:26 AM Tristan Partin wrote: > attached patch +/* + * Restore the default SIGINT behavior while within libpq. Otherwise, we + * can never exit from polling for the database connection. Failure to + * restore is non-fatal. + */ +newact.sa_handler = SIG_DFL; +rc = sigaction(SIGINT, &newact, &oldact); There's no action taken if rc != 0. It doesn't seem right to continue as if everything's fine when the handler registration fails. At least a warning is warranted, so that the user reports such failures to the community. Best regards, Gurjeet http://Gurje.et