Re: [HACKERS] Escaping from blocked send() reprised.
On 2015-01-12 00:40:50 +0100, Andres Freund wrote: Fixed in what I've since pushed (as Heikki basically was ok with the patch sent a couple months back, modulo some fixes)... I'd not actually pushed that patch... I had pushed some patches (barriers, atomics), but had decided to hold off on this. I've now done so. I've mentioned the portability concerns over select() bugs in the commit message a comment. ATM I'm not inclined to add a relatively elaborate test for the bug on pretty fringe platforms. Thanks for looking at this! I plan to continue with committing 1) Commonalize process startup code 2) Add a default local latch for use in signal handlers 3) Use a nonblocking socket for FE/BE communication and block using latches pretty soon. As we already seem to assume that WaitLatch() is signal safe/reentrant (c.f. walsender.c), I'm fine with committing 3) in isolation, without 4). I need a test that properly exercises catchup interrupts before committing that. Once I have that test I plan to commit 4) Introduce and use infrastructure for interrupt processing during client reads. I'd like some input from others what they think about the problem that 5) Process 'die' interrupts while reading/writing from a socket. can reduce the likelihood of clients getting the error message. I personally think that's more than outweighed by not having backends stuck (save quickdie) for a long time when the client is gone/stuck. I think the middleground in the patch to only process die events when actually blocked in writes reduces the likelihood of this sufficiently. I have hacks ontop this to get rid of ImmediateInterrupt alltogether, although I'm not sure how well this will work for some parts of auth/crypt.c. Everything else, including the deadlock checker, seems quite doable. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Mon, Jan 12, 2015 at 01:45:41AM +0100, Andres Freund wrote: On 2015-01-11 19:37:53 -0500, Noah Misch wrote: I recommend either (a) taking no action or (b) adding a regression test verifying WaitLatchOrSocket() conformance in this scenario. Do you have a good idea how to test b) save a C function in regress.c that does what your test does using latches? No, that's what I had in mind. You could probably achieve it with a libpq program that lets input accumulate, but that's trickier. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2015-01-11 19:37:53 -0500, Noah Misch wrote: I recommend either (a) taking no action or (b) adding a regression test verifying WaitLatchOrSocket() conformance in this scenario. Do you have a good idea how to test b) save a C function in regress.c that does what your test does using latches? Then we cane decide what more to do if failure evidence emerges. Seems fine to me. Hm. I can think of two stopgap measures we could add: 1) If we're using select() and WL_SOCKET_WRITEABLE is set without _READABLE, add a timeout of Min(1s, Max(passed_timeout, 1s)). As the time spent waiting only for writable normally shouldn't be very long, that shouldn't be noticeably bad for power usage. 2) Add a SIGPIPE handler that just does a SetLatch(MyLach). I'm having trouble visualizing those proposed measures in detail, but I trust that a decent workaround would emerge. For 1) I'm thinking of just regularly causing a spurious WL_SOCKET_WRITEABLE event via timeouts if it's the only parameter. Latch users have to deal with spurious wakeups anyway, so that should be mostly unproblematic. For 2) I was thinking that for now the problem only arises for the main FE/BE socket. So we can install a sigpipe handler that does a SetLatch() - that will trigger WaitLatch() to return and, after checking for interrupts, retry the actual send() - which then'd return ECONNRESET. What happens if you write/send() in that state, btw? write() reports EAGAIN. Grand. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Sat, Jan 10, 2015 at 11:35 AM, Andres Freund and...@2ndquadrant.com wrote: Interesting. I dimly remembered you mentioning this, that's how I rediscovered this message. Do you remember any details? No, not really. My guess that's not so much the overhead of the latch itself, but the lack of the directed wakeup stuff the OS provides for semaphores. That's possible. If we could replace all usages of semaphores that set immediate interrupts to ok, we could quite easily make the deadlock detector et. al. run outside of signal handlers. That would imo make it more robust, and easier to understand - right now the correctness of locking done in the deadlock detector isn't obvious. With the infrastructure in place it'd also allow your new parallelism code to run outside of signal handlers. Yes, I would be very happy to see ImmediateInterruptOK die in a fire. Unfortunately currently sempahores can't be unlocked in a signal handler (as sysv semaphores aren't signal safe)... It'd also be not so nice to set both a latch and semaphores in every signal handler. Agreed. AIUI, the only reason why we need the self-pipe thing is because on some platforms signals don't interrupt system calls. But my impression was that those platforms were somewhat obscure. To the contrary, I think it's only very obscure platforms where signals still interrupt syscalls - we set SA_RESTART for pretty much everything. There's a couple of system calls that ignore SA_RESTART. For some that's defined in posix, for others it's operating system specific. E.g. on linux semop(), poll(), select() are defined to always return EINTR when interrupted. The recent problems with test_shm_mq test failing on anole were caused by the fact that a signal doesn't abort select() on that platform, but does reset the timer. So a steady stream of signals results in never reaching the timeout. Anyway, the discussion since cleared up that we need the self byte to handle a race, anyway. Eh? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Mon, Jan 12, 2015 at 12:40:50AM +0100, Andres Freund wrote: On 2015-01-11 16:36:07 -0500, Noah Misch wrote: On Sat, Jan 10, 2015 at 03:25:42AM +0100, Andres Freund wrote: 0001-Allow-latches-to-wait-for-socket-writability-without.patch Imo pretty close to commit and can be committed independently. The key open question is whether all platforms of interest can reliably detect end-of-file when poll()ing or select()ing for write only. Older GNU/Linux select() cannot; see attached test program. Yuck. By my reading that's a violation of posix. Agreed. I did test it a bit, and I didn't see problems, but that obviously doesn't say much about old versions. Afaics we interestingly don't have any poll-less buildfarm animals that use unix_latch.c... More likely is that some system will have a poll() exhibiting the same bug, possibly via poll() being a wrapper around select(). Systems without poll() are hard to find; the gnulib manual lists only Windows, BeOS and HP NonStop OS. HP NonStop OS is the one possibly of interest here. I have never personally seen a machine running it. I recommend either (a) taking no action or (b) adding a regression test verifying WaitLatchOrSocket() conformance in this scenario. Then we can decide what more to do if failure evidence emerges. We use poll() there anyway, so the bug in that configuration does not affect PostgreSQL. Is it a bellwether of similar bugs in other implementations, bugs that will affect PostgreSQL? Hm. I can think of two stopgap measures we could add: 1) If we're using select() and WL_SOCKET_WRITEABLE is set without _READABLE, add a timeout of Min(1s, Max(passed_timeout, 1s)). As the time spent waiting only for writable normally shouldn't be very long, that shouldn't be noticeably bad for power usage. 2) Add a SIGPIPE handler that just does a SetLatch(MyLach). I'm having trouble visualizing those proposed measures in detail, but I trust that a decent workaround would emerge. + if (pfds[0].revents (POLLHUP | POLLERR | POLLNVAL)) + { + /* EOF/error condition */ + if (wakeEvents WL_SOCKET_READABLE) + result |= WL_SOCKET_READABLE; + if (wakeEvents WL_SOCKET_WRITEABLE) + result |= WL_SOCKET_WRITEABLE; + } With some poll() implementations (e.g. OS X), this can wrongly report WL_SOCKET_WRITEABLE if the peer used shutdown(SHUT_WR). I tentatively think that's acceptable. libpq does not use shutdown(), and other client interfaces would do so at their own risk. Should we worry about hostile clients creating a denial-of-service by causing a server send() to block unexpectedly? Probably not; a user able to send arbitrary TCP traffic to the postmaster port can already achieve that. Yea, this doesn't seem particularly concerning. a) They can just stop consuming writes and use noticeable amounts of memory by doing output intensive queries. That uses significant os resources and is much harder to detect - today. If there's anything to worry about here (unlikely), it would be with respect to not-yet-authenticated connections only. b) does accepting WL_SOCKET_WRITEABLE without _READABLE change anything here? We already allow _WRITABLE... Today's code translates POLLHUP to WL_SOCKET_READABLE; it must see POLLOUT to set WL_SOCKET_WRITEABLE. Your patch changes that. What happens if you write/send() in that state, btw? write() reports EAGAIN. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Sat, Jan 10, 2015 at 03:25:42AM +0100, Andres Freund wrote: 0001-Allow-latches-to-wait-for-socket-writability-without.patch Imo pretty close to commit and can be committed independently. The key open question is whether all platforms of interest can reliably detect end-of-file when poll()ing or select()ing for write only. Older GNU/Linux select() cannot; see attached test program. We use poll() there anyway, so the bug in that configuration does not affect PostgreSQL. Is it a bellwether of similar bugs in other implementations, bugs that will affect PostgreSQL? This previously had explicitly been forbidden in e42a21b9e6c9, as there was no use case at that point. We now are looking into making FE/BE communication use latches, so it Truncated sentence. + if (pfds[0].revents (POLLHUP | POLLERR | POLLNVAL)) + { + /* EOF/error condition */ + if (wakeEvents WL_SOCKET_READABLE) + result |= WL_SOCKET_READABLE; + if (wakeEvents WL_SOCKET_WRITEABLE) + result |= WL_SOCKET_WRITEABLE; + } With some poll() implementations (e.g. OS X), this can wrongly report WL_SOCKET_WRITEABLE if the peer used shutdown(SHUT_WR). I tentatively think that's acceptable. libpq does not use shutdown(), and other client interfaces would do so at their own risk. Should we worry about hostile clients creating a denial-of-service by causing a server send() to block unexpectedly? Probably not; a user able to send arbitrary TCP traffic to the postmaster port can already achieve that. + if (resEvents.lNetworkEvents FD_CLOSE) + { + if (wakeEvents WL_SOCKET_READABLE) + result |= WL_SOCKET_READABLE; + if (wakeEvents WL_SOCKET_WRITEABLE) + result |= WL_SOCKET_WRITEABLE; + } + } Extra blank line. /* * Test whether select() can report write-ready on a peer-closed TCP socket when * the send buffer is full. Though write() won't block, some GNU/Linux systems * fail to report write-ready. RHEL 6.6 (kernel-2.6.32-431.23.3.el6.x86_64, * glibc-2.12-1.149.el6.x86_64) has the bug. RHEL 7.0 * (kernel-3.10.0-123.8.1.el7.x86_64, glibc-2.17-55.el7_0.3.x86_64) does not. */ #include errno.h #include fcntl.h #include limits.h #include netinet/in.h #include signal.h #include stdio.h #include stdlib.h #include sys/socket.h #include sys/time.h #include sys/types.h #include unistd.h /* Check a syscall return value. */ void CSYS(int res) { if (res 0) { perror(some syscall); exit(EXIT_FAILURE); } } /* Like socketpair(), but use a loopback TCP connection. */ static void tcppair(int fd[2]) { struct sockaddr_in addr; int srv; int one = 1; int flags; addr.sin_family = AF_INET; addr.sin_port = htons(17531); addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); CSYS(srv = socket(AF_INET, SOCK_STREAM, 0)); CSYS(setsockopt(srv, SOL_SOCKET, SO_REUSEADDR, (char *) one, sizeof(one))); CSYS(bind(srv, (struct sockaddr *) addr, sizeof(addr))); CSYS(listen(srv, 8)); CSYS(fd[1] = socket(AF_INET, SOCK_STREAM, 0)); CSYS(flags = fcntl(fd[1], F_GETFL)); CSYS(fcntl(fd[1], F_SETFL, flags | O_NONBLOCK)); if (connect(fd[1], (struct sockaddr *) addr, sizeof(addr)) = 0 || errno != EINPROGRESS) { perror(connect); exit(EXIT_FAILURE); } CSYS(fcntl(fd[1], F_SETFL, flags)); CSYS(fd[0] = accept(srv, NULL, NULL)); CSYS(close(srv)); } /* Does select() consider the fd ready for writing? */ int select_write(int fd) { fd_set writes; struct timeval timeo; FD_ZERO(writes); FD_SET(fd, writes); timeo.tv_sec = 1; timeo.tv_usec = 0; CSYS(select(fd + 1, NULL, writes, NULL, timeo)); return FD_ISSET(fd, writes); } /* Write to a socket until writes would block. */ void fill(int fd) { int flags; int total; CSYS(flags = fcntl(fd, F_GETFL)); CSYS(fcntl(fd, F_SETFL, flags | O_NONBLOCK)); /* * On both Linux and OS X, select() can report the socket as write-ready * immediately after a write() reported EAGAIN. Loop until both sources * agree that the socket is out of storage. */ do { char buf[32 * PIPE_BUF]; int n; total = 0; while ((n = write(fd, buf, sizeof(buf))) 0) total += n; if (errno != EAGAIN) perror(write); printf(wrote %d bytes\n, total); } while (total 0 select_write(fd)); CSYS(fcntl(fd, F_SETFL, flags)); } int main(int argc, char **argv) { int fd[2]; signal(SIGPIPE, SIG_IGN); tcppair(fd); printf(before close:%s
Re: [HACKERS] Escaping from blocked send() reprised.
On 2015-01-11 16:47:53 -0500, Robert Haas wrote: My guess that's not so much the overhead of the latch itself, but the lack of the directed wakeup stuff the OS provides for semaphores. That's possible. On Sat, Jan 10, 2015 at 11:35 AM, Andres Freund and...@2ndquadrant.com wrote: Interesting. I dimly remembered you mentioning this, that's how I rediscovered this message. Do you remember any details? No, not really. I've done a hackish conversion of proc.c to use semaphores and in a sleeping heavy workload (pgbench -j 390 against a scale 1 db) there originally was about a %5 regression (fluctuating a fair bit). I've played with hacking unix_latch.c to a) use eventfd() for the local latch and (no performance difference anymore) b) Adding a eventfd to struct Latch for latches that are created from postmaster. That fd can directly be written to by other processes (combined slightly faster than semaphores). Not sure how much b) is acceptable, due to using MaxBackend fds. And both only help linux. If we could replace all usages of semaphores that set immediate interrupts to ok, we could quite easily make the deadlock detector et. al. run outside of signal handlers. That would imo make it more robust, and easier to understand - right now the correctness of locking done in the deadlock detector isn't obvious. With the infrastructure in place it'd also allow your new parallelism code to run outside of signal handlers. Yes, I would be very happy to see ImmediateInterruptOK die in a fire. Yea, I think it's a absolutely horrible idea. Unfortunately currently sempahores can't be unlocked in a signal handler (as sysv semaphores aren't signal safe)... It'd also be not so nice to set both a latch and semaphores in every signal handler. Agreed. I've since (re-)realised that we've actually relied on semaphore being signal safe for years. The PGSemaphoreLock() in proc.c allows interrupts. And the deadlock detector then uses lwlocks, which in turn use semaphores. That sucks. Anyway, the discussion since cleared up that we need the self byte to handle a race, anyway. Eh? I'm referring to the point Heikki has made in his second paragraph in 5408638c.1080...@vmware.com . Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2015-01-11 16:36:07 -0500, Noah Misch wrote: On Sat, Jan 10, 2015 at 03:25:42AM +0100, Andres Freund wrote: 0001-Allow-latches-to-wait-for-socket-writability-without.patch Imo pretty close to commit and can be committed independently. The key open question is whether all platforms of interest can reliably detect end-of-file when poll()ing or select()ing for write only. Older GNU/Linux select() cannot; see attached test program. Yuck. By my reading that's a violation of posix. I did test it a bit, and I didn't see problems, but that obviously doesn't say much about old versions. Afaics we interestingly don't have any poll-less buildfarm animals that use unix_latch.c... We use poll() there anyway, so the bug in that configuration does not affect PostgreSQL. Is it a bellwether of similar bugs in other implementations, bugs that will affect PostgreSQL? Hm. I can think of two stopgap measures we could add: 1) If we're using select() and WL_SOCKET_WRITEABLE is set without _READABLE, add a timeout of Min(1s, Max(passed_timeout, 1s)). As the time spent waiting only for writable normally shouldn't be very long, that shouldn't be noticeably bad for power usage. 2) Add a SIGPIPE handler that just does a SetLatch(MyLach). This previously had explicitly been forbidden in e42a21b9e6c9, as there was no use case at that point. We now are looking into making FE/BE communication use latches, so it Truncated sentence. Fixed in what I've since pushed (as Heikki basically was ok with the patch sent a couple months back, modulo some fixes)... + if (pfds[0].revents (POLLHUP | POLLERR | POLLNVAL)) + { + /* EOF/error condition */ + if (wakeEvents WL_SOCKET_READABLE) + result |= WL_SOCKET_READABLE; + if (wakeEvents WL_SOCKET_WRITEABLE) + result |= WL_SOCKET_WRITEABLE; + } With some poll() implementations (e.g. OS X), this can wrongly report WL_SOCKET_WRITEABLE if the peer used shutdown(SHUT_WR). I tentatively think that's acceptable. libpq does not use shutdown(), and other client interfaces would do so at their own risk. Should we worry about hostile clients creating a denial-of-service by causing a server send() to block unexpectedly? Probably not; a user able to send arbitrary TCP traffic to the postmaster port can already achieve that. Yea, this doesn't seem particularly concerning. a) They can just stop consuming writes and use noticeable amounts of memory by doing output intensive queries. That uses significant os resources and is much harder to detect - today. b) does accepting WL_SOCKET_WRITEABLE without _READABLE change anything here? We already allow _WRITABLE... What happens if you write/send() in that state, btw? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-09-04 08:49:22 -0400, Robert Haas wrote: On Tue, Sep 2, 2014 at 3:01 PM, Andres Freund and...@2ndquadrant.com wrote: I'm slightly worried about the added overhead due to the latch code. In my implementation I only use latches after a nonblocking read, but still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if that can be made problematic. I think that's not the word you're looking for. There's a less missing... At some point I hacked up a very crude prototype that made LWLocks use latches to sleep instead of semaphores. It was slow. Interesting. I dimly remembered you mentioning this, that's how I rediscovered this message. Do you remember any details? My guess that's not so much the overhead of the latch itself, but the lack of the directed wakeup stuff the OS provides for semaphores. If we could replace all usages of semaphores that set immediate interrupts to ok, we could quite easily make the deadlock detector et. al. run outside of signal handlers. That would imo make it more robust, and easier to understand - right now the correctness of locking done in the deadlock detector isn't obvious. With the infrastructure in place it'd also allow your new parallelism code to run outside of signal handlers. Unfortunately currently sempahores can't be unlocked in a signal handler (as sysv semaphores aren't signal safe)... It'd also be not so nice to set both a latch and semaphores in every signal handler. AIUI, the only reason why we need the self-pipe thing is because on some platforms signals don't interrupt system calls. But my impression was that those platforms were somewhat obscure. To the contrary, I think it's only very obscure platforms where signals still interrupt syscalls - we set SA_RESTART for pretty much everything. There's a couple of system calls that ignore SA_RESTART. For some that's defined in posix, for others it's operating system specific. E.g. on linux semop(), poll(), select() are defined to always return EINTR when interrupted. Anyway, the discussion since cleared up that we need the self byte to handle a race, anyway. Basically, it doesn't feel like a good thing that we've got two sets of primitives for making a backend wait that (1) don't really know about each other and (2) use different operating system primitives. Presumably one of the two systems is better; let's figure out which one it is, use that one all the time, and get rid of the other one. I think the latch interface is clearly better for what we use sema/latches for as it allows to wait for signals (latch sets), a socket and timeouts. So let's try to figure out how to make it perform comparably or better than semaphores. There's imo only one semaphore user that can't trivially be replaced by latches: the semaphore spinlock emulation. Both proc.c and and lwlock.c can be converted quite easily - in the latter case, it might actually end up saving some code. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-11-17 18:22:54 +0300, Alex Shulgin wrote: Andres Freund and...@2ndquadrant.com writes: I've invested some more time in this: 0002 now makes sense on its own and doesn't change anything around the interrupt handling. Oh, and it compiles without 0003. In this patch, the endif appears to be misplaced in PostgresMain: + if (MyProcPort != NULL) + { +#ifdef WIN32 + pgwin32_noblock = true; +#else + if (!pg_set_noblock(MyProcPort-sock)) + ereport(COMMERROR, + (errmsg(could not set socket to nonblocking mode: %m))); + } +#endif + Uh. Odd. Anyway, that bit of code is now somewhere else anyway... One thing I did try is sending a NOTICE to the client when in ProcessInterrupts() and DoingCommandRead is true. I think[1] it was expected to be delivered instantly, but actually the client (psql) only displays it after sending the next statement. Yea, that should be psql specific though. I hope ;) While I'm reading on FE/BE protocol someone might want to share his wisdom on this subject. My guess: psql blocks on readline/libedit call and can't effectively poll the server socket before complete input from user. I'm not sure if it's actually a can't. It doesn't at least ;) Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-10-03 16:26:35 +0200, Andres Freund wrote: On 2014-10-03 17:12:18 +0300, Heikki Linnakangas wrote: 0002 now makes sense on its own and doesn't change anything around the interrupt handling. Oh, and it compiles without 0003. WaitLatchOrSocket() can throw an error, so it's not totally safe to call that underneath OpenSSL. Hm. Fair point. I think we should fix this by simply prohibiting WaitLatch/WaitLatchOrSocket from ERRORing out. The easiest, and imo acceptable, thing is to simply convert the relevant ERRORs to FATAL. I think that'd be perfectly fine as it seems very unlikely that we continue sanely afterwards. It would really be nice if we had a simple way to raise a FATAL that won't go to the client for situations like this. I'd proposed elog(FATAL | COMERROR, ...) in the past... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hello, On 2014-12-15 18:19:26 +0900, Kyotaro HORIGUCHI wrote: Since I don't have clear idea how to promote this, I will remake and be back with new patch based on Andres' for patches. Do my patches miss any functionality you want? The patch satisfies what I want, as I said upthread. What I don't know is how I can go on with it in this CF topic, in other word what should I do in order to put it to ready for committer? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Since I don't have clear idea how to promote this, I will remake and be back with new patch based on Andres' for patches. Hmm.. Sorry for my stupidity. Why is that necessary? It seems really rather wrong to make BIO_set_retry_write() dependant on ProcDiePending? Especially as, at least in my testing, it's not even required because the be_tls_write() can just check the error properly? I mistook the previous conversation as it doesn't work as expected. I confirmed that it works fine. After all, it works as I expected. The parameter for ProcessClientWriteInterrupt() looks somewhat uneasy but the patch 4 looks fine as a whole. Do you have anything to worry about in the patch? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-12-15 18:19:26 +0900, Kyotaro HORIGUCHI wrote: Since I don't have clear idea how to promote this, I will remake and be back with new patch based on Andres' for patches. Do my patches miss any functionality you want? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Thu, Oct 30, 2014 at 10:27 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 10/03/2014 06:29 PM, Heikki Linnakangas wrote: [blah] About patch 3: Looking closer, this design still looks OK to me. As you said yourself, the comments need some work (e.g. the step 5. in the top comment in async.c needs updating). And then there are a couple of FIXME and XXX comments that need to be addressed. Those patches have not been updated in a while, and I am seeing some feedback from several people, hence returning as returned with feedback. Horiguchi-san, feel free to add new entries on the CF app in 2014-12 or move this entry if you feel overwise. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Andres Freund and...@2ndquadrant.com writes: I've invested some more time in this: 0002 now makes sense on its own and doesn't change anything around the interrupt handling. Oh, and it compiles without 0003. In this patch, the endif appears to be misplaced in PostgresMain: + if (MyProcPort != NULL) + { +#ifdef WIN32 + pgwin32_noblock = true; +#else + if (!pg_set_noblock(MyProcPort-sock)) + ereport(COMMERROR, + (errmsg(could not set socket to nonblocking mode: %m))); + } +#endif + pqinitmask(); 0003 Sinval/notify processing got simplified further. There really isn't any need for DisableNotifyInterrupt/DisableCatchupInterrupt anymore. Also begin_client_read/client_read_ended don't make much sense anymore. Instead introduce ProcessClientReadInterrupt (which wants a better name). There's also a very WIP 0004 Allows secure_read/write be interrupted when ProcDiePending is set. All of that happens via the latch mechanism, nothing happens inside signal handlers. So I do think it's quite an improvement over what's been discussed in this thread. But it (and the other approaches) do noticeably increase the likelihood of clients not getting the error message if the client isn't actually dead. The likelihood of write() being blocked *temporarily* due to normal bandwidth constraints is quite high when you consider COPY FROM and similar. Right now we'll wait till we can put the error message into the socket afaics. 1-3 need some serious comment work, but I think the approach is basically sound. I'm much, much less sure about allowing send() to be interrupted. After re-reading these I don't see the rest of items I wanted to inqury about anymore, so it just makes more sense now. One thing I did try is sending a NOTICE to the client when in ProcessInterrupts() and DoingCommandRead is true. I think[1] it was expected to be delivered instantly, but actually the client (psql) only displays it after sending the next statement. While I'm reading on FE/BE protocol someone might want to share his wisdom on this subject. My guess: psql blocks on readline/libedit call and can't effectively poll the server socket before complete input from user. -- Alex [1] http://www.postgresql.org/message-id/1262173040.19367.5015.camel@ebony ``AFAIK, NOTICE was suggested because it can be sent at any time, whereas ERRORs are only associated with statements.'' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 10/03/2014 06:29 PM, Heikki Linnakangas wrote: On 10/03/2014 05:26 PM, Andres Freund wrote: On 2014-10-03 17:12:18 +0300, Heikki Linnakangas wrote: On 09/28/2014 01:54 AM, Andres Freund wrote: 0003 Sinval/notify processing got simplified further. There really isn't any need for DisableNotifyInterrupt/DisableCatchupInterrupt anymore. Also begin_client_read/client_read_ended don't make much sense anymore. Instead introduce ProcessClientReadInterrupt (which wants a better name). There's also a very WIP 0004 Allows secure_read/write be interrupted when ProcDiePending is set. All of that happens via the latch mechanism, nothing happens inside signal handlers. So I do think it's quite an improvement over what's been discussed in this thread. But it (and the other approaches) do noticeably increase the likelihood of clients not getting the error message if the client isn't actually dead. The likelihood of write() being blocked *temporarily* due to normal bandwidth constraints is quite high when you consider COPY FROM and similar. Right now we'll wait till we can put the error message into the socket afaics. 1-3 need some serious comment work, but I think the approach is basically sound. I'm much, much less sure about allowing send() to be interrupted. Yeah, 1-3 seem sane. I think 3 also needs a careful look. Have you looked through it? While imo much less complex than before, there's some complex interactions in the touched code. And we have terrible coverage of both catchup interrupts and notify stuff... I only looked at the .patch, I didn't apply it, so I didn't look at the context much. But I don't see any fundamental problem with it. I would like to have a closer look before it's committed, though. About patch 3: Looking closer, this design still looks OK to me. As you said yourself, the comments need some work (e.g. the step 5. in the top comment in async.c needs updating). And then there are a couple of FIXME and XXX comments that need to be addressed. The comment on PGPROC.procLatch in storage/proc.h says just this: Latch procLatch; /* generic latch for process */ This needs a lot more explaining. It's now used by signal handlers to interrupt a read or write to the socket; that should be mentioned. What else is it used for? (for waking up a backend in synchronous replication, at least) What are the rules on when to arm it and when to reset it? Would it be more clear to use a separate, backend-private, latch, for the signals? I guess that won't work, though, because sometimes we need need to wait for a wakeup from a different process or from a signal at the same time (SyncRepWaitForLSN() in particular). Not without adding a variant of WaitLatch that can wait on two latches simultaneously, anyway. The assumption in secure_raw_read that MyProc exists is pretty surprising. I understand why it's that way, and there's a comment in PostgresMain explaining why the socket cannot be put into non-blocking mode earlier, but it's still a bit whacky. Not sure what to do about that. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-10-30 15:27:13 +0200, Heikki Linnakangas wrote: The comment on PGPROC.procLatch in storage/proc.h says just this: Latch procLatch; /* generic latch for process */ This needs a lot more explaining. It's now used by signal handlers to interrupt a read or write to the socket; that should be mentioned. What else is it used for? (for waking up a backend in synchronous replication, at least) What are the rules on when to arm it and when to reset it? Hm. I agree it use expaned commentary, but I'm unsure if that much detail is really warranted. Any such documentation seems to be almost guaranteed to be out of date. As evidenced that there's none to date... Would it be more clear to use a separate, backend-private, latch, for the signals? I guess that won't work, though, because sometimes we need need to wait for a wakeup from a different process or from a signal at the same time (SyncRepWaitForLSN() in particular). Not without adding a variant of WaitLatch that can wait on two latches simultaneously, anyway. I wondered the same, but I don't really see what it'd buy us during normal running. It seems like it'd just make code more complex without leading to relevantly fewer wakeups. The assumption in secure_raw_read that MyProc exists is pretty surprising. I understand why it's that way, and there's a comment in PostgresMain explaining why the socket cannot be put into non-blocking mode earlier, but it's still a bit whacky. Not sure what to do about that. It makes me quite unhappy too. I looked what it'd take to make proclatch available earlier, but it wasn't pretty. I wondered whether we could use a 'early proc latch' in MyProcLatch that used until we're fully attached to shared memory. Then, when attaching to shared memory we'd set the old latch once, and reset MyProcLatch to the shared memory one. But that's pretty ugly. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hmm.. Sorry for my stupidity. Why is that necessary? It seems really rather wrong to make BIO_set_retry_write() dependant on ProcDiePending? Especially as, at least in my testing, it's not even required because the be_tls_write() can just check the error properly? I mistook the previous conversation as it doesn't work as expected. I confirmed that it works fine. After all, it works as I expected. The parameter for ProcessClientWriteInterrupt() looks somewhat uneasy but the patch 4 looks fine as a whole. Do you have anything to worry about in the patch? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-10-09 14:06:35 +0900, Kyotaro HORIGUCHI wrote: Hello, simplly inhibit set retry flag when ProcDiePending in my_sock_write seems enough. But it returns with SSL_ERROR_SYSCALL not SSL_ERROR_WANT_WRITE so I modified the patch 4 as the attached patch. Why is that necessary? It seems really rather wrong to make BIO_set_retry_write() dependant on ProcDiePending? Especially as, at least in my testing, it's not even required because the be_tls_write() can just check the error properly? diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 6fc6903..2288fe2 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -750,7 +750,8 @@ my_sock_write(BIO *h, const char *buf, int size) BIO_clear_retry_flags(h); if (res = 0) { - if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN) + if (!ProcDiePending + (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)) { BIO_set_retry_write(h); } Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hello, simplly inhibit set retry flag when ProcDiePending in my_sock_write seems enough. But it returns with SSL_ERROR_SYSCALL not SSL_ERROR_WANT_WRITE so I modified the patch 4 as the attached patch. Finally, the attached patch works as expected with Andres's patch 1-3. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 6fc6903..2288fe2 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -750,7 +750,8 @@ my_sock_write(BIO *h, const char *buf, int size) BIO_clear_retry_flags(h); if (res = 0) { - if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN) + if (!ProcDiePending + (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)) { BIO_set_retry_write(h); } diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c index 7b5b30f..ab9e122 100644 --- a/src/backend/libpq/be-secure.c +++ b/src/backend/libpq/be-secure.c @@ -199,6 +199,7 @@ secure_write(Port *port, void *ptr, size_t len) { ssize_t n; +retry: #ifdef USE_SSL if (port-ssl_in_use) { @@ -210,7 +211,26 @@ secure_write(Port *port, void *ptr, size_t len) n = secure_raw_write(port, ptr, len); } - /* XXX: We likely will want to process some interrupts here */ + /* + * We only want to process the interrupt here if socket writes are + * blocking to increase the chance to get an error message to the + * client. If we're not blocked there'll soon be a + * CHECK_FOR_INTERRUPTS(). But if we're blocked we'll never get out of + * that situation if the client has died. + */ + if (ProcDiePending !port-noblock n 0) + { + /* + * We're dying. It's safe (and sane) to handle that now. + */ + CHECK_FOR_INTERRUPTS(); + } + + /* retry after processing interrupts */ + if (n 0 errno == EINTR) + { + goto retry; + } return n; } @@ -236,10 +256,19 @@ wloop: * don't do anything while (possibly) inside a ssl library. */ w = WaitLatchOrSocket(MyProc-procLatch, - WL_SOCKET_WRITEABLE, + WL_LATCH_SET | WL_SOCKET_WRITEABLE, port-sock, 0); - if (w WL_SOCKET_WRITEABLE) + if (w WL_LATCH_SET) + { + ResetLatch(MyProc-procLatch); + /* + * Force a return, so interrupts can be processed when not + * (possibly) underneath a ssl library. + */ + errno = EINTR; + } + else if (w WL_SOCKET_WRITEABLE) { goto wloop; } diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 3a6aa1c..61390aa 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -520,6 +520,13 @@ ProcessClientReadInterrupt(void) errno = save_errno; } + else if (ProcDiePending) + { + /* + * We're dying. It's safe (and sane) to handle that now. + */ + CHECK_FOR_INTERRUPTS(); + } } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-10-03 18:29:23 +0300, Heikki Linnakangas wrote: On 10/03/2014 05:26 PM, Andres Freund wrote: On 2014-10-03 17:12:18 +0300, Heikki Linnakangas wrote: On 09/28/2014 01:54 AM, Andres Freund wrote: 0003 Sinval/notify processing got simplified further. There really isn't any need for DisableNotifyInterrupt/DisableCatchupInterrupt anymore. Also begin_client_read/client_read_ended don't make much sense anymore. Instead introduce ProcessClientReadInterrupt (which wants a better name). There's also a very WIP 0004 Allows secure_read/write be interrupted when ProcDiePending is set. All of that happens via the latch mechanism, nothing happens inside signal handlers. So I do think it's quite an improvement over what's been discussed in this thread. But it (and the other approaches) do noticeably increase the likelihood of clients not getting the error message if the client isn't actually dead. The likelihood of write() being blocked *temporarily* due to normal bandwidth constraints is quite high when you consider COPY FROM and similar. Right now we'll wait till we can put the error message into the socket afaics. 1-3 need some serious comment work, but I think the approach is basically sound. I'm much, much less sure about allowing send() to be interrupted. Yeah, 1-3 seem sane. I think 3 also needs a careful look. Have you looked through it? While imo much less complex than before, there's some complex interactions in the touched code. And we have terrible coverage of both catchup interrupts and notify stuff... I only looked at the .patch, I didn't apply it, so I didn't look at the context much. But I don't see any fundamental problem with it. I would like to have a closer look before it's committed, though. I'd appreciate that. I don't want to commit it without a careful review of another committer. There's also the concern that using a latch for client communication increases the number of syscalls for the same work. We should at least try to quantify that... I'm not too concerned about that, since we only do extra syscalls when the socket isn't immediately available for reading/writing, i.e. when we have to sleep anyway. Well, kernels actually do some nice optimizations for blocking reads - at least for local sockets. Like switching to the other process immediately and such. I'm not super concerned either, but I think we should try to measure it. And if we're failing, we probably should try to address these problems - if possible in the latch code. 4 also looks OK to me at a quick glance. It basically enables handling the die interrupt immediately, if we're blocked in a read or write. It won't be handled in the signal handler, but within the secure_read/write call anyway. What are you thinking about the concern that it'll reduce the likelihood of transferring the error message to the client? I tried to reduce that by only allowing errors when write() blocks, but that's not an infrequent event. I'm not too concerned about that either. I mean, it's probably true that it reduces the likelihood, but I don't particularly care myself. But if we care, we could use a timeout there, so that if we receive a SIGTERM while blocked on a send(), we wait for a few seconds to see if we can send whatever we were sending, before terminating the backend. What should we do with this patch in the commitfest? I think feature should be ddeclare as 'returned with feedback' for this commitfest. I've done so. I don't see much of a reason waiting with patch 1 till the next commitfest. It imo looks uncontroversial and doesn't have any far reaching consequences. Are you planning to clean up and commit these patches? I plan to do so one by one, yes. If you'd like to pick up any of them, feel free to do (after telling me, to avoid duplicated efforts). I don't feel proprietary about any of them. But I guess you have other stuff you'd like to work on too ;) I'll try to send out a version with the stuff you mentioned earlier in the next couple days. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 09/28/2014 01:54 AM, Andres Freund wrote: I've invested some more time in this: Thanks! In 0001, the select() codepath will not return (WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE) on EOF or error, like the comment says and like the poll() path does. It only sets WL_SOCKET_READABLE if WL_SOCKET_READABLE was requested as a wake-event, and likewise for writeable, while the poll() codepath returns (WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE) regardless of the requested wake-events. I'm not sure which is actually better - a separate WL_SOCKET_ERROR code might be best - but it's inconsistent as it is. 0002 now makes sense on its own and doesn't change anything around the interrupt handling. Oh, and it compiles without 0003. WaitLatchOrSocket() can throw an error, so it's not totally safe to call that underneath OpenSSL. Admittedly the cases where it throws an error are shouldn't happen cases like poll() failed or read() on self-pipe failed, but still. Perhaps those errors should be reclassified as FATAL; it's not clear you can just roll back and expect to continue running if any of them happens. In secure_raw_write(), you need to save and restore errno, as WaitLatchOrSocket will not preserve it. If secure_raw_write() calls WaitLatchOrSocket(), and it returns because the latch was set, and we fall out of secure_raw_write, we will return -1 but the errno might not be set to anything sensible anymore. 0003 Sinval/notify processing got simplified further. There really isn't any need for DisableNotifyInterrupt/DisableCatchupInterrupt anymore. Also begin_client_read/client_read_ended don't make much sense anymore. Instead introduce ProcessClientReadInterrupt (which wants a better name). There's also a very WIP 0004 Allows secure_read/write be interrupted when ProcDiePending is set. All of that happens via the latch mechanism, nothing happens inside signal handlers. So I do think it's quite an improvement over what's been discussed in this thread. But it (and the other approaches) do noticeably increase the likelihood of clients not getting the error message if the client isn't actually dead. The likelihood of write() being blocked *temporarily* due to normal bandwidth constraints is quite high when you consider COPY FROM and similar. Right now we'll wait till we can put the error message into the socket afaics. 1-3 need some serious comment work, but I think the approach is basically sound. I'm much, much less sure about allowing send() to be interrupted. Yeah, 1-3 seem sane. 4 also looks OK to me at a quick glance. It basically enables handling the die interrupt immediately, if we're blocked in a read or write. It won't be handled in the signal handler, but within the secure_read/write call anyway. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hi, On 2014-10-03 17:12:18 +0300, Heikki Linnakangas wrote: On 09/28/2014 01:54 AM, Andres Freund wrote: I've invested some more time in this: Thanks! In 0001, the select() codepath will not return (WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE) on EOF or error, like the comment says and like the poll() path does. It only sets WL_SOCKET_READABLE if WL_SOCKET_READABLE was requested as a wake-event, and likewise for writeable, while the poll() codepath returns (WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE) regardless of the requested wake-events. I'm not sure which is actually better - a separate WL_SOCKET_ERROR code might be best - but it's inconsistent as it is. Hm. Right. I think we should only report the requested state. We can't really discern wether it's a hangup, error or actually readable/writable with select() - it just returns the socket as readable/writable as soon as it doesn't block anymore. Where not blocking includes the connection having gone bad. It took me a while to figure out whether that's actually guaranteed by the spec, but I'm pretty sure it is... 0002 now makes sense on its own and doesn't change anything around the interrupt handling. Oh, and it compiles without 0003. WaitLatchOrSocket() can throw an error, so it's not totally safe to call that underneath OpenSSL. Hm. Fair point. Admittedly the cases where it throws an error are shouldn't happen cases like poll() failed or read() on self-pipe failed, but still. Perhaps those errors should be reclassified as FATAL; it's not clear you can just roll back and expect to continue running if any of them happens. Fine with me. In secure_raw_write(), you need to save and restore errno, as WaitLatchOrSocket will not preserve it. If secure_raw_write() calls WaitLatchOrSocket(), and it returns because the latch was set, and we fall out of secure_raw_write, we will return -1 but the errno might not be set to anything sensible anymore. Oops. 0003 Sinval/notify processing got simplified further. There really isn't any need for DisableNotifyInterrupt/DisableCatchupInterrupt anymore. Also begin_client_read/client_read_ended don't make much sense anymore. Instead introduce ProcessClientReadInterrupt (which wants a better name). There's also a very WIP 0004 Allows secure_read/write be interrupted when ProcDiePending is set. All of that happens via the latch mechanism, nothing happens inside signal handlers. So I do think it's quite an improvement over what's been discussed in this thread. But it (and the other approaches) do noticeably increase the likelihood of clients not getting the error message if the client isn't actually dead. The likelihood of write() being blocked *temporarily* due to normal bandwidth constraints is quite high when you consider COPY FROM and similar. Right now we'll wait till we can put the error message into the socket afaics. 1-3 need some serious comment work, but I think the approach is basically sound. I'm much, much less sure about allowing send() to be interrupted. Yeah, 1-3 seem sane. I think 3 also needs a careful look. Have you looked through it? While imo much less complex than before, there's some complex interactions in the touched code. And we have terrible coverage of both catchup interrupts and notify stuff... Tom, do you happen to have time to look at that bit? There's also the concern that using a latch for client communication increases the number of syscalls for the same work. We should at least try to quantify that... 4 also looks OK to me at a quick glance. It basically enables handling the die interrupt immediately, if we're blocked in a read or write. It won't be handled in the signal handler, but within the secure_read/write call anyway. What are you thinking about the concern that it'll reduce the likelihood of transferring the error message to the client? I tried to reduce that by only allowing errors when write() blocks, but that's not an infrequent event. Thanks for looking. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 10/03/2014 05:26 PM, Andres Freund wrote: On 2014-10-03 17:12:18 +0300, Heikki Linnakangas wrote: On 09/28/2014 01:54 AM, Andres Freund wrote: 0003 Sinval/notify processing got simplified further. There really isn't any need for DisableNotifyInterrupt/DisableCatchupInterrupt anymore. Also begin_client_read/client_read_ended don't make much sense anymore. Instead introduce ProcessClientReadInterrupt (which wants a better name). There's also a very WIP 0004 Allows secure_read/write be interrupted when ProcDiePending is set. All of that happens via the latch mechanism, nothing happens inside signal handlers. So I do think it's quite an improvement over what's been discussed in this thread. But it (and the other approaches) do noticeably increase the likelihood of clients not getting the error message if the client isn't actually dead. The likelihood of write() being blocked *temporarily* due to normal bandwidth constraints is quite high when you consider COPY FROM and similar. Right now we'll wait till we can put the error message into the socket afaics. 1-3 need some serious comment work, but I think the approach is basically sound. I'm much, much less sure about allowing send() to be interrupted. Yeah, 1-3 seem sane. I think 3 also needs a careful look. Have you looked through it? While imo much less complex than before, there's some complex interactions in the touched code. And we have terrible coverage of both catchup interrupts and notify stuff... I only looked at the .patch, I didn't apply it, so I didn't look at the context much. But I don't see any fundamental problem with it. I would like to have a closer look before it's committed, though. There's also the concern that using a latch for client communication increases the number of syscalls for the same work. We should at least try to quantify that... I'm not too concerned about that, since we only do extra syscalls when the socket isn't immediately available for reading/writing, i.e. when we have to sleep anyway. 4 also looks OK to me at a quick glance. It basically enables handling the die interrupt immediately, if we're blocked in a read or write. It won't be handled in the signal handler, but within the secure_read/write call anyway. What are you thinking about the concern that it'll reduce the likelihood of transferring the error message to the client? I tried to reduce that by only allowing errors when write() blocks, but that's not an infrequent event. I'm not too concerned about that either. I mean, it's probably true that it reduces the likelihood, but I don't particularly care myself. But if we care, we could use a timeout there, so that if we receive a SIGTERM while blocked on a send(), we wait for a few seconds to see if we can send whatever we were sending, before terminating the backend. What should we do with this patch in the commitfest? Are you planning to clean up and commit these patches? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Sorry, I missed this message and only cought up when reading your CF status mail. I've attached three patches: Could let me know how to get the CF status mail? I think he meant this email I sent last weekend: http://www.postgresql.org/message-id/542672d2.3060...@vmware.com I see, that's what I also received. Thank you. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hello, I propose the attached patch. It adds a new flag ImmediateDieOK, which is a weaker form of ImmediateInterruptOK that only allows handling a pending die-signal in the signal handler. Robert, others, do you see a problem with this? Per se I don't have a problem with it. There does exist the problem that the user doesn't get a error message in more cases though. On the other hand it's bad if any user can prevent the database from restarting. Over IM, Robert pointed out that it's not safe to jump out of a signal handler with siglongjmp, when we're inside library calls, like in a callback called by OpenSSL. But even with current master branch, that's exactly what we do. In secure_raw_read(), we set ImmediateInterruptOK = true, which means that any incoming signal will be handled directly in the signal handler, which can mean elog(ERROR). Should we be worried? OpenSSL might get confused if control never returns to the SSL_read() or SSL_write() function that called secure_raw_read(). But this is imo prohibitive. Yes, we're doing it for a long while. But no, that's not ok. It actually prompoted me into prototyping the latch thing (in some other thread). I don't think existing practice justifies expanding it further. I see, in that case, this approach seems basically applicable. But if I understand correctly, this patch seems not to return out of the openssl code even when latch was found to be set in secure_raw_write/read. I tried setting errno = ECONNRESET and it went well but seems a bad deed. secure_raw_write(Port *port, const void *ptr, size_t len) { n = send(port-sock, ptr, len, 0); if (!port-noblock n 0 (errno == EWOULDBLOCK || errno == EAGAIN)) { w = WaitLatchOrSocket(MyProc-procLatch, ... if (w WL_LATCH_SET) { ResetLatch(MyProc-procLatch); /* * Force a return, so interrupts can be processed when not * (possibly) underneath a ssl library. */ errno = EINTR; (return n; // n is negative) my_sock_write(BIO *h, const char *buf, int size) { res = secure_raw_write(((Port *) h-ptr), buf, size); BIO_clear_retry_flags(h); if (res = 0) { if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN) { BIO_set_retry_write(h); -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-10-02 17:47:39 +0900, Kyotaro HORIGUCHI wrote: Hello, I propose the attached patch. It adds a new flag ImmediateDieOK, which is a weaker form of ImmediateInterruptOK that only allows handling a pending die-signal in the signal handler. Robert, others, do you see a problem with this? Per se I don't have a problem with it. There does exist the problem that the user doesn't get a error message in more cases though. On the other hand it's bad if any user can prevent the database from restarting. Over IM, Robert pointed out that it's not safe to jump out of a signal handler with siglongjmp, when we're inside library calls, like in a callback called by OpenSSL. But even with current master branch, that's exactly what we do. In secure_raw_read(), we set ImmediateInterruptOK = true, which means that any incoming signal will be handled directly in the signal handler, which can mean elog(ERROR). Should we be worried? OpenSSL might get confused if control never returns to the SSL_read() or SSL_write() function that called secure_raw_read(). But this is imo prohibitive. Yes, we're doing it for a long while. But no, that's not ok. It actually prompoted me into prototyping the latch thing (in some other thread). I don't think existing practice justifies expanding it further. I see, in that case, this approach seems basically applicable. But if I understand correctly, this patch seems not to return out of the openssl code even when latch was found to be set in secure_raw_write/read. Correct. That's why I think it's the way forward. There's several problems now where the inability to do real things while reading/writing is a problem. I tried setting errno = ECONNRESET and it went well but seems a bad deed. Where and why did you do that? secure_raw_write(Port *port, const void *ptr, size_t len) { n = send(port-sock, ptr, len, 0); if (!port-noblock n 0 (errno == EWOULDBLOCK || errno == EAGAIN)) { w = WaitLatchOrSocket(MyProc-procLatch, ... if (w WL_LATCH_SET) { ResetLatch(MyProc-procLatch); /* * Force a return, so interrupts can be processed when not * (possibly) underneath a ssl library. */ errno = EINTR; (return n; // n is negative) my_sock_write(BIO *h, const char *buf, int size) { res = secure_raw_write(((Port *) h-ptr), buf, size); BIO_clear_retry_flags(h); if (res = 0) { if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN) { BIO_set_retry_write(h); Hm, this seems, besides one comment, the code from the last patch in my series. Do you have a particular question about it? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hi, But this is imo prohibitive. Yes, we're doing it for a long while. But no, that's not ok. It actually prompoted me into prototyping the latch thing (in some other thread). I don't think existing practice justifies expanding it further. I see, in that case, this approach seems basically applicable. But if I understand correctly, this patch seems not to return out of the openssl code even when latch was found to be set in secure_raw_write/read. Correct. That's why I think it's the way forward. There's several problems now where the inability to do real things while reading/writing is a problem. I tried setting errno = ECONNRESET and it went well but seems a bad deed. Where and why did you do that? The patch of this message. http://www.postgresql.org/message-id/20140828.214704.93968088.horiguchi.kyot...@lab.ntt.co.jp The reason for setting errno (instead of a variable for it) is to trick openssl (or my_socck_write? I've forgot it..) into recognizing as if the underneath send(2) have returned with any uncontinueable error so it cannot be any of continueable errnos (EINTR/EWOULDBLOCK/EAGAIN). Iy my faint memory, only avoiding BIO_set_retry_write() in my_sock_write() dosn't work as expected but it might be enough that my_sock_write returns -1 and doesn't set BIO_set_retry_write(). The reason why ECONNNRESET is any of other errnos possible for send(2)(*1) doesn't seem to fit the real situation, and the blocked situation seems similar to resetted connection from the view that it cannot continue to work due to external condition, and it is used in be_tls_write() in a similary way. Come to think of it, setting ECONNRESET is not so evil? secure_raw_write(Port *port, const void *ptr, size_t len) { n = send(port-sock, ptr, len, 0); if (!port-noblock n 0 (errno == EWOULDBLOCK || errno == EAGAIN)) { w = WaitLatchOrSocket(MyProc-procLatch, ... if (w WL_LATCH_SET) { ResetLatch(MyProc-procLatch); /* * Force a return, so interrupts can be processed when not * (possibly) underneath a ssl library. */ errno = EINTR; (return n; // n is negative) my_sock_write(BIO *h, const char *buf, int size) { res = secure_raw_write(((Port *) h-ptr), buf, size); BIO_clear_retry_flags(h); if (res = 0) { if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN) { BIO_set_retry_write(h); Hm, this seems, besides one comment, the code from the last patch in my series. Do you have a particular question about it? I didn't have a particluar qustion about it. This is cited only in order to show the route to retrying. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Thank you for reviewing. I'll look close to the patch tomorrow. I must say this scares the heck out of me. The current code goes through some trouble to not throw an error while in a recv() send(). For example, you removed the DoingCommandRead check from prepare_for_client_read(). There's an comment in postgres.c that says this: /* * (2) Allow asynchronous signals to be executed immediately * if they * come in while we are waiting for client input. (This must * be * conditional since we don't want, say, reads on behalf of * COPY FROM * STDIN doing the same thing.) */ DoingCommandRead = true; Hmm. Sorry. That's my fault that I skipped over the issues about COPY FROM STDIN. With the patch, we do allow asynchronous signals to be processed while blocked in COPY FROM STDIN. Maybe that's OK, but I don't feel comfortable just changing it. (the comment is now wrong, of course) I don't see actual problem but I agree that the behavior should not be chenged. This patch also enables processing query cancel signals while blocked, not just SIGTERM. That's not good; we might be in the middle of sending a message, and we cannot just error out of that or we might violate the fe/be protocol. That's OK with a SIGTERM as you're terminating the connection anyway, and we have the PqCommBusy safeguard in place that prevents us from sending broken messages to the client, but that's not good enough if we wanted to keep the backend alive, as we won't be able to send anything to the client anymore. Ok, since what I want is escaping from blocked send() only by SIGTERM, it needs another mechanism from current prepare_for_client_read(). BTW, we've been talking about blocking in send(), but this patch also let's a recv() in e.g. COPY FROM STDIN to be interrupted. That's probably a good thing; surely you have exactly the same issues with that as with send(). But I didn't realize we had a problem with that too. I see. (But it is mere a side-effect of my carelessness, as you know:) In summary, this patch is not ready as it is, but I think we can fix it. The key question is: is it safe to handle SIGTERM in the signal handler, calling the exit-handlers and exiting the backend, when blocked in a recv() or send()? It's a change in the pqcomm.c API; most pqcomm.c functions have not thrown errors or processed interrupts before. But looking at the callers, I think it's safe, and there isn't actually any comments explicitly saying that pqcomm.c will never throw errors. I propose the attached patch. It adds a new flag ImmediateDieOK, which is a weaker form of ImmediateInterruptOK that only allows handling a pending die-signal in the signal handler. Robert, others, do you see a problem with this? The patch seems excluding all problems menthioned in the message, I have no objection to it. Over IM, Robert pointed out that it's not safe to jump out of a signal handler with siglongjmp, when we're inside library calls, like in a callback called by OpenSSL. But even with current master branch, that's exactly what we do. In secure_raw_read(), we set ImmediateInterruptOK = true, which means that any incoming signal will be handled directly in the signal handler, which can mean elog(ERROR). Should we be worried? OpenSSL might get confused if control never returns to the SSL_read() or SSL_write() function that called secure_raw_read(). IMHO, it will soon die even if OpenSSL is confused. It seems a bit brute that sudden cutoff occurs even when the socket is *not* blocked, but the backend will soon die and frontend will immediately get ECONNRESET (..hmm it is not seen in manpages of recv/read(2)) and should safely exit from OpenSSL. I cannot run this patch right now, but it seems to be no problem. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Wow, thank you for the patch. 0001: Allows WaitLatchOrSocket(WL_WRITABLE) without WL_READABLE. I've tested the poll() and select() implementations on linux and blindly patched up windows. 0002: Put the socket the backend uses to communicate with the client into nonblocking mode as soon as latches are ready and use latches to wait. This probably doesn't work correctly without 0003, but seems easier to review separately. 0003: Don't do sinval catchup and notify processing in signal handlers. It's quite cool that it worked that well so far, but it requires some complicated code and is rather fragile. 0002 allows to move that out of signal handlers and just use a latch there. This seems remarkably simpler: 4 files changed, 69 insertions(+), 229 deletions(-) These aren't ready for commit, especially not 0003, but I think they are quite a good foundation for getting rid of the blocking in send(). I haven't added any interrupt processing after interrupted writes, but marked the relevant places with XXXs. With regard to 0002, I dislike the current need to do interrupt processing both in be-secure.c and be-secure-openssl.c. I guess we could solve that by returning something like EINTR from the ssl routines when they need further reads/writes and do all the processing in one place in be-secure.c. There's also some cleanup in 0002/0003 needed: prepare_for_client_read()/client_read_ended() aren't needed in that form anymore and should probably rather be something like CHECK_FOR_READ_INTERRUPT() or similar. Similarly the EnableCatchupInterrupt()/DisableCatchupInterrupt() in autovacuum.c is pretty ugly. Btw, be-secure.c is really not a good name anymore... What do you think? I've invested some more time in this: 0002 now makes sense on its own and doesn't change anything around the interrupt handling. Oh, and it compiles without 0003. 0003 Sinval/notify processing got simplified further. There really isn't any need for DisableNotifyInterrupt/DisableCatchupInterrupt anymore. Also begin_client_read/client_read_ended don't make much sense anymore. Instead introduce ProcessClientReadInterrupt (which wants a better name). There's also a very WIP 0004 Allows secure_read/write be interrupted when ProcDiePending is set. All of that happens via the latch mechanism, nothing happens inside signal handlers. So I do think it's quite an improvement over what's been discussed in this thread. But it (and the other approaches) do noticeably increase the likelihood of clients not getting the error message if the client isn't actually dead. The likelihood of write() being blocked *temporarily* due to normal bandwidth constraints is quite high when you consider COPY FROM and similar. Right now we'll wait till we can put the error message into the socket afaics. 1-3 need some serious comment work, but I think the approach is basically sound. I'm much, much less sure about allowing send() to be interrupted. Kyatoro, could you check whether you can achieve what you want using 0004? It's imo pretty clear that a fair amount of base work needs to be done and there's been a fair amount of progress made this fest. I think this can now be marked returned with feedback. Myself is satisfied by Heikki's solution, and it seems ready for commit. But I agree with the temporarily blocked state is seen often and it breaks even non-blocked socket. If we want to/should avoid breaking *temporarily or not* blocked socket even for SIGTERM, this mechanism should be used. Which way should we take? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
By the way, Sorry, I missed this message and only cought up when reading your CF status mail. I've attached three patches: Could let me know how to get the CF status mail? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-09-26 21:02:16 +0300, Heikki Linnakangas wrote: I propose the attached patch. It adds a new flag ImmediateDieOK, which is a weaker form of ImmediateInterruptOK that only allows handling a pending die-signal in the signal handler. Robert, others, do you see a problem with this? Per se I don't have a problem with it. There does exist the problem that the user doesn't get a error message in more cases though. On the other hand it's bad if any user can prevent the database from restarting. Over IM, Robert pointed out that it's not safe to jump out of a signal handler with siglongjmp, when we're inside library calls, like in a callback called by OpenSSL. But even with current master branch, that's exactly what we do. In secure_raw_read(), we set ImmediateInterruptOK = true, which means that any incoming signal will be handled directly in the signal handler, which can mean elog(ERROR). Should we be worried? OpenSSL might get confused if control never returns to the SSL_read() or SSL_write() function that called secure_raw_read(). But this is imo prohibitive. Yes, we're doing it for a long while. But no, that's not ok. It actually prompoted me into prototyping the latch thing (in some other thread). I don't think existing practice justifies expanding it further. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-09-28 00:54:21 +0200, Andres Freund wrote: On 2014-09-27 21:12:43 +0200, Andres Freund wrote: On 2014-09-03 15:09:54 +0300, Heikki Linnakangas wrote: Sorry, I missed this message and only cought up when reading your CF status mail. I've attached three patches: 0001: Allows WaitLatchOrSocket(WL_WRITABLE) without WL_READABLE. I've tested the poll() and select() implementations on linux and blindly patched up windows. 0002: Put the socket the backend uses to communicate with the client into nonblocking mode as soon as latches are ready and use latches to wait. This probably doesn't work correctly without 0003, but seems easier to review separately. 0003: Don't do sinval catchup and notify processing in signal handlers. It's quite cool that it worked that well so far, but it requires some complicated code and is rather fragile. 0002 allows to move that out of signal handlers and just use a latch there. This seems remarkably simpler: 4 files changed, 69 insertions(+), 229 deletions(-) These aren't ready for commit, especially not 0003, but I think they are quite a good foundation for getting rid of the blocking in send(). I haven't added any interrupt processing after interrupted writes, but marked the relevant places with XXXs. With regard to 0002, I dislike the current need to do interrupt processing both in be-secure.c and be-secure-openssl.c. I guess we could solve that by returning something like EINTR from the ssl routines when they need further reads/writes and do all the processing in one place in be-secure.c. There's also some cleanup in 0002/0003 needed: prepare_for_client_read()/client_read_ended() aren't needed in that form anymore and should probably rather be something like CHECK_FOR_READ_INTERRUPT() or similar. Similarly the EnableCatchupInterrupt()/DisableCatchupInterrupt() in autovacuum.c is pretty ugly. Btw, be-secure.c is really not a good name anymore... What do you think? I've invested some more time in this: 0002 now makes sense on its own and doesn't change anything around the interrupt handling. Oh, and it compiles without 0003. 0003 Sinval/notify processing got simplified further. There really isn't any need for DisableNotifyInterrupt/DisableCatchupInterrupt anymore. Also begin_client_read/client_read_ended don't make much sense anymore. Instead introduce ProcessClientReadInterrupt (which wants a better name). There's also a very WIP 0004 Allows secure_read/write be interrupted when ProcDiePending is set. All of that happens via the latch mechanism, nothing happens inside signal handlers. So I do think it's quite an improvement over what's been discussed in this thread. But it (and the other approaches) do noticeably increase the likelihood of clients not getting the error message if the client isn't actually dead. The likelihood of write() being blocked *temporarily* due to normal bandwidth constraints is quite high when you consider COPY FROM and similar. Right now we'll wait till we can put the error message into the socket afaics. 1-3 need some serious comment work, but I think the approach is basically sound. I'm much, much less sure about allowing send() to be interrupted. Kyatoro, could you check whether you can achieve what you want using 0004? It's imo pretty clear that a fair amount of base work needs to be done and there's been a fair amount of progress made this fest. I think this can now be marked returned with feedback. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-09-03 15:09:54 +0300, Heikki Linnakangas wrote: On 09/03/2014 12:23 AM, Andres Freund wrote: On 2014-09-02 17:21:03 -0400, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 second increment, but I see that WaitLatchOrSocket() doesn't currently support waiting for a socket to become writeable, without also waiting for it to become readable. I wonder how difficult it would be to lift that restriction. My recollection is that there was a reason for that, but I don't recall details any more. http://git.postgresql.org/pg/commitdiff/e42a21b9e6c9b9e6346a34b62628d48ff2fc6ddf In my prototype I've changed the API that errors set both READABLE/WRITABLE. Seems to work Andres, would you mind posting the WIP patch you have? That could be a better foundation for this patch. Sorry, I missed this message and only cought up when reading your CF status mail. I've attached three patches: 0001: Allows WaitLatchOrSocket(WL_WRITABLE) without WL_READABLE. I've tested the poll() and select() implementations on linux and blindly patched up windows. 0002: Put the socket the backend uses to communicate with the client into nonblocking mode as soon as latches are ready and use latches to wait. This probably doesn't work correctly without 0003, but seems easier to review separately. 0003: Don't do sinval catchup and notify processing in signal handlers. It's quite cool that it worked that well so far, but it requires some complicated code and is rather fragile. 0002 allows to move that out of signal handlers and just use a latch there. This seems remarkably simpler: 4 files changed, 69 insertions(+), 229 deletions(-) These aren't ready for commit, especially not 0003, but I think they are quite a good foundation for getting rid of the blocking in send(). I haven't added any interrupt processing after interrupted writes, but marked the relevant places with XXXs. With regard to 0002, I dislike the current need to do interrupt processing both in be-secure.c and be-secure-openssl.c. I guess we could solve that by returning something like EINTR from the ssl routines when they need further reads/writes and do all the processing in one place in be-secure.c. There's also some cleanup in 0002/0003 needed: prepare_for_client_read()/client_read_ended() aren't needed in that form anymore and should probably rather be something like CHECK_FOR_READ_INTERRUPT() or similar. Similarly the EnableCatchupInterrupt()/DisableCatchupInterrupt() in autovacuum.c is pretty ugly. Btw, be-secure.c is really not a good name anymore... What do you think? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 09/09/2014 01:31 PM, Kyotaro HORIGUCHI wrote: Hi, I added and edited some comments. Sorry, It tha patch contains a silly bug. Please find the attatched one. I must say this scares the heck out of me. The current code goes through some trouble to not throw an error while in a recv() send(). For example, you removed the DoingCommandRead check from prepare_for_client_read(). There's an comment in postgres.c that says this: /* * (2) Allow asynchronous signals to be executed immediately if they * come in while we are waiting for client input. (This must be * conditional since we don't want, say, reads on behalf of COPY FROM * STDIN doing the same thing.) */ DoingCommandRead = true; With the patch, we do allow asynchronous signals to be processed while blocked in COPY FROM STDIN. Maybe that's OK, but I don't feel comfortable just changing it. (the comment is now wrong, of course) This patch also enables processing query cancel signals while blocked, not just SIGTERM. That's not good; we might be in the middle of sending a message, and we cannot just error out of that or we might violate the fe/be protocol. That's OK with a SIGTERM as you're terminating the connection anyway, and we have the PqCommBusy safeguard in place that prevents us from sending broken messages to the client, but that's not good enough if we wanted to keep the backend alive, as we won't be able to send anything to the client anymore. BTW, we've been talking about blocking in send(), but this patch also let's a recv() in e.g. COPY FROM STDIN to be interrupted. That's probably a good thing; surely you have exactly the same issues with that as with send(). But I didn't realize we had a problem with that too. In summary, this patch is not ready as it is, but I think we can fix it. The key question is: is it safe to handle SIGTERM in the signal handler, calling the exit-handlers and exiting the backend, when blocked in a recv() or send()? It's a change in the pqcomm.c API; most pqcomm.c functions have not thrown errors or processed interrupts before. But looking at the callers, I think it's safe, and there isn't actually any comments explicitly saying that pqcomm.c will never throw errors. I propose the attached patch. It adds a new flag ImmediateDieOK, which is a weaker form of ImmediateInterruptOK that only allows handling a pending die-signal in the signal handler. Robert, others, do you see a problem with this? Over IM, Robert pointed out that it's not safe to jump out of a signal handler with siglongjmp, when we're inside library calls, like in a callback called by OpenSSL. But even with current master branch, that's exactly what we do. In secure_raw_read(), we set ImmediateInterruptOK = true, which means that any incoming signal will be handled directly in the signal handler, which can mean elog(ERROR). Should we be worried? OpenSSL might get confused if control never returns to the SSL_read() or SSL_write() function that called secure_raw_read(). - Heikki diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c index 41ec1ad..049e5b1 100644 --- a/src/backend/libpq/be-secure.c +++ b/src/backend/libpq/be-secure.c @@ -178,5 +178,13 @@ secure_write(Port *port, void *ptr, size_t len) ssize_t secure_raw_write(Port *port, const void *ptr, size_t len) { - return send(port-sock, ptr, len, 0); + ssize_t result; + + prepare_for_client_write(); + + result = send(port-sock, ptr, len, 0); + + client_write_ended(); + + return result; } diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 61f17bf..138060b 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -497,6 +497,17 @@ ReadCommand(StringInfo inBuf) * non-reentrant libc functions. This restriction makes it safe for us * to allow interrupt service routines to execute nontrivial code while * we are waiting for input. + * + * When waiting in the main loop, we can process any interrupt immediately + * in the signal handler. In any other read from the client, like in a COPY + * FROM STDIN, we can't safely process a query cancel signal, because we might + * be in the middle of sending a message to the client, and jumping out would + * violate the protocol. Or rather, pqcomm.c would detect it and refuse to + * send any more messages to the client. But handling a SIGTERM is OK, because + * we're terminating the backend and don't need to send any more messages + * anyway. That means that we might not be able to send an error message to + * the client, but that seems better than waiting indefinitely, in case the + * client is not responding. */ void prepare_for_client_read(void) @@ -513,6 +524,15 @@ prepare_for_client_read(void) /* And don't forget to detect one that already arrived */ CHECK_FOR_INTERRUPTS(); } + else + { + /* Allow die
Re: [HACKERS] Escaping from blocked send() reprised.
Sorry for the mistake... At Wed, 10 Sep 2014 18:53:03 +0300, Heikki Linnakangas hlinnakan...@vmware.com wrote in 541073df.70...@vmware.com Wrong thread... On 09/10/2014 03:04 AM, Kyotaro HORIGUCHI wrote: Hmm. Sorry, I misunderstood the specification. You approach that coloring tokens seems right, but you have broken the parse logic by adding your code. Other than the mistakes others pointed, I found that - non-SQL-ident like tokens are ignored by their token style, quoted or not, so the following line works. | local All aLL trust -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Wrong thread... On 09/10/2014 03:04 AM, Kyotaro HORIGUCHI wrote: Hmm. Sorry, I misunderstood the specification. You approach that coloring tokens seems right, but you have broken the parse logic by adding your code. Other than the mistakes others pointed, I found that - non-SQL-ident like tokens are ignored by their token style, quoted or not, so the following line works. | local All aLL trust I suppose this is not what you intended. This is because you have igonred the attribute of a token when comparing it as non-SQL-ident tokens. - '+' at the head of the sequence '+' is treated as the first character of the *quoted* string. e.g. +hoge is tokenized as +hoge:special_quoted. I found this is what intended. This should be documented as comments. |2) users and user-groups only requires special handling and behavior as follows |Normal user : | A. unquoted ( USER ) will be treated as user ( downcase ). | B. quoted ( USeR ) will be treated as USeR (case-sensitive). | C. quoted ( +USER ) will be treated as normal user +USER (i.e. will not be considered as user-group) and case-sensitive as string is quoted. This seems confising with the B below. This seems should be rearranged. | User Group : | A. unquoted ( +USERGROUP ) will be treated as +usergruop ( downcase ). | B. plus quoted ( +UserGROUP ) will be treated as +UserGROUP (case-sensitive). This is why you simply continued processing for '+' without discarding and skipping the '+', and not setting in_quote so the following parser code works as it is not intended. You should understand what the original code does and insert or modify logics not braeking the assumptions. regards, -- - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hi, I added and edited some comments. Sorry, It tha patch contains a silly bug. Please find the attatched one. regards, -- Kyotaro Horiguchi NTT Open Source Software Center From eb91a7c91e1fd3b24bf5bff0eb885f1c3d274637 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp Date: Fri, 5 Sep 2014 17:21:48 +0900 Subject: [PATCH] Simplly cutting off the socket if signalled during sending to client. --- src/backend/libpq/be-secure.c | 21 ++-- src/backend/libpq/pqcomm.c| 13 +++ src/backend/tcop/postgres.c | 71 ++--- src/include/libpq/libpq.h |1 + 4 files changed, 70 insertions(+), 36 deletions(-) diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c index 41ec1ad..3006697 100644 --- a/src/backend/libpq/be-secure.c +++ b/src/backend/libpq/be-secure.c @@ -145,11 +145,11 @@ secure_raw_read(Port *port, void *ptr, size_t len) { ssize_t n; - prepare_for_client_read(); + prepare_for_client_comm(); n = recv(port-sock, ptr, len, 0); - client_read_ended(); + client_comm_ended(); return n; } @@ -178,5 +178,20 @@ secure_write(Port *port, void *ptr, size_t len) ssize_t secure_raw_write(Port *port, const void *ptr, size_t len) { - return send(port-sock, ptr, len, 0); + ssize_t n; + + /* + * If we get interrupted during send under execution without blocking, + * processing interrupt immediately actually throws away the chance to + * complete sending the bytes handed, but the chance which we could send + * one more tuple or maybe the final bytes has less not significance than + * the risk that we might can't bail out forever due to blocking send. + */ + prepare_for_client_comm(); + + n = send(port-sock, ptr, len, 0); + + client_comm_ended(); + + return n; } diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c index 605d891..9b08529 100644 --- a/src/backend/libpq/pqcomm.c +++ b/src/backend/libpq/pqcomm.c @@ -1343,6 +1343,19 @@ pq_is_send_pending(void) } /* + * pq_is_busy - is there any I/O command running? + * + * This function is intended for use within signal handlers to check if + * any pqcomm I/O operation is under execution. + * + */ +bool +pq_is_busy(void) +{ + return PqCommBusy; +} + +/* * Message-level I/O routines begin here. * * These routines understand about the old-style COPY OUT protocol. diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 7b5480f..b29b200 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -303,16 +303,16 @@ InteractiveBackend(StringInfo inBuf) * * Even though we are not reading from a client process, we still want to * respond to signals, particularly SIGTERM/SIGQUIT. Hence we must use - * prepare_for_client_read and client_read_ended. + * prepare_for_client_comm and client_comm_ended. */ static int interactive_getc(void) { int c; - prepare_for_client_read(); + prepare_for_client_comm(); c = getc(stdin); - client_read_ended(); + client_comm_ended(); return c; } @@ -487,53 +487,47 @@ ReadCommand(StringInfo inBuf) } /* - * prepare_for_client_read -- set up to possibly block on client input + * prepare_for_client_comm -- set up to possibly block on client communication * - * This must be called immediately before any low-level read from the - * client connection. It is necessary to do it at a sufficiently low level - * that there won't be any other operations except the read kernel call - * itself between this call and the subsequent client_read_ended() call. + * This must be called immediately before any low-level read from or write to + * the client connection. It is necessary to do it at a sufficiently low + * level that there won't be any other operations except the read/write kernel + * call itself between this call and the subsequent client_comm_ended() call. * In particular there mustn't be use of malloc() or other potentially - * non-reentrant libc functions. This restriction makes it safe for us - * to allow interrupt service routines to execute nontrivial code while - * we are waiting for input. + * non-reentrant libc functions. This restriction makes it safe for us to + * allow interrupt service routines to execute nontrivial code while we are + * waiting for input or blocking of output. */ void -prepare_for_client_read(void) +prepare_for_client_comm(void) { - if (DoingCommandRead) - { - /* Enable immediate processing of asynchronous signals */ - EnableNotifyInterrupt(); - EnableCatchupInterrupt(); + /* Enable immediate processing of asynchronous signals */ + EnableNotifyInterrupt(); + EnableCatchupInterrupt(); - /* Allow cancel/die interrupts to be processed while waiting */ - ImmediateInterruptOK = true; + /* Allow cancel/die interrupts to be processed while waiting */ + ImmediateInterruptOK = true; - /*
Re: [HACKERS] Escaping from blocked send() reprised.
Hmm. Sorry, I misunderstood the specification. You approach that coloring tokens seems right, but you have broken the parse logic by adding your code. Other than the mistakes others pointed, I found that - non-SQL-ident like tokens are ignored by their token style, quoted or not, so the following line works. | local All aLL trust I suppose this is not what you intended. This is because you have igonred the attribute of a token when comparing it as non-SQL-ident tokens. - '+' at the head of the sequence '+' is treated as the first character of the *quoted* string. e.g. +hoge is tokenized as +hoge:special_quoted. I found this is what intended. This should be documented as comments. |2) users and user-groups only requires special handling and behavior as follows |Normal user : | A. unquoted ( USER ) will be treated as user ( downcase ). | B. quoted ( USeR ) will be treated as USeR (case-sensitive). | C. quoted ( +USER ) will be treated as normal user +USER (i.e. will not be considered as user-group) and case-sensitive as string is quoted. This seems confising with the B below. This seems should be rearranged. | User Group : | A. unquoted ( +USERGROUP ) will be treated as +usergruop ( downcase ). | B. plus quoted ( +UserGROUP ) will be treated as +UserGROUP (case-sensitive). This is why you simply continued processing for '+' without discarding and skipping the '+', and not setting in_quote so the following parser code works as it is not intended. You should understand what the original code does and insert or modify logics not braeking the assumptions. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hello, - This patch introduced redundant socket emulation for win32 backend but win32 bare socket for Port is already nonblocking as described so it donsn't seem to be a serious problem on performance. Addition to it, since I don't know the reason why win32/socket.c provides the blocking-mode socket emulation, I decided to preserve win32/socket.c to have blocking socket emulation. Possibly it can be removed. On Windows, the backend has an emulation layer for POSIX signals, which uses threads and Windows events. The reason win32/socket.c always uses non-blocking mode internally is that it needs to wait for the socket to become readable/writeable, and for the signal-emulation event, at the same time. So no, we can't remove it. I see, thank you. The approach taken in the first patch seems sensible. I changed it to not use FD_SET, though. A custom array seems better, that way we don't need the pgwin32_nonblockset_init() call, we can just use initialize the variable. It's a little bit more code, but it's well-contained in win32/socket.c. Please take a look, to double-check that I didn't screw up. Thank you. I felt a bit qualm to abusing fd_set. A bit more code is not a problem. I had close look on your patch. Both 'nonblocking' and 'noblock' are appears in function names, pgwin32_set_socket_block/noblock/is_nonblocking(). I prefer nonblocking/blocking pair but I'm satisfied they are in uniform style anyway. (Though I also didn't so ;p) pgwin32_set_socket_block() leaves garbage in nonblocking_sockets[] but it's no problem practically. You also removed blocking'ize(?) code but I agree that it is correct because fds of nonclosed socket won't be reused anyway. pg_set_block/noblock() made me laugh. Yes you're correct. Sorry for the bronken (but workable) code. After all, the patch looks pretty good. I'll continue to fit the another patch onto this. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hello, attached is the new-and-far-simple version of this patch. It no longer depends on win32 nonblocking patch since the socket is in blocking mode again. On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote: - Preventing protocol violation. To prevent protocol violation, secure_write sets ClientConnectionLost when SIGTERM detected, then internal_flush() and ProcessInterrupts() follow the instruction. Oh, hang on. Now that I look at pqcomm.c more closely, it already has a mechanism to avoid writing a message in the middle of another message. See pq_putmessage and PqCommBusy. Can we rely on that? Hmm, it gracefully returns up to ExecProcNode() and PqCommBusy is turned off on the way at pq_putmessage() under current implement. So PqCommBusy is already false before it runs into ProcessInterrupts(). Allowing ImmediateInterruptOK on signalled during send(), setting whereToSendOutput to DestNone if PqCommBusy is true will do. We can also distinguish read and write by looking DoingCommandRead. The ImmediateInterruptOK section can be defined enclosing by prepare_for_client_read/client_read_end. - Single pg_terminate_backend surely kills the backend. secure_raw_write() uses non-blocking socket and a loop of select() with timeout to surely detects received signal(SIGTERM). I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 second increment, but I see that WaitLatchOrSocket() doesn't currently support waiting for a socket to become writeable, without also waiting for it to become readable. I wonder how difficult it would be to lift that restriction. It seems quite difficult hearing the following discussion. I also wonder if it would be simpler to keep the socket in blocking mode after all, and just close() in the signal handler if PqCommBusy == true. If the signal arrives while sleeping in send(), the effect would be the same as with your patch. If the signal arrives while sending, but not sleeping, you would not get the chance to send the already-buffered data to the client. But maybe that's OK, whether or not you're blocked is not very deterministic anyway. Hmm. We're back round to the my first patch, with immediately close the socket, and became irrelevant to win32 layer patch. Anyway, it sounds reasonable. Attached patch is a quick hack patch, but it seems working as expected at a glance. regards, -- Kyotaro Horiguchi NTT Open Source Software Center From 7fcb6ef2e66231605e49bd51cd09d275b40cfd57 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp Date: Fri, 5 Sep 2014 17:21:48 +0900 Subject: [PATCH] Simplly cutting off the socket if signalled during sending to client. --- src/backend/libpq/be-secure.c | 14 +++--- src/backend/libpq/pqcomm.c|6 ++ src/backend/tcop/postgres.c | 40 +--- src/include/libpq/libpq.h |1 + 4 files changed, 39 insertions(+), 22 deletions(-) diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c index 41ec1ad..329812b 100644 --- a/src/backend/libpq/be-secure.c +++ b/src/backend/libpq/be-secure.c @@ -145,11 +145,11 @@ secure_raw_read(Port *port, void *ptr, size_t len) { ssize_t n; - prepare_for_client_read(); + prepare_for_client_comm(); n = recv(port-sock, ptr, len, 0); - client_read_ended(); + client_comm_ended(); return n; } @@ -178,5 +178,13 @@ secure_write(Port *port, void *ptr, size_t len) ssize_t secure_raw_write(Port *port, const void *ptr, size_t len) { - return send(port-sock, ptr, len, 0); + ssize_t n; + + prepare_for_client_comm(); + + n = send(port-sock, ptr, len, 0); + + client_comm_ended(); + + return n; } diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c index 605d891..8f84f67 100644 --- a/src/backend/libpq/pqcomm.c +++ b/src/backend/libpq/pqcomm.c @@ -1342,6 +1342,12 @@ pq_is_send_pending(void) return (PqSendStart PqSendPointer); } +bool +pq_is_busy(void) +{ + return PqCommBusy; +} + /* * Message-level I/O routines begin here. * diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 7b5480f..7a4c483 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -303,16 +303,16 @@ InteractiveBackend(StringInfo inBuf) * * Even though we are not reading from a client process, we still want to * respond to signals, particularly SIGTERM/SIGQUIT. Hence we must use - * prepare_for_client_read and client_read_ended. + * prepare_for_client_comm and client_comm_ended. */ static int interactive_getc(void) { int c; - prepare_for_client_read(); + prepare_for_client_comm(); c = getc(stdin); - client_read_ended(); + client_comm_ended(); return c; } @@ -487,7 +487,7 @@ ReadCommand(StringInfo inBuf) } /* - * prepare_for_client_read -- set up to possibly block on client input + * prepare_for_client_comm -- set up to possibly block
Re: [HACKERS] Escaping from blocked send() reprised.
Sorry, It tha patch contains a silly bug. Please find the attatched one. Hello, attached is the new-and-far-simple version of this patch. It no longer depends on win32 nonblocking patch since the socket is in blocking mode again. On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote: - Preventing protocol violation. To prevent protocol violation, secure_write sets ClientConnectionLost when SIGTERM detected, then internal_flush() and ProcessInterrupts() follow the instruction. Oh, hang on. Now that I look at pqcomm.c more closely, it already has a mechanism to avoid writing a message in the middle of another message. See pq_putmessage and PqCommBusy. Can we rely on that? Hmm, it gracefully returns up to ExecProcNode() and PqCommBusy is turned off on the way at pq_putmessage() under current implement. So PqCommBusy is already false before it runs into ProcessInterrupts(). Allowing ImmediateInterruptOK on signalled during send(), setting whereToSendOutput to DestNone if PqCommBusy is true will do. We can also distinguish read and write by looking DoingCommandRead. The ImmediateInterruptOK section can be defined enclosing by prepare_for_client_read/client_read_end. - Single pg_terminate_backend surely kills the backend. secure_raw_write() uses non-blocking socket and a loop of select() with timeout to surely detects received signal(SIGTERM). I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 second increment, but I see that WaitLatchOrSocket() doesn't currently support waiting for a socket to become writeable, without also waiting for it to become readable. I wonder how difficult it would be to lift that restriction. It seems quite difficult hearing the following discussion. I also wonder if it would be simpler to keep the socket in blocking mode after all, and just close() in the signal handler if PqCommBusy == true. If the signal arrives while sleeping in send(), the effect would be the same as with your patch. If the signal arrives while sending, but not sleeping, you would not get the chance to send the already-buffered data to the client. But maybe that's OK, whether or not you're blocked is not very deterministic anyway. Hmm. We're back round to the my first patch, with immediately close the socket, and became irrelevant to win32 layer patch. Anyway, it sounds reasonable. Attached patch is a quick hack patch, but it seems working as expected at a glance. From 11da4bc3c214490671d27379910a667f06cc35af Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp Date: Fri, 5 Sep 2014 17:21:48 +0900 Subject: [PATCH] Simplly cutting off the socket if signalled during sending to client. --- src/backend/libpq/be-secure.c | 14 -- src/backend/libpq/pqcomm.c|6 src/backend/tcop/postgres.c | 53 - src/include/libpq/libpq.h |1 + 4 files changed, 44 insertions(+), 30 deletions(-) diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c index 41ec1ad..329812b 100644 --- a/src/backend/libpq/be-secure.c +++ b/src/backend/libpq/be-secure.c @@ -145,11 +145,11 @@ secure_raw_read(Port *port, void *ptr, size_t len) { ssize_t n; - prepare_for_client_read(); + prepare_for_client_comm(); n = recv(port-sock, ptr, len, 0); - client_read_ended(); + client_comm_ended(); return n; } @@ -178,5 +178,13 @@ secure_write(Port *port, void *ptr, size_t len) ssize_t secure_raw_write(Port *port, const void *ptr, size_t len) { - return send(port-sock, ptr, len, 0); + ssize_t n; + + prepare_for_client_comm(); + + n = send(port-sock, ptr, len, 0); + + client_comm_ended(); + + return n; } diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c index 605d891..8f84f67 100644 --- a/src/backend/libpq/pqcomm.c +++ b/src/backend/libpq/pqcomm.c @@ -1342,6 +1342,12 @@ pq_is_send_pending(void) return (PqSendStart PqSendPointer); } +bool +pq_is_busy(void) +{ + return PqCommBusy; +} + /* * Message-level I/O routines begin here. * diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 7b5480f..15627c3 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -303,16 +303,16 @@ InteractiveBackend(StringInfo inBuf) * * Even though we are not reading from a client process, we still want to * respond to signals, particularly SIGTERM/SIGQUIT. Hence we must use - * prepare_for_client_read and client_read_ended. + * prepare_for_client_comm and client_comm_ended. */ static int interactive_getc(void) { int c; - prepare_for_client_read(); + prepare_for_client_comm(); c = getc(stdin); - client_read_ended(); + client_comm_ended(); return c; } @@ -487,7 +487,7 @@ ReadCommand(StringInfo inBuf) } /* - * prepare_for_client_read -- set up to possibly block on client
Re: [HACKERS] Escaping from blocked send() reprised.
On Tue, Sep 2, 2014 at 3:01 PM, Andres Freund and...@2ndquadrant.com wrote: I'm slightly worried about the added overhead due to the latch code. In my implementation I only use latches after a nonblocking read, but still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if that can be made problematic. I think that's not the word you're looking for. Or if it is, then - it's already problematic. At some point I hacked up a very crude prototype that made LWLocks use latches to sleep instead of semaphores. It was slow. AIUI, the only reason why we need the self-pipe thing is because on some platforms signals don't interrupt system calls. But my impression was that those platforms were somewhat obscure. Could we have a separate latch implementation for platforms where we know that system calls will get interrupted by signals? Alternatively, should we consider reimplementing latches using semaphores? I assume having the signal handler up the semaphore would allow the attempt to down the semaphore to succeed on return from the handler, so it would accomplish the same thing as the self-pipe trick. Basically, it doesn't feel like a good thing that we've got two sets of primitives for making a backend wait that (1) don't really know about each other and (2) use different operating system primitives. Presumably one of the two systems is better; let's figure out which one it is, use that one all the time, and get rid of the other one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 09/04/2014 03:49 PM, Robert Haas wrote: On Tue, Sep 2, 2014 at 3:01 PM, Andres Freund and...@2ndquadrant.com wrote: I'm slightly worried about the added overhead due to the latch code. In my implementation I only use latches after a nonblocking read, but still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if that can be made problematic. I think that's not the word you're looking for. Or if it is, then - it's already problematic. At some point I hacked up a very crude prototype that made LWLocks use latches to sleep instead of semaphores. It was slow. Hmm. Perhaps we should call drainSelfPipe() only after poll/select returns saying that there is something in the self-pipe. That would be a win assuming it's more common for the self-pipe to be empty. AIUI, the only reason why we need the self-pipe thing is because on some platforms signals don't interrupt system calls. That's not the only reason. It also eliminates the race condition that someone might set the latch after we've checked that it's not set, but before calling poll/select. The same reason that ppoll and pselect exist. But my impression was that those platforms were somewhat obscure. Could we have a separate latch implementation for platforms where we know that system calls will get interrupted by signals? ... and have ppoll or pselect. Yeah, seems reasonable, assuming that ppoll/pselect is faster. Alternatively, should we consider reimplementing latches using semaphores? I assume having the signal handler up the semaphore would allow the attempt to down the semaphore to succeed on return from the handler, so it would accomplish the same thing as the self-pipe trick. I don't think there's a function to wait for a file descriptor or semaphore at the same time. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Thu, Sep 4, 2014 at 9:05 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Hmm. Perhaps we should call drainSelfPipe() only after poll/select returns saying that there is something in the self-pipe. That would be a win assuming it's more common for the self-pipe to be empty. Couldn't hurt. But my impression was that those platforms were somewhat obscure. Could we have a separate latch implementation for platforms where we know that system calls will get interrupted by signals? ... and have ppoll or pselect. Yeah, seems reasonable, assuming that ppoll/pselect is faster. Hrm. So we'd have to block SIGUSR1, check the flag, then use pselect() to temporarily unblock SIGUSR1 and wait, then on return again unblock SIGUSR1? Doesn't seem very appealing. I think changing the signal mask is fast on Linux, but quite slow on at least some other UNIX-like platforms. And I've heard that pselect() isn't always truly atomic, so we might run into platform-specific bugs, too. I wonder if there's a better way e.g. using memory barriers. WaitLatch: check is_set. if yes then done. otherwise, set signal_me. memory barrier. recheck is_set. if not set then wait using poll/select. memory barrier. clear signal_me. SetLatch: check is_set. if yes then done. otherwise, set is_set. memory barrier. check signal_me. if set, then send SIGUSR1. Alternatively, should we consider reimplementing latches using semaphores? I assume having the signal handler up the semaphore would allow the attempt to down the semaphore to succeed on return from the handler, so it would accomplish the same thing as the self-pipe trick. I don't think there's a function to wait for a file descriptor or semaphore at the same time. Oh, good point. So that's out, then. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 09/04/2014 04:37 PM, Robert Haas wrote: Hrm. So we'd have to block SIGUSR1, check the flag, then use pselect() to temporarily unblock SIGUSR1 and wait, then on return again unblock SIGUSR1? Doesn't seem very appealing. I think changing the signal mask is fast on Linux, but quite slow on at least some other UNIX-like platforms. And I've heard that pselect() isn't always truly atomic, so we might run into platform-specific bugs, too. I wonder if there's a better way e.g. using memory barriers. WaitLatch: check is_set. if yes then done. otherwise, set signal_me. memory barrier. recheck is_set. if not set then wait using poll/select. memory barrier. clear signal_me. SetLatch: check is_set. if yes then done. otherwise, set is_set. memory barrier. check signal_me. if set, then send SIGUSR1. Doesn't work. No matter what you do, the process running WaitLatch might receive the signal immediately before it calls poll/select. The signal handler will run, and the poll/select call will then go to sleep. There is no way to do this without support from the kernel, that is why ppoll/pselect exist. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Thu, Sep 4, 2014 at 9:53 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 09/04/2014 04:37 PM, Robert Haas wrote: Hrm. So we'd have to block SIGUSR1, check the flag, then use pselect() to temporarily unblock SIGUSR1 and wait, then on return again unblock SIGUSR1? Doesn't seem very appealing. I think changing the signal mask is fast on Linux, but quite slow on at least some other UNIX-like platforms. And I've heard that pselect() isn't always truly atomic, so we might run into platform-specific bugs, too. I wonder if there's a better way e.g. using memory barriers. WaitLatch: check is_set. if yes then done. otherwise, set signal_me. memory barrier. recheck is_set. if not set then wait using poll/select. memory barrier. clear signal_me. SetLatch: check is_set. if yes then done. otherwise, set is_set. memory barrier. check signal_me. if set, then send SIGUSR1. Doesn't work. No matter what you do, the process running WaitLatch might receive the signal immediately before it calls poll/select. The signal handler will run, and the poll/select call will then go to sleep. There is no way to do this without support from the kernel, that is why ppoll/pselect exist. Eesh, I was confused there: ignore me. I was trying to optimize away the signal handling but assuming we still had the self-pipe byte. But of course in that case we don't need to change anything at all. I'm going to go get some more caffeine. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 09/03/2014 12:23 AM, Andres Freund wrote: On 2014-09-02 17:21:03 -0400, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 second increment, but I see that WaitLatchOrSocket() doesn't currently support waiting for a socket to become writeable, without also waiting for it to become readable. I wonder how difficult it would be to lift that restriction. My recollection is that there was a reason for that, but I don't recall details any more. http://git.postgresql.org/pg/commitdiff/e42a21b9e6c9b9e6346a34b62628d48ff2fc6ddf In my prototype I've changed the API that errors set both READABLE/WRITABLE. Seems to work Andres, would you mind posting the WIP patch you have? That could be a better foundation for this patch. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote: To make the code mentioned above (Patch 0002) tidy, rewrite the socket emulation code for win32 backends so that each socket can have its own non-blocking state. (patch 0001) The first patch that makes non-blocking sockets behave more sanely on Windows seems like a good idea, independently of the second patch. I'm looking at the first patch now, I'll make a separate post about the second patch. Some concern about this patch, - This patch allows the number of non-blocking socket to be below 64 (FD_SETSIZE) on win32 backend but it seems to be sufficient. Yeah, that's plenty. - This patch introduced redundant socket emulation for win32 backend but win32 bare socket for Port is already nonblocking as described so it donsn't seem to be a serious problem on performance. Addition to it, since I don't know the reason why win32/socket.c provides the blocking-mode socket emulation, I decided to preserve win32/socket.c to have blocking socket emulation. Possibly it can be removed. On Windows, the backend has an emulation layer for POSIX signals, which uses threads and Windows events. The reason win32/socket.c always uses non-blocking mode internally is that it needs to wait for the socket to become readable/writeable, and for the signal-emulation event, at the same time. So no, we can't remove it. The approach taken in the first patch seems sensible. I changed it to not use FD_SET, though. A custom array seems better, that way we don't need the pgwin32_nonblockset_init() call, we can just use initialize the variable. It's a little bit more code, but it's well-contained in win32/socket.c. Please take a look, to double-check that I didn't screw up. - Heikki commit aaaec23b08677baaed900f72db3f9628c0070922 Author: Heikki Linnakangas heikki.linnakan...@iki.fi Date: Tue Sep 2 20:05:47 2014 +0300 Improve non-blocking sockets emulation on Windows. diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c index 605d891..cba79a7 100644 --- a/src/backend/libpq/pqcomm.c +++ b/src/backend/libpq/pqcomm.c @@ -795,10 +795,6 @@ pq_set_nonblocking(bool nonblocking) if (MyProcPort-noblock == nonblocking) return; -#ifdef WIN32 - pgwin32_noblock = nonblocking ? 1 : 0; -#else - /* * Use COMMERROR on failure, because ERROR would try to send the error to * the client, which might require changing the mode again, leading to @@ -816,7 +812,6 @@ pq_set_nonblocking(bool nonblocking) ereport(COMMERROR, (errmsg(could not set socket to blocking mode: %m))); } -#endif MyProcPort-noblock = nonblocking; } diff --git a/src/backend/port/win32/socket.c b/src/backend/port/win32/socket.c index c981169..51982f0 100644 --- a/src/backend/port/win32/socket.c +++ b/src/backend/port/win32/socket.c @@ -3,6 +3,9 @@ * socket.c * Microsoft Windows Win32 Socket Functions * + * Blocking socket functions implemented so they listen on both the socket + * and the signal event, required for signal handling. + * * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group * * IDENTIFICATION @@ -14,18 +17,19 @@ #include postgres.h /* - * Indicate if pgwin32_recv() and pgwin32_send() should operate - * in non-blocking mode. - * - * Since the socket emulation layer always sets the actual socket to - * non-blocking mode in order to be able to deliver signals, we must - * specify this in a separate flag if we actually need non-blocking - * operation. - * - * This flag changes the behaviour *globally* for all socket operations, - * so it should only be set for very short periods of time. + * Keep track which sockets are in non-blocking mode. Since the socket + * emulation layer always sets the actual socket to non-blocking mode in + * order to be able to deliver signals, we must track ourselves whether to + * present blocking or non-blocking behavior to the callers of pgwin32_recv() + * and pgwin32_send(). We expect there to be only a few non-blocking sockets, + * so a small array will do. */ -int pgwin32_noblock = 0; +#define MAX_NONBLOCKING_SOCKETS 10 + +static SOCKET nonblocking_sockets[MAX_NONBLOCKING_SOCKETS]; +static int num_nonblocking_sockets = 0; + +static bool pgwin32_socket_is_nonblocking(SOCKET s); #undef socket #undef accept @@ -33,11 +37,57 @@ int pgwin32_noblock = 0; #undef select #undef recv #undef send +#undef closesocket /* - * Blocking socket functions implemented so they listen on both - * the socket and the signal event, required for signal handling. + * Add a socket to the list of non-blocking sockets. */ +void +pgwin32_set_socket_noblock(SOCKET s) +{ + if (pgwin32_socket_is_nonblocking(s)) + return; /* already non-nlocking */ + + if (num_nonblocking_sockets = MAX_NONBLOCKING_SOCKETS) + elog(ERROR, too many non-blocking sockets); + + nonblocking_sockets[num_nonblocking_sockets++] = s; +} + +/* + * Remove a socket from the list of non-blocking
Re: [HACKERS] Escaping from blocked send() reprised.
On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote: - Preventing protocol violation. To prevent protocol violation, secure_write sets ClientConnectionLost when SIGTERM detected, then internal_flush() and ProcessInterrupts() follow the instruction. Oh, hang on. Now that I look at pqcomm.c more closely, it already has a mechanism to avoid writing a message in the middle of another message. See pq_putmessage and PqCommBusy. Can we rely on that? - Single pg_terminate_backend surely kills the backend. secure_raw_write() uses non-blocking socket and a loop of select() with timeout to surely detects received signal(SIGTERM). I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 second increment, but I see that WaitLatchOrSocket() doesn't currently support waiting for a socket to become writeable, without also waiting for it to become readable. I wonder how difficult it would be to lift that restriction. I also wonder if it would be simpler to keep the socket in blocking mode after all, and just close() in the signal handler if PqCommBusy == true. If the signal arrives while sleeping in send(), the effect would be the same as with your patch. If the signal arrives while sending, but not sleeping, you would not get the chance to send the already-buffered data to the client. But maybe that's OK, whether or not you're blocked is not very deterministic anyway. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-09-02 21:46:29 +0300, Heikki Linnakangas wrote: I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 second increment, but I see that WaitLatchOrSocket() doesn't currently support waiting for a socket to become writeable, without also waiting for it to become readable. I wonder how difficult it would be to lift that restriction. It's imo not that difficult. I've a prototype patch for that somewhere. I tested my poll() implementation and it worked, but didn't yet get to select(). I also wonder if it would be simpler to keep the socket in blocking mode after all, and just close() in the signal handler if PqCommBusy == true. If the signal arrives while sleeping in send(), the effect would be the same as with your patch. If the signal arrives while sending, but not sleeping, you would not get the chance to send the already-buffered data to the client. But maybe that's OK, whether or not you're blocked is not very deterministic anyway. I've actually been working on a patch to make the whole interaction with the client using sockets. The reason I started so is that we lots of far to complex stuff in signal handlers, and using a latch would allow us to instead interrupt send/recv. While still heavily WIP the reduction in odd stuff (check e.g. HandleCatchupInterrupt()) made me rather happy. I'm slightly worried about the added overhead due to the latch code. In my implementation I only use latches after a nonblocking read, but still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if that can be made problematic. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-09-02 21:01:44 +0200, Andres Freund wrote: I've actually been working on a patch to make the whole interaction with the client using sockets. The reason I started so is that we lots of far to complex stuff in signal handlers, and using a latch would allow us to instead interrupt send/recv. While still heavily WIP the reduction in odd stuff (check e.g. HandleCatchupInterrupt()) made me rather happy. Actually, the even more important reason is that that would allow us to throw errors/fatals sanely in interrupts because we wouldn't possibly jump through openssl anymore... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Heikki Linnakangas hlinnakan...@vmware.com writes: I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 second increment, but I see that WaitLatchOrSocket() doesn't currently support waiting for a socket to become writeable, without also waiting for it to become readable. I wonder how difficult it would be to lift that restriction. My recollection is that there was a reason for that, but I don't recall details any more. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 2014-09-02 17:21:03 -0400, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 second increment, but I see that WaitLatchOrSocket() doesn't currently support waiting for a socket to become writeable, without also waiting for it to become readable. I wonder how difficult it would be to lift that restriction. My recollection is that there was a reason for that, but I don't recall details any more. http://git.postgresql.org/pg/commitdiff/e42a21b9e6c9b9e6346a34b62628d48ff2fc6ddf In my prototype I've changed the API that errors set both READABLE/WRITABLE. Seems to work Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hello, sorry for the dazed reply in the previous mail. I made revised patch for this issue. Attached patches are following, - 0001_Revise_socket_emulation_for_win32_backend.patch Revises socket emulation on win32 backend so that each socket can have its own blocking mode state. - 0002_Allow_backend_termination_during_write_blocking.patch The patch to solve the issue. This patch depends on the 0001_ patch. == I'm marking this as Waiting on Author in the commitfest app, because: 1. the protocol violation needs to be avoided one way or another, and 2. the behavior needs to be consistent so that a single pg_terminate_backend() is enough to always kill the connection. - Preventing protocol violation. To prevent protocol violation, secure_write sets ClientConnectionLost when SIGTERM detected, then internal_flush() and ProcessInterrupts() follow the instruction. - Single pg_terminate_backend surely kills the backend. secure_raw_write() uses non-blocking socket and a loop of select() with timeout to surely detects received signal(SIGTERM). To avoid frequent switching of blocking mode, the bare socket for Port is put to non-blocking mode from the first in StreamConnection() and blocking mode is controlled only by Port-noblock in secure_raw_read/write(). To make the code mentioned above (Patch 0002) tidy, rewrite the socket emulation code for win32 backends so that each socket can have its own non-blocking state. (patch 0001) Some concern about this patch, - This patch allows the number of non-blocking socket to be below 64 (FD_SETSIZE) on win32 backend but it seems to be sufficient. - This patch introduced redundant socket emulation for win32 backend but win32 bare socket for Port is already nonblocking as described so it donsn't seem to be a serious problem on performance. Addition to it, since I don't know the reason why win32/socket.c provides the blocking-mode socket emulation, I decided to preserve win32/socket.c to have blocking socket emulation. Possibly it can be removed. Any suggestions? regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c index 605d891..c92851e 100644 --- a/src/backend/libpq/pqcomm.c +++ b/src/backend/libpq/pqcomm.c @@ -795,10 +795,6 @@ pq_set_nonblocking(bool nonblocking) if (MyProcPort-noblock == nonblocking) return; -#ifdef WIN32 - pgwin32_noblock = nonblocking ? 1 : 0; -#else - /* * Use COMMERROR on failure, because ERROR would try to send the error to * the client, which might require changing the mode again, leading to @@ -816,7 +812,7 @@ pq_set_nonblocking(bool nonblocking) ereport(COMMERROR, (errmsg(could not set socket to blocking mode: %m))); } -#endif + MyProcPort-noblock = nonblocking; } diff --git a/src/backend/port/win32/socket.c b/src/backend/port/win32/socket.c index c981169..f0ff3e7 100644 --- a/src/backend/port/win32/socket.c +++ b/src/backend/port/win32/socket.c @@ -21,11 +21,8 @@ * non-blocking mode in order to be able to deliver signals, we must * specify this in a separate flag if we actually need non-blocking * operation. - * - * This flag changes the behaviour *globally* for all socket operations, - * so it should only be set for very short periods of time. */ -int pgwin32_noblock = 0; +static fd_set nonblockset; #undef socket #undef accept @@ -33,6 +30,7 @@ int pgwin32_noblock = 0; #undef select #undef recv #undef send +#undef closesocket /* * Blocking socket functions implemented so they listen on both @@ -40,6 +38,34 @@ int pgwin32_noblock = 0; */ /* + * Set blocking mode for each socket + */ +void +pgwin32_set_socket_nonblock(SOCKET s, int nonblock) +{ + if (nonblock) + FD_SET(s, nonblockset); + else + FD_CLR(s, nonblockset); + + /* + * fd_set cannot have more than FD_SETSIZE entries. It's not likey to come + * close to this limit but if it goes above the limit, non blocking state + * of some existing sockets will be discarded. + */ + if (nonblockset.fd_count = FD_SETSIZE) + elog(FATAL, Too many sockets requested to be nonblocking mode.); +} + +void +pgwin32_nonblockset_init() +{ + FD_ZERO(nonblockset); +} + +#define socket_is_nonblocking(s) FD_ISSET((s), nonblockset) + +/* * Convert the last socket error code into errno */ static void @@ -256,6 +282,10 @@ pgwin32_socket(int af, int type, int protocol) TranslateSocketError(); return INVALID_SOCKET; } + + /* newly cerated socket should be in blocking mode */ + pgwin32_set_socket_nonblock(s, false); + errno = 0; return s; @@ -334,7 +364,7 @@ pgwin32_recv(SOCKET s, char *buf, int len, int f) return -1; } - if (pgwin32_noblock) + if (socket_is_nonblocking(s)) { /* * No data received, and we are in emulated non-blocking mode, so @@ -420,7 +450,7 @@ pgwin32_send(SOCKET s, const void *buf, int len, int flags) return -1; } - if
Re: [HACKERS] Escaping from blocked send() reprised.
Sorry, I was absorbed by other tasks.. Thank you for reviewing thiis. On 07/01/2014 06:26 AM, Kyotaro HORIGUCHI wrote: At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas robertmh...@gmail.com wrote in CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=wc8...@mail.gmail.com 1. I think it's the case that there are platforms around where a signal won't cause send() to return EINTR and I'd be entirely unsurprised if SSL_write() doesn't necessarily return EINTR in that case. I'm not sure what, if anything, we can do about that. We use a custom write routine with SSL_write, where we call send() ourselves, so that's not a problem as long as we put the check in the right place (in secure_raw_write(), after my recent SSL refactoring - the patch needs to be rebased). man 2 send on FreeBSD has not description about EINTR.. And even on linux, send won't return EINTR for most cases, at least I haven't seen that. So send()=-1,EINTR seems to me as only an equivalent of send() = 0. I have no idea about what the implementer thought the difference is. As the patch stands, there's a race condition: if the SIGTERM arrives *before* the send() call, the send() won't return EINTR anyway. So there's a chance that you still block. Calling pq_terminate_backend() again will dislodge it (assuming send() returns with EINTR on signal), Yes, that window would'nt be extinguished without introducing something more. EINTR is set only when nothing sent by the call. So AFAIS the chance of getting EINTR is far small than expectation. but I don't think we want to define the behavior as usually, pq_terminate_backend() will kill a backend that's blocked on sending to the client, but sometimes you have to call it twice (or more!) to really kill it. I agree that it is desirable behavior, if any measure to avoid that. But I think it's better than doing kill -9 engulfing all innocent backends. A more robust way is to set ImmediateInterruptOK before calling send(). That wouldn't let you send data that can be sent without blocking though. For that, you could put the socket to non-blocking mode, and sleep with select(), also waiting for the process' latch at the same time (die() sets the latch, so that will wake up the select() if a termination request arrives). I condiered it but select() frequently (rather in most cases when send() blocks by send buffer exhaustion) fails to predict that following send() will be blocked. (If my memory is correct.) So the final problem would be blocked send()... Is it actually safe to process the die-interrupt where send() is called? ProcessInterrupts() does ereport(FATAL, ...), which will attempt to send a message to the client. If that happens in the middle of constructing some other message, that will violate the protocol. So I strongly agree to you if select() works as the impression when reading the man document. 2. I think it would be reasonable to try to kill off the connection without notifying the client if we're unable to send the data to the client in a reasonable period of time. But I'm unsure what a reasonable period of time means. This patch would basically do it after no delay at all, which seems like it might be too aggressive. However, I'm not sure. I think there's no such a reasonable time. I agree it's pretty hard to define any reasonable timeout here. I think it would be fine to just cut the connection; even if you don't block while sending, you'll probably reach a CHECK_FOR_INTERRUPT() somewhere higher in the stack and kill the connection almost as abruptly anyway. (you can't violate the protocol, however) Yes, closing the blocked connection seems one of the most smarter way, checking the occurred interrupt could avoid protocol violation. But the problem for that is that there seems no means to close sockets elsewhere the blocking handle. dup(2)'ed handle cannot release the resource by only itself. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 08/26/2014 09:17 AM, Kyotaro HORIGUCHI wrote: but I don't think we want to define the behavior as usually, pq_terminate_backend() will kill a backend that's blocked on sending to the client, but sometimes you have to call it twice (or more!) to really kill it. I agree that it is desirable behavior, if any measure to avoid that. But I think it's better than doing kill -9 engulfing all innocent backends. A more robust way is to set ImmediateInterruptOK before calling send(). That wouldn't let you send data that can be sent without blocking though. For that, you could put the socket to non-blocking mode, and sleep with select(), also waiting for the process' latch at the same time (die() sets the latch, so that will wake up the select() if a termination request arrives). I condiered it but select() frequently (rather in most cases when send() blocks by send buffer exhaustion) fails to predict that following send() will be blocked. (If my memory is correct.) So the final problem would be blocked send()... My point was to put the socket in non-blocking mode, so that send() will return immediately with EAGAIN instead of blocking, if the send buffer is full. See WalSndWriteData for how that would work, it does something similar. Is it actually safe to process the die-interrupt where send() is called? ProcessInterrupts() does ereport(FATAL, ...), which will attempt to send a message to the client. If that happens in the middle of constructing some other message, that will violate the protocol. So I strongly agree to you if select() works as the impression when reading the man document. Not sure what you mean, but the above is a fatal problem with the patch right now, regardless of how you do the sleeping. 2. I think it would be reasonable to try to kill off the connection without notifying the client if we're unable to send the data to the client in a reasonable period of time. But I'm unsure what a reasonable period of time means. This patch would basically do it after no delay at all, which seems like it might be too aggressive. However, I'm not sure. I think there's no such a reasonable time. I agree it's pretty hard to define any reasonable timeout here. I think it would be fine to just cut the connection; even if you don't block while sending, you'll probably reach a CHECK_FOR_INTERRUPT() somewhere higher in the stack and kill the connection almost as abruptly anyway. (you can't violate the protocol, however) Yes, closing the blocked connection seems one of the most smarter way, checking the occurred interrupt could avoid protocol violation. But the problem for that is that there seems no means to close sockets elsewhere the blocking handle. dup(2)'ed handle cannot release the resource by only itself. I didn't understand that, surely you can just close() the socket? There is no dup(2) involved. And we don't necessarily need to close the socket, we just need to avoid writing to it when we're already in the middle of sending a message. I'm marking this as Waiting on Author in the commitfest app, because: 1. the protocol violation needs to be avoided one way or another, and 2. the behavior needs to be consistent so that a single pg_terminate_backend() is enough to always kill the connection. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hello, I condiered it but select() frequently (rather in most cases when send() blocks by send buffer exhaustion) fails to predict that following send() will be blocked. (If my memory is correct.) So the final problem would be blocked send()... My point was to put the socket in non-blocking mode, so that send() will return immediately with EAGAIN instead of blocking, if the send buffer is full. See WalSndWriteData for how that would work, it does something similar. I confused it with what I did during writing this patch. select() - blocking send(). Sorry for confusing the discussion. I understand correctly what you mean and It sounds reasonable. I agree it's pretty hard to define any reasonable timeout here. I think it would be fine to just cut the connection; even if you don't block while sending, you'll probably reach a CHECK_FOR_INTERRUPT() somewhere higher in the stack and kill the connection almost as abruptly anyway. (you can't violate the protocol, however) Yes, closing the blocked connection seems one of the most smarter way, checking the occurred interrupt could avoid protocol violation. But the problem for that is that there seems no means to close sockets elsewhere the blocking handle. dup(2)'ed handle cannot release the resource by only itself. I didn't understand that, surely you can just close() the socket? There is no dup(2) involved. And we don't necessarily need to close the socket, we just need to avoid writing to it when we're already in the middle of sending a message. My assumption there was select() and *blocking* send(). So the sentence cited is out of point from the first. I'm marking this as Waiting on Author in the commitfest app, because: 1. the protocol violation needs to be avoided one way or another, and 2. the behavior needs to be consistent so that a single pg_terminate_backend() is enough to always kill the connection. Thank you for the suggestion. I think I can go forward with that and will come up with new patch. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On 07/01/2014 06:26 AM, Kyotaro HORIGUCHI wrote: At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas robertmh...@gmail.com wrote in CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=wc8...@mail.gmail.com 1. I think it's the case that there are platforms around where a signal won't cause send() to return EINTR and I'd be entirely unsurprised if SSL_write() doesn't necessarily return EINTR in that case. I'm not sure what, if anything, we can do about that. We use a custom write routine with SSL_write, where we call send() ourselves, so that's not a problem as long as we put the check in the right place (in secure_raw_write(), after my recent SSL refactoring - the patch needs to be rebased). man 2 send on FreeBSD has not description about EINTR.. And even on linux, send won't return EINTR for most cases, at least I haven't seen that. So send()=-1,EINTR seems to me as only an equivalent of send() = 0. I have no idea about what the implementer thought the difference is. As the patch stands, there's a race condition: if the SIGTERM arrives *before* the send() call, the send() won't return EINTR anyway. So there's a chance that you still block. Calling pq_terminate_backend() again will dislodge it (assuming send() returns with EINTR on signal), but I don't think we want to define the behavior as usually, pq_terminate_backend() will kill a backend that's blocked on sending to the client, but sometimes you have to call it twice (or more!) to really kill it. A more robust way is to set ImmediateInterruptOK before calling send(). That wouldn't let you send data that can be sent without blocking though. For that, you could put the socket to non-blocking mode, and sleep with select(), also waiting for the process' latch at the same time (die() sets the latch, so that will wake up the select() if a termination request arrives). Is it actually safe to process the die-interrupt where send() is called? ProcessInterrupts() does ereport(FATAL, ...), which will attempt to send a message to the client. If that happens in the middle of constructing some other message, that will violate the protocol. 2. I think it would be reasonable to try to kill off the connection without notifying the client if we're unable to send the data to the client in a reasonable period of time. But I'm unsure what a reasonable period of time means. This patch would basically do it after no delay at all, which seems like it might be too aggressive. However, I'm not sure. I think there's no such a reasonable time. I agree it's pretty hard to define any reasonable timeout here. I think it would be fine to just cut the connection; even if you don't block while sending, you'll probably reach a CHECK_FOR_INTERRUPT() somewhere higher in the stack and kill the connection almost as abruptly anyway. (you can't violate the protocol, however) - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hello, At Tue, 1 Jul 2014 21:21:38 +0200, Martijn van Oosterhout klep...@svana.org wrote in 20140701192138.gb20...@svana.org On Tue, Jul 01, 2014 at 12:26:43PM +0900, Kyotaro HORIGUCHI wrote: 1. I think it's the case that there are platforms around where a signal won't cause send() to return EINTR and I'd be entirely unsurprised if SSL_write() doesn't necessarily return EINTR in that case. I'm not sure what, if anything, we can do about that. man 2 send on FreeBSD has not description about EINTR.. And even on linux, send won't return EINTR for most cases, at least I haven't seen that. So send()=-1,EINTR seems to me as only an equivalent of send() = 0. I have no idea about what the implementer thought the difference is. Whether send() returns EINTR or not depends on whether the signal has been marked restartable or not. This is configurable per signal, see sigaction(). If the signal is marked to restart, the kernel returns ERESTARTHAND (IIRC) and the libc will redo the call internally. Wow, thank you for detailed information. I'll study that and take it into future discussion. Default BSD does not return EINTR normally, but supports sigaction(). I guess it is for easiness-to-keep-compatibility, seems reasonable. have a nice day, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hello, thank you for keeping this discussion moving. I think there's no such a reasonable time. The behavior might should be determined from another point.. On alternative would be let pg_terminate_backend() have a parameter instructing force shutodwn (how to propagate it?), or make a forced shutdown on duplicate invocation of pg_terminate_backend(). Well, I think that when people call pg_terminate_backend() just once, they expect it to kill the target backend. I think people will tolerate a short delay, like a few seconds; after all, there's no guarantee, even today, that the backend will hit a CHECK_FOR_INTERRUPTS() in less than a few hundred milliseconds. Sure. But they are not going to want to have to take a second action to kill the backend - killing it once should be sufficient. Hmm, it sounds persuasive. Well, do you think they tolerate -force option? (Even though its technical practicality is not clear) regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Mon, Jun 30, 2014 at 11:26 PM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: 2. I think it would be reasonable to try to kill off the connection without notifying the client if we're unable to send the data to the client in a reasonable period of time. But I'm unsure what a reasonable period of time means. This patch would basically do it after no delay at all, which seems like it might be too aggressive. However, I'm not sure. I think there's no such a reasonable time. The behavior might should be determined from another point.. On alternative would be let pg_terminate_backend() have a parameter instructing force shutodwn (how to propagate it?), or make a forced shutdown on duplicate invocation of pg_terminate_backend(). Well, I think that when people call pg_terminate_backend() just once, they expect it to kill the target backend. I think people will tolerate a short delay, like a few seconds; after all, there's no guarantee, even today, that the backend will hit a CHECK_FOR_INTERRUPTS() in less than a few hundred milliseconds. But they are not going to want to have to take a second action to kill the backend - killing it once should be sufficient. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Tue, Jul 01, 2014 at 12:26:43PM +0900, Kyotaro HORIGUCHI wrote: 1. I think it's the case that there are platforms around where a signal won't cause send() to return EINTR and I'd be entirely unsurprised if SSL_write() doesn't necessarily return EINTR in that case. I'm not sure what, if anything, we can do about that. man 2 send on FreeBSD has not description about EINTR.. And even on linux, send won't return EINTR for most cases, at least I haven't seen that. So send()=-1,EINTR seems to me as only an equivalent of send() = 0. I have no idea about what the implementer thought the difference is. Whether send() returns EINTR or not depends on whether the signal has been marked restartable or not. This is configurable per signal, see sigaction(). If the signal is marked to restart, the kernel returns ERESTARTHAND (IIRC) and the libc will redo the call internally. Default BSD does not return EINTR normally, but supports sigaction(). Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
[HACKERS] Escaping from blocked send() reprised.
Hello, I have received inquiries related to blocked communication several times for these weeks with different symptoms. Then I found this message from archive, http://postgresql.1045698.n5.nabble.com/Escaping-a-blocked-sendto-syscall-without-causing-a-restart-td5740855.html Subject: Escaping a blocked sendto() syscall without causing a restart Mr. Tom Lane gave a comment replying it, Offhand it looks to me like most signals would kick the backend off the send() call ... but it would loop right back and try again. See internal_flush() in pqcomm.c. (If you're using SSL, this diagnosis may or may not apply.) We can't do anything except repeat the send attempt if the client connection is to be kept in a sane state. (snipped) And I'm not at all sure if we could get it to work in SSL mode... That's true for timeouts that should continue the connection, say, statement_timeout, but focusing on intentional backend termination, I think it does no harm to break it up abruptly, even if it was on SSL. On the other hand it seems still preferable to keep a connection when not blocked. The following expression would detects such a blocking state at just before next send(2) after the previous try exited by signals. (ProcDiePending select(1, NULL, fd, NULL, '1 sec') == 0) Finally, pg_terminate_backend() works even when send is blocked for both SSL and non-SSL connections after 1 second delay with this patch (break_socket_blocking_on_termination_v1.patch). Nevetheless, of course statement_timeout cannot become effective by this method since it breaks the consistency in the client protocol. It needs change in client protocol to have out of band mechanism or something, maybe. Any suggestions? Attached patches are: - break_socket_blocking_on_termination_v1.patch : The patch to break blocked state of send(2) for pg_terminate_backend(). - socket_block_test.patch : debug printing and changing send buffer of libpq for reproducing the blocked situation. Some point of discussion follows, Discussion about the appropriateness of looking into ProcDiePending there and calling CHECK_FOR_INTERRUPTS() seeing it. I have somewhat uneasiness of these things, but what we can at most seems to be replacing ProcDiePending here with some another variable, say ImmediatelyExitFromBlockedState, and somehow go upstairs through normal return path. Additional Try-Catch seems can do that but it looks no benefit for the added complexity.. Discussion on breaking up connetion especially for SSL Breaking an SSL connection up in my_sock_write() cause the following message on client side if it still lives and resumes to receive from the connection, this seems to show that the client handles the event properly. | SSL SYSCALL error: EOF detected | The connection to the server was lost. Attempting reset: Succeeded. Discussion on reliability of select(2) This method is not a perfect solution, since the select(2) sometimes gives a wrong oracle about wheather the follwing send(2) will be blocked. Even so, as far as I see, select(2) just after exiting from blocked send(2) by signal seems always says 'write will be blocked', so what this patch does seems to save most cases except when the any amount of socket buffer is vacated just before the following select. The second chance to exit from blocked send(2) won't come after this(, before one more pg_terminate_backend() ?). Removing the select(2) from the condition (that is, CHECK_FOR_INTERRUPTS() is called always ProcDiePending is true) prevents such a possibility, in exchange for killing 'really live' connection but IMHO it's no problem on intentional server termination. More reliable measure for this would be non-blocking IO but it changes more of the code. Reproducing the situation. Another possible question would be about the possibility of such blocking, or how to reproduce the situation. I found that send(2) on CentOS6.5 somehow sends successfully, for most cases, the remaining data at the retry after exiting by signal during being blocked with buffer full, in spite of no change in environment. So reproducing the stucked situation is rather difficult on the server as is. But such situation would be reproduced quite easily with some cheat, that is, enlarging PQ send buffer, say by ten times. Applying the attached testing patch (socket_block_test.patch), the following steps will make the stucked situation. 1. Do a select which returns big result and enter Ctrl-Z just after invoking. cl $ psql -h localhost postgres cl postgres=# select 1 from generate_series(0, 999); cl ^Z cl [4]+ Stopped psql -h localhost postgres 2. Watch the server to stuck. The server starts to print lines like following to console after a while, then stops. The number enclosed by the square brackets is PID of the server. sv [8809] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0 3. Do
Re: [HACKERS] Escaping from blocked send() reprised.
On Mon, Jun 30, 2014 at 4:13 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, I have received inquiries related to blocked communication several times for these weeks with different symptoms. Then I found this message from archive, http://postgresql.1045698.n5.nabble.com/Escaping-a-blocked-sendto-syscall-without-causing-a-restart-td5740855.html Subject: Escaping a blocked sendto() syscall without causing a restart Mr. Tom Lane gave a comment replying it, Offhand it looks to me like most signals would kick the backend off the send() call ... but it would loop right back and try again. See internal_flush() in pqcomm.c. (If you're using SSL, this diagnosis may or may not apply.) We can't do anything except repeat the send attempt if the client connection is to be kept in a sane state. (snipped) And I'm not at all sure if we could get it to work in SSL mode... That's true for timeouts that should continue the connection, say, statement_timeout, but focusing on intentional backend termination, I think it does no harm to break it up abruptly, even if it was on SSL. On the other hand it seems still preferable to keep a connection when not blocked. The following expression would detects such a blocking state at just before next send(2) after the previous try exited by signals. (ProcDiePending select(1, NULL, fd, NULL, '1 sec') == 0) Finally, pg_terminate_backend() works even when send is blocked for both SSL and non-SSL connections after 1 second delay with this patch (break_socket_blocking_on_termination_v1.patch). Nevetheless, of course statement_timeout cannot become effective by this method since it breaks the consistency in the client protocol. It needs change in client protocol to have out of band mechanism or something, maybe. Any suggestions? You should probably add your patch here, so it doesn't get forgotten about: https://commitfest.postgresql.org/action/commitfest_view/open We're focused on reviewing patches for the current CommitFest, so your patch might not get attention right away. A couple of general thoughts on this topic: 1. I think it's the case that there are platforms around where a signal won't cause send() to return EINTR and I'd be entirely unsurprised if SSL_write() doesn't necessarily return EINTR in that case. I'm not sure what, if anything, we can do about that. 2. I think it would be reasonable to try to kill off the connection without notifying the client if we're unable to send the data to the client in a reasonable period of time. But I'm unsure what a reasonable period of time means. This patch would basically do it after no delay at all, which seems like it might be too aggressive. However, I'm not sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
Hello, The replies follow are mainly as a memo for myself so please don't be bothered to answer until the time comes. At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas robertmh...@gmail.com wrote in CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=wc8...@mail.gmail.com You should probably add your patch here, so it doesn't get forgotten about: https://commitfest.postgresql.org/action/commitfest_view/open Ok, I'll put this there. We're focused on reviewing patches for the current CommitFest, so your patch might not get attention right away. A couple of general thoughts on this topic: Thank you for suggestions. I'll consider on them. 1. I think it's the case that there are platforms around where a signal won't cause send() to return EINTR and I'd be entirely unsurprised if SSL_write() doesn't necessarily return EINTR in that case. I'm not sure what, if anything, we can do about that. man 2 send on FreeBSD has not description about EINTR.. And even on linux, send won't return EINTR for most cases, at least I haven't seen that. So send()=-1,EINTR seems to me as only an equivalent of send() = 0. I have no idea about what the implementer thought the difference is. 2. I think it would be reasonable to try to kill off the connection without notifying the client if we're unable to send the data to the client in a reasonable period of time. But I'm unsure what a reasonable period of time means. This patch would basically do it after no delay at all, which seems like it might be too aggressive. However, I'm not sure. I think there's no such a reasonable time. The behavior might should be determined from another point.. On alternative would be let pg_terminate_backend() have a parameter instructing force shutodwn (how to propagate it?), or make a forced shutdown on duplicate invocation of pg_terminate_backend(). regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers