Re: [GENERAL] postmaster deadlock while logging after syslogger exited
Andres Freund writes: > It doesn't seem impossible to get into a situation where syslogger is > the source of the OOM. Just enabling a lot of logging in a workload with > many large query strings might do it. So making it less likely to be > killed might make the problem worse... Hm, so that's another angle David didn't report on: is it possible that his workload could have resulted in a very large volume of incomplete in-progress log messages? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
On November 16, 2017 7:06:23 PM PST, Tom Lane wrote: >Andres Freund writes: >> On 2017-11-16 21:39:49 -0500, Tom Lane wrote: >>> What might be worth thinking about is allowing the syslogger process >to >>> inherit the postmaster's OOM-kill-proofness setting, instead of >dropping >>> down to the same vulnerability as the postmaster's other child >processes. > >> Hm. I'm a bit scared about that - it doesn't seem that inconceivable >> that various backends log humongous multi-line messages, leading to >> syslogger *actually* taking up a fair amount of memory. Note that >we're >> using plain stringinfos that ereport(ERROR) out of memory situations, >> rather than failing more gracefully. > >True, but there's no hard limits on the postmaster's memory consumption >either ... Is there a credible scenario where it'd allocate many gigabytes of memory? > and if the syslogger does get killed on such a basis, we >have at the least lost a bunch of log output. On the whole I think we'd be >better off trying to prevent OOM kills on the syslogger. (That doesn't >preclude other mitigation measures.) It doesn't seem impossible to get into a situation where syslogger is the source of the OOM. Just enabling a lot of logging in a workload with many large query strings might do it. So making it less likely to be killed might make the problem worse... Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
Andres Freund writes: > On 2017-11-16 21:39:49 -0500, Tom Lane wrote: >> What might be worth thinking about is allowing the syslogger process to >> inherit the postmaster's OOM-kill-proofness setting, instead of dropping >> down to the same vulnerability as the postmaster's other child processes. > Hm. I'm a bit scared about that - it doesn't seem that inconceivable > that various backends log humongous multi-line messages, leading to > syslogger *actually* taking up a fair amount of memory. Note that we're > using plain stringinfos that ereport(ERROR) out of memory situations, > rather than failing more gracefully. True, but there's no hard limits on the postmaster's memory consumption either ... and if the syslogger does get killed on such a basis, we have at the least lost a bunch of log output. On the whole I think we'd be better off trying to prevent OOM kills on the syslogger. (That doesn't preclude other mitigation measures.) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
On 2017-11-16 21:39:49 -0500, Tom Lane wrote: > > We could work around a situation like that if we made postmaster use a > > *different* pipe as stderr than the one we're handing to normal > > backends. If postmaster created a new pipe and closed the read end > > whenever forking a syslogger, we should get EPIPEs when writing after > > syslogger died and could fall back to proper stderr or such. > > I think that's nonsense, unfortunately. Nice phrasing. > If the postmaster had its own pipe, that would reduce the risk of this > deadlock because only the postmaster would be filling that pipe, not > the postmaster and all its other children --- but it wouldn't > eliminate the risk. The deadlock happens because postmaster is waiting for syslogger accept a message, and syslogger waits for postmaster to restart it. To resolve the deadlock postmasterneeds to not wait for a dead sylogger, even if it hasn't yet received & processed the SIGCLD - what other postmaster children do or don't do doesn't matter for resolving that cycle. The reason postmaster currently block on writing to the pipe, instead of getting EPIPE, is because both ends of the pipe are still existing. Which in turn is the case because we need to be able to restart syslogger without passing a new file descriptor to all subprocesses. If postmaster instead uses a different pipe to write to it'll not block anymore, instead getting EPIPE, and can continue towards starting a new syslogger. So I don't think the described deadlock exists if we were to apply my proposed fix. What this obviously would not *not* guarantee is being able start a new syslogger, but it seems fairly impossible to guarantee that. So sure, other processes would still block until syslogger has successfully restarted - but it's a resolvable situation rather than a hard deadlock, which the described situation appears to be. Note that there's plenty of cases where you could run into this even without being unable to fork new processes. You'd e.g. could also run into this while logging the exit of some other subprocess or such, there's enough ereports in postmaster. > I doubt the increase in reliability would be enough to justify the > extra complexity and cost. I'm doubtful about that too. > What might be worth thinking about is allowing the syslogger process to > inherit the postmaster's OOM-kill-proofness setting, instead of dropping > down to the same vulnerability as the postmaster's other child processes. > That presumes that this was an otherwise-unjustified OOM kill, which > I'm not quite sure of ... but it does seem like a situation that could > arise from time to time. Hm. I'm a bit scared about that - it doesn't seem that inconceivable that various backends log humongous multi-line messages, leading to syslogger *actually* taking up a fair amount of memory. Note that we're using plain stringinfos that ereport(ERROR) out of memory situations, rather than failing more gracefully. - Andres -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
Andres Freund writes: > On 2017-11-06 15:35:03 -0500, Tom Lane wrote: >> David Pacheco writes: >>> I ran into what appears to be a deadlock in the logging subsystem. It >>> looks like what happened was that the syslogger process exited because it >>> ran out of memory. But before the postmaster got a chance to handle the >>> SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker. >>> That also failed, and the postmaster went to log a message about it, but >>> it's blocked on the pipe that's normally connected to the syslogger, >>> presumably because the pipe is full because the syslogger is gone and >>> hasn't read from it. >> Ugh. > I'm somewhat inclined to say that one has to live with this if the > system is so resource constrainted that processes barely using memory > get killed. David's report isn't too clear: did the syslogger process actually run out of memory and exit of its own volition after an ENOMEM, or did it get killed by the dreaded OOM killer? In either case, it's unclear whether it was really using an excessive amount of memory. We have not heard reports suggesting a memory leak in the syslogger, but maybe there is one under unusual circumstances? I think you're probably right that the real cause here is the OOM killer just randomly seizing on the syslogger as a victim process; although since the syslogger disconnects from shared memory, it's not very clear why it would score high on the OOM killer's metrics. The whole thing is definitely odd. > We could work around a situation like that if we made postmaster use a > *different* pipe as stderr than the one we're handing to normal > backends. If postmaster created a new pipe and closed the read end > whenever forking a syslogger, we should get EPIPEs when writing after > syslogger died and could fall back to proper stderr or such. I think that's nonsense, unfortunately. If the postmaster had its own pipe, that would reduce the risk of this deadlock because only the postmaster would be filling that pipe, not the postmaster and all its other children --- but it wouldn't eliminate the risk. I doubt the increase in reliability would be enough to justify the extra complexity and cost. What might be worth thinking about is allowing the syslogger process to inherit the postmaster's OOM-kill-proofness setting, instead of dropping down to the same vulnerability as the postmaster's other child processes. That presumes that this was an otherwise-unjustified OOM kill, which I'm not quite sure of ... but it does seem like a situation that could arise from time to time. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
On Fri, Nov 17, 2017 at 11:14 AM, Andres Freund wrote: > On 2017-11-17 11:09:56 +0900, Michael Paquier wrote: >> when redirection_done is switched to true because the first process >> generating a message to the syslogger pipe needs to open it first if >> not done yet? > > I can't follow. The syslogger pipe is created when the first syslogger > is started (before it's forked!). Which happens before other processes > are created, because they all need to inherit that file descriptor. Ah, OK. I didn't recall this dependency. Sorry for the confusion. -- Michael -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
On 2017-11-17 11:09:56 +0900, Michael Paquier wrote: > On Fri, Nov 17, 2017 at 10:50 AM, Andres Freund wrote: > > On 2017-11-06 15:35:03 -0500, Tom Lane wrote: > >> David Pacheco writes: > >> > I ran into what appears to be a deadlock in the logging subsystem. It > >> > looks like what happened was that the syslogger process exited because it > >> > ran out of memory. But before the postmaster got a chance to handle the > >> > SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker. > >> > That also failed, and the postmaster went to log a message about it, but > >> > it's blocked on the pipe that's normally connected to the syslogger, > >> > presumably because the pipe is full because the syslogger is gone and > >> > hasn't read from it. > >> > >> Ugh. > > > > I'm somewhat inclined to say that one has to live with this if the > > system is so resource constrainted that processes barely using memory > > get killed. > > > > We could work around a situation like that if we made postmaster use a > > *different* pipe as stderr than the one we're handing to normal > > backends. If postmaster created a new pipe and closed the read end > > whenever forking a syslogger, we should get EPIPEs when writing after > > syslogger died and could fall back to proper stderr or such. > > I don't have the code on top of my mind, but isn't a custom fd causing > a small penalty Sure, there's some minor overhead because every process would need to close another fd after forking. > when redirection_done is switched to true because the first process > generating a message to the syslogger pipe needs to open it first if > not done yet? I can't follow. The syslogger pipe is created when the first syslogger is started (before it's forked!). Which happens before other processes are created, because they all need to inherit that file descriptor. > So you'd need proper locking to save from race conditions. I completely fail to see why this'd be the case. All I'm talking about is using another pipe between syslogger and postmaster than between other-processes and syslogger. Greetings, Andres Freund -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
On Fri, Nov 17, 2017 at 10:50 AM, Andres Freund wrote: > On 2017-11-06 15:35:03 -0500, Tom Lane wrote: >> David Pacheco writes: >> > I ran into what appears to be a deadlock in the logging subsystem. It >> > looks like what happened was that the syslogger process exited because it >> > ran out of memory. But before the postmaster got a chance to handle the >> > SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker. >> > That also failed, and the postmaster went to log a message about it, but >> > it's blocked on the pipe that's normally connected to the syslogger, >> > presumably because the pipe is full because the syslogger is gone and >> > hasn't read from it. >> >> Ugh. > > I'm somewhat inclined to say that one has to live with this if the > system is so resource constrainted that processes barely using memory > get killed. > > We could work around a situation like that if we made postmaster use a > *different* pipe as stderr than the one we're handing to normal > backends. If postmaster created a new pipe and closed the read end > whenever forking a syslogger, we should get EPIPEs when writing after > syslogger died and could fall back to proper stderr or such. I don't have the code on top of my mind, but isn't a custom fd causing a small penalty when redirection_done is switched to true because the first process generating a message to the syslogger pipe needs to open it first if not done yet? So you'd need proper locking to save from race conditions. Or is the first message redirected message always generated by the postmaster or the syslogger? I don't recall that this is actually true.. -- Michael -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
On 2017-11-06 15:35:03 -0500, Tom Lane wrote: > David Pacheco writes: > > I ran into what appears to be a deadlock in the logging subsystem. It > > looks like what happened was that the syslogger process exited because it > > ran out of memory. But before the postmaster got a chance to handle the > > SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker. > > That also failed, and the postmaster went to log a message about it, but > > it's blocked on the pipe that's normally connected to the syslogger, > > presumably because the pipe is full because the syslogger is gone and > > hasn't read from it. > > Ugh. I'm somewhat inclined to say that one has to live with this if the system is so resource constrainted that processes barely using memory get killed. We could work around a situation like that if we made postmaster use a *different* pipe as stderr than the one we're handing to normal backends. If postmaster created a new pipe and closed the read end whenever forking a syslogger, we should get EPIPEs when writing after syslogger died and could fall back to proper stderr or such. Greetings, Andres Freund -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
On Mon, Nov 6, 2017 at 12:35 PM, Tom Lane wrote: > David Pacheco writes: > > I ran into what appears to be a deadlock in the logging subsystem. It > > looks like what happened was that the syslogger process exited because it > > ran out of memory. But before the postmaster got a chance to handle the > > SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker. > > That also failed, and the postmaster went to log a message about it, but > > it's blocked on the pipe that's normally connected to the syslogger, > > presumably because the pipe is full because the syslogger is gone and > > hasn't read from it. > > Ugh. > Should I file a bug on this issue? Thanks, Dave
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
On Mon, Nov 6, 2017 at 12:35 PM, Tom Lane wrote: > David Pacheco writes: > > ... that process appears to have exited due to a fatal error > > (out of memory). (I know it exited because the process still exists in > the > > kernel -- it hasn't been reaped yet -- and I think it ran out of memory > > based on a log message I found from around the time when the process > > exited.) > > Could we see the exact log message(s) involved? It's pretty hard to > believe that the logger would have consumed much memory. Thanks for the quick reply! Based on kernel state about the dead but unreaped syslogger process, I believe the process exited at 2017-10-27T23:46:21.258Z. Here are all of the entries in the PostgreSQL log from 23:19:12 until the top of the next hour: https://gist.githubusercontent.com/davepacheco/c5541bb464532075f2da761dd990a457/raw/2ba242055aca2fb374e9118045a830d08c590e0a/gistfile1.txt There's no log entry at exactly 23:46:21 or even immediately before that, but there are a lot of "out of memory" errors and a FATAL one at 23:47:28. Unfortunately, we haven't configured logging to include the pid, so I can't be sure which messages came from the syslogger. There are also many log entries for some very long SQL queries. I'm sure that contributed to this problem by filling up the pipe. I was able to extract the contents of the pipe while the system was hung, and it was more of these giant query strings. I think it's likely that this database instance was running in a container with way too small a memory cap for the number of processes configured. (This was a zone (a lightweight container) allocated with 2GB of memory and configured with 512MB of shared_buffers and up to 200 connections.) I expect that the system got to a persistent state of having basically no memory available, at which point nearly any attempt to allocate memory could fail. The syslogger itself may not have been using much memory. So I'm not so much worried about the memory usage itself, but it would be nice if this condition were handled better. Handling out-of-memory is obviously hard, especially when it means being unable to fork, but even crashing would have been better for our use-case. And of course, there are other reasons that the syslogger could exit prematurely besides being low on memory, and those might be more recoverable. Thanks, Dave
Re: [GENERAL] postmaster deadlock while logging after syslogger exited
David Pacheco writes: > I ran into what appears to be a deadlock in the logging subsystem. It > looks like what happened was that the syslogger process exited because it > ran out of memory. But before the postmaster got a chance to handle the > SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker. > That also failed, and the postmaster went to log a message about it, but > it's blocked on the pipe that's normally connected to the syslogger, > presumably because the pipe is full because the syslogger is gone and > hasn't read from it. Ugh. > ... that process appears to have exited due to a fatal error > (out of memory). (I know it exited because the process still exists in the > kernel -- it hasn't been reaped yet -- and I think it ran out of memory > based on a log message I found from around the time when the process > exited.) Could we see the exact log message(s) involved? It's pretty hard to believe that the logger would have consumed much memory. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] postmaster deadlock while logging after syslogger exited
Hello, I ran into what appears to be a deadlock in the logging subsystem. It looks like what happened was that the syslogger process exited because it ran out of memory. But before the postmaster got a chance to handle the SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker. That also failed, and the postmaster went to log a message about it, but it's blocked on the pipe that's normally connected to the syslogger, presumably because the pipe is full because the syslogger is gone and hasn't read from it. The net result is that new connections to PostgreSQL hang, and any query that causes PostgreSQL to log appears to hang, though queries appear to work on existing connections as long as they do not cause the backend to log anything. I don't believe the system would ever get out of this state. I ran into this with version 9.2.4 on SmartOS (illumos), but it doesn't look to me like any of the relevant code has changed in 9.6 or 10, nor is it platform-specific. Version 10.0 still: - restarts the syslogger in response to a SIGCHLD signal (in "reaper") - starts autovacuum workers in response to SIGUSR1 (in "sigusr1_handler") - mask SIGCHLD at the beginning of the SIGUSR1 handler - potentially logs from StartAutoVacWorker (and, of course, many other places in the postmaster) which I think means the problem is still possible. Am I missing something? I couldn't find anything about this issue. Is this just a rarely seen bug? Here's the raw data: my initial observation was that when I ran "psql" to connect to the database, it would hang (for hours, it turned out). My postmaster process was blocked here, writing to its own stderr pipe: $ mdb core.41349 Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ] > $C fd7fffdf9df0 libc.so.1`__write+0xa() fd7fffdfb220 write_pipe_chunks+0x142() fd7fffdfb280 send_message_to_server_log+0x55b() fd7fffdfb2a0 EmitErrorReport+0xe6() fd7fffdfb390 errfinish+0x1ac() fd7fffdfb3b0 StartAutoVacWorker+0x5c() fd7fffdfb3e0 StartAutovacuumWorker+0x6f() fd7fffdfb410 sigusr1_handler+0x185() fd7fffdfb420 libc.so.1`__sighndlr+6() fd7fffdfb4b0 libc.so.1`call_user_handler+0x1db(10, 0, fd7fffdfb520) fd7fffdfb500 libc.so.1`sigacthandler+0x116(10, 0, fd7fffdfb520) fd7fffdfb970 libc.so.1`__pollsys+0xa() fd7fffdfba90 libc.so.1`pselect+0x1cb(5, fd7fffdfbaf0, 0, 0, fd7fffdfbaa0, 0) fd7fffdfbae0 libc.so.1`select+0x5a(5, fd7fffdfbaf0, 0, 0, fd7fffdfdaf0) fd7fffdffb40 ServerLoop+0xb5() fd7fffdffbe0 PostmasterMain+0xec2() fd7fffdffc00 main+0x23a() fd7fffdffc10 _start+0x6c() >From inspecting kernel state, I found that the only other file descriptor on the system that corresponds to the other side of this pipe is in the postmaster itself, which it keeps open in case it needs to restart the syslogger. Based on the code, I expected the syslogger subprocess to have the pipe open, but that process appears to have exited due to a fatal error (out of memory). (I know it exited because the process still exists in the kernel -- it hasn't been reaped yet -- and I think it ran out of memory based on a log message I found from around the time when the process exited.) Then I expected the postmaster would have restarted it, but I found that it currently has SIGCHLD both pending and masked: $ pflags 41349 41349: /opt/postgresql/9.2.4/bin/postgres -D /manatee/pg/data data model = _LP64 flags = ORPHAN|MSACCT|MSFORK sigpend = 0x00028000,0x,0x /1:flags = ASLEEP write(0x2,0xfd7fffdf9e10,0x65) sigmask = 0xfebff047,0xfff7,0x03ff The signal mask may be system-specific, but I've decoded it and confirmed that SIGCHLD is in "sigpend" as well as "sigmask". And that makes sense because the stack above indicates we're in "sigusr1_handler", which masks SIGCHLD when it starts. If it wasn't clear from the above, the logging configuration looks like this: log_destination = 'stderr' logging_collector = on log_directory = '/var/pg/' log_filename = 'postgresql.log' Thanks, Dave