On Thu, Oct 6, 2011 at 14:34, Florian Pflug <f...@phlo.org> wrote: > On Oct5, 2011, at 15:30 , Magnus Hagander wrote: >> When walsender calls out to do_pg_stop_backup() (during base backups), >> it is not possible to terminate the process with a SIGTERM - it >> requires a SIGKILL. This can leave unkillable backends for example if >> archive_mode is on and archive_command is failing (or not set). A >> similar thing would happen in other cases if walsender calls out to >> something that would block (do_pg_start_backup() for example), but the >> stop one is easy to provoke. > > Hm, this seems to be related to another buglet I noticed a while ago, > but then forgot about again. If one terminates pg_basebackup while it's > waiting for all required WAL to be archived, the backend process only > exits once that waiting phase is over. If, like in your failure case, > archive_command fails indefinity (or isn't set), the backend process > stays around forever.
Yes. > Your patch would improve that only insofar as it'd at least allow an > immediate shutdown request to succeed - as it stands, that doesn't work > because, as you mentioned, the blocked walsender doesn't handle SIGTERM. Exactly. > The question is, should we do more? To me, it'd make sense to terminate > a backend once it's connection is gone. We could, for example, make > pq_flush() set a global flag, and make CHECK_FOR_INTERRUPTS handle a > broken connection that same way as a SIGINT or SIGTERM. The problem here is that we're hanging at a place where we don't touch the socket. So we won't notice the socket is gone. We'd have to do a select() or something like that at regular intervals to make sure it's there, no? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers