On Jan13, 2011, at 21:42 , Tom Lane wrote: > Aidan Van Dyk <ai...@highrise.ca> writes: >> If postmaster has a few fds to spare, what about having it open a pipe >> to every child it spawns. It never has to read/write to it, but >> postmaster closing will signal the client's fd. The client just has >> to pop the fd into whatever nrmal poll/select event handlign it uses >> to notice when the "parent's pipe" is closed. > > Hmm. Or more generally: there's one FIFO. The postmaster holds both > sides open. Backends hold the write side open. (They can close the > read side, but that would just be to free up a FD.) Background children > close the write side. Now a background process can use EOF on the read > side of the FIFO to tell it that postmaster and all backends have > exited. You still don't get a signal, but at least the condition you're > testing for is the one we actually want and not an approximation.
I was thinking along a similar line, and put together small test case to prove that this actually works. The attached test program simulates the interactions of a parent process (think postmaster), some utility processes (think walwriter, bgwriter, ...) and some backends. It uses two pairs of fd created with pipe(), called LifeSignParent and LifeSignParentBackends. The writing end of the former is held open only in the parent process, while the writing end of the latter is held open in the parent process and all regular backend processes. Backend processes use select() to monitor the reading end of the LifeSignParent fd pair. Since nothing is ever written to the writing end, the fd becomes readable only when the parent exits, because that is how select() signals EOF. Once that happens the backend exits. The utility processes do the same, but monitor the reading end of LifeSignParentBackends, and thus exit only after the parent and all regular backends have died. Since the lifesign checking uses select(), any place that already uses select can easily check for vanishing life signs. CHECK_FOR_INTERRUPTS could simply check the life sign once every few seconds. If we want an absolutely reliable signal instead of checking in CHECK_FOR_INTERRUPTS, every backend would need to launch a monitor subprocess which monitors the life sign, and exits once it vanishes. The backend would then get a SIGCHLD once the postmaster dies. Seems like overkill, though. The whole thing won't work on Windows, since even if it's got a pipe() or socketpair() call, with EXEC_BACKEND there's no way of transferring these fds to the child processes. AFAIK, however, Windows has other means with which such life signs can be implemented. For example, I seem to remember that WaitForMultipleObjects() can be used to wait for process-related events. But windows really isn't my area of expertise... I have tested this on the latest Ubunutu LTS release (10.04.1) as well as Mac OS X 10.6.6, and it seems to work correctly on both systems. I'd be happy to hear from anyone who has access to other systems on whether this works or not. The expected output is Launched utility 5095 Launched backend 5097 Launched utility 5096 Launched backend 5099 Launched backend 5098 Utility 5095 detected live parent or backend Backend 5097 detected live parent Utility 5096 detected live parent or backend Backend 5099 detected live parent Backend 5098 detected live parent Parent exiting Backend 5097 exiting after parent died Backend 5098 exiting after parent died Backend 5099 exiting after parent died Utility 5096 exiting after parent and backends died Utility 5095 exiting after parent and backends died Everything after "Parent exiting" might be interleaved with a shell prompt, of course. best regards, Florian Pflug
liveness.c
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers