On 01/19/2014 08:22 PM, Robert Haas wrote:
Hmm, that looks an awful lot like the SIGUSR1 signal handler is getting called after we've already completed shmem_exit. And indeed that seems like the sort of thing that would result in dying horribly in just this way. The obvious fix seems to be to check proc_exit_inprogress before doing anything that might touch shared memory, but there are a lot of other SIGUSR1 handlers that don't do that either. However, in those cases, the likely cause of a SIGUSR1 would be a sinval catchup interrupt or a recovery conflict, which aren't likely to be so far delayed that they arrive after we've already disconnected from shared memory. But the dynamic background workers stuff adds a new possible cause of SIGUSR1: the postmaster letting us know that a child has started or died. And that could happen even after we've detached shared memory.
Is anything happening about this? We're still getting quite a few of these: <http://www.pgbuildfarm.org/cgi-bin/show_failures.pl?max_days=3&member=crake>
cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers