On 28 August 2014 23:45, Tom Lane <t...@sss.pgh.pa.us> wrote: > I don't claim to be an expert on this stuff, but I had the idea that > multithreaded environments were supposed to track signal state per-thread > not just per-process, precisely because of issues like this.
After some googling, I found reply #3 in https://community.oracle.com/thread/1950900?start=0&tstart=0 and various other sources which say that in Solaris versions 10 they changed SIGPIPE delivery from per process (as specified by UNIX98) to per thread (as specified by POSIX:2001). But we are on version 11, so my theory doesn't look great. (Though 9 is probably still in use out there somewhere...) I also found this article: http://krokisplace.blogspot.co.uk/2010/02/suppressing-sigpipe-in-library.html The author recommends an approach nearly identical to the PostgreSQL approach, except s/he says: "to do this we use sigtimedwait() with zero timeout; this is to avoid blocking in a scenario where malicious user sent SIGPIPE manually to a whole process: in this case we will see it pending, but other thread may handle it before we had a [chance] to wait for it". Maybe we have malicious users sending signals to processes. It does seem more likely the crashing database triggered this somehow though, perhaps in combination with something else the client app was doing, though I can't think what it could be that would eat another thread's SIGPIPE in between the sigpending and sigwait syscalls. Best regards, Thomas Munro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers