On 2017-04-21 23:50:41 -0400, Tom Lane wrote: > I wrote: > > Attached is a lightly-tested draft patch that converts the postmaster to > > use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions > > about whether this is the direction to proceed, though. > > Attached are a couple of patches that represent a plausible Plan B. > The first one changes the postmaster to run its signal handlers without > specifying SA_RESTART. I've confirmed that that seems to fix the > select_parallel-test-takes-a-long-time problem on gaur/pademelon. > The second one uses pselect, if available, to replace the unblock-signals/ > select()/block-signals dance in ServerLoop. On platforms where pselect > exists and works properly, that should fix the race condition I described > previously. On platforms where it doesn't, we're no worse off than > before. > > As mentioned in the comments for the second patch, even if we don't > have working pselect(), the only problem is that ServerLoop's response > to an interrupt might be delayed by as much as the up-to-1-minute timeout. > The only existing case where that's really bad is launching multiple > bgworkers. I would therefore advocate also changing maybe_start_bgworker > to start up to N bgworkers per call, where N is large enough to pretty > much always satisfy simultaneously-arriving requests. I'd pick 100 or > so, but am willing to negotiate. > > I think that these patches represent something we could back-patch > without a lot of trepidation, unlike the WaitEventSet-based approach. > Therefore, my proposal is to apply and backpatch these changes, and > call it good for v10. For v11, we could work on changing the postmaster > to not do work in signal handlers, as discussed upthread. That would > supersede these two patches completely, though I'd still advocate for > keeping the change in maybe_start_bgworker. > > Note: for testing purposes, these patches are quite independent; just > ignore the hunk in the second patch that changes a comment added by > the first one.
Unclear if related, but https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2017-04-24%2019%3A30%3A42 has a suspicious timing of failing in a weird way. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers