On Tue, Jun 6, 2017 at 2:21 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> One thought is that the only places where shm_mq_set_sender() should >> be getting invoked during the main regression tests are >> ParallelWorkerMain() and ExecParallelGetReceiver, and both of those >> places using ParallelWorkerNumber to figure out what address to pass. >> So if ParallelWorkerNumber were getting set to the same value in two >> different parallel workers - e.g. because the postmaster went nuts and >> launched two processes instead of only one - or if >> ParallelWorkerNumber were not getting initialized at all or were >> getting initialized to some completely bogus value, it could cause >> this symptom. > > Hmm. With some generous assumptions it'd be possible to think that > aa1351f1eec4adae39be59ce9a21410f9dd42118 triggered this. That commit was > present in 20 successful lorikeet runs before the first of these failures, > which is a bit more than the MTBF after that, but not a huge amount more. > > That commit in itself looks innocent enough, but could it have exposed > some latent bug in bgworker launching?
Hmm, that's a really interesting idea, but I can't quite put together a plausible theory around it. I mean, it seems like that commit could make launching bgworkers faster, which could conceivably tickle some heretofore-latent timing-related bug. But it wouldn't, IIUC, make the first worker start any faster than before - it would just make them more closely-spaced thereafter, and it's not very obvious how that would cause a problem. Another idea is that the commit in question is managing to corrupt BackgroundWorkerList somehow. maybe_start_bgworkers() is using slist_foreach_modify(), but previously it always returned after calling do_start_bgworker, and now it doesn't. So if do_start_bgworker() did something that could modify the list structure, then perhaps maybe_start_bgworkers() would get confused. I don't really think that this theory has any legs, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers