Re: [HACKERS] Instability in select_parallel regression test

2017-03-02 Thread Robert Haas
On Mon, Feb 27, 2017 at 8:07 AM, Amit Kapila wrote: >> My guess is that if we apply the fix I suggested above, it'll be good >> enough. If that turns out not to be true, then I guess we'll have to >> deal with that, but why not do the easy thing first? > > Okay, that is also a sensible approach.

Re: [HACKERS] Instability in select_parallel regression test

2017-02-26 Thread Amit Kapila
On Sun, Feb 26, 2017 at 11:15 PM, Robert Haas wrote: > On Mon, Feb 20, 2017 at 7:52 AM, Amit Kapila wrote: > >> The main point is if we >> keep any loose end in this area, then there is a chance that the >> regression test select_parallel can still fail, if not now, then in >> future. Another wa

Re: [HACKERS] Instability in select_parallel regression test

2017-02-26 Thread Robert Haas
On Mon, Feb 20, 2017 at 7:52 AM, Amit Kapila wrote: > On Sun, Feb 19, 2017 at 8:32 PM, Robert Haas wrote: >> On Sun, Feb 19, 2017 at 6:50 PM, Amit Kapila wrote: >>> To close the remaining gap, don't you think we can check slot->in_use >>> flag when generation number for handle and slot are same.

Re: [HACKERS] Instability in select_parallel regression test

2017-02-19 Thread Amit Kapila
On Sun, Feb 19, 2017 at 8:32 PM, Robert Haas wrote: > On Sun, Feb 19, 2017 at 6:50 PM, Amit Kapila wrote: >> To close the remaining gap, don't you think we can check slot->in_use >> flag when generation number for handle and slot are same. > > That doesn't completely fix it either, because > Forg

Re: [HACKERS] Instability in select_parallel regression test

2017-02-19 Thread Robert Haas
On Sun, Feb 19, 2017 at 6:50 PM, Amit Kapila wrote: > To close the remaining gap, don't you think we can check slot->in_use > flag when generation number for handle and slot are same. That doesn't completely fix it either, because ForgetBackgroundWorker() also does BackgroundWorkerData->parallel_

Re: [HACKERS] Instability in select_parallel regression test

2017-02-19 Thread Amit Kapila
On Sun, Feb 19, 2017 at 5:54 PM, Robert Haas wrote: > On Sun, Feb 19, 2017 at 2:17 PM, Robert Haas wrote: >> Such a change can be made, but as I pointed out in the part you didn't >> quote, there are reasons to wonder whether that will be a constructive >> change in real life even if it's better

Re: [HACKERS] Instability in select_parallel regression test

2017-02-19 Thread Robert Haas
On Sun, Feb 19, 2017 at 5:54 PM, Robert Haas wrote: > However, it looks like there's a race condition here, because the slot > doesn't get freed up at the same time that the PID gets set to 0. > That actually happens later, when the postmaster calls > maybe_start_bgworker() or DetermineSleepTime()

Re: [HACKERS] Instability in select_parallel regression test

2017-02-19 Thread Robert Haas
On Sun, Feb 19, 2017 at 2:17 PM, Robert Haas wrote: > Such a change can be made, but as I pointed out in the part you didn't > quote, there are reasons to wonder whether that will be a constructive > change in real life even if it's better for the regression tests. > Optimizing PostgreSQL for the

Re: [HACKERS] Instability in select_parallel regression test

2017-02-19 Thread Robert Haas
On Sat, Feb 18, 2017 at 11:53 PM, Tom Lane wrote: >> It's a little surprising to me that the delay we're seeing here is >> significant, because the death of a child should cause an immediate >> SIGCHLD, resulting in a call to reaper(), resulting in a call to >> waitpid(). Why's that not working?

Re: [HACKERS] Instability in select_parallel regression test

2017-02-18 Thread Tom Lane
Robert Haas writes: > On Fri, Feb 17, 2017 at 9:57 PM, Amit Kapila wrote: >>> That seems like a seriously broken design to me, first because it can make >>> for a significant delay in the slots becoming available (which is what's >>> evidently causing these regression failures), and second becaus

Re: [HACKERS] Instability in select_parallel regression test

2017-02-18 Thread Robert Haas
On Fri, Feb 17, 2017 at 9:57 PM, Amit Kapila wrote: >> That seems like a seriously broken design to me, first because it can make >> for a significant delay in the slots becoming available (which is what's >> evidently causing these regression failures), and second because it's >> simply bad desig

Re: [HACKERS] Instability in select_parallel regression test

2017-02-17 Thread Amit Kapila
On Fri, Feb 17, 2017 at 9:15 PM, Tom Lane wrote: > Amit Kapila writes: >> On Fri, Feb 17, 2017 at 11:22 AM, Tom Lane wrote: >>> In short, it looks to me like ExecShutdownGatherWorkers doesn't actually >>> wait for parallel workers to finish (as its comment suggests is >>> necessary), so that on

Re: [HACKERS] Instability in select_parallel regression test

2017-02-17 Thread Tom Lane
Amit Kapila writes: > On Fri, Feb 17, 2017 at 11:22 AM, Tom Lane wrote: >> In short, it looks to me like ExecShutdownGatherWorkers doesn't actually >> wait for parallel workers to finish (as its comment suggests is >> necessary), so that on not-too-speedy machines the worker slots may all >> stil

Re: [HACKERS] Instability in select_parallel regression test

2017-02-17 Thread Amit Kapila
On Fri, Feb 17, 2017 at 11:22 AM, Tom Lane wrote: > Buildfarm members gaur, pademelon, and gharial have all recently shown > failures like this: > > *** > /home/bfarm/bf-data/HEAD/pgsql.build/src/test/regress/expected/select_parallel.out > Thu Feb 16 20:35:14 2017 > --- > /home/bfarm/bf-data/H

[HACKERS] Instability in select_parallel regression test

2017-02-16 Thread Tom Lane
Buildfarm members gaur, pademelon, and gharial have all recently shown failures like this: *** /home/bfarm/bf-data/HEAD/pgsql.build/src/test/regress/expected/select_parallel.out Thu Feb 16 20:35:14 2017 --- /home/bfarm/bf-data/HEAD/pgsql.build/src/test/regress/results/select_parallel.out Th