On Thu, Jan 25, 2018 at 1:24 AM, Peter Geoghegan <p...@bowt.ie> wrote: > On Tue, Jan 23, 2018 at 9:43 PM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> Right, but what if the worker dies due to something proc_exit(1) or >> something like that before calling BarrierArriveAndWait. I think this >> is part of the problem we have solved in >> WaitForParallelWorkersToFinish such that if the worker exits abruptly >> at any point due to some reason, the system should not hang. > > I have used Thomas' chaos-monkey-fork-process.patch to verify: > > 1. The problem of fork failure causing nbtsort.c to wait forever is a > real problem. Sure enough, the coding pattern within > _bt_leader_heapscan() can cause us to wait forever even with commit > 2badb5afb89cd569500ef7c3b23c7a9d11718f2f, more or less as a > consequence of the patch not using tuple queues (it uses the new > tuplesort sharing thing instead). > > 2. Simply adding a single call to WaitForParallelWorkersToFinish() > within _bt_leader_heapscan() before waiting on our condition variable > fixes the problem -- errors are reliably propagated, and we never end > up waiting forever. > > 3. This short term fix works just as well with > parallel_leader_participation=off. > > At this point, my preferred solution is for someone to go implement > Amit's WaitForParallelWorkersToAttach() idea [1] (Amit himself seems > like the logical person for the job). >
I can implement it and share a prototype patch with you which you can use to test parallel sort stuff. I would like to highlight the difference which you will see with WaitForParallelWorkersToAttach as compare to WaitForParallelWorkersToFinish() is that the former will give you how many of nworkers_launched workers are actually launched whereas latter gives an error if any of the expected workers is not launched. I feel former is good and your proposed way of calling it after the leader is done with its work has alleviated the minor disadvantage of this API which is that we need for workers to startup. However, now I see that you and Thomas are trying to find a different way to overcome this problem differently, so not sure if I should go ahead or not. I have seen that you told you wanted to look at Thomas's proposed stuff carefully tomorrow, so I will wait for you guys to decide which way is appropriate. > Once that's committed, I can > post a new version of the patch that uses that new infrastructure -- > I'll add a call to the new function, without changing anything else. > Failing that, we could actually just use > WaitForParallelWorkersToFinish(). I still don't want to use a barrier, > mostly because it complicates parallel_leader_participation=off, > something that Amit is in agreement with [2][3]. > I think if we want we can use barrier API's to solve this problem, but I kind of have a feeling that it doesn't seem to be the most appropriate API, especially because existing API like WaitForParallelWorkersToFinish() can serve the need in a similar way. Just to conclude, following are proposed ways to solve this problem: 1. Implement a new API WaitForParallelWorkersToAttach and use that to solve this problem. Peter G. and Amit thinks, this is a good way to solve this problem. 2. Use existing API WaitForParallelWorkersToFinish to solve this problem. Peter G. feels that if API mentioned in #1 is not available, we can use this to solve the problem and I agree with that position. Thomas is not against it. 3. Use Thomas's new way to detect such failures. It is not clear to me at this stage if any one of us have accepted it to be the way to proceed, but Thomas and Peter G. want to investigate it further. 4. Use of Barrier API to solve this problem. Robert appears to be strongly in favor of this approach. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com