On Mon, Nov 27, 2017 at 10:21 PM, amul sul <sula...@gmail.com> wrote: > Thanks a lot Rajkumar for this test. I am able to reproduce this crash by > enabling partition wise join. > > The reason for this crash is the same as > the > previous[1] i.e node->as_whichplan > value. This time append->first_partial_plan value looks suspicious. With > the > following change to the v21 patch, I am able to reproduce this crash as > assert > failure when enable_partition_wise_join = ON otherwise working fine. > > diff --git a/src/backend/executor/nodeAppend.c > b/src/backend/executor/nodeAppend.c > index e3b17cf0e2..4b337ac633 100644 > --- a/src/backend/executor/nodeAppend.c > +++ b/src/backend/executor/nodeAppend.c > @@ -458,6 +458,7 @@ choose_next_subplan_for_worker(AppendState *node) > > /* Backward scan is not supported by parallel-aware plans */ > Assert(ScanDirectionIsForward(node->ps.state->es_direction)); > + Assert(append->first_partial_plan < node->as_nplans); > > LWLockAcquire(&pstate->pa_lock, LW_EXCLUSIVE); > > > Will look into this more, tomorrow. > I haven't reached the actual reason why there wasn't any partial plan (i.e. value of append->first_partial_plan & node->as_nplans are same) when the partition-wise join is enabled. I think in this case we could simply return false from choose_next_subplan_for_worker() when there aren't any partial plan and we done with all non-partition plan, although I may be wrong because I am yet to understand this patch.
Here are the changes I did on v21 patch to handle crash reported by Rajkumar[1]: diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c index e3b17cf0e2..e0ee918808 100644 --- a/src/backend/executor/nodeAppend.c +++ b/src/backend/executor/nodeAppend.c @@ -479,9 +479,12 @@ choose_next_subplan_for_worker(AppendState *node) pstate->pa_next_plan = append->first_partial_plan; else pstate->pa_next_plan++; - if (pstate->pa_next_plan == node->as_whichplan) + + if (pstate->pa_next_plan == node->as_whichplan || + (pstate->pa_next_plan == append->first_partial_plan && + append->first_partial_plan >= node->as_nplans)) { - /* We've tried everything! */ + /* We've tried everything or there were no partial plans */ pstate->pa_next_plan = INVALID_SUBPLAN_INDEX; LWLockRelease(&pstate->pa_lock); return false; Apart from this I have added few assert to keep eye on node->as_whichplan value in the attached patch, thanks. 1] http://postgr.es/m/CAKcux6nyDxOyE4PA8O%3DQgF-ugZp_y1G2U%2Burmf76-%3Df2knDsWA%40mail.gmail.com Regards, Amul
ParallelAppend_v22.patch
Description: Binary data