On Mon, Aug 28, 2017 at 1:59 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > I wrote: >> I think that the correct fix probably involves marking each parallel scan >> plan node as dependent on a pseudo executor parameter, which the parent >> Gather or GatherMerge node would flag as being changed on each rescan. >> This would cue the plan layers in between that they cannot optimize on the >> assumption that the leader's instance of the parallel scan will produce >> exactly the same rows as it did last time, even when "nothing else >> changed". The "wtParam" pseudo parameter that's used for communication >> between RecursiveUnion and its descendant WorkTableScan node is a good >> model for what needs to happen. > > Here is a draft patch for this.
! /* ! * Set child node's chgParam to tell it that the next scan might deliver a ! * different set of rows within the leader process. (The overall rowset ! * shouldn't change, but the leader process's subset might; hence nodes ! * between here and the parallel table scan node mustn't optimize on the ! * assumption of an unchanging rowset.) ! */ ! if (gm->rescan_param >= 0) ! outerPlan->chgParam = bms_add_member(outerPlan->chgParam, ! gm->rescan_param); ! ! ! /* ! * if chgParam of subnode is not null then plan will be re-scanned by ! * first ExecProcNode. ! */ ! if (outerPlan->chgParam == NULL) ! ExecReScan(outerPlan); With this change, it is quite possible that during rescans workers will not do any work. I think this will allow workers to launch before rescan (for sequence scan) can reset the scan descriptor in the leader which means that workers will still see the old value and assume that the scan is finished and come out without doing any work. Now, this won't produce wrong results because the leader will scan the whole relation by itself in such a case, but it might be inefficient. It's a bit different from wtParam in > that the special parameter isn't allocated until createplan.c time, > so that we don't eat a parameter slot if we end up choosing a non-parallel > plan; but otherwise things are comparable. > > I could use some feedback on whether this is marking dependent child nodes > sanely. As written, any plan node that's marked parallel_aware is assumed > to need a dependency on the parent Gather or GatherMerge's rescan param > --- and the planner will now bitch if a parallel_aware plan node is not > under any such Gather. Is this reasonable? I think so. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers