On Fri, Aug 4, 2023 at 6:08 AM Richard Guo <guofengli...@gmail.com> wrote:
> > On Thu, Aug 3, 2023 at 7:20 PM Ashutosh Bapat < > ashutosh.bapat....@gmail.com> wrote: > >> However, if reparameterization can *not* happen at the planning time, we >> have chosen a plan which can not be realised meeting a dead end. So as long >> as we can ensure that the reparameterization is possible we can delay >> actual translations. I think it should be possible to decide whether >> reparameterization is possible based on the type of path alone. So seems >> doable. >> > > That has been done in v2 patch. See the new added function > path_is_reparameterizable_by_child(). > Thanks. The patch looks good overall. I like it because of its potential to reduce memory consumption in reparameterization. That's cake on top of cream :) Some comments here. + { + FLAT_COPY_PATH(new_path, path, Path); + + if (path->pathtype == T_SampleScan) + { + Index scan_relid = path->parent->relid; + RangeTblEntry *rte; + + /* it should be a base rel with a tablesample clause... */ + Assert(scan_relid > 0); + rte = planner_rt_fetch(scan_relid, root); + Assert(rte->rtekind == RTE_RELATION); + Assert(rte->tablesample != NULL); + + ADJUST_CHILD_EXPRS(rte->tablesample, TableSampleClause *); + } + } This change makes this function usable only after the final plan has been selected. If some code accidently uses it earlier, it would corrupt rte->tablesample without getting detected easily. I think we should mention this in the function prologue and move it somewhere else. Other option is to a. leave rte->tablesample unchanged here with a comment as to why we aren't changing it b. reparameterize tablesample expression in create_samplescan_plan() if the path is parameterized by the parent. We call reparameterize_path_by_child() in create_nestloop_plan() as this patch does right now. I like that we are delaying reparameterization. It saves a bunch of memory. I haven't measured it but from my recent experiments I know it will be a lot. * If the inner path is parameterized, it is parameterized by the - * topmost parent of the outer rel, not the outer rel itself. Fix - * that. + * topmost parent of the outer rel, not the outer rel itself. We will There's something wrong with the original sentence (which was probably written by me back then :)). I think it should have read "If the inner path is parameterized by the topmost parent of the outer rel instead of the outer rel itself, fix that.". If you agree, the new comment should change it accordingly. @@ -2430,6 +2430,16 @@ create_nestloop_path(PlannerInfo *root, { NestPath *pathnode = makeNode(NestPath); Relids inner_req_outer = PATH_REQ_OUTER(inner_path); + Relids outerrelids; + + /* + * Paths are parameterized by top-level parents, so run parameterization + * tests on the parent relids. + */ + if (outer_path->parent->top_parent_relids) + outerrelids = outer_path->parent->top_parent_relids; + else + outerrelids = outer_path->parent->relids; This looks like an existing bug. Did partitionwise join paths ended up with extra restrict clauses without this fix? I am not able to see why this change is needed by rest of the changes in the patch? Anyway, let's rename outerrelids to something else e.g. outer_param_relids to reflect its true meaning. +bool +path_is_reparameterizable_by_child(Path *path) +{ + switch (nodeTag(path)) + { + case T_Path: ... snip ... + case T_GatherPath: + return true; + default: + + /* We don't know how to reparameterize this path. */ + return false; + } + + return false; +} How do we make sure that any changes to reparameterize_path_by_child() are reflected here? One way is to call this function from reparameterize_path_by_child() the first thing and return if the path can not be reparameterized. I haven't gone through each path's translation to make sure that it's possible to do that during create_nestloop_plan(). I will do that in my next round of review if time permits. But AFAIR, each of the translations are possible during create_nestloop_plan() when it happens in this patch. -#define ADJUST_CHILD_ATTRS(node) \ +#define ADJUST_CHILD_EXPRS(node, fieldtype) \ ((node) = \ - (List *) adjust_appendrel_attrs_multilevel(root, (Node *) (node), \ - child_rel, \ - child_rel->top_parent)) + (fieldtype) adjust_appendrel_attrs_multilevel(root, (Node *) (node), \ + child_rel, \ + child_rel->top_parent)) + +#define ADJUST_CHILD_ATTRS(node) ADJUST_CHILD_EXPRS(node, List *) IIRC, ATTRS here is taken from the function being called. I think we should just change the macro definition, not its name. If you would like to have a separate macro for List nodes, create ADJUST_CHILD_ATTRS_IN_LIST or some such. -- Best Wishes, Ashutosh Bapat