On Tue, Mar 6, 2018 at 4:59 PM, Ashutosh Bapat < ashutosh.ba...@enterprisedb.com> wrote:
> Hi Jeevan, > I am back reviewing this. Here are some comments. > > @@ -1415,7 +1413,8 @@ add_paths_to_append_rel(PlannerInfo *root, > RelOptInfo *rel, > * the unparameterized Append path we are constructing for the > parent. > * If not, there's no workable unparameterized path. > */ > - if (childrel->cheapest_total_path->param_info == NULL) > + if (childrel->pathlist != NIL && > + childrel->cheapest_total_path->param_info == NULL) > accumulate_append_subpath(childrel->cheapest_total_path, > &subpaths, NULL); > else > @@ -1683,6 +1682,13 @@ add_paths_to_append_rel(PlannerInfo *root, > RelOptInfo *rel, > RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr); > Path *subpath; > > + if (childrel->pathlist == NIL) > + { > + /* failed to make a suitable path for this child */ > + subpaths_valid = false; > + break; > + } > + > When can childrel->pathlist be NIL? > > diff --git a/src/backend/optimizer/plan/createplan.c > b/src/backend/optimizer/plan/createplan.c > index 9ae1bf3..f90626c 100644 > --- a/src/backend/optimizer/plan/createplan.c > +++ b/src/backend/optimizer/plan/createplan.c > @@ -1670,7 +1670,15 @@ create_sort_plan(PlannerInfo *root, SortPath > *best_path, int flags) > subplan = create_plan_recurse(root, best_path->subpath, > flags | CP_SMALL_TLIST); > > - plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys, > NULL); > + /* > + * In find_ec_member_for_tle(), child EC members are ignored if they > don't > + * belong to the given relids. Thus, if this sort path is based on a > child > + * relation, we must pass the relids of it. Otherwise, we will end-up > into > + * an error requiring pathkey item. > + */ > + plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys, > + IS_OTHER_REL(best_path->subpath->parent) > ? > + best_path->path.parent->relids : NULL); > > copy_generic_path_info(&plan->plan, (Path *) best_path); > > Please separate this small adjustment in a patch of its own, with some > explanation of why we need it i.e. now this function can see SortPaths from > child (other) relations. > > + if (child_data) > + { > + /* Must be other rel as all child relations are marked OTHER_RELs > */ > + Assert(IS_OTHER_REL(input_rel)); > > I think we should check IS_OTHER_REL() and Assert(child_data). That way we > know > that the code in the if block is executed for OTHER relation. > > - if ((root->hasHavingQual || parse->groupingSets) && > + if (!child_data && (root->hasHavingQual || parse->groupingSets) && > > Degenerate grouping will never see child relations, so instead of checking > for > child_data, Assert (!IS_OTHER_REL()) inside this block. Add a comment there > explaining the assertion. > > + * > + * If we are performing grouping for a child relation, fetch can_sort > from > + * the child_data to avoid re-calculating same. > */ > - can_sort = (gd && gd->rollups != NIL) > - || grouping_is_sortable(parse->groupClause); > + can_sort = child_data ? child_data->can_sort : ((gd && > gd->rollups != NIL) || > + > grouping_is_sortable(parse->groupClause)); > > Instead of adding a conditional here, we can compute these values before > create_grouping_paths() is called from grouping_planner() and then pass > them > down to try_partitionwise_grouping(). I have attached a patch which > refactors > this code. Please see if this refactoring is useful. In the attached > patch, I > have handled can_sort, can_hash and partial aggregation costs. More on the > last > component below. > Changes look good to me and refactoring will be useful for partitionwise patches. However, will it be good if we add agg_costs into the GroupPathExtraData too? Also can we pass this to the add_partial_paths_to_grouping_rel() and add_paths_to_grouping_rel() to avoid passing can_sort, can_hash and costs related details individually to them? > > > /* > * Figure out whether a PartialAggregate/Finalize Aggregate execution > @@ -3740,10 +3789,8 @@ create_grouping_paths(PlannerInfo *root, > * partial paths for partially_grouped_rel; that way, later code can > * easily consider both parallel and non-parallel approaches to > grouping. > */ > - if (try_parallel_aggregation) > + if (!child_data && !(agg_costs->hasNonPartial || > agg_costs->hasNonSerial)) > { > - PathTarget *partial_grouping_target; > - > [... clipped ...] > + get_agg_clause_costs(root, havingQual, > AGGSPLIT_FINAL_DESERIAL, > - &agg_final_costs); > + agg_final_costs); > } > + } > > With this change, we are computing partial aggregation costs even in > the cases when > those will not be used e.g. when there are no children and parallel paths > can > not be created. In the attached patch, I have refactored the code such that > they are computed when they are needed the first time and re-used later. > > + if (child_data) > + { > + partial_grouping_target = child_data->partialRelTarget; > + partially_grouped_rel->reltarget = partial_grouping_target; > + agg_partial_costs = child_data->agg_partial_costs; > + agg_final_costs = child_data->agg_final_costs; > + } > > I think, with the refactoring, we can get rid of the last two lines here. I > think we can get rid of this block entirely, but I have not reviewed the > entire > code to confirm that. > > static PathTarget * > -make_partial_grouping_target(PlannerInfo *root, PathTarget > *grouping_target) > +make_partial_grouping_target(PlannerInfo *root, > + PathTarget *grouping_target, > + Node *havingQual) > This looks like a refactoring change. Should go to one of the refactoring > patches or in a patch of its own. > > This isn't full review. I will continue reviewing this further. > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company > -- Jeevan Chalke Technical Architect, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company