Re: [HACKERS] Partition-wise aggregation/grouping

Jeevan Chalke Tue, 06 Mar 2018 06:23:20 -0800

On Tue, Mar 6, 2018 at 4:59 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:


> Hi Jeevan,
> I am back reviewing this. Here are some comments.
>
> @@ -1415,7 +1413,8 @@ add_paths_to_append_rel(PlannerInfo *root,
> RelOptInfo *rel,
>           * the unparameterized Append path we are constructing for the
> parent.
>           * If not, there's no workable unparameterized path.
>           */
> -        if (childrel->cheapest_total_path->param_info == NULL)
> +        if (childrel->pathlist != NIL &&
> +            childrel->cheapest_total_path->param_info == NULL)
>              accumulate_append_subpath(childrel->cheapest_total_path,
>                                        &subpaths, NULL);
>          else
> @@ -1683,6 +1682,13 @@ add_paths_to_append_rel(PlannerInfo *root,
> RelOptInfo *rel,
>              RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr);
>              Path       *subpath;
>
> +            if (childrel->pathlist == NIL)
> +            {
> +                /* failed to make a suitable path for this child */
> +                subpaths_valid = false;
> +                break;
> +            }
> +
> When can childrel->pathlist be NIL?
>
> diff --git a/src/backend/optimizer/plan/createplan.c
> b/src/backend/optimizer/plan/createplan.c
> index 9ae1bf3..f90626c 100644
> --- a/src/backend/optimizer/plan/createplan.c
> +++ b/src/backend/optimizer/plan/createplan.c
> @@ -1670,7 +1670,15 @@ create_sort_plan(PlannerInfo *root, SortPath
> *best_path, int flags)
>      subplan = create_plan_recurse(root, best_path->subpath,
>                                    flags | CP_SMALL_TLIST);
>
> -    plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys,
> NULL);
> +    /*
> +     * In find_ec_member_for_tle(), child EC members are ignored if they
> don't
> +     * belong to the given relids. Thus, if this sort path is based on a
> child
> +     * relation, we must pass the relids of it. Otherwise, we will end-up
> into
> +     * an error requiring pathkey item.
> +     */
> +    plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys,
> +                                   IS_OTHER_REL(best_path->subpath->parent)
> ?
> +                                   best_path->path.parent->relids : NULL);
>
>      copy_generic_path_info(&plan->plan, (Path *) best_path);
>
> Please separate this small adjustment in a patch of its own, with some
> explanation of why we need it i.e. now this function can see SortPaths from
> child (other) relations.
>
> +    if (child_data)
> +    {
> +        /* Must be other rel as all child relations are marked OTHER_RELs
> */
> +        Assert(IS_OTHER_REL(input_rel));
>
> I think we should check IS_OTHER_REL() and Assert(child_data). That way we
> know
> that the code in the if block is executed for OTHER relation.
>
> -    if ((root->hasHavingQual || parse->groupingSets) &&
> +    if (!child_data && (root->hasHavingQual || parse->groupingSets) &&
>
> Degenerate grouping will never see child relations, so instead of checking
> for
> child_data, Assert (!IS_OTHER_REL()) inside this block. Add a comment there
> explaining the assertion.
>
> +     *
> +     * If we are performing grouping for a child relation, fetch can_sort
> from
> +     * the child_data to avoid re-calculating same.
>       */
> -    can_sort = (gd && gd->rollups != NIL)
> -        || grouping_is_sortable(parse->groupClause);
> +    can_sort = child_data ? child_data->can_sort : ((gd &&
> gd->rollups != NIL) ||
> +
> grouping_is_sortable(parse->groupClause));
>
> Instead of adding a conditional here, we can compute these values before
> create_grouping_paths() is called from grouping_planner() and then pass
> them
> down to try_partitionwise_grouping(). I have attached a patch which
> refactors
> this code. Please see if this refactoring is useful. In the attached
> patch, I
> have handled can_sort, can_hash and partial aggregation costs. More on the
> last
> component below.
>

Changes look good to me and refactoring will be useful for partitionwise
patches.

However, will it be good if we add agg_costs into the GroupPathExtraData
too?
Also can we pass this to the add_partial_paths_to_grouping_rel() and
add_paths_to_grouping_rel() to avoid passing can_sort, can_hash and costs
related details individually to them?


>
>
>      /*
>       * Figure out whether a PartialAggregate/Finalize Aggregate execution
> @@ -3740,10 +3789,8 @@ create_grouping_paths(PlannerInfo *root,
>       * partial paths for partially_grouped_rel; that way, later code can
>       * easily consider both parallel and non-parallel approaches to
> grouping.
>       */
> -    if (try_parallel_aggregation)
> +    if (!child_data && !(agg_costs->hasNonPartial ||
> agg_costs->hasNonSerial))
>      {
> -        PathTarget *partial_grouping_target;
> -
> [... clipped ...]
> +            get_agg_clause_costs(root, havingQual,
>                                   AGGSPLIT_FINAL_DESERIAL,
> -                                 &agg_final_costs);
> +                                 agg_final_costs);
>          }
> +    }
>
> With this change, we are computing partial aggregation costs even in
> the cases when
> those will not be used e.g. when there are no children and parallel paths
> can
> not be created. In the attached patch, I have refactored the code such that
> they are computed when they are needed the first time and re-used later.
>
> +    if (child_data)
> +    {
> +        partial_grouping_target = child_data->partialRelTarget;
> +        partially_grouped_rel->reltarget = partial_grouping_target;
> +        agg_partial_costs = child_data->agg_partial_costs;
> +        agg_final_costs = child_data->agg_final_costs;
> +    }
>
> I think, with the refactoring, we can get rid of the last two lines here. I
> think we can get rid of this block entirely, but I have not reviewed the
> entire
> code to confirm that.
>
>  static PathTarget *
> -make_partial_grouping_target(PlannerInfo *root, PathTarget
> *grouping_target)
> +make_partial_grouping_target(PlannerInfo *root,
> +                             PathTarget *grouping_target,
> +                             Node *havingQual)
> This looks like a refactoring change. Should go to one of the refactoring
> patches or in a patch of its own.
>
> This isn't full review. I will continue reviewing this further.
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>



-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] Partition-wise aggregation/grouping

Reply via email to