On Thu, Feb 1, 2018 at 1:11 AM, Robert Haas <[email protected]> wrote:
> On Mon, Jan 29, 2018 at 3:42 AM, Jeevan Chalke
> <[email protected]> wrote:
> > Attached new patch set and rebased it on latest HEAD.
>
> I strongly dislike add_single_path_to_append_rel. It adds branches
> and complexity to code that is already very complex. Most
> importantly, why are we adding paths to fields in
> OtherUpperPathExtraData *extra instead of adding them to the path list
> of some RelOptInfo? If we had an appropriate RelOptInfo to which we
> could add the paths, then we could make this simpler.
>
> If I understand correctly, the reason you're doing it this way is
> because we have no place to put partially-aggregated, non-partial
> paths. If we only needed to worry about the parallel query case, we
> could just build an append of partially-aggregated paths for each
> child and stick it into the grouped rel's partial pathlist, just as we
> already do for regular parallel aggregation. There's no reason why
> add_paths_to_grouping_rel() needs to care about the difference a
> Partial Aggregate on top of whatever and an Append each branch of
> which is a Partial Aggregate on top of whatever. However, this won't
> work for non-partial paths, because add_paths_to_grouping_rel() needs
> to put paths into the grouped rel's pathlist -- and we can't mix
> together partially-aggregated paths and fully-aggregated paths in the
> same path list.
>
Yes.
>
> But, really, the way we're using grouped_rel->partial_pathlist right
> now is an awful hack. What I'm thinking we could do is introduce a
> new UpperRelationKind called UPPERREL_PARTIAL_GROUP_AGG, coming just
> before UPPERREL_GROUP_AGG. Without partition-wise aggregate,
> partially_grouped_rel's pathlist would always be NIL, and its partial
> pathlist would be constructed using the logic in
> add_partial_paths_to_grouping_rel, which would need renaming. Then,
> add_paths_to_grouping_rel would use paths from input_rel when doing
> non-parallel aggregation and paths from partially_grouped_rel when
> doing parallel aggregation. This would eliminate the ugly
> grouped_rel->partial_pathlist = NIL assignment at the bottom of
> create_grouping_paths(), because the grouped_rel's partial_pathlist
> would never have been (bogusly) populated in the first place, and
> hence would not need to be reset. All of these changes could be made
> via a preparatory patch.
>
I wrote a patch for this (on current HEAD) and attached separately here.
Please have a look.
I still not yet fully understand how we are going to pass those to the
add_paths_to_append_rel(). I need to look it more deeply though.
> Then the main patch needs to worry about four cases:
>
> 1. Parallel partition-wise aggregate, grouping key doesn't contain
> partition key. This should just be a matter of adding additional
> Append paths to partially_grouped_rel->partial_pathlist. The existing
> code already knows how to stick a Gather and FinalizeAggregate step on
> top of that, and I don't see why that logic would need any
> modification or addition. An Append of child partial-grouping paths
> should be producing the same output as a partial grouping of an
> Append, except that the former case might produce more separate groups
> that need to be merged; but that should be OK: we can just throw all
> the paths into the same path list and let the cheapest one win.
>
For any partial aggregation we need to add finalization step after we are
done with the APPEND i.e. post add_paths_to_append_rel(). Given that we
need to replicate the logic of sticking Gather and FinalizeAggregate step
at later stage. This is what exactly done in create_partition_agg_paths().
Am I missing something here?
> 2. Parallel partition-wise aggregate, grouping key contains partition
> key. For the most part, this is no different from case #1. We won't
> have groups spanning different partitions in this case, but we might
> have groups spanning different workers, so we still need a
> FinalizeAggregate step. As an exception, Gather -> Parallel Append ->
> [non-partial Aggregate path] would give us a way of doing aggregation
> in parallel without a separate Finalize step. I'm not sure if we want
> to consider that to be in scope for this patch. If we do, then we'd
> add the Parallel Append path to grouped_rel->partial_pathlist. Then,
> we could stick Gather (Merge) on top if it to produce a path for
> grouped_rel->pathlist using generate_gather_paths(); alternatively, it
> can be used by upper planning steps -- something we currently can't
> ever make work with parallel aggregation.
>
> 3. Non-parallel partition-wise aggregate, grouping key contains
> partition key. Build Append paths from the children of grouped_rel
> and add them to grouped_rel->pathlist.
>
Yes.
>
> 3. Non-parallel partition-wise aggregate, grouping key doesn't contain
> partition key. Build Append paths from the children of
> partially_grouped_rel and add them to partially_grouped_rel->pathlist.
> Also add code to generate paths for grouped_rel->pathlist by sticking
> a FinalizeAggregate step on top of each path from
> partially_grouped_rel->pathlist.
>
Yes, this is done in create_partition_agg_paths().
create_partition_agg_paths() basically adds gather path, if required and
then finalizes it again if required. These steps are similar to that of
add_paths_to_grouping_rel() counterpart which does gather + finalization.
> Overall, what I'm trying to argue for here is making this feature look
> less like its own separate thing and more like part of the general
> mechanism we've already got: partial paths would turn into regular
> paths via generate_gather_paths(), and partially aggregated paths
> would turn into fully aggregated paths by adding FinalizeAggregate.
> The existing special case that allows us to build a non-partial, fully
> aggregated path from a partial, partially-aggregated path would be
> preserved.
>
> I think this would probably eliminate some other problems I see in the
> existing design as well. For example, create_partition_agg_paths
> doesn't consider using Gather Merge, but that might be a win.
Append path is always non-sorted and has no pathkeys. Thus Gather Merge
over an Append path seems infeasible, isn't it?
> With
> the design above, I think you never need to call create_gather_path()
> anywhere. In case #1, the existing code takes care of it. In the
> special case mentioned under #2, if we chose to support that,
> generate_gather_paths() would take care of it. Both of those places
> already know about Gather Merge.
>
I don't understand how exactly, will have more careful look over this.
> On another note, I found preferFullAgg to be wicked confusing. To
> "prefer" something is to like it better, but be willing to accept
> other options if the preference can't be accommodated. Here, it seems
> like preferFullAgg = false prevents consideration of full aggregation.
> So it's really more like allowFullAgg, or, maybe better,
> try_full_aggregation. Also, try_partition_wise_grouping has a
> variable isPartialAgg which is always ends up getting set to
> !preferFullAgg. Having two Boolean variables which are always set to
> the opposite of each other isn't good. To add to the confusion, the
> code following the place where isPartialAgg is set sometimes refers to
> isPartialAgg and sometimes refers to preferFullAgg.
>
I will have a look over this and commenting part.
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
Thanks
--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 2a4e22b..8b8a567 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -186,15 +186,18 @@ static PathTarget *make_sort_input_target(PlannerInfo *root,
static void adjust_paths_for_srfs(PlannerInfo *root, RelOptInfo *rel,
List *targets, List *targets_contain_srfs);
static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
- RelOptInfo *grouped_rel, PathTarget *target,
+ RelOptInfo *grouped_rel,
+ RelOptInfo *partially_grouped_rel,
+ PathTarget *target,
PathTarget *partial_grouping_target,
const AggClauseCosts *agg_costs,
const AggClauseCosts *agg_final_costs,
grouping_sets_data *gd, bool can_sort, bool can_hash,
double dNumGroups, List *havingQual);
-static void add_partial_paths_to_grouping_rel(PlannerInfo *root,
+static void add_paths_to_partial_grouping_rel(PlannerInfo *root,
RelOptInfo *input_rel,
RelOptInfo *grouped_rel,
+ RelOptInfo *partial_grouped_rel,
PathTarget *target,
PathTarget *partial_grouping_target,
AggClauseCosts *agg_partial_costs,
@@ -3627,6 +3630,7 @@ create_grouping_paths(PlannerInfo *root,
Query *parse = root->parse;
Path *cheapest_path = input_rel->cheapest_total_path;
RelOptInfo *grouped_rel;
+ RelOptInfo *partially_grouped_rel;
PathTarget *partial_grouping_target = NULL;
AggClauseCosts agg_partial_costs; /* parallel only */
AggClauseCosts agg_final_costs; /* parallel only */
@@ -3637,6 +3641,8 @@ create_grouping_paths(PlannerInfo *root,
/* For now, do all work in the (GROUP_AGG, NULL) upperrel */
grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL);
+ partially_grouped_rel = fetch_upper_rel(root, UPPERREL_PARTIAL_GROUP_AGG,
+ NULL);
/*
* If the input relation is not parallel-safe, then the grouped relation
@@ -3817,7 +3823,8 @@ create_grouping_paths(PlannerInfo *root,
&agg_final_costs);
}
- add_partial_paths_to_grouping_rel(root, input_rel, grouped_rel, target,
+ add_paths_to_partial_grouping_rel(root, input_rel, grouped_rel,
+ partially_grouped_rel, target,
partial_grouping_target,
&agg_partial_costs, &agg_final_costs,
gd, can_sort, can_hash,
@@ -3825,7 +3832,8 @@ create_grouping_paths(PlannerInfo *root,
}
/* Build final grouping paths */
- add_paths_to_grouping_rel(root, input_rel, grouped_rel, target,
+ add_paths_to_grouping_rel(root, input_rel, grouped_rel,
+ partially_grouped_rel, target,
partial_grouping_target, agg_costs,
&agg_final_costs, gd, can_sort, can_hash,
dNumGroups, (List *) parse->havingQual);
@@ -5859,7 +5867,9 @@ get_partitioned_child_rels_for_join(PlannerInfo *root, Relids join_relids)
*/
static void
add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
- RelOptInfo *grouped_rel, PathTarget *target,
+ RelOptInfo *grouped_rel,
+ RelOptInfo *partially_grouped_rel,
+ PathTarget *target,
PathTarget *partial_grouping_target,
const AggClauseCosts *agg_costs,
const AggClauseCosts *agg_final_costs,
@@ -5870,6 +5880,12 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
Path *cheapest_path = input_rel->cheapest_total_path;
ListCell *lc;
+ /*
+ * Parallel aggregation's partial paths must be stored in a
+ * partially_grouped_rel and not in a grouped_rel.
+ */
+ Assert(grouped_rel->partial_pathlist == NIL);
+
if (can_sort)
{
/*
@@ -5945,10 +5961,13 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
* Now generate a complete GroupAgg Path atop of the cheapest partial
* path. We can do this using either Gather or Gather Merge.
*/
- if (grouped_rel->partial_pathlist)
+ if (partially_grouped_rel->partial_pathlist)
{
- Path *path = (Path *) linitial(grouped_rel->partial_pathlist);
- double total_groups = path->rows * path->parallel_workers;
+ Path *path;
+ double total_groups;
+
+ path = (Path *) linitial(partially_grouped_rel->partial_pathlist);
+ total_groups = path->rows * path->parallel_workers;
path = (Path *) create_gather_path(root,
grouped_rel,
@@ -5999,7 +6018,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
*/
if (parse->groupClause != NIL && root->group_pathkeys != NIL)
{
- foreach(lc, grouped_rel->partial_pathlist)
+ foreach(lc, partially_grouped_rel->partial_pathlist)
{
Path *subpath = (Path *) lfirst(lc);
Path *gmpath;
@@ -6107,9 +6126,11 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
* again, we'll only do this if it looks as though the hash table
* won't exceed work_mem.
*/
- if (grouped_rel->partial_pathlist)
+ if (partially_grouped_rel->partial_pathlist)
{
- Path *path = (Path *) linitial(grouped_rel->partial_pathlist);
+ Path *path;
+
+ path = (Path *) linitial(partially_grouped_rel->partial_pathlist);
hashaggtablesize = estimate_hashagg_tablesize(path,
agg_final_costs,
@@ -6143,15 +6164,16 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
}
/*
- * add_partial_paths_to_grouping_rel
+ * add_paths_to_partial_grouping_rel
*
* Add partial paths to grouping relation. These paths are not fully
* aggregated; a FinalizeAggregate step is still required.
*/
static void
-add_partial_paths_to_grouping_rel(PlannerInfo *root,
+add_paths_to_partial_grouping_rel(PlannerInfo *root,
RelOptInfo *input_rel,
RelOptInfo *grouped_rel,
+ RelOptInfo *partially_grouped_rel,
PathTarget *target,
PathTarget *partial_grouping_target,
AggClauseCosts *agg_partial_costs,
@@ -6199,7 +6221,7 @@ add_partial_paths_to_grouping_rel(PlannerInfo *root,
-1.0);
if (parse->hasAggs)
- add_partial_path(grouped_rel, (Path *)
+ add_partial_path(partially_grouped_rel, (Path *)
create_agg_path(root,
grouped_rel,
path,
@@ -6211,7 +6233,7 @@ add_partial_paths_to_grouping_rel(PlannerInfo *root,
agg_partial_costs,
dNumPartialGroups));
else
- add_partial_path(grouped_rel, (Path *)
+ add_partial_path(partially_grouped_rel, (Path *)
create_group_path(root,
grouped_rel,
path,
@@ -6239,7 +6261,7 @@ add_partial_paths_to_grouping_rel(PlannerInfo *root,
*/
if (hashaggtablesize < work_mem * 1024L)
{
- add_partial_path(grouped_rel, (Path *)
+ add_partial_path(partially_grouped_rel, (Path *)
create_agg_path(root,
grouped_rel,
cheapest_partial_path,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 6bf68f3..a10b150 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -71,6 +71,8 @@ typedef struct AggClauseCosts
typedef enum UpperRelationKind
{
UPPERREL_SETOP, /* result of UNION/INTERSECT/EXCEPT, if any */
+ UPPERREL_PARTIAL_GROUP_AGG, /* result of partial grouping/aggregation, if
+ * any */
UPPERREL_GROUP_AGG, /* result of grouping/aggregation, if any */
UPPERREL_WINDOW, /* result of window functions, if any */
UPPERREL_DISTINCT, /* result of "SELECT DISTINCT", if any */