On 18 March 2016 at 01:22, Amit Kapila <[email protected]> wrote: > On Thu, Mar 17, 2016 at 10:35 AM, David Rowley > <[email protected]> wrote: >> >> On 17 March 2016 at 01:19, Amit Kapila <[email protected]> wrote: >> > Few assorted comments: >> > >> > 2. >> > AggPath * >> > create_agg_path(PlannerInfo *root, >> > @@ -2397,9 +2399,11 @@ create_agg_path(PlannerInfo *root, >> > >> > List *groupClause, >> > List *qual, >> > const AggClauseCosts >> > *aggcosts, >> > - double numGroups) >> > + double numGroups, >> > + >> > bool combineStates, >> > + bool finalizeAggs) >> > >> > Don't you need to set parallel_aware flag in this function as we do for >> > create_seqscan_path()? >> >> I don't really know the answer to that... I mean there's nothing >> special done in nodeAgg.c if the node is running in a worker or in the >> main process. >> > > On again thinking about it, I think it is okay to set parallel_aware flag as > false. This flag means whether that particular node has any parallelism > behaviour which is true for seqscan, but I think not for partial aggregate > node. > > Few other comments on latest patch: > > 1. > > + /* > + * XXX does this need estimated for each partial path, or are they > + * all > going to be the same anyway? > + */ > + dNumPartialGroups = get_number_of_groups(root, > + > clamp_row_est(partial_aggregate_path->rows), > + > rollup_lists, > + > rollup_groupclauses); > > For considering partial groups, do we need to rollup related lists?
No it doesn't you're right. I did mean to remove these, but they're
NIL anyway. Seems better to remove them to prevent confusion.
> 2.
> + hashaggtablesize = estimate_hashagg_tablesize(partial_aggregate_path,
> +
> &agg_costs,
> +
> dNumPartialGroups);
> +
> + /*
> + * Generate a
> hashagg Path, if we can, but we'll skip this if the hash
> + * table looks like it'll exceed work_mem.
> +
> */
> + if (can_hash && hashaggtablesize < work_mem * 1024L)
>
>
> hash table size should be estimated only if can_hash is true.
Good point. Changed.
>
> 3.
> + foreach(lc, grouped_rel->partial_pathlist)
> + {
> + Path *path =
> (Path *) lfirst(lc);
> + double total_groups;
> +
> + total_groups = path-
>>parallel_degree * path->rows;
> +
> + path = (Path *) create_gather_path(root, grouped_rel, path,
> NULL,
> + &total_groups);
>
> Do you need to perform it foreach partial path or just do it for
> firstpartial path?
That's true. The order here does not matter since we're passing
directly into a Gather node, so it's wasteful to consider anything
apart from the cheapest path. -- Fixed.
There was also a missing hash table size check on the Finalize
HashAggregate Path consideration. I've added that now.
Updated patch is attached. Thanks for the re-review.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
0001-Allow-aggregation-to-happen-in-parallel_2016-03-18a.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
