> -----Original Message-----
> From: David Rowley [mailto:david.row...@2ndquadrant.com]
> Sent: Tuesday, June 23, 2015 2:06 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Robert Haas; pgsql-hackers@postgresql.org; Tom Lane
> Subject: Re: [HACKERS] upper planner path-ification
> 
> 
> On 23 June 2015 at 13:55, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
> 
> 
>       Once we support to add aggregation path during path consideration,
>       we need to pay attention morphing of the final target-list according
>       to the intermediate path combination, tentatively chosen.
>       For example, if partial-aggregation makes sense from cost perspective;
>       like SUM(NRows) of partial COUNT(*) AS NRows instead of COUNT(*) on
>       billion rows, planner also has to adjust the final target-list according
>       to the interim paths. In this case, final output shall be SUM(), instead
>       of COUNT().
> 
> 
> 
> 
> This sounds very much like what's been discussed here:
> 
> http://www.postgresql.org/message-id/CA+U5nMJ92azm0Yt8TT=hNxFP=VjFhDqFpaWfmj
> +66-4zvcg...@mail.gmail.com
> 
> 
> The basic concept is that we add another function set to aggregates that allow
> the combination of 2 states. For the case of MIN() and MAX() this will just be
> the same as the transfn. SUM() is similar for many types, more complex for 
> others.
> I've quite likely just borrowed SUM(BIGINT)'s transfer functions to allow
> COUNT()'s to be combined.
>
STDDEV, VARIANCE and relevant can be constructed using nrows, sum(X) and 
sum(X^2).
REGR_*, COVAR_* and relevant can be constructed using nrows, sum(X), sum(Y),
sum(X^2), sum(Y^2) and sum(X*Y).

Let me introduce a positive side effect of this approach.
Because final aggregate function processes values already aggregated partially,
the difference between the state value and transition value gets relatively 
small.
It reduces accidental errors around floating-point calculation. :-)

> More time does need spent inventing the new combining functions that don't
> currently exist, but that shouldn't matter as it can be done later.
> 
> Commitfest link to patch here https://commitfest.postgresql.org/5/131/
> 
> I see you've signed up to review it!
>
Yes, all of us looks at same direction.

Problem is, we have to cross the mountain of the planner enhancement to reach
all the valuable:
 - parallel aggregation
 - aggregation before join
 - remote aggregation via FDW

So, unless we don't find out a solution around planner, 2-phase aggregation is
like a curry without rice....

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to