Hi,

On 03/14/2016 05:45 AM, David Rowley wrote:
On 14 March 2016 at 15:20, David Rowley <david.row...@2ndquadrant.com> wrote:
Current patch:
I've now updated the patch to base it on top of the parallel aggregate
patch in [2]. To apply the attached, you must apply [2] first!

...
[2] 
http://www.postgresql.org/message-id/cakjs1f9tr-+9akwz1xuhunejz7gkkfo45b+4fcnj8dkrdzy...@mail.gmail.com

The attached fixes a small thinko in apply_partialaggref_nodes().

After looking at the parallel aggregate patch, I also looked at this one, as it's naturally related. Sadly I haven't found any issue that I could nag about ;-) The patch seems well baked, as it was in the oven for quite a long time already.

The one concern I do have is that it only adds (de)serialize functions for SUM(numeric) and AVG(numeric). I think it's reasonable not to include that into the patch, but it will look a bit silly if that's all that gets into 9.6.

It's true plenty of aggregates is supported out of the box, as they use fixed-length data types for the aggregate state. But imagine users happy that their aggregate queries get parallelized, only to find out that sum(int8) disables that :-/

So I think we should try to support as many of the aggregates using internal as possible - it seems quite straight-forward to do that at least for the aggregates using the same internal representation (avg, sum, var_samp, var_pop, variance, ...). That's 26 aggregates out of the 45 with aggtranstype=INTERNAL.

For the remaining aggregates (jsonb_*. json_*, string_agg, array_agg, percentile_) it's probably more complicated to serialize the state, and maybe even impossible to do that correctly (e.g. string_agg accumulates the strings in the order as received, but partial paths would build private lists - not sure if this can actually happen in practice).

I think it would be good to actually document which aggregates support parallel execution, and which don't. Currently the docs don't mention this at all, and it's tricky for regular users to deduce that as it's related to the type of the internal state and not to the input types. An example of that is the SUM(bigint) example mentioned above.

That's actually the one thing I think the patch is missing.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to