Hi,

On 2017-12-18 03:30:55 +1300, David Rowley wrote:
> While working on partial aggregation a few years ago, I didn't really
> think it was worthwhile allowing partial aggregation of string_agg and
> array_agg. I soon realised that I was wrong about that and allowing
> parallelisation of these aggregates still could be very useful when
> many rows are filtered out during the scan.

+1


> Some benchmarks that I've done locally show that parallel string_agg
> and array_agg do actually perform better, despite the fact that the
> aggregate state grows linearly with each aggregated item.

It also allows *other* aggregates with different space consumption to be
computed in parallel, that can be fairly huge.


> Just a handful of aggregates now don't support partial aggregation;
> 
> postgres=# select aggfnoid from pg_aggregate where aggcombinefn=0 and
> aggkind='n';
>      aggfnoid
> ------------------
>  xmlagg
>  json_agg
>  json_object_agg
>  jsonb_agg
>  jsonb_object_agg
> (5 rows)

FWIW, I've heard numerous people yearn for json*agg



> I'm going to add this to PG11's final commitfest rather than the
> January 'fest as it seems more like a final commitfest type of patch.

Not sure I understand?


> +/*
> + * array_agg_serialize
> + *           Serialize ArrayBuildState into bytea.
> + */
> +Datum
> +array_agg_serialize(PG_FUNCTION_ARGS)
> +{

> +     /*
> +      * dvalues -- this is very simple when the value type is byval, we can
> +      * simply just send the Datums over, however, for non-byval types we 
> must
> +      * work a little harder.
> +      */
> +     if (state->typbyval)
> +             pq_sendbytes(&buf, (char *) state->dvalues, sizeof(Datum) * 
> state->nelems);
> +     else
> +     {
> +             Oid                     typsend;
> +             bool            typisvarlena;
> +             bytea      *outputbytes;
> +             int                     i;
> +
> +             /* XXX is there anywhere to cache this to save calling each 
> time? */
> +             getTypeBinaryOutputInfo(state->element_type, &typsend, 
> &typisvarlena);

Yes, you should be able to store this in fcinfo->flinfo->fn_extra. Or do
you see a problem doing so?

> +
> +Datum
> +array_agg_deserialize(PG_FUNCTION_ARGS)
> +{

> +     /*
> +      * dvalues -- this is very simple when the value type is byval, we can
> +      * simply just get all the Datums at once, however, for non-byval types 
> we
> +      * must work a little harder.
> +      */
> +     if (result->typbyval)
> +     {
> +             temp = pq_getmsgbytes(&buf, sizeof(Datum) * nelems);
> +             memcpy(result->dvalues, temp, sizeof(Datum) * nelems);
> +     }
> +     else
> +     {
> +             Oid             typreceive;
> +             Oid             typioparam;
> +             int             i;
> +
> +             getTypeBinaryInputInfo(element_type, &typreceive, &typioparam);
> +
> +             for (i = 0; i < nelems; i++)
> +             {
> +                     /* XXX? Surely this cannot be the way to do this? */
> +                     StringInfoData tmp;
> +                     tmp.cursor = 0;
> +                     tmp.maxlen = tmp.len = pq_getmsgint(&buf, 4);
> +                     tmp.data = (char *) pq_getmsgbytes(&buf, tmp.len);
> +
> +                     result->dvalues[i] = OidReceiveFunctionCall(typreceive, 
> &tmp,
> +                                                                             
>                                 typioparam, -1);

Hm, doesn't look too bad to me? What's your concern here?


This isn't a real review, I was basically just doing a quick
scroll-through.

Greetings,

Andres Freund

Reply via email to