Hi, On 2017-12-18 03:30:55 +1300, David Rowley wrote: > While working on partial aggregation a few years ago, I didn't really > think it was worthwhile allowing partial aggregation of string_agg and > array_agg. I soon realised that I was wrong about that and allowing > parallelisation of these aggregates still could be very useful when > many rows are filtered out during the scan.
+1 > Some benchmarks that I've done locally show that parallel string_agg > and array_agg do actually perform better, despite the fact that the > aggregate state grows linearly with each aggregated item. It also allows *other* aggregates with different space consumption to be computed in parallel, that can be fairly huge. > Just a handful of aggregates now don't support partial aggregation; > > postgres=# select aggfnoid from pg_aggregate where aggcombinefn=0 and > aggkind='n'; > aggfnoid > ------------------ > xmlagg > json_agg > json_object_agg > jsonb_agg > jsonb_object_agg > (5 rows) FWIW, I've heard numerous people yearn for json*agg > I'm going to add this to PG11's final commitfest rather than the > January 'fest as it seems more like a final commitfest type of patch. Not sure I understand? > +/* > + * array_agg_serialize > + * Serialize ArrayBuildState into bytea. > + */ > +Datum > +array_agg_serialize(PG_FUNCTION_ARGS) > +{ > + /* > + * dvalues -- this is very simple when the value type is byval, we can > + * simply just send the Datums over, however, for non-byval types we > must > + * work a little harder. > + */ > + if (state->typbyval) > + pq_sendbytes(&buf, (char *) state->dvalues, sizeof(Datum) * > state->nelems); > + else > + { > + Oid typsend; > + bool typisvarlena; > + bytea *outputbytes; > + int i; > + > + /* XXX is there anywhere to cache this to save calling each > time? */ > + getTypeBinaryOutputInfo(state->element_type, &typsend, > &typisvarlena); Yes, you should be able to store this in fcinfo->flinfo->fn_extra. Or do you see a problem doing so? > + > +Datum > +array_agg_deserialize(PG_FUNCTION_ARGS) > +{ > + /* > + * dvalues -- this is very simple when the value type is byval, we can > + * simply just get all the Datums at once, however, for non-byval types > we > + * must work a little harder. > + */ > + if (result->typbyval) > + { > + temp = pq_getmsgbytes(&buf, sizeof(Datum) * nelems); > + memcpy(result->dvalues, temp, sizeof(Datum) * nelems); > + } > + else > + { > + Oid typreceive; > + Oid typioparam; > + int i; > + > + getTypeBinaryInputInfo(element_type, &typreceive, &typioparam); > + > + for (i = 0; i < nelems; i++) > + { > + /* XXX? Surely this cannot be the way to do this? */ > + StringInfoData tmp; > + tmp.cursor = 0; > + tmp.maxlen = tmp.len = pq_getmsgint(&buf, 4); > + tmp.data = (char *) pq_getmsgbytes(&buf, tmp.len); > + > + result->dvalues[i] = OidReceiveFunctionCall(typreceive, > &tmp, > + > typioparam, -1); Hm, doesn't look too bad to me? What's your concern here? This isn't a real review, I was basically just doing a quick scroll-through. Greetings, Andres Freund