Re: Making aggregate deserialization (and WAL receive) functions slightly faster

Tom Lane Sat, 11 Feb 2023 22:39:29 -0800

David Rowley <[email protected]> writes:
> While working on 16fd03e95, I noticed that in each aggregate
> deserialization function, in order to "receive" the bytea value that
> is the serialized aggregate state, appendBinaryStringInfo is used to
> append the bytes of the bytea value onto a temporary StringInfoData.
> Using  appendBinaryStringInfo seems a bit wasteful here. We could
> really just fake up a StringInfoData and point directly to the bytes
> of the bytea value.


Perhaps, but ...

> The best way I could think of to do this was to invent
> initStringInfoFromString() which initialises a StringInfoData and has
> the ->data field point directly at the specified buffer.  This will
> mean that it would be unsafe to do any appendStringInfo* operations on
> the resulting StringInfoData as enlargeStringInfo would try to
> repalloc the data buffer, which might not even point to a palloc'd
> string.

I find this patch horribly dangerous.

It could maybe be okay if we added the capability for StringInfoData
to understand (and enforce) that its "data" buffer is read-only.
However, that'd add overhead to every existing use-case.

> I've attached the benchmark results I got after testing how the
> modification changed the performance of string_agg_deserialize().
> I was hoping this would have a slightly more impressive performance
> impact, especially for string_agg() and array_agg() as the aggregate
> states of those can be large.  However, in the test I ran, there's
> only a very slight performance gain. I may just not have found the
> best case, however.

I do not think we should even consider this without solid evidence
for *major* performance improvements.  As it stands, it's a
quintessential example of a loaded foot-gun, and it seems clear
that making it safe enough to use would add more overhead than
it saves.

                        regards, tom lane

Re: Making aggregate deserialization (and WAL receive) functions slightly faster

Reply via email to