On 01/19/2016 05:14 AM, Robert Haas wrote:
On Mon, Jan 18, 2016 at 12:34 PM, Robert Haas <robertmh...@gmail.com> wrote:
On Mon, Jan 18, 2016 at 12:09 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
Robert Haas <robertmh...@gmail.com> writes:
Yeah.  The API contract for an expanded object took me quite a while
to puzzle out, but it seems to be this: if somebody hands you an R/W
pointer to an expanded object, you're entitled to assume that you can
"take over" that object and mutate it however you like.  But the
object might be in some other memory context, so you have to move it
into your own memory context.

Only if you intend to keep it --- for example, a function that is
mutating and returning an object isn't required to move it
somewhere else, if the input is R/W, and I think it generally
shouldn't.

In the context here, I'd think it is the responsibility of
nodeAgg.c not individual datatype functions to make sure that
expanded objects live where it wants them to.

That's how I did it in my prototype, but the problem with that is that
spinning up a memory context for every group sucks when there are many
groups with only a small number of elements each - hence the 50%
regression that David Rowley observed.  If we're going to use expanded
objects here, which seems like a good idea in principle, that's going
to have to be fixed somehow.  We're flogging the heck out of malloc by
repeatedly creating a context, doing one or two allocations in it, and
then destroying the context.

I think that, in general, the memory context machinery wasn't really
designed to manage lots of small contexts.  The overhead of spinning
up a new context for just a few allocations is substantial.  That
matters in some other situations too, I think - I've commonly seen
AllocSetContextCreate taking several percent  of runtime in profiles.
But here it's much exacerbated.  I'm not sure whether it's better to
attack that problem at the root and try to make AllocSetContextCreate
cheaper, or whether we should try to figure out some change to the
expanded-object machinery to address the issue.

Here is a patch that helps a good deal.  I changed things so that when
we create a context, we always allocate at least 1kB.  If that's more
than we need for the node itself and the name, then we use the rest of
the space as a sort of keeper block.  I think there's more that can be
done to improve this, but I'm wimping out for now because it's late
here.

I suspect something like this is a good idea even if we don't end up
using it for aggregate transition functions.

I dare to claim that if expanded objects require a separate memory context per aggregate state (I don't see why would be the case), it's a dead end. Not so long ago we've fixed exactly this issue in array_agg(), which addressed a bunch of reported OOM issues and improved array_agg() performance quite a bit. It'd be rather crazy introducing the same problem to all aggregate functions.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to