Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

Jeff Davis Sun, 22 Feb 2015 00:15:08 -0800

On Wed, 2015-01-07 at 20:07 +0100, Tomas Vondra wrote:
> So I started digging in the code and I noticed this:
> 
>     hash_mem = MemoryContextMemAllocated(aggstate->hashcontext, true);
> 
> which is IMHO obviously wrong, because that accounts only for the
> hashtable itself. It might be correct for aggregates with state passed
> by value, but it's incorrect for state passed by reference (e.g.
> Numeric, arrays etc.), because initialize_aggregates does this:
> 
>     oldContext = MemoryContextSwitchTo(aggstate->aggcontext);
>     pergroupstate->transValue = datumCopy(peraggstate->initValue,
>                                         peraggstate->transtypeByVal,
>                                         peraggstate->transtypeLen);
>     MemoryContextSwitchTo(oldContext);
> 
> and it's also wrong for all the user-defined aggretates that have no
> access to hashcontext at all, and only get reference to aggcontext using
> AggCheckCallContext(). array_agg() is a prime example.


Actually, I believe the first one is right, and the second one is wrong.
If we allocate pass-by-ref transition states in aggcontext, that defeats
the purpose of the patch, because we can't release the memory when we
start a new batch (because we can't reset aggcontext).

Instead, the transition states should be copied into hashcontext; we
should count the memory usage of hashcontext; AggCheckCallContext should
return hashcontext; and after each batch we should reset hashcontext.

It might be worth refactoring a bit to call it the "groupcontext" or
something instead, and use it for both HashAgg and GroupAgg. That would
reduce the need for conditionals.



[ ... problems with array_agg subcontexts ... ]

> and it runs indefinitely (I gave up after a few minutes). I believe this
> renders the proposed memory accounting approach dead.

...

> The array_agg() patch I submitted to this CF would fix this particular
> query, as it removes the child contexts (so there's no need for
> recursion in MemoryContextMemAllocated), but it does nothing to the
> user-defined aggregates out there. And it's not committed yet.

Now that your patch is committed, I don't think I'd call the memory
accounting patch I proposed "dead" yet. We decided that creating one
context per group is essentially a bug, so we don't need to optimize for
that case.

Your approach may be better, though.

Thank you for reviewing. I'll update my patches and resubmit for the
next CF.

Regards,
        Jeff Davis




-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

Reply via email to