Re: [HACKERS] Memory Accounting v11

Tomas Vondra Tue, 14 Jul 2015 18:15:12 -0700

Hi,

On 07/14/2015 10:19 PM, Robert Haas wrote:

On Sat, Jul 11, 2015 at 2:28 AM, Jeff Davis <[email protected]> wrote:

After talking with a few people at PGCon, small noisy differences
in CPU timings can appear for almost any tweak to the code, and
aren't necessarily cause for major concern.


I agree with that in general, but the concern is a lot bigger when the
function is something that is called everywhere and accounts for a
measurable percentage of our total CPU usage on almost any workload.
If memory allocation got slower because, say, you added some code to
regexp.c and it caused AllocSetAlloc to split a cache line where it
hadn't previously, I wouldn't be worried about that; the next patch,
like as not, will buy the cost back again.  But here you really are
adding code to a hot path.

I think Jeff was suggesting that we should ignore changes measurablyaffecting performance - I'm one of those he discussed this patch with atpgcon, and I can assure you impact on performance was one of the maintopics of the discussion.

Firstly, do we really have good benchmarks and measurements? I reallydoubt that. We do have some numbers for REINDEX, where we observed0.5-1% regression on noisy results from a Power machine (and we've beenunable to reproduce that on x86). I don't think that's a representativebenchmark, and I'd like to see more thorough measurements. And I agreedto do this, once Jeff comes up with a new version of the patch.

Secondly, the question is whether the performance is impacted more bythe additional instructions, or by other things - say, random padding,as was explained by Andrew Gierth in [1].

I don't know whether that's happening in this patch, but if it is, itseems rather strange to use this against this patch and not the others(because there surely will be other patches causing similar issues).

[1]http://www.postgresql.org/message-id/[email protected]

tuplesort.c does its own accounting, and TBH that seems like the right
thing to do here, too.  The difficulty is, I think, that some
transition functions use an internal data type for the transition
state, which might not be a single palloc'd chunk.  But since we can't
spill those aggregates to disk *anyway*, that doesn't really matter.
If the transition is a varlena or a fixed-length type, we can know how
much space it's consuming without hooking into the memory context
framework.

I respectfully disagree. Our current inability to dump/load the statehas little to do with how we measure consumed memory, IMHO.

It's true that we do have two kinds of aggregates, depending on thenature of the aggregate state:


(a) fixed-size state (data types passed by value, variable length types
    that do not grow once allocated, ...)

(b) continuously growing state (as in string_agg/array_agg)

Jeff's HashAgg patch already fixes (a) and can fix (b) once we get asolution for dump/load of the aggregate stats - which we need toimplement anyway for parallel aggregate.

I know there was a proposal to force all aggregates to use regular datatypes as aggregate stats, but I can't see how that could work without asignificant performance penalty. For example array_agg() is usinginternal to pass ArrayBuildState - how do you turn that to a regulardata type without effectively serializing/deserializing the whole arrayon every transition?

And even if we come up with a solution for array_agg, do we reallybelieve it's possible to do for all custom aggregates? Maybe I'm missingsomething but I doubt that. ISTM designing ephemeral data structureallows tweaks that are impossible otherwise.

What might be possible is extending the aggregate API with anothercustom function returning size of the aggregate state. So when definingan aggregate using 'internal' for aggregate state, you'd specifytransfunc, finalfunc and sizefunc. That seems doable, I guess.


I find the memory accounting as a way more elegant solution, though.

kind regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory Accounting v11

Reply via email to