On 20.8.2014 08:11, Jeff Davis wrote: > On Tue, 2014-08-19 at 12:54 +0200, Tomas Vondra wrote: > > It would be easier to resolve the performance concern if I could > reliably get the results Robert is getting. I think I was able to > reproduce the regression with the old patch, but the results were still > noisy.
I created a small extension for this purpose, it's available here: https://github.com/tvondra/palloc_bench In short, it creates a chain of memory contexts, and then repeats palloc/pfree a given number of times (with a chosen request size). It either calls AllocSetContextCreate or AllocSetContextCreateTracked, depending on whether there's a #define TRACKING_FLAG so either leave it there or comment it out. The time is printed as a warhing. I did a number of tests to get an idea of the overhead, using this call select palloc_bench(10,100000000,32768); which means 10 memory contexts, 100000000 palloc/free cycles with 32kB requests. And I got these results: master ------- 3246.03 ms 3125.24 ms 3247.86 ms 3251.85 ms 3113.03 ms 3140.35 ms v3 patch (AllocSetContextCreate => no tracing) 3303.64 ms 3278.57 ms 3295.11 ms 3325.63 ms 3329.84 ms 3439.27 ms -- v3 patch (AllocSetContextCreateTracked => tracing) 6296.43 ms 5228.83 ms 5271.43 ms 5158.60 ms 5404.57 ms 5240.40 ms -- v4 (tracing all the time) 6728.84 ms 6478.88 ms 6478.82 ms 6473.57 ms 6554.96 ms 6528.66 ms I think this makes the overhead clearly visible. I also worth mentioning that this does nothing else except for palloc/free, which is not really what a real workload does. And 100000000 palloc/free of 32kB blocks means ~3TB of RAM, unless my math is broken. So looking at the numbers and saying "7 seconds >> 3 seconds", all is lost is not really appropriate IMHO. Anyway, ISTM that v4 is actually a bitm ore expensive than v3 for some reason. I'm not entirely sure why, but I suspect it's because of updating the few additional memory contexts up to TopMemoryContext. That's something v3 didn't do. I tried to hack a bit on the idea of using a single byte for the flags (isReset and track_mem) - patch attached. That got me pretty much to v3 performance (or maybe slightly better): -- v4 + flag (tracing all the time) 5222.38 ms 4958.37 ms 5072.21 ms 5100.43 ms 5059.65 ms 4995.52 ms But nothing that'd magically save the day ... and with disabled tracing we get pretty much to v3 numbers (with trace_mem=false). So this gymnastics gave us pretty much nothing ... But I have realized that maybe the problem is that we're handling memory contexts and accounting as 1:1. But that's not really the case - most of the context is not really interested in this. They don't use accounting now, and it won't change. So only small fraction of memory contexts will ask for acounting. Yet all the contexts are forced to pass accounting info from their children to their parent, if there happens to be a context with track_mem=true somewhere above them. And that's exactly the problem, because most of the time is spent in update_accounting, in the loop over parents. So my proposal is to separate those two things into two hierarchies, that are somehow parallel, but not exactly. That means: (1) creating a structure with the accouting info typedef struct MemoryAccountingData { Size total_allocated; Size self_allocated; struct MemoryAccountingData * parent; } MemoryAccountingData; (2) adding a pointer to MemoryAccountingData to MemoryContextData, and a 'track_mem' flag for contexts that requested tracking typedef struct MemoryContextData { ... MemoryAccounting accounting; ... } MemoryContextData; (3) when a context does not request accounting, it just uses the accounting pointer from the parent context, and track_mem=false (4) when the context requests accounting, it allocates it's own accounting structure, sets accounting->parent to the accounting from parent, and sets track_mem=true Now all the contexts have a direct pointer to the accounting of the nearest parent context that explicitly requested accounting, and don't need to walk through all the parents. Contexts that did not request tracking have track_mem=false, and their accounting points to the parent with explicit accounting, or is NULL if there's no such parent. For these contexts, GetAllocated always returns 0, but that's OK because they haven't requested accounting anyway. Contexts that requested tracking have have track_mem=true, and have their own specific accounting instance. The accounting->parent plays pretty much the same role as 'accounting' with 'track_mem=false' (see previous paragraph). These contexts return GetAllocated properly. Now, I did a quick with the palloc_bench - 1 context with tracking enabled, and a chain of 10 contexts without tracking (but updating the accounting for the first context). And I see this - with tracking enabled: 3235.57 ms 3240.09 ms 3225.47 ms 3306.95 ms 3249.14 ms 3225.56 ms and with tracking disabled: 3193.43 ms 3169.57 ms 3156.48 ms 3147.12 ms 3142.25 ms 3161.91 ms 3149.97 ms Which is quite good, IMHO. Disabled is pretty much exactly as master (i.e. no accounting at all), enabled is about equal to v3 with disabled tracing. But maybe I did something stupid in those patches, it's 3AM here ... Patch attached, consider it a an early alpha version. regards Tomas
diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c index 743455e..6303d76 100644 --- a/src/backend/utils/mmgr/aset.c +++ b/src/backend/utils/mmgr/aset.c @@ -242,6 +242,8 @@ typedef struct AllocChunkData #define AllocChunkGetPointer(chk) \ ((AllocPointer)(((char *)(chk)) + ALLOC_CHUNKHDRSZ)) +static void update_allocation(MemoryContext context, int64 size); + /* * These functions implement the MemoryContext API for AllocSet contexts. */ @@ -250,7 +252,7 @@ static void AllocSetFree(MemoryContext context, void *pointer); static void *AllocSetRealloc(MemoryContext context, void *pointer, Size size); static void AllocSetInit(MemoryContext context); static void AllocSetReset(MemoryContext context); -static void AllocSetDelete(MemoryContext context); +static void AllocSetDelete(MemoryContext context, MemoryContext parent); static Size AllocSetGetChunkSpace(MemoryContext context, void *pointer); static bool AllocSetIsEmpty(MemoryContext context); static void AllocSetStats(MemoryContext context, int level); @@ -430,6 +432,9 @@ randomize_mem(char *ptr, size_t size) * minContextSize: minimum context size * initBlockSize: initial allocation block size * maxBlockSize: maximum allocation block size + * + * The flag determining whether this context tracks memory usage is inherited + * from the parent context. */ MemoryContext AllocSetContextCreate(MemoryContext parent, @@ -438,6 +443,26 @@ AllocSetContextCreate(MemoryContext parent, Size initBlockSize, Size maxBlockSize) { + return AllocSetContextCreateTracked( + parent, name, minContextSize, initBlockSize, maxBlockSize, + false); +} + +/* + * AllocSetContextCreateTracked + * Create a new AllocSet context. + * + * Implementation for AllocSetContextCreate, but also allows the caller to + * specify whether memory usage should be tracked or not. + */ +MemoryContext +AllocSetContextCreateTracked(MemoryContext parent, + const char *name, + Size minContextSize, + Size initBlockSize, + Size maxBlockSize, + bool track_mem) +{ AllocSet context; /* Do the type-independent part of context creation */ @@ -445,7 +470,8 @@ AllocSetContextCreate(MemoryContext parent, sizeof(AllocSetContext), &AllocSetMethods, parent, - name); + name, + track_mem); /* * Make sure alloc parameters are reasonable, and save them. @@ -500,6 +526,9 @@ AllocSetContextCreate(MemoryContext parent, errdetail("Failed while creating memory context \"%s\".", name))); } + + update_allocation((MemoryContext) context, blksize); + block->aset = context; block->freeptr = ((char *) block) + ALLOC_BLOCKHDRSZ; block->endptr = ((char *) block) + blksize; @@ -590,6 +619,7 @@ AllocSetReset(MemoryContext context) else { /* Normal case, release the block */ + update_allocation(context, -(block->endptr - ((char*) block))); #ifdef CLOBBER_FREED_MEMORY wipe_mem(block, block->freeptr - ((char *) block)); #endif @@ -611,11 +641,13 @@ AllocSetReset(MemoryContext context) * But note we are not responsible for deleting the context node itself. */ static void -AllocSetDelete(MemoryContext context) +AllocSetDelete(MemoryContext context, MemoryContext parent) { AllocSet set = (AllocSet) context; AllocBlock block = set->blocks; + MemoryAccounting accounting; + AssertArg(AllocSetIsValid(set)); #ifdef MEMORY_CONTEXT_CHECKING @@ -623,6 +655,17 @@ AllocSetDelete(MemoryContext context) AllocSetCheck(context); #endif + if (context->accounting != NULL) { + + accounting = context->accounting->parent; + + while (accounting != NULL) + { + accounting->total_allocated -= context->accounting->total_allocated; + accounting = accounting->parent; + } + } + /* Make it look empty, just in case... */ MemSetAligned(set->freelist, 0, sizeof(set->freelist)); set->blocks = NULL; @@ -678,6 +721,9 @@ AllocSetAlloc(MemoryContext context, Size size) errmsg("out of memory"), errdetail("Failed on request of size %zu.", size))); } + + update_allocation(context, blksize); + block->aset = set; block->freeptr = block->endptr = ((char *) block) + blksize; @@ -873,6 +919,8 @@ AllocSetAlloc(MemoryContext context, Size size) errdetail("Failed on request of size %zu.", size))); } + update_allocation(context, blksize); + block->aset = set; block->freeptr = ((char *) block) + ALLOC_BLOCKHDRSZ; block->endptr = ((char *) block) + blksize; @@ -976,6 +1024,7 @@ AllocSetFree(MemoryContext context, void *pointer) set->blocks = block->next; else prevblock->next = block->next; + update_allocation(context, -(block->endptr - ((char*) block))); #ifdef CLOBBER_FREED_MEMORY wipe_mem(block, block->freeptr - ((char *) block)); #endif @@ -1088,6 +1137,7 @@ AllocSetRealloc(MemoryContext context, void *pointer, Size size) AllocBlock prevblock = NULL; Size chksize; Size blksize; + Size oldblksize; while (block != NULL) { @@ -1105,6 +1155,8 @@ AllocSetRealloc(MemoryContext context, void *pointer, Size size) /* Do the realloc */ chksize = MAXALIGN(size); blksize = chksize + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ; + oldblksize = block->endptr - ((char *)block); + block = (AllocBlock) realloc(block, blksize); if (block == NULL) { @@ -1114,6 +1166,7 @@ AllocSetRealloc(MemoryContext context, void *pointer, Size size) errmsg("out of memory"), errdetail("Failed on request of size %zu.", size))); } + update_allocation(context, blksize - oldblksize); block->freeptr = block->endptr = ((char *) block) + blksize; /* Update pointers since block has likely been moved */ @@ -1277,6 +1330,36 @@ AllocSetStats(MemoryContext context, int level) } +/* + * update_allocation + * + * Track newly-allocated or newly-freed memory (freed memory should be + * negative). + */ +static void +update_allocation(MemoryContext context, int64 size) +{ + MemoryContext parent; + MemoryAccounting accounting = context->accounting; + + if (accounting == NULL) + return; + + accounting->self_allocated += size; + + while (accounting != NULL) { + + accounting->total_allocated += size; + + Assert(accounting->self_allocated >= 0); + Assert(accounting->total_allocated >= accounting->self_allocated); + + accounting = accounting->parent; + + } + +} + #ifdef MEMORY_CONTEXT_CHECKING /* diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 4185a03..2e2697f 100644 --- a/src/backend/utils/mmgr/mcxt.c +++ b/src/backend/utils/mmgr/mcxt.c @@ -187,6 +187,8 @@ MemoryContextResetChildren(MemoryContext context) void MemoryContextDelete(MemoryContext context) { + MemoryContext parent = context->parent; + AssertArg(MemoryContextIsValid(context)); /* We had better not be deleting TopMemoryContext ... */ Assert(context != TopMemoryContext); @@ -202,7 +204,12 @@ MemoryContextDelete(MemoryContext context) */ MemoryContextSetParent(context, NULL); - (*context->methods->delete_context) (context); + /* pass the parent in case it's needed, however */ + (*context->methods->delete_context) (context, parent); + + if (context->track_mem) + pfree(context->accounting); + VALGRIND_DESTROY_MEMPOOL(context); pfree(context); } @@ -324,6 +331,26 @@ MemoryContextAllowInCriticalSection(MemoryContext context, bool allow) } /* + * MemoryContextGetAllocated + * + * Return memory allocated by the system to this context. If total is true, + * include child contexts. Context must have track_mem set. + */ +Size +MemoryContextGetAllocated(MemoryContext context, bool total) +{ + Assert(context->track_mem); + + if (! context->track_mem) + return 0; + + if (total) + return context->accounting->total_allocated; + else + return context->accounting->self_allocated; +} + +/* * GetMemoryChunkSpace * Given a currently-allocated chunk, determine the total space * it occupies (including all memory-allocation overhead). @@ -546,7 +573,8 @@ MemoryContext MemoryContextCreate(NodeTag tag, Size size, MemoryContextMethods *methods, MemoryContext parent, - const char *name) + const char *name, + bool track_mem) { MemoryContext node; Size needed = size + strlen(name) + 1; @@ -576,6 +604,8 @@ MemoryContextCreate(NodeTag tag, Size size, node->firstchild = NULL; node->nextchild = NULL; node->isReset = true; + node->track_mem = track_mem; + node->accounting = NULL; node->name = ((char *) node) + size; strcpy(node->name, name); @@ -596,6 +626,18 @@ MemoryContextCreate(NodeTag tag, Size size, #endif } + /* if we want to do tracking, just allocate MemoryAccountingData */ + if (track_mem) + { + node->accounting = (MemoryAccounting)MemoryContextAlloc(TopMemoryContext, + sizeof(MemoryAccountingData)); + if (parent) + node->accounting->parent = parent->accounting; + } else if (parent) { + + node->accounting = parent->accounting; + } + VALGRIND_CREATE_MEMPOOL(node, 0, false); /* Return to type-specific creation routine to finish up */ diff --git a/src/include/nodes/memnodes.h b/src/include/nodes/memnodes.h index ad77509..7ba59f6 100644 --- a/src/include/nodes/memnodes.h +++ b/src/include/nodes/memnodes.h @@ -41,7 +41,8 @@ typedef struct MemoryContextMethods void *(*realloc) (MemoryContext context, void *pointer, Size size); void (*init) (MemoryContext context); void (*reset) (MemoryContext context); - void (*delete_context) (MemoryContext context); + void (*delete_context) (MemoryContext context, + MemoryContext parent); Size (*get_chunk_space) (MemoryContext context, void *pointer); bool (*is_empty) (MemoryContext context); void (*stats) (MemoryContext context, int level); @@ -50,6 +51,18 @@ typedef struct MemoryContextMethods #endif } MemoryContextMethods; +typedef struct MemoryAccountingData { + + Size total_allocated; /* including child contexts */ + Size self_allocated; /* not including child contexts */ + + /* parent accounting (not parent context) */ + struct MemoryAccountingData * parent; + +} MemoryAccountingData; + +typedef MemoryAccountingData * MemoryAccounting; + typedef struct MemoryContextData { @@ -60,6 +73,8 @@ typedef struct MemoryContextData MemoryContext nextchild; /* next child of same parent */ char *name; /* context name (just for debugging) */ bool isReset; /* T = no space alloced since last reset */ + bool track_mem; /* whether to track memory usage */ + MemoryAccounting accounting; #ifdef USE_ASSERT_CHECKING bool allowInCritSection; /* allow palloc in critical section */ #endif diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h index 2fede86..b82f923 100644 --- a/src/include/utils/memutils.h +++ b/src/include/utils/memutils.h @@ -96,6 +96,7 @@ extern void MemoryContextDeleteChildren(MemoryContext context); extern void MemoryContextResetAndDeleteChildren(MemoryContext context); extern void MemoryContextSetParent(MemoryContext context, MemoryContext new_parent); +extern Size MemoryContextGetAllocated(MemoryContext context, bool total); extern Size GetMemoryChunkSpace(void *pointer); extern MemoryContext GetMemoryChunkContext(void *pointer); extern MemoryContext MemoryContextGetParent(MemoryContext context); @@ -117,7 +118,8 @@ extern bool MemoryContextContains(MemoryContext context, void *pointer); extern MemoryContext MemoryContextCreate(NodeTag tag, Size size, MemoryContextMethods *methods, MemoryContext parent, - const char *name); + const char *name, + bool track_mem); /* @@ -130,6 +132,12 @@ extern MemoryContext AllocSetContextCreate(MemoryContext parent, Size minContextSize, Size initBlockSize, Size maxBlockSize); +extern MemoryContext AllocSetContextCreateTracked(MemoryContext parent, + const char *name, + Size minContextSize, + Size initBlockSize, + Size maxBlockSize, + bool track_mem); /* * Recommended default alloc parameters, suitable for "ordinary" contexts
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers