Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-24 Thread Stefan Kaltenbrunner

On 08/22/2015 06:25 AM, Tomas Vondra wrote:

On 08/21/2015 08:37 PM, Tom Lane wrote:

Tomas Vondra tomas.von...@2ndquadrant.com writes:


I also don't think logging just subset of the stats is a lost case.
Sure, we can't know which of the lines are important, but for example
logging just the top-level contexts with a summary of the child contexts
would be OK.


I thought a bit more about this.  Generally, what you want to know about
a given situation is which contexts have a whole lot of allocations
and/or a whole lot of child contexts.  What you suggest above won't work
very well if the problem is buried more than about two levels down in
the context tree.  But suppose we add a parameter to memory context stats
collection that is the maximum number of child contexts to print *per
parent context*.  If there are more than that, summarize the rest as per
your suggestion.  So any given recursion level might look like

  FooContext: m total in n blocks ...
ChildContext1: m total in n blocks ...
  possible grandchildren...
ChildContext2: m total in n blocks ...
  possible grandchildren...
ChildContext3: m total in n blocks ...
  possible grandchildren...
k more child contexts containing m total in n blocks ...

This would require a fixed amount of extra state per recursion level,
so it could be done with a few more parameters/local variables in
MemoryContextStats and no need to risk a malloc().

The case where you would lose important data is where the serious
bloat is in some specific child context that is after the first N
children of its direct parent. I don't believe I've ever seen a case
where that was critical information as long as N isn't too tiny.


Couldn't we make it a bit smarter to handle even cases like this? For
example we might first count/sum the child contexts, and then print
either all child contexts (if there are only a few of them) or just
those that are 5% of the total, 2x the average or something like that.


While having that kind of logic would be nice, i dont think it is 
required. For the case I had the proposed patch from tom seems perfectly 
fine to me - not sure we would want a GUC. From a DBA perspective I dont 
think anybody needs millions of lines of almost duplicated memory 
context dumps and also not sure we need it from a developer perspective 
either (other than the information: there were more than those I printed)




regards


Stefan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-22 Thread Tomas Vondra


On 08/22/2015 06:06 PM, Tom Lane wrote:

Tomas Vondra tomas.von...@2ndquadrant.com writes:


Couldn't we make it a bit smarter to handle even cases like this? For
example we might first count/sum the child contexts, and then print
either all child contexts (if there are only a few of them) or just
those that are 5% of the total, 2x the average or something like that.


That seems overly complicated. In the first place, you couldn't
possibly implement that without two passes over the context set,
which would be mighty expensive. (In the situations where this is
getting run, most likely portions of the address space have been
swapped out, and we'll have to swap them back in again. Bad enough to
iterate over the whole process address space once, but twice?) In the
second place, it would seem to require a much wider API between
MemoryContextStats and the per-context-type stats methods, including
for example a way to decide which numbers about a context were the
most important ones. I'm already concerned about having had to expose
a set of accumulatable numbers at all ... don't want the API to get
into their semantics in that kind of detail.


Sure, the two passes seem quite annoying, although I'm not convinced 
it'd be a problem in practice - most modern systems I've been dealing 
with recently were configured with the 'no swapping' goal, and using 
only a tiny swap for extreme cases (e.g. 256GB RAM with 4GB of swap).


If only we had some memory accounting in place ;-) Then we wouldn't have 
to do the two passes ...



As I said, based on my experience in looking at memory context dumps
(and I've seen a few), a show-the-first-N-children heuristic
probably will work fine. If we find out differently we can work on
refining it, but I don't think a complex design is warranted at this
stage.


OK, let's go with the simple approach.

I'm still not quite convinced having no GUC for turning this off is a 
good idea, though. From time to time we're dealing with OOM issues at 
customer systems, with very limited access (e.g. no gdb). In those cases 
the memory context is the only information we have initial, and it's 
usually quite difficult to get additional info.


I agree that the 'first-N-children' heuristics is going to work in most 
cases, but if it doesn't it'd be nice to have a way to force full 
stats in ad-hoc manner (i.e. disable before query, etc.).


OTOH now that I'm thinking about it, most OOM errors I've seen (at least 
on Linux) were either because of exceptionally large palloc() requests 
(e.g. because of varlena header corruption, overflow bugs, ...) or 
because of OOM killer. In the first case the memory context stats are 
utterly useless because the issue has nothing to do with other memory 
contexts, in the latter case you don't get any stats at all because the 
process is simply killed.


One question regarding the proposed patch though - if I get it right 
(just looking at the diff), it simply limits the output to first 100 
child contexts at each level independently. So if each of those 100 
child contexts has 100 child contexts on it's own, we get 100x100 
lines. And so on, if the hierarchy is deeper. This probably is not 
addressable without introducing some global counter of printed contexts, 
and it may not be an issue at all (all the cases I could remember have a 
single huge context or many sibling contexts).



One thing we could consider doing to improve the odds that it's fine
 would be to rearrange things so that child contexts of the same
parent are more likely to be similar --- for example, break out
all relcache entries to be children of a RelCacheContext instead of
the generic CacheMemoryContext, likewise for cached plans, etc. But
I'm not yet convinced that'd be worth the trouble.


That'd be nice but I see that as an independent improvement - it might 
improve the odds for internal contexts, but what about contexts coming 
from user code (e.g. custom aggregates)?


kind regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-22 Thread Tom Lane
Tomas Vondra tomas.von...@2ndquadrant.com writes:
 One question regarding the proposed patch though - if I get it right 
 (just looking at the diff), it simply limits the output to first 100 
 child contexts at each level independently. So if each of those 100 
 child contexts has 100 child contexts on it's own, we get 100x100 
 lines. And so on, if the hierarchy is deeper. This probably is not 
 addressable without introducing some global counter of printed contexts, 
 and it may not be an issue at all (all the cases I could remember have a 
 single huge context or many sibling contexts).

Right.  The situation Stefan was complaining of, almost certainly,
involved a huge number of children of the same context.  This patch
would successfully abbreviate that case, no matter where it happened
in the context tree exactly.  In principle, if you were getting that
sort of expansion at multiple levels of the context tree concurrently,
you could still get a mighty long context dump ... but I really doubt
that would happen in practice.  (And if it did happen, an overall limit
on the number of contexts printed would hide the fact that it was
happening, which wouldn't be desirable.)

 One thing we could consider doing to improve the odds that it's fine
 would be to rearrange things so that child contexts of the same
 parent are more likely to be similar --- for example, break out
 all relcache entries to be children of a RelCacheContext instead of
 the generic CacheMemoryContext, likewise for cached plans, etc. But
 I'm not yet convinced that'd be worth the trouble.

 That'd be nice but I see that as an independent improvement - it might 
 improve the odds for internal contexts, but what about contexts coming 
 from user code (e.g. custom aggregates)?

Yeah, cases like custom aggregates would be hard to classify.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-22 Thread Tom Lane
Tomas Vondra tomas.von...@2ndquadrant.com writes:
 On 08/21/2015 08:37 PM, Tom Lane wrote:
 ...  But suppose we add a parameter to memory context stats
 collection that is the maximum number of child contexts to print *per
 parent context*.  If there are more than that, summarize the rest as per
 your suggestion.
 
 The case where you would lose important data is where the serious
 bloat is in some specific child context that is after the first N
 children of its direct parent. I don't believe I've ever seen a case
 where that was critical information as long as N isn't too tiny.

 Couldn't we make it a bit smarter to handle even cases like this? For 
 example we might first count/sum the child contexts, and then print 
 either all child contexts (if there are only a few of them) or just 
 those that are 5% of the total, 2x the average or something like that.

That seems overly complicated.  In the first place, you couldn't possibly
implement that without two passes over the context set, which would be
mighty expensive.  (In the situations where this is getting run, most
likely portions of the address space have been swapped out, and we'll have
to swap them back in again.  Bad enough to iterate over the whole process
address space once, but twice?)  In the second place, it would seem to
require a much wider API between MemoryContextStats and the
per-context-type stats methods, including for example a way to decide
which numbers about a context were the most important ones.  I'm already
concerned about having had to expose a set of accumulatable numbers at all
... don't want the API to get into their semantics in that kind of detail.

As I said, based on my experience in looking at memory context dumps
(and I've seen a few), a show-the-first-N-children heuristic probably
will work fine.  If we find out differently we can work on refining it,
but I don't think a complex design is warranted at this stage.

One thing we could consider doing to improve the odds that it's fine
would be to rearrange things so that child contexts of the same parent
are more likely to be similar --- for example, break out all relcache
entries to be children of a RelCacheContext instead of the generic
CacheMemoryContext, likewise for cached plans, etc.  But I'm not yet
convinced that'd be worth the trouble.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-21 Thread Tomas Vondra

On 08/21/2015 08:37 PM, Tom Lane wrote:

Tomas Vondra tomas.von...@2ndquadrant.com writes:


I also don't think logging just subset of the stats is a lost case.
Sure, we can't know which of the lines are important, but for example
logging just the top-level contexts with a summary of the child contexts
would be OK.


I thought a bit more about this.  Generally, what you want to know about
a given situation is which contexts have a whole lot of allocations
and/or a whole lot of child contexts.  What you suggest above won't work
very well if the problem is buried more than about two levels down in
the context tree.  But suppose we add a parameter to memory context stats
collection that is the maximum number of child contexts to print *per
parent context*.  If there are more than that, summarize the rest as per
your suggestion.  So any given recursion level might look like

  FooContext: m total in n blocks ...
ChildContext1: m total in n blocks ...
  possible grandchildren...
ChildContext2: m total in n blocks ...
  possible grandchildren...
ChildContext3: m total in n blocks ...
  possible grandchildren...
k more child contexts containing m total in n blocks ...

This would require a fixed amount of extra state per recursion level,
so it could be done with a few more parameters/local variables in
MemoryContextStats and no need to risk a malloc().

The case where you would lose important data is where the serious
bloat is in some specific child context that is after the first N
children of its direct parent. I don't believe I've ever seen a case
where that was critical information as long as N isn't too tiny.


Couldn't we make it a bit smarter to handle even cases like this? For 
example we might first count/sum the child contexts, and then print 
either all child contexts (if there are only a few of them) or just 
those that are 5% of the total, 2x the average or something like that.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-21 Thread Tom Lane
Tomas Vondra tomas.von...@2ndquadrant.com writes:
 On 08/20/2015 11:04 PM, Stefan Kaltenbrunner wrote:
 On 08/20/2015 06:09 PM, Tom Lane wrote:
 (The reason I say lobotomize is that there's no particularly
 good reason to assume that the first N lines will tell you what you
 need to know. And the filter rule would have to be *very* stupid,
 because we can't risk trying to allocate any additional memory to
 track what we're doing here.)

 IMHO the first thing we might do is provide log_memory_stats GUC 
 controlling that. I'm not a big fan of adding GUC for everything, but in 
 this case it seems appropriate, just like the other log_ options.

I don't think this is advisable.  You would have to either keep it turned
off and risk not being able to debug OOM situations, or keep it turned on
and risk log-volume problems; neither is very satisfactory.

 I also don't think logging just subset of the stats is a lost case. 
 Sure, we can't know which of the lines are important, but for example 
 logging just the top-level contexts with a summary of the child contexts 
 would be OK.

I thought a bit more about this.  Generally, what you want to know about
a given situation is which contexts have a whole lot of allocations
and/or a whole lot of child contexts.  What you suggest above won't work
very well if the problem is buried more than about two levels down in
the context tree.  But suppose we add a parameter to memory context stats
collection that is the maximum number of child contexts to print *per
parent context*.  If there are more than that, summarize the rest as per
your suggestion.  So any given recursion level might look like

 FooContext: m total in n blocks ...
   ChildContext1: m total in n blocks ...
 possible grandchildren...
   ChildContext2: m total in n blocks ...
 possible grandchildren...
   ChildContext3: m total in n blocks ...
 possible grandchildren...
   k more child contexts containing m total in n blocks ...

This would require a fixed amount of extra state per recursion level,
so it could be done with a few more parameters/local variables in
MemoryContextStats and no need to risk a malloc().

The case where you would lose important data is where the serious bloat
is in some specific child context that is after the first N children of
its direct parent.  I don't believe I've ever seen a case where that was
critical information as long as N isn't too tiny.

I think we could hard-wire N at 100 or something like that and pretty
much fix Stefan's complaint, while losing little if any detail in typical
cases.  Manual invocation of MemoryContextStats could pass larger values
if you really needed it during debugging.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-21 Thread Tom Lane
I wrote:
 I thought a bit more about this.  Generally, what you want to know about
 a given situation is which contexts have a whole lot of allocations
 and/or a whole lot of child contexts.  What you suggest above won't work
 very well if the problem is buried more than about two levels down in
 the context tree.  But suppose we add a parameter to memory context stats
 collection that is the maximum number of child contexts to print *per
 parent context*.  If there are more than that, summarize the rest as per
 your suggestion.

Here's a draft patch along this line.  After playing with it a bit,
I think that a wired-in limit of 100 children per level would be fine.
We could imagine adding a GUC for that, but I think it's overkill.

A couple of notes:

Since the summarization mechanism requires passing totals back up anyway,
I took the opportunity to add a grand total line to the end of the
printout.  I'm half tempted to modify that to convert the numbers in the
grand total line to kilobytes or even MB, but it could be argued either way.

This assumes that the total memory allocation couldn't exceed the range of
size_t, a thing which is not promised by the C standard --- but AFAIK it
would be true on any modern machine (and really we were already making
such an assumption in AllocSetStats).  Also, if we ever add a memory
context type that works fundamentally differently from AllocSet, we might
need to rethink what we compute/print as summary stats.  I figure we can
cross that bridge when we come to it.

Comments?

regards, tom lane
diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index febeb6e..e86c7cb 100644
*** a/src/backend/utils/mmgr/aset.c
--- b/src/backend/utils/mmgr/aset.c
*** static void AllocSetReset(MemoryContext 
*** 253,259 
  static void AllocSetDelete(MemoryContext context);
  static Size AllocSetGetChunkSpace(MemoryContext context, void *pointer);
  static bool AllocSetIsEmpty(MemoryContext context);
! static void AllocSetStats(MemoryContext context, int level);
  
  #ifdef MEMORY_CONTEXT_CHECKING
  static void AllocSetCheck(MemoryContext context);
--- 253,260 
  static void AllocSetDelete(MemoryContext context);
  static Size AllocSetGetChunkSpace(MemoryContext context, void *pointer);
  static bool AllocSetIsEmpty(MemoryContext context);
! static void AllocSetStats(MemoryContext context, int level, bool print,
! 			  MemoryContextCounters *totals);
  
  #ifdef MEMORY_CONTEXT_CHECKING
  static void AllocSetCheck(MemoryContext context);
*** AllocSetIsEmpty(MemoryContext context)
*** 1228,1237 
  
  /*
   * AllocSetStats
!  *		Displays stats about memory consumption of an allocset.
   */
  static void
! AllocSetStats(MemoryContext context, int level)
  {
  	AllocSet	set = (AllocSet) context;
  	Size		nblocks = 0;
--- 1229,1243 
  
  /*
   * AllocSetStats
!  *		Compute stats about memory consumption of an allocset.
!  *
!  * level: recursion level (0 at top level); used for print indentation.
!  * print: true to print stats to stderr.
!  * totals: if not NULL, add stats about this allocset into *totals.
   */
  static void
! AllocSetStats(MemoryContext context, int level, bool print,
! 			  MemoryContextCounters *totals)
  {
  	AllocSet	set = (AllocSet) context;
  	Size		nblocks = 0;
*** AllocSetStats(MemoryContext context, int
*** 1239,1247 
  	Size		totalspace = 0;
  	Size		freespace = 0;
  	AllocBlock	block;
- 	AllocChunk	chunk;
  	int			fidx;
- 	int			i;
  
  	for (block = set-blocks; block != NULL; block = block-next)
  	{
--- 1245,1251 
*** AllocSetStats(MemoryContext context, int
*** 1251,1256 
--- 1255,1262 
  	}
  	for (fidx = 0; fidx  ALLOCSET_NUM_FREELISTS; fidx++)
  	{
+ 		AllocChunk	chunk;
+ 
  		for (chunk = set-freelist[fidx]; chunk != NULL;
  			 chunk = (AllocChunk) chunk-aset)
  		{
*** AllocSetStats(MemoryContext context, int
*** 1259,1271 
  		}
  	}
  
! 	for (i = 0; i  level; i++)
! 		fprintf(stderr,   );
  
! 	fprintf(stderr,
  			%s: %zu total in %zd blocks; %zu free (%zd chunks); %zu used\n,
! 			set-header.name, totalspace, nblocks, freespace, nchunks,
! 			totalspace - freespace);
  }
  
  
--- 1265,1289 
  		}
  	}
  
! 	if (print)
! 	{
! 		int			i;
  
! 		for (i = 0; i  level; i++)
! 			fprintf(stderr,   );
! 		fprintf(stderr,
  			%s: %zu total in %zd blocks; %zu free (%zd chunks); %zu used\n,
! set-header.name, totalspace, nblocks, freespace, nchunks,
! totalspace - freespace);
! 	}
! 
! 	if (totals)
! 	{
! 		totals-nblocks += nblocks;
! 		totals-nchunks += nchunks;
! 		totals-totalspace += totalspace;
! 		totals-freespace += freespace;
! 	}
  }
  
  
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 12d29f7..ad1421c 100644
*** a/src/backend/utils/mmgr/mcxt.c
--- b/src/backend/utils/mmgr/mcxt.c
*** MemoryContext CurTransactionContext = NU
*** 52,58 
  

Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-20 Thread Stefan Kaltenbrunner
On 08/20/2015 06:09 PM, Tom Lane wrote:
 Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:
 I wonder if we should have a default of capping the dump to say 1k lines 
 or such and only optionally do a full one.
 
 -1.  It's worked like this for the last fifteen years or thereabouts,
 and you're the first one to complain.  I suspect some weirdness in
 your logging setup, rather than any systemic problem that we
 need to lobotomize our diagnostic output in order to prevent.

not sure what you consider weird in the logging setup here - the context
dump is imho borderline on internal diagnostic output at a debug level
(rather than making sense to an average sysadmin) already (and no way to
control it). But having (like in our case) the backend dumping 2 million
basically identical lines into a general logfile per event seems
excessive and rather abusive towards the rest of the system (just from
an IO perspective for example or from a log file post processing tool
perspective)

 
 (The reason I say lobotomize is that there's no particularly good
 reason to assume that the first N lines will tell you what you need
 to know.  And the filter rule would have to be *very* stupid, because
 we can't risk trying to allocate any additional memory to track what
 we're doing here.)

I do understand that there migt be challenges there but in the last 15
years machines got way faster and pg got way capable and some of those
capabilities might need to get revisited in that regards - and while it
is very nice that pg survives multiple oom cases pretty nicely I dont
think it is entitled to put an imho unreasonable burden on the rest of
the system by writing insane amounts of data.

Just from a sysadmin perspective it also means that it is trivial for a
misbehaving app to fill up the logfile on a system because unlike almost
all other actual other logging output there seems to be no way to
control/disabled it on a per role/database level.


Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] (full) Memory context dump considered harmful

2015-08-20 Thread Stefan Kaltenbrunner

Hi all!


We just had a case of a very long running process of ours that creates 
does a lot of prepared statements through Perls DBD:Pg running into:


https://rt.cpan.org/Public/Bug/Display.html?id=88827

This resulted in millions of prepared statements created, but not 
removed in the affected backends over the course of 1-2 hours until the 
backends in question ran out of memory.
The out of memory condition resulted in one memory context dump 
generated per occurance each consisting of 2M lines(!) (basically a 
line of CachedPlan/CachePlanSource per statement/function).
In the 20 Minutes or it took monitoring to alert and operations to react 
this cause a followup incident because repeated out of memory 
conditions caused over 400M(!!) loglines amounting to some 15GB of data 
running the log partition dangerously close to full.


an example memory context dump looks like this:


TopMemoryContext: 582728880 total in 71126 blocks; 6168 free (52 
chunks); 582722712 used
  TopTransactionContext: 8192 total in 1 blocks; 6096 free (1 chunks); 
2096 used

ExecutorState: 8192 total in 1 blocks; 5392 free (0 chunks); 2800 used
  ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
SPI Exec: 0 total in 0 blocks; 0 free (0 chunks); 0 used
SPI Proc: 8192 total in 1 blocks; 5416 free (0 chunks); 2776 used
  PL/pgSQL function context: 8192 total in 1 blocks; 1152 free (1 
chunks); 7040 used
  PL/pgSQL function context: 24576 total in 2 blocks; 11400 free (1 
chunks); 13176 used
  Type information cache: 24576 total in 2 blocks; 11888 free (5 
chunks); 12688 used
  PL/pgSQL function context: 8192 total in 1 blocks; 1120 free (1 
chunks); 7072 used
  PL/pgSQL function context: 24576 total in 2 blocks; 10984 free (1 
chunks); 13592 used
  PL/pgSQL function context: 57344 total in 3 blocks; 29928 free (2 
chunks); 27416 used
  PL/pgSQL function context: 57344 total in 3 blocks; 28808 free (2 
chunks); 28536 used
  PL/pgSQL function context: 24576 total in 2 blocks; 5944 free (3 
chunks); 18632 used
  RI compare cache: 24576 total in 2 blocks; 15984 free (5 chunks); 
8592 used
  RI query cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 
used
  PL/pgSQL function context: 57344 total in 3 blocks; 31832 free (2 
chunks); 25512 used
  PL/pgSQL function context: 57344 total in 3 blocks; 29600 free (2 
chunks); 27744 used
  PL/pgSQL function context: 57344 total in 3 blocks; 39688 free (5 
chunks); 17656 used

  CFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  Rendezvous variable hash: 8192 total in 1 blocks; 1680 free (0 
chunks); 6512 used
  PLpgSQL function cache: 24520 total in 2 blocks; 3744 free (0 
chunks); 20776 used
  Prepared Queries: 125886512 total in 25 blocks; 4764208 free (91 
chunks); 121122304 used

  TableSpace cache: 8192 total in 1 blocks; 3216 free (0 chunks); 4976 used
  Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 
chunks); 12688 used

  MessageContext: 8192 total in 1 blocks; 6976 free (0 chunks); 1216 used
  Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 
6512 used
  smgr relation table: 24576 total in 2 blocks; 5696 free (4 chunks); 
18880 used
  TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 
chunks); 32 used

  Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used
PortalHeapMemory: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
  ExecutorState: 57344 total in 3 blocks; 21856 free (2 chunks); 
35488 used

ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 8192 total in 1 blocks; 8160 free (1 chunks); 32 used
  Relcache by OID: 24576 total in 2 blocks; 12832 free (3 chunks); 
11744 used
  CacheMemoryContext: 42236592 total in 28 blocks; 7160904 free (298 
chunks); 35075688 used

CachedPlan: 7168 total in 3 blocks; 1544 free (0 chunks); 5624 used
CachedPlanSource: 7168 total in 3 blocks; 3904 free (1 chunks); 
3264 used

CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used
CachedPlanSource: 7168 total in 3 blocks; 1352 free (0 chunks); 
5816 used

CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used
CachedPlanSource: 7168 total in 3 blocks; 1352 free (0 chunks); 
5816 used

CachedPlan: 7168 total in 3 blocks; 1544 free (0 chunks); 5624 used
CachedPlanSource: 7168 total in 3 blocks; 3904 free (1 chunks); 
3264 used

CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used
CachedPlanSource: 7168 total in 3 blocks; 1352 free (0 chunks); 
5816 used

CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used
CachedPlanSource: 7168 total in 3 blocks; 1352 free (0 chunks); 
5816 used

CachedPlan: 7168 total in 3 blocks; 1544 free (0 chunks); 5624 used
CachedPlanSource: 7168 total in 3 blocks; 3904 free (1 chunks); 
3264 used

CachedPlan: 15360 

Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-20 Thread Tomas Vondra

Hi,

On 08/20/2015 11:04 PM, Stefan Kaltenbrunner wrote:

On 08/20/2015 06:09 PM, Tom Lane wrote:

Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:

I wonder if we should have a default of capping the dump to say 1k lines
or such and only optionally do a full one.


-1. It's worked like this for the last fifteen years or
thereabouts, and you're the first one to complain. I suspect some
weirdness in your logging setup, rather than any systemic problem
that we need to lobotomize our diagnostic output in order to
prevent.


not sure what you consider weird in the logging setup here - the
context dump is imho borderline on internal diagnostic output at a
debug level (rather than making sense to an average sysadmin) already
(and no way to control it). But having (like in our case) the backend
dumping 2 million basically identical lines into a general logfile
per event seems excessive and rather abusive towards the rest of the
system (just from an IO perspective for example or from a log file
post processing tool perspective)


I've seen similar issues too, on systems with perfectly common logging 
setups. So even if your logging setup would be strange, it's not the 
main factor here.



(The reason I say lobotomize is that there's no particularly
good reason to assume that the first N lines will tell you what you
need to know. And the filter rule would have to be *very* stupid,
because we can't risk trying to allocate any additional memory to
track what we're doing here.)


I do understand that there migt be challenges there but in the last
15 years machines got way faster and pg got way capable and some of
those capabilities might need to get revisited in that regards - and
while it is very nice that pg survives multiple oom cases pretty
nicely I dont think it is entitled to put an imho unreasonable burden
on the rest of the system by writing insane amounts of data.

Just from a sysadmin perspective it also means that it is trivial for
a misbehaving app to fill up the logfile on a system because unlike
almost all other actual other logging output there seems to be no way
to control/disabled it on a per role/database level.


IMHO the first thing we might do is provide log_memory_stats GUC 
controlling that. I'm not a big fan of adding GUC for everything, but in 
this case it seems appropriate, just like the other log_ options.


I also don't think logging just subset of the stats is a lost case. 
Sure, we can't know which of the lines are important, but for example 
logging just the top-level contexts with a summary of the child contexts 
would be OK. So something like this:


TopMemoryContext: 582728880 total in 71126 blocks; 6168 free  ...
  TopTransactionContext: 8192 total in 1 blocks; 6096 free (1 ...
3 child contexts, 10 blocks, 34873 free
  PL/pgSQL function context: 8192 total in 1 blocks; 1152 fre ...
  PL/pgSQL function context: 24576 total in 2 blocks; 11400 f ...
  Type information cache: 24576 total in 2 blocks; 11888 free ...
  PL/pgSQL function context: 8192 total in 1 blocks; 1120 fre ...
  PL/pgSQL function context: 24576 total in 2 blocks; 10984 f ...
  PL/pgSQL function context: 57344 total in 3 blocks; 29928 f ...
  PL/pgSQL function context: 57344 total in 3 blocks; 28808 f ...
  PL/pgSQL function context: 24576 total in 2 blocks; 5944 fr ...
  RI compare cache: 24576 total in 2 blocks; 15984 free (5 ch ...
  RI query cache: 24576 total in 2 blocks; 11888 free (5 chun ...
  PL/pgSQL function context: 57344 total in 3 blocks; 31832 f ...
  PL/pgSQL function context: 57344 total in 3 blocks; 29600 f ...
  PL/pgSQL function context: 57344 total in 3 blocks; 39688 f ...
  CFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 65 ...
  Rendezvous variable hash: 8192 total in 1 blocks; 1680 free ...
  PLpgSQL function cache: 24520 total in 2 blocks; 3744 free  ...
  Prepared Queries: 125886512 total in 25 blocks; 4764208 fre ...
  TableSpace cache: 8192 total in 1 blocks; 3216 free (0 chun ...
  Operator lookup cache: 24576 total in 2 blocks; 11888 free  ...
  MessageContext: 8192 total in 1 blocks; 6976 free (0 chunks ...
  Operator class cache: 8192 total in 1 blocks; 1680 free (0  ...
  smgr relation table: 24576 total in 2 blocks; 5696 free (4  ...
  TransactionAbortContext: 32768 total in 1 blocks; 32736 fre ...
  Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks);  ...
  PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); ...
4 child contexts, 34 blocks, 87283 free
  Relcache by OID: 24576 total in 2 blocks; 12832 free (3 chu ...
  CacheMemoryContext: 42236592 total in 28 blocks; 7160904 fr ...
1487438 child contexts, 14874382 blocks, 3498349 free

instead of the millions of lines with all the datails.

So either

log_memory_stats = (on|off)

or

log_memory_stats = (full|short|off)

might work.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via 

Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-20 Thread Joshua D. Drake


On 08/20/2015 08:51 AM, Stefan Kaltenbrunner wrote:


This is 9.1.14 on Debian Wheezy/amd64 fwiw - but I dont think we have
made relevant changes in more recent versions.


It seems we may also want to consider a way to drop those prepared 
queries after a period time of non use.


JD





regards


Stefan





--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing I'm offended is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (full) Memory context dump considered harmful

2015-08-20 Thread Tom Lane
Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:
 I wonder if we should have a default of capping the dump to say 1k lines 
 or such and only optionally do a full one.

-1.  It's worked like this for the last fifteen years or thereabouts,
and you're the first one to complain.  I suspect some weirdness in
your logging setup, rather than any systemic problem that we
need to lobotomize our diagnostic output in order to prevent.

(The reason I say lobotomize is that there's no particularly good
reason to assume that the first N lines will tell you what you need
to know.  And the filter rule would have to be *very* stupid, because
we can't risk trying to allocate any additional memory to track what
we're doing here.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers