Re: [HACKERS] auto_explain WAS: RFC: Timing Events

Jim Nasby Sun, 24 Feb 2013 17:55:45 -0800

Sorry for the late reply, but I think I can add some ideas here...


On 11/21/12 5:33 PM, Gavin Flower wrote:

On 22/11/12 12:15, Greg Smith wrote:

On 11/8/12 2:16 PM, Josh Berkus wrote:

Also, logging only the long-running queries is less useful than people
on this list seem to think.  When I'm doing real performance analysis, I
need to see *everything* which was run, not just the slow stuff.  Often
the real problem is a query which used to take 1.1ms, now takes 1.8ms,
and gets run 400 times/second. Looking just at the slow queries won't
tell you that.


No argument here.  I've tried to be clear that some of these high-speed but 
lossy things I'm hacking on are not going to be useful for everyone.  A 
solution that found most of these 1.8ms queries, but dropped some percentage 
under the highest peak load conditions, would still be very useful to me.

An anecdote on this topic seems relevant.  I have a troublesome production 
server that has moved log_min_duration_statement from 100ms to 500ms to 2000ms 
as the system grew.  Even the original setting wasn't short enough to catch 
everything we would like to watch *now*, but seeing sub-second data is a dream 
at this point. The increases have been forced by logging contention becoming 
unmanagable when every client has to fight for the log to write to disk.  I can 
see the buggers stack up as waiting for locks if I try to log shorter 
statements, stalling enough that it drags the whole server down under peak load.

If I could just turn off logging just during those periods--basically, throwing 
them away only when some output rate throttled component hit its limit--I could 
still find them in the data collected the rest of the time.  There are some 
types of problems that also only occur under peak load that this idea would 
miss, but you'd still be likely to get *some* of them, statistically.

There's a really hard trade-off here:

-Sometimes you must save data on every query to capture fine details
-Making every query wait for disk I/O is impractical

The sort of ideas you threw out for making things like auto-explain logging 
per-session can work.  The main limitation there though is that it presumes you 
even know what area the problem is in the first place. I am not optimistic that 
covers a whole lot of ground either.

Upthread I talked about a background process that persists shared memory queues 
as a future consumer of timing events--one that might consume slow query data 
too.  That is effectively acting as the ideal component I described above, one 
that only loses things when it exceeds the system's write capacity for saving 
them.  I wouldn't want to try and retrofit the existing PostgreSQL logging 
facility for such a thing though.  Log parsing as the way to collect data is 
filled with headaches anyway.

I don't see any other good way to resolve this trade-off.  To help with the 
real-world situation you describe, ideally you dump all the query data 
somewhere, fast, and have the client move on.  You can't make queries wait for 
storage, something else (with a buffer!) needs to take over that job.

I can't find the URL right now, but at PG.EU someone was showing me a module 
that grabbed the new 9.2 logging hook and shipped the result to another server. 
 Clients burn a little CPU and network time and they move on, and magically 
disk I/O goes out of their concerns.  How much overhead persisting the data 
takes isn't the database server's job at all then.  That's the sort of 
flexibility people need to have with logging eventually.  Something so 
efficient that every client can afford to do it; it is capable of saving all 
events under ideal conditions; but under adverse ones, you have to keep going 
and accept the loss.


I definitely think we need a means to "log" stuff really, really fast and then 
deal with it later. And the log ideally would not be text either, because it's a PITA to 
deal with and often far more data to handle (which makes all our problems worse).

I like that Greg is proposing a method of logging information in a more useful 
format than text. I suggest that it would be best if the non-text structure be 
implemented outside of any particular logging methodology if possible. Perhaps 
as something *equivalent to* a set of data types in PG. Of course I doubt real 
PG types could be used for the actual logging due to performance reasons, but 
the end interface should be able to use PG capabilities (inheritance is another 
interesting possibility).

Josh also brings up an interesting use case, one where I really like Tom Kyte's (the Oracle 
"Ask Tom" guy) methodology. When asked about how to do logging in PL/SQL he proposed a 
system where you always associate some kind of "context" with log data so that you can 
enable logging *only for that context*. In this case, (and in plpgsql syntax), you'd do something 
like:

PERFORM debug( 'my function name', 'blah blah just happened' );

That way you're not stuck trying to mess around with verbosity levels to stem a 
flood of data, instead you can only enable getting data for a specific context.

IIRC, his system also had a method for turning on debugging on a *different 
backend*. That allowed you to debug a specific application thread. If we had 
that ability, it would eliminate the issue of requiring SU to change the 
autoexplain GUC.

Please note that I'm not suggesting these ideas at the exclusion of others; they can all 
be valuable. IE: Retain the concept of "log level", add a user specified 
context (or maybe it'd be enough to just use function names), and provide some ability to 
control debug on a separate backend. All three features are useful and can stand on their 
own or in combination.

Would it be useful to be able to specify a fixed sampling rate, say 1 in 100.  
That way you could get a well defined statistical sample, yet not cause 
excessive I/O?


We actually do that in our application and have discovered that random sampling 
can end up significantly skewing your data.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] auto_explain WAS: RFC: Timing Events

Reply via email to