Re: [HACKERS] RFC: Timing Events
On Wed, Nov 7, 2012 at 11:45 AM, Josh Berkus wrote: > >> You're being obtuse, Josh. These parameters can be set the same way any >> other parameters can, including ALTER ROLE SET or ALTER DATABASE SET. >> The "superuser only" restriction is that only a superuser can execute >> the ALTER, not that the target role has to be superuser. > > Oh? Ok, that's helpful. > > It still eliminates the main potential use of auto-explain on production > sites, though, which is to turn it on only for specific operations. If you have sufficient control over the application to do that, why can't you just explain those queries the old fashioned way? > I've never been able to make use of auto-explain for any real diagnostic > purpose on a production site, and I don't personally know anyone who has. I have, although it is a rather low volume production site. What prevents you from doing so? The performance impact? The volume of logs generated? Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
Josh Berkus writes: > It still eliminates the main potential use of auto-explain on production > sites, though, which is to turn it on only for specific operations. > I've never been able to make use of auto-explain for any real diagnostic > purpose on a production site, and I don't personally know anyone who has. This objection seems completely off the mark. If you have a specific query you want to test, you can just use EXPLAIN. The point of auto-explain is to get info about queries that are unexpectedly slow, and for that it seems to me that you'll generally want it active over a fairly wide scope. I'm perfectly prepared to grant that you personally may never have had a use for auto-explain, but it does not follow that it's useless. I'm also pretty unclear on what use-case you think there is that's intermediate between plain EXPLAIN and cases for which auto-explain is reasonably suitable as-is. > To get back to the original thread, though: I'm saying that "you can do > that with auto-explain" is zero justification to limit any of Greg's > work, because auto-explain isn't generally useful. I don't recall having said anything against Greg's proposal, but in any case I don't grant your premise. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
> You're being obtuse, Josh. These parameters can be set the same way any > other parameters can, including ALTER ROLE SET or ALTER DATABASE SET. > The "superuser only" restriction is that only a superuser can execute > the ALTER, not that the target role has to be superuser. Oh? Ok, that's helpful. It still eliminates the main potential use of auto-explain on production sites, though, which is to turn it on only for specific operations. I've never been able to make use of auto-explain for any real diagnostic purpose on a production site, and I don't personally know anyone who has. I recall bringing this up when auto-explain was first committed as a fatal limitation in the module. And as far as I'm concerned, it has been. Fixing this would clearly be complex; we don't currently have a system which lets the superuser control which other users can change specific parameters, even though that would be useful in a lot of contexts. Presumably, making auto-explain would also require such a system of grants, so it's not something we're fixing for 9.3, I'd imagine. To get back to the original thread, though: I'm saying that "you can do that with auto-explain" is zero justification to limit any of Greg's work, because auto-explain isn't generally useful. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
Josh Berkus writes: >> You can reduce the logging volume on busy servers with >> auto_explain.log_min_duration. You can also activate it for a single >> database user only by setting log_min_duration to -1 globally and >> change the setting for one user with ALTER ROLE SET, right? > Not according to the docs, you can't. You're being obtuse, Josh. These parameters can be set the same way any other parameters can, including ALTER ROLE SET or ALTER DATABASE SET. The "superuser only" restriction is that only a superuser can execute the ALTER, not that the target role has to be superuser. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
> You can reduce the logging volume on busy servers with > auto_explain.log_min_duration. You can also activate it for a single > database user only by setting log_min_duration to -1 globally and > change the setting for one user with ALTER ROLE SET, right? Not according to the docs, you can't. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
Josh Berkus wrote: >> Huh? The typical use-case is to enable it for all sessions by >> including it in shared_preload_libraries. That doesn't require any >> particular session to be superuser. (If you're superuser you can then >> turn it *off* in your session, should you wish.) > > It's not practical to have auto-explain on for all queries on a server > which is processing 10K queries/minute. And non-superusers can't alter > the settings in their session, even the min_duration. > > A practical use of auto-explain would involve during it on for a single > user session, or for a specific database user, neither of which are > possible. I think that Tom is right. You can reduce the logging volume on busy servers with auto_explain.log_min_duration. You can also activate it for a single database user only by setting log_min_duration to -1 globally and change the setting for one user with ALTER ROLE SET, right? And I am under the impression that parameters that influence what is logged tend to be superuser only for a good reason (the only exception I see are the debug_print_* parameters). Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
> Huh? The typical use-case is to enable it for all sessions by > including it in shared_preload_libraries. That doesn't require any > particular session to be superuser. (If you're superuser you can then > turn it *off* in your session, should you wish.) It's not practical to have auto-explain on for all queries on a server which is processing 10K queries/minute. And non-superusers can't alter the settings in their session, even the min_duration. A practical use of auto-explain would involve during it on for a single user session, or for a specific database user, neither of which are possible. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
On Sun, Nov 4, 2012 at 1:35 AM, Pavel Stehule wrote: > Hello > > 2012/11/4 Satoshi Nagayasu : >>> >> >> Do we have something to add to auto_explain? > > Now I am working on expanding slow query record and auto_explain with > some locking times (lock on objects, lock on enhancing pages, other > locks). But this would only work if you used 'auto_explain.log_analyze=1', which is has nasty performance implications. Or you planning on changing the way log_analyze works to get around this? > > Just statement time produces too less information in our complex and > "unpredictable" cloud environment with thousand databases and hundreds > servers. I think it would be easier to implement but still a big step forward over what we currently have if "explain analyze" and "\timing" would show the 'rusage' values in addition to the wall-clock time. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
Josh Berkus writes: >> Do we have something to add to auto_explain? > Well, to be frank, I've never found auto-explain to be useful because of > its restriction to superuser sessions. It's an interesting > proof-of-concept, but completely useless at any production site. Huh? The typical use-case is to enable it for all sessions by including it in shared_preload_libraries. That doesn't require any particular session to be superuser. (If you're superuser you can then turn it *off* in your session, should you wish.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
> I think auto_explain would help you solve such rare incidents > if it could dump several statistics into server log, including lock > waits and block reads/writes statistic per-session, for example. > > Do we have something to add to auto_explain? Well, to be frank, I've never found auto-explain to be useful because of its restriction to superuser sessions. It's an interesting proof-of-concept, but completely useless at any production site. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
Hello 2012/11/4 Satoshi Nagayasu : > (2012/11/03 10:44), Josh Berkus wrote: >> >> >>> I don't see all that going into core without a much bigger push than I >>> think people will buy. What people really want for all these is a >>> proper trending system, and that means graphs and dashboards and >>> bling--not a history table. >> >> >> Well, I'm particularly thinking for autoconfiguration. For example, to >> set vacuum_freeze_min_age properly, you have to know the XID "burn rate" >> of the server, which is only available via history. I really don't want >> to be depending on a graphical monitoring utility to find these things >> out. >> >>> This whole approach has the assumption that things are going to fall off >>> sometimes. To expand on that theme for a second, right now I'm more >>> worried about the "99%" class of problems. Neither pg_stat_statements >>> nor this idea are very good for tracking the rare rogue problem down. >>> They're both aimed to make things that happen a lot more statistically >>> likely to be seen, by giving an easier UI to glare at them frequently. >>> That's not ideal, but I suspect really fleshing the whole queue consumer >>> -> table idea needs to happen to do much better. >> >> >> I'm just concerned that for some types of incidents, it would be much >> more than 1% *of what you want to look at* which fall off. For example, >> consider a server which does 95% reads at a very high rate, but has 2% >> of its writes cronically having lock waits. That's something you want >> to solve, but it seems fairly probably that these relatively infrequent >> queries would have fallen off the bottom of pg_stat_statements. Same >> thing with the relative handful of queries which do large on-disk sorts. >> >> The problem I'm worried about is that pg_stat_statements is designed to >> keep the most frequent queries, but sometimes the thing you really need >> to look at is not in the list of most frequent queries. > > > I think auto_explain would help you solve such rare incidents > if it could dump several statistics into server log, including lock > waits and block reads/writes statistic per-session, for example. > > Do we have something to add to auto_explain? Now I am working on expanding slow query record and auto_explain with some locking times (lock on objects, lock on enhancing pages, other locks). Just statement time produces too less information in our complex and "unpredictable" cloud environment with thousand databases and hundreds servers. Regards Pavel Stehule > > Regards, > -- > Satoshi Nagayasu > Uptime Technologies, LLC. http://www.uptime.jp > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
(2012/11/03 10:44), Josh Berkus wrote: I don't see all that going into core without a much bigger push than I think people will buy. What people really want for all these is a proper trending system, and that means graphs and dashboards and bling--not a history table. Well, I'm particularly thinking for autoconfiguration. For example, to set vacuum_freeze_min_age properly, you have to know the XID "burn rate" of the server, which is only available via history. I really don't want to be depending on a graphical monitoring utility to find these things out. This whole approach has the assumption that things are going to fall off sometimes. To expand on that theme for a second, right now I'm more worried about the "99%" class of problems. Neither pg_stat_statements nor this idea are very good for tracking the rare rogue problem down. They're both aimed to make things that happen a lot more statistically likely to be seen, by giving an easier UI to glare at them frequently. That's not ideal, but I suspect really fleshing the whole queue consumer -> table idea needs to happen to do much better. I'm just concerned that for some types of incidents, it would be much more than 1% *of what you want to look at* which fall off. For example, consider a server which does 95% reads at a very high rate, but has 2% of its writes cronically having lock waits. That's something you want to solve, but it seems fairly probably that these relatively infrequent queries would have fallen off the bottom of pg_stat_statements. Same thing with the relative handful of queries which do large on-disk sorts. The problem I'm worried about is that pg_stat_statements is designed to keep the most frequent queries, but sometimes the thing you really need to look at is not in the list of most frequent queries. I think auto_explain would help you solve such rare incidents if it could dump several statistics into server log, including lock waits and block reads/writes statistic per-session, for example. Do we have something to add to auto_explain? Regards, -- Satoshi Nagayasu Uptime Technologies, LLC. http://www.uptime.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
> I don't see all that going into core without a much bigger push than I > think people will buy. What people really want for all these is a > proper trending system, and that means graphs and dashboards and > bling--not a history table. Well, I'm particularly thinking for autoconfiguration. For example, to set vacuum_freeze_min_age properly, you have to know the XID "burn rate" of the server, which is only available via history. I really don't want to be depending on a graphical monitoring utility to find these things out. > This whole approach has the assumption that things are going to fall off > sometimes. To expand on that theme for a second, right now I'm more > worried about the "99%" class of problems. Neither pg_stat_statements > nor this idea are very good for tracking the rare rogue problem down. > They're both aimed to make things that happen a lot more statistically > likely to be seen, by giving an easier UI to glare at them frequently. > That's not ideal, but I suspect really fleshing the whole queue consumer > -> table idea needs to happen to do much better. I'm just concerned that for some types of incidents, it would be much more than 1% *of what you want to look at* which fall off. For example, consider a server which does 95% reads at a very high rate, but has 2% of its writes cronically having lock waits. That's something you want to solve, but it seems fairly probably that these relatively infrequent queries would have fallen off the bottom of pg_stat_statements. Same thing with the relative handful of queries which do large on-disk sorts. The problem I'm worried about is that pg_stat_statements is designed to keep the most frequent queries, but sometimes the thing you really need to look at is not in the list of most frequent queries. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
On 11/1/12 11:54 PM, Josh Berkus wrote: For example, it would be really useful to be able to see, for example, pg_stat_user_tables from 2 days ago to estimate table growth and activity, or pg_stat_replication from 10 minutes ago to average replication lag. I don't see all that going into core without a much bigger push than I think people will buy. What people really want for all these is a proper trending system, and that means graphs and dashboards and bling--not a history table. I have almost all of my customers using Munin or Cacti or Zabbix or something, and none using pg_statsinfo. Shoot, static graphs are barely good enough anymore--people really want dynamic ones driven by client-side Javascript. "Why can't I zoom in on this Munin graph, this is lame" they tell me. I blame Google Maps for being the first thing that made all the users so demanding in this area. But the main weakness of these tools isn't display, is that it's seemed impractical to get them to collect per-table data, either for configuration, speed, or display reasons. I'm trying to find a good web application toolchain to recommend that does that and dynamic graphs, too. I would never take up the fight to try and build in that direction in core though. I think most people aren't even consuming the pg_stat_user_tables data already provided fully yet in userland. [I fear this topic will turn into a more appropriate one for pgsql-advocacy in a hurry if it keeps going] So, the problem with joining against pg_stat_statements is that a special-purpose incident you're looking at (like a lock_wait) might have been pushed "off the bottom" of pg_stat_statements even though it is still visible in pg_stat_lock_waits. No? This whole approach has the assumption that things are going to fall off sometimes. To expand on that theme for a second, right now I'm more worried about the "99%" class of problems. Neither pg_stat_statements nor this idea are very good for tracking the rare rogue problem down. They're both aimed to make things that happen a lot more statistically likely to be seen, by giving an easier UI to glare at them frequently. That's not ideal, but I suspect really fleshing the whole queue consumer -> table idea needs to happen to do much better. Thanks for the quick feedback, there's a lot of ideas I should incorporate there I need to chew on. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Timing Events
On Fri, Nov 2, 2012 at 8:54 AM, Josh Berkus wrote: > Greg, > > First off, let me again praise the great work you and Peter are doing in > this area. > > Modeling this on pg_stat_statements includes the hope of packaging it as > > an extension with minor core hooks, and the idea that there would be a > > fixed size list of timing events available at any time. Consider it a > > FIFO queue built using a circular buffer. Dump events into there and > > provide a way to view them. If some fall out before they're > > analyzed/saved, that doesn't seem a problem for now. > > Agreed. > > > If you want 100% > > durable, the log is still available. Eventually I'd like a consumer for > > these that wrote them to a history table as an option, but that seems a > > second priority after getting them into memory in the first place. > > Yes, and easy for users/tools to implement once the basic data is out > there. > > > I'd > > like that consumer thing for pg_stat_statements too, but so far we're > > getting by without it. It seems like something that might benefit from > > the in-core queuing work one day too. > > Well, we could actually use such a thing in general, and not just for > the timing events. For example, it would be really useful to be able to > see, for example, pg_stat_user_tables from 2 days ago to estimate table > growth and activity, or pg_stat_replication from 10 minutes ago to > average replication lag. There was a plug-in tool for this, I think > Itagaki developed it. Anyone remember what/where it is? > pg_statsinfo perhaps? It is used for stat info management: http://pgfoundry.org/projects/pgstatsinfo/ -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] RFC: Timing Events
Greg, First off, let me again praise the great work you and Peter are doing in this area. > Modeling this on pg_stat_statements includes the hope of packaging it as > an extension with minor core hooks, and the idea that there would be a > fixed size list of timing events available at any time. Consider it a > FIFO queue built using a circular buffer. Dump events into there and > provide a way to view them. If some fall out before they're > analyzed/saved, that doesn't seem a problem for now. Agreed. > If you want 100% > durable, the log is still available. Eventually I'd like a consumer for > these that wrote them to a history table as an option, but that seems a > second priority after getting them into memory in the first place. Yes, and easy for users/tools to implement once the basic data is out there. > I'd > like that consumer thing for pg_stat_statements too, but so far we're > getting by without it. It seems like something that might benefit from > the in-core queuing work one day too. Well, we could actually use such a thing in general, and not just for the timing events. For example, it would be really useful to be able to see, for example, pg_stat_user_tables from 2 days ago to estimate table growth and activity, or pg_stat_replication from 10 minutes ago to average replication lag. There was a plug-in tool for this, I think Itagaki developed it. Anyone remember what/where it is? So there's actually two things for a consumer: Timing Events and Counters. > = Initial event list = > > There are three particular events I have been thinking about, hoping > that will be enough to make sure the structure is general enough to > handle more later: > > -log_lock_waits output. This is my #1 goal. Lock waits are incredibly > common issues in larger systems, and the process for tracking them down > right now is very painful. Probably deadlocks too. > > -log_temp_files output. This everyone wants to monitor, but it won't be > as useful unless query tieing (below) works. > > -log_checkpoints. The fight to make this data available via > pg_stat_bgwriter has been useful. You really need a trending system to > do anything useful with the data that way though. Being able to write a > simple timing events query that looked for checkpoint spikes would be > great for alerting, which is often done with something other than a > trending package. To give specific examples here, the graphs Munin > makes for pg_stat_bgwriter data work, but they're not much of a help if > what you really want is to trigger a Nagios alert when checkpoints are > slow. A Timing Event entry of a slow checkpoint would be easy to > trigger alerts from. I agree that these are the events to start with. Let me add the additional events I think we want to have, in the order of how useful the information is for troubleshooting: 4. log_connections / log_disconnections 5. deadlock reporting 6. autovacuum logging I also think that we should eventually expose a buffer of the last N error messages, but that's a different sort of data. > > = Data to save = > > There's a few obvious fields that could be explicitly saved as regular > columns: > > -timestamp > -user > -database + application_name > -event type code. Presumably an integer that's decoded via a master > list of these, in some header file. > -running process id + running session id > -running process details (below) > -referenced process id. The one causing the problem in situations like > a lock wait. - referenced session id > -referenced process details (below) > -serial number. People who will want to poll & store this data might > want a better way to do that than just a timestamp. Would let you know > when you lost some of it, too, so you'd know to consider increasing the > buffer size or decrease the polling interval. By the time the user looks at this information, details on the process from pg_stat_activity will probably no longer be available. This makes data like process_id not that useful, and means we need to replicate all relevant data from pg_stat_activity. > Another way this data might be stored is to have a key/value composite > type, and then make each timing event entry an array of those. If only PostgreSQL had some sort of key-value composite storage type available! I have no idea where we would get such a thing. ;-) > I'm open to other ideas for how to store this. If the events go into > one place, the different data each saves turns into the classic > "flexible schema" problem. I think it's an excellent approach, not in the least because it lets us add new data later, and deals with the variety of fields we need to lock_waits. > = Query text = > > For the often multi-line query text that's dumped by log_lock_waits, > I've thought of two possibilities so far: > > -Mash them into a single text line using "\n" between them. Now you're > even back to where, with the right escaping, the entire timing event > detail par
[HACKERS] RFC: Timing Events
Parsing log files for commonly needed performance data is no fun. As a next step toward eliminating that, I've been working on how to approach this similarly to how pg_stat_statements can cut down on query log processing. I thought it would be novel to outline this for design review before coding any more of it. I keep switching between calling the sort of things people want to know about Timing Events and Performance Events. No preference here, I'll use the former because it's shorter. Modeling this on pg_stat_statements includes the hope of packaging it as an extension with minor core hooks, and the idea that there would be a fixed size list of timing events available at any time. Consider it a FIFO queue built using a circular buffer. Dump events into there and provide a way to view them. If some fall out before they're analyzed/saved, that doesn't seem a problem for now. If you want 100% durable, the log is still available. Eventually I'd like a consumer for these that wrote them to a history table as an option, but that seems a second priority after getting them into memory in the first place. I'd like that consumer thing for pg_stat_statements too, but so far we're getting by without it. It seems like something that might benefit from the in-core queuing work one day too. = Initial event list = There are three particular events I have been thinking about, hoping that will be enough to make sure the structure is general enough to handle more later: -log_lock_waits output. This is my #1 goal. Lock waits are incredibly common issues in larger systems, and the process for tracking them down right now is very painful. Probably deadlocks too. -log_temp_files output. This everyone wants to monitor, but it won't be as useful unless query tieing (below) works. -log_checkpoints. The fight to make this data available via pg_stat_bgwriter has been useful. You really need a trending system to do anything useful with the data that way though. Being able to write a simple timing events query that looked for checkpoint spikes would be great for alerting, which is often done with something other than a trending package. To give specific examples here, the graphs Munin makes for pg_stat_bgwriter data work, but they're not much of a help if what you really want is to trigger a Nagios alert when checkpoints are slow. A Timing Event entry of a slow checkpoint would be easy to trigger alerts from. = Data to save = There's a few obvious fields that could be explicitly saved as regular columns: -timestamp -user -database -event type code. Presumably an integer that's decoded via a master list of these, in some header file. -running process id -running process details (below) -referenced process id. The one causing the problem in situations like a lock wait. -referenced process details (below) -serial number. People who will want to poll & store this data might want a better way to do that than just a timestamp. Would let you know when you lost some of it, too, so you'd know to consider increasing the buffer size or decrease the polling interval. = Storing event details = When I think about how people are going to want to use this data, it should be easy to tear apart without returning to the need to write a complicated parser. My first idea with that in mind was to consider this like a series of key/value pairs. If you look at the log_checkpoints output, you can see it approaches that form already at the end, with "=" signs as the delimiters. Taking an example from a pre-9.1 server (so no sync times, but they obviously fit the style already): checkpoint complete: wrote 128770 buffers (12.3%); 0 transaction log file(s) added, 979 removed, 129 recycled; write=431.784 s, sync=6048.384 s, total=6484.669 s You can imagine that instead as: wrote=128770 wrote_pct=12.3 log_added=0 log_removed=979 log_recycled=129 recycled write=431.784 sync=6048.384 s total=6484.669 Even if the timing event data was stored exactly like that--simple text string, perhaps with some escaping of "=" signs--I think it would be completely usable to people. log_temp_files output saves like this easily enough too. Another way this data might be stored is to have a key/value composite type, and then make each timing event entry an array of those. I'm open to other ideas for how to store this. If the events go into one place, the different data each saves turns into the classic "flexible schema" problem. = Query text = For the often multi-line query text that's dumped by log_lock_waits, I've thought of two possibilities so far: -Mash them into a single text line using "\n" between them. Now you're even back to where, with the right escaping, the entire timing event detail part could still collapse into a single text string. That would kill the idea of simple parsing of key/value data stuffed there though. -Create an array of lines to