I'm stuck with a web app that periodically has truly awful query response times. The cause may be an interaction between system load, vacuum-analyze scheduling, the occasional wild variation in join selectivity, and stats collection. Logging the queries and running them later doesn't create an obvious pattern, because it's hard to catch the exact circumstances where the query takes forever, or returns zero rows.
It would be difficult to change the app to run an EXPLAIN ANALYZE and record the results. I do this when I can, because it's the fastest way to figure out what went wrong in a 10-way join. I would like to add a guc variant on debug_print_plan that logs the output of ExplainOneQuery(), rather than dumping the plan node-traversal output --- perhaps it's just me, but I have trouble relating that output to the tables and columns named neatly in EXPLAIN ANALYZE output. This is different from just running EXPLAIN ANALYZE: this is asking the backend to log the EXPLAIN ANALYZE output, but ALSO return the normal result set. Some points I ponder: (1) Could not find this on the TODO list, nor mentioned in the mail archive. Has this truly never come up? Logging detail query execution to to the profiler is something one gets accustomed to, with MSSQL. (2) src/backend/commands/explain.c:ExplainQuery gives a cautionary comment on scribbling (recording stats) in the execution tree, particularly when that tree will be executed repeatedly from a PLPGSQL proc (not a problem in my case, but ...) (3) any suggestions on what to avoid ... ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend