I'm stuck with a web app that periodically has truly awful query
response times. The cause may be an interaction between system load,
vacuum-analyze scheduling, the occasional wild variation in join
selectivity, and stats collection. Logging the queries and running them
later doesn't create an obvious pattern, because it's hard to catch the
exact circumstances where the query takes forever, or returns zero rows.

It would be difficult to change the app to run an EXPLAIN ANALYZE and
record the results. I do this when I can, because it's the fastest way
to figure out what went wrong in a 10-way join. 

I would like to add a guc variant on debug_print_plan that logs the
output of ExplainOneQuery(), rather than dumping the plan node-traversal
output --- perhaps it's just me, but I have trouble relating that output
to the tables and columns named neatly in EXPLAIN ANALYZE output.

This is different from just running EXPLAIN ANALYZE: this is asking the
backend to log the EXPLAIN ANALYZE output, but ALSO return the normal
result set. 

Some points I ponder:

(1) Could not find this on the TODO list, nor mentioned in the mail
archive. Has this truly never come up? Logging detail query execution to
to the profiler is something one gets accustomed to, with MSSQL.

(2) src/backend/commands/explain.c:ExplainQuery gives a cautionary
comment on scribbling (recording stats) in the execution tree,
particularly when that tree will be executed repeatedly from a PLPGSQL
proc (not a problem in my case, but ...)

(3) any suggestions on what to avoid ...



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to