I'm attempting to measure database query performance across a large number of high-volume clusters. I'd like to share a couple of ideas and hear what other ideas folks know of for this problem. I suppose there are existing solutions, I just haven't found them.
The idea here is to systematically capture execution times of predictable SQL queries taking longer than certain threshholds. For example, suppose my application routinely launches queries of the form "SELECT ... FROM table1, ... WHERE id = NNN and ...", and from experience we know this query takes 5ms when fully cached, and maybe 50ms when not cached. So we'd like to capture when this query exceeds, say, 100ms. My latest thought is to store regexes of "interesting queries" along with their threshholds in a central database: create table interesting_query ( regex varchar min float ) Then, with the cluster logging queries and durations, I'd tail the log into a perl script that 1) connects to the central DB and downloads the interesting queries, and then 2) parses the log output, 3) keeps track of max/min/avg/stddev, and then 4) periodically insert the results into the central monitoring database. So, supposing there were 10 queries/second for a given query, then we might report the slowness every minute, and each report would include the aggregate max/min/stddev/count/avg stats for 600 instances of the queries in the preceding minute. Once those numbers are in a central database, I could easily identify performance troublespots. How are others handling this problem? Other ideas? Thanks. Ed ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly