I'm attempting to measure database query performance across a 
large number of high-volume clusters.  I'd like to share a 
couple of ideas and hear what other ideas folks know of for this 
problem.  I suppose there are existing solutions, I just haven't 
found them.

The idea here is to systematically capture execution times of 
predictable SQL queries taking longer than certain threshholds.  
For example, suppose my application routinely launches queries 
of the form "SELECT ... FROM table1, ... WHERE id = NNN and 
...", and from experience we know this query takes 5ms when 
fully cached, and maybe 50ms when not cached.  So we'd like to 
capture when this query exceeds, say, 100ms.

My latest thought is to store regexes of "interesting queries" 
along with their threshholds in a central database:

        create table interesting_query (
                regex   varchar
                min             float
        )

Then, with the cluster logging queries and durations, I'd tail 
the log into a perl script that 1) connects to the central DB 
and downloads the interesting queries, and then 2) parses the 
log output, 3) keeps track of max/min/avg/stddev, and then 4) 
periodically insert the results into the central monitoring 
database.  

So, supposing there were 10 queries/second for a given query, 
then we might report the slowness every minute, and each report 
would include the aggregate max/min/stddev/count/avg stats for 
600 instances of the queries in the preceding minute.  Once 
those numbers are in a central database, I could easily identify 
performance troublespots.

How are others handling this problem?  Other ideas?

Thanks.

Ed


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to