Hi all. Im looking for some help to speed up some queries I have to run. Ill try to 
briefly describe the setup, but mostly try let my code explain itself.

There are three tables described below from a database that describes the execution of 
a Java program. class_loads records details of each class. methods records details of 
each method. Each method is associated with a record from class_loads via the class_id 
field. Finally method_executions contains info about method executions by mapping 
method_ids (which can be used to join with methods table) to event_numbers which 
represent time.

mysql> describe class_loads;
+--------------+---------------------+------+-----+---------+-------+
| Field        | Type                | Null | Key | Default | Extra |
+--------------+---------------------+------+-----+---------+-------+
| event_number | bigint(20) unsigned |      | PRI | 0       |       |
| class_id     | int(10) unsigned    |      | PRI | 0       |       |
| class_name   | varchar(255)        |      | MUL |         |       |
| source_name  | varchar(255)        |      |     |         |       |
+--------------+---------------------+------+-----+---------+-------+
5 rows in set (0.08 sec)

mysql> describe methods;
+------------------+------------------+------+-----+---------+-------+
| Field            | Type             | Null | Key | Default | Extra |
+------------------+------------------+------+-----+---------+-------+
| method_id        | int(10) unsigned |      | PRI | 0       |       |
| class_id         | int(10) unsigned |      | MUL | 0       |       |
| method_name      | varchar(255)     |      |     |         |       |
| method_signature | varchar(255)     |      |     |         |       |
+------------------+------------------+------+-----+---------+-------+
4 rows in set (0.00 sec)

mysql> describe method_executions;
+--------------+---------------------+------+-----+---------+-------+
| Field        | Type                | Null | Key | Default | Extra |
+--------------+---------------------+------+-----+---------+-------+
| event_number | bigint(20) unsigned |      | PRI | 0       |       |
| method_id    | int(10) unsigned    |      |     | 0       |       |
+--------------+---------------------+------+-----+---------+-------+
3 rows in set (0.00 sec)


The objective is to compute what I would call an execution frequency matrix (time 
intervals as rows, methods as columns), i.e. for some given event_number/time interval 
[x, y), I need to determine how often each method with a name conforming to a 
specified regex is executed within that interval.

Heres what I have done. Being a novice Im sure its painfully ineffecient. The 
bottleneck is indicated below. To give an idea of the table sizes, in a fairly small 
example we have 199, 2423 and 2434194 rows in class_loads, methods and 
method_executions respectively.


CREATE TEMPORARY TABLE temp_class_loads (
 class_id INT NOT NULL,
 class_name VARCHAR(255) NOT NULL,
 source_name VARCHAR(255) NOT NULL)
 
CREATE TEMPORARY TABLE temp_methods (
 method_id INT NOT NULL,
 source_name VARCHAR(255) NOT NULL,
 class_name VARCHAR(255) NOT NULL,
 method_name VARCHAR(255) NOT NULL,
 method_signature VARCHAR(255) NOT NULL)

LOCK TABLES class_loads READ, methods READ, method_entries READ

INSERT INTO temp_class_loads (
 SELECT class_id, class_name, source_name
 FROM class_loads
 WHERE class_name LIKE "com.%")
 
INSERT INTO temp_methods (
 SELECT method_id, source_name, class_name, method_name, method_signature
  FROM temp_class_loads, methods
  WHERE temp_class_loads.class_id = methods.class_id
  GROUP BY method_id)

{do the following statements for various [x, y) event_number ranges - The third one is 
the bottleneck!}

 CREATE TEMPORARY TABLE interval_method_freqs (
  method_id INT NOT NULL,
  count INT NOT NULL)
  
 INSERT INTO interval_method_freqs (
  SELECT method_id, COUNT(*)
   FROM method_executions
   WHERE method_executions.event_number BETWEEN x AND y
   GROUP BY method_id)
 
 SELECT SUM(count)
  FROM temp_methods LEFT JOIN interval_method_freqs USING(method_id)
  GROUP BY source_name, class_name, method_name, method_signature
  
 DROP TABLE interval_method_freqs
 
{loop}

UNLOCK TABLES

DROP TABLE temp_methods

DROP TABLE temp_class_loads



If you got this far thanks for making the effort! I look forward to reading suggested 
improvements.

Regards,
A.

Reply via email to