Hi all. Im looking for some help to speed up some queries I have to run. Ill try to briefly describe the setup, but mostly try let my code explain itself.
There are three tables described below from a database that describes the execution of a Java program. class_loads records details of each class. methods records details of each method. Each method is associated with a record from class_loads via the class_id field. Finally method_executions contains info about method executions by mapping method_ids (which can be used to join with methods table) to event_numbers which represent time. mysql> describe class_loads; +--------------+---------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+---------------------+------+-----+---------+-------+ | event_number | bigint(20) unsigned | | PRI | 0 | | | class_id | int(10) unsigned | | PRI | 0 | | | class_name | varchar(255) | | MUL | | | | source_name | varchar(255) | | | | | +--------------+---------------------+------+-----+---------+-------+ 5 rows in set (0.08 sec) mysql> describe methods; +------------------+------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------------+------------------+------+-----+---------+-------+ | method_id | int(10) unsigned | | PRI | 0 | | | class_id | int(10) unsigned | | MUL | 0 | | | method_name | varchar(255) | | | | | | method_signature | varchar(255) | | | | | +------------------+------------------+------+-----+---------+-------+ 4 rows in set (0.00 sec) mysql> describe method_executions; +--------------+---------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+---------------------+------+-----+---------+-------+ | event_number | bigint(20) unsigned | | PRI | 0 | | | method_id | int(10) unsigned | | | 0 | | +--------------+---------------------+------+-----+---------+-------+ 3 rows in set (0.00 sec) The objective is to compute what I would call an execution frequency matrix (time intervals as rows, methods as columns), i.e. for some given event_number/time interval [x, y), I need to determine how often each method with a name conforming to a specified regex is executed within that interval. Heres what I have done. Being a novice Im sure its painfully ineffecient. The bottleneck is indicated below. To give an idea of the table sizes, in a fairly small example we have 199, 2423 and 2434194 rows in class_loads, methods and method_executions respectively. CREATE TEMPORARY TABLE temp_class_loads ( class_id INT NOT NULL, class_name VARCHAR(255) NOT NULL, source_name VARCHAR(255) NOT NULL) CREATE TEMPORARY TABLE temp_methods ( method_id INT NOT NULL, source_name VARCHAR(255) NOT NULL, class_name VARCHAR(255) NOT NULL, method_name VARCHAR(255) NOT NULL, method_signature VARCHAR(255) NOT NULL) LOCK TABLES class_loads READ, methods READ, method_entries READ INSERT INTO temp_class_loads ( SELECT class_id, class_name, source_name FROM class_loads WHERE class_name LIKE "com.%") INSERT INTO temp_methods ( SELECT method_id, source_name, class_name, method_name, method_signature FROM temp_class_loads, methods WHERE temp_class_loads.class_id = methods.class_id GROUP BY method_id) {do the following statements for various [x, y) event_number ranges - The third one is the bottleneck!} CREATE TEMPORARY TABLE interval_method_freqs ( method_id INT NOT NULL, count INT NOT NULL) INSERT INTO interval_method_freqs ( SELECT method_id, COUNT(*) FROM method_executions WHERE method_executions.event_number BETWEEN x AND y GROUP BY method_id) SELECT SUM(count) FROM temp_methods LEFT JOIN interval_method_freqs USING(method_id) GROUP BY source_name, class_name, method_name, method_signature DROP TABLE interval_method_freqs {loop} UNLOCK TABLES DROP TABLE temp_methods DROP TABLE temp_class_loads If you got this far thanks for making the effort! I look forward to reading suggested improvements. Regards, A.