Hello, I'm experimenting with a large table (currently 9.7 M rows / 500 MB disk space) that stores measuring results and I sometimes (!) get slow queries for no good reason. The table looks like this:
+------------------------+ | values | +------------------------+ | int (11) sourceid | | timestamp (14) dummy | | timestamp (14) start | | int (11) newstart | | mediumint (9) duration | | float value | | tinyint (4) valuetype | +------------------------+ CREATE TABLE values ( sourceid int(11) NOT NULL default '0', dummy timestamp(14) NOT NULL, start timestamp(14) NOT NULL, newstart int(11) NOT NULL default '0', duration mediumint(9) NOT NULL default '0', value float NOT NULL default '0', valuetype tinyint(4) NOT NULL default '0', PRIMARY KEY (valuetype,sourceid,newstart), KEY sourceid (sourceid), KEY start (start) ) TYPE=MyISAM; The fields 'dummy' and 'start' are deprecated and will be removed as soon as all my scripts have been altered to match the new structure. I think they're not relevant for my problem (but who knows?). The index I use consists of three fields: valuetype (either 0 or 8 so far), sourceid (approx. 500 different values) and newstart (containing UNIX-timestamps, hardly any duplicates at all), in that order. I'm using temporary, disk-based MyISAM tables for calculations. The first step of that process is to load the desired part of the data table into the temporary table using a statement like: INSERT INTO result_table ... SELECT result_1, result_2, ... result_n FROM values WHERE sourceid IN (1,2,3) AND (newstart+duration) > some_UNIX_timestamp AND newstart < some_other_UNIX_timestamp AND valuetype=8; Despite several steps of optimization of the whole process (involving three more queries and another temporary table), an average total execution time of around 3-5 seconds is perfectly normal, as are peak times of up to 30 seconds. For some reason though, I occasionally get execution times of several minutes for the above query ALONE (the other queries are much faster), without any obvious correlation to the number of rows retrieved. To get a little more statistics, I set long_query_time down to 30 seconds, which made all the interesting queries available in the slow query log. Below are some of the logged results for 'sourceid IN(429,430,431) ... AND valuetype=0': +-------+--------------------------+---------------------------+-------- ----+-----------------+---------------------+ | query | FROM_UNIXTIME(left_edge) | FROM_UNIXTIME(right_edge) | exec. time | rows examinated | execution date | +-------+--------------------------+---------------------------+-------- ----+-----------------+---------------------+ | 1 | 2003-10-01 11:51:07 | 2003-10-08 11:51:07 | >214 sec. | 5'946'933 | 2003-10-08 11:55:43 | | 2 | 2003-10-01 11:51:07 | 2003-10-08 11:51:07 | >153 sec. | 4'898'654 | 2003-10-08 11:55:55 | | 3 | 2003-07-07 10:01:28 | 2003-10-08 12:13:19 | 52 sec. | 97'062 | 2003-10-08 12:17:46 | +-------+--------------------------+---------------------------+-------- ----+-----------------+---------------------+ and some information on the table's content (valuetype=0 at 2003-10-08 16:10:17): +----------+------------------------------+----------------------------- -+-----------------+ | sourceid | FROM_UNIXTIME(MAX(newstart)) | FROM_UNIXTIME(MAX(newstart)) | COUNT(newstart) | +----------+------------------------------+----------------------------- -+-----------------+ | 429 | 2003-07-07 10:04:04 | 2003-10-08 16:05:18 | 30059 | | 430 | 2003-07-07 10:02:55 | 2003-10-08 16:07:28 | 33155 | | 431 | 2003-07-07 10:01:28 | 2003-10-08 16:07:34 | 28209 | +----------+------------------------------+----------------------------- -+-----------------+ Obviously, in the two identical queries above, (1,2) MySQL was attempting a full table scan (and was stopped by a quick-and-dirty safety script of mine, which wakes up every minute and kills every process older than 180 seconds - I know this isn't good practice, but I can't have massive queries blocking incoming updates), whereas in the other one, only the rows with a correct sourceid were scanned (the second table shows less than 97'062 records because rows with valuetype<>0 weren't counted). As the second table suggests, query 3 retrieved all rows with valuetype=0, whereas 1 and 2 attempted to get all values of the last 7 days. One might think this could be a hint, but, as mentioned above, most of the time the queries are running smoothly and querying for some particular portion of data in the middle of the table usually works fine. Also, EXPLAIN shows very modest row numbers for all queries. CHECK / REPAIR TABLE just give 'status : OK' and are not of much use, though it seems to me that checking / repairing the table helps for a while. I was able to reproduce the phenomenon with the very same query from the slow query log repeatedly. Then I tried to alter the WHERE-clause to match the field order in the index and was glad to see the query go through as fast as usual -- until I did the cross-check and voilá: the original version of the query now magically went through just as fast. Meanwhile, some other queries had found their way into the slow query log... I can't really imagine that this is a bug, but I have no idea what my mistake is. Does the field order in the WHERE-clause in any way affect the speed of the query? Can anybody explain the inconsistency in reproducing the phenomenon? Now, I'm aware that I have a rather large table and the calculations I'm doing require some processing time. Additionally, there are between 10 and 100 INSERTs on the table once a minute, which I know doesn't make it any better. But turning off the data feed did not solve the problem, and most queries seem to have no problem. There's hardly anything else running on the system and mysqld and innodb are granted about 50% of all memory. I'm running a 4.0.15 binary distribution on a SUSE-Linux. My settings are (InnoDb included just in case, I only use MyISAM in the above example): key_buffer=64M max_allowed_packet=1M table_cache=64 sort_buffer=512K net_buffer_length=8K myisam_sort_buffer_size=8M innodb_mirrored_log_groups=1 innodb_log_files_in_group=3 innodb_log_file_size=10M innodb_log_buffer_size=8M innodb_flush_log_at_trx_commit=0 innodb_log_archive=0 innodb_buffer_pool_size=30M innodb_additional_mem_pool_size=5M innodb_file_io_threads=4 innodb_lock_wait_timeout=50 Any suggestions how to monitor the queries? I haven't installed --with-debug, unfortunately. Is there a way to add the debugging kit now? Could it be of any use? I am in the unfortunate situation of trying to improve the database architecture in a live production environment to match our grown needs, while the current version is already close to its limits and needs care on a daily basis. I could use some tips on that as well... Thanks for any help - 'my Latin has come to an end' as we Germans would say. Hanno ------------------------------------------------------------------------ ------------ Hanno Fietz dezem GmbH Lohmeyerstr. 9 10587 Berlin Tel.: 030 / 34 70 50 22 Fax: 030 / 34 70 50 21 www.dezem.de ------------------------------------------------------------------------ ------------ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]