Is there any way to optimize a range query that includes an ORDER BY with keys from two different tables? I'm running MySQL 4.1.18 on FreeBSD.
I've been struggling with some queries that are _incredibly_ slow--from 1-5 minutes on slowish but decent hardware. When I try versions without the ORDER BY they're fast, and whatever tweaks I do to the indexing do speed things up even faster, but have no effect on the situation with the ORDER BY. The docs suggest that indexes can't help here, but I find it hard to believe that sorting on keys in different tables is that rare a requirement; is there any way to restructure the query to speed things up? To take a few simple examples (most actual queries are more complicated, but the slowdown isn't a result of the complication), I have three tables (edited to remove fields not used in these examples), "part" has_many "quotation" has_many "cwGroup": mysql> desc part; +-----------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------+------------------+------+-----+---------+----------------+ | id | int(10) unsigned | | PRI | NULL | auto_increment | | cit | text | YES | | NULL | | | d | int(11) | YES | MUL | NULL | | +-----------+------------------+------+-----+---------+----------------+ mysql> desc quotation; +---------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +---------+------------------+------+-----+---------+----------------+ | id | int(10) unsigned | | PRI | NULL | auto_increment | | part_id | int(10) unsigned | | MUL | 0 | | | qt | text | YES | | NULL | | +---------+------------------+------+-----+---------+----------------+ mysql> desc cwGroup; +--------------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------+------------------+------+-----+---------+----------------+ | id | int(10) unsigned | | PRI | NULL | auto_increment | | quotation_id | int(10) unsigned | | MUL | 0 | | | cw | varchar(100) | YES | | NULL | | | stripped_cw | varchar(100) | YES | MUL | NULL | | +--------------+------------------+------+-----+---------+----------------+ The rough numbers are 100K rows in "part", 2.7M in "quotation", and 3.3M in "cwGroup". For example, the following query, which would return 460 rows without the LIMIT, takes about 51s: SELECT part.d, quotation.qt, cwGroup.cw FROM cwGroup JOIN quotation ON (quotation.id = cwGroup.quotation_id ) JOIN part ON ( part.id = quotation.part_id ) WHERE ( part.d BETWEEN 1950 AND 1970 AND cwGroup.stripped_cw LIKE 'man%' ) ORDER BY part.d, cwGroup.stripped_cw LIMIT 25 and the EXPLAIN for it looks like: *************************** 1. row *************************** id: 1 select_type: SIMPLE table: cwGroup type: range possible_keys: quotation_id,stripped_cw key: stripped_cw key_len: 101 ref: NULL rows: 8489 Extra: Using where; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: quotation type: eq_ref possible_keys: PRIMARY,part_id key: PRIMARY key_len: 4 ref: rqs_incs.cwGroup.quotation_id rows: 1 Extra: *************************** 3. row *************************** id: 1 select_type: SIMPLE table: part type: eq_ref possible_keys: PRIMARY,d key: PRIMARY key_len: 4 ref: rqs_incs.quotation.part_id rows: 1 Extra: Using where Without the ORDER BY it drops to about 1.5s and EXPLAIN no longer shows the use of "temporary" and "filesort". An even worse example, but unfortunately a common need in this app, is a query that returns a lot of rows (but which I'm paging through, of course), such as: SELECT part.d, quotation.qt, cwGroup.cw FROM cwGroup JOIN quotation ON (quotation.id = cwGroup.quotation_id ) JOIN part ON ( part.id = quotation.part_id ) WHERE ( cwGroup.stripped_cw BETWEEN 'ant' AND 'asx' ) ORDER BY cwGroup.stripped_cw, part.d LIMIT 25 This takes 2m31s to execute, obviously due to the large number of rows (the total result is about 47K rows), but a similar query without the ORDER BY took only .08s (though a COUNT(*) took a similar 2-3m): *************************** 1. row *************************** id: 1 select_type: SIMPLE table: cwGroup type: range possible_keys: quotation_id,stripped_cw key: stripped_cw key_len: 101 ref: NULL rows: 54745 Extra: Using where; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: quotation type: eq_ref possible_keys: PRIMARY,part_id key: PRIMARY key_len: 4 ref: rqs_incs.cwGroup.quotation_id rows: 1 Extra: *************************** 3. row *************************** id: 1 select_type: SIMPLE table: part type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: rqs_incs.quotation.part_id rows: 1 Extra: Other queries, as said, are more complicated, adding additional columns in the searches or joining in other tables (sometimes with range searches here as well), but these don't seem to affect the underlying problem. Adding multiple-column indexes also doesn't affect things in any significant way. Any thoughts? I clearly need a significant speed improvement, not just a tweak like making a bigger sort_buffer_size or getting faster disks. Thanks for reading this far. Jesse Sheidlower -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]