Is there any way to optimize a range query that includes
an ORDER BY with keys from two different tables? I'm running
MySQL 4.1.18 on FreeBSD.
I've been struggling with some queries that are _incredibly_
slow--from 1-5 minutes on slowish but decent hardware. When I
try versions without the ORDER BY they're fast, and whatever
tweaks I do to the indexing do speed things up even faster,
but have no effect on the situation with the ORDER BY. The
docs suggest that indexes can't help here, but I find it hard
to believe that sorting on keys in different tables is that
rare a requirement; is there any way to restructure the
query to speed things up?
To take a few simple examples (most actual queries are more
complicated, but the slowdown isn't a result of the
complication), I have three tables (edited to remove fields
not used in these examples), part has_many quotation
has_many cwGroup:
mysql desc part;
+---+--+--+-+-++
| Field | Type | Null | Key | Default | Extra |
+---+--+--+-+-++
| id| int(10) unsigned | | PRI | NULL| auto_increment |
| cit | text | YES | | NULL||
| d | int(11) | YES | MUL | NULL||
+---+--+--+-+-++
mysql desc quotation;
+-+--+--+-+-++
| Field | Type | Null | Key | Default | Extra |
+-+--+--+-+-++
| id | int(10) unsigned | | PRI | NULL| auto_increment |
| part_id | int(10) unsigned | | MUL | 0 ||
| qt | text | YES | | NULL||
+-+--+--+-+-++
mysql desc cwGroup;
+--+--+--+-+-++
| Field| Type | Null | Key | Default | Extra |
+--+--+--+-+-++
| id | int(10) unsigned | | PRI | NULL| auto_increment |
| quotation_id | int(10) unsigned | | MUL | 0 ||
| cw | varchar(100) | YES | | NULL||
| stripped_cw | varchar(100) | YES | MUL | NULL||
+--+--+--+-+-++
The rough numbers are 100K rows in part, 2.7M in quotation, and
3.3M in cwGroup.
For example, the following query, which would return 460 rows
without the LIMIT, takes about 51s:
SELECT part.d, quotation.qt, cwGroup.cw
FROM cwGroup
JOIN quotation ON (quotation.id = cwGroup.quotation_id )
JOIN part ON ( part.id = quotation.part_id )
WHERE ( part.d BETWEEN 1950 AND 1970 AND cwGroup.stripped_cw LIKE 'man%' )
ORDER BY part.d, cwGroup.stripped_cw
LIMIT 25
and the EXPLAIN for it looks like:
*** 1. row ***
id: 1
select_type: SIMPLE
table: cwGroup
type: range
possible_keys: quotation_id,stripped_cw
key: stripped_cw
key_len: 101
ref: NULL
rows: 8489
Extra: Using where; Using temporary; Using filesort
*** 2. row ***
id: 1
select_type: SIMPLE
table: quotation
type: eq_ref
possible_keys: PRIMARY,part_id
key: PRIMARY
key_len: 4
ref: rqs_incs.cwGroup.quotation_id
rows: 1
Extra:
*** 3. row ***
id: 1
select_type: SIMPLE
table: part
type: eq_ref
possible_keys: PRIMARY,d
key: PRIMARY
key_len: 4
ref: rqs_incs.quotation.part_id
rows: 1
Extra: Using where
Without the ORDER BY it drops to about 1.5s and EXPLAIN no longer
shows the use of temporary and filesort.
An even worse example, but unfortunately a common need in this
app, is a query that returns a lot of rows (but which I'm paging
through, of course), such as:
SELECT part.d, quotation.qt, cwGroup.cw
FROM cwGroup
JOIN quotation ON (quotation.id = cwGroup.quotation_id )
JOIN part ON ( part.id = quotation.part_id )
WHERE ( cwGroup.stripped_cw BETWEEN 'ant' AND 'asx' )
ORDER BY cwGroup.stripped_cw, part.d
LIMIT 25
This takes 2m31s to execute, obviously due to the large number
of rows (the total result is about 47K rows), but a similar
query without the ORDER BY took only .08s (though a COUNT(*)
took a similar 2-3m):
*** 1. row ***
id: 1
select_type: SIMPLE
table: cwGroup
type: range
possible_keys: quotation_id,stripped_cw
key: stripped_cw
key_len: 101
ref: NULL