I haven't tried it in drizzle yet, these results are from mysql. I point it out
to you guys because it could be I assume they'll come out similarly. I'll try
to get a full dump into drizzle in the next few days.
Slow:
mysql> EXPLAIN SELECT * FROM user_table WHERE uid < 1000000 ORDER BY id DESC
LIMIT 0,12\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: user_table
type: range
possible_keys: uid
key: uid
key_len: 8
ref: NULL
rows: 48988
Extra: Using where; Using filesort
1 row in set (0.00 sec)
Fast:
mysql> EXPLAIN SELECT * FROM user_table INNER JOIN ( SELECT id FROM
user_table WHERE uid < 1000000 ORDER BY id DESC LIMIT 0,12 ) as SUBSELECT_TABLE
ON user_table.id = SUBSELECT_TABLE.id\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: <derived2>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 12
Extra:
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: user_table
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 8
ref: SUBSELECT_TABLE.id
rows: 1
Extra:
*************************** 3. row ***************************
id: 2
select_type: DERIVED
table: user_table
type: index
possible_keys: uid
key: PRIMARY
key_len: 8
ref: NULL
rows: 48985
Extra:
3 rows in set (0.04 sec)
Aggregation functions are especially ripe for optimization. Here's another
(non-limit) ugly query a developer had written which did very well when
changing SELECT * to SELECT id + inner join:
Slow:
mysql> EXPLAIN SELECT `tbl`.* FROM (SELECT `user_table`.* FROM `user_table`
WHERE (uid IN
(1022220528,516975293,610868834,1369254286,678430330,760683910,1101602962,840005,1539880648,536801853,533871879,1505692116,724332436,31803759,506549838,1507890014,593486534,1052379946,503592215,194600927,167800172,22406819,9433878,2716675,1120200072,517782135,729065959,620450950,32800651,672604205,588236068,33606106,1404468345,12633743,200800506,1413614990,585172327,1162999061,1066765095,1028653783,544385233,662588027,1137254943,1267449968,1161740972,760405590,1545259569,691569675,615948798,1189330419,595498800,546053510,1563824045,1160480409,162801352,1232341358,829698629,1486335885,1058605840,1039020025,701572979,513147932,614621086,1292596475,731515619,1058231753,1120200213,646377270,71701242,502763017,1173077531,710880495,85600035,193302738,11603373,1094790282,501483798,733680612,1841388912,1111793151,572545213,645570291,100000437158365,596083147,587246958,724405544,641236986,730519392,1168848663,146900336,1062266947,518095800,596610029,502844349,501817153,100000257743929,100000558823049,100000379524868,25833337,611837485,621991585,9318548,1472361327,1787625275,788507322,100000177520280,146900181,113500364,1379993602,766300262))
AND (`affiliations` IS NOT NULL) AND (`affiliations` <> "[]" ) ORDER BY `id`
DESC) AS `tbl` GROUP BY `uid`\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: <derived2>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1635
Extra: Using temporary; Using filesort
*************************** 2. row ***************************
id: 2
select_type: DERIVED
table: user_table
type: ALL
possible_keys: uid
key: NULL
key_len: NULL
ref: NULL
rows: 2900
Extra: Using filesort
2 rows in set (0.23 sec)
This query takes about 23 seconds. Compare it to this query, which takes _0.23
seconds_:
Fast:
mysql> EXPLAIN SELECT * FROM user_table INNER JOIN ( SELECT MAX(id) as
id FROM `user_table` WHERE (uid IN
(1022220528,516975293,610868834,1369254286,678430330,760683910,1101602962,840005,1539880648,536801853,533871879,1505692116,724332436,31803759,506549838,1507890014,593486534,1052379946,503592215,194600927,
167800172,22406819, 9433878,2716675,1120200072,
517782135,729065959,620450950,32800651,672604205,588236068,
33606106,1404468345,12633743,200800506,1413614990,585172327,1162999061,1066765095,1028653783,544385233,662588027,1137254943,1267449968,1161740972,
760405590,1545259569, 691569675,615948798,
1189330419,595498800,546053510,1563824045,1160480409,
162801352,1232341358,829698629,1486335885,1058605840,1039020025,
701572979,513147932,614621086,1292596475,731515619,1058231753,
1120200213,646377270, 71701242,502763017,1173077531,710880495,
85600035,193302738,11603373,1094790282, 501483798,733680612,
1841388912,1111793151,572545213,
645570291,100000437158365,596083147,587246958,724405544,
641236986,730519392,1168848663,146900336,1062266947,518095800,
596610029,502844349,501817153, 100000257743929,100000558823049,
100000379524868,25833337,611837485,621991585, 9318548,
1472361327,1787625275,788507322,100000177520280,146900181,113500364,1379993602,766300262))
AND (`affiliations` IS NOT NULL) AND (`affiliations` != "[]" ) GROUP BY uid )
as SUBSELECT_TABLE ON user_table.id = SUBSELECT_TABLE.id\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: <derived2>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 87
Extra:
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: user_table
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 8
ref: SUBSELECT_TABLE.id
rows: 1
Extra:
*************************** 3. row ***************************
id: 2
select_type: DERIVED
table: user_table
type: range
possible_keys: uid
key: uid
key_len: 8
ref: NULL
rows: 2900
Extra: Using where
3 rows in set (0.02 sec)
I've even seen "optimized away" on a few of the explains before. I'm having a
hard time finding the query that benefited that much from it. If I do, I'll
post it up here.
-Wilfried
On Apr 17, 2010, at 13:51 , Jay Pipes wrote:
> Hi! Interesting. Can you post the EXPLAIN SELECT for each of the
> statements? Also, what version of Drizzle are you using?
>
> Thanks!
>
> jay
>
> On Sat, Apr 17, 2010 at 1:50 PM, Wilfried Schobeiri <[email protected]>
> wrote:
>> Hi all! Good hanging out with those of you at MySQLConf & Drizzle Day this
>> past week. Hope you guys got home safe.
>>
>> I give you a sample query. id is primary key, uid is indexed. The table has
>> about 20 non-indexed columns. on top of taht.
>>
>> SELECT * FROM user_facebook WHERE uid < 1000000 ORDER BY id DESC LIMIT 0,12
>>
>> (2 min 24.54 sec)
>>
>> vs
>>
>> SELECT * FROM user_facebook INNER JOIN
>> (
>> SELECT id FROM user_facebook WHERE uid < 1000000 ORDER BY id DESC
>> LIMIT 0,12
>> )
>> as SUBSELECT_TABLE
>> ON user_facebook.id = SUBSELECT_TABLE.id
>>
>> (1.95 sec)
>>
>>
>> Same results, substantial performance boost. Worth noting is that the more
>> complicated or large the where set is (especially if there's an IN() set),
>> or if there's any aggregation going on, the delayed join version gets almost
>> exponentially faster. One could see 10-1000x increases in query time.
>>
>> I'm not sure what's going on with the first query (or if we can do anything
>> about it), but it seems like it's trying to do the order and limiting on the
>> entire column set instead of using only what's applicable to do the ordering
>> and limiting. If we can optimize around the primary key, perhaps there can
>> be some massive performance gains out of it.
>>
>> -Wilfried (nphase)
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~drizzle-discuss
>> Post to : [email protected]
>> Unsubscribe : https://launchpad.net/~drizzle-discuss
>> More help : https://help.launchpad.net/ListHelp
>>
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help : https://help.launchpad.net/ListHelp