Conceptually, what I'd like to do is join two tables on a hostid for a result like this non-join version where hostid in the 'IN' come from the other table.
mysql> explain select avg(load5min) from kstatHostData_20050513 where hostid IN (250, 400) and localdate > '2005-06-06 13:00:00'\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: kstatHostData_20050513 type: range possible_keys: hostid_gmtdate_idx,hostid_localdate_idx key: hostid_localdate_idx key_len: 11 ref: NULL rows: 385 Extra: Using where 1 row in set (0.01 sec) This works great. It correctly picks the hostid_localdate_idx index and the query is fast. If I convert this to JOIN a table which contains exactly the same values ( two rows: 250, 400) as in the IN clause above, mysql picks either key and only using hostid portion to complete the join, e.g., mysql> explain select avg(load5min) from kstatHostData_20050513 ks, hostinfo hi WHERE ks.hostid = hi.hostid and localdate > '2005-06-06 13:00:00'\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: hi type: index possible_keys: bar_idx key: bar_idx key_len: 4 ref: NULL rows: 2 Extra: Using index *************************** 2. row *************************** id: 1 select_type: SIMPLE table: ks type: ref possible_keys: hostid_gmtdate_idx,hostid_localdate_idx key: hostid_localdate_idx key_len: 3 ref: postinistats.hi.hostid rows: 32338 Extra: Using where 2 rows in set (0.00 sec) mysql> This query, needless to say, is abysmally slow. The actual number of applicable rows needed, as indicated in the first example where the full index is used, is approximately 400. It makes sense that either index will do as they both start with the hostid and you can only use one index, I'm just not sure how to work around it when the WHERE date condition signifincantly limits the result. I tried a subquery which yield results that actually appear worst than the join. (See below). Short of populating my IN block in the application with a pre-query, or re-organizing data into summaries to limit rows, I'm out of ideas on how to improve this. Maybe I'm missing some obvious solution. If anyone has any thoughts or explanations, I'd really appeciate them. Mysql is 4.1.12-standard. Thanks, John mysql> explain select avg(load5min) from kstatHostData_20050513 WHERE hostid IN ( SELECT hostid FROM hostinfo) AND localdate > '2005-06-06 13:00:00'\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: kstatHostData_20050513 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 26797461 Extra: Using where *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: hostinfo type: index_subquery possible_keys: bar_idx key: bar_idx key_len: 4 ref: func rows: 2 Extra: Using index 2 rows in set (0.00 sec) mysql> -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]