As others have mentioned, mysql doesn't handle IN queries efficiently. You can try changing it to using derived tables/subqueries. I did some quick tests and the explain shows a different analysis.
select comment, gid, date_posted from tbl JOIN (select max(id) as mid from tbl where gid in ( 1234,2345,3456 .. 7890 ) group by gid) as maxids ON tpl.id=maxids.mid; You're really just changing one of the IN statements to a join. My quick tests showed that a full table is still being done, but it's now on the derived table (maxids), which would only be as large as how many gids you are searching on (50?). I tested this on two related tables, 170K in one and 90K in the other. Your tables are currently much smaller, so mysql may come up with a different execution path. For me, the join syntax was far faster. On 9/21/07, Erik Giberti <[EMAIL PROTECTED]> wrote: > Hello everyone, > > The app server in this case is PHP, and the database is MySQL 5.0.22 > on RedHat linux > > I've got a database with about 7.5K records in it that I expect to > start growing very quickly ~10-12K records per day. The storage > engine is InnoDB. This table is growing quickly and will continue to > grow for a long time. This table stores comments (as you can see from > the structure) and is being used to display a list of comments based > on a users affiliations. > > The structure is approximately this - I'm leaving out unrelated columns: > > id int - primary key - auto increment > gid bigint - indexed > comment varchar > date_posted timestamp > > I run a query with the following form > > select comment, gid, date_posted from tbl where id in (select max(id) > from tbl where gid in ( 1234,2345,3456 .. 7890 ) group by gid); > > I have an index on gid and id is the primary key > > When I describe the query with about 50 gid values inserted (where > indicated above) I get the following: > > +----+--------------------+-------+-------+---------------+--------- > +---------+------+------+--------------------------+ > | id | select_type | table | type | possible_keys | key | > key_len | ref | rows | Extra | > +----+--------------------+-------+-------+---------------+--------- > +---------+------+------+--------------------------+ > | 1 | PRIMARY | tbl | ALL | NULL | NULL | > NULL | NULL | 7533 | Using where | > | 2 | DEPENDENT SUBQUERY | tbl | range | idx_gid | idx_gid | > 9 | NULL | 58 | Using where; Using index | > +----+--------------------+-------+-------+---------------+--------- > +---------+------+------+--------------------------+ > > Running the query on a production machine with sufficient memory and > horsepower (box is only 20% utilized) it still takes 3 seconds to run > - obviously not quick enough for web use. > > What I really need is the most recent comment from each group based > on a variable set of gid's that change from user to user. > > Any thoughts on how to tweak this to avoid the full table scan? Thank > you in advance for your assistance. > > Erik Giberti > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] > > -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]