Erik, Even is you eliminate the subquery (which I have used efficiently in the past, but they are always something to be careful of), the IN clause is going to kill you.. above a certain number of elements in that clause, the optimizer will go straight to full table scan. This drawback is well-known.
On 9/21/07, Dan Buettner <[EMAIL PROTECTED]> wrote: > Erik, I think the main reason your query is running slowly is the use of a > subselect. MySQL does not generally perform well with subselects, though > work continues in that area. > > There is also a problem/situation in MySQL in that you can't use MAX/GROUP > BY functions quite the way you can in other databases; you'll get an > accurate MAX value for one column, but the value in another won't > necessarily be from the same row. Someone posted on the list about this > recently, calling it a bug, and I tend to agree. > > To solve your problem: > > I would take one of two approaches. > > First approach: split it into two queries in PHP, and use the results of the > first in the second, like so: > > query1 = select max(id) from tbl where gid in ( 1234,2345,3456 .. 7890 ) > group by gid > > in PHP: id_string = join the results with commas. implode function? > > query2 = select comment, gid, date_posted from tbl where id in (id_string) > > Generally speaking, fewer queries = higher performance, and databases are > optimized to join tables, they do it well - but in your case I think you'll > find one of these works better. > > Second approach: > > Insert values from first query into a temporary table, then join on that > temp table in your second query. > > I don't think either approach will have a speed advantage, and the first is > probably easier to code. > > HTH, > Dan > > > > On 9/21/07, Erik Giberti <[EMAIL PROTECTED]> wrote: > > > > Hello everyone, > > > > The app server in this case is PHP, and the database is MySQL 5.0.22 > > on RedHat linux > > > > I've got a database with about 7.5K records in it that I expect to > > start growing very quickly ~10-12K records per day. The storage > > engine is InnoDB. This table is growing quickly and will continue to > > grow for a long time. This table stores comments (as you can see from > > the structure) and is being used to display a list of comments based > > on a users affiliations. > > > > The structure is approximately this - I'm leaving out unrelated columns: > > > > id int - primary key - auto increment > > gid bigint - indexed > > comment varchar > > date_posted timestamp > > > > I run a query with the following form > > > > select comment, gid, date_posted from tbl where id in (select max(id) > > from tbl where gid in ( 1234,2345,3456 .. 7890 ) group by gid); > > > > I have an index on gid and id is the primary key > > > > When I describe the query with about 50 gid values inserted (where > > indicated above) I get the following: > > > > +----+--------------------+-------+-------+---------------+--------- > > +---------+------+------+--------------------------+ > > | id | select_type | table | type | possible_keys | key | > > key_len | ref | rows | Extra | > > +----+--------------------+-------+-------+---------------+--------- > > +---------+------+------+--------------------------+ > > | 1 | PRIMARY | tbl | ALL | NULL | NULL | > > NULL | NULL | 7533 | Using where | > > | 2 | DEPENDENT SUBQUERY | tbl | range | idx_gid | idx_gid | > > 9 | NULL | 58 | Using where; Using index | > > +----+--------------------+-------+-------+---------------+--------- > > +---------+------+------+--------------------------+ > > > > Running the query on a production machine with sufficient memory and > > horsepower (box is only 20% utilized) it still takes 3 seconds to run > > - obviously not quick enough for web use. > > > > What I really need is the most recent comment from each group based > > on a variable set of gid's that change from user to user. > > > > Any thoughts on how to tweak this to avoid the full table scan? Thank > > you in advance for your assistance. > > > > Erik Giberti > > > > > > -- > > MySQL General Mailing List > > For list archives: http://lists.mysql.com/mysql > > To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] > > > > > -- - michael dykman - [EMAIL PROTECTED] - All models are wrong. Some models are useful. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]