Erik,

Even is you eliminate the subquery (which I have used efficiently in
the past, but they are always something to be careful of), the IN
clause is going to kill you..  above a certain number of elements in
that clause, the optimizer will go straight to full table scan.  This
drawback is well-known.

On 9/21/07, Dan Buettner <[EMAIL PROTECTED]> wrote:
> Erik, I think the main reason your query is running slowly is the use of a
> subselect.  MySQL does not generally perform well with subselects, though
> work continues in that area.
>
> There is also a problem/situation in MySQL in that you can't use MAX/GROUP
> BY functions quite the way you can in other databases; you'll get an
> accurate MAX value for one column, but the value in another won't
> necessarily be from the same row.  Someone posted on the list about this
> recently, calling it a bug, and I tend to agree.
>
> To solve your problem:
>
> I would take one of two approaches.
>
> First approach: split it into two queries in PHP, and use the results of the
> first in the second, like so:
>
> query1 = select max(id) from tbl where gid in ( 1234,2345,3456 .. 7890 )
> group by gid
>
> in PHP: id_string = join the results with commas.  implode function?
>
> query2 = select comment, gid, date_posted from tbl where id in (id_string)
>
> Generally speaking, fewer queries = higher performance, and databases are
> optimized to join tables, they do it well - but in your case I think you'll
> find one of these works better.
>
> Second approach:
>
> Insert values from first query into a temporary table, then join on that
> temp table in your second query.
>
> I don't think either approach will have a speed advantage, and the first is
> probably easier to code.
>
> HTH,
> Dan
>
>
>
> On 9/21/07, Erik Giberti <[EMAIL PROTECTED]> wrote:
> >
> > Hello everyone,
> >
> > The app server in this case is PHP, and the database is MySQL 5.0.22
> > on RedHat linux
> >
> > I've got a database with about 7.5K records in it that I expect to
> > start growing very quickly ~10-12K records per day. The storage
> > engine is InnoDB. This table is growing quickly and will continue to
> > grow for a long time. This table stores comments (as you can see from
> > the structure) and is being used to display a list of comments based
> > on a users affiliations.
> >
> > The structure is approximately this - I'm leaving out unrelated columns:
> >
> > id int - primary key - auto increment
> > gid bigint - indexed
> > comment varchar
> > date_posted timestamp
> >
> > I run a query with the following form
> >
> > select comment, gid, date_posted from tbl where id in (select max(id)
> > from tbl where gid in ( 1234,2345,3456 .. 7890 ) group by gid);
> >
> > I have an index on gid and id is the primary key
> >
> > When I describe the query with about 50 gid values inserted (where
> > indicated above) I get the following:
> >
> > +----+--------------------+-------+-------+---------------+---------
> > +---------+------+------+--------------------------+
> > | id | select_type        | table | type  | possible_keys | key     |
> > key_len | ref  | rows | Extra                    |
> > +----+--------------------+-------+-------+---------------+---------
> > +---------+------+------+--------------------------+
> > |  1 | PRIMARY            | tbl   | ALL   | NULL          | NULL    |
> > NULL    | NULL | 7533 | Using where              |
> > |  2 | DEPENDENT SUBQUERY | tbl   | range | idx_gid       | idx_gid |
> > 9       | NULL |   58 | Using where; Using index |
> > +----+--------------------+-------+-------+---------------+---------
> > +---------+------+------+--------------------------+
> >
> > Running the query on a production machine with sufficient memory and
> > horsepower (box is only 20% utilized) it still takes 3 seconds to run
> > - obviously not quick enough for web use.
> >
> > What I really need is the most recent comment from each group based
> > on a variable set of gid's that change from user to user.
> >
> > Any thoughts on how to tweak this to avoid the full table scan? Thank
> > you in advance for your assistance.
> >
> > Erik Giberti
> >
> >
> > --
> > MySQL General Mailing List
> > For list archives: http://lists.mysql.com/mysql
> > To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]
> >
> >
>


-- 
 - michael dykman
 - [EMAIL PROTECTED]

 - All models are wrong.  Some models are useful.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to