Re: Query Optimization Question

SGreen Mon, 13 Mar 2006 06:52:28 -0800

"Robert DiFalco" <[EMAIL PROTECTED]> wrote on 03/11/2006 12:43:43 PM:


> In a previous database engine I was using an IN was more optimal than a
> <>. So, for example:
> 
>     SELECT * FROM table WHERE table.type IN (1,2,3);
> 
> Where the possible values of type are 0-3, was appreciably faster than:
> 
>     SELECT * FROM table WHERE table.type <> 0;
> 
> I've been playing with the Query Browser and checking out the
> optimization documents and haven't been able to make a clear call on
> whether or not this is also the case with MySQL/InnoDB.
> 
> TIA,
> 
> R.
> 
> 

YES, YES, YES! This is definitely an optimization. 

When you say IN or =, you are asking for "matching values". Matches can 
come from indexes. When you say <> or NOT IN, you are asking for 
everything BUT matches.  In order to evaluate a negative, the database 
engine (and this is usually true regardless of database server) almost 
always performs a full table scan to test every row to make sure it is 
either <> or NOT IN. At the very best, they have to perform a full index 
scan which is still less efficient than  ranged or values-based lookups.

It's when you get into the situation where you are matching against dozens 
of IN-clause items that you may run into slowdowns again. Until you reach 
2 or 3 dozen terms (depending on your hardware) you should be faster with 
an IN comparison than a <> or a NOT IN comparison. An optimization to 
search for BUT a term or two is to create a temporary table of all of your 
terms and delete the exact ones you want to exclude. Put an index on yoru 
temp table then JOIN that back into your query again (replacing the huge 
IN clause).  The database will match index to index and things will get 
fast again. This technique can scale up to some really big queries.

Always try to code for the affirmative tests. Your users will thank you.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Re: Query Optimization Question

Reply via email to