RE: Need confirmation: Subselects are broken with regards to index usage?

Robert DiFalco Fri, 25 May 2007 16:53:46 -0700

Lets hope something is done about it soon, not all sub queries SHOULD be
re-written as joins. Generally joins that result in the need to use
DISTINCT are better off re-written as IN OR EXISTS sub queries.


In any event, I have a lot of very complex queries that work great for
our other database backends but will need to be rewritten for MySQL.
Does anyone know of a tool that will just take a standard SQL select
query string and rewrite it (with the option of rewriting subqueries to
joins). I know such a tool can't be perfect but it could help out with
the heavy lifting leaving only some tuning and debugging.

-----Original Message-----
From: Baron Schwartz [mailto:[EMAIL PROTECTED] 
Sent: Thursday, May 24, 2007 2:35 PM
To: Robert DiFalco
Cc: [email protected]
Subject: Re: Need confirmation: Subselects are broken with regards to
index usage?

Hi Robert,

The way non-correlated subqueries are sometimes "optimized" into
correlated ones and then executed for each row in the outer table is a
well-known MySQL deficiency, yes.  I would not really look for it to be
fixed soon, though it's been in progress for a while.  The version in
which it gets fixed is still likely a long way from GA.  But maybe a
MySQL employee can give better info on that.

In the meantime, we are all forced to find alternative ways to write
such queries :-)

Baron

Robert DiFalco wrote:
> Ok, so I guess it is more complicated than that.
> 
> This query which has 5M records that match its criteria returns
> instantly:
> 
>  SELECT ELEMS.id
>  FROM ELEMS
>  WHERE ((
>       ELEMS.nodeID IN (
>               SELECT link.childID 
>               FROM link 
>               JOIN path ON link.parentID=path.decendantId 
>               WHERE (path.ancestorId = 1))))
>  LIMIT 0,100;
> 
> Now if I change the ancestorId criteria to a node group that does not 
> exist the query takes a very long time. Btw, it also looks like this 
> is being optimized into a less efficient EXISTS query. Anyway, the 
> join version doesn't have the same problem, it is fast if it is 
> searching for a conditions that has results or one that has none. Note

> that the JOIN version requires a SELECT in case a node has multiple
ancestors.
> 
>  SELECT DISTINCT ELEMS.id
>  FROM ELEMS
>  JOIN link ON ELEMS.nodeID = link.childID  JOIN path ON 
> link.parentID=path.decendantId  WHERE (path.ancestorId = 1)  LIMIT 
> 0,100;
> 
> Anyone have any ideas why this is the case? 
> 
> 
> 
> -----Original Message-----
> From: Robert DiFalco [mailto:[EMAIL PROTECTED]
> Sent: Thursday, May 24, 2007 1:11 PM
> To: [email protected]
> Subject: Need confirmation: Subselects are broken with regards to 
> index usage?
> 
> I think I'm discovering that sub-selects in MySQL are broken. Is that 
> true? It seems like you cannot have a sub-select without doing a table

> scan -- even for a constant IN expression -- this because it gets 
> re-written as an EXISTS that executes for each row.
>  
> Is that true? Forcing an index doesn't even seem to help. 
>  
> R.
> 
> 



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

RE: Need confirmation: Subselects are broken with regards to index usage?

Reply via email to