Re: OR isn't optimised

2006-06-10 Thread Taras D

No luck Dan. The reason why I had the straight join was because the
optimiser was joining the tables in a non-optimal way - it wasn't
using both parts of the key (the same problem I'm having now) so I
changed it to straight join to see if the optimiser would use both
parts of the key.


mysql> explain select * from l,d where (l.sId = d.dId and d.sId = 1)
or (l.sId = d.dId and d.sId = 2) \G
*** 1. row ***
  id: 1
 select_type: SIMPLE
   table: d
type: range
possible_keys: PRIMARY
 key: PRIMARY
 key_len: 2
 ref: NULL
rows: 399
   Extra: Using where; Using index
*** 2. row ***
  id: 1
 select_type: SIMPLE
   table: l
type: ref
possible_keys: sId
 key: sId
 key_len: 2
 ref: radius_searching_test_case.d.dId
rows: 1
   Extra: Using where; Using index
2 rows in set (0.00 sec)


If you don't specify the straight join, the optimiser doesn't figure
out that it can use both parts of the key to reduce the d table if you
join the l table first, so it first does the d table, then the l
table. I think this join order is optimal if there *was* a good reason
why it could only use the first part of the primary key on the d
table, but I can't think of a good reason.

I had to increase the number of entries in the d table from 100^2 to
200^2 to illustrate that the join order the optimiser is choosing
without the straight join is not optimal. In the actual application
the d table is 15000^2 entries, while the l table is about 100 000, so
this change is a closer match to reality.  You could theoretically do
the search in question using 100*2=200 rows if it could use both parts
of the keys and join the d table to the l table (which I tried to get
it to do by using straight join), but instead its doing it in 399 rows
(I'm guessing the join order the optimiser is choosing is optimal if
it couldn't, for some reason, use both parts of the primary key on the
d table) if you don't specify the join order (which, as  I mentioned,
would be the optimal order if there was a reason why it can't use both
parts of the primary key on the d table)

I really have no idea what to do - maybe its a bug?

Taras

On 6/9/06, Dan Buettner <[EMAIL PROTECTED]> wrote:

Hi again Taras -

Clearly, I was off the mark - sorry!

What if you remove the straight_join specification and let MySQL's JOIN
optimizer do its own thing?

Dan


Taras D wrote:
> Sorry everyone, forgot to put in my setup detail!
>
> SQL version: 5.0.22-community-nt
> Windows 2000 Professional 5.0, SP 4
>
> Dan, I tried your suggestion, but no luck :(.
>
> explain select straight_join * from l,d where (l.sId = d.dId) and
> (d.sId in (1,2)) \G
> *** 1. row ***
>   id: 1
>  select_type: SIMPLE
>table: l
> type: index
> possible_keys: sId
>  key: sId
>  key_len: 5
>  ref: NULL
> rows: 100
>Extra: Using index
> *** 2. row ***
>   id: 1
>  select_type: SIMPLE
>table: d
> type: range
> possible_keys: PRIMARY
>  key: PRIMARY
>  key_len: 2
>  ref: NULL
> rows: 200
>Extra: Using where; Using index
> 2 rows in set (0.00 sec)
>
> The only problem with switching to UNION's is that I would be looking
> at 18 (I incorrectly wrote 16) times more rows than I should be
> (seeing there could be upto 18 OR conditions)
>
> Taras
>
> On 6/8/06, Dan Buettner <[EMAIL PROTECTED]> wrote:
>> Taras, I'm guessing you're using version 4.1 or earlier.  This is a
>> known issue with MySQL prior to 5.  4.1 and earlier would only use one
>> index per instance of a table per query, leading to poor performance
>> with OR statements in many cases.  In other words, it uses an index to
>> optimize one of your OR cases, but not the other(s).
>>
>> One way around this is the use of UNIONs as you found, or to use
>> multiple instances of a table per query - though that often leads to
>> same hairy SQL that in my opinion is difficult to maintain.  UNIONs are
>> more straightforward, as is (in many cases) an upgrade to 5.x.
>>
>> MySQL 5 addresses this issue:
>> http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html
>>
>> (Now watch, it will turn out you're running 5.0!)
>>
>> I wonder if you'd have better success with the version you're on if you
>> re-wrote your query more like so:
>>
>> select straight_join *
>> from l,d
>> where (l.sId = d.dId)
>> and (d.sId in (1, 2));
>>
>> Dan
>>
>>
>>
>> Taras D wrote:
>> > Hi everyone,
>> >
>> > I have the following schema:
>> >
>> > create table l
>> > (
>> >  aId int unsigned not null auto_increment primary key,
>> >  sId smallint unsigned not null,
>> >  dId smallint unsigned,
>> >  index(sId,dId)
>> > );
>> >
>> > create table d
>> > (
>> >  sId smallint unsigned 

Re: OR isn't optimised

2006-06-09 Thread Dan Buettner

Hi again Taras -

Clearly, I was off the mark - sorry!

What if you remove the straight_join specification and let MySQL's JOIN 
optimizer do its own thing?


Dan


Taras D wrote:

Sorry everyone, forgot to put in my setup detail!

SQL version: 5.0.22-community-nt
Windows 2000 Professional 5.0, SP 4

Dan, I tried your suggestion, but no luck :(.

explain select straight_join * from l,d where (l.sId = d.dId) and
(d.sId in (1,2)) \G
*** 1. row ***
  id: 1
 select_type: SIMPLE
   table: l
type: index
possible_keys: sId
 key: sId
 key_len: 5
 ref: NULL
rows: 100
   Extra: Using index
*** 2. row ***
  id: 1
 select_type: SIMPLE
   table: d
type: range
possible_keys: PRIMARY
 key: PRIMARY
 key_len: 2
 ref: NULL
rows: 200
   Extra: Using where; Using index
2 rows in set (0.00 sec)

The only problem with switching to UNION's is that I would be looking
at 18 (I incorrectly wrote 16) times more rows than I should be
(seeing there could be upto 18 OR conditions)

Taras

On 6/8/06, Dan Buettner <[EMAIL PROTECTED]> wrote:

Taras, I'm guessing you're using version 4.1 or earlier.  This is a
known issue with MySQL prior to 5.  4.1 and earlier would only use one
index per instance of a table per query, leading to poor performance
with OR statements in many cases.  In other words, it uses an index to
optimize one of your OR cases, but not the other(s).

One way around this is the use of UNIONs as you found, or to use
multiple instances of a table per query - though that often leads to
same hairy SQL that in my opinion is difficult to maintain.  UNIONs are
more straightforward, as is (in many cases) an upgrade to 5.x.

MySQL 5 addresses this issue:
http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html

(Now watch, it will turn out you're running 5.0!)

I wonder if you'd have better success with the version you're on if you
re-wrote your query more like so:

select straight_join *
from l,d
where (l.sId = d.dId)
and (d.sId in (1, 2));

Dan



Taras D wrote:
> Hi everyone,
>
> I have the following schema:
>
> create table l
> (
>  aId int unsigned not null auto_increment primary key,
>  sId smallint unsigned not null,
>  dId smallint unsigned,
>  index(sId,dId)
> );
>
> create table d
> (
>  sId smallint unsigned not null,
>  dId smallint unsigned not null,
>  primary key(sId, dId)
> );
>
> The d table has 10 000 entries, the l table has 100 entries.
>
> The following query, which consist of an OR. Each part of the OR
> specifies exactly one row in the d table.
>
> explain select straight_join * from l,d where (l.sId = d.dId and d.sId
> = 1) or (l.sId = d.dId and d.sId = 2);
>
> 
++-+---+---+---+-+-+--+--+--+ 


>
> | id | select_type | table | type  | possible_keys | key | key_len
> | ref  | rows | Extra|
> 
++-+---+---+---+-+-+--+--+--+ 


>
> |  1 | SIMPLE  | l | index | sId   | sId | 5
> | NULL |  100 | Using index  |
> |  1 | SIMPLE  | d | range | PRIMARY   | PRIMARY | 2
> | NULL |  200 | Using where; Using index |
> 
++-+---+---+---+-+-+--+--+--+ 


>
>
> So why is only half of the index being used to reduce the d table? I
> would think if it used both indices, at most two matching rows would
> be found, reducing the  number of rows that have to be inspected from
> 20 000 to 100. In fact, UNIONing the two queries shows just this:
>
> select straight_join * from l,d where (l.sId = d.dId and d.sId = 1)
> union select straight_join * from l,d where (l.sId = d.dId and d.sId =
> 2);
> 
++--+++---+-+-++--+-+ 


>
> | id | select_type  | table  | type   | possible_keys | key |
> key_len | ref| rows | Extra
> |
> 
++--+++---+-+-++--+-+ 


>
> |  1 | PRIMARY  | l  | index  | sId   | sId |
> 5   | NULL   |  100 | Using index
> |
> |  1 | PRIMARY  | d  | eq_ref | PRIMARY   | PRIMARY |
> 4   | const,radius_searching_test_case.l.sId |1 | Using index
> |
> |  2 | UNION| l  | index  | sId   | sId |
> 5   | NULL   |  100 | Using index
> |
> |  2 | UNION| d  | eq_ref | PRIMARY   | PRIMARY |
> 4   | const,radius_searching_test_case.l.sId |1 | Using index
> |
> |

Re: OR isn't optimised

2006-06-08 Thread Taras D

Sorry everyone, forgot to put in my setup detail!

SQL version: 5.0.22-community-nt
Windows 2000 Professional 5.0, SP 4

Dan, I tried your suggestion, but no luck :(.

explain select straight_join * from l,d where (l.sId = d.dId) and
(d.sId in (1,2)) \G
*** 1. row ***
  id: 1
 select_type: SIMPLE
   table: l
type: index
possible_keys: sId
 key: sId
 key_len: 5
 ref: NULL
rows: 100
   Extra: Using index
*** 2. row ***
  id: 1
 select_type: SIMPLE
   table: d
type: range
possible_keys: PRIMARY
 key: PRIMARY
 key_len: 2
 ref: NULL
rows: 200
   Extra: Using where; Using index
2 rows in set (0.00 sec)

The only problem with switching to UNION's is that I would be looking
at 18 (I incorrectly wrote 16) times more rows than I should be
(seeing there could be upto 18 OR conditions)

Taras

On 6/8/06, Dan Buettner <[EMAIL PROTECTED]> wrote:

Taras, I'm guessing you're using version 4.1 or earlier.  This is a
known issue with MySQL prior to 5.  4.1 and earlier would only use one
index per instance of a table per query, leading to poor performance
with OR statements in many cases.  In other words, it uses an index to
optimize one of your OR cases, but not the other(s).

One way around this is the use of UNIONs as you found, or to use
multiple instances of a table per query - though that often leads to
same hairy SQL that in my opinion is difficult to maintain.  UNIONs are
more straightforward, as is (in many cases) an upgrade to 5.x.

MySQL 5 addresses this issue:
http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html

(Now watch, it will turn out you're running 5.0!)

I wonder if you'd have better success with the version you're on if you
re-wrote your query more like so:

select straight_join *
from l,d
where (l.sId = d.dId)
and (d.sId in (1, 2));

Dan



Taras D wrote:
> Hi everyone,
>
> I have the following schema:
>
> create table l
> (
>  aId int unsigned not null auto_increment primary key,
>  sId smallint unsigned not null,
>  dId smallint unsigned,
>  index(sId,dId)
> );
>
> create table d
> (
>  sId smallint unsigned not null,
>  dId smallint unsigned not null,
>  primary key(sId, dId)
> );
>
> The d table has 10 000 entries, the l table has 100 entries.
>
> The following query, which consist of an OR. Each part of the OR
> specifies exactly one row in the d table.
>
> explain select straight_join * from l,d where (l.sId = d.dId and d.sId
> = 1) or (l.sId = d.dId and d.sId = 2);
>
> 
++-+---+---+---+-+-+--+--+--+
>
> | id | select_type | table | type  | possible_keys | key | key_len
> | ref  | rows | Extra|
> 
++-+---+---+---+-+-+--+--+--+
>
> |  1 | SIMPLE  | l | index | sId   | sId | 5
> | NULL |  100 | Using index  |
> |  1 | SIMPLE  | d | range | PRIMARY   | PRIMARY | 2
> | NULL |  200 | Using where; Using index |
> 
++-+---+---+---+-+-+--+--+--+
>
>
> So why is only half of the index being used to reduce the d table? I
> would think if it used both indices, at most two matching rows would
> be found, reducing the  number of rows that have to be inspected from
> 20 000 to 100. In fact, UNIONing the two queries shows just this:
>
> select straight_join * from l,d where (l.sId = d.dId and d.sId = 1)
> union select straight_join * from l,d where (l.sId = d.dId and d.sId =
> 2);
> 
++--+++---+-+-++--+-+
>
> | id | select_type  | table  | type   | possible_keys | key |
> key_len | ref| rows | Extra
> |
> 
++--+++---+-+-++--+-+
>
> |  1 | PRIMARY  | l  | index  | sId   | sId |
> 5   | NULL   |  100 | Using index
> |
> |  1 | PRIMARY  | d  | eq_ref | PRIMARY   | PRIMARY |
> 4   | const,radius_searching_test_case.l.sId |1 | Using index
> |
> |  2 | UNION| l  | index  | sId   | sId |
> 5   | NULL   |  100 | Using index
> |
> |  2 | UNION| d  | eq_ref | PRIMARY   | PRIMARY |
> 4   | const,radius_searching_test_case.l.sId |1 | Using index
> |
> || UNION RESULT |  | ALL| NULL  | NULL|
> NULL| NULL   | NULL |
> |
> 
++--+++---+-+-

Re: OR isn't optimised

2006-06-08 Thread Dan Buettner
Taras, I'm guessing you're using version 4.1 or earlier.  This is a 
known issue with MySQL prior to 5.  4.1 and earlier would only use one 
index per instance of a table per query, leading to poor performance 
with OR statements in many cases.  In other words, it uses an index to 
optimize one of your OR cases, but not the other(s).


One way around this is the use of UNIONs as you found, or to use 
multiple instances of a table per query - though that often leads to 
same hairy SQL that in my opinion is difficult to maintain.  UNIONs are 
more straightforward, as is (in many cases) an upgrade to 5.x.


MySQL 5 addresses this issue:
http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html

(Now watch, it will turn out you're running 5.0!)

I wonder if you'd have better success with the version you're on if you 
re-wrote your query more like so:


select straight_join *
from l,d
where (l.sId = d.dId)
and (d.sId in (1, 2));

Dan



Taras D wrote:

Hi everyone,

I have the following schema:

create table l
(
 aId int unsigned not null auto_increment primary key,
 sId smallint unsigned not null,
 dId smallint unsigned,
 index(sId,dId)
);

create table d
(
 sId smallint unsigned not null,
 dId smallint unsigned not null,
 primary key(sId, dId)
);

The d table has 10 000 entries, the l table has 100 entries.

The following query, which consist of an OR. Each part of the OR
specifies exactly one row in the d table.

explain select straight_join * from l,d where (l.sId = d.dId and d.sId
= 1) or (l.sId = d.dId and d.sId = 2);

++-+---+---+---+-+-+--+--+--+ 


| id | select_type | table | type  | possible_keys | key | key_len
| ref  | rows | Extra|
++-+---+---+---+-+-+--+--+--+ 


|  1 | SIMPLE  | l | index | sId   | sId | 5
| NULL |  100 | Using index  |
|  1 | SIMPLE  | d | range | PRIMARY   | PRIMARY | 2
| NULL |  200 | Using where; Using index |
++-+---+---+---+-+-+--+--+--+ 



So why is only half of the index being used to reduce the d table? I
would think if it used both indices, at most two matching rows would
be found, reducing the  number of rows that have to be inspected from
20 000 to 100. In fact, UNIONing the two queries shows just this:

select straight_join * from l,d where (l.sId = d.dId and d.sId = 1)
union select straight_join * from l,d where (l.sId = d.dId and d.sId =
2);
++--+++---+-+-++--+-+ 


| id | select_type  | table  | type   | possible_keys | key |
key_len | ref| rows | Extra
|
++--+++---+-+-++--+-+ 


|  1 | PRIMARY  | l  | index  | sId   | sId |
5   | NULL   |  100 | Using index
|
|  1 | PRIMARY  | d  | eq_ref | PRIMARY   | PRIMARY |
4   | const,radius_searching_test_case.l.sId |1 | Using index
|
|  2 | UNION| l  | index  | sId   | sId |
5   | NULL   |  100 | Using index
|
|  2 | UNION| d  | eq_ref | PRIMARY   | PRIMARY |
4   | const,radius_searching_test_case.l.sId |1 | Using index
|
|| UNION RESULT |  | ALL| NULL  | NULL|
NULL| NULL   | NULL |
|
++--+++---+-+-++--+-+ 



So why isn't the optimiser picking up that it can use both parts of
the keys and significantly reduce the number of rows it must inspect?
I have tried inserting 'force index(primary) after 'from l, d', but
that doesn't help. A work around *could* be to just union the results,
but in the real case the 'l' table can have upto 100 000 entries, and
there can be upto 16 independant ORs, which means that I would be
looking at 100 000 * 16 = 1.6 million rows (using UNIONs), instead of
100 000 rows (if I could get this OR stuff to work). (PHP code to
insert data into the tables is appended)

Does anyone know what's happening?

Thanks,
Taras

php code
- 


= $INSERT_AT_A_TIME)
   {
 mysql_query($query) or die(mysql_error().''.$query);
 echo "Done another 1000\n\r";
 $query = 'INSERT INTO d VALUES';
 $done = 0;
   }
   else
   {
 $query .= ', ';
   }
 }
}

// now do the l table
for($j = 1; $j <= 100; $j++)
{
 $s = mt_rand(1,100);
 $d = mt

OR isn't optimised

2006-06-07 Thread Taras D

Hi everyone,

I have the following schema:

create table l
(
 aId int unsigned not null auto_increment primary key,
 sId smallint unsigned not null,
 dId smallint unsigned,
 index(sId,dId)
);

create table d
(
 sId smallint unsigned not null,
 dId smallint unsigned not null,
 primary key(sId, dId)
);

The d table has 10 000 entries, the l table has 100 entries.

The following query, which consist of an OR. Each part of the OR
specifies exactly one row in the d table.

explain select straight_join * from l,d where (l.sId = d.dId and d.sId
= 1) or (l.sId = d.dId and d.sId = 2);

++-+---+---+---+-+-+--+--+--+
| id | select_type | table | type  | possible_keys | key | key_len
| ref  | rows | Extra|
++-+---+---+---+-+-+--+--+--+
|  1 | SIMPLE  | l | index | sId   | sId | 5
| NULL |  100 | Using index  |
|  1 | SIMPLE  | d | range | PRIMARY   | PRIMARY | 2
| NULL |  200 | Using where; Using index |
++-+---+---+---+-+-+--+--+--+

So why is only half of the index being used to reduce the d table? I
would think if it used both indices, at most two matching rows would
be found, reducing the  number of rows that have to be inspected from
20 000 to 100. In fact, UNIONing the two queries shows just this:

select straight_join * from l,d where (l.sId = d.dId and d.sId = 1)
union select straight_join * from l,d where (l.sId = d.dId and d.sId =
2);
++--+++---+-+-++--+-+
| id | select_type  | table  | type   | possible_keys | key |
key_len | ref| rows | Extra
|
++--+++---+-+-++--+-+
|  1 | PRIMARY  | l  | index  | sId   | sId |
5   | NULL   |  100 | Using index
|
|  1 | PRIMARY  | d  | eq_ref | PRIMARY   | PRIMARY |
4   | const,radius_searching_test_case.l.sId |1 | Using index
|
|  2 | UNION| l  | index  | sId   | sId |
5   | NULL   |  100 | Using index
|
|  2 | UNION| d  | eq_ref | PRIMARY   | PRIMARY |
4   | const,radius_searching_test_case.l.sId |1 | Using index
|
|| UNION RESULT |  | ALL| NULL  | NULL|
NULL| NULL   | NULL |
|
++--+++---+-+-++--+-+

So why isn't the optimiser picking up that it can use both parts of
the keys and significantly reduce the number of rows it must inspect?
I have tried inserting 'force index(primary) after 'from l, d', but
that doesn't help. A work around *could* be to just union the results,
but in the real case the 'l' table can have upto 100 000 entries, and
there can be upto 16 independant ORs, which means that I would be
looking at 100 000 * 16 = 1.6 million rows (using UNIONs), instead of
100 000 rows (if I could get this OR stuff to work). (PHP code to
insert data into the tables is appended)

Does anyone know what's happening?

Thanks,
Taras

php code
-
= $INSERT_AT_A_TIME)
   {
 mysql_query($query) or die(mysql_error().''.$query);
 echo "Done another 1000\n\r";
 $query = 'INSERT INTO d VALUES';
 $done = 0;
   }
   else
   {
 $query .= ', ';
   }
 }
}

// now do the l table
for($j = 1; $j <= 100; $j++)
{
 $s = mt_rand(1,100);
 $d = mt_rand(1,100);
 $query = "INSERT INTO l VALUES(NULL,$s,$d)";
 mysql_query($query);
}

echo 'FINISHED';

?>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]