Re: Inefficient use of index (was: Big difference in MyISAM and InnoDB SELECT speed)

2003-11-27 Thread Janusz Krzysztofik
Janusz Krzysztofik wrote:
> ...
> I am trying to optimize MySQL (3.23.49 from Debian stable) setup for
> ASPseek application. I decided to try InnoDB in order to be able
> to update tables while performing time consuming selects.
> After converting all tables to InnoDB I noticed a big difference
> in processing speed of one of the SELECT queries performed by the application.
> 
> Query:
> select url_id from urlword where deleted=0 and status=200 and origin=1
> 
> MyISAM:
> Query OK, 14274315 rows affected (4 min 54.88 sec)
> 
> InnoDB:
> Query OK, 14274315 rows affected (1 day 8 hours 46 min 46.70 sec)

Mechain Marc wrote:
> ...
> Have you done an EXPLAIN on your query ?
> 
> May be an index on (origin,status,deleted) could help.

Janusz Krzysztofik wrote:
> ...
> mysql> explain select url_id from urlword where deleted=0 and status=200 and 
> origin=1;
> 
> MyISAM (fast):
> +-+--+---+--+-+--+--++
> | table   | type | possible_keys | key  | key_len | ref  | rows | Extra  |
> +-+--+---+--+-+--+--++
> | urlword | ALL  | crc   | NULL |NULL | NULL | 46648925 | where used |
> +-+--+---+--+-+--+--++
> 
> InnoDB (very slow):
> +-+--+---+--+-+---+--++
> | table   | type | possible_keys | key  | key_len | ref   | rows | Extra  |
> +-+--+---+--+-+---+--++
> | urlword | ref  | crc   | crc  |   5 | const | 16951116 | where used |
> +-+--+---+--+-+---+--++
> 
> So I retried InnoDB with "ignore index (crc)" and got:
> Query OK, 14274315 rows affected (5 min 43.23 sec)
> 
> Next I found that this issue is not related to InnoDB. On my second server
> with almost the same data in MyISAM tables the query also uses the index
> and lasts forever.
> 
> Now the question is: how should I set up (and maintain?) my MySQL server
> to prevent it from using indexes inefficiently?

Mechain Marc wrote:
> ...
> Quite a good question, no idea.
> 
> But if the query
> select url_id from urlword where deleted=?? and status=??? and origin=?;
> is a query that you will use very often, I still think that creating an index on 
> (origin,status,deleted)
> is the good answer.

gerald_clark wrote:
> ...
> Try adding an index on (status,deleted)

Martijn Tonies wrote:
> ...
> > > I wonder: how many possible different values would such an index
> > > return?
> >
> > mysql> select distinct status, deleted from urlword;
> > ...
> > 13 rows in set (6 min 55.94 sec)
> >
> > mysql> select distinct status, deleted, origin from urlword;
> > ...
> > 23 rows in set (7 min 9.90 sec)
> >
> > > If this is a (very) low value, won't the index make things
> > > slower (if it's being used) compared to a full table scan?
> >
> > I guess these values are very low for a table with 46 milion records,
> > so I understand it is better not to use such indicies, right?
> 
> Well, that's what I have understood from many db engines :-)
> ...
> The idea is, that fetching rows in index order takes make time then fetching
> rows in storage order.

Let's try Marc's idea first:

   mysql> create index x1 on urlword (deleted,status,origin);
   Query OK, 46648925 rows affected (3 hours 56 min 36.17 sec)

Oops, Marc said "(origin,status,deleted)", not
"(deleted,status,origin)", but we'll see.

   mysql> explain select url_id from urlword where deleted=0 and
status=200 and origin=1;
  
+-+--+---+--+-+---+--++
   | table   | type | possible_keys | key  | key_len | ref   | rows
| Extra  |
  
+-+--+---+--+-+---+--++
   | urlword | ref  | crc,x1| crc  |   5 | const | 16934432
| where used |
  
+-+--+---+--+-+---+--++
   1 row in set (0.27 sec) ^
   |
Why crc again, not x1? +

   mysql> show index from urlword;
  
+-++-+--+-+---+-+--++-+
   | Table   | Non_unique | Key_name| Seq_in_index |
Column_name | Collation | Cardinality | Sub_part | Packed | Comment
|
  
+-++-+--+-+---+-+--++-+
   | urlword |  0 | PRIMARY |1 |
url_id  | A |46411809 | NULL | NULL   |
|
   | urlword |  0 | url |1 |
url | A |46411809 | NULL | NULL   |
|
   | urlword |  1 | next_index_time |1 |
next

Re: Inefficient use of index (was: Big difference in MyISAM and InnoDB SELECT speed)

2003-11-24 Thread Martijn Tonies
Hi,

> > > You are not using any indicies, because there aren't any that could be
> > > used in this query.
> > > Try adding an index on (status,deleted)
> >
> > I wonder: how many possible different values would such an index
> > return?
>
> mysql> select distinct status, deleted from urlword;
> ++-+
> | status | deleted |
> ++-+
> |200 |   0 |
> |503 |   0 |
> |302 |   0 |
> |  0 |   1 |
> |200 |   1 |
> |504 |   0 |
> |301 |   0 |
> |  0 |   0 |
> |204 |   0 |
> |420 |   0 |
> |  1 |   0 |
> |303 |   0 |
> |470 |   0 |
> ++-+
> 13 rows in set (6 min 55.94 sec)
>
> mysql> select distinct status, deleted, origin from urlword;
> ...
> 23 rows in set (7 min 9.90 sec)
>
> > If this is a (very) low value, won't the index make things
> > slower (if it's being used) compared to a full table scan?
>
> I guess these values are very low for a table with 46 milion records,
> so I understand it is better not to use such indicies, right?

Well, that's what I have understood from many db engines :-)

Of course, the engine itself should be smart enough to avoid using the index
because of low selectivity.

The idea is, that fetching rows in index order takes make time then fetching
rows in storage order.

With regards,

Martijn Tonies
Database Workbench - developer tool for InterBase, Firebird, MySQL & MS SQL
Server.
Upscene Productions
http://www.upscene.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Inefficient use of index (was: Big difference in MyISAM and InnoDB SELECT speed)

2003-11-24 Thread Janusz Krzysztofik
Martijn Tonies wrote:
> 
> Hi,
> 
> > You are not using any indicies, because there aren't any that could be
> > used in this query.
> > Try adding an index on (status,deleted)
> 
> I wonder: how many possible different values would such an index
> return?

mysql> select distinct status, deleted from urlword;
++-+
| status | deleted |
++-+
|200 |   0 |
|503 |   0 |
|302 |   0 |
|  0 |   1 |
|200 |   1 |
|504 |   0 |
|301 |   0 |
|  0 |   0 |
|204 |   0 |
|420 |   0 |
|  1 |   0 |
|303 |   0 |
|470 |   0 |
++-+
13 rows in set (6 min 55.94 sec)

mysql> select distinct status, deleted, origin from urlword;
...
23 rows in set (7 min 9.90 sec)

> If this is a (very) low value, won't the index make things
> slower (if it's being used) compared to a full table scan?

I guess these values are very low for a table with 46 milion records,
so I understand it is better not to use such indicies, right?

>
> With regards,
> 
> Martijn Tonies
> Database Workbench - developer tool for InterBase, Firebird, MySQL & MS SQL
> Server.
> Upscene Productions
> http://www.upscene.com
> 

Thanks, Martijn,

Janusz

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Inefficient use of index (was: Big difference in MyISAM and InnoDB SELECT speed)

2003-11-24 Thread Martijn Tonies
Hi,


> You are not using any indicies, because there aren't any that could be
> used in this query.
> Try adding an index on (status,deleted)

I wonder: how many possible different values would such an index
return? If this is a (very) low value, won't the index make things
slower (if it's being used) compared to a full table scan?

With regards,

Martijn Tonies
Database Workbench - developer tool for InterBase, Firebird, MySQL & MS SQL
Server.
Upscene Productions
http://www.upscene.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Inefficient use of index (was: Big difference in MyISAM and InnoDB SELECT speed)

2003-11-24 Thread Mechain Marc
Janusz,

> Now the question is: how should I set up (and maintain?) my MySQL server
> to prevent it
> from using indexes inefficiently?

Quite a good question, no idea.

But if the query
select url_id from urlword where deleted=?? and status=??? and origin=?;
is a query that you will use very often, I still think that creating an index on 
(origin,status,deleted)
is the good answer.

Marc.

-Message d'origine-
De : Janusz Krzysztofik [mailto:[EMAIL PROTECTED]
Envoyé : lundi 24 novembre 2003 16:48
A : Mechain Marc
Cc : [EMAIL PROTECTED]
Objet : Inefficient use of index (was: Big difference in MyISAM and InnoDB SELECT 
speed)


Marc,

Thank you for your prompt answer.

I run EXPLAIN in both cases and got:

MyISAM (fast):
mysql> explain select url_id from urlword where deleted=0 and status=200
and origin=1;
+-+--+---+--+-+--+--++
| table   | type | possible_keys | key  | key_len | ref  | rows |
Extra  |
+-+--+---+--+-+--+--++
| urlword | ALL  | crc   | NULL |NULL | NULL | 46648925 |
where used |
+-+--+---+--+-+--+--++
1 row in set (0.18 sec)

InnoDB (very slow):
mysql> explain select url_id from urlword where deleted=0 and status=200
and origin=1;
+-+--+---+--+-+---+--++
| table   | type | possible_keys | key  | key_len | ref   | rows |
Extra  |
+-+--+---+--+-+---+--++
| urlword | ref  | crc   | crc  |   5 | const | 16951116 |
where used |
+-+--+---+--+-+---+--++
1 row in set (0.22 sec)


So I retried InnoDB with "ignore index (crc)" and got:

Sample disk usage (iostat):
Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s
avgrq-sz avgqu-sz   await  svctm  %util
/dev/scsi/host1/bus4/target0/lun0/disc
   32586.70   0.00 530.10  0.00 66233.600.00 33116.80
0.00   124.9515.35   28.99  15.11  80.10

Processing time:
Query OK, 14274315 rows affected (5 min 43.23 sec)

That's it, 5 minutes, not 32 hours.

Next I found that this issue is not related to InnoDB. On my second
server with almost
the same data in MyISAM tables the query also uses the index and lasts
forever.

Now the question is: how should I set up (and maintain?) my MySQL server
to prevent it
from using indexes inefficiently?

Janusz


Mechain Marc wrote:
> 
> Innodb may not use an index (full scan) where MyIsam does (index crc which is the 
> only relevant one).
> 
> Have you done an EXPLAIN on your query ?
> 
> May be an index on (origin,status,deleted) could help.
> 
> Marc.
> 
> -Message d'origine-
> De : Janusz Krzysztofik [mailto:[EMAIL PROTECTED]
> Envoyé : lundi 24 novembre 2003 13:58
> A : [EMAIL PROTECTED]
> Objet : Big difference in MyISAM and InnoDB SELECT speed
> 
> Hello,
> 
> I am trying to optimize MySQL (3.23.49 from Debian stable) setup for
> ASPseek application. I decided to try InnoDB in order to be able
> to update tables while performing time consuming selects.
> After converting all tables to InnoDB I noticed a big difference
> in processing speed of one of the SELECT queries performed by the application.
> 
> My configuration file /etc/my.cnf is based on my-huge.cnf example.
> 
> Table structure:
> create table urlword(url_id integer auto_increment primary key,
> site_id integer not null,
> tree_id integer not null,
> deleted tinyint DEFAULT 0 NOT NULL,
> url varchar(128) not null,
> next_index_time INT NOT NULL,
> status int(11) DEFAULT '0' NOT NULL,
> crc char(32) DEFAULT '' NOT NULL,
> last_modified varchar(32) DEFAULT '' NOT NULL,
> etag varchar(48) DEFAULT '' NOT NULL,
> last_index_time INT NOT NULL,
> referrer int(11) DEFAULT '0' NOT NULL,
> tag int(11) DEFAULT '0' NOT NULL,
> hops int(11) DEFAULT '0' NOT NULL,
> redir integer,
> origin integer,
> unique index(url),
> index(next_index_time),
> index(hops,next_index_time),
> index crc (origin, crc(8)));
> 
> Query:
> select url_id from urlword where deleted=0 and status=200 and origin=1
> 
> MyISAM:
> Table status:
> | Name| Type   | Row_format | Rows | Avg_row_length | Data_length | 
> Max_data_length | Index_length | Data_free | Auto_increment | Create_time | 
> Update_time | Check_time | Create_options | Comment |
> +-+++--++-+-+--+---++-+-+++-+
> | urlword | MyISAM | Dynamic| 46648925 |143 |  6714978360 |   
> 10995116277

Re: Inefficient use of index (was: Big difference in MyISAM and InnoDB SELECT speed)

2003-11-24 Thread gerald_clark
You are not using any indicies, because there aren't any that could be 
used in this query.
Try adding an index on (status,deleted)

Janusz Krzysztofik wrote:

Marc,

Thank you for your prompt answer.

I run EXPLAIN in both cases and got:

MyISAM (fast):
mysql> explain select url_id from urlword where deleted=0 and status=200
and origin=1;
+-+--+---+--+-+--+--++
| table   | type | possible_keys | key  | key_len | ref  | rows |
Extra  |
+-+--+---+--+-+--+--++
| urlword | ALL  | crc   | NULL |NULL | NULL | 46648925 |
where used |
+-+--+---+--+-+--+--++
1 row in set (0.18 sec)
InnoDB (very slow):
mysql> explain select url_id from urlword where deleted=0 and status=200
and origin=1;
+-+--+---+--+-+---+--++
| table   | type | possible_keys | key  | key_len | ref   | rows |
Extra  |
+-+--+---+--+-+---+--++
| urlword | ref  | crc   | crc  |   5 | const | 16951116 |
where used |
+-+--+---+--+-+---+--++
1 row in set (0.22 sec)
So I retried InnoDB with "ignore index (crc)" and got:

Sample disk usage (iostat):
Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s
avgrq-sz avgqu-sz   await  svctm  %util
/dev/scsi/host1/bus4/target0/lun0/disc
  32586.70   0.00 530.10  0.00 66233.600.00 33116.80
0.00   124.9515.35   28.99  15.11  80.10

Processing time:
Query OK, 14274315 rows affected (5 min 43.23 sec)
That's it, 5 minutes, not 32 hours.

Next I found that this issue is not related to InnoDB. On my second
server with almost
the same data in MyISAM tables the query also uses the index and lasts
forever.
Now the question is: how should I set up (and maintain?) my MySQL server
to prevent it
from using indexes inefficiently?
Janusz


Mechain Marc wrote:
 

Innodb may not use an index (full scan) where MyIsam does (index crc which is the only relevant one).

Have you done an EXPLAIN on your query ?

May be an index on (origin,status,deleted) could help.

Marc.

-Message d'origine-
De : Janusz Krzysztofik [mailto:[EMAIL PROTECTED]
Envoyé : lundi 24 novembre 2003 13:58
A : [EMAIL PROTECTED]
Objet : Big difference in MyISAM and InnoDB SELECT speed
Hello,

I am trying to optimize MySQL (3.23.49 from Debian stable) setup for
ASPseek application. I decided to try InnoDB in order to be able
to update tables while performing time consuming selects.
After converting all tables to InnoDB I noticed a big difference
in processing speed of one of the SELECT queries performed by the application.
My configuration file /etc/my.cnf is based on my-huge.cnf example.

Table structure:
create table urlword(url_id integer auto_increment primary key,
   site_id integer not null,
   tree_id integer not null,
   deleted tinyint DEFAULT 0 NOT NULL,
   url varchar(128) not null,
   next_index_time INT NOT NULL,
   status int(11) DEFAULT '0' NOT NULL,
   crc char(32) DEFAULT '' NOT NULL,
   last_modified varchar(32) DEFAULT '' NOT NULL,
   etag varchar(48) DEFAULT '' NOT NULL,
   last_index_time INT NOT NULL,
   referrer int(11) DEFAULT '0' NOT NULL,
   tag int(11) DEFAULT '0' NOT NULL,
   hops int(11) DEFAULT '0' NOT NULL,
   redir integer,
   origin integer,
   unique index(url),
   index(next_index_time),
   index(hops,next_index_time),
   index crc (origin, crc(8)));
Query:
select url_id from urlword where deleted=0 and status=200 and origin=1
MyISAM:
Table status:
| Name| Type   | Row_format | Rows | Avg_row_length | Data_length | 
Max_data_length | Index_length | Data_free | Auto_increment | Create_time | 
Update_time | Check_time | Create_options | Comment |
+-+++--++-+-+--+---++-+-+++-+
| urlword | MyISAM | Dynamic| 46648925 |143 |  6714978360 |   
1099511627775 |   4052629504 | 0 |   46648929 | 2003-11-21 14:13:28 | 
2003-11-22 04:01:16 | NULL   | max_rows=1 | |
Sample disk usage (iostat -x -d /dev/scsi/... 10):
Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s avgrq-sz 
avgqu-sz   await  svctm  %util
/dev/scsi/host1/bus4/target0/lun0/disc
  5510.20   0.20 361.10  0.40 46970.404.80 23485.20 2.40   129.95
16.41   45.39  27.11  98.00
Processing time:
Query OK, 14274315 rows affected (4 min 54.88 sec)
InnoDB:
Table status:
| Name| Type   | Row_format | Rows | Avg_row_length | Data_length | 
Max_data