RE: InnoDb Table Performance problem

2004-02-12 Thread Michael McTernan
Hi,

 SELECT COUNT(*) for InnoDB tables is a know problem...  The table
 handler (for InnoDB) has to do a table scan to count all rows...  This
 particular case is optimized with MyISAM ...

Sure.  But why is the tablescan ~100 times faster for the table without the
BLOB column?

This goes back to my original assumption that I thought the BLOB contents
would be stored outside of the normal column data, meaning that you have to
do an additional seek to retrieve a BLOB value, but that a BLOB column
doesn't inflate the data over which a table scan has to iterate and so
doesn't impact tablescan performance.  Is this wrong?  Or is this different
for InnoDB vs MyISAM?

Of course, I'm aiming not to use table scans at all though :)

Thanks,

Mike

 -Original Message-
 From: Benoit St-Jean [mailto:[EMAIL PROTECTED]
 Sent: 11 February 2004 22:47
 To: Michael McTernan
 Cc: Mysql
 Subject: Re: InnoDb Table Performance problem


 Michael McTernan wrote:

 Hi there,
 
 I'm making something similar to a file revision control system, and using
 MySQL on Linux as the database to drive it.  Almost all my tables are
 InnoDB, and generally it is going very well, with the exception
 of one table
 that is always very slow.
 
 This table holds the files within the database.  It is defined
 as follows:
 
 CREATE TABLE `files` (
   `id` int(10) unsigned NOT NULL auto_increment,
   `revision` int(10) unsigned NOT NULL default '0',
   `filenameid` int(10) unsigned NOT NULL default '0',
   `pathid` int(10) unsigned NOT NULL default '0',
   `extensionid` int(10) unsigned NOT NULL default '0',
   `isDeleted` enum('0','1') NOT NULL default '0',
   `filepathname` tinytext NOT NULL,
   `contentsGz` longblob NOT NULL,
   PRIMARY KEY  (`id`),
   UNIQUE KEY `revision` (`revision`,`filepathname`(255)),
   KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`),
   KEY `filepathname` (`filepathname`(255)),
   CONSTRAINT `0_3570` FOREIGN KEY (`filenameid`) REFERENCES `filenames`
 (`id`),
   CONSTRAINT `0_3571` FOREIGN KEY (`extensionid`) REFERENCES
 `fileextensions` (`id`),
   CONSTRAINT `0_3572` FOREIGN KEY (`pathid`) REFERENCES
 `filepaths` (`id`)
 ) TYPE=InnoDB;
 
 The 'contentsGz' column will have the contents of the file and will
 typically be a couple of hundred kilobytes, but in some rare
 cases as large
 as 20 Megabytes.
 
 Selects on this table always go very slowly.  I've used EXPLAIN
 to look at
 what is going on, and carefully added a couple of multi-column
 indexes that
 have improved SELECT performance (this table is updated rarely,
 so I'm not
 too worried about INSERT performance).  However, the performance is still
 really bad.
 
 I tried creating an identical table with the exception that it
 doesn't have
 the 'contentsGz' column:
 
 CREATE TABLE `filemetadata` (
   `id` int(10) unsigned NOT NULL auto_increment,
   `revision` int(10) unsigned NOT NULL default '0',
   `filenameid` int(10) unsigned NOT NULL default '0',
   `pathid` int(10) unsigned NOT NULL default '0',
   `extensionid` int(10) unsigned NOT NULL default '0',
   `isDeleted` enum('0','1') NOT NULL default '0',
   `filepathname` tinytext NOT NULL,
   PRIMARY KEY  (`id`),
   UNIQUE KEY `revision` (`revision`,`filepathname`(255)),
   KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`),
   KEY `filepathname` (`filepathname`(255)),
   CONSTRAINT `0_3651` FOREIGN KEY (`filenameid`) REFERENCES `filenames`
 (`id`),
   CONSTRAINT `0_3652` FOREIGN KEY (`extensionid`) REFERENCES
 `fileextensions` (`id`),
   CONSTRAINT `0_3653` FOREIGN KEY (`pathid`) REFERENCES
 `filepaths` (`id`)
 ) TYPE=InnoDB;
 
 I used UPDATE ... SELECT to copy all data from the 'files' table to
 'filemetadata'.
 
 Here is something I found suprising:
 
 mysql SELECT COUNT(1) FROM files;
 +--+
 | COUNT(1) |
 +--+
 |   101013 |
 +--+
 1 row in set (32.42 sec)
 
 mysql SELECT COUNT(1) FROM filemetadata;
 +--+
 | COUNT(1) |
 +--+
 |   101013 |
 +--+
 1 row in set (0.29 sec)
 
 

 SELECT COUNT(*) for InnoDB tables is a know problem...  The table
 handler (for InnoDB) has to do a table scan to count all rows...  This
 particular case is optimized with MyISAM ...

 
 
 
 
 




 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]







-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: InnoDb Table Performance problem

2004-02-12 Thread Chris Nolan
Michael McTernan wrote:

Hi,

 

SELECT COUNT(*) for InnoDB tables is a know problem...  The table
handler (for InnoDB) has to do a table scan to count all rows...  This
particular case is optimized with MyISAM ...
   

Sure.  But why is the tablescan ~100 times faster for the table without the
BLOB column?
 

I would try to sum it up like this:

1. Discs work best when used as purely sequential devices.
2. Inserting those BLOBs takes up space somewhere. Strictly speaking, on 
the disc they may very well be in between certain rows.
3. As you may need to read over them, the disc has to seek much further 
and depending on the way BLOBs are placed inside the tablespace, seeking 
may be all over the place.

Have you tried the following to see the results:

ALTER TABLE files TYPE = MyISAM;
(Remove all indexes)
SELECT SUM(revision) FROM files;
You may want to execute the above select on your InnoDB version of the 
table as well.

Generally speaking, the execution time of COUNT(*) doesn't tell you very 
much unless you are a bit more specific:

SELECT COUNT(revision) FROM files WHERE revision BETWEEN '1' AND '2600';

I am prepared to be shot down for being totally incorrect!

Regards,

Chris

This goes back to my original assumption that I thought the BLOB contents
would be stored outside of the normal column data, meaning that you have to
do an additional seek to retrieve a BLOB value, but that a BLOB column
doesn't inflate the data over which a table scan has to iterate and so
doesn't impact tablescan performance.  Is this wrong?  Or is this different
for InnoDB vs MyISAM?
Of course, I'm aiming not to use table scans at all though :)

Thanks,

Mike

 

-Original Message-
From: Benoit St-Jean [mailto:[EMAIL PROTECTED]
Sent: 11 February 2004 22:47
To: Michael McTernan
Cc: Mysql
Subject: Re: InnoDb Table Performance problem
Michael McTernan wrote:

   

Hi there,

I'm making something similar to a file revision control system, and using
MySQL on Linux as the database to drive it.  Almost all my tables are
InnoDB, and generally it is going very well, with the exception
 

of one table
   

that is always very slow.

This table holds the files within the database.  It is defined
 

as follows:
   

CREATE TABLE `files` (
`id` int(10) unsigned NOT NULL auto_increment,
`revision` int(10) unsigned NOT NULL default '0',
`filenameid` int(10) unsigned NOT NULL default '0',
`pathid` int(10) unsigned NOT NULL default '0',
`extensionid` int(10) unsigned NOT NULL default '0',
`isDeleted` enum('0','1') NOT NULL default '0',
`filepathname` tinytext NOT NULL,
`contentsGz` longblob NOT NULL,
PRIMARY KEY  (`id`),
UNIQUE KEY `revision` (`revision`,`filepathname`(255)),
KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`),
KEY `filepathname` (`filepathname`(255)),
CONSTRAINT `0_3570` FOREIGN KEY (`filenameid`) REFERENCES `filenames`
(`id`),
CONSTRAINT `0_3571` FOREIGN KEY (`extensionid`) REFERENCES
`fileextensions` (`id`),
CONSTRAINT `0_3572` FOREIGN KEY (`pathid`) REFERENCES
 

`filepaths` (`id`)
   

) TYPE=InnoDB;

The 'contentsGz' column will have the contents of the file and will
typically be a couple of hundred kilobytes, but in some rare
 

cases as large
   

as 20 Megabytes.

Selects on this table always go very slowly.  I've used EXPLAIN
 

to look at
   

what is going on, and carefully added a couple of multi-column
 

indexes that
   

have improved SELECT performance (this table is updated rarely,
 

so I'm not
   

too worried about INSERT performance).  However, the performance is still
really bad.
I tried creating an identical table with the exception that it
 

doesn't have
   

the 'contentsGz' column:

CREATE TABLE `filemetadata` (
`id` int(10) unsigned NOT NULL auto_increment,
`revision` int(10) unsigned NOT NULL default '0',
`filenameid` int(10) unsigned NOT NULL default '0',
`pathid` int(10) unsigned NOT NULL default '0',
`extensionid` int(10) unsigned NOT NULL default '0',
`isDeleted` enum('0','1') NOT NULL default '0',
`filepathname` tinytext NOT NULL,
PRIMARY KEY  (`id`),
UNIQUE KEY `revision` (`revision`,`filepathname`(255)),
KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`),
KEY `filepathname` (`filepathname`(255)),
CONSTRAINT `0_3651` FOREIGN KEY (`filenameid`) REFERENCES `filenames`
(`id`),
CONSTRAINT `0_3652` FOREIGN KEY (`extensionid`) REFERENCES
`fileextensions` (`id`),
CONSTRAINT `0_3653` FOREIGN KEY (`pathid`) REFERENCES
 

`filepaths` (`id`)
   

) TYPE=InnoDB;

I used UPDATE ... SELECT to copy all data from the 'files' table to
'filemetadata'.
Here is something I found suprising:

mysql SELECT COUNT(1) FROM files;
+--+
| COUNT(1) |
+--+
|   101013 |
+--+
1 row in set (32.42 sec)
mysql SELECT COUNT(1) FROM filemetadata;
+--+
| COUNT(1) |
+--+
|   101013 |
+--+
1 row in set (0.29 sec)
 

SELECT COUNT(*) for InnoDB tables is a know problem...  The table
handler (for InnoDB) has

Re: InnoDb Table Performance problem

2004-02-12 Thread Benoit St-Jean
Michael McTernan wrote:

Hi,

 

SELECT COUNT(*) for InnoDB tables is a know problem...  The table
handler (for InnoDB) has to do a table scan to count all rows...  This
particular case is optimized with MyISAM ...
   

Sure.  But why is the tablescan ~100 times faster for the table without the
BLOB column?
 

Well, two possible explanations...  BLOBS are either stored in a 
separate area thus forcing MySQL to perform an extra seek to retrieve 
the full row or the table scan takes a lot more time to execute because 
of the BLOB data you have to retrieve...  Reading 10 rows of 1K is 
way faster than retrieving 10 rows of  800K!  By your result, I am 
assuming that your table with the BLOB is 100 times bigger than the one 
without it...

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


RE: InnoDb Table Performance problem

2004-02-12 Thread Michael McTernan
Hi,

 Well, two possible explanations...  BLOBS are either stored in a
 separate area thus forcing MySQL to perform an extra seek to retrieve
 the full row or the table scan takes a lot more time to execute because
 of the BLOB data you have to retrieve...

I'm not often retrieving the BLOB value in my queries, so wouldn't expect to
get the second seek performance it, and hence would expect good peformance.

 Reading 10 rows of 1K is
 way faster than retrieving 10 rows of  800K!  By your result, I am
 assuming that your table with the BLOB is 100 times bigger than the one
 without it...

I'm not sure how to get the size of a table from InnoDb, so could guess that
this is true, but I'm not certain.

I'm guessing that with InnoDb, the BLOB data is being stored inline with the
other column data :(

Thanks,

Mike


 -Original Message-
 From: Benoit St-Jean [mailto:[EMAIL PROTECTED]
 Sent: 12 February 2004 14:04
 To: Michael McTernan
 Cc: Mysql
 Subject: Re: InnoDb Table Performance problem


 Michael McTernan wrote:

 Hi,
 
 
 
 SELECT COUNT(*) for InnoDB tables is a know problem...  The table
 handler (for InnoDB) has to do a table scan to count all rows...  This
 particular case is optimized with MyISAM ...
 
 
 
 Sure.  But why is the tablescan ~100 times faster for the table
 without the
 BLOB column?
 
 
 Well, two possible explanations...  BLOBS are either stored in a
 separate area thus forcing MySQL to perform an extra seek to retrieve
 the full row or the table scan takes a lot more time to execute because
 of the BLOB data you have to retrieve...  Reading 10 rows of 1K is
 way faster than retrieving 10 rows of  800K!  By your result, I am
 assuming that your table with the BLOB is 100 times bigger than the one
 without it...







-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: InnoDb Table Performance problem

2004-02-12 Thread Michael McTernan
 Performance problem


 Michael McTernan wrote:

 Hi,
 
 
 
 SELECT COUNT(*) for InnoDB tables is a know problem...  The table
 handler (for InnoDB) has to do a table scan to count all rows...  This
 particular case is optimized with MyISAM ...
 
 
 
 Sure.  But why is the tablescan ~100 times faster for the table
 without the
 BLOB column?
 
 
 I would try to sum it up like this:

 1. Discs work best when used as purely sequential devices.
 2. Inserting those BLOBs takes up space somewhere. Strictly speaking, on
 the disc they may very well be in between certain rows.
 3. As you may need to read over them, the disc has to seek much further
 and depending on the way BLOBs are placed inside the tablespace, seeking
 may be all over the place.

 Have you tried the following to see the results:

 ALTER TABLE files TYPE = MyISAM;
 (Remove all indexes)
 SELECT SUM(revision) FROM files;

 You may want to execute the above select on your InnoDB version of the
 table as well.

 Generally speaking, the execution time of COUNT(*) doesn't tell you very
 much unless you are a bit more specific:

 SELECT COUNT(revision) FROM files WHERE revision BETWEEN '1' AND '2600';

 I am prepared to be shot down for being totally incorrect!

 Regards,

 Chris

 This goes back to my original assumption that I thought the BLOB contents
 would be stored outside of the normal column data, meaning that
 you have to
 do an additional seek to retrieve a BLOB value, but that a BLOB column
 doesn't inflate the data over which a table scan has to iterate and so
 doesn't impact tablescan performance.  Is this wrong?  Or is
 this different
 for InnoDB vs MyISAM?
 
 Of course, I'm aiming not to use table scans at all though :)
 
 Thanks,
 
 Mike
 
 
 
 -Original Message-
 From: Benoit St-Jean [mailto:[EMAIL PROTECTED]
 Sent: 11 February 2004 22:47
 To: Michael McTernan
 Cc: Mysql
 Subject: Re: InnoDb Table Performance problem
 
 
 Michael McTernan wrote:
 
 
 
 Hi there,
 
 I'm making something similar to a file revision control
 system, and using
 MySQL on Linux as the database to drive it.  Almost all my tables are
 InnoDB, and generally it is going very well, with the exception
 
 
 of one table
 
 
 that is always very slow.
 
 This table holds the files within the database.  It is defined
 
 
 as follows:
 
 
 CREATE TABLE `files` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `revision` int(10) unsigned NOT NULL default '0',
  `filenameid` int(10) unsigned NOT NULL default '0',
  `pathid` int(10) unsigned NOT NULL default '0',
  `extensionid` int(10) unsigned NOT NULL default '0',
  `isDeleted` enum('0','1') NOT NULL default '0',
  `filepathname` tinytext NOT NULL,
  `contentsGz` longblob NOT NULL,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `revision` (`revision`,`filepathname`(255)),
  KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`),
  KEY `filepathname` (`filepathname`(255)),
  CONSTRAINT `0_3570` FOREIGN KEY (`filenameid`) REFERENCES `filenames`
 (`id`),
  CONSTRAINT `0_3571` FOREIGN KEY (`extensionid`) REFERENCES
 `fileextensions` (`id`),
  CONSTRAINT `0_3572` FOREIGN KEY (`pathid`) REFERENCES
 
 
 `filepaths` (`id`)
 
 
 ) TYPE=InnoDB;
 
 The 'contentsGz' column will have the contents of the file and will
 typically be a couple of hundred kilobytes, but in some rare
 
 
 cases as large
 
 
 as 20 Megabytes.
 
 Selects on this table always go very slowly.  I've used EXPLAIN
 
 
 to look at
 
 
 what is going on, and carefully added a couple of multi-column
 
 
 indexes that
 
 
 have improved SELECT performance (this table is updated rarely,
 
 
 so I'm not
 
 
 too worried about INSERT performance).  However, the
 performance is still
 really bad.
 
 I tried creating an identical table with the exception that it
 
 
 doesn't have
 
 
 the 'contentsGz' column:
 
 CREATE TABLE `filemetadata` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `revision` int(10) unsigned NOT NULL default '0',
  `filenameid` int(10) unsigned NOT NULL default '0',
  `pathid` int(10) unsigned NOT NULL default '0',
  `extensionid` int(10) unsigned NOT NULL default '0',
  `isDeleted` enum('0','1') NOT NULL default '0',
  `filepathname` tinytext NOT NULL,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `revision` (`revision`,`filepathname`(255)),
  KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`),
  KEY `filepathname` (`filepathname`(255)),
  CONSTRAINT `0_3651` FOREIGN KEY (`filenameid`) REFERENCES `filenames`
 (`id`),
  CONSTRAINT `0_3652` FOREIGN KEY (`extensionid`) REFERENCES
 `fileextensions` (`id`),
  CONSTRAINT `0_3653` FOREIGN KEY (`pathid`) REFERENCES
 
 
 `filepaths` (`id`)
 
 
 ) TYPE=InnoDB;
 
 I used UPDATE ... SELECT to copy all data from the 'files' table to
 'filemetadata'.
 
 Here is something I found suprising:
 
 mysql SELECT COUNT(1) FROM files;
 +--+
 | COUNT(1) |
 +--+
 |   101013 |
 +--+
 1 row in set (32.42 sec)
 
 mysql SELECT COUNT(1) FROM filemetadata;
 +--+
 | COUNT(1

Re: InnoDb Table Performance problem

2004-02-12 Thread Chris Nolan
(revision) |
+---+
|561401 |
+---+
1 row in set (0.18 sec)
[Restart database]
mysql SELECT SUM(revision) FROM inofiles;
+---+
| SUM(revision) |
+---+
|561401 |
+---+
1 row in set (0.29 sec)
mysql SELECT SUM(revision) from inofiles;
+---+
| SUM(revision) |
+---+
|561401 |
+---+
1 row in set (0.10 sec)
So I guess the conclusion is that BLOBs are *really* bad for table scans in
MyISAM _and_ InnoDB, although perhaps slightly less so for InnoDB.  I guess
the BLOB data must get interleaved with the non-BLOB data on disk, and that
putting the BLOBs into a different table may cause some sort of partitioning
between the tables on disk.
 

That may help. One thing you may want to do is re-run the above with an 
index on the revision column for both table types with and without 
BLOBs. I'm betting that you'll see little deviation between the 
BLOB-equipped tables and the BLOB-less ones. One thing you might want to 
investigate is storing your BLOBs in a MyISAM table and your other data 
in an InnoDB table.

You have to be impressed with InnoDB though - it's a full-on 
multiversioned storage engine and it manages to keep pace with MyISAM 
(and outperform it on occassion). Considering MyISAM's design brief was 
to be sleek, fast and space efficient (which gives it further speed 
advantages), InnoDB's comparitive overhead is hidden extremely well. 
Unfortunately, I really started using MySQL well after NuSphere buggered 
up everything and retired the Gemini table type - it would have been 
interesting to see all 4 table types be compared with all the nice speed 
improvements 4.0.x has brough along.

I'm guessing that the real problem here is that some of my queries are
secretly doing table scans when they shouldn't, and that is causing a huge
slowdown, although I might split the BLOB column into a different table such
that table scans don't give me such a bad hit if they do happen.
 

There is a table I query often, with two ENUM columns, a TIMESTAMP 
column and a TEXT column. I often need to throw queries at it that look 
for substrings in the TEXT column for rows between two dates with 
certain values for the ENUM columns. There are indexes on the ENUM, 
TIMESTAMP and TEXT columns (prefix indexes of course). Unless I specify 
FORCE INDEX(date), MySQL does a full table scan, resulting in queries 
that take 2 minutes to complete. Using the FORCE INDEX(date) predicate, 
the query time drops to about 2 seconds.

Thanks,

Mike

 

You're welcome! I hope that this helps you out!

Regards,

Chris

-Original Message-
From: Chris Nolan [mailto:[EMAIL PROTECTED]
Sent: 12 February 2004 12:23
To: Michael McTernan
Cc: Benoit St-Jean; Mysql
Subject: Re: InnoDb Table Performance problem
Michael McTernan wrote:

   

Hi,



 

SELECT COUNT(*) for InnoDB tables is a know problem...  The table
handler (for InnoDB) has to do a table scan to count all rows...  This
particular case is optimized with MyISAM ...
   

Sure.  But why is the tablescan ~100 times faster for the table
 

without the
   

BLOB column?

 

I would try to sum it up like this:

1. Discs work best when used as purely sequential devices.
2. Inserting those BLOBs takes up space somewhere. Strictly speaking, on
the disc they may very well be in between certain rows.
3. As you may need to read over them, the disc has to seek much further
and depending on the way BLOBs are placed inside the tablespace, seeking
may be all over the place.
Have you tried the following to see the results:

ALTER TABLE files TYPE = MyISAM;
(Remove all indexes)
SELECT SUM(revision) FROM files;
You may want to execute the above select on your InnoDB version of the
table as well.
Generally speaking, the execution time of COUNT(*) doesn't tell you very
much unless you are a bit more specific:
SELECT COUNT(revision) FROM files WHERE revision BETWEEN '1' AND '2600';

I am prepared to be shot down for being totally incorrect!

Regards,

Chris

   

This goes back to my original assumption that I thought the BLOB contents
would be stored outside of the normal column data, meaning that
 

you have to
   

do an additional seek to retrieve a BLOB value, but that a BLOB column
doesn't inflate the data over which a table scan has to iterate and so
doesn't impact tablescan performance.  Is this wrong?  Or is
 

this different
   

for InnoDB vs MyISAM?

Of course, I'm aiming not to use table scans at all though :)

Thanks,

Mike



 

-Original Message-
From: Benoit St-Jean [mailto:[EMAIL PROTECTED]
Sent: 11 February 2004 22:47
To: Michael McTernan
Cc: Mysql
Subject: Re: InnoDb Table Performance problem
Michael McTernan wrote:



   

Hi there,

I'm making something similar to a file revision control
 

system, and using
   

MySQL on Linux as the database to drive it.  Almost all my tables are
InnoDB, and generally it is going very well

RE: InnoDb Table Performance problem

2004-02-12 Thread Michael McTernan
Hi Chris,

 A HEX editor? :-) Finding out what the developers at MySQL AB like to
 drink and sending them a few megalitres of it? :-)

Heh heh :-)

 In all seriousness, you haven't got a great many options (although I
 could be completely wrong, in which case some kind person on this list
 will enlighten / flame me), but using OPTIMIZE TABLE on MyISAM tables
 and ALTER TABLE TYPE = InnoDB on InnoDB tables will defrag your table
 for you, possibly helping matters.

I'm pretty sure that the new tables I created for the test would have been
defraged since I ALTER'd them a couple of times to drop indices.

 InnoDB looks at tablespace files and raw partitions in the same way and
 manages space within them in a method related to that used by the
 Berkeley Fast File System.

Sounds like it is already very efficient then.

 It *might* have some more
 options to play with if you wrap a few INSERTs in a transaction block.

That's a good idea - I could certainly try that.  A future version of InnoDb
may also take advantage of this I guess :)

 That may help. One thing you may want to do is re-run the above with an
 index on the revision column for both table types with and without
 BLOBs.

I did try adding indices, and you are correct, it brings the times right
down, so it is only table scan performance that is at fault here.

 You have to be impressed with InnoDB though - it's a full-on
 multiversioned storage engine and it manages to keep pace with MyISAM
 (and outperform it on occassion).

Completely.  I'm impressed with the whole lot of MySQL - it's brilliant.

 Using the FORCE INDEX(date) predicate, the query time drops to about 2
seconds.

I've had to do this in the past too.  I'll be looking out for slow queries
in the logs and then EXPLAINing them to see where I can make this
improvement.

Thankyou for all your help.  While there is no simple solution, I'm a lot
more sure of my options now :)

Cheers,

Mike

 -Original Message-
 From: Chris Nolan [mailto:[EMAIL PROTECTED]
 Sent: 12 February 2004 17:28
 To: Michael McTernan
 Cc: Mysql; Benoit St-Jean
 Subject: Re: InnoDb Table Performance problem


 Michael McTernan wrote:

 Hi there,
 
 
 
 I would try to sum it up like this:
 
 1. Discs work best when used as purely sequential devices.
 2. Inserting those BLOBs takes up space somewhere. Strictly speaking, on
 the disc they may very well be in between certain rows.
 
 
 
 Good points - I guess there is no way for the database to
 control where on
 the file system the data ends up.  I wonder if using a raw
 parition for the
 database under InnoDB would fix this?  I guess that would give
 the storage
 engine the opportunity to put the BLOB data in a heap at one end
 of a disk
 while keeping the table data together at the other, although this sounds
 like a lot of work for the storage engine.  What would be good would be a
 tool to show how 'fragmented' a database table is on disk?
 
 
 
 A HEX editor? :-) Finding out what the developers at MySQL AB like to
 drink and sending them a few megalitres of it? :-)

 In all seriousness, you haven't got a great many options (although I
 could be completely wrong, in which case some kind person on this list
 will enlighten / flame me), but using OPTIMIZE TABLE on MyISAM tables
 and ALTER TABLE TYPE = InnoDB on InnoDB tables will defrag your table
 for you, possibly helping matters. In each case, the table has a SHARE
 MODE lock set (read only), copies all existing rows to a new table such
 that they are positioned optimally, drops the old table and renames the
 new table to the original name.

 InnoDB looks at tablespace files and raw partitions in the same way and
 manages space within them in a method related to that used by the
 Berkeley Fast File System. The only advantage of raw partition usage is
 bypassing filesystem logic which is very beneficial on some OSes (not
 Linux or any of the BSDs to my knowledge though - their FS layers are
 damned quick!).

 3. As you may need to read over them, the disc has to seek much further
 and depending on the way BLOBs are placed inside the tablespace, seeking
 may be all over the place.
 
 
 
 Another good point.  It maybe that by creating the
 'filemetadata' table in
 one go I've just ensured that most of the columns are linear on the disc.
 In which case re-creating the 'files' table and then adding the BLOB data
 back as a second step might fix this.  My experiments show this
 is not the
 case though...
 
 
 If you're doing INSERTs in AUTO_COMMIT mode, then InnoDB will have no
 choice but to put the BLOBs where it can. It *might* have some more
 options to play with if you wrap a few INSERTs in a transaction block.

 
 
 Have you tried the following to see the results:
 
 ALTER TABLE files TYPE = MyISAM;
 (Remove all indexes)
 SELECT SUM(revision) FROM files;
 
 
 
 Okay.  I've made two new tables, one called 'myfiles' and one called
 'inofiles' which are both duplicates of 'files' without any
 indices except

Re: InnoDb Table Performance problem

2004-02-11 Thread Benoit St-Jean
Michael McTernan wrote:

Hi there,

I'm making something similar to a file revision control system, and using
MySQL on Linux as the database to drive it.  Almost all my tables are
InnoDB, and generally it is going very well, with the exception of one table
that is always very slow.
This table holds the files within the database.  It is defined as follows:

CREATE TABLE `files` (
 `id` int(10) unsigned NOT NULL auto_increment,
 `revision` int(10) unsigned NOT NULL default '0',
 `filenameid` int(10) unsigned NOT NULL default '0',
 `pathid` int(10) unsigned NOT NULL default '0',
 `extensionid` int(10) unsigned NOT NULL default '0',
 `isDeleted` enum('0','1') NOT NULL default '0',
 `filepathname` tinytext NOT NULL,
 `contentsGz` longblob NOT NULL,
 PRIMARY KEY  (`id`),
 UNIQUE KEY `revision` (`revision`,`filepathname`(255)),
 KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`),
 KEY `filepathname` (`filepathname`(255)),
 CONSTRAINT `0_3570` FOREIGN KEY (`filenameid`) REFERENCES `filenames`
(`id`),
 CONSTRAINT `0_3571` FOREIGN KEY (`extensionid`) REFERENCES
`fileextensions` (`id`),
 CONSTRAINT `0_3572` FOREIGN KEY (`pathid`) REFERENCES `filepaths` (`id`)
) TYPE=InnoDB;
The 'contentsGz' column will have the contents of the file and will
typically be a couple of hundred kilobytes, but in some rare cases as large
as 20 Megabytes.
Selects on this table always go very slowly.  I've used EXPLAIN to look at
what is going on, and carefully added a couple of multi-column indexes that
have improved SELECT performance (this table is updated rarely, so I'm not
too worried about INSERT performance).  However, the performance is still
really bad.
I tried creating an identical table with the exception that it doesn't have
the 'contentsGz' column:
CREATE TABLE `filemetadata` (
 `id` int(10) unsigned NOT NULL auto_increment,
 `revision` int(10) unsigned NOT NULL default '0',
 `filenameid` int(10) unsigned NOT NULL default '0',
 `pathid` int(10) unsigned NOT NULL default '0',
 `extensionid` int(10) unsigned NOT NULL default '0',
 `isDeleted` enum('0','1') NOT NULL default '0',
 `filepathname` tinytext NOT NULL,
 PRIMARY KEY  (`id`),
 UNIQUE KEY `revision` (`revision`,`filepathname`(255)),
 KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`),
 KEY `filepathname` (`filepathname`(255)),
 CONSTRAINT `0_3651` FOREIGN KEY (`filenameid`) REFERENCES `filenames`
(`id`),
 CONSTRAINT `0_3652` FOREIGN KEY (`extensionid`) REFERENCES
`fileextensions` (`id`),
 CONSTRAINT `0_3653` FOREIGN KEY (`pathid`) REFERENCES `filepaths` (`id`)
) TYPE=InnoDB;
I used UPDATE ... SELECT to copy all data from the 'files' table to
'filemetadata'.
Here is something I found suprising:

mysql SELECT COUNT(1) FROM files;
+--+
| COUNT(1) |
+--+
|   101013 |
+--+
1 row in set (32.42 sec)
mysql SELECT COUNT(1) FROM filemetadata;
+--+
| COUNT(1) |
+--+
|   101013 |
+--+
1 row in set (0.29 sec)
 

SELECT COUNT(*) for InnoDB tables is a know problem...  The table 
handler (for InnoDB) has to do a table scan to count all rows...  This 
particular case is optimized with MyISAM ...



 





--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]