Michael McTernan wrote:

Hi,



SELECT COUNT(*) for InnoDB tables is a know problem... The table
handler (for InnoDB) has to do a table scan to count all rows... This
particular case is optimized with MyISAM ...



Sure. But why is the tablescan ~100 times faster for the table without the
BLOB column?


I would try to sum it up like this:

1. Discs work best when used as purely sequential devices.
2. Inserting those BLOBs takes up space somewhere. Strictly speaking, on the disc they may very well be in between certain rows.
3. As you may need to read over them, the disc has to seek much further and depending on the way BLOBs are placed inside the tablespace, seeking may be all over the place.


Have you tried the following to see the results:

ALTER TABLE files TYPE = MyISAM;
(Remove all indexes)
SELECT SUM(revision) FROM files;

You may want to execute the above select on your InnoDB version of the table as well.

Generally speaking, the execution time of COUNT(*) doesn't tell you very much unless you are a bit more specific:

SELECT COUNT(revision) FROM files WHERE revision BETWEEN '1' AND '2600';

I am prepared to be shot down for being totally incorrect!

Regards,

Chris

This goes back to my original assumption that I thought the BLOB contents
would be stored outside of the normal column data, meaning that you have to
do an additional seek to retrieve a BLOB value, but that a BLOB column
doesn't inflate the data over which a table scan has to iterate and so
doesn't impact tablescan performance.  Is this wrong?  Or is this different
for InnoDB vs MyISAM?

Of course, I'm aiming not to use table scans at all though :)

Thanks,

Mike



-----Original Message-----
From: Benoit St-Jean [mailto:[EMAIL PROTECTED]
Sent: 11 February 2004 22:47
To: Michael McTernan
Cc: Mysql
Subject: Re: InnoDb Table Performance problem


Michael McTernan wrote:




Hi there,

I'm making something similar to a file revision control system, and using
MySQL on Linux as the database to drive it. Almost all my tables are
InnoDB, and generally it is going very well, with the exception


of one table


that is always very slow.

This table holds the files within the database. It is defined


as follows:


CREATE TABLE `files` (
`id` int(10) unsigned NOT NULL auto_increment,
`revision` int(10) unsigned NOT NULL default '0',
`filenameid` int(10) unsigned NOT NULL default '0',
`pathid` int(10) unsigned NOT NULL default '0',
`extensionid` int(10) unsigned NOT NULL default '0',
`isDeleted` enum('0','1') NOT NULL default '0',
`filepathname` tinytext NOT NULL,
`contentsGz` longblob NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `revision` (`revision`,`filepathname`(255)),
KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`),
KEY `filepathname` (`filepathname`(255)),
CONSTRAINT `0_3570` FOREIGN KEY (`filenameid`) REFERENCES `filenames`
(`id`),
CONSTRAINT `0_3571` FOREIGN KEY (`extensionid`) REFERENCES
`fileextensions` (`id`),
CONSTRAINT `0_3572` FOREIGN KEY (`pathid`) REFERENCES


`filepaths` (`id`)


) TYPE=InnoDB;

The 'contentsGz' column will have the contents of the file and will
typically be a couple of hundred kilobytes, but in some rare


cases as large


as 20 Megabytes.

Selects on this table always go very slowly. I've used EXPLAIN


to look at


what is going on, and carefully added a couple of multi-column


indexes that


have improved SELECT performance (this table is updated rarely,


so I'm not


too worried about INSERT performance).  However, the performance is still
really bad.

I tried creating an identical table with the exception that it


doesn't have


the 'contentsGz' column:

CREATE TABLE `filemetadata` (
`id` int(10) unsigned NOT NULL auto_increment,
`revision` int(10) unsigned NOT NULL default '0',
`filenameid` int(10) unsigned NOT NULL default '0',
`pathid` int(10) unsigned NOT NULL default '0',
`extensionid` int(10) unsigned NOT NULL default '0',
`isDeleted` enum('0','1') NOT NULL default '0',
`filepathname` tinytext NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `revision` (`revision`,`filepathname`(255)),
KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`),
KEY `filepathname` (`filepathname`(255)),
CONSTRAINT `0_3651` FOREIGN KEY (`filenameid`) REFERENCES `filenames`
(`id`),
CONSTRAINT `0_3652` FOREIGN KEY (`extensionid`) REFERENCES
`fileextensions` (`id`),
CONSTRAINT `0_3653` FOREIGN KEY (`pathid`) REFERENCES


`filepaths` (`id`)


) TYPE=InnoDB;

I used UPDATE ... SELECT to copy all data from the 'files' table to
'filemetadata'.

Here is something I found suprising:

mysql> SELECT COUNT(1) FROM files;
+----------+
| COUNT(1) |
+----------+
|   101013 |
+----------+
1 row in set (32.42 sec)

mysql> SELECT COUNT(1) FROM filemetadata;
+----------+
| COUNT(1) |
+----------+
|   101013 |
+----------+
1 row in set (0.29 sec)




SELECT COUNT(*) for InnoDB tables is a know problem...  The table
handler (for InnoDB) has to do a table scan to count all rows...  This
particular case is optimized with MyISAM ...










--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]














-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]



Reply via email to