RE: InnoDb Table Performance problem
Hi, SELECT COUNT(*) for InnoDB tables is a know problem... The table handler (for InnoDB) has to do a table scan to count all rows... This particular case is optimized with MyISAM ... Sure. But why is the tablescan ~100 times faster for the table without the BLOB column? This goes back to my original assumption that I thought the BLOB contents would be stored outside of the normal column data, meaning that you have to do an additional seek to retrieve a BLOB value, but that a BLOB column doesn't inflate the data over which a table scan has to iterate and so doesn't impact tablescan performance. Is this wrong? Or is this different for InnoDB vs MyISAM? Of course, I'm aiming not to use table scans at all though :) Thanks, Mike -Original Message- From: Benoit St-Jean [mailto:[EMAIL PROTECTED] Sent: 11 February 2004 22:47 To: Michael McTernan Cc: Mysql Subject: Re: InnoDb Table Performance problem Michael McTernan wrote: Hi there, I'm making something similar to a file revision control system, and using MySQL on Linux as the database to drive it. Almost all my tables are InnoDB, and generally it is going very well, with the exception of one table that is always very slow. This table holds the files within the database. It is defined as follows: CREATE TABLE `files` ( `id` int(10) unsigned NOT NULL auto_increment, `revision` int(10) unsigned NOT NULL default '0', `filenameid` int(10) unsigned NOT NULL default '0', `pathid` int(10) unsigned NOT NULL default '0', `extensionid` int(10) unsigned NOT NULL default '0', `isDeleted` enum('0','1') NOT NULL default '0', `filepathname` tinytext NOT NULL, `contentsGz` longblob NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `revision` (`revision`,`filepathname`(255)), KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`), KEY `filepathname` (`filepathname`(255)), CONSTRAINT `0_3570` FOREIGN KEY (`filenameid`) REFERENCES `filenames` (`id`), CONSTRAINT `0_3571` FOREIGN KEY (`extensionid`) REFERENCES `fileextensions` (`id`), CONSTRAINT `0_3572` FOREIGN KEY (`pathid`) REFERENCES `filepaths` (`id`) ) TYPE=InnoDB; The 'contentsGz' column will have the contents of the file and will typically be a couple of hundred kilobytes, but in some rare cases as large as 20 Megabytes. Selects on this table always go very slowly. I've used EXPLAIN to look at what is going on, and carefully added a couple of multi-column indexes that have improved SELECT performance (this table is updated rarely, so I'm not too worried about INSERT performance). However, the performance is still really bad. I tried creating an identical table with the exception that it doesn't have the 'contentsGz' column: CREATE TABLE `filemetadata` ( `id` int(10) unsigned NOT NULL auto_increment, `revision` int(10) unsigned NOT NULL default '0', `filenameid` int(10) unsigned NOT NULL default '0', `pathid` int(10) unsigned NOT NULL default '0', `extensionid` int(10) unsigned NOT NULL default '0', `isDeleted` enum('0','1') NOT NULL default '0', `filepathname` tinytext NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `revision` (`revision`,`filepathname`(255)), KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`), KEY `filepathname` (`filepathname`(255)), CONSTRAINT `0_3651` FOREIGN KEY (`filenameid`) REFERENCES `filenames` (`id`), CONSTRAINT `0_3652` FOREIGN KEY (`extensionid`) REFERENCES `fileextensions` (`id`), CONSTRAINT `0_3653` FOREIGN KEY (`pathid`) REFERENCES `filepaths` (`id`) ) TYPE=InnoDB; I used UPDATE ... SELECT to copy all data from the 'files' table to 'filemetadata'. Here is something I found suprising: mysql SELECT COUNT(1) FROM files; +--+ | COUNT(1) | +--+ | 101013 | +--+ 1 row in set (32.42 sec) mysql SELECT COUNT(1) FROM filemetadata; +--+ | COUNT(1) | +--+ | 101013 | +--+ 1 row in set (0.29 sec) SELECT COUNT(*) for InnoDB tables is a know problem... The table handler (for InnoDB) has to do a table scan to count all rows... This particular case is optimized with MyISAM ... -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: InnoDb Table Performance problem
Michael McTernan wrote: Hi, SELECT COUNT(*) for InnoDB tables is a know problem... The table handler (for InnoDB) has to do a table scan to count all rows... This particular case is optimized with MyISAM ... Sure. But why is the tablescan ~100 times faster for the table without the BLOB column? I would try to sum it up like this: 1. Discs work best when used as purely sequential devices. 2. Inserting those BLOBs takes up space somewhere. Strictly speaking, on the disc they may very well be in between certain rows. 3. As you may need to read over them, the disc has to seek much further and depending on the way BLOBs are placed inside the tablespace, seeking may be all over the place. Have you tried the following to see the results: ALTER TABLE files TYPE = MyISAM; (Remove all indexes) SELECT SUM(revision) FROM files; You may want to execute the above select on your InnoDB version of the table as well. Generally speaking, the execution time of COUNT(*) doesn't tell you very much unless you are a bit more specific: SELECT COUNT(revision) FROM files WHERE revision BETWEEN '1' AND '2600'; I am prepared to be shot down for being totally incorrect! Regards, Chris This goes back to my original assumption that I thought the BLOB contents would be stored outside of the normal column data, meaning that you have to do an additional seek to retrieve a BLOB value, but that a BLOB column doesn't inflate the data over which a table scan has to iterate and so doesn't impact tablescan performance. Is this wrong? Or is this different for InnoDB vs MyISAM? Of course, I'm aiming not to use table scans at all though :) Thanks, Mike -Original Message- From: Benoit St-Jean [mailto:[EMAIL PROTECTED] Sent: 11 February 2004 22:47 To: Michael McTernan Cc: Mysql Subject: Re: InnoDb Table Performance problem Michael McTernan wrote: Hi there, I'm making something similar to a file revision control system, and using MySQL on Linux as the database to drive it. Almost all my tables are InnoDB, and generally it is going very well, with the exception of one table that is always very slow. This table holds the files within the database. It is defined as follows: CREATE TABLE `files` ( `id` int(10) unsigned NOT NULL auto_increment, `revision` int(10) unsigned NOT NULL default '0', `filenameid` int(10) unsigned NOT NULL default '0', `pathid` int(10) unsigned NOT NULL default '0', `extensionid` int(10) unsigned NOT NULL default '0', `isDeleted` enum('0','1') NOT NULL default '0', `filepathname` tinytext NOT NULL, `contentsGz` longblob NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `revision` (`revision`,`filepathname`(255)), KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`), KEY `filepathname` (`filepathname`(255)), CONSTRAINT `0_3570` FOREIGN KEY (`filenameid`) REFERENCES `filenames` (`id`), CONSTRAINT `0_3571` FOREIGN KEY (`extensionid`) REFERENCES `fileextensions` (`id`), CONSTRAINT `0_3572` FOREIGN KEY (`pathid`) REFERENCES `filepaths` (`id`) ) TYPE=InnoDB; The 'contentsGz' column will have the contents of the file and will typically be a couple of hundred kilobytes, but in some rare cases as large as 20 Megabytes. Selects on this table always go very slowly. I've used EXPLAIN to look at what is going on, and carefully added a couple of multi-column indexes that have improved SELECT performance (this table is updated rarely, so I'm not too worried about INSERT performance). However, the performance is still really bad. I tried creating an identical table with the exception that it doesn't have the 'contentsGz' column: CREATE TABLE `filemetadata` ( `id` int(10) unsigned NOT NULL auto_increment, `revision` int(10) unsigned NOT NULL default '0', `filenameid` int(10) unsigned NOT NULL default '0', `pathid` int(10) unsigned NOT NULL default '0', `extensionid` int(10) unsigned NOT NULL default '0', `isDeleted` enum('0','1') NOT NULL default '0', `filepathname` tinytext NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `revision` (`revision`,`filepathname`(255)), KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`), KEY `filepathname` (`filepathname`(255)), CONSTRAINT `0_3651` FOREIGN KEY (`filenameid`) REFERENCES `filenames` (`id`), CONSTRAINT `0_3652` FOREIGN KEY (`extensionid`) REFERENCES `fileextensions` (`id`), CONSTRAINT `0_3653` FOREIGN KEY (`pathid`) REFERENCES `filepaths` (`id`) ) TYPE=InnoDB; I used UPDATE ... SELECT to copy all data from the 'files' table to 'filemetadata'. Here is something I found suprising: mysql SELECT COUNT(1) FROM files; +--+ | COUNT(1) | +--+ | 101013 | +--+ 1 row in set (32.42 sec) mysql SELECT COUNT(1) FROM filemetadata; +--+ | COUNT(1) | +--+ | 101013 | +--+ 1 row in set (0.29 sec) SELECT COUNT(*) for InnoDB tables is a know problem... The table handler (for InnoDB) has
Re: InnoDb Table Performance problem
Michael McTernan wrote: Hi, SELECT COUNT(*) for InnoDB tables is a know problem... The table handler (for InnoDB) has to do a table scan to count all rows... This particular case is optimized with MyISAM ... Sure. But why is the tablescan ~100 times faster for the table without the BLOB column? Well, two possible explanations... BLOBS are either stored in a separate area thus forcing MySQL to perform an extra seek to retrieve the full row or the table scan takes a lot more time to execute because of the BLOB data you have to retrieve... Reading 10 rows of 1K is way faster than retrieving 10 rows of 800K! By your result, I am assuming that your table with the BLOB is 100 times bigger than the one without it... -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: InnoDb Table Performance problem
Hi, Well, two possible explanations... BLOBS are either stored in a separate area thus forcing MySQL to perform an extra seek to retrieve the full row or the table scan takes a lot more time to execute because of the BLOB data you have to retrieve... I'm not often retrieving the BLOB value in my queries, so wouldn't expect to get the second seek performance it, and hence would expect good peformance. Reading 10 rows of 1K is way faster than retrieving 10 rows of 800K! By your result, I am assuming that your table with the BLOB is 100 times bigger than the one without it... I'm not sure how to get the size of a table from InnoDb, so could guess that this is true, but I'm not certain. I'm guessing that with InnoDb, the BLOB data is being stored inline with the other column data :( Thanks, Mike -Original Message- From: Benoit St-Jean [mailto:[EMAIL PROTECTED] Sent: 12 February 2004 14:04 To: Michael McTernan Cc: Mysql Subject: Re: InnoDb Table Performance problem Michael McTernan wrote: Hi, SELECT COUNT(*) for InnoDB tables is a know problem... The table handler (for InnoDB) has to do a table scan to count all rows... This particular case is optimized with MyISAM ... Sure. But why is the tablescan ~100 times faster for the table without the BLOB column? Well, two possible explanations... BLOBS are either stored in a separate area thus forcing MySQL to perform an extra seek to retrieve the full row or the table scan takes a lot more time to execute because of the BLOB data you have to retrieve... Reading 10 rows of 1K is way faster than retrieving 10 rows of 800K! By your result, I am assuming that your table with the BLOB is 100 times bigger than the one without it... -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: InnoDb Table Performance problem
Performance problem Michael McTernan wrote: Hi, SELECT COUNT(*) for InnoDB tables is a know problem... The table handler (for InnoDB) has to do a table scan to count all rows... This particular case is optimized with MyISAM ... Sure. But why is the tablescan ~100 times faster for the table without the BLOB column? I would try to sum it up like this: 1. Discs work best when used as purely sequential devices. 2. Inserting those BLOBs takes up space somewhere. Strictly speaking, on the disc they may very well be in between certain rows. 3. As you may need to read over them, the disc has to seek much further and depending on the way BLOBs are placed inside the tablespace, seeking may be all over the place. Have you tried the following to see the results: ALTER TABLE files TYPE = MyISAM; (Remove all indexes) SELECT SUM(revision) FROM files; You may want to execute the above select on your InnoDB version of the table as well. Generally speaking, the execution time of COUNT(*) doesn't tell you very much unless you are a bit more specific: SELECT COUNT(revision) FROM files WHERE revision BETWEEN '1' AND '2600'; I am prepared to be shot down for being totally incorrect! Regards, Chris This goes back to my original assumption that I thought the BLOB contents would be stored outside of the normal column data, meaning that you have to do an additional seek to retrieve a BLOB value, but that a BLOB column doesn't inflate the data over which a table scan has to iterate and so doesn't impact tablescan performance. Is this wrong? Or is this different for InnoDB vs MyISAM? Of course, I'm aiming not to use table scans at all though :) Thanks, Mike -Original Message- From: Benoit St-Jean [mailto:[EMAIL PROTECTED] Sent: 11 February 2004 22:47 To: Michael McTernan Cc: Mysql Subject: Re: InnoDb Table Performance problem Michael McTernan wrote: Hi there, I'm making something similar to a file revision control system, and using MySQL on Linux as the database to drive it. Almost all my tables are InnoDB, and generally it is going very well, with the exception of one table that is always very slow. This table holds the files within the database. It is defined as follows: CREATE TABLE `files` ( `id` int(10) unsigned NOT NULL auto_increment, `revision` int(10) unsigned NOT NULL default '0', `filenameid` int(10) unsigned NOT NULL default '0', `pathid` int(10) unsigned NOT NULL default '0', `extensionid` int(10) unsigned NOT NULL default '0', `isDeleted` enum('0','1') NOT NULL default '0', `filepathname` tinytext NOT NULL, `contentsGz` longblob NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `revision` (`revision`,`filepathname`(255)), KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`), KEY `filepathname` (`filepathname`(255)), CONSTRAINT `0_3570` FOREIGN KEY (`filenameid`) REFERENCES `filenames` (`id`), CONSTRAINT `0_3571` FOREIGN KEY (`extensionid`) REFERENCES `fileextensions` (`id`), CONSTRAINT `0_3572` FOREIGN KEY (`pathid`) REFERENCES `filepaths` (`id`) ) TYPE=InnoDB; The 'contentsGz' column will have the contents of the file and will typically be a couple of hundred kilobytes, but in some rare cases as large as 20 Megabytes. Selects on this table always go very slowly. I've used EXPLAIN to look at what is going on, and carefully added a couple of multi-column indexes that have improved SELECT performance (this table is updated rarely, so I'm not too worried about INSERT performance). However, the performance is still really bad. I tried creating an identical table with the exception that it doesn't have the 'contentsGz' column: CREATE TABLE `filemetadata` ( `id` int(10) unsigned NOT NULL auto_increment, `revision` int(10) unsigned NOT NULL default '0', `filenameid` int(10) unsigned NOT NULL default '0', `pathid` int(10) unsigned NOT NULL default '0', `extensionid` int(10) unsigned NOT NULL default '0', `isDeleted` enum('0','1') NOT NULL default '0', `filepathname` tinytext NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `revision` (`revision`,`filepathname`(255)), KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`), KEY `filepathname` (`filepathname`(255)), CONSTRAINT `0_3651` FOREIGN KEY (`filenameid`) REFERENCES `filenames` (`id`), CONSTRAINT `0_3652` FOREIGN KEY (`extensionid`) REFERENCES `fileextensions` (`id`), CONSTRAINT `0_3653` FOREIGN KEY (`pathid`) REFERENCES `filepaths` (`id`) ) TYPE=InnoDB; I used UPDATE ... SELECT to copy all data from the 'files' table to 'filemetadata'. Here is something I found suprising: mysql SELECT COUNT(1) FROM files; +--+ | COUNT(1) | +--+ | 101013 | +--+ 1 row in set (32.42 sec) mysql SELECT COUNT(1) FROM filemetadata; +--+ | COUNT(1
Re: InnoDb Table Performance problem
(revision) | +---+ |561401 | +---+ 1 row in set (0.18 sec) [Restart database] mysql SELECT SUM(revision) FROM inofiles; +---+ | SUM(revision) | +---+ |561401 | +---+ 1 row in set (0.29 sec) mysql SELECT SUM(revision) from inofiles; +---+ | SUM(revision) | +---+ |561401 | +---+ 1 row in set (0.10 sec) So I guess the conclusion is that BLOBs are *really* bad for table scans in MyISAM _and_ InnoDB, although perhaps slightly less so for InnoDB. I guess the BLOB data must get interleaved with the non-BLOB data on disk, and that putting the BLOBs into a different table may cause some sort of partitioning between the tables on disk. That may help. One thing you may want to do is re-run the above with an index on the revision column for both table types with and without BLOBs. I'm betting that you'll see little deviation between the BLOB-equipped tables and the BLOB-less ones. One thing you might want to investigate is storing your BLOBs in a MyISAM table and your other data in an InnoDB table. You have to be impressed with InnoDB though - it's a full-on multiversioned storage engine and it manages to keep pace with MyISAM (and outperform it on occassion). Considering MyISAM's design brief was to be sleek, fast and space efficient (which gives it further speed advantages), InnoDB's comparitive overhead is hidden extremely well. Unfortunately, I really started using MySQL well after NuSphere buggered up everything and retired the Gemini table type - it would have been interesting to see all 4 table types be compared with all the nice speed improvements 4.0.x has brough along. I'm guessing that the real problem here is that some of my queries are secretly doing table scans when they shouldn't, and that is causing a huge slowdown, although I might split the BLOB column into a different table such that table scans don't give me such a bad hit if they do happen. There is a table I query often, with two ENUM columns, a TIMESTAMP column and a TEXT column. I often need to throw queries at it that look for substrings in the TEXT column for rows between two dates with certain values for the ENUM columns. There are indexes on the ENUM, TIMESTAMP and TEXT columns (prefix indexes of course). Unless I specify FORCE INDEX(date), MySQL does a full table scan, resulting in queries that take 2 minutes to complete. Using the FORCE INDEX(date) predicate, the query time drops to about 2 seconds. Thanks, Mike You're welcome! I hope that this helps you out! Regards, Chris -Original Message- From: Chris Nolan [mailto:[EMAIL PROTECTED] Sent: 12 February 2004 12:23 To: Michael McTernan Cc: Benoit St-Jean; Mysql Subject: Re: InnoDb Table Performance problem Michael McTernan wrote: Hi, SELECT COUNT(*) for InnoDB tables is a know problem... The table handler (for InnoDB) has to do a table scan to count all rows... This particular case is optimized with MyISAM ... Sure. But why is the tablescan ~100 times faster for the table without the BLOB column? I would try to sum it up like this: 1. Discs work best when used as purely sequential devices. 2. Inserting those BLOBs takes up space somewhere. Strictly speaking, on the disc they may very well be in between certain rows. 3. As you may need to read over them, the disc has to seek much further and depending on the way BLOBs are placed inside the tablespace, seeking may be all over the place. Have you tried the following to see the results: ALTER TABLE files TYPE = MyISAM; (Remove all indexes) SELECT SUM(revision) FROM files; You may want to execute the above select on your InnoDB version of the table as well. Generally speaking, the execution time of COUNT(*) doesn't tell you very much unless you are a bit more specific: SELECT COUNT(revision) FROM files WHERE revision BETWEEN '1' AND '2600'; I am prepared to be shot down for being totally incorrect! Regards, Chris This goes back to my original assumption that I thought the BLOB contents would be stored outside of the normal column data, meaning that you have to do an additional seek to retrieve a BLOB value, but that a BLOB column doesn't inflate the data over which a table scan has to iterate and so doesn't impact tablescan performance. Is this wrong? Or is this different for InnoDB vs MyISAM? Of course, I'm aiming not to use table scans at all though :) Thanks, Mike -Original Message- From: Benoit St-Jean [mailto:[EMAIL PROTECTED] Sent: 11 February 2004 22:47 To: Michael McTernan Cc: Mysql Subject: Re: InnoDb Table Performance problem Michael McTernan wrote: Hi there, I'm making something similar to a file revision control system, and using MySQL on Linux as the database to drive it. Almost all my tables are InnoDB, and generally it is going very well
RE: InnoDb Table Performance problem
Hi Chris, A HEX editor? :-) Finding out what the developers at MySQL AB like to drink and sending them a few megalitres of it? :-) Heh heh :-) In all seriousness, you haven't got a great many options (although I could be completely wrong, in which case some kind person on this list will enlighten / flame me), but using OPTIMIZE TABLE on MyISAM tables and ALTER TABLE TYPE = InnoDB on InnoDB tables will defrag your table for you, possibly helping matters. I'm pretty sure that the new tables I created for the test would have been defraged since I ALTER'd them a couple of times to drop indices. InnoDB looks at tablespace files and raw partitions in the same way and manages space within them in a method related to that used by the Berkeley Fast File System. Sounds like it is already very efficient then. It *might* have some more options to play with if you wrap a few INSERTs in a transaction block. That's a good idea - I could certainly try that. A future version of InnoDb may also take advantage of this I guess :) That may help. One thing you may want to do is re-run the above with an index on the revision column for both table types with and without BLOBs. I did try adding indices, and you are correct, it brings the times right down, so it is only table scan performance that is at fault here. You have to be impressed with InnoDB though - it's a full-on multiversioned storage engine and it manages to keep pace with MyISAM (and outperform it on occassion). Completely. I'm impressed with the whole lot of MySQL - it's brilliant. Using the FORCE INDEX(date) predicate, the query time drops to about 2 seconds. I've had to do this in the past too. I'll be looking out for slow queries in the logs and then EXPLAINing them to see where I can make this improvement. Thankyou for all your help. While there is no simple solution, I'm a lot more sure of my options now :) Cheers, Mike -Original Message- From: Chris Nolan [mailto:[EMAIL PROTECTED] Sent: 12 February 2004 17:28 To: Michael McTernan Cc: Mysql; Benoit St-Jean Subject: Re: InnoDb Table Performance problem Michael McTernan wrote: Hi there, I would try to sum it up like this: 1. Discs work best when used as purely sequential devices. 2. Inserting those BLOBs takes up space somewhere. Strictly speaking, on the disc they may very well be in between certain rows. Good points - I guess there is no way for the database to control where on the file system the data ends up. I wonder if using a raw parition for the database under InnoDB would fix this? I guess that would give the storage engine the opportunity to put the BLOB data in a heap at one end of a disk while keeping the table data together at the other, although this sounds like a lot of work for the storage engine. What would be good would be a tool to show how 'fragmented' a database table is on disk? A HEX editor? :-) Finding out what the developers at MySQL AB like to drink and sending them a few megalitres of it? :-) In all seriousness, you haven't got a great many options (although I could be completely wrong, in which case some kind person on this list will enlighten / flame me), but using OPTIMIZE TABLE on MyISAM tables and ALTER TABLE TYPE = InnoDB on InnoDB tables will defrag your table for you, possibly helping matters. In each case, the table has a SHARE MODE lock set (read only), copies all existing rows to a new table such that they are positioned optimally, drops the old table and renames the new table to the original name. InnoDB looks at tablespace files and raw partitions in the same way and manages space within them in a method related to that used by the Berkeley Fast File System. The only advantage of raw partition usage is bypassing filesystem logic which is very beneficial on some OSes (not Linux or any of the BSDs to my knowledge though - their FS layers are damned quick!). 3. As you may need to read over them, the disc has to seek much further and depending on the way BLOBs are placed inside the tablespace, seeking may be all over the place. Another good point. It maybe that by creating the 'filemetadata' table in one go I've just ensured that most of the columns are linear on the disc. In which case re-creating the 'files' table and then adding the BLOB data back as a second step might fix this. My experiments show this is not the case though... If you're doing INSERTs in AUTO_COMMIT mode, then InnoDB will have no choice but to put the BLOBs where it can. It *might* have some more options to play with if you wrap a few INSERTs in a transaction block. Have you tried the following to see the results: ALTER TABLE files TYPE = MyISAM; (Remove all indexes) SELECT SUM(revision) FROM files; Okay. I've made two new tables, one called 'myfiles' and one called 'inofiles' which are both duplicates of 'files' without any indices except
Re: InnoDb Table Performance problem
Michael McTernan wrote: Hi there, I'm making something similar to a file revision control system, and using MySQL on Linux as the database to drive it. Almost all my tables are InnoDB, and generally it is going very well, with the exception of one table that is always very slow. This table holds the files within the database. It is defined as follows: CREATE TABLE `files` ( `id` int(10) unsigned NOT NULL auto_increment, `revision` int(10) unsigned NOT NULL default '0', `filenameid` int(10) unsigned NOT NULL default '0', `pathid` int(10) unsigned NOT NULL default '0', `extensionid` int(10) unsigned NOT NULL default '0', `isDeleted` enum('0','1') NOT NULL default '0', `filepathname` tinytext NOT NULL, `contentsGz` longblob NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `revision` (`revision`,`filepathname`(255)), KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`), KEY `filepathname` (`filepathname`(255)), CONSTRAINT `0_3570` FOREIGN KEY (`filenameid`) REFERENCES `filenames` (`id`), CONSTRAINT `0_3571` FOREIGN KEY (`extensionid`) REFERENCES `fileextensions` (`id`), CONSTRAINT `0_3572` FOREIGN KEY (`pathid`) REFERENCES `filepaths` (`id`) ) TYPE=InnoDB; The 'contentsGz' column will have the contents of the file and will typically be a couple of hundred kilobytes, but in some rare cases as large as 20 Megabytes. Selects on this table always go very slowly. I've used EXPLAIN to look at what is going on, and carefully added a couple of multi-column indexes that have improved SELECT performance (this table is updated rarely, so I'm not too worried about INSERT performance). However, the performance is still really bad. I tried creating an identical table with the exception that it doesn't have the 'contentsGz' column: CREATE TABLE `filemetadata` ( `id` int(10) unsigned NOT NULL auto_increment, `revision` int(10) unsigned NOT NULL default '0', `filenameid` int(10) unsigned NOT NULL default '0', `pathid` int(10) unsigned NOT NULL default '0', `extensionid` int(10) unsigned NOT NULL default '0', `isDeleted` enum('0','1') NOT NULL default '0', `filepathname` tinytext NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `revision` (`revision`,`filepathname`(255)), KEY `fpeindex` (`filenameid`,`pathid`,`extensionid`,`revision`), KEY `filepathname` (`filepathname`(255)), CONSTRAINT `0_3651` FOREIGN KEY (`filenameid`) REFERENCES `filenames` (`id`), CONSTRAINT `0_3652` FOREIGN KEY (`extensionid`) REFERENCES `fileextensions` (`id`), CONSTRAINT `0_3653` FOREIGN KEY (`pathid`) REFERENCES `filepaths` (`id`) ) TYPE=InnoDB; I used UPDATE ... SELECT to copy all data from the 'files' table to 'filemetadata'. Here is something I found suprising: mysql SELECT COUNT(1) FROM files; +--+ | COUNT(1) | +--+ | 101013 | +--+ 1 row in set (32.42 sec) mysql SELECT COUNT(1) FROM filemetadata; +--+ | COUNT(1) | +--+ | 101013 | +--+ 1 row in set (0.29 sec) SELECT COUNT(*) for InnoDB tables is a know problem... The table handler (for InnoDB) has to do a table scan to count all rows... This particular case is optimized with MyISAM ... -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]