Kyle, Sounds like you've thought of everything.
How much disk space does your largest table take up? Probably too much for what I'm about to suggest. There are commercial external ram disks that support gb volumes. You could create a temporary database on the RAM disk, file copy the table to the RAM disk, and run the import there. When you're done do a file copy back to your database directory. Keep in mind these hardware ram disks are very very expensive.
I assume you're using Char and not VarChar for your text fields.
Have you tweaked the bulk_insert_buffer_size variable?
This is similar to what you're doing: http://www.mysql.com/doc/en/Insert_speed.html
And of course if you have a few bucks, there is paid support from MySQL AB. They've probably encountered this problem quite a few times.
Mike
At 11:22 PM 1/30/2004, you wrote:
mos wrote:
At 01:20 PM 1/30/2004, you wrote:
I routinely see people requesting information on how to efficiently
handle *large* tables in mysql. Hopefully, the following will be helpful to those individuals.
I have several tables that are 500M to 4B rows in size and require me to drop and re-create indices at regular intervals. Usually, I'm removing about 10% of the existing records and then adding about 15% new records back in. Doing this with indices turned on would take days if not weeks. With the proper sequence of operations, however, I can accomplish this in about 24 hours.
I'm assuming you're using MyISAM tables and not InnoDb.
Have you tried using "Alter Table ... Disable Keys" to disable the non-unique indexes before you do your deletes and inserts?
Yes, we are using MyISAM. Innodb tables have been too slow in our tests. Disabling the keys does work, but the process I outlined turns out to be more efficient for very large data sets.
When possible, we load data using mysqlimport. This is much faster than connecting to the server and executing a series of INSERT statements.
You can get similar speeds using "Load Data"
mysqlimport translates directly to "LOAD DATA". Scripting with mysqlimport
is much simpler than trying to write a series of sql statements and pipe them into a client.
I have to ask, why are you deleting 10% of your rows and adding another 15% back in? Are you removing old data, data that is say 6 months old, and then importing this months data? If so, can't you just store one months data in separate tables and use Merge Table for your queries? At the end of each month, just drop the oldest table and create a new table for the current month's data.
Wrong kind of data storage architecture for my problem...
I've actually done some testing of merge tables at their upper limits. The N^2 table actually contains more than 2^32 rows so we were forced to split it across 2 tables. (FYI: Until last spring a bug existed in merge tables where the row count was returned as an int instead of a long.)
Kyle
*************************************************************************** Kyle J. Munn Email: [EMAIL PROTECTED] EraGen Biosciences Phone: 608.662.9000 x351 http://www.EraGen.com Fax: 608.662.9003
-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]