Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!
At 05:18 PM 3/6/2005, Harrison Fisk wrote: Hi, On Mar 6, 2005, at 12:51 PM, mos wrote: At 10:07 PM 3/2/2005, you wrote: Hello. You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process on the MyISAM table. I gave that a try but I had to cancel Alter Table ... Enable Keys after 49 hours. I find it amazing that it takes only 6.25 hours to load 450 million rows into the table using Load Data Infile with the keys disabled, but over 49 hours (maybe a lot more) to rebuild 3 indexes. Why do you find that amazing? The import is pretty much a raw file copy by MySQL. That 6 hours is just copying data from one place to another. When you create indexes MySQL has to go through and sort the data. While doing so it has to create a special structure and maintain it and write it to disk. Creating an index is *much* more effort than a simple file copy. Even with 100g of disk space dedicated to the sort file. It started using Repair with Filesort for the first 7 hours, then it switched to Repair with KeyCache. It should only use one of the Repair methods. If it switched then something was wrong with Repair by filesort (maybe out of disk space?) Look in your error log and see if there is any message there about it. There was a [Warning] Warning: Enabling keys got errno 136, retrying and it kept running for almost 2 more days. (Yes, I guess I should have checked earlier.) So apparently it ran out of index space because I didn't build the table with Max_Rows=10 (1 billion?). I'm using NTFS so there is no 4gb limit. The data file is about 84g. There should have been plenty of disk space available for the index file. I think what bothers me the most about all this is I'm kept in the dark about the progress of rebuilding the index (or table). Wouldn't it be nice if the Show Process List could say 10% complete or 99% complete? If it sat at 5% complete for a few hours (days?), then I would know something was wrong. But if the progress steadily increased every few hours and when it got to 99% complete I would know it is almost complete. I could let my client know it is 99% complete rather than saying I don't know every few hours. I would like to see this % complete as a feature so people aren't kept in the dark. At least then they would know whether the process is hung or not. Just my 2 cents. Mike What did you have your myisam_* variables set to? I don't know anyone who can afford to take their database down for a few days (weeks?) while building an index. How often do you need to load in 450 million rows? Generally this only occurs once, or in a batch process, such as once per month. I don't know what your expectations are, but this is never going to be a very fast process under any DBMS. Importing 450M rows will take some decent amount of time. Mike mos [EMAIL PROTECTED] wrote: I have a 50g CSV file that I am trying to import into an empty MyISAM table. It appears to go fine except after 10 hours it hasn't completed. A Show Process List shows Info=load data infile ... and State=Repair with keycache. The table has a few hundred million rows of data. I assume it is using Repair with keycache to rebuild the indexes after the data has been loaded. From what I've read Repair with keycache is going to take a huge amount of time to complete. How do I disable Repair with keycache and use Repair with sort instead (which should be faster, right?)? I'm using MySQL 4.1.10 on Win XP and 1gb ram. TIA Mike -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.NET http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Gleb Paharenko / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] Regards, Harrison -- Harrison C. Fisk, Trainer and Consultant MySQL AB, www.mysql.com Get a jumpstart on MySQL Cluster -- http://www.mysql.com/consulting/packaged/cluster.html -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!
At 10:07 PM 3/2/2005, you wrote: Hello. You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process on the MyISAM table. I gave that a try but I had to cancel Alter Table ... Enable Keys after 49 hours. I find it amazing that it takes only 6.25 hours to load 450 million rows into the table using Load Data Infile with the keys disabled, but over 49 hours (maybe a lot more) to rebuild 3 indexes. Even with 100g of disk space dedicated to the sort file. It started using Repair with Filesort for the first 7 hours, then it switched to Repair with KeyCache. I don't know anyone who can afford to take their database down for a few days (weeks?) while building an index. Mike mos [EMAIL PROTECTED] wrote: I have a 50g CSV file that I am trying to import into an empty MyISAM table. It appears to go fine except after 10 hours it hasn't completed. A Show Process List shows Info=load data infile ... and State=Repair with keycache. The table has a few hundred million rows of data. I assume it is using Repair with keycache to rebuild the indexes after the data has been loaded. From what I've read Repair with keycache is going to take a huge amount of time to complete. How do I disable Repair with keycache and use Repair with sort instead (which should be faster, right?)? I'm using MySQL 4.1.10 on Win XP and 1gb ram. TIA Mike -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.NET http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Gleb Paharenko / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!
Hi, On Mar 6, 2005, at 12:51 PM, mos wrote: At 10:07 PM 3/2/2005, you wrote: Hello. You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process on the MyISAM table. I gave that a try but I had to cancel Alter Table ... Enable Keys after 49 hours. I find it amazing that it takes only 6.25 hours to load 450 million rows into the table using Load Data Infile with the keys disabled, but over 49 hours (maybe a lot more) to rebuild 3 indexes. Why do you find that amazing? The import is pretty much a raw file copy by MySQL. That 6 hours is just copying data from one place to another. When you create indexes MySQL has to go through and sort the data. While doing so it has to create a special structure and maintain it and write it to disk. Creating an index is *much* more effort than a simple file copy. Even with 100g of disk space dedicated to the sort file. It started using Repair with Filesort for the first 7 hours, then it switched to Repair with KeyCache. It should only use one of the Repair methods. If it switched then something was wrong with Repair by filesort (maybe out of disk space?) Look in your error log and see if there is any message there about it. What did you have your myisam_* variables set to? I don't know anyone who can afford to take their database down for a few days (weeks?) while building an index. How often do you need to load in 450 million rows? Generally this only occurs once, or in a batch process, such as once per month. I don't know what your expectations are, but this is never going to be a very fast process under any DBMS. Importing 450M rows will take some decent amount of time. Mike mos [EMAIL PROTECTED] wrote: I have a 50g CSV file that I am trying to import into an empty MyISAM table. It appears to go fine except after 10 hours it hasn't completed. A Show Process List shows Info=load data infile ... and State=Repair with keycache. The table has a few hundred million rows of data. I assume it is using Repair with keycache to rebuild the indexes after the data has been loaded. From what I've read Repair with keycache is going to take a huge amount of time to complete. How do I disable Repair with keycache and use Repair with sort instead (which should be faster, right?)? I'm using MySQL 4.1.10 on Win XP and 1gb ram. TIA Mike -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.NET http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Gleb Paharenko / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] Regards, Harrison -- Harrison C. Fisk, Trainer and Consultant MySQL AB, www.mysql.com Get a jumpstart on MySQL Cluster -- http://www.mysql.com/consulting/packaged/cluster.html -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!
Hello. You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process on the MyISAM table. mos [EMAIL PROTECTED] wrote: I have a 50g CSV file that I am trying to import into an empty MyISAM table. It appears to go fine except after 10 hours it hasn't completed. A Show Process List shows Info=load data infile ... and State=Repair with keycache. The table has a few hundred million rows of data. I assume it is using Repair with keycache to rebuild the indexes after the data has been loaded. From what I've read Repair with keycache is going to take a huge amount of time to complete. How do I disable Repair with keycache and use Repair with sort instead (which should be faster, right?)? I'm using MySQL 4.1.10 on Win XP and 1gb ram. TIA Mike -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.NET http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Gleb Paharenko / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!
At 10:07 PM 3/2/2005, you wrote: Hello. You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process on the MyISAM table. That may work provided I can get the keys rebuilt later using FileSort and not KeyCache. You see the problem isn't in loading the data into the table which occurs reasonably fast that because running Load Data Infile on an empty table will disable the keys until all the data is loaded, so explicitly disabling before hand them won't help. After the data is loaded, Load Data Infile will then rebuild the keys and will choose to use Repair with keycache whereas Repair with filesort would be 100x-1000x faster. There doesn't seem to be any way to get it to use Repair with filesort. So I could use disable keys as you had suggested and then rebuild them manually with myisamchk repair with sort provided it doesn't reload all the data into a temporary table (which Alter Table usually does). If it does create a temporary table it will physically reloads the 500 million rows a second time and I will need another 100g of free disk space. So is there a solution to this paradox? I seem to be left with two options here: a) Do I shoot myself in the left foot right away, b) or Do I wait and shoot myself in the right foot? bg Is there a 3rd option? TIA Mike mos [EMAIL PROTECTED] wrote: I have a 50g CSV file that I am trying to import into an empty MyISAM table. It appears to go fine except after 10 hours it hasn't completed. A Show Process List shows Info=load data infile ... and State=Repair with keycache. The table has a few hundred million rows of data. I assume it is using Repair with keycache to rebuild the indexes after the data has been loaded. From what I've read Repair with keycache is going to take a huge amount of time to complete. How do I disable Repair with keycache and use Repair with sort instead (which should be faster, right?)? I'm using MySQL 4.1.10 on Win XP and 1gb ram. TIA Mike -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.NET http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Gleb Paharenko / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!
Hi, On Mar 3, 2005, at 11:32 AM, mos wrote: At 10:07 PM 3/2/2005, you wrote: Hello. You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process on the MyISAM table. That may work provided I can get the keys rebuilt later using FileSort and not KeyCache. You see the problem isn't in loading the data into the table which occurs reasonably fast that because running Load Data Infile on an empty table will disable the keys until all the data is loaded, so explicitly disabling before hand them won't help. After the data is loaded, Load Data Infile will then rebuild the keys and will choose to use Repair with keycache whereas Repair with filesort would be 100x-1000x faster. There doesn't seem to be any way to get it to use Repair with filesort. So I could use disable keys as you had suggested and then rebuild them manually with myisamchk repair with sort provided it doesn't reload all the data into a temporary table (which Alter Table usually does). If it does create a temporary table it will physically reloads the 500 million rows a second time and I will need another 100g of free disk space. So is there a solution to this paradox? It chooses which method to use based on your myisam_sort_buffer_size, myisam_max_sort_file_size and myisam_max_extra_sort_file_size. Have you increased the size of these? Keep in mind these are SESSION variables, so they can be set on the connection right before you LOAD DATA INFILE. You always have to tell MySQL how much extra memory/disk it can use, it can't assume that you want it to use it. That is why it will use Repair by keycache, since it doesn't use any extra resources. If you increase the amount of diskspace or memory, then it should use a Repair by sort instead. Also keep in mind that Repair by sort doesn't work for UNIQUE or PRIMARY KEYs. Regards, Harrison -- Harrison C. Fisk, Trainer and Consultant MySQL AB, www.mysql.com Get a jumpstart on MySQL Cluster -- http://www.mysql.com/consulting/packaged/cluster.html -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!
At 12:39 PM 3/3/2005, Harrison Fisk wrote: Hi, On Mar 3, 2005, at 11:32 AM, mos wrote: At 10:07 PM 3/2/2005, you wrote: Hello. You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process on the MyISAM table. That may work provided I can get the keys rebuilt later using FileSort and not KeyCache. You see the problem isn't in loading the data into the table which occurs reasonably fast that because running Load Data Infile on an empty table will disable the keys until all the data is loaded, so explicitly disabling before hand them won't help. After the data is loaded, Load Data Infile will then rebuild the keys and will choose to use Repair with keycache whereas Repair with filesort would be 100x-1000x faster. There doesn't seem to be any way to get it to use Repair with filesort. So I could use disable keys as you had suggested and then rebuild them manually with myisamchk repair with sort provided it doesn't reload all the data into a temporary table (which Alter Table usually does). If it does create a temporary table it will physically reloads the 500 million rows a second time and I will need another 100g of free disk space. So is there a solution to this paradox? It chooses which method to use based on your myisam_sort_buffer_size, myisam_max_sort_file_size and myisam_max_extra_sort_file_size. Have you increased the size of these? Keep in mind these are SESSION variables, so they can be set on the connection right before you LOAD DATA INFILE. You always have to tell MySQL how much extra memory/disk it can use, it can't assume that you want it to use it. That is why it will use Repair by keycache, since it doesn't use any extra resources. If you increase the amount of diskspace or memory, then it should use a Repair by sort instead. That's what I thought. I have myisam_max_sort_file_size set to 15.7GB and myisam_max_extra_sort_file_size set to 2GB. I confirmed this with Show Variables. I thought this would be enough to sort the indexes. I have confirmed there is enough free space on the hard drive to handle this both in the database directory and the mysql temp directory. Also keep in mind that Repair by sort doesn't work for UNIQUE or PRIMARY KEYs. So the solution is to keep using Load Data Infile with even larger max_sort variables and more disk space? There are only 4 indexes on the table and the largest index is around 50 bytes. This is running on an AMD3500+ with 1gb RAM. My problem is it takes about 6 hours to load in the rows using Load Data Infile to the empty table (which is fine) then by doing a Show ProcessList I'll know if it is using FileSort or KeyCache. That is a lot of time to waste if it ends up using KeyCache because then I know it could be a few days to index the table. So how do people force Load Data Infile to use FileSort and not KeyCache? In other words how do they know ahead of time what to set myisam_max_extra_sort_file_size set and myisam_max_sort_file_size to without guessing? To me it seems a lot like hit and miss guesswork. I would rather see a parameter BySort or ByKeyCache added to Load File command that forces it to use either FileSort or KeyCache rather than having the developer cross his fingers and hope the system variables are set correctly. Mike Regards, Harrison -- Harrison C. Fisk, Trainer and Consultant MySQL AB, www.mysql.com Get a jumpstart on MySQL Cluster -- http://www.mysql.com/consulting/packaged/cluster.html -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!
Hi, On Mar 3, 2005, at 3:13 PM, mos wrote: At 12:39 PM 3/3/2005, Harrison Fisk wrote: Hi, On Mar 3, 2005, at 11:32 AM, mos wrote: At 10:07 PM 3/2/2005, you wrote: Hello. You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process on the MyISAM table. That may work provided I can get the keys rebuilt later using FileSort and not KeyCache. You see the problem isn't in loading the data into the table which occurs reasonably fast that because running Load Data Infile on an empty table will disable the keys until all the data is loaded, so explicitly disabling before hand them won't help. After the data is loaded, Load Data Infile will then rebuild the keys and will choose to use Repair with keycache whereas Repair with filesort would be 100x-1000x faster. There doesn't seem to be any way to get it to use Repair with filesort. So I could use disable keys as you had suggested and then rebuild them manually with myisamchk repair with sort provided it doesn't reload all the data into a temporary table (which Alter Table usually does). If it does create a temporary table it will physically reloads the 500 million rows a second time and I will need another 100g of free disk space. So is there a solution to this paradox? It chooses which method to use based on your myisam_sort_buffer_size, myisam_max_sort_file_size and myisam_max_extra_sort_file_size. Have you increased the size of these? Keep in mind these are SESSION variables, so they can be set on the connection right before you LOAD DATA INFILE. You always have to tell MySQL how much extra memory/disk it can use, it can't assume that you want it to use it. That is why it will use Repair by keycache, since it doesn't use any extra resources. If you increase the amount of diskspace or memory, then it should use a Repair by sort instead. That's what I thought. I have myisam_max_sort_file_size set to 15.7GB and myisam_max_extra_sort_file_size set to 2GB. I confirmed this with Show Variables. I thought this would be enough to sort the indexes. I have confirmed there is enough free space on the hard drive to handle this both in the database directory and the mysql temp directory. Also keep in mind that Repair by sort doesn't work for UNIQUE or PRIMARY KEYs. So the solution is to keep using Load Data Infile with even larger max_sort variables and more disk space? There are only 4 indexes on the table and the largest index is around 50 bytes. This is running on an AMD3500+ with 1gb RAM. My problem is it takes about 6 hours to load in the rows using Load Data Infile to the empty table (which is fine) then by doing a Show ProcessList I'll know if it is using FileSort or KeyCache. That is a lot of time to waste if it ends up using KeyCache because then I know it could be a few days to index the table. If you want to test it out to see how much is required (though it shouldn't matter, just set them as large as you possibly could (see below)), then take your table and load it once with indexes disabled. Then enable them. It will immediately do one or the other. If you are unhappy, then you can kill it, change variables and repeat. So how do people force Load Data Infile to use FileSort and not KeyCache? In other words how do they know ahead of time what to set myisam_max_extra_sort_file_size set and myisam_max_sort_file_size to without guessing? To me it seems a lot like hit and miss guesswork. I would rather see a parameter BySort or ByKeyCache added to Load File command that forces it to use either FileSort or KeyCache rather than having the developer cross his fingers and hope the system variables are set correctly. You don't play around with the variables. Set them to the largest possible amount that you would let MySQL use (ie. how much ram and how much disk space). Then it will have the information to make the decision properly. It only uses what it needs, up to your limit. If it sees that your limits are too small, then it won't be able to do a By Sort, even if you want it to. Why aren't you setting them as large as you can in the first go? Regards, Harrison -- Harrison C. Fisk, Trainer and Consultant MySQL AB, www.mysql.com Get a jumpstart on MySQL Cluster -- http://www.mysql.com/consulting/packaged/cluster.html -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]