Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!

2005-03-07 Thread mos
At 05:18 PM 3/6/2005, Harrison Fisk wrote:
Hi,
On Mar 6, 2005, at 12:51 PM, mos wrote:
At 10:07 PM 3/2/2005, you wrote:
Hello.
You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process
on the MyISAM table.
I gave that a try but I had to cancel Alter Table ... Enable Keys after 
49 hours. I find it amazing that it takes only 6.25 hours to load 450 
million rows into the table using Load Data Infile with the keys 
disabled, but over 49 hours (maybe a lot more) to rebuild 3 indexes.
Why do you find that amazing?  The import is pretty much a raw file copy 
by MySQL.  That 6 hours is just copying data from one place to 
another.  When you create indexes MySQL has to go through and sort the 
data.  While doing so it has to create a special structure and maintain it 
and write it to disk.  Creating an index is *much* more effort than a 
simple file copy.

 Even with 100g of disk space dedicated to the sort file. It started 
using Repair with Filesort for the first 7 hours, then it switched to 
Repair with KeyCache.
It should only use one of the Repair methods.  If it switched then 
something was wrong with Repair by filesort (maybe out of disk space?)
Look in your error log and see if there is any message there about it.
There was a [Warning] Warning: Enabling keys got errno 136, retrying and 
it kept running for almost 2 more days. (Yes, I guess I should have checked 
earlier.)

So apparently it ran out of index space because I didn't build the table 
with Max_Rows=10 (1 billion?). I'm using NTFS so there is no 4gb 
limit. The data file is about 84g. There should have been plenty of disk 
space available for the index file.

I think what bothers me the most about all this is I'm kept in the dark 
about the progress of rebuilding the index (or table). Wouldn't it be nice 
if the Show Process List could say 10% complete or 99% complete? If 
it sat at 5% complete for a few hours (days?), then I would know something 
was wrong. But if the progress steadily increased every few hours and when 
it got to 99% complete I would know it is almost complete. I could let my 
client know it is 99% complete rather than saying I don't know every few 
hours. I would like to see this % complete as a feature so people aren't 
kept in the dark. At least then they would know whether the process is hung 
or not.

Just my 2 cents.
Mike

What did you have your myisam_* variables set to?
I don't know anyone who can afford to take their database down for a few 
days (weeks?) while building an index.
How often do you need to load in 450 million rows?  Generally this only 
occurs once, or in a batch process, such as once per month.  I don't know 
what your expectations are, but this is never going to be a very fast 
process under any DBMS.  Importing 450M rows will take some decent amount 
of time.

Mike

mos [EMAIL PROTECTED] wrote:
 I have a 50g CSV file that I am trying to import into an empty MyISAM
 table. It appears to go fine except after 10 hours it hasn't completed. A
 Show Process List shows Info=load data infile ... and State=Repair
 with keycache. The table has a few hundred million rows of data. I 
assume
 it is using Repair with keycache to rebuild the indexes after the data
 has been loaded.

 From what I've read Repair with keycache is going to take a huge amount
 of time to complete. How do I disable Repair with keycache and use
 Repair with sort instead (which should be faster, right?)? I'm using
 MySQL 4.1.10 on Win XP and 1gb ram.

 TIA

 Mike



--
For technical support contracts, goto https://order.mysql.com/?ref=ensita
This email is sponsored by Ensita.NET http://www.ensita.net/
   __  ___ ___   __
  /  |/  /_ __/ __/ __ \/ /Gleb Paharenko
 / /|_/ / // /\ \/ /_/ / /__   [EMAIL PROTECTED]
/_/  /_/\_, /___/\___\_\___/   MySQL AB / Ensita.NET
   ___/   www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]
Regards,
Harrison
--
Harrison C. Fisk, Trainer and Consultant
MySQL AB, www.mysql.com
Get a jumpstart on MySQL Cluster -- 
http://www.mysql.com/consulting/packaged/cluster.html

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!

2005-03-06 Thread mos
At 10:07 PM 3/2/2005, you wrote:
Hello.
You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process
on the MyISAM table.
I gave that a try but I had to cancel Alter Table ... Enable Keys after 
49 hours. I find it amazing that it takes only 6.25 hours to load 450 
million rows into the table using Load Data Infile with the keys disabled, 
but over 49 hours (maybe a lot more) to rebuild 3 indexes. Even with 100g 
of disk space dedicated to the sort file. It started using Repair with 
Filesort for the first 7 hours, then it switched to Repair with 
KeyCache. I don't know anyone who can afford to take their database down 
for a few days (weeks?) while building an index.

Mike

mos [EMAIL PROTECTED] wrote:
 I have a 50g CSV file that I am trying to import into an empty MyISAM
 table. It appears to go fine except after 10 hours it hasn't completed. A
 Show Process List shows Info=load data infile ... and State=Repair
 with keycache. The table has a few hundred million rows of data. I assume
 it is using Repair with keycache to rebuild the indexes after the data
 has been loaded.

 From what I've read Repair with keycache is going to take a huge amount
 of time to complete. How do I disable Repair with keycache and use
 Repair with sort instead (which should be faster, right?)? I'm using
 MySQL 4.1.10 on Win XP and 1gb ram.

 TIA

 Mike


--
For technical support contracts, goto https://order.mysql.com/?ref=ensita
This email is sponsored by Ensita.NET http://www.ensita.net/
   __  ___ ___   __
  /  |/  /_ __/ __/ __ \/ /Gleb Paharenko
 / /|_/ / // /\ \/ /_/ / /__   [EMAIL PROTECTED]
/_/  /_/\_, /___/\___\_\___/   MySQL AB / Ensita.NET
   ___/   www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!

2005-03-06 Thread Harrison Fisk
Hi,
On Mar 6, 2005, at 12:51 PM, mos wrote:
At 10:07 PM 3/2/2005, you wrote:
Hello.
You may use ALTER TABLE .. DISABLE KEYS to speed up the loading 
process
on the MyISAM table.
I gave that a try but I had to cancel Alter Table ... Enable Keys 
after 49 hours. I find it amazing that it takes only 6.25 hours to 
load 450 million rows into the table using Load Data Infile with the 
keys disabled, but over 49 hours (maybe a lot more) to rebuild 3 
indexes.
Why do you find that amazing?  The import is pretty much a raw file 
copy by MySQL.  That 6 hours is just copying data from one place to 
another.  When you create indexes MySQL has to go through and sort the 
data.  While doing so it has to create a special structure and maintain 
it and write it to disk.  Creating an index is *much* more effort than 
a simple file copy.

 Even with 100g of disk space dedicated to the sort file. It started 
using Repair with Filesort for the first 7 hours, then it switched 
to Repair with KeyCache.
It should only use one of the Repair methods.  If it switched then 
something was wrong with Repair by filesort (maybe out of disk space?)  
Look in your error log and see if there is any message there about it.  
What did you have your myisam_* variables set to?

I don't know anyone who can afford to take their database down for a 
few days (weeks?) while building an index.
How often do you need to load in 450 million rows?  Generally this only 
occurs once, or in a batch process, such as once per month.  I don't 
know what your expectations are, but this is never going to be a very 
fast process under any DBMS.  Importing 450M rows will take some decent 
amount of time.

Mike

mos [EMAIL PROTECTED] wrote:
 I have a 50g CSV file that I am trying to import into an empty 
MyISAM
 table. It appears to go fine except after 10 hours it hasn't 
completed. A
 Show Process List shows Info=load data infile ... and 
State=Repair
 with keycache. The table has a few hundred million rows of data. I 
assume
 it is using Repair with keycache to rebuild the indexes after the 
data
 has been loaded.

 From what I've read Repair with keycache is going to take a huge 
amount
 of time to complete. How do I disable Repair with keycache and use
 Repair with sort instead (which should be faster, right?)? I'm 
using
 MySQL 4.1.10 on Win XP and 1gb ram.

 TIA

 Mike



--
For technical support contracts, goto 
https://order.mysql.com/?ref=ensita
This email is sponsored by Ensita.NET http://www.ensita.net/
   __  ___ ___   __
  /  |/  /_ __/ __/ __ \/ /Gleb Paharenko
 / /|_/ / // /\ \/ /_/ / /__   [EMAIL PROTECTED]
/_/  /_/\_, /___/\___\_\___/   MySQL AB / Ensita.NET
   ___/   www.mysql.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]

Regards,
Harrison
--
Harrison C. Fisk, Trainer and Consultant
MySQL AB, www.mysql.com
Get a jumpstart on MySQL Cluster -- 
http://www.mysql.com/consulting/packaged/cluster.html

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!

2005-03-03 Thread Gleb Paharenko
Hello.



You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process

on the MyISAM table.





mos [EMAIL PROTECTED] wrote:

 I have a 50g CSV file that I am trying to import into an empty MyISAM 

 table. It appears to go fine except after 10 hours it hasn't completed. A 

 Show Process List shows Info=load data infile ... and State=Repair 

 with keycache. The table has a few hundred million rows of data. I assume 

 it is using Repair with keycache to rebuild the indexes after the data 

 has been loaded.

 

 From what I've read Repair with keycache is going to take a huge amount 

 of time to complete. How do I disable Repair with keycache and use 

 Repair with sort instead (which should be faster, right?)? I'm using 

 MySQL 4.1.10 on Win XP and 1gb ram.

 

 TIA

 

 Mike

 

 



-- 
For technical support contracts, goto https://order.mysql.com/?ref=ensita
This email is sponsored by Ensita.NET http://www.ensita.net/
   __  ___ ___   __
  /  |/  /_ __/ __/ __ \/ /Gleb Paharenko
 / /|_/ / // /\ \/ /_/ / /__   [EMAIL PROTECTED]
/_/  /_/\_, /___/\___\_\___/   MySQL AB / Ensita.NET
   ___/   www.mysql.com




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!

2005-03-03 Thread mos
At 10:07 PM 3/2/2005, you wrote:
Hello.
You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process
on the MyISAM table.
That may work provided I can get the keys rebuilt later using FileSort and 
not KeyCache.

You see the problem isn't in loading the data into the table which occurs 
reasonably fast that because running Load Data Infile on an empty table 
will disable the keys until all the data is loaded, so explicitly disabling 
before hand them won't help.  After the data is loaded, Load Data Infile 
will then rebuild the keys and will choose to use Repair with keycache 
whereas Repair with filesort would be 100x-1000x faster. There doesn't 
seem to be any way to get it to use Repair with filesort.

So I could use disable keys as you had suggested and then rebuild them 
manually with myisamchk repair with sort provided it doesn't reload all 
the data into a temporary table (which Alter Table usually does). If it 
does create a temporary table it will physically reloads the 500 million 
rows a second time and I will need another 100g of free disk space.

So is there a solution to this paradox?
I seem to be left with two options here:
a) Do I shoot myself in the left foot right away,
b) or Do I wait and shoot myself in the right foot? bg
Is there a 3rd option?
TIA
Mike

mos [EMAIL PROTECTED] wrote:
 I have a 50g CSV file that I am trying to import into an empty MyISAM
 table. It appears to go fine except after 10 hours it hasn't completed. A
 Show Process List shows Info=load data infile ... and State=Repair
 with keycache. The table has a few hundred million rows of data. I assume
 it is using Repair with keycache to rebuild the indexes after the data
 has been loaded.

 From what I've read Repair with keycache is going to take a huge amount
 of time to complete. How do I disable Repair with keycache and use
 Repair with sort instead (which should be faster, right?)? I'm using
 MySQL 4.1.10 on Win XP and 1gb ram.

 TIA

 Mike


--
For technical support contracts, goto https://order.mysql.com/?ref=ensita
This email is sponsored by Ensita.NET http://www.ensita.net/
   __  ___ ___   __
  /  |/  /_ __/ __/ __ \/ /Gleb Paharenko
 / /|_/ / // /\ \/ /_/ / /__   [EMAIL PROTECTED]
/_/  /_/\_, /___/\___\_\___/   MySQL AB / Ensita.NET
   ___/   www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!

2005-03-03 Thread Harrison Fisk
Hi,
On Mar 3, 2005, at 11:32 AM, mos wrote:
At 10:07 PM 3/2/2005, you wrote:
Hello.
You may use ALTER TABLE .. DISABLE KEYS to speed up the loading 
process
on the MyISAM table.
That may work provided I can get the keys rebuilt later using FileSort 
and not KeyCache.

You see the problem isn't in loading the data into the table which 
occurs reasonably fast that because running Load Data Infile on an 
empty table will disable the keys until all the data is loaded, so 
explicitly disabling before hand them won't help.  After the data is 
loaded, Load Data Infile will then rebuild the keys and will choose to 
use Repair with keycache whereas Repair with filesort would be 
100x-1000x faster. There doesn't seem to be any way to get it to use 
Repair with filesort.

So I could use disable keys as you had suggested and then rebuild them 
manually with myisamchk repair with sort provided it doesn't reload 
all the data into a temporary table (which Alter Table usually does). 
If it does create a temporary table it will physically reloads the 500 
million rows a second time and I will need another 100g of free disk 
space.

So is there a solution to this paradox?
It chooses which method to use based on your myisam_sort_buffer_size, 
myisam_max_sort_file_size and myisam_max_extra_sort_file_size.  Have 
you increased the size of these?  Keep in mind these are SESSION 
variables, so they can be set on the connection right before you LOAD 
DATA INFILE.

You always have to tell MySQL how much extra memory/disk it can use, it 
can't assume that you want it to use it.  That is why it will use 
Repair by keycache, since it doesn't use any extra resources.  If you 
increase the amount of diskspace or memory, then it should use a Repair 
by sort instead.

Also keep in mind that Repair by sort doesn't work for UNIQUE or 
PRIMARY KEYs.

Regards,
Harrison
--
Harrison C. Fisk, Trainer and Consultant
MySQL AB, www.mysql.com
Get a jumpstart on MySQL Cluster -- 
http://www.mysql.com/consulting/packaged/cluster.html

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!

2005-03-03 Thread mos
At 12:39 PM 3/3/2005, Harrison Fisk wrote:
Hi,
On Mar 3, 2005, at 11:32 AM, mos wrote:
At 10:07 PM 3/2/2005, you wrote:
Hello.
You may use ALTER TABLE .. DISABLE KEYS to speed up the loading process
on the MyISAM table.
That may work provided I can get the keys rebuilt later using FileSort 
and not KeyCache.

You see the problem isn't in loading the data into the table which occurs 
reasonably fast that because running Load Data Infile on an empty table 
will disable the keys until all the data is loaded, so explicitly 
disabling before hand them won't help.  After the data is loaded, Load 
Data Infile will then rebuild the keys and will choose to use Repair 
with keycache whereas Repair with filesort would be 100x-1000x faster. 
There doesn't seem to be any way to get it to use Repair with filesort.

So I could use disable keys as you had suggested and then rebuild them 
manually with myisamchk repair with sort provided it doesn't reload all 
the data into a temporary table (which Alter Table usually does). If it 
does create a temporary table it will physically reloads the 500 million 
rows a second time and I will need another 100g of free disk space.

So is there a solution to this paradox?
It chooses which method to use based on your myisam_sort_buffer_size, 
myisam_max_sort_file_size and myisam_max_extra_sort_file_size.  Have you 
increased the size of these?  Keep in mind these are SESSION variables, so 
they can be set on the connection right before you LOAD DATA INFILE.

You always have to tell MySQL how much extra memory/disk it can use, it 
can't assume that you want it to use it.  That is why it will use Repair 
by keycache, since it doesn't use any extra resources.  If you increase 
the amount of diskspace or memory, then it should use a Repair by sort instead.
That's what I thought. I have myisam_max_sort_file_size set to 15.7GB and 
myisam_max_extra_sort_file_size set to 2GB. I confirmed this with Show 
Variables. I thought this would be enough to sort the indexes. I have 
confirmed there is enough free space on the hard drive to handle this both 
in the database directory and the mysql temp directory.


Also keep in mind that Repair by sort doesn't work for UNIQUE or PRIMARY KEYs.
So the solution is to keep using Load Data Infile with even larger max_sort 
variables and more disk space? There are only 4 indexes on the table and 
the largest index is around 50 bytes. This is running on an AMD3500+ with 
1gb RAM.

My problem is it takes about 6 hours to load in the rows using Load Data 
Infile to the empty table (which is fine) then by doing a Show 
ProcessList I'll know if it is using FileSort or KeyCache. That is a lot 
of time to waste if it ends up using KeyCache because then I know it could 
be a few days to index the table.

So how do people force Load Data Infile to use FileSort and not KeyCache? 
In other words how do they know ahead of time what to set 
myisam_max_extra_sort_file_size set and myisam_max_sort_file_size to 
without guessing? To me it seems a lot like hit and miss guesswork. I would 
rather see a parameter BySort or ByKeyCache added to Load File command 
that forces it to use either FileSort or KeyCache rather than having the 
developer cross his fingers and hope the system variables are set correctly.

Mike

Regards,
Harrison
--
Harrison C. Fisk, Trainer and Consultant
MySQL AB, www.mysql.com
Get a jumpstart on MySQL Cluster -- 
http://www.mysql.com/consulting/packaged/cluster.html

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: Load Data Infile ... Repair with KeyCache --WAY SLOW!

2005-03-03 Thread Harrison Fisk
Hi,
On Mar 3, 2005, at 3:13 PM, mos wrote:
At 12:39 PM 3/3/2005, Harrison Fisk wrote:
Hi,
On Mar 3, 2005, at 11:32 AM, mos wrote:
At 10:07 PM 3/2/2005, you wrote:
Hello.
You may use ALTER TABLE .. DISABLE KEYS to speed up the loading 
process
on the MyISAM table.
That may work provided I can get the keys rebuilt later using 
FileSort and not KeyCache.

You see the problem isn't in loading the data into the table which 
occurs reasonably fast that because running Load Data Infile on an 
empty table will disable the keys until all the data is loaded, so 
explicitly disabling before hand them won't help.  After the data is 
loaded, Load Data Infile will then rebuild the keys and will choose 
to use Repair with keycache whereas Repair with filesort would 
be 100x-1000x faster. There doesn't seem to be any way to get it to 
use Repair with filesort.

So I could use disable keys as you had suggested and then rebuild 
them manually with myisamchk repair with sort provided it doesn't 
reload all the data into a temporary table (which Alter Table 
usually does). If it does create a temporary table it will 
physically reloads the 500 million rows a second time and I will 
need another 100g of free disk space.

So is there a solution to this paradox?
It chooses which method to use based on your myisam_sort_buffer_size, 
myisam_max_sort_file_size and myisam_max_extra_sort_file_size.  Have 
you increased the size of these?  Keep in mind these are SESSION 
variables, so they can be set on the connection right before you LOAD 
DATA INFILE.

You always have to tell MySQL how much extra memory/disk it can use, 
it can't assume that you want it to use it.  That is why it will use 
Repair by keycache, since it doesn't use any extra resources.  If you 
increase the amount of diskspace or memory, then it should use a 
Repair by sort instead.
That's what I thought. I have myisam_max_sort_file_size set to 15.7GB 
and myisam_max_extra_sort_file_size set to 2GB. I confirmed this with 
Show Variables. I thought this would be enough to sort the indexes. I 
have confirmed there is enough free space on the hard drive to handle 
this both in the database directory and the mysql temp directory.


Also keep in mind that Repair by sort doesn't work for UNIQUE or 
PRIMARY KEYs.
So the solution is to keep using Load Data Infile with even larger 
max_sort variables and more disk space? There are only 4 indexes on 
the table and the largest index is around 50 bytes. This is running on 
an AMD3500+ with 1gb RAM.

My problem is it takes about 6 hours to load in the rows using Load 
Data Infile to the empty table (which is fine) then by doing a Show 
ProcessList I'll know if it is using FileSort or KeyCache. That is a 
lot of time to waste if it ends up using KeyCache because then I know 
it could be a few days to index the table.
If you want to test it out to see how much is required (though it 
shouldn't matter, just set them as large as you possibly could (see 
below)), then take your table and load it once with indexes disabled.  
Then enable them.  It will immediately do one or the other.  If you are 
unhappy, then you can kill it, change variables and repeat.

So how do people force Load Data Infile to use FileSort and not 
KeyCache? In other words how do they know ahead of time what to set 
myisam_max_extra_sort_file_size set and myisam_max_sort_file_size to 
without guessing? To me it seems a lot like hit and miss guesswork. I 
would rather see a parameter BySort or ByKeyCache added to Load 
File command that forces it to use either FileSort or KeyCache rather 
than having the developer cross his fingers and hope the system 
variables are set correctly.
You don't play around with the variables.  Set them to the largest 
possible amount that you would let MySQL use (ie. how much ram and how 
much disk space).  Then it will have the information to make the 
decision properly.  It only uses what it needs, up to your limit.  If 
it sees that your limits are too small, then it won't be able to do a 
By Sort, even if you want it to.  Why aren't you setting them as large 
as you can in the first go?

Regards,
Harrison
--
Harrison C. Fisk, Trainer and Consultant
MySQL AB, www.mysql.com
Get a jumpstart on MySQL Cluster -- 
http://www.mysql.com/consulting/packaged/cluster.html

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]