I'm trying to understand the most efficient load sequence
possible.
I've got 2 large tables. Each is about 50 million rows,
taking about 20GB of disk space. My disk is accessed
via fibre channel, and can read/write about 40 MB per
second.
My system is a dual PIII 850, running Linux 2.4.2-2smp #1 SMP.
Memory: 1GB.
I've "downgraded" my compiler to GCC 2.95-3, and can compile
MySQL.
The actual load of a single table takes about 1/2 hour, but
indexing takes DAYS. I've recompiled '--with-raid' which allows
me to create a table big enough.
The queries are very fast, which is why I am attempting this.
I currently use Oracle, but would like to offload to MySQL as
much as possible. Once I prove this out, I figure the support
contract is WAY cheaper than my Oracle licenses for new projects.
Total Oracle load, index, analyze for this data is about 1/2 a day.
Here's my start MySQL command:
nohup nice --10 /usr/local/libexec/mysqld \
--basedir=/usr/local \
--datadir=/var/lib/mysql \
--user=mysql \
--pid-file=/var/lib/mysql/ramius.xxx.com.pid \
--skip-locking \
-O key_buffer=512M \
-O max_allowed_packet=1M \
-O table_cache=256 \
-O sort_buffer=64M \
-O record_buffer=1M \
-O myisam_sort_buffer_size=128M \
-O thread_cache=8 \
-O thread_concurrency=4 \
--port=3306 \
--skip-networking \
--tmpdir=/xy/local/tmp/ \
--warnings \
--log=/xy/local/tmp/mysql.log \
&
It eats about 512 MB.
Here's the critical part of the table create script:
create table ind_1101 (
FINDER_NUMBER_IND VARCHAR(10),
FINDER_NUMBER_HH VARCHAR(10),
DATE_ON_BASE DATE ,
...
100 fields later
)
TYPE=MyISAM
RAID_TYPE=STRIPED
RAID_CHUNKS=30
RAID_CHUNKSIZE=10000
MIN_ROWS=60000000
MAX_ROWS=500000000
I do the load of the data and then index via a single
alter statement rather than a series of creates, to avoid
many table copies. Note: I get my best load performance
by allowing 2 source files to be loaded at the same time.
If I do 1, then I don't fill the pipeline, and if I do 3,
the disk seems to thrash and wait time goes up. I never
use more than a single CPU during this process.
I've tried InnoDB, but it loads too slow.
According to the docs, it seems that the following
sequence should be faster:
Create WITH indexes defined.
Alter table disable indexes.
Load data.
Alter table enable indexes.
The 'enable' and 'disable' seem to be strictly a MySQL 4.0
command, which I didn't initially DL, so I'm setting that
up now.
Is there anything else I should be looking at?
Is there a large table FAQ?
Will I cut the time in 1/2 buy putting it on a
Dual 2GHZ Xeon (waiting for a test box right now)?
Thanks for reading this far.
Barry
---------------------------------------------------------------------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)
To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php