George, I've not heard of that technique - "sorting so that mysql doesn't have to jump around as much".
What I am aware of, and what can make a difference, is disabling indexes prior to starting the load. Your load goes quite a bit faster, in general, but then there is some time at the spent updating indexes. Overall it can be faster. From http://dev.mysql.com/doc/refman/5.0/en/load-data.html : If you use LOAD DATA INFILE on an empty MyISAM table, all non-unique indexes are created in a separate batch (as for REPAIR TABLE). Normally, this makes LOAD DATA INFILE much faster when you have many indexes. In some extreme cases, you can create the indexes even faster by turning them off with ALTER TABLE ... DISABLE KEYS before loading the file into the table and using ALTER TABLE ... ENABLE KEYS to re-create the indexes after loading the file. See Section 7.2.16, "Speed of INSERT Statements". Dan On 8/18/06, George Law <[EMAIL PROTECTED]> wrote:
Hello All, An ex-coworker of my recently sent over a link to a mysql article about speeding up mass imports. unfortunately I deleted the email before I had time to actually digest it and now I am wondering if it could help with my imports. The article suggested sorting the data ahead of time to prevent mysql from having to jump around as much. What I have is a raw data file that I pre-process to create 3 separate tab delimited files, which are then loaded into mysql using "load data infile"... I am working with a table with a primary key consisting of 2 fields. | cdr_seq_no | int(9) unsigned | NO | PRI | 0 | | | callid | char(33) | NO | PRI | | | show index from comp_cdr; +----------+------------+------------------+--------------+------------- -----+-----------+-------------+----------+--------+------+------------+ ---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +----------+------------+------------------+--------------+------------- -----+-----------+-------------+----------+--------+------+------------+ ---------+ | comp_cdr | 0 | PRIMARY | 1 | callid | A | 3454606 | NULL | NULL | | BTREE | | | comp_cdr | 0 | PRIMARY | 2 | cdr_seq_no | A | 3454606 | NULL | NULL | | BTREE | | ... some sample data is : | callid | cdr_seq_no | +----------------------------------+------------+ | 0000002644e5f21f4abaa204120fad41 | 89630624 | | 0000004644e5f3db5af88248020fad41 | 89627398 | | 0000008344e5ef975e6eafe0020fad41 | 89630801 | | 0000009a44e5f2694aabb6cc12115a45 | 89614941 | | 000000e044e5f2e94dd45074020fad41 | 89619029 | | 0007bb54c8691110800100201c0060ff | 89616484 | | 002a401ae58711d382f98183346766e7 | 89621314 | | 007f2ad6df2cdb118dd2c879d6db9673 | 89631061 | | 008724bcc7691110800100201c0144ff | 89628873 | | 008bbf9dc9691110800100201c00edff | 89627287 | Any suggestions on exactly how to sort those fields before Thanks!!! -- George Law [EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]