George, I've not heard of that technique - "sorting so that mysql
doesn't have to jump around as much".

What I am aware of, and what can make a difference, is disabling
indexes prior to starting the load.  Your load goes quite a bit
faster, in general, but then there is some time at the spent updating
indexes.   Overall it can be faster.  From
http://dev.mysql.com/doc/refman/5.0/en/load-data.html :


If you use LOAD DATA INFILE on an empty MyISAM table, all non-unique
indexes are created in a separate batch (as for REPAIR TABLE).
Normally, this makes LOAD DATA INFILE much faster when you have many
indexes. In some extreme cases, you can create the indexes even faster
by turning them off with ALTER TABLE ... DISABLE KEYS before loading
the file into the table and using ALTER TABLE ... ENABLE KEYS to
re-create the indexes after loading the file. See Section 7.2.16,
"Speed of INSERT Statements".


Dan


On 8/18/06, George Law <[EMAIL PROTECTED]> wrote:
Hello All,

An ex-coworker of my recently sent over a link to a mysql article about
speeding up mass imports.
unfortunately I deleted the email before I had time to actually digest
it and now I am wondering if it could help with my imports.

The article suggested sorting the data ahead of time to prevent mysql
from having to jump around as much.
What I have is a raw data file that I pre-process to create 3 separate
tab delimited files, which are then loaded into mysql using "load data
infile"...


I am working with a table with a primary key consisting of 2 fields.
| cdr_seq_no                          | int(9) unsigned  | NO   | PRI |
0       |       |
| callid                              | char(33)         | NO   | PRI |
|       |


show index from comp_cdr;

+----------+------------+------------------+--------------+-------------
-----+-----------+-------------+----------+--------+------+------------+
---------+
| Table    | Non_unique | Key_name         | Seq_in_index | Column_name
| Collation | Cardinality | Sub_part | Packed | Null | Index_type |
Comment |
+----------+------------+------------------+--------------+-------------
-----+-----------+-------------+----------+--------+------+------------+
---------+
| comp_cdr |          0 | PRIMARY          |            1 | callid
| A         |     3454606 |     NULL | NULL   |      | BTREE      |
|
| comp_cdr |          0 | PRIMARY          |            2 | cdr_seq_no
| A         |     3454606 |     NULL | NULL   |      | BTREE      |
|
...

some sample data is :
| callid                           | cdr_seq_no |
+----------------------------------+------------+
| 0000002644e5f21f4abaa204120fad41 |   89630624 |
| 0000004644e5f3db5af88248020fad41 |   89627398 |
| 0000008344e5ef975e6eafe0020fad41 |   89630801 |
| 0000009a44e5f2694aabb6cc12115a45 |   89614941 |
| 000000e044e5f2e94dd45074020fad41 |   89619029 |
| 0007bb54c8691110800100201c0060ff |   89616484 |
| 002a401ae58711d382f98183346766e7 |   89621314 |
| 007f2ad6df2cdb118dd2c879d6db9673 |   89631061 |
| 008724bcc7691110800100201c0144ff |   89628873 |
| 008bbf9dc9691110800100201c00edff |   89627287 |

Any suggestions on exactly how to sort those fields before


Thanks!!!

--
George Law
[EMAIL PROTECTED]


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to