Re: MySQL 5 is 25% slower then 4.1
Are you doing single insert statements, multiple insert statements or LOAD DATA INFILE statements? On 12/4/05, Chenzhou Cui [EMAIL PROTECTED] wrote: Dear MySQL fans, I have a 260 GB huge file with 1045175762 rows. Two weeks ago, I wrote a Java program to read the huge plain text file into MySQL 4.1.12. 300 million of rows could be loaded in one day. Last Thursday, I updated the MySQL to 5.0.16 and then run the same program. Only 225 million of rows can be loaded in one day. In additional to the version difference, the MySQL 4.1.12 was compiled from SRPM packages from RedHat Updates, while the MySQL 5.0.16 is installed directly using the MySQL binary RPM packages for Redhat AS 4. Any information and suggestion are welcome. Regards, Chenzhou CUI -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL 5 is 25% slower then 4.1
After 4 days and a half, my program finished last night. The 1045175762 rows costed the Java program 109 hours to load. The speed is about 9.6 million per hour. I didn't use multiple insert statements and LOAD DATA INFILE, but only insert into statement. The Java program reads one line from the source file, and then execute an insert statement. The configuration of my server is as follows: Two Intel Xeon 2.8GHz CPUs 3GB memory Two RAID 5 of SATA disks The huge file locates on a RAID, the MySQL server datadir locates on another RAID. The two RAIDs link to the server by Ultra SCSI 320 interface. Gary Richardson wrote: Are you doing single insert statements, multiple insert statements or LOAD DATA INFILE statements? On 12/4/05, Chenzhou Cui [EMAIL PROTECTED] wrote: Dear MySQL fans, I have a 260 GB huge file with 1045175762 rows. Two weeks ago, I wrote a Java program to read the huge plain text file into MySQL 4.1.12. 300 million of rows could be loaded in one day. Last Thursday, I updated the MySQL to 5.0.16 and then run the same program. Only 225 million of rows can be loaded in one day. In additional to the version difference, the MySQL 4.1.12 was compiled from SRPM packages from RedHat Updates, while the MySQL 5.0.16 is installed directly using the MySQL binary RPM packages for Redhat AS 4. Any information and suggestion are welcome. Regards, Chenzhou CUI -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- Chenzhou Cui (China-VO Project) National Astronomical Observatory | Tel: (8610)64841695 Chinese Academy of Sciences | FAX: (8610)64878240 Datun Road 20A, Chaoyang District | Email: [EMAIL PROTECTED] Beijing 100012, China | WWW: www.lamost.org/~cb -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL 5 is 25% slower then 4.1
Chenzhou Cui wrote: I didn't use multiple insert statements and LOAD DATA INFILE, but only insert into statement. The Java program reads one line from the source file, and then execute an insert statement. I can't comment on the speed of 5.0.x vs 4.1.x, but I can suggest that you optimize this 'insert' process. Instead of issuing a new insert command per record, how about using placeholders and binding values? I don't know any Java, but from Perl you can do this. Actually, I'm not certain that MySQL supports this yet or not - you'd have to check your server version and drivers. Or you could do something like collect a series of records - say 100 records at a time, and issue an insert query that includes all of them, eg: insert into SomeTable ( field_1, field_2, field_3 ) values ( 3, 56, 45 ), ( 45, 3456, 345 ), ( 345, 76, 345 ), ( 345, 45, 546 ) This is a lot faster than issuing separate inserts per record. You don't want to include *too* many records at once - there's a maximum packet size or something like that that you can't exceed. -- Daniel Kasak IT Developer NUS Consulting Group Level 5, 77 Pacific Highway North Sydney, NSW, Australia 2060 T: (+61) 2 9922-7676 / F: (+61) 2 9922 7989 email: [EMAIL PROTECTED] website: http://www.nusconsulting.com.au -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL 5 is 25% slower then 4.1
Also, if the file looks anything like a CSV file, I recommend using LOAD DATA INFILE http://dev.mysql.com/doc/refman/4.1/en/load-data.html You'll probably load that data in half to a quarter of the time. On 11/30/05, Daniel Kasak [EMAIL PROTECTED] wrote: Chenzhou Cui wrote: I didn't use multiple insert statements and LOAD DATA INFILE, but only insert into statement. The Java program reads one line from the source file, and then execute an insert statement. I can't comment on the speed of 5.0.x vs 4.1.x, but I can suggest that you optimize this 'insert' process. Instead of issuing a new insert command per record, how about using placeholders and binding values? I don't know any Java, but from Perl you can do this. Actually, I'm not certain that MySQL supports this yet or not - you'd have to check your server version and drivers. Or you could do something like collect a series of records - say 100 records at a time, and issue an insert query that includes all of them, eg: insert into SomeTable ( field_1, field_2, field_3 ) values ( 3, 56, 45 ), ( 45, 3456, 345 ), ( 345, 76, 345 ), ( 345, 45, 546 ) This is a lot faster than issuing separate inserts per record. You don't want to include *too* many records at once - there's a maximum packet size or something like that that you can't exceed. -- Daniel Kasak IT Developer NUS Consulting Group Level 5, 77 Pacific Highway North Sydney, NSW, Australia 2060 T: (+61) 2 9922-7676 / F: (+61) 2 9922 7989 email: [EMAIL PROTECTED] website: http://www.nusconsulting.com.au -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL 5 is 25% slower then 4.1
The file is in fix column width format. If it is in CSV format, I don't need a program to read it. The first few lines are as following: --- usnob_read: no zone specified, read from South Pole #USNOB (whole) #USNO-B1.0 Tycho-2RA (J2000) DecsRA sDE Epoch pmRA pmDE P spA spD Fit N MsY| Bmag1 C Surv. cl -xi--eta| Rmag1 C Surv. cl -xi--eta| Bmag2 C Surv. cl -xi--eta| Rmag2 C Surv. cl -xi--eta| Imag C Surv. cl -xi--eta| ; r() -001 001.087156-89.918600 65 0 1986.9 -2 -6 8 4 0 0 0 3 ..Y| --- - ---- | 14.00 1 5-001 9 +00.04-00.01| 14.43 1 4-001 3 -00.04+00.00| 13.68 1 6-001 0 -00.02+00.00| --- - ---- | -002 001.330223-89.933234 5 192 1986.9 +2 +0 6 0 12 0 1 3 ...| --- - ---- | 17.26 1 5-001 10 +00.00+00.14| 17.43 1 4-001 8 -00.01-00.10| 17.20 1 6-001 0 -00.01-00.06| --- - ---- | -003 003.255817-89.964103 2 80 1986.9 +8 +18 8 0 5 0 0 3 ...| --- - ---- | 19.22 1 5-001 8 +00.00+00.06| 21.20 1 4-001 2 -00.01-00.04| 19.66 1 6-001 0 -00.01-00.03| --- - ---- | -004 003.646575-89.906014 204 224 1986.9 +0 +0 0 0 0 3 3 3 ...| --- - ---- | 19.50 1 5-001 3 +00.17-00.29| 19.86 1 4-001 8 +00.10+00.01| 18.60 1 6-001 0 -00.29+00.26| --- - ---- | -005 006.311025-89.952395 114 45 1986.9 +0 +0 0 0 0 1 0 3 ...| --- - ---- | 18.10 1 5-001 9 -00.11-00.05| 18.61 1 4-001 8 -00.06-00.01| 18.23 1 6-001 0 +00.16+00.05| --- - ---- | Gary Richardson wrote: Also, if the file looks anything like a CSV file, I recommend using LOAD DATA INFILE http://dev.mysql.com/doc/refman/4.1/en/load-data.html You'll probably load that data in half to a quarter of the time. On 11/30/05, Daniel Kasak [EMAIL PROTECTED] wrote: Chenzhou Cui wrote: I didn't use multiple insert statements and LOAD DATA INFILE, but only insert into statement. The Java program reads one line from the source file, and then execute an insert statement. I can't comment on the speed of 5.0.x vs 4.1.x, but I can suggest that you optimize this 'insert' process. Instead of issuing a new insert command per record, how about using placeholders and binding values? I don't know any Java, but from Perl you can do this. Actually, I'm not certain that MySQL supports this yet or not - you'd have to check your server version and drivers. Or you could do something like collect a series of records - say 100 records at a time, and issue an insert query that includes all of them, eg: insert into SomeTable ( field_1, field_2, field_3 ) values ( 3, 56, 45 ), ( 45, 3456, 345 ), ( 345, 76, 345 ), ( 345, 45, 546 ) This is a lot faster than issuing separate inserts per record. You don't want to include *too* many records at once - there's a maximum packet size or something like that that you can't exceed. -- Daniel Kasak IT Developer NUS Consulting Group Level 5, 77 Pacific Highway North Sydney, NSW, Australia 2060 T: (+61) 2 9922-7676 / F: (+61) 2 9922 7989 email: [EMAIL PROTECTED] website: http://www.nusconsulting.com.au -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- Chenzhou Cui (China-VO Project) National Astronomical Observatory | Tel: (8610)64841695 Chinese Academy of Sciences | FAX: (8610)64878240 Datun Road 20A, Chaoyang District | Email: [EMAIL PROTECTED] Beijing 100012, China | WWW: www.lamost.org/~cb -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL 5 is 25% slower then 4.1
Chenzhou Cui wrote: The file is in fix column width format. If it is in CSV format, I don't need a program to read it. 'load data infile' can import from a fixed-width format. Check the documentation. I've set up a number of these imports, and they're a little messy to set up ( you have to put details of the widths of columns IN THE TABLE DEFINITION, but it does work. What I do is create a temporary table with the column widths in the table definition, 'load data infile' into it, and then use a 'insert into ___ select * from ___' query to move the data into the real table. If you've got a lot of rows, then maybe this isn't such a good idea, but you can *probably* get your column widths into your table without causing too much trouble. Importing from fixed-width columns really is the weakest part of 'load data infile'. Perhaps a feature-request bug for a better solution is in order? -- Daniel Kasak IT Developer NUS Consulting Group Level 5, 77 Pacific Highway North Sydney, NSW, Australia 2060 T: (+61) 2 9922-7676 / F: (+61) 2 9922 7989 email: [EMAIL PROTECTED] website: http://www.nusconsulting.com.au -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]