Hi!
>>>>> "Peter" == Peter Zaitsev <[EMAIL PROTECTED]> writes:
Peter> Hello mysql,
Peter> I'm repeating my test's now with more information and on more
Peter> powerfull maching with no load now, this should give more accurate
Peter> results.
Peter> The machine is Dual PIII-750/1GB RAM
<cut>
Peter> With mysqldump which uses the following query to do this:
Peter> SELECT * INTO OUTFILE '/spylog/test/g00hits.txt' FROM g00hits
Peter> It takes:
Peter> real 1m23.177s (first time)
Peter> real 1m20.934s (second time)
Peter> The backup table for the same table takes:
Peter> first time:
mysql> backup table g00hits to '/spylog/test';
Peter> +--------------+--------+----------+----------+
Peter> | Table | Op | Msg_type | Msg_text |
Peter> +--------------+--------+----------+----------+
Peter> | la00.g00hits | backup | status | OK |
Peter> +--------------+--------+----------+----------+
Peter> 1 row in set (22.60 sec)
Peter> second time:
Peter> Well. We have 4 times difference here, which is a bit strange as I
Peter> would expect the disk speed should be the limit here, but not
Peter> converting the rows in text format, which is in this case done at
Peter> speed of 45000rows/second, which is not really much I think for this
Peter> type of operation. Hope mysql does not use strcat to form the strings?
No, we are not using buffering for this.
I would guess that the speed difference would come from this:
- When you do BACKUP TABLE, the time you measure is just the time it
takes to copy the old file to the Linux internal disk buffer; As Linux is
delaying the actual flush to disk until later, you will not see this.
- When doing SELECT * INTO OUTFILE, you are mostly measuring how long
it takes to convert a LOT of numerical data to strings before
writing this to disk.
The speed ratio between copying 4 bytes and converting an int to
string more than explains the time difference.
To get exact measurements of what is going on, you need a test where
the file is bigger than your memory!
Peter> Well. Let's try to do the restore:
mysql> restore table g00hits from '/spylog/test';
Peter> +--------------+---------+----------+----------+
Peter> | Table | Op | Msg_type | Msg_text |
Peter> +--------------+---------+----------+----------+
Peter> | test.g00hits | restore | status | OK |
Peter> +--------------+---------+----------+----------+
Peter> 1 row in set (1 min 31.16 sec)
Peter> and again:
mysql> restore table g00hits from '/spylog/test';
Peter> +--------------+---------+----------+----------+
Peter> | Table | Op | Msg_type | Msg_text |
Peter> +--------------+---------+----------+----------+
Peter> | test.g00hits | restore | status | OK |
Peter> +--------------+---------+----------+----------+
Peter> 1 row in set (1 min 30.41 sec)
mysql> load data infile '/spylog/test/g00hits.txt' ignore into table g00hits;
Peter> Query OK, 3722867 rows affected (8 min 35.73 sec)
mysql> load data infile '/spylog/test/g00hits.txt' ignore into table g00hits;
Peter> Query OK, 3722867 rows affected (3 min 35.26 sec)
Peter> So it seems like cache really matters here, this would indicate bad
Peter> disk access way as this difference is to huge and is not found with
Peter> backup/restore table.
It could also be that Linux did a sync during the test.
Peter> Now without ignore:
mysql> load data infile '/spylog/test/g00hits.txt' into table g00hits;
Peter> Query OK, 3722867 rows affected (3 min 34.75 sec)
Peter> So it seems like it does not really matter.
Ignore has identical speed as an insert, if there isn't a duplicate
key.
Peter> Well let's drop all indexes and try to load data fastest way ?
mysql> load data infile '/spylog/test/g00hits.txt' ignore into table g00hits;
Peter> Query OK, 3722867 rows affected (1 min 56.16 sec)
Peter> Records: 3722867 Deleted: 0 Skipped: 0 Warnings: 0
mysql> load data infile '/spylog/test/g00hits.txt' into table g00hits;
Peter> Query OK, 3722867 rows affected (1 min 56.38 sec)
Peter> Records: 3722867 Deleted: 0 Skipped: 0 Warnings: 0
mysql> alter table g00hits add primary key(counter_id,ts,id),add key(visitor_id);
Peter> Query OK, 3722867 rows affected (1 min 54.38 sec)
Peter> Records: 3722867 Duplicates: 0 Warnings: 0
Peter> So it does not give any improvement do add keys after loading date, it
Peter> seems to be allready optimal.
When you do a LOAD DATA to an empty file, MySQL will first insert the
data and the unique keys and then create the other keys in a batch..
Peter> So the diference in restore speed is about 3 times, which is not
Peter> really expected, but a bit better then I reported before, but anyway I
Peter> can't understand so huge difference for load data then text file fits
Peter> to cache or does not. Well may be it was because of some other random
Peter> events - I was not able to reproduce so slow speed anymore.
Peter> also look at the difference of vmstat between restore and load data:
Peter> restore:
Peter> 0 1 1 181624 42456 196488 206936 0 0 2823 4208 1126 1151 0 25 74
Peter> 0 1 0 181624 24472 222220 199332 0 0 3411 3903 1144 1273 0 25 74
Peter> 1 0 1 181624 30640 104856 268544 0 0 7030 371 682 792 29 24 47
Peter> 1 0 0 181624 30636 104856 268544 0 0 0 0 104 115 50 0 50
Peter> 1 0 0 181624 21108 104848 269308 0 0 0 0 104 114 50 0 49
Peter> So the file is copied quickly and then indexes are generated so one
Peter> CPU is completely out:
Peter> LOAD DATA:
Peter> 2 1 0 181624 75816 166460 180356 0 0 425 223 411 8798 29 6 65
Peter> 0 1 0 181624 68256 168944 185312 0 0 435 407 608 2827 10 5 85
Peter> 1 1 0 181624 63676 171668 189108 0 0 388 363 656 7937 19 7 74
Peter> 1 0 0 181624 23976 104856 245228 0 0 0 0 106 116 50 0 50
Peter> 1 0 0 181624 23976 104856 245228 0 0 0 0 104 113 50 0 49
Peter> 1 0 0 181624 29980 131516 212568 0 0 0 0 104 114 48 3 50
Peter> as you see the CPU is not really used, therefore the disk access is
Peter> not so huge, this may indicate the disk is bein accessed in random
Peter> order, therefore it should not i think. This may indicate to small
Peter> buffer used (?) my record_buffer is set to 512K by the way.
This is strange; When doing load data, MySQL should buffer rows up to
'record_buffer' and then write this in a single block.
I will try to find time to test this in the near future with your
table definition!
Regards,
Monty
---------------------------------------------------------------------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)
To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php