Hi!

>>>>> "Peter" == Peter Zaitsev <[EMAIL PROTECTED]> writes:

Peter> Hello mysql,
Peter>   I'm repeating my test's now with more information and on more
Peter>   powerfull maching with no load now, this should give more accurate
Peter>   results.
Peter>   The machine is Dual PIII-750/1GB RAM

<cut>

Peter> With mysqldump which uses the following query to do this:

Peter> SELECT * INTO OUTFILE '/spylog/test/g00hits.txt' FROM g00hits

Peter> It takes:
Peter> real    1m23.177s   (first time)
Peter> real    1m20.934s   (second time)

Peter> The backup table for the same table takes:

Peter> first time:
mysql> backup table g00hits to '/spylog/test';
Peter> +--------------+--------+----------+----------+
Peter> | Table        | Op     | Msg_type | Msg_text |
Peter> +--------------+--------+----------+----------+
Peter> | la00.g00hits | backup | status   | OK       |
Peter> +--------------+--------+----------+----------+
Peter> 1 row in set (22.60 sec)
Peter> second time:

Peter> Well. We have 4 times difference here, which is a bit strange as I
Peter> would expect the disk speed should be the limit here, but not
Peter> converting the rows in text format, which is in this case done at
Peter> speed of 45000rows/second, which is not really much I think for this
Peter> type of operation. Hope mysql does not use strcat to form the strings?

No, we are not using buffering for this.

I would guess that the speed difference would come from this:

- When you do BACKUP TABLE, the time you measure is just the time it
takes to copy the old file to the Linux internal disk buffer;  As Linux is
delaying the actual flush to disk until later, you will not see this.

- When doing SELECT * INTO OUTFILE, you are mostly measuring how long
  it takes to convert a LOT of numerical data to strings before
  writing this to disk.

  The speed ratio between copying 4 bytes and converting an int to
  string more than explains the time difference.

To get exact measurements of what is going on, you need a test where
the file is bigger than your memory!


Peter> Well. Let's try to do the restore:
mysql> restore table g00hits from '/spylog/test';
Peter> +--------------+---------+----------+----------+
Peter> | Table        | Op      | Msg_type | Msg_text |
Peter> +--------------+---------+----------+----------+
Peter> | test.g00hits | restore | status   | OK       |
Peter> +--------------+---------+----------+----------+
Peter> 1 row in set (1 min 31.16 sec)

Peter> and again:

mysql> restore table g00hits from '/spylog/test';
Peter> +--------------+---------+----------+----------+
Peter> | Table        | Op      | Msg_type | Msg_text |
Peter> +--------------+---------+----------+----------+
Peter> | test.g00hits | restore | status   | OK       |
Peter> +--------------+---------+----------+----------+
Peter> 1 row in set (1 min 30.41 sec)

mysql> load data infile '/spylog/test/g00hits.txt' ignore into table g00hits;
Peter> Query OK, 3722867 rows affected (8 min 35.73 sec)

mysql> load data infile '/spylog/test/g00hits.txt'  ignore into table g00hits;
Peter> Query OK, 3722867 rows affected (3 min 35.26 sec)

Peter> So it seems like cache really matters here, this would indicate bad
Peter> disk access way as this difference is to huge and is not found with
Peter> backup/restore table.

It could also be that Linux did a sync during the test.

Peter> Now without ignore:

mysql> load data infile '/spylog/test/g00hits.txt'  into table g00hits;
Peter> Query OK, 3722867 rows affected (3 min 34.75 sec)

Peter> So it seems like it does not really matter.

Ignore has identical speed as an insert, if there isn't a duplicate
key.

Peter> Well let's drop all indexes and try to load data fastest way ?

mysql> load data infile '/spylog/test/g00hits.txt'  ignore into table g00hits;
Peter> Query OK, 3722867 rows affected (1 min 56.16 sec)
Peter> Records: 3722867  Deleted: 0  Skipped: 0  Warnings: 0

mysql> load data infile '/spylog/test/g00hits.txt'   into table g00hits;
Peter> Query OK, 3722867 rows affected (1 min 56.38 sec)
Peter> Records: 3722867  Deleted: 0  Skipped: 0  Warnings: 0

mysql> alter table g00hits add primary key(counter_id,ts,id),add key(visitor_id);
Peter> Query OK, 3722867 rows affected (1 min 54.38 sec)
Peter> Records: 3722867  Duplicates: 0  Warnings: 0

Peter> So it does not give any improvement do add keys after loading date, it
Peter> seems to be allready optimal.

When you do a LOAD DATA to an empty file, MySQL will first insert the
data and the unique keys and then create the other keys in a batch..

Peter> So the diference in restore speed is about 3 times, which is not
Peter> really expected, but a bit better then I reported before, but anyway I
Peter> can't understand so huge difference for load data then text file fits
Peter> to cache or does not. Well may be it was because of some other random
Peter> events - I was not able to reproduce so slow speed anymore.

Peter> also look at the difference of vmstat between restore and load data:
Peter> restore:

Peter>  0  1  1 181624  42456 196488 206936   0   0  2823  4208 1126  1151   0  25  74
Peter>  0  1  0 181624  24472 222220 199332   0   0  3411  3903 1144  1273   0  25  74
Peter>  1  0  1 181624  30640 104856 268544   0   0  7030   371  682   792  29  24  47
Peter>  1  0  0 181624  30636 104856 268544   0   0     0     0  104   115  50   0  50
Peter>  1  0  0 181624  21108 104848 269308   0   0     0     0  104   114  50   0  49

Peter>  So the file is copied quickly and then indexes are generated so one
Peter>  CPU is completely out:

Peter>  LOAD DATA:

Peter>  2  1  0 181624  75816 166460 180356   0   0   425   223  411  8798  29   6  65
Peter>  0  1  0 181624  68256 168944 185312   0   0   435   407  608  2827  10   5  85
Peter>  1  1  0 181624  63676 171668 189108   0   0   388   363  656  7937  19   7  74
Peter>  1  0  0 181624  23976 104856 245228   0   0     0     0  106   116  50   0  50
Peter>  1  0  0 181624  23976 104856 245228   0   0     0     0  104   113  50   0  49
Peter>  1  0  0 181624  29980 131516 212568   0   0     0     0  104   114  48   3  50
  
Peter>  as you see the CPU is not really used, therefore the disk access is
Peter>  not so huge, this may indicate the disk is bein accessed in random
Peter>  order, therefore it should not i think. This may indicate to small
Peter>  buffer used (?) my record_buffer is set to 512K by the way.

This is strange;  When doing load data, MySQL should buffer rows up to
'record_buffer' and then write this in a single block.

I will try to find time to test this in the near future with your
table definition!

Regards,
Monty


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to