RE: mysqlimport --use-threads / mysqladmin processlist

2012-07-25 Thread Rick James
I'm skeptical that use-treads can every be very effective.

What order are the rows in?  They are probably in PRIMARY KEY order, which 
means that the INSERTing threads will be fighting over similar spots in the 
table.

Is it I/O bound when it is single-threaded?  If so, then there can't be any 
improvement with use-threads.

etc.

Suggest you file a bug with bugs.mysql.com.  If nothing else, the documentation 
should say more than it does.

 -Original Message-
 From: Róbert Kohányi [mailto:kohanyi.rob...@gmail.com]
 Sent: Tuesday, July 24, 2012 10:52 AM
 To: mysql@lists.mysql.com
 Subject: mysqlimport --use-threads / mysqladmin processlist
 
 I'm in the middle of migrating an InnoDB database to an NDBCluster. I
 use mysqldump to first create two dumps, the first one contains only
 the database schema, the second one contains only tab delimited data
 (via mysqldump --tab). I edit my InnoDB schema here and there
 (ENGINE=InnoDB to ENGINE=NDB, etc.) import it and after this I import
 the InnoDB data *as is* using mysqlimport.
 
 I use it like this:
 
 mysqlimport --local --use-threads=4 db dir/*.txt
 
 (dir of course cotains the tab delimited data I dumped before.)
 
 The import starts, and I check its progress via mysqladmin, like this:
 
 mysqladmin --sleep=1 processlist
 
 this is what I see: http://pastebin.com/raw.php?i=M23fWVjc
 
 Only a single process seems to be loading my data. Is this what I
 *should* see, or, in my case using 4 threads, should I see four
 processes? I'm not asking which one will be faster, I'm just simply
 confused because I don't know what to expect. If I start four different
 mysqlimport processes, each one importing different files, then I can
 see four different process in the mysql processlist.
 
 If it's matters, here is my server version (I use the official
 binaries).
 Server version: 5.5.25a-ndb-7.2.7-gpl MySQL Cluster Community Server
 (GPL)
 
 Regards,
 Kohányi Róbert
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql



Re: mysqlimport --use-threads / mysqladmin processlist

2012-07-25 Thread Róbert Kohányi
Yes, the rows are in primary key order, however each row contains
specific integer primary keys; I'm not inserting nulls into a table
where the primary key is auto increment, so I don't see why concurrent
inserts would fight for similar spots (although, I'm admittedly not a
MySQL hotshot, so the basis of my assumption is a *hunch* only).

I'm not sure (yet) if a single-threaded operation would run into an
i/o bottleneck. I didn't run mysqlimport using --use-threads=1 just
yet (will do if I have the time), but when I've ran it with
--use-threads=4 the import (of a ~500 MB dump) took more time than
running for different processes (I've split my tab delimited dumps
with split into four even pieces and imported those in four different
sessions).

Anyway, it seems that doing a simple import (from a dump, which isn't
tab delimited, but contains complete or extended inserts) takes the
same amount of time than doing a mysqlimport using --use-threads=4 and
as it turns out splitting my tab delimited dump is too complex to
handle gracefully, because my data contains newline characters all
over the place, so I've dropped the idea of this whole mysqlimport
thing for now. (I'll try the method of migrating an InnoDB database to
an NDBCluster described here[1] instead.)

If I have the time I'll write up a bug report, or documentation
enhancement request for this.

Thanks for the input!

Regards,
Kohányi Róbert

[1]: 
http://johanandersson.blogspot.se/2012/04/mysql-cluster-how-to-load-it-with-data.html

On Wed, Jul 25, 2012 at 6:49 PM, Rick James rja...@yahoo-inc.com wrote:
 I'm skeptical that use-treads can every be very effective.

 What order are the rows in?  They are probably in PRIMARY KEY order, which 
 means that the INSERTing threads will be fighting over similar spots in the 
 table.

 Is it I/O bound when it is single-threaded?  If so, then there can't be any 
 improvement with use-threads.

 etc.

 Suggest you file a bug with bugs.mysql.com.  If nothing else, the 
 documentation should say more than it does.

 -Original Message-
 From: Róbert Kohányi [mailto:kohanyi.rob...@gmail.com]
 Sent: Tuesday, July 24, 2012 10:52 AM
 To: mysql@lists.mysql.com
 Subject: mysqlimport --use-threads / mysqladmin processlist

 I'm in the middle of migrating an InnoDB database to an NDBCluster. I
 use mysqldump to first create two dumps, the first one contains only
 the database schema, the second one contains only tab delimited data
 (via mysqldump --tab). I edit my InnoDB schema here and there
 (ENGINE=InnoDB to ENGINE=NDB, etc.) import it and after this I import
 the InnoDB data *as is* using mysqlimport.

 I use it like this:

 mysqlimport --local --use-threads=4 db dir/*.txt

 (dir of course cotains the tab delimited data I dumped before.)

 The import starts, and I check its progress via mysqladmin, like this:

 mysqladmin --sleep=1 processlist

 this is what I see: http://pastebin.com/raw.php?i=M23fWVjc

 Only a single process seems to be loading my data. Is this what I
 *should* see, or, in my case using 4 threads, should I see four
 processes? I'm not asking which one will be faster, I'm just simply
 confused because I don't know what to expect. If I start four different
 mysqlimport processes, each one importing different files, then I can
 see four different process in the mysql processlist.

 If it's matters, here is my server version (I use the official
 binaries).
 Server version: 5.5.25a-ndb-7.2.7-gpl MySQL Cluster Community Server
 (GPL)

 Regards,
 Kohányi Róbert

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



mysqlimport --use-threads / mysqladmin processlist

2012-07-24 Thread Róbert Kohányi
I'm in the middle of migrating an InnoDB database to an NDBCluster. I
use mysqldump to first create two dumps, the first one contains only
the database schema, the second one contains only tab delimited data
(via mysqldump --tab). I edit my InnoDB schema here and there
(ENGINE=InnoDB to ENGINE=NDB, etc.) import it and after this I import
the InnoDB data *as is* using mysqlimport.

I use it like this:

mysqlimport --local --use-threads=4 db dir/*.txt

(dir of course cotains the tab delimited data I dumped before.)

The import starts, and I check its progress via mysqladmin, like this:

mysqladmin --sleep=1 processlist

this is what I see: http://pastebin.com/raw.php?i=M23fWVjc

Only a single process seems to be loading my data. Is this what I
*should* see, or, in my case using 4 threads, should I see four
processes? I'm not asking which one will be faster, I'm just simply
confused because I don't know what to expect. If I start four
different mysqlimport processes, each one importing different files,
then I can see four different process in the mysql processlist.

If it's matters, here is my server version (I use the official binaries).
Server version: 5.5.25a-ndb-7.2.7-gpl MySQL Cluster Community Server (GPL)

Regards,
Kohányi Róbert

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql