At 11:23 +0700 7/9/03, Dominicus Donny wrote:
Try analyze your table(s).

What information will this yield to make indexing faster?



"Me fail English? That's unpossible" ###___Archon___###

----- Original Message -----
From: "electroteque" <[EMAIL PROTECTED]>
To: "Paul DuBois" <[EMAIL PROTECTED]>; "Florian Weimer" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Wednesday, July 09, 2003 10:23 AM
Subject: RE: Faster reindexing


 when reimporting or reinserting or whatever from a huge db i usually drop
 all the indexes reimport then create them again much quicker

 -----Original Message-----
 From: Paul DuBois [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, July 09, 2003 1:09 PM
 To: Florian Weimer; [EMAIL PROTECTED]
 Subject: Re: Faster reindexing


At 9:39 +0200 7/7/03, Florian Weimer wrote: >I've got a table with 100 million rows and need some indexes on it >(one row is 126 bytes). > >I'm currently using MyISAM and the indexing proceeds at an >astonishingly low rate: about 200 MB per hour. This is rate is far >too low; if we had to recover the database for some reason, we'd have >to wait for days. > >The table looks like this: > >CREATE TABLE flows ( > version CHAR NOT NULL, > router CHAR(15) NOT NULL, > src_ip CHAR(15) NOT NULL, > dst_ip CHAR(15) NOT NULL, > protocol TINYINT UNSIGNED NOT NULL, > src_port MEDIUMINT UNSIGNED NOT NULL, > dst_port MEDIUMINT UNSIGNED NOT NULL, > packets INTEGER UNSIGNED NOT NULL, > bytes INTEGER UNSIGNED NOT NULL, > src_if MEDIUMINT UNSIGNED NOT NULL, > dst_if MEDIUMINT UNSIGNED NOT NULL, > src_as MEDIUMINT UNSIGNED NOT NULL, > dst_as MEDIUMINT UNSIGNED NOT NULL, > src_net CHAR(1) NOT NULL, > dst_net CHAR(1) NOT NULL, > direction CHAR(1) NOT NULL, > class CHAR(1) NOT NULL, > start_time CHAR(24), > end_time CHAR(24) >); > >Indexes are created using this statement: > >mysql> ALTER TABLE flows > -> ADD INDEX dst_ip (dst_ip, src_ip), > -> ADD INDEX dst_port (dst_port, start_time), > -> ADD INDEX src_ip (src_ip, start_time), > -> ADD INDEX time (start_time); > >In theory, we could represent the columns router, src_ip, dst_ip, >start_time, end_time using integers of the appropriate size, but this >would make ad-hoc queries harder to type (and porting our applications >would be even more difficult).

 Perhaps, but as a test, you might add a couple of extra columns to
 the table, then populate them like this after loading the table:

 UPDATE flows SET int_src_ip = INET_ATON(src_ip), int_dst_ip =
 INET_ATON(dst_ip);

 Then try creating the indexes using int_src_ip and int_dst_ip rather
 than src_ip and dst_ip.

 If it's significantly faster, you may want to reconsider whether it might
 not be worth using INET_ATON(X) in your queries rather than X.

 >
 >Should I switch to another table type?

 It's easy enough to convert the table to, e.g., InnoDB and then
 create the indexes, so an empirical test should not be difficult.

 --
 Paul DuBois, Senior Technical Writer
 Madison, Wisconsin, USA
 MySQL AB, www.mysql.com

> Are you MySQL certified? http://www.mysql.com/certification/



-- Paul DuBois, Senior Technical Writer Madison, Wisconsin, USA MySQL AB, www.mysql.com

Are you MySQL certified? http://www.mysql.com/certification/


-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]



Reply via email to