Try analyze your table(s). "Me fail English? That's unpossible" ###___Archon___###
----- Original Message ----- From: "electroteque" <[EMAIL PROTECTED]> To: "Paul DuBois" <[EMAIL PROTECTED]>; "Florian Weimer" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Wednesday, July 09, 2003 10:23 AM Subject: RE: Faster reindexing > when reimporting or reinserting or whatever from a huge db i usually drop > all the indexes reimport then create them again much quicker > > -----Original Message----- > From: Paul DuBois [mailto:[EMAIL PROTECTED] > Sent: Wednesday, July 09, 2003 1:09 PM > To: Florian Weimer; [EMAIL PROTECTED] > Subject: Re: Faster reindexing > > > At 9:39 +0200 7/7/03, Florian Weimer wrote: > >I've got a table with 100 million rows and need some indexes on it > >(one row is 126 bytes). > > > >I'm currently using MyISAM and the indexing proceeds at an > >astonishingly low rate: about 200 MB per hour. This is rate is far > >too low; if we had to recover the database for some reason, we'd have > >to wait for days. > > > >The table looks like this: > > > >CREATE TABLE flows ( > > version CHAR NOT NULL, > > router CHAR(15) NOT NULL, > > src_ip CHAR(15) NOT NULL, > > dst_ip CHAR(15) NOT NULL, > > protocol TINYINT UNSIGNED NOT NULL, > > src_port MEDIUMINT UNSIGNED NOT NULL, > > dst_port MEDIUMINT UNSIGNED NOT NULL, > > packets INTEGER UNSIGNED NOT NULL, > > bytes INTEGER UNSIGNED NOT NULL, > > src_if MEDIUMINT UNSIGNED NOT NULL, > > dst_if MEDIUMINT UNSIGNED NOT NULL, > > src_as MEDIUMINT UNSIGNED NOT NULL, > > dst_as MEDIUMINT UNSIGNED NOT NULL, > > src_net CHAR(1) NOT NULL, > > dst_net CHAR(1) NOT NULL, > > direction CHAR(1) NOT NULL, > > class CHAR(1) NOT NULL, > > start_time CHAR(24), > > end_time CHAR(24) > >); > > > >Indexes are created using this statement: > > > >mysql> ALTER TABLE flows > > -> ADD INDEX dst_ip (dst_ip, src_ip), > > -> ADD INDEX dst_port (dst_port, start_time), > > -> ADD INDEX src_ip (src_ip, start_time), > > -> ADD INDEX time (start_time); > > > >In theory, we could represent the columns router, src_ip, dst_ip, > >start_time, end_time using integers of the appropriate size, but this > >would make ad-hoc queries harder to type (and porting our applications > >would be even more difficult). > > Perhaps, but as a test, you might add a couple of extra columns to > the table, then populate them like this after loading the table: > > UPDATE flows SET int_src_ip = INET_ATON(src_ip), int_dst_ip = > INET_ATON(dst_ip); > > Then try creating the indexes using int_src_ip and int_dst_ip rather > than src_ip and dst_ip. > > If it's significantly faster, you may want to reconsider whether it might > not be worth using INET_ATON(X) in your queries rather than X. > > > > >Should I switch to another table type? > > It's easy enough to convert the table to, e.g., InnoDB and then > create the indexes, so an empirical test should not be difficult. > > -- > Paul DuBois, Senior Technical Writer > Madison, Wisconsin, USA > MySQL AB, www.mysql.com > > Are you MySQL certified? http://www.mysql.com/certification/ > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: > http://lists.mysql.com/[EMAIL PROTECTED] > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]