Try analyze your table(s).

"Me fail English? That's unpossible"
###___Archon___###

----- Original Message -----
From: "electroteque" <[EMAIL PROTECTED]>
To: "Paul DuBois" <[EMAIL PROTECTED]>; "Florian Weimer" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Wednesday, July 09, 2003 10:23 AM
Subject: RE: Faster reindexing


> when reimporting or reinserting or whatever from a huge db i usually drop
> all the indexes reimport then create them again much quicker
>
> -----Original Message-----
> From: Paul DuBois [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, July 09, 2003 1:09 PM
> To: Florian Weimer; [EMAIL PROTECTED]
> Subject: Re: Faster reindexing
>
>
> At 9:39 +0200 7/7/03, Florian Weimer wrote:
> >I've got a table with 100 million rows and need some indexes on it
> >(one row is 126 bytes).
> >
> >I'm currently using MyISAM and the indexing proceeds at an
> >astonishingly low rate: about 200 MB per hour.  This is rate is far
> >too low; if we had to recover the database for some reason, we'd have
> >to wait for days.
> >
> >The table looks like this:
> >
> >CREATE TABLE flows (
> > version    CHAR NOT NULL,
> > router     CHAR(15) NOT NULL,
> > src_ip     CHAR(15) NOT NULL,
> > dst_ip     CHAR(15) NOT NULL,
> > protocol   TINYINT UNSIGNED NOT NULL,
> > src_port   MEDIUMINT UNSIGNED NOT NULL,
> > dst_port   MEDIUMINT UNSIGNED NOT NULL,
> > packets    INTEGER UNSIGNED NOT NULL,
> > bytes      INTEGER UNSIGNED NOT NULL,
> > src_if     MEDIUMINT UNSIGNED NOT NULL,
> > dst_if     MEDIUMINT UNSIGNED NOT NULL,
> > src_as     MEDIUMINT UNSIGNED NOT NULL,
> > dst_as     MEDIUMINT UNSIGNED NOT NULL,
> > src_net    CHAR(1) NOT NULL,
> > dst_net    CHAR(1) NOT NULL,
> > direction  CHAR(1) NOT NULL,
> > class      CHAR(1) NOT NULL,
> > start_time CHAR(24),
> > end_time   CHAR(24)
> >);
> >
> >Indexes are created using this statement:
> >
> >mysql> ALTER TABLE flows
> >     -> ADD INDEX dst_ip (dst_ip, src_ip),
> >     -> ADD INDEX dst_port (dst_port, start_time),
> >     -> ADD INDEX src_ip (src_ip, start_time),
> >     -> ADD INDEX time (start_time);
> >
> >In theory, we could represent the columns router, src_ip, dst_ip,
> >start_time, end_time using integers of the appropriate size, but this
> >would make ad-hoc queries harder to type (and porting our applications
> >would be even more difficult).
>
> Perhaps, but as a test, you might add a couple of extra columns to
> the table, then populate them like this after loading the table:
>
> UPDATE flows SET int_src_ip = INET_ATON(src_ip), int_dst_ip =
> INET_ATON(dst_ip);
>
> Then try creating the indexes using int_src_ip and int_dst_ip rather
> than src_ip and dst_ip.
>
> If it's significantly faster, you may want to reconsider whether it might
> not be worth using INET_ATON(X) in your queries rather than X.
>
> >
> >Should I switch to another table type?
>
> It's easy enough to convert the table to, e.g., InnoDB and then
> create the indexes, so an empirical test should not be difficult.
>
> --
> Paul DuBois, Senior Technical Writer
> Madison, Wisconsin, USA
> MySQL AB, www.mysql.com
>
> Are you MySQL certified?  http://www.mysql.com/certification/
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:
> http://lists.mysql.com/[EMAIL PROTECTED]
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to