Re: Performance while Building Indices -- how does it scale?

Joseph Bueno Mon, 23 Sep 2002 02:03:21 -0700

Chris Stoughton wrote:
> I have a table with a few million rows, wth 633 columns.  I want to 
> create 10 inidices on this table, six single-column and four two-column 
> inidices.  The database will be loaded once and remain static, and 
> queried many times.  Please note that this is a "small" prototype for 
> the actual database, which will have 40 times more rows.
> 
> I used two strategies to create these tables:
> 
> 1.  Create the indices in the "create table" statement
> 2.  Load the table, and then use the "create index" statement for each 
> index.
> 
> With Strategy 1, the loading started nicely  using the "load data 
> infile" command to load approx. 2000 records at a time.  It took 3 
> seconds for each "load data infile" command.  After a few hundred "load 
> data infile" commands, however, the time increased to two minutes per 
> command.
> 
> With Strategy 2, the loading went from start to finish at 3 seconds for 
> each "load data infile" command.  But now, creating each index is taking 
> longer.  Here is a log:
> 
> Fri Sep 20 15:17:43 CDT 2002 create index objId on bestTsObj (objId)
> Fri Sep 20 15:35:51 CDT 2002 create index fieldId on bestTsObj (fieldId)
> Fri Sep 20 15:56:02 CDT 2002 create index targetObjId on bestTsObj 
> (targetObjId)
> Fri Sep 20 16:20:02 CDT 2002 create index ra on bestTsObj (ra)
> Fri Sep 20 16:49:16 CDT 2002 create index decl on bestTsObj (decl)
> Fri Sep 20 17:53:38 CDT 2002 create index u on bestTsObj (u)
> Fri Sep 20 18:42:52 CDT 2002 create index g on bestTsObj (g)
> Fri Sep 20 22:04:25 CDT 2002 create index r on bestTsObj (r)
> Sat Sep 21 10:06:44 CDT 2002 create index i on bestTsObj (i)
> Sat Sep 21 19:31:52 CDT 2002 create index z on bestTsObj (z)
> Sun Sep 22 15:45:06 CDT 2002 create index ug on bestTsObj (u,g)
> 
> 
> The first index was created in a reasonable amount of time (18 minutes), 
> but it is taking longer to create more indices.
> 
> 
> With both strategies, no process is swapping heavily, and mysqld is not 
> consuming available CPU cycles efficiently.
> 
> Can you suggest how I can build multiple indices on large tables 
> efficiently?  Will it help to create a "narrow" table with only the 
> quantities I want to use in indices?
> 
> We have installed mysql via rpm:
> bash-2.04$ mysqladmin
> mysqladmin  Ver 8.18 Distrib 3.23.36, for redhat-linux-gnu on i386
> 
> bash-2.04$ uname -r
> 2.4.18-xfs-1.1
> bash-2.04$
> What diagnostics do you suggest I run?
> 
> Thanks
> 
Hi,


You should use strategy #1 since each time you create a new index, all
index are rebuilt (this is why each new index is taking longer in
strategy#2).

Since mysqld is not consuming CPU nor swapping, it is probably doing a
lot of disc I/Os. You can verify that with vmstat utility.

You didn't specify how much RAM you have and what mysql configuration
you are using (check /etc/my.cnf). You should pay special attention to
'key_buffer_size' parameter value since it has a major impact on index
performance.

Regards,
-- 
Joseph Bueno


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Performance while Building Indices -- how does it scale?

Reply via email to