Adding indexes on large tables

2003-10-07 Thread Brendan J Sherar
Greetings to all, and thanks for the excellent resource!

I have a question regarding indexing large tables (150M+ rows, 2.6G).

The tables in question have a format like this:

word_id mediumint unsigned
doc_id mediumint unsigned

Our indexes are as follows:

PRIMARY KEY (word_id, doc_id)
INDEX (doc_id)

The heart of the question is this:

When calling ALTER IGNORE TABLE doc_word ADD PRIMARY KEY(doc_id, word_id),
ADD INDEX(doc_id), MySQL proceeds to create a working copy of the table. This
process takes over an hour to perform. During this time, disk I/O for the
rest of the database (live) reaches a bottleneck, and slows to an
unacceptable crawl. Once the copy has been created, MySQL is able to do
the actual index build very quickly and efficiently. This process must
occur three times daily.

A) MySQL creates these temporary tables in the same directory as the
original datafile. Is there a way to cause it to use an alternate
directory (i.e., on a separate mounted disk)?

B) Is there a way to nice this process in such a way that the amount of
I/O it consumes in performing the copy is restricted to a manageable level
so that other requests to the disks can be served in a timely fashion?

C) Would abandoning ext3 in favor of ext2 create a substantial difference?

D) We're reluctant to upgrade to 4.0 at this point, but were we do so, are
there any significant gains in this situation?

E) The ALTER TABLE query is performed using perl DBI. Is there a lower
level call available which would improve performance?

F) Any other ideas or suggestions?

The system in question has the following setup:

Dual Xeon 2.8, 4G RAM, 2 x 146GB U160 SCSI (10,000 RPM) on RAID 1
(hardware). Redhat 8.0, 2.4.18 kernel, using ext3 fs. MySQL 3.23.56, with
myisam tables.

Relevant variables:

myisam_sort_buffer_size=512M
tmp_table_size=128M
This is a master, so bin_log is on

Thanks in advance for your help, and please keep up the excellent work!

Best,
Brendan



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Adding indexes on large tables

2003-10-07 Thread Dan Greene
an option for A-

no idea if this will work, but what if you moved your actual data file to new drive, 
and soft linked it from the other drive?

 -Original Message-
 From: Brendan J Sherar [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, October 07, 2003 7:27 AM
 To: [EMAIL PROTECTED]
 Subject: Adding indexes on large tables
 
 
 Greetings to all, and thanks for the excellent resource!
 
 I have a question regarding indexing large tables (150M+ rows, 2.6G).
 
 The tables in question have a format like this:
 
 word_id mediumint unsigned
 doc_id mediumint unsigned
 
 Our indexes are as follows:
 
 PRIMARY KEY (word_id, doc_id)
 INDEX (doc_id)
 
 The heart of the question is this:
 
 When calling ALTER IGNORE TABLE doc_word ADD PRIMARY 
 KEY(doc_id, word_id),
 ADD INDEX(doc_id), MySQL proceeds to create a working copy of 
 the table. This
 process takes over an hour to perform. During this time, disk 
 I/O for the
 rest of the database (live) reaches a bottleneck, and slows to an
 unacceptable crawl. Once the copy has been created, MySQL is 
 able to do
 the actual index build very quickly and efficiently. This process must
 occur three times daily.
 
 A) MySQL creates these temporary tables in the same directory as the
 original datafile. Is there a way to cause it to use an alternate
 directory (i.e., on a separate mounted disk)?
 
 B) Is there a way to nice this process in such a way that 
 the amount of
 I/O it consumes in performing the copy is restricted to a 
 manageable level
 so that other requests to the disks can be served in a timely fashion?
 
 C) Would abandoning ext3 in favor of ext2 create a 
 substantial difference?
 
 D) We're reluctant to upgrade to 4.0 at this point, but were 
 we do so, are
 there any significant gains in this situation?
 
 E) The ALTER TABLE query is performed using perl DBI. Is there a lower
 level call available which would improve performance?
 
 F) Any other ideas or suggestions?
 
 The system in question has the following setup:
 
 Dual Xeon 2.8, 4G RAM, 2 x 146GB U160 SCSI (10,000 RPM) on RAID 1
 (hardware). Redhat 8.0, 2.4.18 kernel, using ext3 fs. MySQL 
 3.23.56, with
 myisam tables.
 
 Relevant variables:
 
 myisam_sort_buffer_size=512M
 tmp_table_size=128M
 This is a master, so bin_log is on
 
 Thanks in advance for your help, and please keep up the 
 excellent work!
 
 Best,
 Brendan
 
 
 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]
 
 

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Adding indexes on large tables

2003-10-07 Thread Brad Teale
Brendan,
  We have used ext2, ext3, and reiser for testing purposes, and we have
found ext3 to be terribly slow on file read/write operations.  If you need 
a journaling file system, I would go with reiser, otherwise ext2 will be 
blazingly fast.

The other thing I would do is move your DB to another drive like Dan, said.

Brad

-Original Message-
From: Brendan J Sherar [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 07, 2003 6:27 AM
To: [EMAIL PROTECTED]
Subject: Adding indexes on large tables


Greetings to all, and thanks for the excellent resource!

I have a question regarding indexing large tables (150M+ rows, 2.6G).

The tables in question have a format like this:

word_id mediumint unsigned
doc_id mediumint unsigned

Our indexes are as follows:

PRIMARY KEY (word_id, doc_id)
INDEX (doc_id)

The heart of the question is this:

When calling ALTER IGNORE TABLE doc_word ADD PRIMARY KEY(doc_id, word_id),
ADD INDEX(doc_id), MySQL proceeds to create a working copy of the table.
This
process takes over an hour to perform. During this time, disk I/O for the
rest of the database (live) reaches a bottleneck, and slows to an
unacceptable crawl. Once the copy has been created, MySQL is able to do
the actual index build very quickly and efficiently. This process must
occur three times daily.

A) MySQL creates these temporary tables in the same directory as the
original datafile. Is there a way to cause it to use an alternate
directory (i.e., on a separate mounted disk)?

B) Is there a way to nice this process in such a way that the amount of
I/O it consumes in performing the copy is restricted to a manageable level
so that other requests to the disks can be served in a timely fashion?

C) Would abandoning ext3 in favor of ext2 create a substantial difference?

D) We're reluctant to upgrade to 4.0 at this point, but were we do so, are
there any significant gains in this situation?

E) The ALTER TABLE query is performed using perl DBI. Is there a lower
level call available which would improve performance?

F) Any other ideas or suggestions?

The system in question has the following setup:

Dual Xeon 2.8, 4G RAM, 2 x 146GB U160 SCSI (10,000 RPM) on RAID 1
(hardware). Redhat 8.0, 2.4.18 kernel, using ext3 fs. MySQL 3.23.56, with
myisam tables.

Relevant variables:

myisam_sort_buffer_size=512M
tmp_table_size=128M
This is a master, so bin_log is on

Thanks in advance for your help, and please keep up the excellent work!

Best,
Brendan



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]