Hello All,
I have a legacy application which was written using a compound primary key
of an item number (non unique) along with a category ID. The combination of
the item number and category ID make the records unique.
I am in the process of replacing the compound (VARCHAR) keys with an
unique integer key in these tables.
So I have created an item_seq table and assigned a unique sequence number to
each compound key -- it looks like this (all tables are myisam tables, and
mysql version 5.0)
desc item_seq;
+-----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------+------+-----+---------+----------------+
| seq | int(10) unsigned | NO | PRI | NULL | auto_increment |
| itemid | char(11) | NO | MUL | | |
| category | char(4) | NO | | | |
+-----------+------------------+------+-----+---------+----------------+
I also have my main transactional table with about 180,000,000 rows -- it
looks like this:
desc item_trans;
+-------------+-----------------------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default |
Extra |
+-------------+-----------------------+------+-----+---------------------+-------+
| seq | int(10) unsigned | NO | MUL | |
|
| itemid | char(11) | NO | PRI | |
|
| category | char(4) | NO | PRI | |
|
| transid | int(10) | NO | PRI | |
|
Currently the "seq" field is null for the entire table. So of course, I
want to update the main transaction table with the new sequence number.
So I've disabled all the keys on the "item_trans" table -- since I am
updating every row, it wouldn't (shouldn't) be using the index anyway.
Here is my correlated update query:
update item_trans i, item_seq is
set i.seq=is.seq
where is.itemid=i.itemid and is.category=i.category;
If I run an explain on the select version of the update, this is what I
get:
+----+-------------+----------+--------+---------------+--------+---------+------------------------------------------------+-----------+-------+
| id | select_type | table | type | possible_keys | key | key_len |
ref | rows | Extra |
+----+-------------+----------+--------+---------------+--------+---------+------------------------------------------------+-----------+-------+
| 1 | SIMPLE | item_trans | ALL | PRIMARY | NULL | NULL
| NULL | 178948797 | |
| 1 | SIMPLE | item_seq | eq_ref | itemid | itemid | 20 |
g.item_trans.itemid,g.item_trans.category | 1 |
|
+----+-------------+----------+--------+---------------+--------+---------+------------------------------------------------+-----------+-------+
... which is exactly what I would expect it to do. Update every record of
the item_trans table, and do a full index lookup on the items_seq table.
SO... I've been running this query to update item_trans, and it's been
running for 5 days now.
I've also tried running this with the primary key index on the item_trans
table (but not the seq index), and that ran slower in my initial tests.
Are there any faster ways to update 180 million records with a correlated
update query? And I'm fairly certain that trying to do this in PHP
one-record at a time would take much longer than a SQL solution.
Thanks,
-Hank