Hi,

On Jan 29, 2008 4:08 PM, David Schneider-Joseph <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I am attempting to convert a very large table (~23 million rows) from
> MyISAM to InnoDB.  If I do it in chunks of one million at a time, the
> first million are very fast (approx. 3 minutes or so), and then it
> gets progressively worse, until by the time I get even to the fourth
> chunk, it's taking 15-20 minutes, and continuing to worsen.  This is
> much worse degradation than the O*log(N) that you would expect.
>
> I have determined that this is in part due to the indexes on the
> table.  I have an index on two columns, call them A and B.  There is
> no relationship between the ordering in A and the ordering in B.  If I
> create the InnoDB table with only index A, and insert the data into it
> in the order of A, then almost no gradual degradation in performance
> can be observed between chunks.  Similarly, if I create the InnoDB
> table with only index B, and insert the data ordered by B, it also
> maintains its performance.
>
> However, I have not been able find a way to insert the data with
> *both* indexes, A and B, without suffering this gradual degradation in
> performance.  I have tried all sorts of methods to convert the table,
> such as dumping to a file and importing back into an InnoDB table,
> using "ALTER TABLE big_table ENGINE=InnoDB;", importing the data
> without the second index and adding it after the fact, and in all
> cases it is just a game of whackamole, with the overhead being moved
> elsewhere.  (Note: my primary key is an integer column, so that is not
> the issue here.)
>
> This problem can even be reproduced in a very simple test case, where
> I continuously insert approximately 1 million rows into a table, with
> random data.  `big_table` can be any table with approximately one
> million rows in id range 1 through 1000000 (we're not actually using
> any data from it):
>
> > mysql> create table test (id int NOT NULL auto_increment, x int NOT
> > NULL, primary key (id), key (x)) ENGINE=InnoDB;
> > Query OK, 0 rows affected (0.08 sec)
> >
> > mysql> insert into test (x) select cast(rand()*10000000 as unsigned)
> > from big_table where id between 1 and 1000000;
> > Query OK, 981734 rows affected (22.23 sec)
> > Records: 981734  Duplicates: 0  Warnings: 0
> >
> > mysql> insert into test (x) select cast(rand()*10000000 as unsigned)
> > from big_table where id between 1 and 1000000;
> > Query OK, 981734 rows affected (37.03 sec)
> > Records: 981734  Duplicates: 0  Warnings: 0
> >
> > mysql> insert into test (x) select cast(rand()*10000000 as unsigned)
> > from big_table where id between 1 and 1000000;
> > Query OK, 981734 rows affected (56.41 sec)
> > Records: 981734  Duplicates: 0  Warnings: 0
> >
> > mysql> insert into test (x) select cast(rand()*10000000 as unsigned)
> > from big_table where id between 1 and 1000000;
> > Query OK, 981734 rows affected (1 min 8.47 sec)
> > Records: 981734  Duplicates: 0  Warnings: 0
> >
> > mysql> insert into test (x) select cast(rand()*10000000 as unsigned)
> > from big_table where id between 1 and 1000000;
> > Query OK, 981734 rows affected (1 min 27.67 sec)
> > Records: 981734  Duplicates: 0  Warnings: 0
> >
> > mysql> insert into test (x) select cast(rand()*10000000 as unsigned)
> > from big_table where id between 1 and 1000000;
> > Query OK, 981734 rows affected (1 min 57.93 sec)
> > Records: 981734  Duplicates: 0  Warnings: 0
>
> Any ideas, anyone?

It's because your index is bigger than your memory (or at least bigger
than your InnoDB buffer pool).  InnoDB can't build indexes by sorting
rows, so building the indexes gets slow.

Perhaps you can keep the table as many small tables, or keep older
data in an archive table that's MyISAM and keep only the newest rows
in InnoDB.  Or if you're using MySQL 5.1, this might be a candidate
for partitioning.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to