subject:"RE\: Non\-linear degradation in bulk loads\?"

Re: Non-linear degradation in bulk loads?

2005-10-18 Thread Heikki Tuuri


Jon,

using a  4 GB buffer pool is safe in a 64-bit computer, and allocating  
50 % - 80 % of memory to the buffer pool is recommended in an 
InnoDB-only server.


Regards,

Heikki
Oracle/Innobase

- Alkuperäinen viesti -
Lähettäjä: Jon Frisby [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED]; mysql@lists.mysql.com 
mailto:mysql@lists.mysql.com

Lähetetty: Tuesday, October 18, 2005 4:39 AM
Aihe: RE: Non-linear degradation in bulk loads?

Side question:  If I use a 64-bit MySQL build on a 64-bit kernel, is it 
safe and sane to allocate say, 6GB to the InnoDB buffer pool?


On an 8GB box, 64-bit software stack, what is the optimum memory 
allocation for a pure-InnoDB (* - MyISAM used only for grant tables) 
mysql server running as the sole application on the machine?


-JF

 -Original Message-
 From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
 Sent: Monday, October 17, 2005 10:55 AM
 To: mysql@lists.mysql.com mailto:mysql@lists.mysql.com
 Subject: Re: Non-linear degradation in bulk loads?

 Jon,

 I am not 100 % sure that the problem we saw was in a 64-bit
 Linux. It might have been 32-bit.

 Anyway, since CentOS is a clone of RHEL, this might be the
 same file cache phenomenon. I do not know if one can force
 the file cache to stay smaller than 4 GB. You can try running
 some dummy programs that occupy a few GB of memory.

 Regards,

 Heikki
 Oracle/Innobase

 - Alkuperäinen viesti -
 Lähettäjä: Jon Frisby [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED]
 Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED];

 mysql@lists.mysql.com mailto:mysql@lists.mysql.com
 Lähetetty: Monday, October 17, 2005 8:49 PM
 Aihe: RE: Non-linear degradation in bulk loads?


 Sorry to spam the group, but I just noticed that I asserted
 we were on a
 2.4 kernel -- we're on 2.6.9-11 as part of CentOS 4.1...

 -JF


  -Original Message-
  From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
  Sent: Sunday, October 16, 2005 2:23 AM
  To: mysql@lists.mysql.com mailto:mysql@lists.mysql.com
  Subject: Re: Non-linear degradation in bulk loads?
 
  Jon,
 
  your hardware/OS combination is quite new and unusual. It
 might indeed
  be an OS problem. We observed from a 64-bit RHEL
  4 that when the file cache of the OS grew bigger than 4 GB,
 then the
  file I/O performance dropped to one tenth of the normal.
 You would not
  expect that kind of behavior from a 64-bit OS.
 
  When you see the slowdown, what does 'top' say about the OS
 file cache
  size?
 
  
  The database is our main sites database but we've
 dramatically reduced
  the load on that machine over the past couple months
 through careful
  optimization of our code.  The box is a dual, dual-core
 Opteron, 8GB
  of RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20
 (32-bit of
  course).  We have 1GB allocated to the buffer pool, and our
 usual 1GB
  *
  3 log files.  8 I/O threads.
 
  Load on the box sits at around 6-7, with a large (50%)
 amount of time
  spent in wait state, but actual disk throughput to our
 software RAID
  array (No longer on a SAN...) is quite low -- 6-9k blocks/s
 out, 1-6k
  blocks/s in.
  
 
  Regards,
 
  Heikki
 
 
  - Original Message -
  From: Jon Frisby [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
  Newsgroups: mailing.database.myodbc
  Sent: Friday, October 14, 2005 10:39 PM
  Subject: RE: Non-linear degradation in bulk loads?
 
 
   I've tried tweaking the structure of the schema to have,
  for example, a =
   PRIMARY KEY index on email, no other indexes, and then
  insert in sorted =
   order -- made no improvement whatsoever.  Another clue that
  leads me to =
   believe that this may be an OS issue:  Starting a large cp
  on the same =
   box (from a local filesystem other than the one the InnoDB
  data pool was =
   on, to NFS) caused MySQL to become COMPLETELY backlogged
  (we went from =
   ~15-20 connections at any given instant to 750 (our
  max_connections =
   setting)).
  
   -JF
  
  
   -Original Message-
   From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
   Sent: Wednesday, October 12, 2005 8:15 AM
   To: mysql@lists.mysql.com mailto:mysql@lists.mysql.com
   Subject: Re: Non-linear degradation in bulk loads?
  =20
   Jon,
  =20
   hmm... maybe one of the indexes inevitably is in a random order.
  =20
   Please post a typical
  =20
   SHOW INNODB STATUS\G
  =20
   when the inserts happen slowly.
  =20
   What is your my.cnf like?
  =20
   Regards,
  =20
   Heikki
   Innobase/Oracle
  =20
   - Alkuper=E4inen viesti -
   L=E4hett=E4j=E4: Jon Frisby [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED]
   Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED];=20 
mailto:[EMAIL PROTECTED];=20 
  mysql@lists.mysql.com mailto:mysql@lists.mysql.com

   L=E4hetetty: Wednesday, October 12, 2005 3:08 AM
   Aihe: RE: Non-linear degradation in bulk loads?
  =20
  =20
Two solutions: 1) sort the rows to be inserted on the
  key 'email

RE: Non-linear degradation in bulk loads?

2005-10-17 Thread Jon Frisby

About 6GB...  Is there any way to forcibly limit this?

-JF

 -Original Message-
 From: Heikki Tuuri [mailto:[EMAIL PROTECTED] 
 Sent: Sunday, October 16, 2005 2:23 AM
 To: mysql@lists.mysql.com
 Subject: Re: Non-linear degradation in bulk loads?

 Jon,

 your hardware/OS combination is quite new and unusual. It 
 might indeed be an OS problem. We observed from a 64-bit RHEL 
 4 that when the file cache of the OS grew bigger than 4 GB, 
 then the file I/O performance dropped to one tenth of the 
 normal. You would not expect that kind of behavior from a 64-bit OS.

 When you see the slowdown, what does 'top' say about the OS 
 file cache size?

 The database is our main sites database but we've 
 dramatically reduced the load on that machine over the past 
 couple months through careful optimization of our code.  The 
 box is a dual, dual-core Opteron, 8GB of RAM running a 32-bit 
 Linux 2.4 kernel and MySQL 4.0.20 (32-bit of course).  We 
 have 1GB allocated to the buffer pool, and our usual 1GB *
 3 log files.  8 I/O threads.

 Load on the box sits at around 6-7, with a large (50%) 
 amount of time spent in wait state, but actual disk 
 throughput to our software RAID array (No longer on a SAN...) 
 is quite low -- 6-9k blocks/s out, 1-6k blocks/s in.

 Regards,

 Heikki

 - Original Message -
 From: Jon Frisby [EMAIL PROTECTED]
 Newsgroups: mailing.database.myodbc
 Sent: Friday, October 14, 2005 10:39 PM
 Subject: RE: Non-linear degradation in bulk loads?

  I've tried tweaking the structure of the schema to have, 
 for example, a =
  PRIMARY KEY index on email, no other indexes, and then 
 insert in sorted =
  order -- made no improvement whatsoever.  Another clue that 
 leads me to =
  believe that this may be an OS issue:  Starting a large cp 
 on the same =
  box (from a local filesystem other than the one the InnoDB 
 data pool was =
  on, to NFS) caused MySQL to become COMPLETELY backlogged 
 (we went from =
  ~15-20 connections at any given instant to 750 (our 
 max_connections =
  setting)).

  -JF

  -Original Message-
  From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
  Sent: Wednesday, October 12, 2005 8:15 AM
  To: mysql@lists.mysql.com
  Subject: Re: Non-linear degradation in bulk loads?
 =20
  Jon,
 =20
  hmm... maybe one of the indexes inevitably is in a random order.
 =20
  Please post a typical
 =20
  SHOW INNODB STATUS\G
 =20
  when the inserts happen slowly.
 =20
  What is your my.cnf like?
 =20
  Regards,
 =20
  Heikki
  Innobase/Oracle
 =20
  - Alkuper=E4inen viesti -
  L=E4hett=E4j=E4: Jon Frisby [EMAIL PROTECTED]
  Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED];=20
  mysql@lists.mysql.com
  L=E4hetetty: Wednesday, October 12, 2005 3:08 AM
  Aihe: RE: Non-linear degradation in bulk loads?
 =20
 =20
   Two solutions: 1) sort the rows to be inserted on the 
 key 'email'=20
   before inserting.

   2) Or:

   http://dev.mysql.com/doc/mysql/en/innodb-tuning.html

   If you have UNIQUE constraints on secondary keys, starting=20
  from MySQL
   3.23.52 and 4.0.3, you can speed up table imports by 
 temporarily=20
   turning off the uniqueness checks during the import session:
   SET UNIQUE_CHECKS=3D0;

   For big tables, this saves a lot of disk I/O because 
 InnoDB can use=20
   its insert buffer to write secondary index records in a batch.

   But make sure you do not have any duplicates in the rows!
 =20
  After sending my mail, I discovered SET UNIQUE_CHECKS=3D0, and=20
  subsequent to that it also occurred to me to try putting the=20
  data in in sorted order.  Unfortunately, doing=20
  UNIQUE_CHECKS=3D0 did not work, and even the combination of=20
  both did not work.  First chunk (3.4m rows) was ~1.5 minutes,=20
  second was ~5 minutes...
 =20
  At this point I'm inclined to believe that there is something=20
  very wrong with the disk subsystem because of this and other=20
  problems (doing a large cp from the datapool filesystem to=20
  another filesystem brought the database to a near-halt, among=20
  other things).
 =20
  As a stop-gap solution, I created the table with no indexes,=20
  and loaded all the data (loaded in linear time), and plan on=20
  doing a CREATE UNIQUE INDEX on the table.  Will this happen=20
  in linear time, or near-linear time?
 =20
  *sigh*
 =20
  -JF=20
 =20
 =20
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:   =20
  http://lists.mysql.com/[EMAIL PROTECTED]
 =20
 =20

  -- 
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe: 
  http://lists.mysql.com/[EMAIL PROTECTED]

 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Non-linear degradation in bulk loads?

2005-10-17 Thread Jon Frisby

Actually, I believe we're running 32-bit, with bigmem...  Does similar
behavior occur in such a scenario?

-JF

 -Original Message-
 From: Heikki Tuuri [mailto:[EMAIL PROTECTED] 
 Sent: Sunday, October 16, 2005 2:23 AM
 To: mysql@lists.mysql.com
 Subject: Re: Non-linear degradation in bulk loads?

 Jon,

 your hardware/OS combination is quite new and unusual. It 
 might indeed be an OS problem. We observed from a 64-bit RHEL 
 4 that when the file cache of the OS grew bigger than 4 GB, 
 then the file I/O performance dropped to one tenth of the 
 normal. You would not expect that kind of behavior from a 64-bit OS.

 When you see the slowdown, what does 'top' say about the OS 
 file cache size?

 The database is our main sites database but we've 
 dramatically reduced the load on that machine over the past 
 couple months through careful optimization of our code.  The 
 box is a dual, dual-core Opteron, 8GB of RAM running a 32-bit 
 Linux 2.4 kernel and MySQL 4.0.20 (32-bit of course).  We 
 have 1GB allocated to the buffer pool, and our usual 1GB *
 3 log files.  8 I/O threads.

 Load on the box sits at around 6-7, with a large (50%) 
 amount of time spent in wait state, but actual disk 
 throughput to our software RAID array (No longer on a SAN...) 
 is quite low -- 6-9k blocks/s out, 1-6k blocks/s in.

 Regards,

 Heikki

 - Original Message -
 From: Jon Frisby [EMAIL PROTECTED]
 Newsgroups: mailing.database.myodbc
 Sent: Friday, October 14, 2005 10:39 PM
 Subject: RE: Non-linear degradation in bulk loads?

  I've tried tweaking the structure of the schema to have, 
 for example, a =
  PRIMARY KEY index on email, no other indexes, and then 
 insert in sorted =
  order -- made no improvement whatsoever.  Another clue that 
 leads me to =
  believe that this may be an OS issue:  Starting a large cp 
 on the same =
  box (from a local filesystem other than the one the InnoDB 
 data pool was =
  on, to NFS) caused MySQL to become COMPLETELY backlogged 
 (we went from =
  ~15-20 connections at any given instant to 750 (our 
 max_connections =
  setting)).

  -JF

  -Original Message-
  From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
  Sent: Wednesday, October 12, 2005 8:15 AM
  To: mysql@lists.mysql.com
  Subject: Re: Non-linear degradation in bulk loads?
 =20
  Jon,
 =20
  hmm... maybe one of the indexes inevitably is in a random order.
 =20
  Please post a typical
 =20
  SHOW INNODB STATUS\G
 =20
  when the inserts happen slowly.
 =20
  What is your my.cnf like?
 =20
  Regards,
 =20
  Heikki
  Innobase/Oracle
 =20
  - Alkuper=E4inen viesti -
  L=E4hett=E4j=E4: Jon Frisby [EMAIL PROTECTED]
  Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED];=20
  mysql@lists.mysql.com
  L=E4hetetty: Wednesday, October 12, 2005 3:08 AM
  Aihe: RE: Non-linear degradation in bulk loads?
 =20
 =20
   Two solutions: 1) sort the rows to be inserted on the 
 key 'email'=20
   before inserting.

   2) Or:

   http://dev.mysql.com/doc/mysql/en/innodb-tuning.html

   If you have UNIQUE constraints on secondary keys, starting=20
  from MySQL
   3.23.52 and 4.0.3, you can speed up table imports by 
 temporarily=20
   turning off the uniqueness checks during the import session:
   SET UNIQUE_CHECKS=3D0;

   For big tables, this saves a lot of disk I/O because 
 InnoDB can use=20
   its insert buffer to write secondary index records in a batch.

   But make sure you do not have any duplicates in the rows!
 =20
  After sending my mail, I discovered SET UNIQUE_CHECKS=3D0, and=20
  subsequent to that it also occurred to me to try putting the=20
  data in in sorted order.  Unfortunately, doing=20
  UNIQUE_CHECKS=3D0 did not work, and even the combination of=20
  both did not work.  First chunk (3.4m rows) was ~1.5 minutes,=20
  second was ~5 minutes...
 =20
  At this point I'm inclined to believe that there is something=20
  very wrong with the disk subsystem because of this and other=20
  problems (doing a large cp from the datapool filesystem to=20
  another filesystem brought the database to a near-halt, among=20
  other things).
 =20
  As a stop-gap solution, I created the table with no indexes,=20
  and loaded all the data (loaded in linear time), and plan on=20
  doing a CREATE UNIQUE INDEX on the table.  Will this happen=20
  in linear time, or near-linear time?
 =20
  *sigh*
 =20
  -JF=20
 =20
 =20
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:   =20
  http://lists.mysql.com/[EMAIL PROTECTED]
 =20
 =20

  -- 
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe: 
  http://lists.mysql.com/[EMAIL PROTECTED]

 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http

Re: Non-linear degradation in bulk loads?

2005-10-17 Thread Heikki Tuuri


Jon,

I do not know. Why not install a 64-bit Linux in your computer?

Regards,

Heikki
Oracle/Innobase

- Alkuperäinen viesti - 
Lähettäjä: Jon Frisby [EMAIL PROTECTED]
Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED]; 
mysql@lists.mysql.com

Lähetetty: Monday, October 17, 2005 8:46 PM
Aihe: RE: Non-linear degradation in bulk loads?


Actually, I believe we're running 32-bit, with bigmem...  Does similar
behavior occur in such a scenario?

-JF



-Original Message-
From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
Sent: Sunday, October 16, 2005 2:23 AM
To: mysql@lists.mysql.com
Subject: Re: Non-linear degradation in bulk loads?

Jon,

your hardware/OS combination is quite new and unusual. It
might indeed be an OS problem. We observed from a 64-bit RHEL
4 that when the file cache of the OS grew bigger than 4 GB,
then the file I/O performance dropped to one tenth of the
normal. You would not expect that kind of behavior from a 64-bit OS.

When you see the slowdown, what does 'top' say about the OS
file cache size?


The database is our main sites database but we've
dramatically reduced the load on that machine over the past
couple months through careful optimization of our code.  The
box is a dual, dual-core Opteron, 8GB of RAM running a 32-bit
Linux 2.4 kernel and MySQL 4.0.20 (32-bit of course).  We
have 1GB allocated to the buffer pool, and our usual 1GB *
3 log files.  8 I/O threads.

Load on the box sits at around 6-7, with a large (50%)
amount of time spent in wait state, but actual disk
throughput to our software RAID array (No longer on a SAN...)
is quite low -- 6-9k blocks/s out, 1-6k blocks/s in.


Regards,

Heikki


- Original Message -
From: Jon Frisby [EMAIL PROTECTED]
Newsgroups: mailing.database.myodbc
Sent: Friday, October 14, 2005 10:39 PM
Subject: RE: Non-linear degradation in bulk loads?


 I've tried tweaking the structure of the schema to have,
for example, a =
 PRIMARY KEY index on email, no other indexes, and then
insert in sorted =
 order -- made no improvement whatsoever.  Another clue that
leads me to =
 believe that this may be an OS issue:  Starting a large cp
on the same =
 box (from a local filesystem other than the one the InnoDB
data pool was =
 on, to NFS) caused MySQL to become COMPLETELY backlogged
(we went from =
 ~15-20 connections at any given instant to 750 (our
max_connections =
 setting)).

 -JF


 -Original Message-
 From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, October 12, 2005 8:15 AM
 To: mysql@lists.mysql.com
 Subject: Re: Non-linear degradation in bulk loads?
=20
 Jon,
=20
 hmm... maybe one of the indexes inevitably is in a random order.
=20
 Please post a typical
=20
 SHOW INNODB STATUS\G
=20
 when the inserts happen slowly.
=20
 What is your my.cnf like?
=20
 Regards,
=20
 Heikki
 Innobase/Oracle
=20
 - Alkuper=E4inen viesti -
 L=E4hett=E4j=E4: Jon Frisby [EMAIL PROTECTED]
 Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED];=20
 mysql@lists.mysql.com
 L=E4hetetty: Wednesday, October 12, 2005 3:08 AM
 Aihe: RE: Non-linear degradation in bulk loads?
=20
=20
  Two solutions: 1) sort the rows to be inserted on the
key 'email'=20
  before inserting.
 
  2) Or:
 
  http://dev.mysql.com/doc/mysql/en/innodb-tuning.html
  
  If you have UNIQUE constraints on secondary keys, starting=20
 from MySQL
  3.23.52 and 4.0.3, you can speed up table imports by
temporarily=20
  turning off the uniqueness checks during the import session:
  SET UNIQUE_CHECKS=3D0;
 
  For big tables, this saves a lot of disk I/O because
InnoDB can use=20
  its insert buffer to write secondary index records in a batch.
  
 
  But make sure you do not have any duplicates in the rows!
=20
 After sending my mail, I discovered SET UNIQUE_CHECKS=3D0, and=20
 subsequent to that it also occurred to me to try putting the=20
 data in in sorted order.  Unfortunately, doing=20
 UNIQUE_CHECKS=3D0 did not work, and even the combination of=20
 both did not work.  First chunk (3.4m rows) was ~1.5 minutes,=20
 second was ~5 minutes...
=20
 At this point I'm inclined to believe that there is something=20
 very wrong with the disk subsystem because of this and other=20
 problems (doing a large cp from the datapool filesystem to=20
 another filesystem brought the database to a near-halt, among=20
 other things).
=20
 As a stop-gap solution, I created the table with no indexes,=20
 and loaded all the data (loaded in linear time), and plan on=20
 doing a CREATE UNIQUE INDEX on the table.  Will this happen=20
 in linear time, or near-linear time?
=20
 *sigh*
=20
 -JF=20
=20
=20
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:   =20
 http://lists.mysql.com/[EMAIL PROTECTED]
=20
=20

 -- 
 MySQL General Mailing List

 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http

RE: Non-linear degradation in bulk loads?

2005-10-17 Thread Jon Frisby

Sorry to spam the group, but I just noticed that I asserted we were on a
2.4 kernel -- we're on 2.6.9-11 as part of CentOS 4.1...

-JF

 -Original Message-
 From: Heikki Tuuri [mailto:[EMAIL PROTECTED] 
 Sent: Sunday, October 16, 2005 2:23 AM
 To: mysql@lists.mysql.com
 Subject: Re: Non-linear degradation in bulk loads?

 Jon,

 your hardware/OS combination is quite new and unusual. It 
 might indeed be an OS problem. We observed from a 64-bit RHEL 
 4 that when the file cache of the OS grew bigger than 4 GB, 
 then the file I/O performance dropped to one tenth of the 
 normal. You would not expect that kind of behavior from a 64-bit OS.

 When you see the slowdown, what does 'top' say about the OS 
 file cache size?

 The database is our main sites database but we've 
 dramatically reduced the load on that machine over the past 
 couple months through careful optimization of our code.  The 
 box is a dual, dual-core Opteron, 8GB of RAM running a 32-bit 
 Linux 2.4 kernel and MySQL 4.0.20 (32-bit of course).  We 
 have 1GB allocated to the buffer pool, and our usual 1GB *
 3 log files.  8 I/O threads.

 Load on the box sits at around 6-7, with a large (50%) 
 amount of time spent in wait state, but actual disk 
 throughput to our software RAID array (No longer on a SAN...) 
 is quite low -- 6-9k blocks/s out, 1-6k blocks/s in.

 Regards,

 Heikki

 - Original Message -
 From: Jon Frisby [EMAIL PROTECTED]
 Newsgroups: mailing.database.myodbc
 Sent: Friday, October 14, 2005 10:39 PM
 Subject: RE: Non-linear degradation in bulk loads?

  I've tried tweaking the structure of the schema to have, 
 for example, a =
  PRIMARY KEY index on email, no other indexes, and then 
 insert in sorted =
  order -- made no improvement whatsoever.  Another clue that 
 leads me to =
  believe that this may be an OS issue:  Starting a large cp 
 on the same =
  box (from a local filesystem other than the one the InnoDB 
 data pool was =
  on, to NFS) caused MySQL to become COMPLETELY backlogged 
 (we went from =
  ~15-20 connections at any given instant to 750 (our 
 max_connections =
  setting)).

  -JF

  -Original Message-
  From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
  Sent: Wednesday, October 12, 2005 8:15 AM
  To: mysql@lists.mysql.com
  Subject: Re: Non-linear degradation in bulk loads?
 =20
  Jon,
 =20
  hmm... maybe one of the indexes inevitably is in a random order.
 =20
  Please post a typical
 =20
  SHOW INNODB STATUS\G
 =20
  when the inserts happen slowly.
 =20
  What is your my.cnf like?
 =20
  Regards,
 =20
  Heikki
  Innobase/Oracle
 =20
  - Alkuper=E4inen viesti -
  L=E4hett=E4j=E4: Jon Frisby [EMAIL PROTECTED]
  Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED];=20
  mysql@lists.mysql.com
  L=E4hetetty: Wednesday, October 12, 2005 3:08 AM
  Aihe: RE: Non-linear degradation in bulk loads?
 =20
 =20
   Two solutions: 1) sort the rows to be inserted on the 
 key 'email'=20
   before inserting.

   2) Or:

   http://dev.mysql.com/doc/mysql/en/innodb-tuning.html

   If you have UNIQUE constraints on secondary keys, starting=20
  from MySQL
   3.23.52 and 4.0.3, you can speed up table imports by 
 temporarily=20
   turning off the uniqueness checks during the import session:
   SET UNIQUE_CHECKS=3D0;

   For big tables, this saves a lot of disk I/O because 
 InnoDB can use=20
   its insert buffer to write secondary index records in a batch.

   But make sure you do not have any duplicates in the rows!
 =20
  After sending my mail, I discovered SET UNIQUE_CHECKS=3D0, and=20
  subsequent to that it also occurred to me to try putting the=20
  data in in sorted order.  Unfortunately, doing=20
  UNIQUE_CHECKS=3D0 did not work, and even the combination of=20
  both did not work.  First chunk (3.4m rows) was ~1.5 minutes,=20
  second was ~5 minutes...
 =20
  At this point I'm inclined to believe that there is something=20
  very wrong with the disk subsystem because of this and other=20
  problems (doing a large cp from the datapool filesystem to=20
  another filesystem brought the database to a near-halt, among=20
  other things).
 =20
  As a stop-gap solution, I created the table with no indexes,=20
  and loaded all the data (loaded in linear time), and plan on=20
  doing a CREATE UNIQUE INDEX on the table.  Will this happen=20
  in linear time, or near-linear time?
 =20
  *sigh*
 =20
  -JF=20
 =20
 =20
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:   =20
  http://lists.mysql.com/[EMAIL PROTECTED]
 =20
 =20

  -- 
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe: 
  http://lists.mysql.com/[EMAIL PROTECTED]

 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe

Re: Non-linear degradation in bulk loads?

2005-10-17 Thread Heikki Tuuri


Jon,

I am not 100 % sure that the problem we saw was in a 64-bit Linux. It might 
have been 32-bit.


Anyway, since CentOS is a clone of RHEL, this might be the same file cache 
phenomenon. I do not know if one can force the file cache to stay smaller 
than 4 GB. You can try running some dummy programs that occupy a few GB of 
memory.


Regards,

Heikki
Oracle/Innobase

- Alkuperäinen viesti - 
Lähettäjä: Jon Frisby [EMAIL PROTECTED]
Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED]; 
mysql@lists.mysql.com

Lähetetty: Monday, October 17, 2005 8:49 PM
Aihe: RE: Non-linear degradation in bulk loads?


Sorry to spam the group, but I just noticed that I asserted we were on a
2.4 kernel -- we're on 2.6.9-11 as part of CentOS 4.1...

-JF



-Original Message-
From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
Sent: Sunday, October 16, 2005 2:23 AM
To: mysql@lists.mysql.com
Subject: Re: Non-linear degradation in bulk loads?

Jon,

your hardware/OS combination is quite new and unusual. It
might indeed be an OS problem. We observed from a 64-bit RHEL
4 that when the file cache of the OS grew bigger than 4 GB,
then the file I/O performance dropped to one tenth of the
normal. You would not expect that kind of behavior from a 64-bit OS.

When you see the slowdown, what does 'top' say about the OS
file cache size?


The database is our main sites database but we've
dramatically reduced the load on that machine over the past
couple months through careful optimization of our code.  The
box is a dual, dual-core Opteron, 8GB of RAM running a 32-bit
Linux 2.4 kernel and MySQL 4.0.20 (32-bit of course).  We
have 1GB allocated to the buffer pool, and our usual 1GB *
3 log files.  8 I/O threads.

Load on the box sits at around 6-7, with a large (50%)
amount of time spent in wait state, but actual disk
throughput to our software RAID array (No longer on a SAN...)
is quite low -- 6-9k blocks/s out, 1-6k blocks/s in.


Regards,

Heikki


- Original Message -
From: Jon Frisby [EMAIL PROTECTED]
Newsgroups: mailing.database.myodbc
Sent: Friday, October 14, 2005 10:39 PM
Subject: RE: Non-linear degradation in bulk loads?


 I've tried tweaking the structure of the schema to have,
for example, a =
 PRIMARY KEY index on email, no other indexes, and then
insert in sorted =
 order -- made no improvement whatsoever.  Another clue that
leads me to =
 believe that this may be an OS issue:  Starting a large cp
on the same =
 box (from a local filesystem other than the one the InnoDB
data pool was =
 on, to NFS) caused MySQL to become COMPLETELY backlogged
(we went from =
 ~15-20 connections at any given instant to 750 (our
max_connections =
 setting)).

 -JF


 -Original Message-
 From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, October 12, 2005 8:15 AM
 To: mysql@lists.mysql.com
 Subject: Re: Non-linear degradation in bulk loads?
=20
 Jon,
=20
 hmm... maybe one of the indexes inevitably is in a random order.
=20
 Please post a typical
=20
 SHOW INNODB STATUS\G
=20
 when the inserts happen slowly.
=20
 What is your my.cnf like?
=20
 Regards,
=20
 Heikki
 Innobase/Oracle
=20
 - Alkuper=E4inen viesti -
 L=E4hett=E4j=E4: Jon Frisby [EMAIL PROTECTED]
 Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED];=20
 mysql@lists.mysql.com
 L=E4hetetty: Wednesday, October 12, 2005 3:08 AM
 Aihe: RE: Non-linear degradation in bulk loads?
=20
=20
  Two solutions: 1) sort the rows to be inserted on the
key 'email'=20
  before inserting.
 
  2) Or:
 
  http://dev.mysql.com/doc/mysql/en/innodb-tuning.html
  
  If you have UNIQUE constraints on secondary keys, starting=20
 from MySQL
  3.23.52 and 4.0.3, you can speed up table imports by
temporarily=20
  turning off the uniqueness checks during the import session:
  SET UNIQUE_CHECKS=3D0;
 
  For big tables, this saves a lot of disk I/O because
InnoDB can use=20
  its insert buffer to write secondary index records in a batch.
  
 
  But make sure you do not have any duplicates in the rows!
=20
 After sending my mail, I discovered SET UNIQUE_CHECKS=3D0, and=20
 subsequent to that it also occurred to me to try putting the=20
 data in in sorted order.  Unfortunately, doing=20
 UNIQUE_CHECKS=3D0 did not work, and even the combination of=20
 both did not work.  First chunk (3.4m rows) was ~1.5 minutes,=20
 second was ~5 minutes...
=20
 At this point I'm inclined to believe that there is something=20
 very wrong with the disk subsystem because of this and other=20
 problems (doing a large cp from the datapool filesystem to=20
 another filesystem brought the database to a near-halt, among=20
 other things).
=20
 As a stop-gap solution, I created the table with no indexes,=20
 and loaded all the data (loaded in linear time), and plan on=20
 doing a CREATE UNIQUE INDEX on the table.  Will this happen=20
 in linear time, or near-linear time?
=20
 *sigh*
=20
 -JF=20
=20
=20
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:   =20
 http

RE: Non-linear degradation in bulk loads?

2005-10-17 Thread Jon Frisby

We only upgraded to CentOS 4.1 due to an emergency data center migration...  We 
weren't prepared to undergo the risk of a 64-bit upgrade at the same time.  I 
believe we're experimenting with 64-bit kernel now as part of our efforts to 
diagnose and resolve the I/O issue.

-JF
 

 -Original Message-
 From: Heikki Tuuri [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 17, 2005 10:52 AM
 To: mysql@lists.mysql.com
 Subject: Re: Non-linear degradation in bulk loads?
 
 Jon,
 
 I do not know. Why not install a 64-bit Linux in your computer?
 
 Regards,
 
 Heikki
 Oracle/Innobase
 
 - Alkuperäinen viesti -
 Lähettäjä: Jon Frisby [EMAIL PROTECTED]
 Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED]; 
 mysql@lists.mysql.com
 Lähetetty: Monday, October 17, 2005 8:46 PM
 Aihe: RE: Non-linear degradation in bulk loads?
 
 
 Actually, I believe we're running 32-bit, with bigmem...  
 Does similar behavior occur in such a scenario?
 
 -JF
 
 
  -Original Message-
  From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
  Sent: Sunday, October 16, 2005 2:23 AM
  To: mysql@lists.mysql.com
  Subject: Re: Non-linear degradation in bulk loads?
 
  Jon,
 
  your hardware/OS combination is quite new and unusual. It 
 might indeed 
  be an OS problem. We observed from a 64-bit RHEL
  4 that when the file cache of the OS grew bigger than 4 GB, 
 then the 
  file I/O performance dropped to one tenth of the normal. 
 You would not 
  expect that kind of behavior from a 64-bit OS.
 
  When you see the slowdown, what does 'top' say about the OS 
 file cache 
  size?
 
  
  The database is our main sites database but we've 
 dramatically reduced 
  the load on that machine over the past couple months 
 through careful 
  optimization of our code.  The box is a dual, dual-core 
 Opteron, 8GB 
  of RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 
 (32-bit of 
  course).  We have 1GB allocated to the buffer pool, and our 
 usual 1GB 
  *
  3 log files.  8 I/O threads.
 
  Load on the box sits at around 6-7, with a large (50%) 
 amount of time 
  spent in wait state, but actual disk throughput to our 
 software RAID 
  array (No longer on a SAN...) is quite low -- 6-9k blocks/s 
 out, 1-6k 
  blocks/s in.
  
 
  Regards,
 
  Heikki
 
 
  - Original Message -
  From: Jon Frisby [EMAIL PROTECTED]
  Newsgroups: mailing.database.myodbc
  Sent: Friday, October 14, 2005 10:39 PM
  Subject: RE: Non-linear degradation in bulk loads?
 
 
   I've tried tweaking the structure of the schema to have,
  for example, a =
   PRIMARY KEY index on email, no other indexes, and then
  insert in sorted =
   order -- made no improvement whatsoever.  Another clue that
  leads me to =
   believe that this may be an OS issue:  Starting a large cp
  on the same =
   box (from a local filesystem other than the one the InnoDB
  data pool was =
   on, to NFS) caused MySQL to become COMPLETELY backlogged
  (we went from =
   ~15-20 connections at any given instant to 750 (our
  max_connections =
   setting)).
  
   -JF
  
  
   -Original Message-
   From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
   Sent: Wednesday, October 12, 2005 8:15 AM
   To: mysql@lists.mysql.com
   Subject: Re: Non-linear degradation in bulk loads?
  =20
   Jon,
  =20
   hmm... maybe one of the indexes inevitably is in a random order.
  =20
   Please post a typical
  =20
   SHOW INNODB STATUS\G
  =20
   when the inserts happen slowly.
  =20
   What is your my.cnf like?
  =20
   Regards,
  =20
   Heikki
   Innobase/Oracle
  =20
   - Alkuper=E4inen viesti -
   L=E4hett=E4j=E4: Jon Frisby [EMAIL PROTECTED]
   Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED];=20  
  mysql@lists.mysql.com
   L=E4hetetty: Wednesday, October 12, 2005 3:08 AM
   Aihe: RE: Non-linear degradation in bulk loads?
  =20
  =20
Two solutions: 1) sort the rows to be inserted on the
  key 'email'=20
before inserting.
   
2) Or:
   
http://dev.mysql.com/doc/mysql/en/innodb-tuning.html

If you have UNIQUE constraints on secondary keys, starting=20
   from MySQL
3.23.52 and 4.0.3, you can speed up table imports by
  temporarily=20
turning off the uniqueness checks during the import session:
SET UNIQUE_CHECKS=3D0;
   
For big tables, this saves a lot of disk I/O because
  InnoDB can use=20
its insert buffer to write secondary index records in a batch.

   
But make sure you do not have any duplicates in the rows!
  =20
   After sending my mail, I discovered SET 
 UNIQUE_CHECKS=3D0, and=20  
  subsequent to that it also occurred to me to try putting the=20  
  data in in sorted order.  Unfortunately, doing=20  
 UNIQUE_CHECKS=3D0 
  did not work, and even the combination of=20  both did not work.  
  First chunk (3.4m rows) was ~1.5 minutes,=20  second was ~5 
  minutes...
  =20
   At this point I'm inclined to believe that there is 
 something=20  
  very wrong with the disk subsystem because of this and other=20  
  problems (doing a large

RE: Non-linear degradation in bulk loads?

2005-10-17 Thread Jon Frisby

Side question:  If I use a 64-bit MySQL build on a 64-bit kernel, is it safe 
and sane to allocate say, 6GB to the InnoDB buffer pool?

On an 8GB box, 64-bit software stack, what is the optimum memory allocation for 
a pure-InnoDB (* - MyISAM used only for grant tables) mysql server running as 
the sole application on the machine?

-JF 

 -Original Message-
 From: Heikki Tuuri [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 17, 2005 10:55 AM
 To: mysql@lists.mysql.com
 Subject: Re: Non-linear degradation in bulk loads?
 
 Jon,
 
 I am not 100 % sure that the problem we saw was in a 64-bit 
 Linux. It might have been 32-bit.
 
 Anyway, since CentOS is a clone of RHEL, this might be the 
 same file cache phenomenon. I do not know if one can force 
 the file cache to stay smaller than 4 GB. You can try running 
 some dummy programs that occupy a few GB of memory.
 
 Regards,
 
 Heikki
 Oracle/Innobase
 
 - Alkuperäinen viesti -
 Lähettäjä: Jon Frisby [EMAIL PROTECTED]
 Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED]; 
 mysql@lists.mysql.com
 Lähetetty: Monday, October 17, 2005 8:49 PM
 Aihe: RE: Non-linear degradation in bulk loads?
 
 
 Sorry to spam the group, but I just noticed that I asserted 
 we were on a
 2.4 kernel -- we're on 2.6.9-11 as part of CentOS 4.1...
 
 -JF
 
 
  -Original Message-
  From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
  Sent: Sunday, October 16, 2005 2:23 AM
  To: mysql@lists.mysql.com
  Subject: Re: Non-linear degradation in bulk loads?
 
  Jon,
 
  your hardware/OS combination is quite new and unusual. It 
 might indeed 
  be an OS problem. We observed from a 64-bit RHEL
  4 that when the file cache of the OS grew bigger than 4 GB, 
 then the 
  file I/O performance dropped to one tenth of the normal. 
 You would not 
  expect that kind of behavior from a 64-bit OS.
 
  When you see the slowdown, what does 'top' say about the OS 
 file cache 
  size?
 
  
  The database is our main sites database but we've 
 dramatically reduced 
  the load on that machine over the past couple months 
 through careful 
  optimization of our code.  The box is a dual, dual-core 
 Opteron, 8GB 
  of RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 
 (32-bit of 
  course).  We have 1GB allocated to the buffer pool, and our 
 usual 1GB 
  *
  3 log files.  8 I/O threads.
 
  Load on the box sits at around 6-7, with a large (50%) 
 amount of time 
  spent in wait state, but actual disk throughput to our 
 software RAID 
  array (No longer on a SAN...) is quite low -- 6-9k blocks/s 
 out, 1-6k 
  blocks/s in.
  
 
  Regards,
 
  Heikki
 
 
  - Original Message -
  From: Jon Frisby [EMAIL PROTECTED]
  Newsgroups: mailing.database.myodbc
  Sent: Friday, October 14, 2005 10:39 PM
  Subject: RE: Non-linear degradation in bulk loads?
 
 
   I've tried tweaking the structure of the schema to have,
  for example, a =
   PRIMARY KEY index on email, no other indexes, and then
  insert in sorted =
   order -- made no improvement whatsoever.  Another clue that
  leads me to =
   believe that this may be an OS issue:  Starting a large cp
  on the same =
   box (from a local filesystem other than the one the InnoDB
  data pool was =
   on, to NFS) caused MySQL to become COMPLETELY backlogged
  (we went from =
   ~15-20 connections at any given instant to 750 (our
  max_connections =
   setting)).
  
   -JF
  
  
   -Original Message-
   From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
   Sent: Wednesday, October 12, 2005 8:15 AM
   To: mysql@lists.mysql.com
   Subject: Re: Non-linear degradation in bulk loads?
  =20
   Jon,
  =20
   hmm... maybe one of the indexes inevitably is in a random order.
  =20
   Please post a typical
  =20
   SHOW INNODB STATUS\G
  =20
   when the inserts happen slowly.
  =20
   What is your my.cnf like?
  =20
   Regards,
  =20
   Heikki
   Innobase/Oracle
  =20
   - Alkuper=E4inen viesti -
   L=E4hett=E4j=E4: Jon Frisby [EMAIL PROTECTED]
   Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED];=20  
  mysql@lists.mysql.com
   L=E4hetetty: Wednesday, October 12, 2005 3:08 AM
   Aihe: RE: Non-linear degradation in bulk loads?
  =20
  =20
Two solutions: 1) sort the rows to be inserted on the
  key 'email'=20
before inserting.
   
2) Or:
   
http://dev.mysql.com/doc/mysql/en/innodb-tuning.html

If you have UNIQUE constraints on secondary keys, starting=20
   from MySQL
3.23.52 and 4.0.3, you can speed up table imports by
  temporarily=20
turning off the uniqueness checks during the import session:
SET UNIQUE_CHECKS=3D0;
   
For big tables, this saves a lot of disk I/O because
  InnoDB can use=20
its insert buffer to write secondary index records in a batch.

   
But make sure you do not have any duplicates in the rows!
  =20
   After sending my mail, I discovered SET 
 UNIQUE_CHECKS=3D0, and=20  
  subsequent to that it also occurred to me to try putting the=20  
  data in in sorted order

Re: Non-linear degradation in bulk loads?

2005-10-16 Thread Heikki Tuuri


Jon,

your hardware/OS combination is quite new and unusual. It might indeed be an 
OS problem. We observed from a 64-bit RHEL 4 that when the file cache of the 
OS grew bigger than 4 GB, then the file I/O performance dropped to one tenth 
of the normal. You would not expect that kind of behavior from a 64-bit OS.


When you see the slowdown, what does 'top' say about the OS file cache size?


The database is our main sites database but we've dramatically reduced
the load on that machine over the past couple months through careful
optimization of our code.  The box is a dual, dual-core Opteron, 8GB of
RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 (32-bit of
course).  We have 1GB allocated to the buffer pool, and our usual 1GB *
3 log files.  8 I/O threads.

Load on the box sits at around 6-7, with a large (50%) amount of time
spent in wait state, but actual disk throughput to our software RAID
array (No longer on a SAN...) is quite low -- 6-9k blocks/s out, 1-6k
blocks/s in.


Regards,

Heikki


- Original Message - 
From: Jon Frisby [EMAIL PROTECTED]

Newsgroups: mailing.database.myodbc
Sent: Friday, October 14, 2005 10:39 PM
Subject: RE: Non-linear degradation in bulk loads?



I've tried tweaking the structure of the schema to have, for example, a =
PRIMARY KEY index on email, no other indexes, and then insert in sorted =
order -- made no improvement whatsoever.  Another clue that leads me to =
believe that this may be an OS issue:  Starting a large cp on the same =
box (from a local filesystem other than the one the InnoDB data pool was =
on, to NFS) caused MySQL to become COMPLETELY backlogged (we went from =
~15-20 connections at any given instant to 750 (our max_connections =
setting)).

-JF



-Original Message-
From: Heikki Tuuri [mailto:[EMAIL PROTECTED]
Sent: Wednesday, October 12, 2005 8:15 AM
To: mysql@lists.mysql.com
Subject: Re: Non-linear degradation in bulk loads?
=20
Jon,
=20
hmm... maybe one of the indexes inevitably is in a random order.
=20
Please post a typical
=20
SHOW INNODB STATUS\G
=20
when the inserts happen slowly.
=20
What is your my.cnf like?
=20
Regards,
=20
Heikki
Innobase/Oracle
=20
- Alkuper=E4inen viesti -
L=E4hett=E4j=E4: Jon Frisby [EMAIL PROTECTED]
Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED];=20
mysql@lists.mysql.com
L=E4hetetty: Wednesday, October 12, 2005 3:08 AM
Aihe: RE: Non-linear degradation in bulk loads?
=20
=20
 Two solutions: 1) sort the rows to be inserted on the key 'email'=20
 before inserting.

 2) Or:

 http://dev.mysql.com/doc/mysql/en/innodb-tuning.html
 
 If you have UNIQUE constraints on secondary keys, starting=20
from MySQL
 3.23.52 and 4.0.3, you can speed up table imports by temporarily=20
 turning off the uniqueness checks during the import session:
 SET UNIQUE_CHECKS=3D0;

 For big tables, this saves a lot of disk I/O because InnoDB can use=20
 its insert buffer to write secondary index records in a batch.
 

 But make sure you do not have any duplicates in the rows!
=20
After sending my mail, I discovered SET UNIQUE_CHECKS=3D0, and=20
subsequent to that it also occurred to me to try putting the=20
data in in sorted order.  Unfortunately, doing=20
UNIQUE_CHECKS=3D0 did not work, and even the combination of=20
both did not work.  First chunk (3.4m rows) was ~1.5 minutes,=20
second was ~5 minutes...
=20
At this point I'm inclined to believe that there is something=20
very wrong with the disk subsystem because of this and other=20
problems (doing a large cp from the datapool filesystem to=20
another filesystem brought the database to a near-halt, among=20
other things).
=20
As a stop-gap solution, I created the table with no indexes,=20
and loaded all the data (loaded in linear time), and plan on=20
doing a CREATE UNIQUE INDEX on the table.  Will this happen=20
in linear time, or near-linear time?
=20
*sigh*
=20
-JF=20
=20
=20
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:   =20
http://lists.mysql.com/[EMAIL PROTECTED]
=20
=20


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: 
http://lists.mysql.com/[EMAIL PROTECTED]





--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Non-linear degradation in bulk loads?

2005-10-14 Thread Jon Frisby

I've tried tweaking the structure of the schema to have, for example, a PRIMARY 
KEY index on email, no other indexes, and then insert in sorted order -- made 
no improvement whatsoever.  Another clue that leads me to believe that this may 
be an OS issue:  Starting a large cp on the same box (from a local filesystem 
other than the one the InnoDB data pool was on, to NFS) caused MySQL to become 
COMPLETELY backlogged (we went from ~15-20 connections at any given instant to 
750 (our max_connections setting)).

-JF


 -Original Message-
 From: Heikki Tuuri [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, October 12, 2005 8:15 AM
 To: mysql@lists.mysql.com
 Subject: Re: Non-linear degradation in bulk loads?
 
 Jon,
 
 hmm... maybe one of the indexes inevitably is in a random order.
 
 Please post a typical
 
 SHOW INNODB STATUS\G
 
 when the inserts happen slowly.
 
 What is your my.cnf like?
 
 Regards,
 
 Heikki
 Innobase/Oracle
 
 - Alkuperäinen viesti -
 Lähettäjä: Jon Frisby [EMAIL PROTECTED]
 Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED]; 
 mysql@lists.mysql.com
 Lähetetty: Wednesday, October 12, 2005 3:08 AM
 Aihe: RE: Non-linear degradation in bulk loads?
 
 
  Two solutions: 1) sort the rows to be inserted on the key 'email' 
  before inserting.
 
  2) Or:
 
  http://dev.mysql.com/doc/mysql/en/innodb-tuning.html
  
  If you have UNIQUE constraints on secondary keys, starting 
 from MySQL
  3.23.52 and 4.0.3, you can speed up table imports by temporarily 
  turning off the uniqueness checks during the import session:
  SET UNIQUE_CHECKS=0;
 
  For big tables, this saves a lot of disk I/O because InnoDB can use 
  its insert buffer to write secondary index records in a batch.
  
 
  But make sure you do not have any duplicates in the rows!
 
 After sending my mail, I discovered SET UNIQUE_CHECKS=0, and 
 subsequent to that it also occurred to me to try putting the 
 data in in sorted order.  Unfortunately, doing 
 UNIQUE_CHECKS=0 did not work, and even the combination of 
 both did not work.  First chunk (3.4m rows) was ~1.5 minutes, 
 second was ~5 minutes...
 
 At this point I'm inclined to believe that there is something 
 very wrong with the disk subsystem because of this and other 
 problems (doing a large cp from the datapool filesystem to 
 another filesystem brought the database to a near-halt, among 
 other things).
 
 As a stop-gap solution, I created the table with no indexes, 
 and loaded all the data (loaded in linear time), and plan on 
 doing a CREATE UNIQUE INDEX on the table.  Will this happen 
 in linear time, or near-linear time?
 
 *sigh*
 
 -JF 
 
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]
 
 

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Non-linear degradation in bulk loads?

2005-10-12 Thread Heikki Tuuri


Jon,

hmm... maybe one of the indexes inevitably is in a random order.

Please post a typical

SHOW INNODB STATUS\G

when the inserts happen slowly.

What is your my.cnf like?

Regards,

Heikki
Innobase/Oracle

- Alkuperäinen viesti - 
Lähettäjä: Jon Frisby [EMAIL PROTECTED]
Vastaanottaja: Heikki Tuuri [EMAIL PROTECTED]; 
mysql@lists.mysql.com

Lähetetty: Wednesday, October 12, 2005 3:08 AM
Aihe: RE: Non-linear degradation in bulk loads?



Two solutions: 1) sort the rows to be inserted on the key
'email' before inserting.

2) Or:

http://dev.mysql.com/doc/mysql/en/innodb-tuning.html

If you have UNIQUE constraints on secondary keys, starting from MySQL
3.23.52 and 4.0.3, you can speed up table imports by
temporarily turning off the uniqueness checks during the
import session:
SET UNIQUE_CHECKS=0;

For big tables, this saves a lot of disk I/O because InnoDB
can use its insert buffer to write secondary index records in a batch.


But make sure you do not have any duplicates in the rows!


After sending my mail, I discovered SET UNIQUE_CHECKS=0, and subsequent
to that it also occurred to me to try putting the data in in sorted
order.  Unfortunately, doing UNIQUE_CHECKS=0 did not work, and even the
combination of both did not work.  First chunk (3.4m rows) was ~1.5
minutes, second was ~5 minutes...

At this point I'm inclined to believe that there is something very wrong
with the disk subsystem because of this and other problems (doing a
large cp from the datapool filesystem to another filesystem brought the
database to a near-halt, among other things).

As a stop-gap solution, I created the table with no indexes, and loaded
all the data (loaded in linear time), and plan on doing a CREATE UNIQUE
INDEX on the table.  Will this happen in linear time, or near-linear
time?

*sigh*

-JF 



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Non-linear degradation in bulk loads?

2005-10-11 Thread Devananda


Jon Frisby wrote:

Everyone,

We're trying to do some bulk data loads on several different tables (on
several different machines, using several different techniques) and
seeing dramatically worse-than-linear performance.

We've tried the bulk-INSERT syntax, and the LOAD DATA INFILE syntax.
We've done ALTER TABLE ... DISABLE KEYS, SET FOREIGN_KEY_CHECKS=0 (where
appropriate), and so forth.

The one that is the most immediate concern is a table of the form:

CREATE TABLE `test` (
  `email` varchar(255) NOT NULL default '',
  `when_happened` datetime NOT NULL default '-00-00 00:00:00',
  UNIQUE KEY `email` (`email`),
  KEY `when_happened` (`when_happened`)
) TYPE=InnoDB;

I'm loading data using LOAD DATA INFILE with chunks containing 3.4m rows
each (~135MB files).  The first chunk was very quick (about 1.5
minutes), but the tenth chunk has taken 22.6 hours and is still going.
(It's been getting progessively slower with each chunk...)

The database is our main sites database but we've dramatically reduced
the load on that machine over the past couple months through careful
optimization of our code.  The box is a dual, dual-core Opteron, 8GB of
RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 (32-bit of
course).  We have 1GB allocated to the buffer pool, and our usual 1GB *
3 log files.  8 I/O threads.

Load on the box sits at around 6-7, with a large (50%) amount of time
spent in wait state, but actual disk throughput to our software RAID
array (No longer on a SAN...) is quite low -- 6-9k blocks/s out, 1-6k
blocks/s in.

Something *has* to be wrong here, but we're not sure what we've missed.
We've restored larger data sets from a mysqldump in the past in
dramatically less time on far inferior hardware. (A superset of this
same data to a schema which is also a superset, PLUS a bunch of other
rather large tables -- all in ~8 hours on a 3Ware RAID array on a dual
Xeon w/ 4GB of RAM)

We're inclined to believe that this is a configuration problem, as
opposed to a driver or hardware problem given the non-linear nature of
the performance degradation.  This implies we're doing something truly
stupid with our loads.  What could cause this kind of strangeness?

-JF



Hi Jon,

I experienced this same non-linear degradation during large imports, 
exactly like you are describing, about 18 months ago. I don't remember 
if I found a specific cause, but I am fairly certain that it was related 
to a few issues, and that we did resolve it. I have not seen this happen 
with MyISAM tables, and we were able import our full data by breaking it 
into chunks and waiting between each chunk, so I believe it to be 
related to InnoDB's logs in some way. Since you are already importing 
your data in chunks, try making each chunk a separate transaction, or 
waiting until disk activity slows to load the next chunk. Also, there 
have been major improvements to InnoDB in the later 4.1 releases, so if 
possible, I would suggest upgrading.


Hope that helps!
Devananda vdv

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Non-linear degradation in bulk loads?

2005-10-11 Thread Heikki Tuuri


Jon,

my guess is that the inserts to the UNIQUE secondary index cause the 
workload to be seriously disk-bound.


Two solutions: 1) sort the rows to be inserted on the key 'email' before 
inserting.


2) Or:

http://dev.mysql.com/doc/mysql/en/innodb-tuning.html

If you have UNIQUE constraints on secondary keys, starting from MySQL 
3.23.52 and 4.0.3, you can speed up table imports by temporarily turning off 
the uniqueness checks during the import session:

SET UNIQUE_CHECKS=0;

For big tables, this saves a lot of disk I/O because InnoDB can use its 
insert buffer to write secondary index records in a batch.



But make sure you do not have any duplicates in the rows!

Note that now you can get support on this MySQL mailing list from a Vice 
President of Oracle. I hope that the level of support improves.


Best regards,

Heikki
Vice President, server technology
Oracle/Innobase Oy



...
Everyone,

We're trying to do some bulk data loads on several different tables (on
several different machines, using several different techniques) and
seeing dramatically worse-than-linear performance.

We've tried the bulk-INSERT syntax, and the LOAD DATA INFILE syntax.
We've done ALTER TABLE ... DISABLE KEYS, SET FOREIGN_KEY_CHECKS=0 (where
appropriate), and so forth.

The one that is the most immediate concern is a table of the form:

CREATE TABLE `test` (
 `email` varchar(255) NOT NULL default '',
 `when_happened` datetime NOT NULL default '-00-00 00:00:00',
 UNIQUE KEY `email` (`email`),
 KEY `when_happened` (`when_happened`)
) TYPE=InnoDB;

I'm loading data using LOAD DATA INFILE with chunks containing 3.4m rows
each (~135MB files).  The first chunk was very quick (about 1.5
minutes), but the tenth chunk has taken 22.6 hours and is still going.
(It's been getting progessively slower with each chunk...)

The database is our main sites database but we've dramatically reduced
the load on that machine over the past couple months through careful
optimization of our code.  The box is a dual, dual-core Opteron, 8GB of
RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 (32-bit of
course).  We have 1GB allocated to the buffer pool, and our usual 1GB *
3 log files.  8 I/O threads.

Load on the box sits at around 6-7, with a large (50%) amount of time
spent in wait state, but actual disk throughput to our software RAID
array (No longer on a SAN...) is quite low -- 6-9k blocks/s out, 1-6k
blocks/s in.

Something *has* to be wrong here, but we're not sure what we've missed.
We've restored larger data sets from a mysqldump in the past in
dramatically less time on far inferior hardware. (A superset of this
same data to a schema which is also a superset, PLUS a bunch of other
rather large tables -- all in ~8 hours on a 3Ware RAID array on a dual
Xeon w/ 4GB of RAM)

We're inclined to believe that this is a configuration problem, as
opposed to a driver or hardware problem given the non-linear nature of
the performance degradation.  This implies we're doing something truly
stupid with our loads.  What could cause this kind of strangeness?

-JF 



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Non-linear degradation in bulk loads?

2005-10-11 Thread mos


At 11:41 PM 10/10/2005, you wrote:

Hi Jon,
Well, may be the next suggestions might help you.
Disable Keys does apply to non-unique keys only.
So I suggest to focus on your unique Email key.
You could do some tests with:
a. drop the unique key on Email
b. load the various bulks
c. after loading, define Email, eg. as (unique) Primary Key.
After b. you can do also ALTER TABLE ORDER BY Email,
to get Email index in pace with the physical order of the data.
Hope it helps ...
Best wishes, Cor


- Original Message - From: Jon Frisby [EMAIL PROTECTED]
To: mysql@lists.mysql.com
Sent: Tuesday, October 11, 2005 4:12 AM
Subject: Non-linear degradation in bulk loads?


Everyone,

We're trying to do some bulk data loads on several different tables (on
several different machines, using several different techniques) and
seeing dramatically worse-than-linear performance.

We've tried the bulk-INSERT syntax, and the LOAD DATA INFILE syntax.
We've done ALTER TABLE ... DISABLE KEYS, SET FOREIGN_KEY_CHECKS=0 (where
appropriate), and so forth.

The one that is the most immediate concern is a table of the form:

CREATE TABLE `test` (
 `email` varchar(255) NOT NULL default '',
 `when_happened` datetime NOT NULL default '-00-00 00:00:00',
 UNIQUE KEY `email` (`email`),
 KEY `when_happened` (`when_happened`)
) TYPE=InnoDB;

I'm loading data using LOAD DATA INFILE with chunks containing 3.4m rows
each (~135MB files).  The first chunk was very quick (about 1.5
minutes), but the tenth chunk has taken 22.6 hours and is still going.
(It's been getting progessively slower with each chunk...)

The database is our main sites database but we've dramatically reduced
the load on that machine over the past couple months through careful
optimization of our code.  The box is a dual, dual-core Opteron, 8GB of
RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 (32-bit of
course).  We have 1GB allocated to the buffer pool, and our usual 1GB *
3 log files.  8 I/O threads.

Load on the box sits at around 6-7, with a large (50%) amount of time
spent in wait state, but actual disk throughput to our software RAID
array (No longer on a SAN...) is quite low -- 6-9k blocks/s out, 1-6k
blocks/s in.

Something *has* to be wrong here, but we're not sure what we've missed.
We've restored larger data sets from a mysqldump in the past in
dramatically less time on far inferior hardware. (A superset of this
same data to a schema which is also a superset, PLUS a bunch of other
rather large tables -- all in ~8 hours on a 3Ware RAID array on a dual
Xeon w/ 4GB of RAM)

We're inclined to believe that this is a configuration problem, as
opposed to a driver or hardware problem given the non-linear nature of
the performance degradation.  This implies we're doing something truly
stupid with our loads.  What could cause this kind of strangeness?

-JF


JF,
It's likely the indexes that are causing the slowdown. If you 
remove all indexes from the table definition and start with an empty table, 
you should see a dramatic speed increase. When all the data has been 
loaded, use one Alter Table command to rebuild all the indexes.  You could 
try to optimize the table after each load to see if that speeds things up 
(this will rebalance the index distribution). I was able to load 100 
million rows relatively fast, but failed miserably at 500 million rows 
because of insufficient memory.


Mike

P.S. I suppose you already know when you use Load Data to load data into an 
empty table, it won't update the indexes until the load has completed. This 
is why loading data into an empty table is much faster than loading data 
into a table that has rows in it (even if it is only 1 row). 



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Non-linear degradation in bulk loads?

2005-10-11 Thread Jon Frisby

 Two solutions: 1) sort the rows to be inserted on the key 
 'email' before inserting.
 
 2) Or:
 
 http://dev.mysql.com/doc/mysql/en/innodb-tuning.html
 
 If you have UNIQUE constraints on secondary keys, starting from MySQL
 3.23.52 and 4.0.3, you can speed up table imports by 
 temporarily turning off the uniqueness checks during the 
 import session:
 SET UNIQUE_CHECKS=0;
 
 For big tables, this saves a lot of disk I/O because InnoDB 
 can use its insert buffer to write secondary index records in a batch.
 
 
 But make sure you do not have any duplicates in the rows!

After sending my mail, I discovered SET UNIQUE_CHECKS=0, and subsequent
to that it also occurred to me to try putting the data in in sorted
order.  Unfortunately, doing UNIQUE_CHECKS=0 did not work, and even the
combination of both did not work.  First chunk (3.4m rows) was ~1.5
minutes, second was ~5 minutes...

At this point I'm inclined to believe that there is something very wrong
with the disk subsystem because of this and other problems (doing a
large cp from the datapool filesystem to another filesystem brought the
database to a near-halt, among other things).

As a stop-gap solution, I created the table with no indexes, and loaded
all the data (loaded in linear time), and plan on doing a CREATE UNIQUE
INDEX on the table.  Will this happen in linear time, or near-linear
time?

*sigh*

-JF

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Non-linear degradation in bulk loads?

2005-10-10 Thread Manoj

Not sure but given that you suffer from non-linear degradation in
performance;my guess is you might be extending your ibdata file every
too frequently during the batch load process. Check the
ibdata_data_file_path variable in my.cnf for more details.

Cheers

Manoj
On 10/11/05, Jon Frisby [EMAIL PROTECTED] wrote:
 Everyone,

 We're trying to do some bulk data loads on several different tables (on
 several different machines, using several different techniques) and
 seeing dramatically worse-than-linear performance.

 We've tried the bulk-INSERT syntax, and the LOAD DATA INFILE syntax.
 We've done ALTER TABLE ... DISABLE KEYS, SET FOREIGN_KEY_CHECKS=0 (where
 appropriate), and so forth.

 The one that is the most immediate concern is a table of the form:

 CREATE TABLE `test` (
  `email` varchar(255) NOT NULL default '',
  `when_happened` datetime NOT NULL default '-00-00 00:00:00',
  UNIQUE KEY `email` (`email`),
  KEY `when_happened` (`when_happened`)
 ) TYPE=InnoDB;

 I'm loading data using LOAD DATA INFILE with chunks containing 3.4m rows
 each (~135MB files).  The first chunk was very quick (about 1.5
 minutes), but the tenth chunk has taken 22.6 hours and is still going.
 (It's been getting progessively slower with each chunk...)

 The database is our main sites database but we've dramatically reduced
 the load on that machine over the past couple months through careful
 optimization of our code.  The box is a dual, dual-core Opteron, 8GB of
 RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 (32-bit of
 course).  We have 1GB allocated to the buffer pool, and our usual 1GB *
 3 log files.  8 I/O threads.

 Load on the box sits at around 6-7, with a large (50%) amount of time
 spent in wait state, but actual disk throughput to our software RAID
 array (No longer on a SAN...) is quite low -- 6-9k blocks/s out, 1-6k
 blocks/s in.

 Something *has* to be wrong here, but we're not sure what we've missed.
 We've restored larger data sets from a mysqldump in the past in
 dramatically less time on far inferior hardware. (A superset of this
 same data to a schema which is also a superset, PLUS a bunch of other
 rather large tables -- all in ~8 hours on a 3Ware RAID array on a dual
 Xeon w/ 4GB of RAM)

 We're inclined to believe that this is a configuration problem, as
 opposed to a driver or hardware problem given the non-linear nature of
 the performance degradation.  This implies we're doing something truly
 stupid with our loads.  What could cause this kind of strangeness?

 -JF

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Non-linear degradation in bulk loads?

2005-10-10 Thread Jon Frisby

Manoj,

Thanks for the reply!  Unfortunately, I failed to note that we don't
have an auto-extending data pool.  Our data pool is about 212GB, in 4GB
chunks, with about 4.5GB free right now.  We extend it explicitly, and
monitor to make sure it doesn't fill up.

-JF 

 -Original Message-
 From: Manoj [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 10, 2005 8:51 PM
 To: Jon Frisby
 Cc: mysql@lists.mysql.com
 Subject: Re: Non-linear degradation in bulk loads?
 
 Not sure but given that you suffer from non-linear 
 degradation in performance;my guess is you might be extending 
 your ibdata file every too frequently during the batch load 
 process. Check the ibdata_data_file_path variable in my.cnf 
 for more details.
 
 Cheers
 
 Manoj
 On 10/11/05, Jon Frisby [EMAIL PROTECTED] wrote:
  Everyone,
 
  We're trying to do some bulk data loads on several different tables 
  (on several different machines, using several different techniques) 
  and seeing dramatically worse-than-linear performance.
 
  We've tried the bulk-INSERT syntax, and the LOAD DATA INFILE syntax.
  We've done ALTER TABLE ... DISABLE KEYS, SET FOREIGN_KEY_CHECKS=0 
  (where appropriate), and so forth.
 
  The one that is the most immediate concern is a table of the form:
 
  CREATE TABLE `test` (
   `email` varchar(255) NOT NULL default '',  `when_happened` 
 datetime 
  NOT NULL default '-00-00 00:00:00',  UNIQUE KEY `email` 
 (`email`),  
  KEY `when_happened` (`when_happened`)
  ) TYPE=InnoDB;
 
  I'm loading data using LOAD DATA INFILE with chunks containing 3.4m 
  rows each (~135MB files).  The first chunk was very quick 
 (about 1.5 
  minutes), but the tenth chunk has taken 22.6 hours and is 
 still going.
  (It's been getting progessively slower with each chunk...)
 
  The database is our main sites database but we've 
 dramatically reduced 
  the load on that machine over the past couple months 
 through careful 
  optimization of our code.  The box is a dual, dual-core 
 Opteron, 8GB 
  of RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 
 (32-bit of 
  course).  We have 1GB allocated to the buffer pool, and our 
 usual 1GB 
  *
  3 log files.  8 I/O threads.
 
  Load on the box sits at around 6-7, with a large (50%) 
 amount of time 
  spent in wait state, but actual disk throughput to our 
 software RAID 
  array (No longer on a SAN...) is quite low -- 6-9k blocks/s 
 out, 1-6k 
  blocks/s in.
 
  Something *has* to be wrong here, but we're not sure what 
 we've missed.
  We've restored larger data sets from a mysqldump in the past in 
  dramatically less time on far inferior hardware. (A 
 superset of this 
  same data to a schema which is also a superset, PLUS a 
 bunch of other 
  rather large tables -- all in ~8 hours on a 3Ware RAID 
 array on a dual 
  Xeon w/ 4GB of RAM)
 
  We're inclined to believe that this is a configuration problem, as 
  opposed to a driver or hardware problem given the 
 non-linear nature of 
  the performance degradation.  This implies we're doing 
 something truly 
  stupid with our loads.  What could cause this kind of strangeness?
 
  -JF
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]
 
 
 

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Non-linear degradation in bulk loads?

2005-10-10 Thread C.R. Vegelin


Hi Jon,
Well, may be the next suggestions might help you.
Disable Keys does apply to non-unique keys only.
So I suggest to focus on your unique Email key.
You could do some tests with:
a. drop the unique key on Email
b. load the various bulks
c. after loading, define Email, eg. as (unique) Primary Key.
After b. you can do also ALTER TABLE ORDER BY Email,
to get Email index in pace with the physical order of the data.
Hope it helps ...
Best wishes, Cor


- Original Message - 
From: Jon Frisby [EMAIL PROTECTED]

To: mysql@lists.mysql.com
Sent: Tuesday, October 11, 2005 4:12 AM
Subject: Non-linear degradation in bulk loads?


Everyone,

We're trying to do some bulk data loads on several different tables (on
several different machines, using several different techniques) and
seeing dramatically worse-than-linear performance.

We've tried the bulk-INSERT syntax, and the LOAD DATA INFILE syntax.
We've done ALTER TABLE ... DISABLE KEYS, SET FOREIGN_KEY_CHECKS=0 (where
appropriate), and so forth.

The one that is the most immediate concern is a table of the form:

CREATE TABLE `test` (
 `email` varchar(255) NOT NULL default '',
 `when_happened` datetime NOT NULL default '-00-00 00:00:00',
 UNIQUE KEY `email` (`email`),
 KEY `when_happened` (`when_happened`)
) TYPE=InnoDB;

I'm loading data using LOAD DATA INFILE with chunks containing 3.4m rows
each (~135MB files).  The first chunk was very quick (about 1.5
minutes), but the tenth chunk has taken 22.6 hours and is still going.
(It's been getting progessively slower with each chunk...)

The database is our main sites database but we've dramatically reduced
the load on that machine over the past couple months through careful
optimization of our code.  The box is a dual, dual-core Opteron, 8GB of
RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 (32-bit of
course).  We have 1GB allocated to the buffer pool, and our usual 1GB *
3 log files.  8 I/O threads.

Load on the box sits at around 6-7, with a large (50%) amount of time
spent in wait state, but actual disk throughput to our software RAID
array (No longer on a SAN...) is quite low -- 6-9k blocks/s out, 1-6k
blocks/s in.

Something *has* to be wrong here, but we're not sure what we've missed.
We've restored larger data sets from a mysqldump in the past in
dramatically less time on far inferior hardware. (A superset of this
same data to a schema which is also a superset, PLUS a bunch of other
rather large tables -- all in ~8 hours on a 3Ware RAID array on a dual
Xeon w/ 4GB of RAM)

We're inclined to believe that this is a configuration problem, as
opposed to a driver or hardware problem given the non-linear nature of
the performance degradation.  This implies we're doing something truly
stupid with our loads.  What could cause this kind of strangeness?

-JF

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]





--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Non-linear degradation in bulk loads?

RE: Non-linear degradation in bulk loads?

RE: Non-linear degradation in bulk loads?

Re: Non-linear degradation in bulk loads?

RE: Non-linear degradation in bulk loads?

Re: Non-linear degradation in bulk loads?

RE: Non-linear degradation in bulk loads?

RE: Non-linear degradation in bulk loads?

Re: Non-linear degradation in bulk loads?

RE: Non-linear degradation in bulk loads?

Re: Non-linear degradation in bulk loads?

Re: Non-linear degradation in bulk loads?

Re: Non-linear degradation in bulk loads?

Re: Non-linear degradation in bulk loads?

RE: Non-linear degradation in bulk loads?

Re: Non-linear degradation in bulk loads?

RE: Non-linear degradation in bulk loads?

Re: Non-linear degradation in bulk loads?

18 matches

Site Navigation

Mail list logo

Footer information