Re: Frequently corrupt tables

Bill Adams Thu, 18 Oct 2001 12:12:11 -0700

Kyle Hayes wrote:

> > I found yesterday (at the advice of this list) that adding an occasional
> > call to "FLUSH TABLES" fixed my corruption problems.  I would do that right
> > before the disconnect or program exit.
>
> What kernel are you using?  Some of the 2.4 series have... odd... behavior
> with regards to caching.


Linux host 2.2.19 #6 SMP Wed Jul 11 10:55:03 PDT 2001 i686 unknown
2GB Memory, 4 CPUs.
(It happened on other systems with different kernel versions too.)

> Are you using SCSI or IDE.  We've run many tests with both and not had any
> corruption problems unless we did something whacked like pull the power for
> the machine while it was running the test.

SCSI.  (Had problem with different controllers on different systems)

Three dual channel controllers, all the same:

[bill@host ~/dev]$ cat /proc/scsi/aic7xxx/0
Adaptec AIC7xxx driver version: 5.1.33/3.2.4
Compile Options:
  TCQ Enabled By Default : Disabled
  AIC7XXX_PROC_STATS     : Disabled
  AIC7XXX_RESET_DELAY    : 5

Adapter Configuration:
           SCSI Adapter: Adaptec AIC-7899 Ultra 160/m SCSI host adapter
                           Ultra-160/m LVD/SE Wide Controller Channel A at PCI
2/6/0
    PCI MMAPed I/O Base: 0xf9dfa000
 Adapter SEEPROM Config: SEEPROM found and used.
      Adaptec SCSI BIOS: Disabled
                    IRQ: 21
                   SCBs: Active 0, Max Active 1,
                         Allocated 15, HW 32, Page 255
             Interrupts: 36738969
      BIOS Control Word: 0xb8f8
   Adapter Control Word: 0x7c5d
   Extended Translation: Enabled
Disconnect Enable Flags: 0xffff
     Ultra Enable Flags: 0x0000
 Tag Queue Enable Flags: 0x0000
Ordered Queue Tag Flags: 0x0000
Default Tag Queue Depth: 8
    Tagged Queue By Device array for aic7xxx host instance 0:
      {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255}
    Actual queue depth per device for aic7xxx host instance 0:
      {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}

Statistics:

(scsi0:0:0:0)
  Device using Wide/Sync transfers at 80.0 MByte/sec, offset 31
  Transinfo settings: current(10/31/1/0), goal(10/127/1/0), user(9/127/1/2)
  Total transfers 36738885 (18761976 reads and 17976909 writes)


> What filesystem are you running?

ext2. At least that is what linux sees.  The disks are actually hardware raid0
winchester flashdisks.


> Just running FLUSH TABLES sounds like it is only going to make the problem
> less common, not fix it.  Something is corrupting your indexes/data.

I loaded three big tables last night with no problems (after adding the
occasional $dbh->do( "FLUSH TABLES" ).  Before it would happen at least once
when doing a large (re)load of data.



> Is the data getting mangled or the index?  If myisamchk can fix the problem,

That is the funny thing, I had to do a mysqldump > file; mysql <file to fix the
table.  myisamchk would report the table was bad, I would try to repair with
-o (and just about every other level).  then myisamchk would report it was good
(even with -e).  When I continued to load the data, it would quickly become
corrupted again.  Even rebuilding all of the indexes would not fix it.  Running
the mysqldump, mysql fixed it much better.


>
> it is likely that the index is the problem.  MySQL will cache the index in
> memory, but not the data.  Thus, if you see data mangling problems and
> possibly index problems, I would look at the kernel, disk etc.  If you are
> only see index problems, but the data looks OK, then the version of MySQL
> might be a problem or maybe you have a bad build.  MySQL builds more cleanly

It happened with 3.23.41.


> than most OSS projects, but it is a big complex beastie and can build
> incorrectly without obvious errors sometimes in our experience.  Bad library
> versions can also be a factor.

I did build/run this on a RH6.2 system.


> We've run tests with 1000 hits per second on a database on a cheasy IDE drive
> without a problem.  We've run those tests for hours at a time with no
> problems.  SCSI definitely works better than IDE, but the newer IDE drives
> aren't that bad anymore.  They still use a lot of CPU.

It is not the selects that cause the problems, it is lots of inserts.  Again, it
only seems to happen on large loads.  I have three main tables and a large load
means:
mysql> select count(*) from pcm_test_header_200109;
+----------+
| count(*) |
+----------+
|     5844 |
+----------+
1 row in set (0.07 sec)

mysql> select count(*) from pcm_test_summary_200109;
+----------+
| count(*) |
+----------+
|   840413 |
+----------+
1 row in set (0.04 sec)

mysql> select count(*) from pcm_test_site_200109;
+----------+
| count(*) |
+----------+
|  7248366 |
+----------+
1 row in set (0.02 sec)

mysql>

Any of the three tables can have problems but it is usually the site table.



> If your drives to write caching, that can be a problem if you have a power
> drop.  Most IDE drives (all?) will cache writes to allow the disk firmware to

This is not a power or crash problem.  It happens WHILE the loader is running.

It could be a DBI/DBD bug.  I [try to] insert all of the above records with a
single database handle (connection).

b.




---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Frequently corrupt tables

Reply via email to