Re: InnoDB corrupt after power failure

2012-10-04 Thread Andrew Miklas
Hi Manuel,

Thanks for the fast reply.

On Oct 4, 2012, at 12:05 AM, Manuel Arostegui wrote:
snip
 it shouldn't be a biggie if you have a BBU. Do you guys use HW RAID + BBU?

We've checked with our hosting provider, and the database was indeed stored on 
a BBU RAID.

 What's your innodb_flush_log_at_trx_commit value?

mysql show variables like 'innodb_flush_log_at_trx_commit'\G
*** 1. row ***
Variable_name: innodb_flush_log_at_trx_commit
Value: 1
1 row in set (0.00 sec)


 Have you tried playing with innodb_force_recovery option to try to get the 
 server started at least? That way you might be able to identify which 
 table(s) is/are the corrupted one and the one(s) preventing the whole server 
 from booting up. 

As the affected machine was just a read only slave, it was easier for me to get 
things back into service by just reloading off the master.  Unfortunately, I 
didn't think to keep the corrupted ibd files for later debugging.

At this point, I'm more trying to figure out if there's something wrong with 
the DB or host config.  There was effectively no data loss, but I'm worried we 
might have data loss or availability issues if this error crops up on our 
master server.


Thanks,


Andrew


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: InnoDB corrupt after power failure

2012-10-04 Thread Andrew Miklas
Hi Rick,

On Oct 4, 2012, at 2:40 PM, Rick James wrote:

 I hope you turned OFF caching on the drives, themselves.  The BBU should be 
 the single place that caches and is trusted to survive a power outage.

The DB server in question is running in a virtualized environment, so the array 
shows up as a SCSI device inside our VM.  I can't use hdparm to directly check 
whether the disks are doing write caching, but our hosting provider assures us 
that once data is sent to the virtual SCSI device from inside the VM, it will 
be persisted to disk even if there's a power failure.

I'm a bit suspicious of a recent change we did to switch our ext3 journals from 
data=ordered to data=writeback.  The ext3 docs say a crash+recovery can cause 
incorrect data to appear in files which were written shortly before the crash 
[1].  As a result, if a tablespace were extended just before the power failure, 
it might be possible that when MySQL restarts, it will see random data at the 
end of the tablespace.  It seems like this could happen even if the disks are 
BBU / not write caching, because the increase of the ibd's file size in the 
inode and the zeroing out of the new blocks assigned to the file are not atomic 
with respect to one another.

Is the InnoDB recovery process OK with this scenario?  Has anyone else seen 
corruption problems with data=writeback?


-- Andrew


[1] http://lxr.linux.no/linux+v3.5.2/Documentation/filesystems/ext3.txt
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



InnoDB corrupt after power failure

2012-10-03 Thread Andrew Miklas
Hi guys,

I recently had a data corruption issue with InnoDB.  MySQL was shut down 
improperly (power failure), and when the system came back up, MySQL refused to 
start.  On inspection of the logs (see below), it looks like the tablespace 
became seriously corrupted.  In the end, I had to rebuild the slave using 
mysqldump.

I'm curious what happened here, since I thought InnoDB wasn't supposed to 
become corrupted on an improper shutdown.  One possibility that we were 
exploring was that the filesystem journal setting was incorrect.  We were using 
ext3 with the journal set to writeback mode.  Is this a known bad config with 
InnoDB?


Thanks for any help,


Andrew


---

MySQL server version: Server version: 5.5.27-1~ppa1~lucid-log (Ubuntu)
(Running on Ubuntu 10.04.2 LTS)

120831 20:56:01 InnoDB: The InnoDB memory heap is disabled
120831 20:56:01 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120831 20:56:01 InnoDB: Compressed tables use zlib 1.2.3.3
120831 20:56:02 InnoDB: Initializing buffer pool, size = 5.0G
120831 20:56:03 InnoDB: Completed initialization of buffer pool
120831 20:56:06 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 62096393185
120831 20:56:06  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Warning: database page corruption or a failed
InnoDB: file read of space 0 page 230.
InnoDB: Trying to recover it from the doublewrite buffer.
InnoDB: Recovered the page from the doublewrite buffer.
InnoDB: Warning: database page corruption or a failed
InnoDB: file read of space 0 page 373.
InnoDB: Trying to recover it from the doublewrite buffer.
InnoDB: Recovered the page from the doublewrite buffer.
InnoDB: Warning: database page corruption or a failed
InnoDB: file read of space 0 page 214.
InnoDB: Trying to recover it from the doublewrite buffer.
InnoDB: Recovered the page from the doublewrite buffer.
InnoDB: Warning: database page corruption or a failed
InnoDB: file read of space 0 page 222.
InnoDB: Trying to recover it from the doublewrite buffer.
InnoDB: Recovered the page from the doublewrite buffer.
InnoDB: Warning: database page corruption or a failed
InnoDB: file read of space 0 page 2673.
InnoDB: Trying to recover it from the doublewrite buffer.
InnoDB: Recovered the page from the doublewrite buffer.
InnoDB: Warning: database page corruption or a failed
InnoDB: file read of space 0 page 2681.
InnoDB: Trying to recover it from the doublewrite buffer.
InnoDB: Recovered the page from the doublewrite buffer.
InnoDB: Warning: database page corruption or a failed
InnoDB: file read of space 0 page 46.
InnoDB: Trying to recover it from the doublewrite buffer.
InnoDB: Recovered the page from the doublewrite buffer.
InnoDB: Warning: database page corruption or a failed
InnoDB: file read of space 0 page 62.
InnoDB: Trying to recover it from the doublewrite buffer.
InnoDB: Recovered the page from the doublewrite buffer.
InnoDB: Doing recovery: scanned up to log sequence number 62096881152
InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 198.
InnoDB: You may have to recover from a backup.
120831 20:56:33  InnoDB: Page dump in ascii and hex (16384 bytes):
 len 16384; hex
{Big dump here -- I can supply if needed}

InnoDB: End of page dump
120831 20:56:33  InnoDB: Page checksum 3859504003, prior-to-4.0.14-form 
checksum 1080681687
InnoDB: stored checksum 3859504003, prior-to-4.0.14-form stored checksum 
3870577874
InnoDB: Page lsn 14 1966349405, low 4 bytes of lsn at page end 1966973261
InnoDB: Page number (if stored to page already) 198,
InnoDB: space id (if created with = MySQL-4.1.1 and stored already) 0
InnoDB: Page may be a system page
InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 198.
InnoDB: You may have to recover from a backup.
InnoDB: It is also possible that your operating
InnoDB: system has corrupted its own file cache
InnoDB: and rebooting your computer removes the
InnoDB: error.
InnoDB: If the corrupt page is an index page
InnoDB: you can also try to fix the corruption
InnoDB: by dumping, dropping, and reimporting
InnoDB: the corrupt table. You can use CHECK
InnoDB: TABLE to scan your table for corruption.
InnoDB: See also 
http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
InnoDB: Ending processing because of a corrupt database page.
120831 20:56:33  InnoDB: Assertion failure in thread 140548948399904 in file 
buf0buf.c line 3609
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to