Re: InnoDB corrupt after power failure
2012/10/4 Andrew Miklas and...@pagerduty.com Hi guys, I recently had a data corruption issue with InnoDB. MySQL was shut down improperly (power failure), and when the system came back up, MySQL refused to start. On inspection of the logs (see below), it looks like the tablespace became seriously corrupted. In the end, I had to rebuild the slave using mysqldump. I'm curious what happened here, since I thought InnoDB wasn't supposed to become corrupted on an improper shutdown. One possibility that we were exploring was that the filesystem journal setting was incorrect. We were using ext3 with the journal set to writeback mode. Is this a known bad config with InnoDB? Hey Andrew, it shouldn't be a biggie if you have a BBU. Do you guys use HW RAID + BBU? What's your innodb_flush_log_at_trx_commit value? Have you tried playing with innodb_force_recovery option to try to get the server started at least? That way you might be able to identify which table(s) is/are the corrupted one and the one(s) preventing the whole server from booting up. Manuel
Re: InnoDB corrupt after power failure
Hi Manuel, Thanks for the fast reply. On Oct 4, 2012, at 12:05 AM, Manuel Arostegui wrote: snip it shouldn't be a biggie if you have a BBU. Do you guys use HW RAID + BBU? We've checked with our hosting provider, and the database was indeed stored on a BBU RAID. What's your innodb_flush_log_at_trx_commit value? mysql show variables like 'innodb_flush_log_at_trx_commit'\G *** 1. row *** Variable_name: innodb_flush_log_at_trx_commit Value: 1 1 row in set (0.00 sec) Have you tried playing with innodb_force_recovery option to try to get the server started at least? That way you might be able to identify which table(s) is/are the corrupted one and the one(s) preventing the whole server from booting up. As the affected machine was just a read only slave, it was easier for me to get things back into service by just reloading off the master. Unfortunately, I didn't think to keep the corrupted ibd files for later debugging. At this point, I'm more trying to figure out if there's something wrong with the DB or host config. There was effectively no data loss, but I'm worried we might have data loss or availability issues if this error crops up on our master server. Thanks, Andrew -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: InnoDB corrupt after power failure
I hope you turned OFF caching on the drives, themselves. The BBU should be the single place that caches and is trusted to survive a power outage. -Original Message- From: Andrew Miklas [mailto:and...@pagerduty.com] Sent: Thursday, October 04, 2012 2:16 PM To: Manuel Arostegui Cc: mysql@lists.mysql.com Subject: Re: InnoDB corrupt after power failure Hi Manuel, Thanks for the fast reply. On Oct 4, 2012, at 12:05 AM, Manuel Arostegui wrote: snip it shouldn't be a biggie if you have a BBU. Do you guys use HW RAID + BBU? We've checked with our hosting provider, and the database was indeed stored on a BBU RAID. What's your innodb_flush_log_at_trx_commit value? mysql show variables like 'innodb_flush_log_at_trx_commit'\G *** 1. row *** Variable_name: innodb_flush_log_at_trx_commit Value: 1 1 row in set (0.00 sec) Have you tried playing with innodb_force_recovery option to try to get the server started at least? That way you might be able to identify which table(s) is/are the corrupted one and the one(s) preventing the whole server from booting up. As the affected machine was just a read only slave, it was easier for me to get things back into service by just reloading off the master. Unfortunately, I didn't think to keep the corrupted ibd files for later debugging. At this point, I'm more trying to figure out if there's something wrong with the DB or host config. There was effectively no data loss, but I'm worried we might have data loss or availability issues if this error crops up on our master server. Thanks, Andrew -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: InnoDB corrupt after power failure
Hi Rick, On Oct 4, 2012, at 2:40 PM, Rick James wrote: I hope you turned OFF caching on the drives, themselves. The BBU should be the single place that caches and is trusted to survive a power outage. The DB server in question is running in a virtualized environment, so the array shows up as a SCSI device inside our VM. I can't use hdparm to directly check whether the disks are doing write caching, but our hosting provider assures us that once data is sent to the virtual SCSI device from inside the VM, it will be persisted to disk even if there's a power failure. I'm a bit suspicious of a recent change we did to switch our ext3 journals from data=ordered to data=writeback. The ext3 docs say a crash+recovery can cause incorrect data to appear in files which were written shortly before the crash [1]. As a result, if a tablespace were extended just before the power failure, it might be possible that when MySQL restarts, it will see random data at the end of the tablespace. It seems like this could happen even if the disks are BBU / not write caching, because the increase of the ibd's file size in the inode and the zeroing out of the new blocks assigned to the file are not atomic with respect to one another. Is the InnoDB recovery process OK with this scenario? Has anyone else seen corruption problems with data=writeback? -- Andrew [1] http://lxr.linux.no/linux+v3.5.2/Documentation/filesystems/ext3.txt -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: InnoDB corrupt after power failure
On Linux, XFS is preferred. Noop or Deadline, not CFQ is preferred. I don't know if any of this has any impact on the crash you describe. I am quite suspicious of VMs. -Original Message- From: Andrew Miklas [mailto:and...@pagerduty.com] Sent: Thursday, October 04, 2012 3:21 PM To: Rick James Cc: Manuel Arostegui; mysql@lists.mysql.com Subject: Re: InnoDB corrupt after power failure Hi Rick, On Oct 4, 2012, at 2:40 PM, Rick James wrote: I hope you turned OFF caching on the drives, themselves. The BBU should be the single place that caches and is trusted to survive a power outage. The DB server in question is running in a virtualized environment, so the array shows up as a SCSI device inside our VM. I can't use hdparm to directly check whether the disks are doing write caching, but our hosting provider assures us that once data is sent to the virtual SCSI device from inside the VM, it will be persisted to disk even if there's a power failure. I'm a bit suspicious of a recent change we did to switch our ext3 journals from data=ordered to data=writeback. The ext3 docs say a crash+recovery can cause incorrect data to appear in files which were written shortly before the crash [1]. As a result, if a tablespace were extended just before the power failure, it might be possible that when MySQL restarts, it will see random data at the end of the tablespace. It seems like this could happen even if the disks are BBU / not write caching, because the increase of the ibd's file size in the inode and the zeroing out of the new blocks assigned to the file are not atomic with respect to one another. Is the InnoDB recovery process OK with this scenario? Has anyone else seen corruption problems with data=writeback? -- Andrew [1] http://lxr.linux.no/linux+v3.5.2/Documentation/filesystems/ext3.txt -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql