RE: InnoDB corrupt after power failure
On Linux, XFS is preferred. Noop or Deadline, not CFQ is preferred. I don't know if any of this has any impact on the crash you describe. I am quite suspicious of VMs. > -Original Message- > From: Andrew Miklas [mailto:and...@pagerduty.com] > Sent: Thursday, October 04, 2012 3:21 PM > To: Rick James > Cc: Manuel Arostegui; mysql@lists.mysql.com > Subject: Re: InnoDB corrupt after power failure > > Hi Rick, > > On Oct 4, 2012, at 2:40 PM, Rick James wrote: > > > I hope you turned OFF caching on the drives, themselves. The BBU > should be the single place that caches and is trusted to survive a > power outage. > > The DB server in question is running in a virtualized environment, so > the array shows up as a SCSI device inside our VM. I can't use hdparm > to directly check whether the disks are doing write caching, but our > hosting provider assures us that once data is sent to the virtual SCSI > device from inside the VM, it will be persisted to disk even if there's > a power failure. > > I'm a bit suspicious of a recent change we did to switch our ext3 > journals from data=ordered to data=writeback. The ext3 docs say "a > crash+recovery can cause incorrect data to appear in files which were > written shortly before the crash" [1]. As a result, if a tablespace > were extended just before the power failure, it might be possible that > when MySQL restarts, it will see random data at the end of the > tablespace. It seems like this could happen even if the disks are BBU > / not write caching, because the increase of the ibd's file size in the > inode and the zeroing out of the new blocks assigned to the file are > not atomic with respect to one another. > > Is the InnoDB recovery process OK with this scenario? Has anyone else > seen corruption problems with data=writeback? > > > -- Andrew > > > [1] http://lxr.linux.no/linux+v3.5.2/Documentation/filesystems/ext3.txt -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: InnoDB corrupt after power failure
Hi Rick, On Oct 4, 2012, at 2:40 PM, Rick James wrote: > I hope you turned OFF caching on the drives, themselves. The BBU should be > the single place that caches and is trusted to survive a power outage. The DB server in question is running in a virtualized environment, so the array shows up as a SCSI device inside our VM. I can't use hdparm to directly check whether the disks are doing write caching, but our hosting provider assures us that once data is sent to the virtual SCSI device from inside the VM, it will be persisted to disk even if there's a power failure. I'm a bit suspicious of a recent change we did to switch our ext3 journals from data=ordered to data=writeback. The ext3 docs say "a crash+recovery can cause incorrect data to appear in files which were written shortly before the crash" [1]. As a result, if a tablespace were extended just before the power failure, it might be possible that when MySQL restarts, it will see random data at the end of the tablespace. It seems like this could happen even if the disks are BBU / not write caching, because the increase of the ibd's file size in the inode and the zeroing out of the new blocks assigned to the file are not atomic with respect to one another. Is the InnoDB recovery process OK with this scenario? Has anyone else seen corruption problems with data=writeback? -- Andrew [1] http://lxr.linux.no/linux+v3.5.2/Documentation/filesystems/ext3.txt -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: InnoDB corrupt after power failure
I hope you turned OFF caching on the drives, themselves. The BBU should be the single place that caches and is trusted to survive a power outage. > -Original Message- > From: Andrew Miklas [mailto:and...@pagerduty.com] > Sent: Thursday, October 04, 2012 2:16 PM > To: Manuel Arostegui > Cc: mysql@lists.mysql.com > Subject: Re: InnoDB corrupt after power failure > > Hi Manuel, > > Thanks for the fast reply. > > On Oct 4, 2012, at 12:05 AM, Manuel Arostegui wrote: > > > it shouldn't be a biggie if you have a BBU. Do you guys use HW RAID + > BBU? > > We've checked with our hosting provider, and the database was indeed > stored on a BBU RAID. > > > What's your innodb_flush_log_at_trx_commit value? > > mysql> show variables like 'innodb_flush_log_at_trx_commit'\G > *** 1. row *** > Variable_name: innodb_flush_log_at_trx_commit > Value: 1 > 1 row in set (0.00 sec) > > > > Have you tried playing with innodb_force_recovery option to try to > get the server started at least? That way you might be able to identify > which table(s) is/are the corrupted one and the one(s) preventing the > whole server from booting up. > > As the affected machine was just a read only slave, it was easier for > me to get things back into service by just reloading off the master. > Unfortunately, I didn't think to keep the corrupted ibd files for later > debugging. > > At this point, I'm more trying to figure out if there's something wrong > with the DB or host config. There was effectively no data loss, but > I'm worried we might have data loss or availability issues if this > error crops up on our master server. > > > Thanks, > > > Andrew > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: InnoDB corrupt after power failure
Hi Manuel, Thanks for the fast reply. On Oct 4, 2012, at 12:05 AM, Manuel Arostegui wrote: > it shouldn't be a biggie if you have a BBU. Do you guys use HW RAID + BBU? We've checked with our hosting provider, and the database was indeed stored on a BBU RAID. > What's your innodb_flush_log_at_trx_commit value? mysql> show variables like 'innodb_flush_log_at_trx_commit'\G *** 1. row *** Variable_name: innodb_flush_log_at_trx_commit Value: 1 1 row in set (0.00 sec) > Have you tried playing with innodb_force_recovery option to try to get the > server started at least? That way you might be able to identify which > table(s) is/are the corrupted one and the one(s) preventing the whole server > from booting up. As the affected machine was just a read only slave, it was easier for me to get things back into service by just reloading off the master. Unfortunately, I didn't think to keep the corrupted ibd files for later debugging. At this point, I'm more trying to figure out if there's something wrong with the DB or host config. There was effectively no data loss, but I'm worried we might have data loss or availability issues if this error crops up on our master server. Thanks, Andrew -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: InnoDB corrupt after power failure
2012/10/4 Andrew Miklas > Hi guys, > > I recently had a data corruption issue with InnoDB. MySQL was shut down > improperly (power failure), and when the system came back up, MySQL refused > to start. On inspection of the logs (see below), it looks like the > tablespace became seriously corrupted. In the end, I had to rebuild the > slave using mysqldump. > > I'm curious what happened here, since I thought InnoDB wasn't supposed to > become corrupted on an improper shutdown. One possibility that we were > exploring was that the filesystem journal setting was incorrect. We were > using ext3 with the journal set to writeback mode. Is this a known bad > config with InnoDB? > Hey Andrew, it shouldn't be a biggie if you have a BBU. Do you guys use HW RAID + BBU? What's your innodb_flush_log_at_trx_commit value? Have you tried playing with innodb_force_recovery option to try to get the server started at least? That way you might be able to identify which table(s) is/are the corrupted one and the one(s) preventing the whole server from booting up. Manuel
InnoDB corrupt after power failure
Hi guys, I recently had a data corruption issue with InnoDB. MySQL was shut down improperly (power failure), and when the system came back up, MySQL refused to start. On inspection of the logs (see below), it looks like the tablespace became seriously corrupted. In the end, I had to rebuild the slave using mysqldump. I'm curious what happened here, since I thought InnoDB wasn't supposed to become corrupted on an improper shutdown. One possibility that we were exploring was that the filesystem journal setting was incorrect. We were using ext3 with the journal set to writeback mode. Is this a known bad config with InnoDB? Thanks for any help, Andrew --- MySQL server version: Server version: 5.5.27-1~ppa1~lucid-log (Ubuntu) (Running on Ubuntu 10.04.2 LTS) 120831 20:56:01 InnoDB: The InnoDB memory heap is disabled 120831 20:56:01 InnoDB: Mutexes and rw_locks use GCC atomic builtins 120831 20:56:01 InnoDB: Compressed tables use zlib 1.2.3.3 120831 20:56:02 InnoDB: Initializing buffer pool, size = 5.0G 120831 20:56:03 InnoDB: Completed initialization of buffer pool 120831 20:56:06 InnoDB: highest supported file format is Barracuda. InnoDB: Log scan progressed past the checkpoint lsn 62096393185 120831 20:56:06 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Warning: database page corruption or a failed InnoDB: file read of space 0 page 230. InnoDB: Trying to recover it from the doublewrite buffer. InnoDB: Recovered the page from the doublewrite buffer. InnoDB: Warning: database page corruption or a failed InnoDB: file read of space 0 page 373. InnoDB: Trying to recover it from the doublewrite buffer. InnoDB: Recovered the page from the doublewrite buffer. InnoDB: Warning: database page corruption or a failed InnoDB: file read of space 0 page 214. InnoDB: Trying to recover it from the doublewrite buffer. InnoDB: Recovered the page from the doublewrite buffer. InnoDB: Warning: database page corruption or a failed InnoDB: file read of space 0 page 222. InnoDB: Trying to recover it from the doublewrite buffer. InnoDB: Recovered the page from the doublewrite buffer. InnoDB: Warning: database page corruption or a failed InnoDB: file read of space 0 page 2673. InnoDB: Trying to recover it from the doublewrite buffer. InnoDB: Recovered the page from the doublewrite buffer. InnoDB: Warning: database page corruption or a failed InnoDB: file read of space 0 page 2681. InnoDB: Trying to recover it from the doublewrite buffer. InnoDB: Recovered the page from the doublewrite buffer. InnoDB: Warning: database page corruption or a failed InnoDB: file read of space 0 page 46. InnoDB: Trying to recover it from the doublewrite buffer. InnoDB: Recovered the page from the doublewrite buffer. InnoDB: Warning: database page corruption or a failed InnoDB: file read of space 0 page 62. InnoDB: Trying to recover it from the doublewrite buffer. InnoDB: Recovered the page from the doublewrite buffer. InnoDB: Doing recovery: scanned up to log sequence number 62096881152 InnoDB: Database page corruption on disk or a failed InnoDB: file read of page 198. InnoDB: You may have to recover from a backup. 120831 20:56:33 InnoDB: Page dump in ascii and hex (16384 bytes): len 16384; hex {Big dump here -- I can supply if needed} InnoDB: End of page dump 120831 20:56:33 InnoDB: Page checksum 3859504003, prior-to-4.0.14-form checksum 1080681687 InnoDB: stored checksum 3859504003, prior-to-4.0.14-form stored checksum 3870577874 InnoDB: Page lsn 14 1966349405, low 4 bytes of lsn at page end 1966973261 InnoDB: Page number (if stored to page already) 198, InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0 InnoDB: Page may be a system page InnoDB: Database page corruption on disk or a failed InnoDB: file read of page 198. InnoDB: You may have to recover from a backup. InnoDB: It is also possible that your operating InnoDB: system has corrupted its own file cache InnoDB: and rebooting your computer removes the InnoDB: error. InnoDB: If the corrupt page is an index page InnoDB: you can also try to fix the corruption InnoDB: by dumping, dropping, and reimporting InnoDB: the corrupt table. You can use CHECK InnoDB: TABLE to scan your table for corruption. InnoDB: See also http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. InnoDB: Ending processing because of a corrupt database page. 120831 20:56:33 InnoDB: Assertion failure in thread 140548948399904 in file buf0buf.c line 3609 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to In