Re: Expected behavior of bad sectors on one drive in a RAID1

Russell Coker Tue, 20 Oct 2015 06:16:55 -0700

On Wed, 21 Oct 2015 12:00:59 AM Austin S Hemmelgarn wrote:
> > https://www.gnu.org/software/ddrescue/
> > 
> > At this stage I would use ddrescue or something similar to copy data from
> > the failing disk to a fresh disk, then do a BTRFS scrub to regenerate
> > the missing data.
> > 
> > I wouldn't remove the disk entirely because then you lose badly if you
> > get another failure.  I wouldn't use a BTRFS replace because you already
> > have the system apart and I expect ddrescue could copy the data faster. 
> > Also as the drive has been causing system failures (I'm guessing a
> > problem with the power connector) you REALLY don't want BTRFS to corrupt
> > data on the other disks.  If you have a system with the failing disk and
> > a new disk attached then there's no risk of further contamination.
> 
> BIG DISCLAIMER: For the filesystem to be safely mountable it is
> ABSOLUTELY NECESSARY to remove the old disk after doing a block level


You are correct, my message wasn't clear.

What I meant to say is that doing a "btrfs device remove" or "btrfs replace" 
is generally a bad idea in such a situation.  "btrfs replace" is pretty good 
if you are replacing a disk with a larger one or replacing a disk that has 
only minor errors (a disk that just gets a few bad sectors is unlikely to get 
many more in a hurry).

> copy of it.  By all means, keep the disk around, but do not keep it
> visible to the kernel after doing a block level copy of it.  Also, you
> will probably have to run 'btrfs device scan' after copying the disk and
> removing it for the filesystem to work right.  This is an inherent
> result of how BTRFS's multi-device functionality works, and also applies
> to doing stuff like LVM snapshots of BTRFS filesystems.

Good advice.  I recommend just rebooting the system.  I think that if anyone 
who has the background knowledge to do such things without rebooting will 
probably just do it without needing to ask us for advice.

> >> Question 2 - Before having ran the scrub, booting off the raid with
> >> bad sectors, would btrfs "on the fly" recognize it was getting bad
> >> sector data with the checksum being off, and checking the other
> >> drives?  Or, is it expected that I could get a bad sector read in a
> >> critical piece of operating system and/or kernel, which could be
> >> causing my lockup issues?
> > 
> > Unless you have disabled CoW then BTRFS will not return bad data.
> 
> It is worth clarifying also that:
> a. While BTRFS will not return bad data in this case, it also won't
> automatically repair the corruption.

Really?  If so I think that's a bug in BTRFS.  When mounted rw I think that 
every time corruption is discovered it should be automatically fixed.

> b. In the unlikely event that both copies are bad, trying to read the
> data will return an IO error.
> c. It is theoretically possible (although statistically impossible) that
> the block could become corrupted, but the checksum could still be
> correct (CRC32c is good at detecting small errors, but it's not hard to
> generate a hash collision for any arbitrary value, so if a large portion
> of the block goes bad, then it can theoretically still have a valid
> checksum).

It would be interesting to see some research into how CRC32 fits with the more 
common disk errors.  For a disk to return bad data and claim it to be good the 
data must either be a misplaced write or read (which is almost certain to be 
caught by BTRFS as the metadata won't match), or a random sector that matches 
the disk's CRC.  Is generating a hash collision for a CRC32 inside a CRC 
protected block much more difficult?

> >> Question 3 - Probably doesn't matter, but how can I see which files
> >> (or metadata to files) the 40 current bad sectors are in?  (On extX,
> >> I'd use tune2fs and debugfs to be able to see this information.)
> > 
> > Read all the files in the system and syslog will report it.  But really
> > don't do that until after you have copied the disk.
> 
> It may also be possible to use some of the debug tools from BTRFS to do
> this without hitting the disks so hard, but it will likely take a lot
> more effort.

I don't think that you can do that without hitting the disks hard.

That said last time I checked (last time an executive of a hard drive 
manufacturer was willing to talk to me) drives were apparently designed to 
perform any sequence of operations for their warranty period.  So for a disk 
that is believed to be good this shouldn't be a problem.  For a disk that is 
known to be dying it would be a really bad idea to do anything other than copy 
the data off at maximum speed.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Expected behavior of bad sectors on one drive in a RAID1

Reply via email to