Re: raid1: cannot add disk to replace faulty because can only mount fs as read-only.

Austin S. Hemmelgarn Thu, 02 Feb 2017 07:13:12 -0800

On 2017-02-02 09:25, Adam Borowski wrote:

On Thu, Feb 02, 2017 at 07:49:50AM -0500, Austin S. Hemmelgarn wrote:

This is a severe bug that makes a not all that uncommon (albeit bad) use
case fail completely.  The fix had no dependencies itself and


I don't see what's bad in mounting a RAID degraded.  Yeah, it provides no
redundancy but that's no worse than using a single disk from the start.
And most people not doing storage/server farm don't have a stack of spare
disks at hand, so getting a replacement might take a while.

Running degraded is bad. Period. If you don't have a disk on hand toreplace the failed one (and if you care about redundancy, you shouldhave at least one spare on hand), you should be converting to a singledisk, not continuing to run in degraded mode until you get a new disk.The moment you start talking about running degraded long enough that youwill be _booting_ the system with the array degraded, you need to beconverting to a single disk. This is of course impractical forsomething like a hardware array or an LVM volume, but it's _trivial_with BTRFS, and protects you from all kinds of bad situations that can'thappen with a single disk but can completely destroy the filesystem ifit's a degraded array. Running a single disk is not exactly the same asrunning a degraded array, it's actually marginally safer (even if youaren't using dup profile for metadata) because there are fewer movingparts to go wrong. It's also exponentially more efficient.


Being able to continue to run when a disk fails is the whole point of RAID
-- despite what some folks think, RAIDs are not for backups but for uptime.
And if your uptime goes to hell because the moment a disk fails you need to
drop everything and replace the disk immediately, why would you use RAID?

Because just replacing a disk and rebuilding the array is almost alwaysmuch cheaper in terms of time than rebuilding the system from a backup.IOW, even if you have to drop everything and replace the diskimmediately, it's still less time consuming than restoring from abackup. It also has the advantage that you don't lose any data.

I /thought/ the immediate benefit was obvious enough that it
would be mainline-merged by now, not hoovered-up into some long-term
project with no real hint as to /when/ it might be merged.  Oh, well...

I think (although I'm not sure about it) that this:
http://www.spinics.net/lists/linux-btrfs/msg47283.html
is the first posting of the patch series.


Is there a more recent version somewhere?  Mechanically rebasing+resolving
conflicts doesn't work, I'd need to do a more involved refresh, which would
be a waste of time if it's already done by someone with an actual clue about
this code.

There may be, but I haven't looked very far. Qu would probably be theperson to ask.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: raid1: cannot add disk to replace faulty because can only mount fs as read-only.

Reply via email to