Re: Raid0 rescue

Duncan Thu, 27 Jul 2017 13:26:24 -0700

Hugo Mills posted on Thu, 27 Jul 2017 15:10:38 +0000 as excerpted:

> On Thu, Jul 27, 2017 at 10:49:37AM -0400, Alan Brand wrote:
>> I know I am screwed but hope someone here can point at a possible
>> solution.
>> 
>> I had a pair of btrfs drives in a raid0 configuration.  One of the
>> drives was pulled by mistake, put in a windows box, and a quick NTFS
>> format was done.  Then much screaming occurred.
>> 
>> I know the data is still there. [...]


>> I can't run a normal recovery as only half of each file is there.
> 
>    Welcome to RAID-0...

Hugo, Chris Murphy, or one of the devs should they take an interest, are 
your best bets for current recovery.  This reply only tries to fill in 
some recommendations for an eventual rebuild.

As Hugo implies, RAID-0 mode, not just for btrfs but in general, is well 
known among admins for being "garbage data not worth trying to recover" 
mode.  Not only is there no redundancy, but with raid0 you're 
deliberately increasing the chances of loss because now loss of any one 
device pretty well makes garbage of the entire array, and loss of any 
single device in a group of more than one is more likely than loss of any 
single device by itself.

So first rule of raid0, don't use it unless the data you're putting on it 
is indeed not worth trying to rebuild, either because you keep the 
backups updated and it's easier to just go back to them than to even try 
recovery of the raid0, or because the data really is garbage data, 
internet cache, temp files, etc, that it's really just better to scrap 
and let the cache rebuild, etc, than try to recover.

That's in general.  For btrfs in particular, there's some additional 
considerations altho they don't change the above.  If the data isn't 
quite down to the raid0-garbage, just-give-up-and-start-over, level, with 
btrfs, what you likely want is metadata raid1, data single, mode, which 
is the btrfs multi-device default.

The raid1 metadata mode will mean there's two copies of metadata, one on 
each of two different devices, so it'll tolerate loss of a single device 
and still let you at least know where the files are located and give you 
a chance at recovery.  But since metadata is typically a small fraction 
of the total, you'll not be sacrificing /too/ much space for that 
additional safety.

The single data mode will normally put files (under a gig filesize 
anyway, tho as the size increases toward a gig the chances of it all 
being on a single device go down) all on one device, so with a loss of a 
device, you'll either still have the file or you won't.  The contrast 
with raid0 mode is that its line is 64k instead of a gig, above which the 
file will be striped across multiple devices, so indeed, with a two-
device raid0, half of each file, in alternating 64k pieces, is what you 
have left if one of the devices goes bad, while with single, your chances 
of whole-file recovery, assuming it wasn't /entirely/ on the bad device, 
are pretty good upto a gig or so.

And because btrfs is still in the stabilizing, "get-the-code-correct-
before-you-worry-about-optimizing-it" mode, unlike more mature raid 
implementations such as the kernel's mdraid, btrfs still normally 
accesses only one device at a time, so btrfs raid0 only gets you the 
space advantage, not the usual raid0 speed advantage.  So btrfs single 
mode isn't really much if any slower than raid0, while being much safer 
and offering the same (or even better in the case of differing device 
sizes) size advantage as raid0.

Put differently, there's really very little case for choosing btrfs raid0 
mode at this time.  Maybe some years in the future when raid0 mode is 
speed-optimized that will change, but for now, single mode is safer and 
in the case of unequal device sizes makes better use of space, while 
being generally as fast, as raid0 mode, so single mode is almost 
certainly a better choice.

Meanwhile, back to the general case again:  Admin's first rule of 
backups:  The *true* value you place on your your data is defined not by 
arbitrary claims, but by the number of backups of that data you have.  No 
backups, much like putting the data on raid0, defines that data as of 
garbage value, not worth the trouble to try to recover in the case of 
raid0, not worth the trouble of making the backup in the first place in 
the case of no backup.  Of course really valuable data will have multiple 
backups, generally some of which are off-site in case the entire site is 
lost (flood, fire, earthquake, bomb, etc), while others are on-site in 
ordered to facilitate easy recovery from a less major disaster, should it 
be necessary.

Which means, regardless of whether files are lost or not, what was of 
most value as defined by an admin's actions (or lack of them in the case 
of not having a backup) is always saved, either the time/resources/
trouble to make the backup in the first place if the data wasn't worth 
it, or the data, if the backup was made and is thus available for 
recovery purposes.

So if you lost the data and didn't have a backup, particularly if it was 
on a raid0 which declares at least that instance of the data to be not 
really worth the trouble of an attempt at recovery anyway, at least you 
can be glad you saved what your actions defined as most important, the 
time/resources/trouble to make that backup that you didn't have, because 
the value of the data wasn't worth having a backup. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Raid0 rescue

Reply via email to