subject:"RE\: Scrubbing with BTRFS Raid 5"

Re: Scrubbing with BTRFS Raid 5

2014-01-20 Thread Duncan

Graham Fleming posted on Sun, 19 Jan 2014 16:53:13 -0800 as excerpted:

> From the wiki, I see that scrubbing is not supported on a RAID 5 volume.
> 
> Can I still run the scrub routing (maybe read-only?) to check for any
> issues. I understand at this point running 3.12 kernel there are no
> routines to fix parity issues with RAID 5 while scrubbing but just want
> to know if I'm either a) not causing any harm by running the scrub on a
> RAID 5 volume and b) it's actually goin to provide me with useful
> feedback (ie file X is damaged).

This isn't a direct answer to your question, but answers a somewhat more 
basic question...

Btrfs raid5/6 isn't ready for use in a live-environment yet, period, only 
for testing where the reliability of the data beyond the test doesn't 
matter.  It works as long as everything works normally, writing out the 
parity blocks as well as the data, but besides scrub not yet being 
implemented, neither is recovery from loss of device, or from out-of-sync-
state power-off.

Since the whole /point/ of raid5/6 is recovery from device-loss, without 
that it's simply a less efficient raid0, which accepts the risk of fully 
data loss if a device is lost in ordered to gain the higher thruput of N-
way data striping.  So in practice at this point, if you're willing to 
accept loss of all data and want the higher thruput, you'd use raid0 or 
perhaps single mode instead, or if not, you'd use raid1 or raid10 mode.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: Scrubbing with BTRFS Raid 5

2014-01-21 Thread Graham Fleming

Thanks for all the info guys.

I ran some tests on the latest 3.12.8 kernel. I set up 3 1GB files and attached 
them to /dev/loop{1..3} and created a BTRFS RAID 5 volume with them.

I copied some data (from dev/urandom) into two test files and got their MD5 
sums and saved them to a text file.

I then unmounted the volume, trashed Disk3 and created a new Disk4 file, 
attached to /dev/loop4.

I mounted the BTRFS RAID 5 volume degraded and the md5 sums were fine. I added 
/dev/loop4 to the volume and then deleted the missing device and it rebalanced. 
I had data spread out on all three devices now. MD5 sums unchanged on test 
files.

This, to me, implies BTRFS RAID 5 is working quite well and I can in fact, 
replace a dead drive.

Am I missing something?--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

2014-01-21 Thread Duncan

Graham Fleming posted on Tue, 21 Jan 2014 01:06:37 -0800 as excerpted:

> Thanks for all the info guys.
> 
> I ran some tests on the latest 3.12.8 kernel. I set up 3 1GB files and
> attached them to /dev/loop{1..3} and created a BTRFS RAID 5 volume with
> them.
> 
> I copied some data (from dev/urandom) into two test files and got their
> MD5 sums and saved them to a text file.
> 
> I then unmounted the volume, trashed Disk3 and created a new Disk4 file,
> attached to /dev/loop4.
> 
> I mounted the BTRFS RAID 5 volume degraded and the md5 sums were fine. I
> added /dev/loop4 to the volume and then deleted the missing device and
> it rebalanced. I had data spread out on all three devices now. MD5 sums
> unchanged on test files.
> 
> This, to me, implies BTRFS RAID 5 is working quite well and I can in
> fact,
> replace a dead drive.
> 
> Am I missing something?

What you're missing is that device death and replacement rarely happens 
as neatly as your test (clean unmounts and all, no middle-of-process 
power-loss, etc).  You tested best-case, not real-life or worst-case.

Try that again, setting up the raid5, setting up a big write to it, 
disconnect one device in the middle of that write (I'm not sure if just 
dropping the loop works or if the kernel gracefully shuts down the loop 
device), then unplugging the system without unmounting... and /then/ see 
what sense btrfs can make of the resulting mess.  In theory, with an 
atomic write btree filesystem such as btrfs, even that should work fine, 
minus perhaps the last few seconds of file-write activity, but the 
filesystem should remain consistent on degraded remount and device add, 
device remove, and rebalance, even if another power-pull happens in the 
middle of /that/.

But given btrfs' raid5 incompleteness, I don't expect that will work.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

2014-01-21 Thread Jim Salter

Would it be reasonably accurate to say "btrfs' RAID5 implementation is 
likely working well enough and safe enough if you are backing up 
regularly and are willing and able to restore from backup if necessary 
if a device failure goes horribly wrong", then?

This is a reasonably serious question. My typical scenario runs along 
the lines of two identical machines with regular filesystem replication 
between them; in the event of something going horribly horribly wrong 
with the production machine, I just spin up services on the replicated 
machine - making it "production" - and then deal with the broken one at 
relative leisure.

If the worst thing wrong with RAID5/6 in current btrfs is "might not 
deal as well as you'd like with a really nasty example of single-drive 
failure", that would likely be livable for me.

On 01/21/2014 12:08 PM, Duncan wrote:
> What you're missing is that device death and replacement rarely happens
> as neatly as your test (clean unmounts and all, no middle-of-process
> power-loss, etc).  You tested best-case, not real-life or worst-case.
>
> Try that again, setting up the raid5, setting up a big write to it,
> disconnect one device in the middle of that write (I'm not sure if just
> dropping the loop works or if the kernel gracefully shuts down the loop
> device), then unplugging the system without unmounting... and /then/ see
> what sense btrfs can make of the resulting mess.  In theory, with an
> atomic write btree filesystem such as btrfs, even that should work fine,
> minus perhaps the last few seconds of file-write activity, but the
> filesystem should remain consistent on degraded remount and device add,
> device remove, and rebalance, even if another power-pull happens in the
> middle of /that/.
>
> But given btrfs' raid5 incompleteness, I don't expect that will work.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

2014-01-21 Thread Chris Murphy

On Jan 21, 2014, at 10:18 AM, Jim Salter  wrote:

> Would it be reasonably accurate to say "btrfs' RAID5 implementation is likely 
> working well enough and safe enough if you are backing up regularly and are 
> willing and able to restore from backup if necessary if a device failure goes 
> horribly wrong", then?

It's for testing purposes. If you really want to commit a production machine 
for testing a file system, and you're prepared to lose 100% of changes since 
the last backup, OK do that.

> If the worst thing wrong with RAID5/6 in current btrfs is "might not deal as 
> well as you'd like with a really nasty example of single-drive failure", that 
> would likely be livable for me.

It was just one hypothetical scenario, it's not the only one. If it's really 
truly seriously being tested, eventually you'll break it.

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: Scrubbing with BTRFS Raid 5

2014-01-21 Thread Graham Fleming

Thanks again for the added info; very helpful.

I want to keep playing around with BTRFSS RAID 5 and testing with it... 
assuming I have a drive with bad blocks, or let's say some inconsistent parity 
am I right in assuming that a) a btrfs scrub operation will not fix the stripes 
with bad parity and b) a balance operation will not be successful? Or would a 
balance operation work to re-write parity?--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

2014-01-21 Thread Jim Salter

There are different values of "testing" and of "production" - in my 
world, at least, they're not atomically defined categories. =)


On 01/21/2014 12:38 PM, Chris Murphy wrote:
It's for testing purposes. If you really want to commit a production 
machine for testing a file system, and you're prepared to lose 100% of 
changes since the last backup, OK do that.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

2014-01-22 Thread Duncan

Graham Fleming posted on Tue, 21 Jan 2014 10:03:26 -0800 as excerpted:

> I want to keep playing around with BTRFSS RAID 5 and testing with it...
> assuming I have a drive with bad blocks, or let's say some inconsistent
> parity am I right in assuming that a) a btrfs scrub operation will not
> fix the stripes with bad parity

What I know is that it is said btrfs scrub doesn't work with btrfs raid5/6 
yet.  I don't know how it actually fails (tho I'd hope it simply returns 
an error to the effect that it doesn't work with raid5/6 yet) as I've not 
actually tried that mode, here.

> and b) a balance operation will not be
> successful? Or would a balance operation work to re-write parity?

Balance actually rewrites everything (well, everything matching its 
filters if a filtered balance is used, everything, if not), so it should 
rewrite parity correctly.

AFAIK, all the writing works and routine read works.  It's the error 
recovery that's still only partially implemented.  Since reading just 
reads data, not parity unless there's a dropped device or the like to 
recover from, as long as all devices are active and there's a good copy 
of the data (based on btrfs checksumming) to read, the rebalance should 
just use and rewrite that, ignoring the bad parity.



-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

2014-01-22 Thread Duncan

Jim Salter posted on Tue, 21 Jan 2014 12:18:01 -0500 as excerpted:

> Would it be reasonably accurate to say "btrfs' RAID5 implementation is
> likely working well enough and safe enough if you are backing up
> regularly and are willing and able to restore from backup if necessary
> if a device failure goes horribly wrong", then?

I'd say (and IIRC I did say somewhere, but don't remember if it was this 
thread) that in reliability terms btrfs raid5 should be treated like 
btrfs raid0 at this point.  Raid0 is well known to have absolutely no 
failover -- if a device fails, the raid is toast.  It's possible so-
called "extreme measures" may recover data from the surviving bits (think 
the $expen$ive$ $ervice$ of data recovery firms), but the idea is that 
either no data that's not easily replaced is stored on a raid0 in the 
first place, or if it is, there's (tested recoverable) backup to the 
level that you're fully comfortable with losing EVERYTHING not backed up.

Examples of good data for raid0 are the kernel sources (as a user, not a 
dev, so you're not hacking on them), your distro's local package cache, 
browser cache, etc.  This because by definition all those examples have 
the net as their backup, so loss of a local copy means a bit more to 
download, at worst.

That's what btrfs raid5/6 are at the moment, effectively raid0 from a 
recovery perspective.

Now the parity /is/ being written; it simply can't be treated as 
available for recovery.  So supposing you do /not/ lose a device (or 
suffer a bad checksum) on the raid5 until after the recovery code is 
complete and available, you've effectively "free" upgraded from raid0 
reliability to raid5 reliability as soon as recovery is possible, which 
will be nice, and meanwhile you can test the operational functionality, 
so there /are/ reasons you might want to run the btrfs raid5 mode now.  
As long as you remember it's currently effectively raid0 should something 
go wrong, and you either don't use it for valuable data in the first 
place, or you're willing to do without any updates to that data since the 
last tested backup, should it come to that.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

2014-01-22 Thread Chris Mason

On Tue, 2014-01-21 at 17:08 +, Duncan wrote:
> Graham Fleming posted on Tue, 21 Jan 2014 01:06:37 -0800 as excerpted:
> 
> > Thanks for all the info guys.
> > 
> > I ran some tests on the latest 3.12.8 kernel. I set up 3 1GB files and
> > attached them to /dev/loop{1..3} and created a BTRFS RAID 5 volume with
> > them.
> > 
> > I copied some data (from dev/urandom) into two test files and got their
> > MD5 sums and saved them to a text file.
> > 
> > I then unmounted the volume, trashed Disk3 and created a new Disk4 file,
> > attached to /dev/loop4.
> > 
> > I mounted the BTRFS RAID 5 volume degraded and the md5 sums were fine. I
> > added /dev/loop4 to the volume and then deleted the missing device and
> > it rebalanced. I had data spread out on all three devices now. MD5 sums
> > unchanged on test files.
> > 
> > This, to me, implies BTRFS RAID 5 is working quite well and I can in
> > fact,
> > replace a dead drive.
> > 
> > Am I missing something?
> 
> What you're missing is that device death and replacement rarely happens 
> as neatly as your test (clean unmounts and all, no middle-of-process 
> power-loss, etc).  You tested best-case, not real-life or worst-case.
> 
> Try that again, setting up the raid5, setting up a big write to it, 
> disconnect one device in the middle of that write (I'm not sure if just 
> dropping the loop works or if the kernel gracefully shuts down the loop 
> device), then unplugging the system without unmounting... and /then/ see 
> what sense btrfs can make of the resulting mess.  In theory, with an 
> atomic write btree filesystem such as btrfs, even that should work fine, 
> minus perhaps the last few seconds of file-write activity, but the 
> filesystem should remain consistent on degraded remount and device add, 
> device remove, and rebalance, even if another power-pull happens in the 
> middle of /that/.
> 
> But given btrfs' raid5 incompleteness, I don't expect that will work.
> 

raid5/6 deals with IO errors from one or two drives, and it is able to
reconstruct the parity from the remaining drives and give you good data.

If we hit a crc error, the raid5/6 code will try a parity reconstruction
to make good data, and if we find good data from the other copy, it'll
return that up to userland.

In other words, for those cases it works just like raid1/10.  What it
won't do (yet) is write that good data back to the storage.  It'll stay
bad until you remove the device or run balance to rewrite everything.

Balance will reconstruct parity to get good data as it balances.  This
isn't as useful as scrub, but that work is coming.

-chris



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

2014-01-22 Thread ronnie sahlberg

On Wed, Jan 22, 2014 at 12:45 PM, Chris Mason  wrote:
> On Tue, 2014-01-21 at 17:08 +, Duncan wrote:
>> Graham Fleming posted on Tue, 21 Jan 2014 01:06:37 -0800 as excerpted:
>>
>> > Thanks for all the info guys.
>> >
>> > I ran some tests on the latest 3.12.8 kernel. I set up 3 1GB files and
>> > attached them to /dev/loop{1..3} and created a BTRFS RAID 5 volume with
>> > them.
>> >
>> > I copied some data (from dev/urandom) into two test files and got their
>> > MD5 sums and saved them to a text file.
>> >
>> > I then unmounted the volume, trashed Disk3 and created a new Disk4 file,
>> > attached to /dev/loop4.
>> >
>> > I mounted the BTRFS RAID 5 volume degraded and the md5 sums were fine. I
>> > added /dev/loop4 to the volume and then deleted the missing device and
>> > it rebalanced. I had data spread out on all three devices now. MD5 sums
>> > unchanged on test files.
>> >
>> > This, to me, implies BTRFS RAID 5 is working quite well and I can in
>> > fact,
>> > replace a dead drive.
>> >
>> > Am I missing something?
>>
>> What you're missing is that device death and replacement rarely happens
>> as neatly as your test (clean unmounts and all, no middle-of-process
>> power-loss, etc).  You tested best-case, not real-life or worst-case.
>>
>> Try that again, setting up the raid5, setting up a big write to it,
>> disconnect one device in the middle of that write (I'm not sure if just
>> dropping the loop works or if the kernel gracefully shuts down the loop
>> device), then unplugging the system without unmounting... and /then/ see
>> what sense btrfs can make of the resulting mess.  In theory, with an
>> atomic write btree filesystem such as btrfs, even that should work fine,
>> minus perhaps the last few seconds of file-write activity, but the
>> filesystem should remain consistent on degraded remount and device add,
>> device remove, and rebalance, even if another power-pull happens in the
>> middle of /that/.
>>
>> But given btrfs' raid5 incompleteness, I don't expect that will work.
>>
>
> raid5/6 deals with IO errors from one or two drives, and it is able to
> reconstruct the parity from the remaining drives and give you good data.
>
> If we hit a crc error, the raid5/6 code will try a parity reconstruction
> to make good data, and if we find good data from the other copy, it'll
> return that up to userland.
>
> In other words, for those cases it works just like raid1/10.  What it
> won't do (yet) is write that good data back to the storage.  It'll stay
> bad until you remove the device or run balance to rewrite everything.
>
> Balance will reconstruct parity to get good data as it balances.  This
> isn't as useful as scrub, but that work is coming.
>

That is awesome!

What about online conversion from not-raid5/6 to raid5/6  what is the
status for that code, for example
what happens if there is a failure during the conversion or a reboot ?



> -chris
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

2014-01-22 Thread Chris Mason

On Wed, 2014-01-22 at 13:06 -0800, ronnie sahlberg wrote:
> On Wed, Jan 22, 2014 at 12:45 PM, Chris Mason  wrote:
> > On Tue, 2014-01-21 at 17:08 +, Duncan wrote:
> >> Graham Fleming posted on Tue, 21 Jan 2014 01:06:37 -0800 as excerpted:
> >>
> >> > Thanks for all the info guys.
> >> >
> >> > I ran some tests on the latest 3.12.8 kernel. I set up 3 1GB files and
> >> > attached them to /dev/loop{1..3} and created a BTRFS RAID 5 volume with
> >> > them.
> >> >
> >> > I copied some data (from dev/urandom) into two test files and got their
> >> > MD5 sums and saved them to a text file.
> >> >
> >> > I then unmounted the volume, trashed Disk3 and created a new Disk4 file,
> >> > attached to /dev/loop4.
> >> >
> >> > I mounted the BTRFS RAID 5 volume degraded and the md5 sums were fine. I
> >> > added /dev/loop4 to the volume and then deleted the missing device and
> >> > it rebalanced. I had data spread out on all three devices now. MD5 sums
> >> > unchanged on test files.
> >> >
> >> > This, to me, implies BTRFS RAID 5 is working quite well and I can in
> >> > fact,
> >> > replace a dead drive.
> >> >
> >> > Am I missing something?
> >>
> >> What you're missing is that device death and replacement rarely happens
> >> as neatly as your test (clean unmounts and all, no middle-of-process
> >> power-loss, etc).  You tested best-case, not real-life or worst-case.
> >>
> >> Try that again, setting up the raid5, setting up a big write to it,
> >> disconnect one device in the middle of that write (I'm not sure if just
> >> dropping the loop works or if the kernel gracefully shuts down the loop
> >> device), then unplugging the system without unmounting... and /then/ see
> >> what sense btrfs can make of the resulting mess.  In theory, with an
> >> atomic write btree filesystem such as btrfs, even that should work fine,
> >> minus perhaps the last few seconds of file-write activity, but the
> >> filesystem should remain consistent on degraded remount and device add,
> >> device remove, and rebalance, even if another power-pull happens in the
> >> middle of /that/.
> >>
> >> But given btrfs' raid5 incompleteness, I don't expect that will work.
> >>
> >
> > raid5/6 deals with IO errors from one or two drives, and it is able to
> > reconstruct the parity from the remaining drives and give you good data.
> >
> > If we hit a crc error, the raid5/6 code will try a parity reconstruction
> > to make good data, and if we find good data from the other copy, it'll
> > return that up to userland.
> >
> > In other words, for those cases it works just like raid1/10.  What it
> > won't do (yet) is write that good data back to the storage.  It'll stay
> > bad until you remove the device or run balance to rewrite everything.
> >
> > Balance will reconstruct parity to get good data as it balances.  This
> > isn't as useful as scrub, but that work is coming.
> >
> 
> That is awesome!
> 
> What about online conversion from not-raid5/6 to raid5/6  what is the
> status for that code, for example
> what happens if there is a failure during the conversion or a reboot ?

The conversion code uses balance, so that works normally.  If there is a
failure during the conversion you'll end up with some things raid5/6 and
somethings at whatever other level you used.

The data will still be there, but you are more prone to enospc
problems ;)

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

2014-01-22 Thread ronnie sahlberg

On Wed, Jan 22, 2014 at 1:16 PM, Chris Mason  wrote:
> On Wed, 2014-01-22 at 13:06 -0800, ronnie sahlberg wrote:
>> On Wed, Jan 22, 2014 at 12:45 PM, Chris Mason  wrote:
>> > On Tue, 2014-01-21 at 17:08 +, Duncan wrote:
>> >> Graham Fleming posted on Tue, 21 Jan 2014 01:06:37 -0800 as excerpted:
>> >>
>> >> > Thanks for all the info guys.
>> >> >
>> >> > I ran some tests on the latest 3.12.8 kernel. I set up 3 1GB files and
>> >> > attached them to /dev/loop{1..3} and created a BTRFS RAID 5 volume with
>> >> > them.
>> >> >
>> >> > I copied some data (from dev/urandom) into two test files and got their
>> >> > MD5 sums and saved them to a text file.
>> >> >
>> >> > I then unmounted the volume, trashed Disk3 and created a new Disk4 file,
>> >> > attached to /dev/loop4.
>> >> >
>> >> > I mounted the BTRFS RAID 5 volume degraded and the md5 sums were fine. I
>> >> > added /dev/loop4 to the volume and then deleted the missing device and
>> >> > it rebalanced. I had data spread out on all three devices now. MD5 sums
>> >> > unchanged on test files.
>> >> >
>> >> > This, to me, implies BTRFS RAID 5 is working quite well and I can in
>> >> > fact,
>> >> > replace a dead drive.
>> >> >
>> >> > Am I missing something?
>> >>
>> >> What you're missing is that device death and replacement rarely happens
>> >> as neatly as your test (clean unmounts and all, no middle-of-process
>> >> power-loss, etc).  You tested best-case, not real-life or worst-case.
>> >>
>> >> Try that again, setting up the raid5, setting up a big write to it,
>> >> disconnect one device in the middle of that write (I'm not sure if just
>> >> dropping the loop works or if the kernel gracefully shuts down the loop
>> >> device), then unplugging the system without unmounting... and /then/ see
>> >> what sense btrfs can make of the resulting mess.  In theory, with an
>> >> atomic write btree filesystem such as btrfs, even that should work fine,
>> >> minus perhaps the last few seconds of file-write activity, but the
>> >> filesystem should remain consistent on degraded remount and device add,
>> >> device remove, and rebalance, even if another power-pull happens in the
>> >> middle of /that/.
>> >>
>> >> But given btrfs' raid5 incompleteness, I don't expect that will work.
>> >>
>> >
>> > raid5/6 deals with IO errors from one or two drives, and it is able to
>> > reconstruct the parity from the remaining drives and give you good data.
>> >
>> > If we hit a crc error, the raid5/6 code will try a parity reconstruction
>> > to make good data, and if we find good data from the other copy, it'll
>> > return that up to userland.
>> >
>> > In other words, for those cases it works just like raid1/10.  What it
>> > won't do (yet) is write that good data back to the storage.  It'll stay
>> > bad until you remove the device or run balance to rewrite everything.
>> >
>> > Balance will reconstruct parity to get good data as it balances.  This
>> > isn't as useful as scrub, but that work is coming.
>> >
>>
>> That is awesome!
>>
>> What about online conversion from not-raid5/6 to raid5/6  what is the
>> status for that code, for example
>> what happens if there is a failure during the conversion or a reboot ?
>
> The conversion code uses balance, so that works normally.  If there is a
> failure during the conversion you'll end up with some things raid5/6 and
> somethings at whatever other level you used.
>
> The data will still be there, but you are more prone to enospc
> problems ;)
>

Ok, but if there is enough space,  you could just restart the balance
and it will eventually finish and all should, with some luck, be ok?

Awesome. This sounds like things are a lot closer to raid5/6 being
fully operational than I realized.


> -chris
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrubbing with BTRFS Raid 5

RE: Scrubbing with BTRFS Raid 5

Re: Scrubbing with BTRFS Raid 5

Re: Scrubbing with BTRFS Raid 5

Re: Scrubbing with BTRFS Raid 5

RE: Scrubbing with BTRFS Raid 5

Re: Scrubbing with BTRFS Raid 5

Re: Scrubbing with BTRFS Raid 5

Re: Scrubbing with BTRFS Raid 5

Re: Scrubbing with BTRFS Raid 5

Re: Scrubbing with BTRFS Raid 5

Re: Scrubbing with BTRFS Raid 5

Re: Scrubbing with BTRFS Raid 5

13 matches

Site Navigation

Mail list logo

Footer information