On Thu, Sep 20, 2018 at 11:23 AM, Adrian Bastholm <adr...@javaguru.org> wrote:
> On Mon, Sep 17, 2018 at 2:44 PM Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>
>>
>> Then I strongly recommend to use the latest upstream kernel and progs
>> for btrfs. (thus using Debian Testing)
>>
>> And if anything went wrong, please report asap to the mail list.
>>
>> Especially for fs corruption, that's the ghost I'm always chasing for.
>> So if any corruption happens again (although I hope it won't happen), I
>> may have a chance to catch it.
>
> You got it
>> >
>> >> Anyway, enjoy your stable fs even it's not btrfs
>
>> > My new stable fs is too rigid. Can't grow it, can't shrink it, can't
>> > remove vdevs from it , so I'm planning a comeback to BTRFS. I guess
>> > after the dust settled I realize I like the flexibility of BTRFS.
>> >
> I'm back to btrfs.
>
>> From the code aspect, the biggest difference is the chunk layout.
>> Due to the ext* block group usage, each block group header (except some
>> sparse bg) is always used, thus btrfs can't use them.
>>
>> This leads to highly fragmented chunk layout.
>
> The only thing I really understood is "highly fragmented" == not good
> . I might need to google these "chunk" thingies

Chunks are synonyms with block groups. They're like a super extent, or
extent of extents.

The block group is how Btrfs abstracts the logical address used most
everywhere in Btrfs land, and device + physical location of extents.
It's how a file is referenced only by on logical address, and doesn't
need to know either where the extent is located, or how many copies
there are. The block group allocation profile is what determines if
there's one copy, duplicate copies, raid1, 10, 5, 6 copies of a chunk
and where the copies are located. It's also fundamental to how device
add, remove, replace, file system resize, and balance all interrelate.


>> If your primary concern is to make the fs as stable as possible, then
>> keep snapshots to a minimal amount, avoid any functionality you won't
>> use, like qgroup, routinely balance, RAID5/6.
>
> So, is RAID5 stable enough ? reading the wiki there's a big fat
> warning about some parity issues, I read an article about silent
> corruption (written a while back), and chris says he can't recommend
> raid56 to mere mortals.

Depends on how you define stable. In recent kernels it's stable on
stable hardware, i.e. no lying hardware (actually flushes when it
claims it has), no power failures, and no failed devices. Of course
it's designed to help protect against a clear loss of a device, but
there's tons of stuff here that's just not finished including ejecting
bad devices from the array like md and lvm raids will do. Btrfs will
just keep trying, through all the failures. There are some patches to
moderate this but I don't think they're merged yet.

You'd also want to be really familiar with how to handle degraded
operation, if you're going to depend on it, and how to replace a bad
device. Last I refreshed my memory on it, it's advised to use "btrfs
device add" followed by "btrfs device remove" for raid56; whereas
"btrfs replace" is preferred for all other profiles. I'm not sure if
the "btrfs replace" issues with parity raid were fixed.

Metadata as raid56 shows a lot more problem reports than metadata
raid1, so there's something goofy going on in those cases. I'm not
sure how well understood they are. But other people don't have
problems with it.

It's worth looking through the archives about some things. Btrfs
raid56 isn't exactly perfectly COW, there is read-modify-write code
that means there can be overwrites. I vaguely recall that it's COW in
the logical layer, but the physical writes can end up being RMW or not
for sure COW.



-- 
Chris Murphy

Reply via email to