Re: [PATCH] btrfs: raid56: Use correct stolen pages to calculate P/Q

Goffredo Baroncelli Mon, 28 Nov 2016 10:46:49 -0800

On 2016-11-28 01:40, Qu Wenruo wrote:
> 
> At 11/27/2016 07:16 AM, Goffredo Baroncelli wrote:
>> On 2016-11-26 19:54, Zygo Blaxell wrote:
>>> On Sat, Nov 26, 2016 at 02:12:56PM +0100, Goffredo Baroncelli wrote:
>>>> On 2016-11-25 05:31, Zygo Blaxell wrote:
>> [...]
>>>>
>>>> BTW Btrfs in RAID1 mode corrects the data even in the read case. So
>>>
>>> Have you tested this?  I think you'll find that it doesn't.
>>
>> Yes I tested it; and it does the rebuild automatically.
>> I corrupted a disk of mirror, then I read the related file. The log  says:
>>
>> [   59.287748] BTRFS warning (device vdb): csum failed ino 257 off 0 csum 
>> 12813760 expected csum 3114703128
>> [   59.291542] BTRFS warning (device vdb): csum failed ino 257 off 0 csum 
>> 12813760 expected csum 3114703128
>> [   59.294950] BTRFS info (device vdb): read error corrected: ino 257 off 0 
>> (dev /dev/vdb sector 2154496)
>>                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>> IIRC In case of RAID5/6 the last line is missing. However in both the case 
>> the data returned is good; but in RAID1 the data is corrected also on the 
>> disk.
>>
>> Where you read that the data is not rebuild automatically ?
>>
>> In fact I was surprised that RAID5/6 behaves differently....
>>
> 
> Yes, I also tried that and realized that RAID1 is recovering corrupted data 
> at *READ* time.
> 
> The main difference between RAID1 and RAID56 seems to be the complexity.
> 
> For RAID56, we have different read/write behavior, for read, we use flag 
> BTRFS_RBIO_READ_REBUILD, which will only rebuild data but not write them into 
> disk.
> And I'm a little concern about the race between read time fix and write.
> 
> I assume it's possible to change the behavior to follow RAID1, but I'd like 
> to do it in the following steps:
> 1) Fix known RAID56 bugs
>    With the v3 patch and previous 2 patches, it seems OK now.
> 2) Full fstests test case, with all possible corruption combination
>    (WIP)
> 3) Rework current RAID56 code to a cleaner and more readable status
>    (long term)
> 4) Add the support to fix things at read time.
> 
> So the behavior change is not something we will see in short time.


+1
I am understanding that the status of RAID5/6 code is so badly that we need to 
correct all the more critical bugs and then increase the tests for not 
regression.
On the point 3, I don't know the code well enough to say something, the code is 
very complex.
I see the point 4 as the less urgent. 

Let me to make a request: I would like to know your opinion about my email 
"RFC: raid with a variable stripe size", which started a little thread. I am 
asking this because you now have the hands on this code: my suggestion (use 
different BG, with different stripe size to avoid RMW cycles) or the Zygo's one 
(don't fill a stripe if you don't need to avoid RMW cycles) are difficult to 
implement ?

BR
G.Baroncelli


> 
> Thanks,
> Qu


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] btrfs: raid56: Use correct stolen pages to calculate P/Q

Reply via email to