On 08/14/2017 09:08 AM, Qu Wenruo wrote:
> 
>>
>> Supposing to log for each transaction BTRFS which "data NOCOW blocks" will 
>> be updated and their checksum, in case a transaction is interrupted you know 
>> which blocks have to be checked and are able to verify if the checksum 
>> matches and correct the mismatch. Logging also the checksum could help to 
>> identify if:
>> - the data is old
>> - the data is updated
>> - the updated data is correct
>>
>> The same approach could be used also to solving also the issue related to 
>> the infamous RAID5/6 hole: logging which block are updated, in case of 
>> transaction aborted you can check the parity which have to be rebuild.
> Indeed Liu is using journal to solve RAID5/6 write hole.
> 
> But to address the lack-of-journal nature of btrfs, he introduced a journal 
> device to handle it, since btrfs metadata is either written or trashed, we 
> can't rely existing btrfs metadata to handle journal.

The Liu's solution is a lot heavier. With the Liu's solution, you need to write 
both the data and parity 2 times. I am only suggest to track the block to 
update. And it would be only need for the stripes involved by a RMW cycle. This 
is a lot less data to write (8 byte vs 4Kbyte)

> 
> PS: This reminds me why ZFS is still using journal (called ZFS intent log) 
> but not mandatory metadata CoW of btrfs.

Form a theoretical point of view, if you have a "PURE" COW file-system, you 
don't need a journal. Unfortunately a RAID5/6 stripe update is a RMW cycle, so 
you need a journal to keep it in sync. The same is true for the NOCOW file (and 
their checksums)


> 
> Thanks,
> Qu


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to