On 08/14/2017 09:08 AM, Qu Wenruo wrote: > >> >> Supposing to log for each transaction BTRFS which "data NOCOW blocks" will >> be updated and their checksum, in case a transaction is interrupted you know >> which blocks have to be checked and are able to verify if the checksum >> matches and correct the mismatch. Logging also the checksum could help to >> identify if: >> - the data is old >> - the data is updated >> - the updated data is correct >> >> The same approach could be used also to solving also the issue related to >> the infamous RAID5/6 hole: logging which block are updated, in case of >> transaction aborted you can check the parity which have to be rebuild. > Indeed Liu is using journal to solve RAID5/6 write hole. > > But to address the lack-of-journal nature of btrfs, he introduced a journal > device to handle it, since btrfs metadata is either written or trashed, we > can't rely existing btrfs metadata to handle journal.
The Liu's solution is a lot heavier. With the Liu's solution, you need to write both the data and parity 2 times. I am only suggest to track the block to update. And it would be only need for the stripes involved by a RMW cycle. This is a lot less data to write (8 byte vs 4Kbyte) > > PS: This reminds me why ZFS is still using journal (called ZFS intent log) > but not mandatory metadata CoW of btrfs. Form a theoretical point of view, if you have a "PURE" COW file-system, you don't need a journal. Unfortunately a RAID5/6 stripe update is a RMW cycle, so you need a journal to keep it in sync. The same is true for the NOCOW file (and their checksums) > > Thanks, > Qu -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html