Goffredo Baroncelli wrote:
On 05/02/2018 06:55 PM, waxhead wrote:

So again, which problem would solve having the parity checksummed ? On the best 
of my knowledge nothing. In any case the data is checksummed so it is 
impossible to return corrupted data (modulo bug :-) ).

I am not a BTRFS dev , but this should be quite easy to answer. Unless you 
checksum the parity there is no way to verify that that the data (parity) you 
use to reconstruct other data is correct.

In any case you could catch that the compute data is wrong, because the data is 
always checksummed. And in any case you must check the data against their 
checksum.

What if you lost an entire disk? or had corruption for both data AND checksum? How do you plan to safely reconstruct that without checksummed parity?

My point is that storing the checksum is a cost that you pay *every time*. 
Every time you update a part of a stripe you need to update the parity, and 
then in turn the parity checksum. It is not a problem of space occupied nor a 
computational problem. It is a a problem of write amplification...
How much of a problem is this? no benchmarks have been run since the feature is not yet there I suppose.


The only gain is to avoid to try to use the parity when
a) you need it (i.e. when the data is missing and/or corrupted)
I'm not sure I can make out your argument here , but with RAID5/6 you don't have another copy to restore from. You *have* to use the parity to reconstruct data and it is a good thing if this data is trusted.

and b) it is corrupted.
But the likelihood of this case is very low. And you can catch it during the 
data checksum check (which has to be performed in any case !).

So from one side you have a *cost every time* (the write amplification), to 
other side you have a gain (cpu-time) *only in case* of the parity is corrupted 
and you need it (eg. scrub or corrupted data)).

IMHO the cost are very higher than the gain, and the likelihood the gain is 
very lower compared to the likelihood (=100% or always) of the cost.

Then run benchmarks and considering making parity checksums optional (but pretty please dipped in syrup with sugar on top - keep in on by default).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to