On Mon, 2017-08-14 at 10:23 -0400, Austin S. Hemmelgarn wrote:
> Assume you have higher level verification.  Would you rather not be
> able 
> to read the data regardless of if it's correct or not, or be able to 
> read it and determine yourself if it's correct or not?

What would be the difference here then to the CoW+checksuming+some-
data-corruption-case?!
btrfs would also give EIO and all these applications you mention would
fail then.

As I've said previous, one could provide end users with the means to
still access the faulty data. Or they could simply mount with
nochecksum.




> For almost 
> anybody, the answer is going to be the second case, because the 
> application knows better than the OS if the data is correct (and 
> 'correct' may be a threshold, not some binary determination).
You've made that claim already once with VMs and DBs, and your claim
proved simply wrong.

Most applications don't do this kind of verification.

And those that do probably rather just check whether the data is valid
and if not give an error or at best fall back to some automatical
backups (e.g. what package managers do).

I'd know only few programs who'd really be capable to use data they
know is bogus and recover from that automagically... the only examples
I'd know are some archive formats which include error correcting codes.
And I really mean using the blocks for recovery for which the csum
wouldn't verify (i.e. the ones that gives an EIO)... without ECCs, how
would a program know what do to with such data?


I cannot image that many people would choose the second option, to be
honest.
Working with bogus data?! What should be the benefit of this?



>   At that 
> point, you need to make the checksum error a warning instead of 
> returning -EIO.  How do you intend to communicate that warning back
> to 
> the application?  The kernel log won't work, because on any
> reasonably 
> secure system it's not visible to anyone but root.

Still same problem with CoW + any data corruption...

>   There's also no side 
> channel for the read() system calls that you can utilize.  That then 
> means that the checksums end up just being a means for the
> administrator 
> to know some data wasn't written correctly, but they should know
> that 
> anyway because the system crashed.

No, they'd have no idea if any / which data was written during the
crash.



> Looking at this from a different angle: Without background, what
> would 
> you assume the behavior to be for this?  For most people, the
> assumption 
> would be that this provides the same degree of data safety that the 
> checksums do when the data is CoW.

I don't think the average use would have any such assumption. Most
people likely don't even know that there is implicitly no checksuming
if nodatacow is enabled.


What people may however have heard of is, that btrfs does doe
checksuming and they'd assume that their filesystem gives them always
just valid data (or an error)... and IMO that's actually what each
modern fs should do per default.
Relying on higher levels providing such means is simply not realistic.



Cheers,
Chris.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to