On 2018-01-23 19:44, Chris Murphy wrote:
On Tue, Jan 23, 2018 at 5:51 AM, Austin S. Hemmelgarn
<ahferro...@gmail.com> wrote:

This is extremely important to understand.  BTRFS and ZFS are essentially
the only filesystems available on Linux that actually validate things enough
to notice this reliably (ReFS on Windows probably does, and I think whatever
Apple is calling their new FS does too).

ReFS always checksums metadata, optionally can checksum data.
Good to know, I've not actually dealt with ReFS myself yet (we're mostly a Linux shop where I work, and the two Windows servers we do have aren't using ReFS simply because it wasn't beyond the technology preview level when we installed them and we don't want to screw anything up).

APFS is really vague on this front, it may be checksumming metadata,
it's not checksumming data and with no option to. Apple proposes their
branded storage devices do not return bogus data. OK so then why
checksum the metadata?
Even aside from the fact that it might be checksumming data, Apple's storage engineers are still smoking something pretty damn strong if they think that they can claim their storage devices _never_ return bogus data. Either they're running some kind of checksumming _and_ replication below the block layer in the storage device itself (which actually might explain the insane cost of at least one piece of their hardware), or they think they've come up with some fail-safe way to detect corruption and return errors reliably, and in either case things can still fail. I smell a potential future lawsuit in the works...

Even if ext4 did notice it, it
would just mark the filesystem for a check and then keep going without doing
anything else about it (seriously, the default behavior for internal errors
on ext4 is to just continue like nothing happened and mark the FS for fsck).

I haven't used ext4 with metadata checksumming enabled, and have no
idea how it behaves when it starts encountering checksum errors during
normal use. For sure XFS will complain a lot and will go read only
when it gets confused. I'd expect any file system going to the trouble
of checksumming would have to have some means of bailing out, rather
than just continuing on.
Actually, I forgot about the (newer) metadata checksumming feature in ext4, and was just basing my statement on behavior the last time I used it for anything serious. Having just checked mkfs.ext4, it appears that the metadata in the SB that tells the kernel what to do when it runs into an error for the FS still defaults to continuing on as if nothing happens, even if you enable metadata checksumming (which still seems to be disabled by default). Whether or not that actually is honored by modern kernels, I don't know, but I've seen no evidence to suggest that it isn't.

Btrfs (and maybe ZFS) COW everything except supers. So ostensibly a
future feature might let them continue on with a kind of
integrated/single volume variation on seed/sprout device. I'd like to
see something like this just for undoable and testable offline
repairs, rather than offline repair only being predicated on
overwritting metadata.Agreed.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to