On Thu, Jan 23, 2014 at 2:21 PM, Schlacta, Christ <aarc...@aarcane.org> wrote:
> What guarantees does ceph place on data integrity? Zfs uses a Merkel tree to
> guarantee the integrity of all data and metadata on disk and will ultimately
> refuse to return "duff" data to an end user consumer.
>
> I know ceph provides some integrity mechanisms and has a scrub feature. Does
> it provide full data integrity guarantees in a manner similar to zfs? What
> will happen if duff data is encountered at runtime? What happens if all
> copies of some data are present, but damaged?
>
> Can ceph provide these guarantees, whatever they are, even when underlying
> storage provides no such guarantees ( xfs, extn, reiser)?

Sadly, Ceph cannot do as well as ZFS can here. The relevant interfaces
to pass data integrity information through the whole stack simply
don't exist. :( What Ceph does do:
1) All data sent over the wire is checksummed (crc32) and validated so
we know that what we got is what was sent.
2) All data in the osd journals is checksummed (crc32) so if we crash
and restart we can validate that independently of what the disk/fs is
telling us.
3) All data is regularly deep-scrubbed (based on your configuration)
to compare checksums across replicas.
And if you're on btrfs or zfs, there's also 4) the local fs won't
return duff data.

We'd love to do more and have experimented with it some (there is a
"sloppy crc" mechanism in the OSDs you can enable that tries to track
and validate what we've written to disk on our own, but it's simply
not suitable for production due to the extra disk traffic it
necessarily incurs), but until there's a way for us to get integrity
information straight from the client user application into the OSD
data store there's not much point.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to