On Sat, Feb 06, 2010 at 09:22:57AM -0800, Richard Elling wrote: > I'm interested in anecdotal evidence which suggests there is a > problem as it is currently designed.
I like to look at it differently: I'm not sure if there is a problem. I'd like to have a simple way to discover a problem, using the work zfs is already doing for me. So, I'd like two things from the "system" as a whole: - confidence that a send|recv which completes "successfully" has really delivered an exact copy. - verification that two datasets are the same, from a simple, quick, ideally cheap test. I can get some way to the former from understanding of the mechanisms used and analysis of their protective coverage and reasoning about the possible failure modes. Having the latter gets me the rest of the way there, and even most of the way there by itself. confidence < verification < assurance. So, for example, in early tests with send|recv, I'm sure many of us have run "rsync -nc .." comparison runs over the results. That's easy, relatively quick, but not entirely as cheap as could be. "It would be very nice" if there was a simple dataset fingerprint that depended, merkle-style, on the entire contents of the dataset (snapshot) below, and that could be easily compared on sender and receiver. This (together with scrub) would provide the desied assurance that the two are indeed the same. Back to analysis and reasoning for a moment; I would have more confidence in send|recv if I knew the end-to-end protections extended to cover the on-disk checksums (since the on-disk copies are the important endpoints for this operation). I suspect this was a large part of the intent behind the OP's question. As it stands from the current description, there are windows where errors might be introduced and not detected - in particular, if I have a protection gap via non-ECC RAM at either send or recv. I can cover many of the other gaps with pipeline tools, as discussed. This is a hard gap to cover, even for detection, without help from the actual zfs endpoints. Of course there are conflicting requirements, since we also want send|recv to facilitate recompression, reblocking, changing checksum method, etc etc. So lets turn the question around: what is the best way to verify that send|recv really has produced an identical copy? -- Dan.
pgpWedvuxzWsP.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss