Orvar's post over in opensol-discuss has me thinking:
After reading the paper and looking at design docs, I'm wondering if there is some facility to allow for comparing data in the ARC to it's corresponding checksum. That is, if I've got the data I want in the ARC, how can I be sure it's correct (and free of hardware memory errors)? I'd assume the way is to also store absolutely all the checksums for all blocks/metadatas being read/written in the ARC (which, of course, means that only so much RAM corruption can be compensated for), and do a validation when that every time that block is used/written from the ARC. You'd likely have to do constant metadata consistency checking, and likely have to hold multiple copies of metadata in-ARC to compensate for possible corruption. I'm assuming that this has at least been explored, right?
(the researchers used non-ECC RAM, so honestly, I think it's a bit unrealistic to expect that your car will win the Indy 500 if you put a Yugo engine in it) - normally, this problem is exactly what you have hardware ECC and memory scrubbing for at the hardware level.
I'm not saying that ZFS should consider doing this - doing a validation for in-memory data is non-trivially expensive in performance terms, and there's only so much you can do and still expect your machine to survive. I mean, I've used the old NonStop stuff, and yes, you can shoot them with a .45 and it likely will still run, but wacking them with a bazooka still is guarantied to make them, well, Non-NonStop.
-Erik -------- Original Message -------- Subject: Re: [osol-discuss] Any news about 2010.3? Date: Wed, 31 Mar 2010 01:06:45 PDT From: Orvar Korvar <knatte_fnatte_tja...@yahoo.com> To: opensolaris-disc...@opensolaris.org If you value your data, you should reconsider. But if your data is not important, then skip ZFS. File system data corruption test by researcher: http://blogs.zdnet.com/storage/?p=169 ZFS data corruption test by researchers: http://www.cs.wisc.edu/wind/Publications/zfs-corruption-fast10.pdf -- This message posted from opensolaris.org _______________________________________________ opensolaris-discuss mailing list opensolaris-disc...@opensolaris.org -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss