budy, here are some links. Remember, the reason you get corrupted files, is because ZFS detects it. Probably, you got corruption earlier as well, but your hardware did not notice it. This is called Silent Corruption. But ZFS is designed to detect and correct Silent Corruption. Which no normal hardware is designed for.
The thing is, ZFS does end-to-end checksum. The data in RAM, is it identlcal on disc? From RAM down to controller to disk. There can be errors in the passing between the realms. Normally, there are checksums within each realm (checksums on the disc), but no checksums from the beginning of the chaing, to the end: end to the end checksums: http://jforonda.blogspot.com/2007/01/faulty-fc-port-meets-zfs.html Here are some links. CERN did a data integrity survey on 3000 hw raid and saw silent corruptions. http://storagemojo.com/2007/09/19/cerns-data-corruption-research/ In another CERN paper, they say "such data corruption is found in all solutions, no matter price (even very expensive Enterprise solutions)"!!! From that paper (can not find the link now) "Conclusions -silent corruptions are a fact of life -first step towards a solution is detection -elimination seems impossible -existing datasets are at the mercy of Murphy -correction will cost time AND money -effort has to start now (if not started already) -multiple cost-schemes exist --trade time and storage space (à la Google) --trade time and CPU power (correction codes" CERN writes: "checksumming - not necessarily enough" you need to use "end-to-end checksumming (ZFS has a point)" See the specifications on a new SAS Enterprise disk, typically it says: "one irrecoverable error in 10^15 bits". With todays large and fast raids, you quickly reach 10^ 15 bits in a short time. Greenplums database solution faces one such bit every 15 min: http://queue.acm.org/detail.cfm?id=1317400 Ordinary filesystems such as XFS, ReiserFS, JFS, etc does not protect your data, nor detect all errors (here is a PhD thesis link) http://www.zdnet.com/blog/storage/how-microsoft-puts-your-data-at-risk/169 ZFS data integrity tested by researchers: http://www.zdnet.com/blog/storage/zfs-data-integrity-tested/811?tag=rbxccnbzd1 (if they had ran zfs raid, ZFS would have corrected all artificially injected errors. Now, ZFS only detected all errors - which is very difficult to do. First step is detection, then repair the errors) Companies tries to hide silent corruption: http://www.enterprisestorageforum.com/sans/features/article.php/3704666 http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt "When a drive returns garbage, since RAID5 does not EVER check parity on read (RAID3 & RAID4 do BTW and both perform better for databases than RAID5 to boot) if you write a garbage sector back garbage parity will be calculated and your RAID5 integrity is lost! Similarly if a drive fails and one of the remaining drives is flaky the replacement will be rebuilt with garbage also propagating the problem to two blocks instead of just one." http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf "The paper explains that the best RAID-6 can do is use probabilistic methods to distinguish between single and dual-disk corruption, eg. "there are 95% chances it is single-disk corruption so I am going to fix it assuming that, but there are 5% chances I am going to actually corrupt more data, I just can't tell". I wouldn't want to rely on a RAID controller that takes gambles :-)" Researchers write regarding hw-raid: http://www.cs.wisc.edu/adsl/Publications/parity-fast08.html "We use the model checker to evaluate a number of different approaches found in real RAID systems, focusing on parity-based protection and single errors. We find holes in all of the schemes examined, where systems potentially exposes data to loss or returns corrupt data to the user. In data loss scenarios, the error is detected, but the data cannot be recovered, while in the rest, the error is not detected and therefore corrupt data is returned to the user. For example, we examine a combination of two techniques – block-level checksums (where checksums of the data block are stored within the same disk block as data and verified on every read) and write-verify (where data is read back immediately after it is written to disk and verified for correctness), and show that the scheme could still fail to detect certain error conditions, thus returning corrupt data to the user. We discover one particularly interesting and general problem that we call parity pollution. In this situation, corrupt data in one block of a stripe spreads to other blocks through various parity calculations. We find a number of cases where parity pollution occurs, and show how pollution can lead to data loss. Specifically, we find that data scrubbing (which is used to reduce the chances of double disk failures) tends to be one of themain causes of parity pollution." http://www.cs.wisc.edu/adsl/Publications/corruption-fast08.pdf "Detecting and recovering from data corruption requires protection techniques beyond those provided by the disk drive. In fact, basic protection schemes such as RAID [13] may also be unable to detect these problems. ... as we discuss later, checksums do not protect against all forms of corruption" http://www.cs.wisc.edu/adsl/Publications/corrupt-mysql-icde10.pdf "More reliable SCSI drives encounter fewer problems, but even within this expensive and carefully-engineered drive class, corruption still takes place." .... Recent work has shown that even with sophisticated RAID protection strategies, the “right” combination of a single fault and certain repair activities (e.g., a parity scrub) can still lead to data loss [19]. Thus, while these schemes reduce the chances of corruption, the possibility still exists; any higher-level client of storage that is serious about managing data reliably must consider the possibility that a disk will return data in a corrupted form." -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss