On 21 June, 2011 - Todd Urie sent me these 5,9K bytes: > I have a zpool that shows the following from a zpool status -v <zpool name> > > brsnnfs0104 [/var/spool/cron/scripts]# zpool status -v ABC0101 > pool:ABC0101 > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > ABC0101 ONLINE 0 0 10 > /dev/vx/dsk/ABC01dg/ABC0101_01 ONLINE 0 0 2 > /dev/vx/dsk/ABC01dg/ABC0101_02 ONLINE 0 0 8 > /dev/vx/dsk/ABC01dg/ABC0101_03 ONLINE 0 0 10 > > errors: Permanent errors have been detected in the following files: > > /clients/ABC0101/rep/local/bfm/web/htdocs/tmp/rscache/717b52282ea059452621587173561360 > /clients/ > ABC0101/rep/local/bfm/web/htdocs/tmp/rscache/6e6a9f37c4d13fdb3dcb8649272a2a49 > /clients/ABC0101/rep/d0/prod1/reports/ReutersCMOLoad/ReutersCMOLoad. > ABCntss001.20110620.141330.26496.ROLLBACK_FOR_UPDATE_COUPONS.html > /clients/ > ABC0101/rep/local/bfm/web/htdocs/tmp/G2_0.related_detail_loader.1308593666.54643.n5cpoli3355.data > /clients/ > ABC0101/rep/d0/prod1/reports/gp_reports/ALLMNG/20110429/F_OLPO82_A.gp. > ABCIM_GA.nlaf.xml.gz > /clients/ > ABC0101/rep/d0/prod1/reports/gp_reports/ALLMNG/20110429/UNVLXCIAFI.gp. > ABCIM_GA.nlaf.xml.gz > /clients/ > ABC0101/rep/d0/prod1/reports/gp_reports/ALLMNG/20110429/UNIVLEXCIA.gp.BARCRATING_ > ABC.nlaf.xml.gz > > I think that a scrub at least has the possibility to clear this up. A quick > search suggests that others have had some good experience with using scrub > in similar circumstances. I was wondering if anyone could share some of > their experiences, good and bad, so that I can assess the risk and > probability of success with this approach. Also, any other ideas would > certainly be appreciated.
As you have no ZFS based redundancy, it can only detect that some blocks delivered from the devices (SAN I guess?) were broken according to the checksum. If you had raidz/mirror in zfs, it would have corrected the problems and written back correct data to the malfunctioning device. Now it does not. A scrub only reads the data and verifies that data matches checksums. /Tomas -- Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss