Re: [zfs-discuss] zfs data corruption
On Apr 27, 2008, at 4:39 PM, Carson Gaspar wrote: > Ian Collins wrote: >> Carson Gaspar wrote: > >>> If this is possible, it's entirely undocumented... Actually, fmd's >>> documentation is generally terrible. The sum total of configuration >>> information is: >>> >>> FILES >>> /etc/fm/fmd Fault manager configuration direc- >>> tory >>> >>> Which is empty... It does look like I could write code to copy the >>> output of "fmdump -f" somewhere useful if I had to. >>> >>> >> Have you tried man fmadm? >> >> http://onesearch.sun.com/search/docs/index.jsp?col=docs_en&locale=en&qt=fmadm&cs=false&st=11 >> >> Brings up some useful information. > > "man fmadm" has: > > - nothing to do with configuration (the topic) (OK, it "prints the > config", whatever that means, but you can't _change_ anything) > - no examples of usage > > I stand by my statement that the fault management docs need a lot of > help. I found the fmadm manpage very unhelpful as well. This CR is going to be fixed soon: 6679902 fmadm(1M) needs examples http://bugs.opensolaris.org/view_bug.do?bug_id=6679902 If you have specifics, feel free to add to the CR. eric ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs data corruption
http://www.sun.com/bigadmin/features/articles/selfheal.jsp -- mark Carson Gaspar wrote: Ian Collins wrote: Carson Gaspar wrote: If this is possible, it's entirely undocumented... Actually, fmd's documentation is generally terrible. The sum total of configuration information is: FILES /etc/fm/fmd Fault manager configuration direc- tory Which is empty... It does look like I could write code to copy the output of "fmdump -f" somewhere useful if I had to. Have you tried man fmadm? http://onesearch.sun.com/search/docs/index.jsp?col=docs_en&locale=en&qt=fmadm&cs=false&st=11 Brings up some useful information. "man fmadm" has: - nothing to do with configuration (the topic) (OK, it "prints the config", whatever that means, but you can't _change_ anything) - no examples of usage I stand by my statement that the fault management docs need a lot of help. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs data corruption
Ian Collins wrote: > Carson Gaspar wrote: >> If this is possible, it's entirely undocumented... Actually, fmd's >> documentation is generally terrible. The sum total of configuration >> information is: >> >> FILES >> /etc/fm/fmd Fault manager configuration direc- >> tory >> >> Which is empty... It does look like I could write code to copy the >> output of "fmdump -f" somewhere useful if I had to. >> >> > Have you tried man fmadm? > > http://onesearch.sun.com/search/docs/index.jsp?col=docs_en&locale=en&qt=fmadm&cs=false&st=11 > > Brings up some useful information. "man fmadm" has: - nothing to do with configuration (the topic) (OK, it "prints the config", whatever that means, but you can't _change_ anything) - no examples of usage I stand by my statement that the fault management docs need a lot of help. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs data corruption
Carson Gaspar wrote: > Nathan Kroenert - Server ESG wrote: > > >> I also *believe* (though am not certain - Perhaps someone else on the >> list might be?) it would be possible to have each *event* (so - the >> individual events that lead to a Fault Diagnosis) generate a message if >> it was required, though I have never taken the time to do that one... >> > > If this is possible, it's entirely undocumented... Actually, fmd's > documentation is generally terrible. The sum total of configuration > information is: > > FILES > /etc/fm/fmd Fault manager configuration direc- > tory > > Which is empty... It does look like I could write code to copy the > output of "fmdump -f" somewhere useful if I had to. > > Have you tried man fmadm? http://onesearch.sun.com/search/docs/index.jsp?col=docs_en&locale=en&qt=fmadm&cs=false&st=11 Brings up some useful information. Ian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs data corruption
Nathan Kroenert - Server ESG wrote: > I also *believe* (though am not certain - Perhaps someone else on the > list might be?) it would be possible to have each *event* (so - the > individual events that lead to a Fault Diagnosis) generate a message if > it was required, though I have never taken the time to do that one... If this is possible, it's entirely undocumented... Actually, fmd's documentation is generally terrible. The sum total of configuration information is: FILES /etc/fm/fmd Fault manager configuration direc- tory Which is empty... It does look like I could write code to copy the output of "fmdump -f" somewhere useful if I had to. > All of this said, I understand if you feel things are being 'hidden' > from you until it's *actually* busted that you are having some of your > forward vision obscured 'in the name of a quiet logfile'. I felt much > the same way for a period of time. (Though, I live more in the CPU / > Memory camp...) > > But - Once I realised what I could do with fmstat and fmdump, I was not > the slightest bit unhappy (Actually, that's not quite true... Even once > I knew what they could do, it still took me a while to work out the > options I cared about for fmdump / fmstat), but I now trust FMA to look > after my CPU / Memory issues better than I would in real life. I can > still get what I need when I want to, and the data is actually more > accessible and interesting. I just needed to know where to go looking. > > All this being said, I was not actually aware that many of our disk / > target drivers were actually FMA'd up yet. heh - Shows what I know. > > Does any of this make you feel any better (or worse)? Hiding the raw data isn't helping. Log it at debug if you want, but log it off-box. The local logs won't be available when your server is dead and you want to figure out why. A real world example is that sometimes the only host-side sign of FC storage issues is a retryable error (as everything is redundant). Now I'm sure the storage folks can get other errors out of their side, but sadly I can't. That retryable error is our canary in the coal mine warning us that we may have just lost redundancy. We don't want fmd to take any action, but we do want to know... -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs data corruption
Note: IANATZD (I Am Not A Team-ZFS Dude) Speaking as a Hardware Guy, knowing that something is happening, has happened or is indicated to happen is a Good Thing (tm). Begin unlikely, but possible scenario: If, for instance, I'm getting a cluster of read errors (or, perhaps bad blocks), I could: - See it as it's happening - See the block number for each error - already know the rate at which the errors are happening - Be able to determine that it's not good, and it's time to replace the disk. - You get the picture... And based on this information, I could feel confident that I have the right information at hand to be able to determine that it is or is not time to replace this disk. Of course, that assumes: - I know anything about disks - I know anything about the error messages - I have some sort of logging tool that recognises the errors (and does not just throw out the 'retryable ones', as most I have seen are configured to do) - I care - The folks watching the logs in the enterprise management tool care - My storage even bothers to report the errors Certainly, for some organisations, all of the above are exactly how it works, and it works well for them. Looking at the ZFS/FMA approach, it certainly is somewhat different. The (very) rough concept is that FMA gets pretty much all errors reported to it. It logs them, in a persistent store, which is always available to view. It also makes diagnoses on the errors, based on the rules that exist for that particular style of error. Once enough (or the right type of) errors happen, it'll then make a Fault Diagnosis for that component, and log a message, loud and proud into the syslog. It may also take other actions, like, retire a page of memory, offline a CPU, panic the box, etc. So - That's the rough overview. It's worth noting up front that we can *observe* every event that has happened. Using fmdump and fmstat we can immediately see if anything interesting has been happening, or we can wait for a Fault Diagnosis, in which case, we can just watch /var/adm/messages. I also *believe* (though am not certain - Perhaps someone else on the list might be?) it would be possible to have each *event* (so - the individual events that lead to a Fault Diagnosis) generate a message if it was required, though I have never taken the time to do that one... There are many advantages to this approach - It does not rely on logfiles, offsets into logfiles, counters of previously processes messages and all of the other doom and gloom that comes with scraping logfiles. It's something you can simply ask: Any issues, chief? The answer is there in a flash. You will also be less likely to have the messages rolled out of the logs before you get to them (another classic...). And - You get some great details from fmdump showing you what's really going on, and it's something that's really easy to parse to look for patterns. All of this said, I understand if you feel things are being 'hidden' from you until it's *actually* busted that you are having some of your forward vision obscured 'in the name of a quiet logfile'. I felt much the same way for a period of time. (Though, I live more in the CPU / Memory camp...) But - Once I realised what I could do with fmstat and fmdump, I was not the slightest bit unhappy (Actually, that's not quite true... Even once I knew what they could do, it still took me a while to work out the options I cared about for fmdump / fmstat), but I now trust FMA to look after my CPU / Memory issues better than I would in real life. I can still get what I need when I want to, and the data is actually more accessible and interesting. I just needed to know where to go looking. All this being said, I was not actually aware that many of our disk / target drivers were actually FMA'd up yet. heh - Shows what I know. Does any of this make you feel any better (or worse)? Nathan. Mark A. Carlson wrote: > fmd(1M) can log faults to syslogd that are already diagnosed. Why > would you want the random spew as well? > > -- mark > > Carson Gaspar wrote: >> [EMAIL PROTECTED] wrote: >> >> >>> It's not safe to jump to this conclusion. Disk drivers that support FMA >>> won't log error messages to /var/adm/messages. As more support for I/O >>> FMA shows up, you won't see random spew in the messages file any more. >>> >> >> >> That is a Very Bad Idea. Please convey this to whoever thinks that >> they're "helping" by not sysloging I/O errors. If this shows up in >> Solaris 11, we will Not Be Amused. Lack of off-box error logging will >> directly cause loss of revenue. >> >> >> > > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zf
Re: [zfs-discuss] zfs data corruption
On Sat, 26 Apr 2008, Carson Gaspar wrote: >> It's not safe to jump to this conclusion. Disk drivers that support FMA >> won't log error messages to /var/adm/messages. As more support for I/O >> FMA shows up, you won't see random spew in the messages file any more. > > > That is a Very Bad Idea. Please convey this to whoever thinks that > they're "helping" by not sysloging I/O errors. If this shows up in > Solaris 11, we will Not Be Amused. Lack of off-box error logging will > directly cause loss of revenue. > I am glad to hear that your large financial institution (Bear Stearns?) is contributing to the OpenSolaris project. :-) Today's systems are very complex and may contain many tens of disks. Syslog is a bottleneck and often logs to local files, which grow very large, and hinder system performance while many log messages are being reported. If syslog is to a remote host, then the network is also impacted. If a device (or several inter-related devices) is/are experiencing problems, it seems best to isolate and diagnose it, with one intelligent notification rather than spewing hundreds of thousands of low-level error messages to a system logger. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs data corruption
fmd(1M) can log faults to syslogd that are already diagnosed. Why would you want the random spew as well? -- mark Carson Gaspar wrote: [EMAIL PROTECTED] wrote: It's not safe to jump to this conclusion. Disk drivers that support FMA won't log error messages to /var/adm/messages. As more support for I/O FMA shows up, you won't see random spew in the messages file any more. That is a Very Bad Idea. Please convey this to whoever thinks that they're "helping" by not sysloging I/O errors. If this shows up in Solaris 11, we will Not Be Amused. Lack of off-box error logging will directly cause loss of revenue. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs data corruption
[EMAIL PROTECTED] wrote: > It's not safe to jump to this conclusion. Disk drivers that support FMA > won't log error messages to /var/adm/messages. As more support for I/O > FMA shows up, you won't see random spew in the messages file any more. That is a Very Bad Idea. Please convey this to whoever thinks that they're "helping" by not sysloging I/O errors. If this shows up in Solaris 11, we will Not Be Amused. Lack of off-box error logging will directly cause loss of revenue. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs data corruption
> I'm just interested in understanding how zfs determined there was data > corruption when I have checksums disabled and there were no > non-retryable read errors reported in the messages file. If the metadata is corrupt, how is ZFS going to find the data blocks on disk? > > I don't believe it was a real disk read error because of the > > absence of evidence in /var/adm/messages. It's not safe to jump to this conclusion. Disk drivers that support FMA won't log error messages to /var/adm/messages. As more support for I/O FMA shows up, you won't see random spew in the messages file any more. -j ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs data corruption
Just to clarify this post. This isn't data I care about recovering. I'm just interested in understanding how zfs determined there was data corruption when I have checksums disabled and there were no non-retryable read errors reported in the messages file. On Wed, Apr 23, 2008 at 9:52 PM, Victor Engle <[EMAIL PROTECTED]> wrote: > Thanks! That would explain things. I don't believe it was a real disk > read error because of the absence of evidence in /var/adm/messages. > > I'll review the man page and documentation to confirm that metadata is > checksummed. > > Regards, > Vic > > > > > On Wed, Apr 23, 2008 at 6:30 PM, Nathan Kroenert > <[EMAIL PROTECTED]> wrote: > > I'm just taking a stab here, so could be completely wrong, but IIRC, even > if > > you disable checksum, it still checksums the metadata... > > > > So, it could be metadata checksum errors. > > > > Others on the list might have some funky zdb thingies you could to see > what > > it actually is... > > > > Note: typed pre caffeine... :) > > > > Nathan > > > > > > > > Vic Engle wrote: > > > > > I'm hoping someone can help me understand a zfs data corruption symptom. > > We have a zpool with checksum turned off. Zpool status shows that data > > corruption occured. The application using the pool at the time reported a > > "read" error and zoppl status (see below) shows 2 read errors on a device. > > The thing that is confusing to me is how ZFS determines that data > corruption > > exists when reading data from a pool with checkdum turned off. > > > > > > Also, I'm wondering about the persistent errors in the output below. > Since > > no specific file or directory is mentioned does this indicate pool metadata > > is corrupt? > > > > > > Thanks for any help interpreting the output... > > > > > > > > > # zpool status -xv > > > pool: zpool1 > > > state: ONLINE > > > status: One or more devices has experienced an error resulting in data > > >corruption. Applications may be affected. > > > action: Restore the file in question if possible. Otherwise restore the > > >entire pool from backup. > > > see: http://www.sun.com/msg/ZFS-8000-8A > > > scrub: none requested > > > config: > > > > > >NAME STATE READ WRITE > CKSUM > > >zpool1 ONLINE 2 0 > 0 > > > c4t60A9800043346859444A476B2D48446Fd0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D484352d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D484236d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D482D6Cd0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D483951d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D483836d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D48366Bd0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D483551d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D483435d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D48326Bd0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D483150d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D483035d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D47796Ad0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D477850d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D477734d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D47756Ad0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D47744Fd0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D477333d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D477169d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D47704Ed0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D476F33d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D476D68d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D476C4Ed0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D476B32d0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D476968d0 ONLINE 0 0 > 0 > > > c4t60A98000433468656834476B2D453974d0 ONLINE 0 0 > 0 > > > c4t60A98000433468656834476B2D454142d0 ONLINE 0 0 > 0 > > > c4t60A98000433468656834476B2D454255d0 ONLINE 0 0 > 0 > > > c4t60A98000433468656834476B2D45436Dd0 ONLINE 0 0 > 0 > > > c4t60A9800043346859444A476B2D487346d0 ONLINE 2 0 > 0 > > > c4t60A9800043346859444A476B2D487175d0 ONLINE
Re: [zfs-discuss] zfs data corruption
Thanks! That would explain things. I don't believe it was a real disk read error because of the absence of evidence in /var/adm/messages. I'll review the man page and documentation to confirm that metadata is checksummed. Regards, Vic On Wed, Apr 23, 2008 at 6:30 PM, Nathan Kroenert <[EMAIL PROTECTED]> wrote: > I'm just taking a stab here, so could be completely wrong, but IIRC, even if > you disable checksum, it still checksums the metadata... > > So, it could be metadata checksum errors. > > Others on the list might have some funky zdb thingies you could to see what > it actually is... > > Note: typed pre caffeine... :) > > Nathan > > > > Vic Engle wrote: > > > I'm hoping someone can help me understand a zfs data corruption symptom. > We have a zpool with checksum turned off. Zpool status shows that data > corruption occured. The application using the pool at the time reported a > "read" error and zoppl status (see below) shows 2 read errors on a device. > The thing that is confusing to me is how ZFS determines that data corruption > exists when reading data from a pool with checkdum turned off. > > > > Also, I'm wondering about the persistent errors in the output below. Since > no specific file or directory is mentioned does this indicate pool metadata > is corrupt? > > > > Thanks for any help interpreting the output... > > > > > > # zpool status -xv > > pool: zpool1 > > state: ONLINE > > status: One or more devices has experienced an error resulting in data > >corruption. Applications may be affected. > > action: Restore the file in question if possible. Otherwise restore the > >entire pool from backup. > > see: http://www.sun.com/msg/ZFS-8000-8A > > scrub: none requested > > config: > > > >NAME STATE READ WRITE CKSUM > >zpool1 ONLINE 2 0 0 > > c4t60A9800043346859444A476B2D48446Fd0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D484352d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D484236d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D482D6Cd0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D483951d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D483836d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D48366Bd0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D483551d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D483435d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D48326Bd0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D483150d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D483035d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D47796Ad0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D477850d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D477734d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D47756Ad0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D47744Fd0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D477333d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D477169d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D47704Ed0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D476F33d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D476D68d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D476C4Ed0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D476B32d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D476968d0 ONLINE 0 0 0 > > c4t60A98000433468656834476B2D453974d0 ONLINE 0 0 0 > > c4t60A98000433468656834476B2D454142d0 ONLINE 0 0 0 > > c4t60A98000433468656834476B2D454255d0 ONLINE 0 0 0 > > c4t60A98000433468656834476B2D45436Dd0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D487346d0 ONLINE 2 0 0 > > c4t60A9800043346859444A476B2D487175d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D48705Ad0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D486F45d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D486D74d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D486C5Ad0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D486B44d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D486974d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D486859d0 ONLINE 0 0 0 > > c4t60A9800043346859444A476B2D486744d0 ONLINE 0 0 0 > >
Re: [zfs-discuss] zfs data corruption
> Since no specific file or directory is mentioned install newer bits and get better info automatically but for now type: zdb -vvv zpool1 17 zdb -vvv zpool1 18 zdb -vvv zpool1 19 echo remove those objects zpool clear zpool1 zpool scrub zpool1 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs data corruption
I'm just taking a stab here, so could be completely wrong, but IIRC, even if you disable checksum, it still checksums the metadata... So, it could be metadata checksum errors. Others on the list might have some funky zdb thingies you could to see what it actually is... Note: typed pre caffeine... :) Nathan Vic Engle wrote: > I'm hoping someone can help me understand a zfs data corruption symptom. We > have a zpool with checksum turned off. Zpool status shows that data > corruption occured. The application using the pool at the time reported a > "read" error and zoppl status (see below) shows 2 read errors on a device. > The thing that is confusing to me is how ZFS determines that data corruption > exists when reading data from a pool with checkdum turned off. > > Also, I'm wondering about the persistent errors in the output below. Since no > specific file or directory is mentioned does this indicate pool metadata is > corrupt? > > Thanks for any help interpreting the output... > > > # zpool status -xv > pool: zpool1 > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. >see: http://www.sun.com/msg/ZFS-8000-8A > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > zpool1 ONLINE 2 0 0 > c4t60A9800043346859444A476B2D48446Fd0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D484352d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D484236d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D482D6Cd0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D483951d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D483836d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D48366Bd0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D483551d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D483435d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D48326Bd0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D483150d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D483035d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D47796Ad0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D477850d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D477734d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D47756Ad0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D47744Fd0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D477333d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D477169d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D47704Ed0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D476F33d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D476D68d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D476C4Ed0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D476B32d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D476968d0 ONLINE 0 0 0 > c4t60A98000433468656834476B2D453974d0 ONLINE 0 0 0 > c4t60A98000433468656834476B2D454142d0 ONLINE 0 0 0 > c4t60A98000433468656834476B2D454255d0 ONLINE 0 0 0 > c4t60A98000433468656834476B2D45436Dd0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D487346d0 ONLINE 2 0 0 > c4t60A9800043346859444A476B2D487175d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D48705Ad0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486F45d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486D74d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486C5Ad0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486B44d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486974d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486859d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486744d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486573d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486459d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486343d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D486173d0 ONLINE 0 0 0 > c4t60A9800043346859444A476B2D482F58d0 ONLINE 0 0 0 > c4t60A98000
[zfs-discuss] zfs data corruption
I'm hoping someone can help me understand a zfs data corruption symptom. We have a zpool with checksum turned off. Zpool status shows that data corruption occured. The application using the pool at the time reported a "read" error and zoppl status (see below) shows 2 read errors on a device. The thing that is confusing to me is how ZFS determines that data corruption exists when reading data from a pool with checkdum turned off. Also, I'm wondering about the persistent errors in the output below. Since no specific file or directory is mentioned does this indicate pool metadata is corrupt? Thanks for any help interpreting the output... # zpool status -xv pool: zpool1 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAME STATE READ WRITE CKSUM zpool1 ONLINE 2 0 0 c4t60A9800043346859444A476B2D48446Fd0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D484352d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D484236d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D482D6Cd0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D483951d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D483836d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D48366Bd0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D483551d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D483435d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D48326Bd0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D483150d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D483035d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D47796Ad0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D477850d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D477734d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D47756Ad0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D47744Fd0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D477333d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D477169d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D47704Ed0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D476F33d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D476D68d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D476C4Ed0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D476B32d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D476968d0 ONLINE 0 0 0 c4t60A98000433468656834476B2D453974d0 ONLINE 0 0 0 c4t60A98000433468656834476B2D454142d0 ONLINE 0 0 0 c4t60A98000433468656834476B2D454255d0 ONLINE 0 0 0 c4t60A98000433468656834476B2D45436Dd0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D487346d0 ONLINE 2 0 0 c4t60A9800043346859444A476B2D487175d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D48705Ad0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486F45d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486D74d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486C5Ad0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486B44d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486974d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486859d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486744d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486573d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486459d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486343d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D486173d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D482F58d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D485A43d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D485872d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D485758d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D485642d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D485471d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D485357d0 ONLINE 0 0 0 c4t60A9800043346859444A476B2D48