Re[3]: [zfs-discuss] zpool status and CKSUM errors
Hello Robert, Thursday, July 6, 2006, 1:49:34 AM, you wrote: RM Hello Eric, RM Monday, June 12, 2006, 11:21:24 PM, you wrote: ES I reproduced this pretty easily on a lab machine. I've filed: ES 6437568 ditto block repair is incorrectly propagated to root vdev ES To track this issue. Keep in mind that you do have a flakey ES controller/lun/something. If this had been a user data block, your data ES would be gone. RM I belive that something else is also happening here. RM I can see CKSUM errors on two different servers (v240 and T2000) all RM on non-redundant zpools and all the times it looks like ditto block RM helped - hey, it's just improbable. RM And while on T2000 from fmdump -ev I get: RM Jul 05 19:59:43.8786 ereport.io.fire.pec.btp 0x14e4b8015f612002 RM Jul 05 20:05:28.9165 ereport.io.fire.pec.re 0x14e5f951ce12b002 RM Jul 05 20:05:58.5381 ereport.io.fire.pec.re 0x14e614e78f4c9002 RM Jul 05 20:05:58.5389 ereport.io.fire.pec.btp 0x14e614e7b6ddf002 RM Jul 05 23:34:11.1960 ereport.io.fire.pec.re 0x1513869a6f7a6002 RM Jul 05 23:34:11.1967 ereport.io.fire.pec.btp 0x1513869a95196002 RM Jul 06 00:09:17.1845 ereport.io.fire.pec.re 0x151b2fca4c988002 RM Jul 06 00:09:17.1852 ereport.io.fire.pec.btp 0x151b2fca72e6b002 RM on v240 fmdump shows nothing for over a month and I'm sure I did zpool RM clear on that server later. RM v240: RM bash-3.00# zpool status nfs-s5-s7 RM pool: nfs-s5-s7 RM state: ONLINE RM status: One or more devices has experienced an unrecoverable error. An RM attempt was made to correct the error. Applications are unaffected. RM action: Determine if the device needs to be replaced, and clear the errors RM using 'zpool clear' or replace the device with 'zpool replace'. RMsee: http://www.sun.com/msg/ZFS-8000-9P RM scrub: none requested RM config: RM NAME STATE READ WRITE CKSUM RM nfs-s5-s7ONLINE 0 0 167 RM c4t600C0FF009258F28706F5201d0 ONLINE 0 0 167 RM errors: No known data errors RM bash-3.00# RM bash-3.00# zpool clear nfs-s5-s7 RM bash-3.00# zpool status nfs-s5-s7 RM pool: nfs-s5-s7 RM state: ONLINE RM scrub: none requested RM config: RM NAME STATE READ WRITE CKSUM RM nfs-s5-s7ONLINE 0 0 0 RM c4t600C0FF009258F28706F5201d0 ONLINE 0 0 0 RM errors: No known data errors RM bash-3.00# RM bash-3.00# zpool scrub nfs-s5-s7 RM bash-3.00# zpool status nfs-s5-s7 RM pool: nfs-s5-s7 RM state: ONLINE RM scrub: scrub in progress, 0.01% done, 269h24m to go RM config: RM NAME STATE READ WRITE CKSUM RM nfs-s5-s7ONLINE 0 0 0 RM c4t600C0FF009258F28706F5201d0 ONLINE 0 0 0 RM errors: No known data errors RM bash-3.00# RM We'll see the result - I hope I would have not to stop it in the RM morning. Anyway I have a feeling that nothing will be reported. RM ps. I've got several similar pools on those two servers and I see RM CKSUM errors on all of them with the same result - it's almost RM impossible. ok, it took several days actually to complete scrub. During scrub I saw some CKSUM errors already and now again there are many of them, however scrub itself reported no errors at all. bash-3.00# zpool status nfs-s5-s7 pool: nfs-s5-s7 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed with 0 errors on Sun Jul 9 02:56:19 2006 config: NAME STATE READ WRITE CKSUM nfs-s5-s7ONLINE 0 018 c4t600C0FF009258F28706F5201d0 ONLINE 0 018 errors: No known data errors bash-3.00# -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re[2]: [zfs-discuss] zpool status and CKSUM errors
Hello Eric, Friday, June 9, 2006, 5:16:29 PM, you wrote: ES On Fri, Jun 09, 2006 at 06:16:53AM -0700, Robert Milkowski wrote: bash-3.00# zpool status -v nfs-s5-p1 pool: nfs-s5-p1 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM nfs-s5-p1ONLINE 0 0 2 c4t600C0FF009258F7A4F1BC601d0 ONLINE 0 0 2 errors: No known data errors bash-3.00# As you can see there's no protection with ZFS. Does it mean that those two checksum errors were related to metadata and thanks to ditto blocks it was corrected? (I assume application did receive proper data and fs is ok). ES Hmm, I'm not sure. There are no persistent data errors (as shown by the ES 'errors:' line), so you should be. If you want to send your ES /var/fm/fmd/errlog, or 'fmdump -eV' output, we can take a look at the ES details of the error. If this is the case, then it's a bug that the ES checksum error is reported for the pool for a recovered ditto block. ES You may want to try 'zpool clear nfs-s5-p1; zpool scrub nfs-s5-p1' and ES see if it turns up anything. Well, I just did 'fmdump -eV' and last entry is from May 31th and is related to pools which are already destroyed. I can see another 1 checksum error in that pool (I did zpool clear last time) and it's NOT reported by fmdump. This one occuerd afet May 31th. I hope these are ditto blocks and nothing else (read: bad). System is b39 SPARC. -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re[2]: [zfs-discuss] zpool status and CKSUM errors
Hello Jeff, Saturday, June 10, 2006, 2:32:49 AM, you wrote: btw: I'm really suprised how SATA disks are unreliable. I put dozen TBs of data on ZFS last time and just after few days I got few hundreds checksum error (there raid-z was used). And these disks are 500GB in 3511 array. Well that would explain some fsck's, etc. we saw before. JB I suspect you've got a bad disk or controller. A normal SATA drive JB just won't behave this badly. Cool that RAID-Z survives it, though. It's not that bad right now. It was then but the array (3511) reported several times 'Drive NOTIFY: Media Error Encountered - 163A981 (311)' and then I got all of these CKSUM errors. Once it stabilized (drive finally filed and was replaced by hotspare) I see no CKSUM errors after few days. Looks like drive was failing, etc. But still I'm surprised that the array returned bad data (raid-5 on the array). We see such messages once in a while on several 3511s. -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool status and CKSUM errors
On Mon, Jun 12, 2006 at 10:49:49AM +0200, Robert Milkowski wrote: Well, I just did 'fmdump -eV' and last entry is from May 31th and is related to pools which are already destroyed. I can see another 1 checksum error in that pool (I did zpool clear last time) and it's NOT reported by fmdump. This one occuerd afet May 31th. I hope these are ditto blocks and nothing else (read: bad). System is b39 SPARC. Yes, that does sound like ditto blocks. I'll poke around with Bill and figure out why the checksum errors would be percolating up to the pool level. They should be reported only for the leaf device. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool status and CKSUM errors
Richard Elling wrote: Robert Milkowski wrote: btw: I'm really suprised how SATA disks are unreliable. I put dozen TBs of data on ZFS last time and just after few days I got few hundreds checksum error (there raid-z was used). And these disks are 500GB in 3511 array. Well that would explain some fsck's, etc. we saw before. It is more likely due to the density than the interface. In general, high density disks will suffer from superparamagnetic affects more than lower density disks. There are several ways to combat this, but the consumer market values space over reliability. I'm not actually convinced the consumer market wants the space, it is more that we don't have a choice because it bigger and bigger drives is all we can buy. Personally I have very little need for a 500G disk at home (mainly because I don't do video and my photos are jpg not raw ;-)). And since there is no checksumming to detect problems, they don't think they have problems -- the insidious effects of cancer. Or most of the data is stored in file formats don't get impacted too much by the odd bit flip here and there (eg MPEG streams). -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool status and CKSUM errors
btw: I'm really suprised how SATA disks are unreliable. I put dozen TBs of data on ZFS last time and just after few days I got few hundreds checksum error (there raid-z was used). And these disks are 500GB in 3511 array. Well that would explain some fsck's, etc. we saw before. I suspect you've got a bad disk or controller. A normal SATA drive just won't behave this badly. Cool that RAID-Z survives it, though. Jeff ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool status and CKSUM errors
Jeff Bonwick wrote: btw: I'm really suprised how SATA disks are unreliable. I put dozen TBs of data on ZFS last time and just after few days I got few hundreds checksum error (there raid-z was used). And these disks are 500GB in 3511 array. Well that would explain some fsck's, etc. we saw before. I suspect you've got a bad disk or controller. A normal SATA drive just won't behave this badly. Cool that RAID-Z survives it, though. I had a power supply go bad a few months ago (cheap PC-junk power supply) and it trashed a bunch of my SATA and IDE disks [*] (though, happily, not the IDE disk I scavenged from a Sun V100 :-). The symptoms were thousands of non-recoverable reads which were remapped until the disks ran out of spare blocks. Since I didn't believe this, I got a new, more expensive, and presumably more reliable power supply. The IDE disks faired better, but I had to do a low-level format on the SATA drive. All is well now and zfs hasn't shown any errors since. But, thunderstorm season is approaching next month... I am also trying to collect field data which shows such failure modes specifically looking for clusters of errors. However, I can't promise anything, and may not get much time to do in-depth study anytime soon. [*] my theory is that disks are about the only devices still using 12VDC power. Some disk vendor specify the quality of the 12VDC supply (eg. ripple) for specific drives. In my case, the 12VDC was the only common-mode failure in the system which would have trashed most of the drives in this manner. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss