Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-28 Thread Frank Middleton

Thanks to everyone who made suggestions! This machine has run
memtest for a week and VTS for several days with no errors. It
does seem that the problem is probably in the CPU cache.

On 03/24/10 10:07 AM, Damon Atkins wrote:

You could try copying the file to /tmp (ie swap/ram) and do a
continues loop of checksums


On a variation of your suggestion, I implemented a bash script
that applies sha1sum 10,000 times with a pause of 0.1S between
each attempt, and tests the result against what seemed to be the
correct result.

sha1sum on /lib/libdlpi.so.1 resulted in 11% of incorrect results
sha1sum on /tmp/libdlpi.so.1 resulted in 5 failures out of 10,000
sha1sum on /lib/libpam.so.1 resulted in zero errors in 10,000
sha1sum on /tmp/libpam.so.1ditto.

So what we have is a pattern sensitive failure that is also sensitive
to how busy the cpu is (and doesn't fail running VTS). md5sum and
sha256sum produced similar results, and presumably so would
fletcher2. To get really meaningful results, the machine should be
otherwise idle (but then, maybe it wouldn't fail).

Is anyone willing to speculate (or have any suggestions for further
experiments) about what failure mode could cause a checksum
calculation to be pattern sensitive and also thousands of times
more likely to fail if read from disk vs. tmpfs? FWIW the failures
are pretty consistent, mostly but not always producing the
same bad checksum.

So at boot, the cpu is busy, increasing the probability of this
pattern sensitive failure,  and this one time it failed on every
read of /lib/libdlpi.so.1. With copies=1 this was twice as likely
to happen, and when it did ZFS returned an error on any
attempt to read the file. With copies=2 in this case it doesn't
return an error when attempting to read. Also there were no
set-bit errors this time, but then I have no idea what a set-bit
error is.

On 03/24/10 12:32 PM, Richard Elling wrote:


Clearly, fletcher2 identified the problem.


Ironically, on this hardware it seems it created the problem :-).
However you have been vindicated - it was a pattern sensitive
problem as you have long suggested it might be.

So: that the file is still readable is a mystery, but how it became
to be flagged as bad in ZFS isn't, any more.

Cheers -- Frank


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-24 Thread Daniel Carosone
On Tue, Mar 23, 2010 at 07:22:59PM -0400, Frank Middleton wrote:
 On 03/22/10 11:50 PM, Richard Elling wrote:
  
 Look again, the checksums are different.

 Whoops, you are correct, as usual. Just 6 bits out of 256 different...

 Look which bits are different -  digits 24, 53-56 in both cases.

This is very likely an error introduced during the calculation of
the hash, rather than an error in the input data.  I don't know how
that helps narrow down the source of the problem, though..

It suggests an experiment: try switching to another hash algorithm.
It may move the problem around, or even make it worse, of course.

I'm also reminded of a thread about the implementation of fletcher2
being flawed, perhaps you're better switching regardless.

 o Why is the file flagged by ZFS as fatally corrupted still accessible?

 This is the part I was hoping to get answers for since AFAIK this
 should be impossible. Since none of this is having any operational
 impact, all of these issues are of interest only, but this is a bit scary!

It's only the blocks with bad checksums that should return errors.
Maybe you're not reading those, or the transient error doesn't happen
next time when you actually try to read it / from the other side of
the mirror.

Repeated errors in the same file could also be a symptom of an error
calculating the hash when the file was written.  If there's a
bit-flipping issue at the root of it, with some given probability,
that would invert the probabilities of correct and error results.

--
Dan.


pgpGRgBlRkr4l.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-24 Thread Damon Atkins
You could try copying the file to /tmp (ie swap/ram) and do a continues loop of 
checksums  e.g.

while [ ! -f  ibdlpi.so.1.x ] ; do sleep 1; cp libdlpi.so.1 libdlpi.so.1.x ; 
A=`sha512sum -b libdlpi.so.1.x` ; [ $A == what it should be 
libdlpi.so.1.x ]  rm libdlpi.so.1.x ; done ; date

Assume the file never goes to swap, it would tell you if something on the 
motherboard is playing up.

I have seen CPU randomly set a byte to 0 which should not be 0, think it was an 
L1 or L2 cache problem.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-24 Thread Saso Kiselkov
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

How about running memtest86+ (http://www.memtest.org/) on the machine
for a while? It doesn't test the arithmetics on the CPU very much, but
it stresses data paths quite a lot. Just a quick suggestion...

- --
Saso

Damon Atkins wrote:
 You could try copying the file to /tmp (ie swap/ram) and do a continues loop 
 of checksums  e.g.
 
 while [ ! -f  ibdlpi.so.1.x ] ; do sleep 1; cp libdlpi.so.1 libdlpi.so.1.x ; 
 A=`sha512sum -b libdlpi.so.1.x` ; [ $A == what it should be 
 libdlpi.so.1.x ]  rm libdlpi.so.1.x ; done ; date
 
 Assume the file never goes to swap, it would tell you if something on the 
 motherboard is playing up.
 
 I have seen CPU randomly set a byte to 0 which should not be 0, think it was 
 an L1 or L2 cache problem.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuqHm8ACgkQRO8UcfzpOHD9PQCgyehtxeAt8tieOlIKfHICQQI9
bFoAnRGzfWayNDsjHj5NdF+5n++Pdqaq
=cru5
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-24 Thread Damon Atkins
you could also use psradm to take a CPU off-line.

At boot I would ??assume?? the system boots the same way every time unless 
something changes, so you could be hiting the came CPU core every time or the 
same bit of RAM until booted fully.

Or even run SunVTS Validation Test Suite which I belive has a simlar test to 
the cp in /tmp and all the other tests it has.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-24 Thread Richard Elling
On Mar 23, 2010, at 11:21 PM, Daniel Carosone wrote:

 On Tue, Mar 23, 2010 at 07:22:59PM -0400, Frank Middleton wrote:
 On 03/22/10 11:50 PM, Richard Elling wrote:
 
 Look again, the checksums are different.
 
 Whoops, you are correct, as usual. Just 6 bits out of 256 different...
 
 Look which bits are different -  digits 24, 53-56 in both cases.
 
 This is very likely an error introduced during the calculation of
 the hash, rather than an error in the input data.  I don't know how
 that helps narrow down the source of the problem, though..

The exact same code is used to calculate the checksum when writing
or reading. However, we assume the processor works and Frank's tests
do not indicate otherwise.

 
 It suggests an experiment: try switching to another hash algorithm.
 It may move the problem around, or even make it worse, of course.
 
 I'm also reminded of a thread about the implementation of fletcher2
 being flawed, perhaps you're better switching regardless.

Clearly, fletcher2 identified the problem.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-23 Thread Frank Middleton

On 03/22/10 11:50 PM, Richard Elling wrote:
 

Look again, the checksums are different.


Whoops, you are correct, as usual. Just 6 bits out of 256 different...
Last year
expected 4a027c11b3ba4cec bf274565d5615b7b 3ef5fe61b2ed672e ec8692f7fd33094a
actual  4a027c11b3ba4cec bf274567d5615b7b 3ef5fe61b2ed672e ec86a5b3fd33094a
Last Month (obviously a different file)
expected 4b454eec8aebddb5 3b74c5235e1963ee c4489bdb2b475e76 fda3474dd1b6b63f
actual  4b454eec8aebddb5 3b74c5255e1963ee c4489bdb2b475e76 fda354c1d1b6b63f

Look which bits are different -  digits 24, 53-56 in both cases. But comparing
the bits, there's no discernible pattern. Is this an artifact of the algorithm
made by one erring bit always being at the same offset?


don't forget the -V flag :-)


I didn't. As mentioned there are subsequent set-bit errors, (14 minutes
later)  but none for this particular incident. I'll send you the results
separately since they are so puzzling. These 16 checksum failures
on libdlpi.so.1 were the only fmdump -eV entries for the entire boot
sequence except that it started out with one ereport.fs.zfs.data,
whatever that is, for a total of exactly 17 records, 9 in 1 uS, then
8 more 40 mS later, also in 1uS. Then nothing for 4 minutes, one
more checksum failure (bad_range_sets =) then 10 minutes later,
two with the set-bits error, one for each disk. That's it.


o Why is the file flagged by ZFS as fatally corrupted still accessible?


This is the part I was hoping to get answers for since AFAIK this
should be impossible. Since none of this is having any operational
impact, all of these issues are of interest only, but this is a bit scary!


Broken CPU, HBA, bus, memory, or power supply.


No argument there. Doesn't leave much, does it :-). Since the file itself
appears to be uncorrupted, and the metadata is consistent for all 16
entries, it would seem that the checksum calculation itself is failing
because it would appear in this case that everything else is OK. Is there
a way to apply the fletcher2 algorithm interactively as in sum(1)
or cksum(1)  (i.e., outside the scope of ZFS) to see if it is in some way
pattern sensitive with this CPU? Since only a small subset of files is
affected, this should be easy to verify. Start a scrub to heat things
up and then in parallel do checksums in a tight loop...


Transient failures are some of the most difficult to track down. Not all
transient failures are random.


Indeed, although this doesn't seem to be random. The hits to libdlpi.so.1
seems to be quite reproducible as you've seen from the fmdump log,
although I doubt this particular scenario will happen again. Can you
think of any tools to investigate this? I suppose I could extract the
checksum code from ZFS itself to build one, but that would take quite
a lot of time. Is there any documentation that explains the output of
fmdump -eV? What are set-bits, for example?

I guess not...  from man fmdump(1m)

   The error log file contains /Private/  telemetry  informa-
 tion  used  by  Sun's automated diagnosis software.
..

   Each problem recorded in the fault log is identified by:

 oThe time of its diagnosis

So did ZFS really read 8 copies of libdlpi.so.1 within 1uS, wait
40mS and then read another 8 copies in 1uS again? I doubt it :-).
I bet it took  1uS just to (mis)calculate the checksum (1.6GHz
16 bit cpu).

Thanks -- Frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-22 Thread Frank Middleton

On 03/21/10 03:24 PM, Richard Elling wrote:
 

I feel confident we are not seeing a b0rken drive here.  But something is
clearly amiss and we cannot rule out the processor, memory, or controller.


Absolutely no question of that, otherwise this list would be flooded :-).
However, the purpose of the post wasn't really to diagnose the hardware
but to ask about the behavior of ZFS under certain error conditions.


Frank reports that he sees this on the same file, /lib/libdlpi.so.1, so I'll go 
out
on a limb and speculate that there is something in the bit pattern for that
file that intermittently triggers a bit flip on this system. I'll also 
speculate that
this error will not be reproducible on another system.


Hopefully not, but you never know :-). However, this instance is different.
The example you quote shows both expected and actual checksums to be
the same. This time the expected and actual checksums are different and
fmdump isn't flagging any bad_ranges or set-bits (the behavior you observed
is still happening, but orthogonal to this instance at different times and not
always on this file).

Since file itself is OK, and the expected checksums are always the same,
neither the file nor the metatdata appear to be corrupted, so it appears
that both are making it into memory without error.

It would seem therefore that it is the actual checksum calculation that is
failing. But, only at boot time, the calculated (bad) checksums differ (out
of 16, 10, 3, and 3 are the same [1]) so it's not consistent. At this point it
would seem to be cpu or memory, but why only at boot? IMO it's an
old and feeble power supply under strain pushing cpu or memory to a
margin not seen during normal operation, which could be why diagnostics
never see anything amiss (and the importance of a good power supply).

FWIW the machine passed everything vts could throw at it for a couple
of days. Anyone got any suggestions for more targeted diagnostics?

There were several questions embedded in the original post, and I'm not
sure any of them have really been answered:

o Why is the file flagged by ZFS as fatally corrupted still accessiible?
   [is this new behavior from b111b vs b125?].

o What possible mechanism could there be for the /calculated/ checksums
   of /four/ copies of just one specific file to be bad and no others?

o Why did this only happen at boot to just this one file which also is
   peculiarly subject to the bitflips you observed, also mostly at boot
  (sometimes at scrub)? I like the feeble power supply answer, but why
  just this one file? Bizarre...

# zpool get  failmode rpool
NAME   PROPERTY  VALUE SOURCE
rpool  failmode  wait  default

This machine is extremely memory limited, so I suspect that libdlpi.so.1 is
not in a cache. Certainly, a brand new copy wouldn't be, and there's no
problem writing and (much later) reading the new copy (or the old one,
for that matter). It remains to be seen if the brand new copy gets clobbered
at boot (the machine, for all it's faults, remains busily up and operational
for months at a time). Maybe I should schedule a reboot out of curiosity :-).


This sort of specific error analysis is possible after b125. See CR6867188
for more details.


Wasn't this in b125? IIRC we upgraded to b125 for this very reason. There
certainly seems to be an overwhelming amount of data in the various logs!

Cheers -- Frank

[1]  This could be (3+1) * 4 where in one instance all 3+1 happen to be the
same. Does ZFS really read all 4 copies 4 times (by fmdump timestamp, 8
within 1uS, 40mS later, another 8,  again within 1uS)? Not sure what the
fmdump timestamps mean, so it's hard to find any pattern.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-22 Thread Richard Elling
On Mar 22, 2010, at 4:21 PM, Frank Middleton wrote:

 On 03/21/10 03:24 PM, Richard Elling wrote:
 
 I feel confident we are not seeing a b0rken drive here.  But something is
 clearly amiss and we cannot rule out the processor, memory, or controller.
 
 Absolutely no question of that, otherwise this list would be flooded :-).
 However, the purpose of the post wasn't really to diagnose the hardware
 but to ask about the behavior of ZFS under certain error conditions.
 
 Frank reports that he sees this on the same file, /lib/libdlpi.so.1, so I'll 
 go out
 on a limb and speculate that there is something in the bit pattern for that
 file that intermittently triggers a bit flip on this system. I'll also 
 speculate that
 this error will not be reproducible on another system.
 
 Hopefully not, but you never know :-). However, this instance is different.
 The example you quote shows both expected and actual checksums to be
 the same.

Look again, the checksums are different.

 This time the expected and actual checksums are different and
 fmdump isn't flagging any bad_ranges or set-bits (the behavior you observed
 is still happening, but orthogonal to this instance at different times and not
 always on this file).

don't forget the -V flag :-)

 Since file itself is OK, and the expected checksums are always the same,
 neither the file nor the metatdata appear to be corrupted, so it appears
 that both are making it into memory without error.
 
 It would seem therefore that it is the actual checksum calculation that is
 failing. But, only at boot time, the calculated (bad) checksums differ (out
 of 16, 10, 3, and 3 are the same [1]) so it's not consistent. At this point it
 would seem to be cpu or memory, but why only at boot? IMO it's an
 old and feeble power supply under strain pushing cpu or memory to a
 margin not seen during normal operation, which could be why diagnostics
 never see anything amiss (and the importance of a good power supply).
 
 FWIW the machine passed everything vts could throw at it for a couple
 of days. Anyone got any suggestions for more targeted diagnostics?
 
 There were several questions embedded in the original post, and I'm not
 sure any of them have really been answered:
 
 o Why is the file flagged by ZFS as fatally corrupted still accessiible?
   [is this new behavior from b111b vs b125?].
 
 o What possible mechanism could there be for the /calculated/ checksums
   of /four/ copies of just one specific file to be bad and no others?

Broken CPU, HBA, bus, or memory.

 o Why did this only happen at boot to just this one file which also is
   peculiarly subject to the bitflips you observed, also mostly at boot
  (sometimes at scrub)? I like the feeble power supply answer, but why
  just this one file? Bizarre...

Broken CPU, HBA, bus, memory, or power supply.

 # zpool get  failmode rpool
 NAME   PROPERTY  VALUE SOURCE
 rpool  failmode  wait  default
 
 This machine is extremely memory limited, so I suspect that libdlpi.so.1 is
 not in a cache. Certainly, a brand new copy wouldn't be, and there's no
 problem writing and (much later) reading the new copy (or the old one,
 for that matter). It remains to be seen if the brand new copy gets clobbered
 at boot (the machine, for all it's faults, remains busily up and operational
 for months at a time). Maybe I should schedule a reboot out of curiosity :-).
 
 This sort of specific error analysis is possible after b125. See CR6867188
 for more details.
 
 Wasn't this in b125? IIRC we upgraded to b125 for this very reason. There
 certainly seems to be an overwhelming amount of data in the various logs!
 
 Cheers -- Frank
 
 [1]  This could be (3+1) * 4 where in one instance all 3+1 happen to be the
 same. Does ZFS really read all 4 copies 4 times (by fmdump timestamp, 8
 within 1uS, 40mS later, another 8,  again within 1uS)? Not sure what the
 fmdump timestamps mean, so it's hard to find any pattern.

Transient failures are some of the most difficult to track down. Not all 
transient failures are random.
  -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-21 Thread Frank Middleton

On 03/15/10 01:01 PM, David Dyer-Bennet wrote:


This sounds really bizarre.


Yes, it is. ButCR 6880994 is bizarre too.
 

One detail suggestion on checking what's going on (since I don't have a
clue towards a real root-cause determination): Get an md5sum on a clean
copy of the file, say from a new install or something, and check the
allegedly-corrupted copy against that.  This can fairly easily give you a
pretty reliable indication if the file is truly corrupted or not.


With many thanks to Danek Duvall, I got a new copy of libdlpi.so.1

# md5sum /lib/libdlpi.so.1
2468392ff87b5810571572eb572d0a41  /lib/libdlpi.so.1
# md5sum /lib/libdlpi.so.1.orig
2468392ff87b5810571572eb572d0a41  /lib/libdlpi.so.1.orig
# zpool status -v

errors: Permanent errors have been detected in the following files:

//lib/libdlpi.so.1.orig

So here we seem to have an example of a ZFS false positive, the first
I've see or heard of. The good news is that it is still possible to read the
file, so this augers well for the ability to boot under this circumstance.
FWIW fmdump does seem to show show actual checksum errors on
all four copies in 16 attempts to read them. There were 3 groups of
different bad checksums; within each group the checksum was the
same but differed from the expected.

Perhaps someone who can could add this to CR 6880994 in the hopes
that it might help lead to a better understanding.

For the casual reader, CR 6880994 is about a pathological PC that
gets checksum errors on the same set of files at boot, even though the
root pool is mirrored. With copies=2, usually ZFS can repair them. But
after a recent power cycle, all 4 copies reported bad checksums but in
reality the the file seems to be uncorrupted. The machine has no ECC
and flaky bus parity, so there are plenty of ways for the data to get
messed up. It's a mystery why this only happens at boot, though.




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-21 Thread Richard Elling
On Mar 21, 2010, at 11:03 AM, Frank Middleton wrote:
 On 03/15/10 01:01 PM, David Dyer-Bennet wrote:
 
 This sounds really bizarre.
 
 Yes, it is. ButCR 6880994 is bizarre too.

Rolling back to a conversation with Frank last fall, here is the output
of fmdump which shows the single bit flip. Extra lines elided.

TIME   CLASS
Oct 23 2009 14:53:01.525657508 ereport.fs.zfs.checksum
class = ereport.fs.zfs.checksum
pool = rpool
vdev_guid = 0x509094f6dc795c97
vdev_type = disk
vdev_path = /dev/dsk/c3d0s0
vdev_devid = id1,c...@amaxtor_6y080l0=y32he6xe/a
parent_guid = 0x323cf9d672c3b05a
parent_type = mirror
zio_err = 50
zio_offset = 0x50384800
zio_size = 0x9800
zio_objset = 0x29
zio_object = 0x1a209
zio_level = 0
zio_blkid = 0x0
cksum_expected = 0x4a027c11b3ba4cec 0xbf274565d5615b7b 
0x3ef5fe61b2ed672e 0xec8692f7fd33094a
cksum_actual = 0x4a027c11b3ba4cec 0xbf274567d5615b7b 0x3ef5fe61b2ed672e 
0xec86a5b3fd33094a
cksum_algorithm = fletcher2
bad_ranges = 0x228 0x230
bad_ranges_min_gap = 0x8
bad_range_sets = 0x1
bad_range_clears = 0x0
bad_set_bits = 0x0 0x0 0x0 0x0 0x2 0x0 0x0 0x0
bad_cleared_bits = 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0

Here we see that one bit was set, 0x2 when we expected 0x0.
Later that same day...

Oct 23 2009 14:53:01.525657152 ereport.fs.zfs.checksum
class = ereport.fs.zfs.checksum
pool = rpool
pool_guid = 0x5062a7a7247652b1
vdev_guid = 0x1181c8516c0dc9b0
vdev_type = disk
vdev_path = /dev/dsk/c3d1s0
vdev_devid = id1,c...@awdc_wd800bb-00bsa0=wd-wma6s1025599/a
parent_guid = 0x323cf9d672c3b05a
parent_type = mirror
zio_err = 50
zio_offset = 0x50384800
zio_size = 0x9800
zio_objset = 0x29
zio_object = 0x1a209
zio_level = 0
zio_blkid = 0x0
cksum_expected = 0x4a027c11b3ba4cec 0xbf274565d5615b7b 
0x3ef5fe61b2ed672e 0xec8692f7fd33094a
cksum_actual = 0x4a027c11b3ba4cec 0xbf274567d5615b7b 0x3ef5fe61b2ed672e 
0xec86a5b3fd33094a
cksum_algorithm = fletcher2
bad_ranges = 0x228 0x230
bad_ranges_min_gap = 0x8
bad_range_sets = 0x1
bad_range_clears = 0x0
bad_set_bits = 0x0 0x0 0x0 0x0 0x2 0x0 0x0 0x0
bad_cleared_bits = 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0

So we see the exact same bit flipped (0x2 expecting 0x0) on two different
disks, /dev/dsk/c3d0s0 (Maxtor) and /dev/dsk/c3d1s0 (Western Digital), at 
the same zio offset and size.

I feel confident we are not seeing a b0rken drive here.  But something is
clearly amiss and we cannot rule out the processor, memory, or controller.
Frank reports that he sees this on the same file, /lib/libdlpi.so.1, so I'll go 
out
on a limb and speculate that there is something in the bit pattern for that 
file that intermittently triggers a bit flip on this system. I'll also 
speculate that
this error will not be reproducible on another system.

This sort of specific error analysis is possible after b125. See CR6867188
for more details.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-15 Thread David Dyer-Bennet

On Sun, March 14, 2010 13:54, Frank Middleton wrote:


 How can it even be remotely possible to get a checksum failure on mirrored
 drives
 with copies=2? That means all four copies were corrupted? Admittedly this
 is
 on a grotty PC with no ECC and flaky bus parity, but how come the same
 file always
 gets flagged as being clobbered (even though apparently it isn't).

 The oddest part is that libdlpi.so.1 doesn't actually seem to be
 corrupted. nm lists
 it with no problem and you can copy it to /tmp, rename it, and then copy
 it back.
 objdump and readelf can all process this library with no problem. But pkg
 fix
 flags an error in it's own inscrutable way. CCing pkg-discuss in case a
 pkg guru
 can shed any light on what the output of pkg fix (below) means.
 Presumably libc
 is OK, or it wouldn't boot :-).

This sounds really bizarre.

One detail suggestion on checking what's going on (since I don't have a
clue towards a real root-cause determination): Get an md5sum on a clean
copy of the file, say from a new install or something, and check the
allegedly-corrupted copy against that.  This can fairly easily give you a
pretty reliable indication if the file is truly corrupted or not.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] CR 6880994 and pkg fix

2010-03-14 Thread Frank Middleton

Can anyone say what the status of CR 6880994 (kernel/zfs Checksum failures on
mirrored drives) might be?

Setting copies=2 has mitigated the problem, which manifests itself consistently 
at
boot by flagging libdlpi.so.1, but two recent power cycles in a row with no 
normal
shutdown has resulted in a permanent error even with copies=2 on all of the 
root
pool (and specifically having duplicated /lib to make sure there are 2 copies).

How can it even be remotely possible to get a checksum failure on mirrored 
drives
with copies=2? That means all four copies were corrupted? Admittedly this is
on a grotty PC with no ECC and flaky bus parity, but how come the same file 
always
gets flagged as being clobbered (even though apparently it isn't).

The oddest part is that libdlpi.so.1 doesn't actually seem to be corrupted. nm 
lists
it with no problem and you can copy it to /tmp, rename it, and then copy it 
back.
objdump and readelf can all process this library with no problem. But pkg fix
flags an error in it's own inscrutable way. CCing pkg-discuss in case a pkg guru
can shed any light on what the output of pkg fix (below) means. Presumably 
libc
is OK, or it wouldn't boot :-).

This with b125 on X86.

# zpool status -v
  pool: rpool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
rpool   ONLINE   0 0 0
  mirror-0  ONLINE   0 0 2
c3d1s0  ONLINE   0 0 2
c3d0s0  ONLINE   0 0 2

errors: Permanent errors have been detected in the following files:

//lib/libdlpi.so.1

# pkg fix  SUNWcsl
Verifying: pkg://opensolarisdev/SUNWcsl ERROR
file: lib/libc.so.1
Elfhash: cbb55a2ea24db9e03d9cd08c25b20406896c2fef should be 
0e73a56d6ea0753f3721988ccbd716e370e57c4e
Created ZFS snapshot: 2010-03-13-23:39:17 . ||
Repairing: pkg://opensolarisdev/SUNWcsl
pkg: Requested fix operation would affect files that cannot be modified in 
live image.
Please retry this operation on an alternate boot environment

# nm /lib/libdlpi.so.1
00015562 b Bbss.bss
00015562 b Bbss.bss
00015240 d Ddata.data
00015240 d Ddata.data
000152f8 d Dpicdata.picdata
3ca8 r Drodata.rodata
3ca0 r Drodata.rodata
 A SUNW_1.1
 A SUNWprivate
000150ac D _DYNAMIC
00015562 b _END_
00015000 D _GLOBAL_OFFSET_TABLE_
16c0 T _PROCEDURE_LINKAGE_TABLE_
 r _START_
 U ___errno
 U __ctype
 U __div64
00015562 D _edata
00015562 B _end
43d7 R _etext
3c84 t _fini
 U _fxstat
3c68 t _init
3ca0 r _lib_version
 U _lxstat
 U _xmknod
 U _xstat
 U abs
 U calloc
 U close
 U closedir
 U dgettext
 U dladm_close
 U dladm_dev2linkid
 U dladm_open
 U dladm_parselink
 U dladm_phys_info
 U dladm_walk
2d5c T dlpi_arptype
222c T dlpi_bind
1d6c T dlpi_close
24d0 T dlpi_disabmulti
2c78 T dlpi_disabnotify
24b0 T dlpi_enabmulti
2af4 T dlpi_enabnotify
00015288 d dlpi_errlist
2ce4 T dlpi_fd
25b8 T dlpi_get_physaddr
2e50 T dlpi_iftype
1dc8 T dlpi_info
2d2c T dlpi_linkname
39a8 T dlpi_mactype
000152f8 d dlpi_mactypes
21a0 T dlpi_makelink
1b00 T dlpi_open
2158 T dlpi_parselink
3ca8 r dlpi_primsizes
2598 T dlpi_promiscoff
2578 T dlpi_promiscon
28fc T dlpi_recv
27a4 T dlpi_send
26d4 T dlpi_set_physaddr
2d04 T dlpi_set_timeout
3908 T dlpi_strerror
2d48 T dlpi_style
2384 T dlpi_unbind
1a20 T dlpi_walk
 U free
1998 t fstat
 U getenv
 U gethrtime
 U getmsg
32fc t i_dlpi_attach
3a28 t i_dlpi_buildsap
32ac t i_dlpi_checkstyle
3bfc t i_dlpi_deletenotifyid
39e8 t i_dlpi_getprimsize
3868 t i_dlpi_msg_common
23f4 t i_dlpi_multi
3bd4 t i_dlpi_notifyidexists
3ac8 t i_dlpi_notifyind_process
2f28 t i_dlpi_open
3384 t i_dlpi_passive
24e8 t i_dlpi_promisc
3460 t i_dlpi_strgetmsg
33e4 t i_dlpi_strputmsg
316c t i_dlpi_style1_open
31f0 t i_dlpi_style2_open
19f0 t i_dlpi_walk_link
3a9c t i_dlpi_writesap
 U ifparse_ifspec
 U ioctl
00015240 d libdlpi_errlist
196c t lstat
 U memcpy
 U memset
19c4 t mknod
 U open
 U opendir
 U poll
 U putmsg
 U readdir
 U snprintf
1940 t stat
 U strchr
 U strerror
 U strlcpy
 U strlen
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org