Hi Mike,

I can't really speak for how virtualization products are using
files for pools, but we don't recommend creating pools on files,
much less NFS-mounted files and then building zones on top.

File-based pool configurations might be used for limited internal
testing of some features, but our product testing does not include
testing storage pools on files or NFS-mounted files.

Unless Ed's project gets refunded, I'm not sure how much farther
you can go with this approach.

Thanks,

Cindy

On 01/07/10 15:05, Mike Gerdts wrote:
[removed zones-discuss after sending heads-up that the conversation
will continue at zfs-discuss]

On Mon, Jan 4, 2010 at 5:16 PM, Cindy Swearingen
<cindy.swearin...@sun.com> wrote:
Hi Mike,

It is difficult to comment on the root cause of this failure since
the several interactions of these features are unknown. You might
consider seeing how Ed's proposal plays out and let him do some more
testing...

Unfortunately Ed's proposal is not funded last I heard.  Ops Center
uses many of the same mechanisms for putting zones on ZFS.  This is
where I saw the problem initially.

If you are interested in testing this with NFSv4 and it still fails
the same way, then also consider testing this with a local file
instead of a NFS-mounted file and let us know the results. I'm also
unsure of using the same path for the pool and the zone root path,
rather than one path for pool and a pool/dataset path for zone
root path. I will test this myself if I get some time.

I have been unable to reproduce with a local file.  I have been able
to reproduce with NFSv4 on build 130.  Rather surprisingly the actual
checksums found in the ereports are sometimes "0x0 0x0 0x0 0x0" or
"0xbaddcafe00 ...".

Here's what I did:

- Install OpenSolaris build 130 (ldom on T5220)
- Mount some NFS space at /nfszone:
   mount -F nfs -o vers=4 $file:/path /nfszone
- Create a 10gig sparse file
   cd /nfszone
   mkfile -n 10g root
- Create a zpool
   zpool create -m /zones/nfszone nfszone /nfszone/root
- Configure and install a zone
   zonecfg -z nfszone
    set zonepath = /zones/nfszone
    set autoboot = false
    verify
    commit
    exit
   chmod 700 /zones/nfszone
   zoneadm -z nfszone install

- Verify that the nfszone pool is clean.  First, pkg history in the
zone shows the timestamp of the last package operation

  2010-01-07T20:27:07 install                   pkg             Succeeded

At 20:31 I ran:

# zpool status nfszone
  pool: nfszone
 state: ONLINE
 scrub: none requested
config:

        NAME             STATE     READ WRITE CKSUM
        nfszone          ONLINE       0     0     0
          /nfszone/root  ONLINE       0     0     0

errors: No known data errors

I booted the zone.  By 20:32 it had accumulated 132 checksum errors:

 # zpool status nfszone
  pool: nfszone
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

        NAME             STATE     READ WRITE CKSUM
        nfszone          DEGRADED     0     0     0
          /nfszone/root  DEGRADED     0     0   132  too many errors

errors: No known data errors

fmdump has some very interesting things to say about the actual
checksums.  The 0x0 and 0xbaddcafe00 seem to shout that these checksum
errors are not due to a couple bits flipped

# fmdump -eV | grep cksum_actual | sort | uniq -c | sort -n | tail
   2    cksum_actual = 0x14c538b06b6 0x2bb571a06ddb0 0x3e05a7c4ac90c62
0x290cbce13fc59dce
   3    cksum_actual = 0x175bb95fc00 0x1767673c6fe00 0xfa9df17c835400
0x7e0aef335f0c7f00
   3    cksum_actual = 0x2eb772bf800 0x5d8641385fc00 0x7cf15b214fea800
0xd4f1025a8e66fe00
   4    cksum_actual = 0x0 0x0 0x0 0x0
   4    cksum_actual = 0x1d32a7b7b00 0x248deaf977d80 0x1e8ea26c8a2e900
0x330107da7c4bcec0
   5    cksum_actual = 0x14b8f7afe6 0x915db8d7f87 0x205dc7979ad73
0x4e0b3a8747b8a8
   6    cksum_actual = 0x1184cb07d00 0xd2c5aab5fe80 0x69ef5922233f00
0x280934efa6d20f40
   6    cksum_actual = 0x348e6117700 0x765aa1a547b80 0xb1d6d98e59c3d00
0x89715e34fbf9cdc0
  16    cksum_actual = 0xbaddcafe00 0x5dcc54647f00 0x1f82a459c2aa00
0x7f84b11b3fc7f80
  48    cksum_actual = 0x5d6ee57f00 0x178a70d27f80 0x3fc19c3a19500
0x82804bc6ebcfc0

I halted the zone, exported the pool, imported the pool, then did a
scrub.  Everything seemed to be OK:

# zpool export nfszone
# zpool import -d /nfszone nfszone
# zpool status nfszone
  pool: nfszone
 state: ONLINE
 scrub: none requested
config:

        NAME             STATE     READ WRITE CKSUM
        nfszone          ONLINE       0     0     0
          /nfszone/root  ONLINE       0     0     0

errors: No known data errors
# zpool scrub nfszone
# zpool status nfszone
  pool: nfszone
 state: ONLINE
 scrub: scrub completed after 0h0m with 0 errors on Thu Jan  7 21:56:47 2010
config:

        NAME             STATE     READ WRITE CKSUM
        nfszone          ONLINE       0     0     0
          /nfszone/root  ONLINE       0     0     0

errors: No known data errors

But then I booted the zone...

# zoneadm -z nfszone boot
# zpool status nfszone
  pool: nfszone
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 0h0m with 0 errors on Thu Jan  7 21:56:47 2010
config:

        NAME             STATE     READ WRITE CKSUM
        nfszone          ONLINE       0     0     0
          /nfszone/root  ONLINE       0     0   109

errors: No known data errors

I'm confused as to why this pool seems to be quite usable even with so
many checksum errors.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to