Re: [zfs-discuss] ZFS still crashing after patch

2008-05-05 Thread Rustam
Hello Robert,
 Which would happen if you have problem with HW and you're getting
 wring checksums on both side of your mirrors. Maybe PS?

 Try memtest anyway or sunvts
Unfortunately, SunVTS doesn't run on non-Sun/OEM hardware. And memtest requires 
too much downtime which I cannot afford right now.

However, I've interesting observations and now I can reproduce crash. It seems 
that I've bad checksum(s) and ZFS crashes each time when it tries to read it. 
Below are two cases:



Case1: I've got a checksum error not striped over mirrors, this time it was 
checksum for a file and not 0x0. I tried to read file twice. First try 
returned I/O error, second try caused panic. Here's the log:




core# zpool status -xv
  pool: box5
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:
 
NAMESTATE READ WRITE CKSUM
box5ONLINE   0 0 2
  mirrorONLINE   0 0 0
c1d0ONLINE   0 0 0
c2d0ONLINE   0 0 0
  mirrorONLINE   0 0 2
c2d1ONLINE   0 0 4
c1d1ONLINE   0 0 4
 
errors: Permanent errors have been detected in the following files:
 
box5:0x0
/u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file

core# ll /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file
-rw---   1 user group   489 Apr 20  2006 
/u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file

core# cat /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file
cat: input error on 
/u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file: I/O error

core# zpool status -xv
  pool: box5
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:
 
NAMESTATE READ WRITE CKSUM
box5ONLINE   0 0 4
  mirrorONLINE   0 0 0
c1d0ONLINE   0 0 0
c2d0ONLINE   0 0 0
  mirrorONLINE   0 0 4
c2d1ONLINE   0 0 8
c1d1ONLINE   0 0 8
 
errors: Permanent errors have been detected in the following files:
 
box5:0x0
/u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file

core# cat /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file
(Kernel Panic: BAD TRAP: type=e (#pf Page fault) rp=fe8001112490 
addr=fe80882b7000)
...
(after system boot up)
core# rm /u02/domains/somedomain/0/1/5/data/sub1/sub2/1145543794.file
core# zpool status -xv
  pool: box5
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:
 
NAMESTATE READ WRITE CKSUM
box5ONLINE   0 0 0
  mirrorONLINE   0 0 0
c1d0ONLINE   0 0 0
c2d0ONLINE   0 0 0
  mirrorONLINE   0 0 0
c2d1ONLINE   0 0 0
c1d1ONLINE   0 0 0
 
errors: Permanent errors have been detected in the following files:
 
box5:0x0
box5:0x4a049a

core# mdb unix.17 vmcore.17
Loading modules: [ unix krtld genunix specfs dtrace cpu.generic uppc pcplusmp 
ufs ip hook neti sctp arp usba uhci fctl nca lofs zfs random nfs ipc sppp 
crypto ptm ]
 ::status
debugging crash dump vmcore.17 (64-bit) from core
operating system: 5.10 Generic_127128-11 (i86pc)
panic message: BAD TRAP: type=e (#pf Page fault) rp=fe8001112490 
addr=fe80882b7000
dump content: kernel pages only
 ::stack
fletcher_2_native+0x13()
zio_checksum_verify+0x27()
zio_next_stage+0x65()
zio_wait_for_children+0x49()
zio_wait_children_done+0x15()
zio_next_stage+0x65()
zio_vdev_io_assess+0x84()
zio_next_stage+0x65()
vdev_cache_read+0x14c()
vdev_disk_io_start+0x135()
vdev_io_start+0x12()
zio_vdev_io_start+0x7b()
zio_next_stage_async+0xae()
zio_nowait+9()
vdev_mirror_io_start+0xa9()
vdev_io_start+0x12()
zio_vdev_io_start+0x7b()
zio_next_stage_async+0xae()
zio_nowait+9()
vdev_mirror_io_start+0xa9()
zio_vdev_io_start+0x116()
zio_next_stage+0x65()
zio_ready+0xec()
zio_next_stage+0x65()
zio_wait_for_children+0x49()

Re: [zfs-discuss] ZFS still crashing after patch

2008-05-05 Thread Richard Elling
Rustam wrote:
 Hello Robert,
   
 Which would happen if you have problem with HW and you're getting
 wring checksums on both side of your mirrors. Maybe PS?

 Try memtest anyway or sunvts
 
 Unfortunately, SunVTS doesn't run on non-Sun/OEM hardware. And memtest 
 requires too much downtime which I cannot afford right now.
   

Sometimes if you read the docs, you can get confused by people who
intend to confuse you.  SunVTS does work on a wide variety of
hardware, though it may not be supported. To fully understand the
perspective, SunVTS is used by Sun in the manufacturing process.
It is the tests run on hardware before shipping to customers.  It is not
intended to be a generic test whatever hardware you find laying around
product.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Issue with simultaneous IO to lots of ZFS pools

2008-05-05 Thread Chris Siebenmann
[Jeff Bonwick:]
| That said, I suspect I know the reason for the particular problem
| you're seeing: we currently do a bit too much vdev-level caching.
| Each vdev can have up to 10MB of cache.  With 132 pools, even if
| each pool is just a single iSCSI device, that's 1.32GB of cache.
| 
| We need to fix this, obviously.  In the interim, you might try
| setting zfs_vdev_cache_size to some smaller value, like 1MB.

 I wanted to update the mailing list with a success story: I added
another 2GB of memory to the server (bringing it to 4GB total),
tried my 132-pool tests again, and things worked fine. So this seems
to have been the issue and I'm calling it fixed now.

(I decided that adding some more memory to the server was simpler
in the long run than setting system parameters.)

 I can still make the Solaris system lock up solidly if I do extreme
things, like doing 'zfs scrub pool ' for all 132 pools, but I'm not
too surprised by that; you can always kill a system if you try hard
enough. The important thing for me is that routine things don't kill the
system any more just because it has so many pools.

 So: thank you, everyone.

- cks
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS still crashing after patch

2008-05-05 Thread Marcelo Leal
Hello,
 If you believe that the problem can be related to ZIL code, you can try to 
disable it to debug (isolate) the problem. If it is not a fileserver (NFS), 
disabling the zil should not impact consistency.

 Leal.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] read errors observed after scrub

2008-05-05 Thread Jeremy Kister
I have a Solaris 10u3 x86 patched up with the important kernel/zfs/fs 
patches (now running kernel 120012-14).

after executing a 'zpool scrub' on one of my pools, i see I/O read errors:

# zpool status  | grep ONLINE | grep -v '0 0 0'
  state: ONLINE
 c2t1d0  ONLINE   9 0 0
 c2t4d0  ONLINE  32 0 0
 c2t5d0  ONLINE   7 0 0


Are these errors important enough to switch the disks?

if not, i've read that when these numbers break a magic threshold, zfs 
takes the disk offline and calls it dead.

If I use 'zpool clear', will only these administrative statistics be 
cleared, or will important internal numbers that keep track of the 
errors be cleared as well?

I do see bad blocks on the offending disks -- but why would zfs see them 
(assuming the disk re-mapped the bad blocks) ?
# smartctl -a /dev/rdsk/c2t1d0 | grep defect
Elements in grown defect list: 3
# smartctl -a /dev/rdsk/c2t4d0 | grep defect
Elements in grown defect list: 3
# smartctl -a /dev/rdsk/c2t5d0 | grep defect
Elements in grown defect list: 2



-- 

Jeremy Kister
http://jeremy.kister.net./





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS still crashing after patch

2008-05-05 Thread Rustam
Hello Leal,

I've been already warned 
(http://www.opensolaris.org/jive/message.jspa?messageID=231349) that ZIL could 
be a cause and I made tests with zil_disabled. I run scrub and system crashed 
exactly at after the same period and the same error. ZIL known to cause some 
problems on writes, while all my problems are with zio_read and checksum_verify.

This is NFS file server, but it crashed even when NFS unshared and nfs/server 
is disabled. So this is not NFS problem.

I reduced panic occasions by setting zfs_prefetch_disable. This allows me to 
avoid unnecessary reads and reduces chances of reading bad checksums. For now 
I've 24 hours without crash which is much better than few times a day. However, 
I know that bad checksums are there and I need to fix them somehow.

--
Rustam
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS still crashing after patch

2008-05-05 Thread Bob Friesenhahn
On Mon, 5 May 2008, Marcelo Leal wrote:

 Hello, If you believe that the problem can be related to ZIL code, 
 you can try to disable it to debug (isolate) the problem. If it is 
 not a fileserver (NFS), disabling the zil should not impact 
 consistency.

In what way is NFS special when it comes to ZFS consistency?  If NFS 
consistency is lost by disabling the zil then local consistency is 
also lost.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Inconcistancies with scrub and zdb

2008-05-05 Thread Jonathan Loran

Since no one has responded to my thread, I have a question:  Is zdb 
suitable to run on a live pool?  Or should it only be run on an exported 
or destroyed pool?  In fact, I see that it has been asked before on this 
forum, but is there a users guide to zdb? 

Thanks,

Jon

-- 


- _/ _/  /   - Jonathan Loran -   -
-/  /   /IT Manager   -
-  _  /   _  / / Space Sciences Laboratory, UC Berkeley
-/  / /  (510) 643-5146 [EMAIL PROTECTED]
- __/__/__/   AST:7731^29u18e3
 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS and disk usage management?

2008-05-05 Thread John.Stewart
 
After struggling for some time to try and wedge a ZFS file server into
our environment, I have come to the conclusion that I'm simply going to
have to live without quotas. They have been immensely useful in the past
5 years or so in allowing us to keep track of which groups are hogging
disk space, and even finding a bug in one manufacturing/engineering tool
which occasionally crashed in a way which generated 4GB files when it
did.
 
The problem is the fact that NFS mounts cannot be done across
filesystems as implemented with ZFS and Solaris 10. For example, we have
client machines mounting to /groups/accounting... but we also have
clients mounting to /groups directly.
 
I know the zfs answer/dogma is automounts, but it's not that simple. I
have no good way to know what is being mounted in which manner (blame
the NAS 5320's boatload of bugs there... I could go on and on there in a
curse-filled tirade), so the only real way to know is to migrate and
find out what breaks. Not good. Furthermore, there is no reasonable
fix, anyway, other than some serious automount voodoo.
 
So, this means making the zfs filesystems at the /groups level instead
of the /groups/accounting level as I had expected to do... meaning we
can't implement quotas in any reasonable manner that I know of.
 
That given, so I have any good options for monitoring usage of
subdirectories within my ZFS filesystems without going through a du -sh
/groups/* every night? It sure seems like a kludge.
 
thanks
 
johnS
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS still crashing after patch

2008-05-05 Thread Bob Friesenhahn
On Mon, 5 May 2008, eric kustarz wrote:

 That's not true:
 http://blogs.sun.com/erickustarz/entry/zil_disable

 Perhaps people are using consistency to mean different things here...

Consistency means that fsync() assures that the data will be written 
to disk so no data is lost.  It is not the same thing as no 
corruption.  ZFS will happily lose some data in order to avoid some 
corruption if the system loses power.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS still crashing after patch

2008-05-05 Thread Bob Friesenhahn
On Mon, 5 May 2008, Marcelo Leal wrote:
 I'm calling consistency, a coherent local view...
 I think that was one option to debug (if not a NFS server), without
 generate a corrupted filesystem.

In other words your flight reservation will not be lost if the system 
crashes.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and disk usage management?

2008-05-05 Thread Bob Friesenhahn
On Mon, 5 May 2008, [EMAIL PROTECTED] wrote:
 The problem is the fact that NFS mounts cannot be done across
 filesystems as implemented with ZFS and Solaris 10. For example, we have
 client machines mounting to /groups/accounting... but we also have
 clients mounting to /groups directly.

On my system I have a /home filesystem, and then I have additional 
logical-per user filesystems underneath.  I know that I can mount 
/home directly but I currently automount the per-user filesystems 
since otherwise user permissions and filesystem quotas are not visible 
to the client for anything other than Solaris 10.

I assume that ZFS quotas are enforced even if the current size and 
space free is not included in the user visible 'df'.  Is that not 
true?

Presumably applications get some unexpected error when the quota limit 
is hit since the client OS does not know the real amount of space 
free.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS still crashing after patch

2008-05-05 Thread eric kustarz

On May 5, 2008, at 4:43 PM, Bob Friesenhahn wrote:

 On Mon, 5 May 2008, eric kustarz wrote:

 That's not true:
 http://blogs.sun.com/erickustarz/entry/zil_disable

 Perhaps people are using consistency to mean different things  
 here...

 Consistency means that fsync() assures that the data will be written  
 to disk so no data is lost.  It is not the same thing as no  
 corruption.  ZFS will happily lose some data in order to avoid some  
 corruption if the system loses power.

Ok, that makes more sense.  You're talking from the application  
perspective, whereas my blog entry is from the file system's  
perspective (disabling the ZIL does not compromise on-disk consistency).

eric
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] sharesmb settings not working with some filesystems

2008-05-05 Thread Benjamin Staffin
This is really strange.  Check out this error:

[EMAIL PROTECTED] ~]# zfs get sharesmb tank/software tank/music
NAME   PROPERTY  VALUE  SOURCE
tank/music sharesmb  offdefault
tank/software  sharesmb  offdefault

(same begin state for both filesystems)

[EMAIL PROTECTED] ~]# zfs set sharesmb=name=music tank/music
(works fine, and clients can mount it)

[EMAIL PROTECTED] ~]# zfs set sharesmb=name=software tank/software
cannot share 'tank/software': smb add share failed

 fails.  WTF?

NFS sharing tank/software also fails, but I don't see any errors on
the server side.  You have to try mounting it before you see any
failures there.

My zpool and the various zfs filesystems were created with sxce b76.
The problem seemed to crop up while running b82.  At that point I did
zpool upgrade -a and zfs upgrade -a to see if that helped (... since
apparently cifs sharing isn't actually supposed to work with zfs v2!).
 It did not.  I've now upgraded to b87, and I'm seeing the exact same
problem.

I'm not manually configuring things with sharemgr.  I did at one time,
but I decided to remove those settings and try to use zfs sharesmb and
sharenfs directly.  Maybe this is where the problem started?

What's going on here?  It smells to me like some unpleasant voodoo. :(

- Ben
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] sharesmb settings not working with some filesystems

2008-05-05 Thread Rob
  cannot share 'tank/software': smb add share failed

you meant to post this in storage-discuss
but type:

chmod 777 /tank/software
zfs set sharesmb=name=software tank/software


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and Linux

2008-05-05 Thread Bill McGonigle
Is it also true that ZFS can't be re-implemented in GPLv2 code because then the 
CDDL-based patent protections don't apply?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss