[zfs-discuss] strange performance drop of solaris 10/zfs

2009-01-29 Thread Kevin Maguire
Hi

We have been using a Solaris 10 system (Sun-Fire-V245) for a while as
our primary file server. This is based on Solaris 10 06/06, plus
patches up to approx May 2007. It is a production machine, and until
about a week ago has had few problems.

Attached to the V245 is a SCSI RAID array, which presents one LUN to
the OS.  On this lun is a zpool (tank), and within that 300+ zfs file
systems (one per user for automounted home directories). The system is
connected to our LAN via gigabit Ethernet,. most of our NFS clients
have just 100FD network connection.

In recent days performance of the file server seems to have gone off a
cliff.  I don't know how to troubleshoot what might be wrong? Typical
"zpool iostat 120" output is shown below. If I run "truss -D df" I see
each call to statvfs64("/tank/bla) takes 2-3 seconds. The RAID itself
is healthy, and all disks are reporting as OK.

I have tried to establish if some client or clients are thrashing the
server via nfslogd, but without seeing anything obvious.  Is there
some kind of per-zfs-filesystem iostat?

End users are reporting just saving small files can take 5-30 seconds?
prstat/top shows no process using significant CPU load.  The system
has 8GB of RAM, vmstat shows nothing interesting.

I have another V245, with the same SCSI/RAID/zfs setup, and a similar
(though a bit less) load of data and users where this problem is NOT
apparent there?

Suggestions?
Kevin

Thu Jan 29 11:32:29 CET 2009
   capacity operationsbandwidth
pool used  avail   read  write   read  write
--  -  -  -  -  -  -
tank2.09T   640G 10 66   825K  1.89M
tank2.09T   640G 39  5  4.80M   126K
tank2.09T   640G 38  8  4.73M   191K
tank2.09T   640G 40  5  4.79M   126K
tank2.09T   640G 39  5  4.73M   170K
tank2.09T   640G 40  3  4.88M  43.8K
tank2.09T   640G 40  3  4.87M  54.7K
tank2.09T   640G 39  4  4.81M   111K
tank2.09T   640G 39  9  4.78M   134K
tank2.09T   640G 37  5  4.61M   313K
tank2.09T   640G 39  3  4.89M  32.8K
tank2.09T   640G 35  7  4.31M   629K
tank2.09T   640G 28 13  3.47M  1.43M
tank2.09T   640G  5 51   433K  4.27M
tank2.09T   640G  6 51   450K  4.23M
tank2.09T   639G  5 52   543K  4.23M
tank2.09T   640G 26 57  3.00M  1.15M
tank2.09T   640G 39  6  4.82M   107K
tank2.09T   640G 39  3  4.80M   119K
tank2.09T   640G 38  8  4.64M   295K
tank2.09T   640G 40  7  4.82M   102K
tank2.09T   640G 43  5  4.79M   103K
tank2.09T   640G 39  4  4.73M   193K
tank2.09T   640G 39  5  4.87M  62.1K
tank2.09T   640G 40  3  4.88M  49.3K
tank2.09T   640G 40  3  4.80M   122K
tank2.09T   640G 42  4  4.83M  82.0K
tank2.09T   640G 40  3  4.89M  42.0K
...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool lost

2007-02-13 Thread Kevin Maguire
Hi

I'm using fairly stock S10, but this is really just a zfs/zpool question.

#  uname -a
SunOS peach 5.10 Generic_118833-36 sun4u sparc SUNW,Sun-Blade-100


Misremembering the option to file for handling special files (-s), I executed 
the following:

#  file -m /dev/dsk/c*s2

My shell would have expanded this as follows:

#  ls /dev/dsk/c*s2
/dev/dsk/c0t0d0s2  /dev/dsk/c0t1d0s2  /dev/dsk/c0t2d0s2

This is a SB100 with 2 disks.  c0t2d0 has the OS and other partitions, c0t2d0 
is the CDROM.

The other disk was one large, single disk,  zpool.  It's not mountable now.

#  zpool status -v
  pool: tank
 state: FAULTED
status: One or more devices could not be opened.  There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-D3
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankUNAVAIL  0 0 0  insufficient replicas
  c0t0d0UNAVAIL  0 0 0  cannot open


Not knowing what the "file -m ..." command might have done, I wonder if this 
situation is recoverable?  Looking at the smf logs I see that my pool (tank) is 
at least recognized a bit:

# fmdump -eV 

Feb 09 2007 16:41:46.082264521 ereport.fs.zfs.vdev.open_failed
nvlist version: 0
class = ereport.fs.zfs.vdev.open_failed
ena = 0x10163875741
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x347b6d721b99340d
vdev = 0xa7f43f56535e14a8
(end detector)

pool = tank
pool_guid = 0x347b6d721b99340d
pool_context = 1
vdev_guid = 0xa7f43f56535e14a8
vdev_type = disk
vdev_path = /dev/dsk/c0t0d0s0
vdev_devid = id1,[EMAIL PROTECTED]/a
parent_guid = 0x347b6d721b99340d
parent_type = root
prev_state = 0x1
__ttl = 0x1
__tod = 0x45cc963a 0x4e741c9

Feb 09 2007 16:41:46.082265280 ereport.fs.zfs.zpool
nvlist version: 0
class = ereport.fs.zfs.zpool
ena = 0x10163875741
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x347b6d721b99340d
(end detector)

pool = tank
pool_guid = 0x347b6d721b99340d
pool_context = 1
__ttl = 0x1
__tod = 0x45cc963a 0x4e744c0

Any ideas/suggestions?

Kevin
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] commercial backup software and zfs

2006-08-15 Thread Kevin Maguire
Hi

Is the following an accurate sstatment of the current status with (for me) the 
3 main commercial ackup software solutions out there

(1) Legato Networker - support coming in 7.3.2 patch due soon.  Until then 
backups of ZFS filesystems wont work at all as the acl(8) calls will fail. But 
will work if you use "legacy" mouning of zfs filesytsem (but even then wont 
include any ALCs).  Or backup via NFS mounts.

(2) IBM's TSM - no current or official support from IBM.  Will/wont work?

(3) Veritas NetBackup - client v5.1+ works out of the box, but without full/any 
ACL support.

?

Kevin

PS: I'm particularly interested in (2)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss