Hi All,

yesterday we done some tests with ZFS using a new server and a new JBOD going 
in production this week.

Here is what we found:


1) Solaris seems unable to recognize as "disk" any fc disk already labeled by a 
storage processor. cfgadm reports them as "unknown".
We had to start linux and clean the partition table to have Solaris recognize 
the disks ... :(


2) Our test server was connected to the JBOD through a dual fc adapter, dual fc 
switch, MPXIO enabled.
We had MANY PANICS doing the following when the pool was loaded with a dd ..

-disconnecting and reconnectiong a few times one of the fc link.
-enabling/disabling a fc link port on one fc switch.
-powering off one of the two fc switches


Sometimes we get a panic and nothing on the logs!
Just a few examples:

Mar  3 18:38:54 TESTSVR     offlining lun=0 (trace=0), target=cd (trace=2800004)
Mar  3 18:38:55 TESTSVR unix: [ID 836849 kern.notice] 
Mar  3 18:38:55 TESTSVR ^Mpanic[cpu0]/thread=fffffe8000d1cc80: 
Mar  3 18:38:55 TESTSVR genunix: [ID 809409 kern.notice] ZFS: I/O failure 
(write on <unknown> off 0: zio fffffe8322055280 [L0 unallocated] 20000L/20000P 
DVA[0]=<1:575a0
000:20000> fletcher2 uncompressed LE contiguous birth=9 fill=0 cksum=0:0:0:0): 
error 14
Mar  3 18:38:55 TESTSVR unix: [ID 100000 kern.notice] 
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cac0 
zfs:zfsctl_ops_root+2f9c8b42 ()
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cad0 
zfs:zio_next_stage+72 ()
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cb00 
zfs:zio_wait_for_children+49 ()
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cb10 
zfs:zio_wait_children_done+15 ()
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cb20 
zfs:zio_next_stage+72 ()
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cb60 
zfs:zio_vdev_io_assess+82 ()
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cb70 
zfs:zio_next_stage+72 ()
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cbd0 
zfs:vdev_mirror_io_done+c1 ()
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cbe0 
zfs:zio_vdev_io_done+14 ()
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cc60 
genunix:taskq_thread+bc ()
Mar  3 18:38:55 TESTSVR genunix: [ID 655072 kern.notice] fffffe8000d1cc70 
unix:thread_start+8 ()
Mar  3 18:38:55 TESTSVR unix: [ID 100000 kern.notice] 
Mar  3 18:38:55 TESTSVR genunix: [ID 672855 kern.notice] syncing file systems...

Mar  3 18:51:52 TESTSVR savecore: [ID 570001 auth.error] reboot after panic: 
ZFS: I/O failure (write on <unknown> off 0: zio fffffe8322055280 [L0 
unallocated] 20000L/20
000P DVA[0]=<1:575a0000:20000> fletcher2 uncompressed LE contiguous birth=9 
fill=0 cksum=0:0:0:0): error 14


PANIC
Nothing on the log!
Mar  4 19:08:21 TESTSVR savecore: [ID 570001 auth.error] reboot after panic: 
ZFS: I/O failure (write on <unknown> off 0: zio fffffe8322055280 [L0 
unallocated] 20000L/20
000P DVA[0]=<1:575a0000:20000> fletcher2 uncompressed LE contiguous birth=9 
fill=0 cksum=0:0:0:0): error 14


PANIC
Nothing on the log!
Mar  4 19:11:20 TESTSVR savecore: [ID 570001 auth.error] reboot after panic: 
ZFS: I/O failure (write on <unknown> off 0: zio fffffe8322055280 [L0 
unallocated] 20000L/20
000P DVA[0]=<1:575a0000:20000> fletcher2 uncompressed LE contiguous birth=9 
fill=0 cksum=0:0:0:0): error 14




Mar  4 19:25:37 TESTSVR genunix: [ID 834635 kern.info] /scsi_vhci/[EMAIL 
PROTECTED] (sd13) multipath status: degraded, path /[EMAIL 
PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1011,[EMAIL PROTECTED]/pc
i1077,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (fp2) to target address: 
w22000004cfd87b7b,0 is offline Load balancing: round-robin
Mar  4 19:25:37 TESTSVR unix: [ID 836849 kern.notice] 
Mar  4 19:25:37 TESTSVR ^Mpanic[cpu3]/thread=fffffe80002e1c80: 
Mar  4 19:25:37 TESTSVR genunix: [ID 809409 kern.notice] ZFS: I/O failure 
(write on <unknown> off 0: zio fffffe811bdb7800 [L0 unallocated] 20000L/20000P 
DVA[0]=<3:56260
000:20000> fletcher2 uncompressed LE contiguous birth=22 fill=0 cksum=0:0:0:0): 
error 14
Mar  4 19:25:37 TESTSVR unix: [ID 100000 kern.notice] 
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1ac0 
zfs:zfsctl_ops_root+2f9c8b42 ()
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1ad0 
zfs:zio_next_stage+72 ()
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1b00 
zfs:zio_wait_for_children+49 ()
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1b10 
zfs:zio_wait_children_done+15 ()
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1b20 
zfs:zio_next_stage+72 ()
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1b60 
zfs:zio_vdev_io_assess+82 ()
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1b70 
zfs:zio_next_stage+72 ()
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1bd0 
zfs:vdev_mirror_io_done+c1 ()
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1be0 
zfs:zio_vdev_io_done+14 ()
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1c60 
genunix:taskq_thread+bc ()
Mar  4 19:25:37 TESTSVR genunix: [ID 655072 kern.notice] fffffe80002e1c70 
unix:thread_start+8 ()
Mar  4 19:25:37 TESTSVR unix: [ID 100000 kern.notice] 
Mar  4 19:25:37 TESTSVR genunix: [ID 672855 kern.notice] syncing file systems...
Mar  4 19:25:37 TESTSVR genunix: [ID 904073 kern.notice]  done






Mar  4 19:34:24 TESTSVR genunix: [ID 403854 kern.notice] assertion failed: 
vdev_config_sync(rvd, txg) == 0, file: ../../common/fs/zfs/spa.c, line: 2801
Mar  4 19:34:24 TESTSVR unix: [ID 100000 kern.notice] 
Mar  4 19:34:24 TESTSVR genunix: [ID 802836 kern.notice] fffffe80007e0b60 
fffffffffb9b49f3 ()
Mar  4 19:34:24 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007e0bd0 
zfs:spa_sync+39c ()
Mar  4 19:34:24 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007e0c60 
zfs:txg_sync_thread+115 ()
Mar  4 19:34:24 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007e0c70 
unix:thread_start+8 ()
Mar  4 19:34:24 TESTSVR unix: [ID 100000 kern.notice] 
Mar  4 19:34:24 TESTSVR genunix: [ID 672855 kern.notice] syncing file systems...
Mar  4 19:34:24 TESTSVR genunix: [ID 904073 kern.notice]  done





Mar  4 20:33:35 TESTSVR genunix: [ID 809409 kern.notice] ZFS: I/O failure 
(write on <unknown> off 0: zio fffffe83170da300 [L0 unallocated] 20000L/20000P 
DVA[0]=<6:70660
000:20000> fletcher2 uncompressed LE contiguous birth=462 fill=0 
cksum=0:0:0:0): error 14
Mar  4 20:33:35 TESTSVR unix: [ID 100000 kern.notice] 
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9ac0 
zfs:zfsctl_ops_root+2f9c8b42 ()
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9ad0 
zfs:zio_next_stage+72 ()
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9b00 
zfs:zio_wait_for_children+49 ()
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9b10 
zfs:zio_wait_children_done+15 ()
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9b20 
zfs:zio_next_stage+72 ()
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9b60 
zfs:zio_vdev_io_assess+82 ()
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9b70 
zfs:zio_next_stage+72 ()
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9bd0 
zfs:vdev_mirror_io_done+c1 ()
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9be0 
zfs:zio_vdev_io_done+14 ()
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9c60 
genunix:taskq_thread+bc ()
Mar  4 20:33:35 TESTSVR genunix: [ID 655072 kern.notice] fffffe80007b9c70 
unix:thread_start+8 ()
Mar  4 20:33:35 TESTSVR unix: [ID 100000 kern.notice] 
Mar  4 20:33:35 TESTSVR genunix: [ID 672855 kern.notice] syncing file systems...
Mar  4 20:33:36 TESTSVR genunix: [ID 904073 kern.notice]  done








3) Zpool throughput isn't stable.  Sometimes we get 150+ MB/s write performance 
with 1 link, sometimes just around 40MB/s.
Using only one fc link give us always stable performance at 160MB/s.


Conclusion:
After a day of tests we are going to think that ZFS doesn't work well with 
MPXIO.


awaiting for your comments,
Gino
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to