I've recently upgraded my x4500 to Nevada build 97, and am having problems with 
the iscsi target.

Background: this box is used to serve NFS underlying a VMware ESX environment 
(zfs filesystem-type datasets) and presents iSCSI targets (zfs zvol datasets) 
for a Windows host and to act as zoneroots for Solaris 10 hosts.  For optimal 
random-read performance, I've configured a single zfs pool of mirrored VDEVs of 
all 44 disks (+2 boot disks, +2 spares = 48)

Before the upgrade, the box was flaky under load: all I/Os to the ZFS pool 
would stop occasionally.

Since the upgrade, that hasn't happened, and the NFS clients are quite happy.  
The iSCSI initiators are not.

The windows initiator is running the Microsoft iSCSI initiator v2.0.6 on 
Windows 2003 SP2 x64 Enterprise Edition.  When the system reboots, it is not 
able to connect to its iscsi targets.  No devices are found until I restart the 
iscsitgt process on the x4500, at which point the initiator will reconnect and 
find everything.  I notice that on the x4500, it maintains an active TCP 
connection (according to netstat -an | grep 3260) to the Windows box through 
the reboot and for a long time afterwards.  The initiator starts a second 
connection, but it seems that the target doesn't let go of the old one.  Or 
something.  At this point, every time I reboot the Windows system I have to 
`pkill iscsitgtd`

The Solaris system is running S10 Update 4.  Every once in a while (twice 
today, and not correlated with the pkill's above) the system reports that all 
of the iscsi disks are unavailable.  Nothing I've tried short of a reboot of 
the whole host brings them back.  All of the zones on the system remount their 
zoneroots read-only (and give I/O errors when read or zlogin'd to)

There are a set of TCP connections from the zonehost to the x4500 that remain 
even through disabling the iscsi_initiator service.  There's no process holding 
them as far as pfiles can tell.

Does this sound familiar to anyone?  Any suggestions on what I can do to 
troubleshoot further?  I have a kernel dump from the zonehost and a snoop 
capture of the wire for the Windows host (but it's big).

I'll be opening a bug too.

Thanks,
--Joe
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to