What version of open-iscsi and kernel are you using? And are you using 
the kernel modules with open-iscsi or the ones that come with the kernel?

Nicholas A. Bellinger wrote:
>>
>> The problem is that the failure of the outstanding I/Os does not seem to
>> be occuring in all cases.  In particular, a iscsiadm --logout I believe
>> is getting issued, and said logout request failing/timing out because
>> DRBD_TARGET has been released.  It is at this point where umount for the
>> ext3 mount and/or sync hangs indefinately.  When the problem occurs, it
>> looks like this from the kernel ring buffer:
>>
>> iscsi_deallocate_extra_thread_sets:285: ***OPS*** Stopped 1 thread set(s) (2 
>> total threads).
>> iscsi_deallocate_extra_thread_sets:285: ***OPS*** Stopped 2 thread set(s) (4 
>> total threads).
>> session10: iscsi: session recovery timed out after 120 secs
>> sd 51:0:0:0: scsi: Device offlined - not ready after error recovery

If you see this then any and all that was sent the device and any new IO 
should be failed to the FS and block layer like below. There is a bug in 
some kernels though, where if you were to run a iscsiadm logout command 
it can hang and lead to weird problems, because the scsi layer is 
broken. If you use open-iscsi 869.2's kernel modules or the iscsi 
modules in 2.6.25.6 or newer then this is fixed. Not sure if that is 
what you are seeing, because we see IO failed upwards here. Also once we 
see "Device offlined", the scsi layer is going to fail the IO when it 
hits the scsi prep functions and is never even reaches us. If there is 
IO stuck in the driver you could do
cat /sys/class/scsi_host/hostX/host_busy

to check (that file prints the number of commands the scsi layer has 
sent the driver and the driver has not yet returned back (ok so I mean 
how many commands is outsatnding)).


>> sd 51:0:0:0: [sdg] Result: hostbyte=DID_BUS_BUSY 
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sdg, sector 0
>> Buffer I/O error on device sdg, logical block 0
>> lost page write due to I/O error on sdg




>>
>> I should mention that we are not doing any I/O to said iSCSI LUN via
>> Open/iSCSI other than the filesystem metadata for ext3 umount and
>> SYNCHRONIZE_CACHE CDB during struct scsi_device deregistration.  From
>> experience with Core-iSCSI, I know the logout path is tricky wrt
>> exceptions (I spent months on it to handle all cases with Immediate and
>> Non Immediate Logout, as well as doing logouts on the fly from the same
>> connection in MC/S and different connections in MC/S :-)
>>
>> So the question is:
>>
>> I) When a ISCSI_INIT_LOGOUT_REQ is not returned with a
>> ISCSI_TARGET_LOGOUT_RSP and replacement_timeout fires, are all
>> outstanding I/Os for that particular session being completed with an
>> non-recoveryable exception..?  Has anyone ever run into this case and/or
>> tested it..?

If the connection is down when you run iscsiadm logout, we will not send 
a logout and the replacement_timeout does not come into play. We just 
fast fail the connection and just cleanup the commands and kernel 
resrouces and iscsiadm returns (yeah pretty bad I know - it is on the TODO).

If the connection is up when you run iscsiadm logout, and while the 
logout is floating around the connection drops, we are again lazy and 
just fail and cleanup and return right away. The replacement_timeout 
does not come into play for this and we just fail right away.


If you run 869.2 from open-iscsi.org and build with

make DEBUG_SCSI=1 DEBUG_TCP=1
make DEBUG_SCSI=1 DEBUG_TCP=1 install
and send all the log output I can tell you better what is going on.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to