On Wed, 2010-08-04 at 21:51 -0500, Mike Christie wrote:
> conn error 1011 is generic. If this is occurring when the eql box is 
> rebalancing luns, it is a little different than above. With the above 
> problem we did not know why we got the error. With your situation we 
> sort of expect this. We should not be getting disk IO errors though.
> 
> When we get the logout request from the target, we send the logout 
> request, then basically handle the cleanup like if we got a connection 
> error. That is why you would see the conn error msg in this path. This 
> also means if this happened to the same IO 5 times, then you would see 
> the disk IO errors (scsi layer only lets us retry disk IO 5 times). But 
> if it just happened once, then the IO should be retried when we log into 
> the new portal and execute like normal.

What would be the best way to I identify how many retries have elapsed?

> Or are you using dm-multipath over iscsi? In that case you do not get 
> any retries, so we would expect to see that end_request: I/O error 
> message, but dm-multipath should just be retrying a new path or 
> internally queueing for whatever timeout value you had it use in 
> multipath.conf.

Multipath is not enabled at all. The equallogic array is active/passive
and we only have a view into one controller at any time, so we don't
make use of multipath at present.

> Could you send me the libiscsi.c file you patched?
> 
> Could you also send more of the log for either case? I want to see the 
> iscsid log info and any more of the kernel iscsi log info that you have. 
> I am looking for session recovery timed out messages and/or target 
> requested logout messages.

I've copied both the messages file from the host goncalog140 and the
patched libiscsi.c. FWIW, I've also included the iscsid.conf. Find these
files in the link below:

http://promisc.org/iscsi/

N.B: the messages file contains spew from other instrumentation tests
(e.g a dump_stack() call in scsi_transport_iscsi.c::iscsi_conn_error()).
The last set of tests which I've made available yesterday have only the
libiscsi.c and IIRC the iscsi_tcp.c, and this output can be found around
the timeframe of 17:50.

If required I can spin a new set of tests with different instrumentation
and/or collect different information, logs or tcpdumps, if that helps in
any way.

Thanks,
 -Goncalo.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to