On Mon, 2 Nov 2009, Santi Saez wrote:

> Randomly we get Open-iSCSI "conn errors" when connecting to an
> Infortrend A16E-G2130-4 storage array. We had discussed about this
> earlier in the list, see:


> Nov  2 18:34:02 vz-17 kernel: ping timeout of 5 secs expired, last rx 
> 408250499, last ping 408249467, now 408254467
> Nov  2 18:34:02 vz-17 kernel:  connection1:0: iscsi: detected conn error 
> (1011)
> Nov  2 18:34:03 vz-17 iscsid: Kernel reported iSCSI connection 1:0 error 
> (1011) state (3)
> Nov  2 18:34:07 vz-17 iscsid: connection1:0 is operational after recovery (1 
> attempts)
> Nov  2 18:34:52 vz-17 kernel: ping timeout of 5 secs expired, last rx


That looks vaguely familiar, although I think mine was nop-out timeout 
(might be reported in another log file). Does it mostly happen when you do 
long sequential reads from the Infortrend unit? In my case it turned out 
to be a very low level of packet drops being caused by a cisco 2960G when 
'mls qos' was enabled (which due to an IOS bug, didn't increment the drop 
counter). I'm not sure if the loss when 'mls qos' is enabled is by design 
as part of WRED, or a function of the port buffers being divided up into 
things smaller than optimal.

Having TCP window scaling enabled made the problem an order of magnitude 
worse, try disabling it and seeing if you have the same problem still? 
(suggest something like dd if=/dev/sdc of=/dev/null bs=1048576 count=10 to 
see if that triggers it, assuming it was the same problem I was 
suffering).

Every other iSCSI target I've tried recovered pretty gracefully from this, 
but not the Infortrend, I suspect their TCP retransmit algorithm needs a 
lot of love. I suspect it's pathologically broken when window scaling is 
enabled.

Sadly when I opened a ticket with Infortrend, enclosing tcpdumps and 
analysis, they were no more useful than to let me know they don't support 
debian (despite having instructions for debian iscsi on their website), 
and they don't support Western Digital drives in redundant controller 
configurations (nice of them to have this somewhere public).

Hope that might be somewhat useful to you.. Please to let me / the list 
know how you get on. There was sadly little information on this when I was 
tearing my hair out about it.

Best wishes
James

-- 
James Rice
Jump Networks Ltd

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to