Hi Hannes,

Would you be able to send me a unified patch containing the changes included in 
the test kernels so I can rebuild the drivers with them and update you today?

For completeness, we are not running SLES, but rather the Citrix XenServer 5.6 
release which is based off of the Linux 2.6.27 tree of SLES. Also, for this 
specific controller we don't enable MPIO, but in most other arrays we do.

Thanks,
 -Goncalo.

-----Original Message-----
From: Hannes Reinecke [mailto:h...@suse.de] 
Sent: 06 August 2010 15:58
To: Mike Christie
Cc: open-iscsi@googlegroups.com; Goncalo Gomes
Subject: Re: detected conn error (1011)

Mike Christie wrote:
> ccing Hannes from suse, because this looks like a SLES only bug.
> 
> Hey Hannes,
> 
> The user is using Linux 2.6.27 x86 based on SLES + Xen 3.4 (as dom0)
> running a couple of RHEL 5.5 VMs. The underlying storage for these VMs
> is iSCSI based via open-iscsi 2.0.870-26.6.1 and a DELL equallogic array.
> 
> 
> On 08/05/2010 02:21 PM, Goncalo Gomes wrote:
>> I've copied both the messages file from the host goncalog140 and the
>> patched libiscsi.c. FWIW, I've also included the iscsid.conf. Find these
>> files in the link below:
>>
>> http://promisc.org/iscsi/
>>
> 
> It looks like this chunk from libiscsi.c:iscsi_queuecommand:
> 
>         case ISCSI_STATE_FAILED:
>             reason = FAILURE_SESSION_FAILED;
>             sc->result = DID_TRANSPORT_DISRUPTED << 16;
>             break;
> 
> is causing IO errors.
> 
> You want to use something like DID_IMM_RETRY because it can be a long
> time between the time the kernel marks the state as ISCSI_STATE_FAILED
> until we start recovery and properly get all the device queues blocked,
> so we can exhaust all the retries if we use DID_TRANSPORT_DISRUPTED.
Yeah, I noticed.
But the problem is that multipathing will stall during this time,
ie no failover will occur and I/O will stall. Using DID_TRANSPORT_DISRUPTED
will circumvent this and we can failover immediately.

Sadly I got additional bugreports about this so I think I'll have
to revert it.

I have put some test kernels at

http://beta.suse.com/private/hare/sles11/iscsi

Can you test with them and check if this issue is solved?

Thanks.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                   zSeries & Storage
h...@suse.de                          +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to