Thanks for the patch Mike. Below is the output from a failure when
running with the patch. Any thoughts?

[<f8bf0876>] iscsi_conn_failure+0x10/0x69 [libiscsi]

[<f9bf202d>] iscsi_eh_abort+0x2f1/0x406 [libiscsi]

[<f885d378>] __scsi_try_to_abort_cmd+0x19/0x1a [scsi_mod]

[<f885e85d>] scsi_error_handler+0x24d/0x422 [scsi_mod]

[<c041f7ea>] complete+0x2b/0x3d

[<f885e610>] scsi_error_handler+0x0/0x422 [scsi_mod]

[<c0435f65>] kthread+0xc0/0xeb

[<c0435ea5>] kthread+0x0/0xeb

[<c0405c3b>] kernel_thread_helper+0x7/0x10

=======================

connection1:0 detected conn error (1011)

[<f8bf0876>] iscsi_conn_failure+0x10/0x69 [libiscsi]

[<f8bf22fc>] iscsi_eh_target_reset+0xbb/0x218 [libiscsi]

[<c0605967>] _spin_lock_bh+0x8/0x18

[<f8bf0f78>] iscsi_eh_device_reset+0x1c5/0x1cf [libiscsi]

[<c054a6dd>] get_device+0xe/0x14

[<f885d764>] scsi_try_host_reset+0x3a/0x99 [scsi_mod]

[<f885e0e3>] scsi_eh_ready_devs+0x302/0x3e2 [scsi_mod]

[<f885e8dd>] scsi_error_handler+0x2cd/0x422 [scsi_mod]

[<c041f7ea>] complete+0x2b/0x3d

[<f885e610>] scsi_error_handler+0x0/0x422 [scsi_mod]

[<c0435f65>] kthread+0xc0/0xeb

[<c0435ea5>] kthread+0x0/0xeb

[<c0405c3b>] kernel_thread_helper+0x7/0x10

=======================

session1: session recovery timed out after 400 secs

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: scsi: Device offlined - no ready after error recovery

sd 0:0:0:0: SCSI error: return code = 0x00020000

end_request: I/O error, dev sda, sector 14283149

On Jul 13, 10:34 pm, Mike Christie <micha...@cs.wisc.edu> wrote:
> Could you run with the attached patch? It just prints out a little more
> info. When we get the conn error, it will print out a message if it is
> due to the target dropping the connection and it will print out stack
> trace so we can see exactly what piece of code is throwing the error.
>
> On 07/13/2010 09:33 PM, Sean S wrote:
>
> > Nothing else in the log from iscsid. No mention of a failed reconnect,
> > although the only log I'm really able to access post failure is dmesg.
> > Since I'm running a root iscsi, I couldn't get to /var/log/messages
> > which maybe was a little more verbose? What sort of network problems
>
> Yeah, by default the iscsid messages go there. iscsid should be spitting
> out a cannot connect $some_error_value_or_string that would help tell us
> why we cannot reach the target anymore.
>
> > might cause this? The "network" in this situation is a simple gigE
> > switch with about 3 or 4 systems on it. The target and initiator are
> > on the same subnet, nothing fancy. Is there some additional debug
> > you'd recommend turning on? Any tips or tricks when running with a
> > root iscsi drive?
>
> Not that I can think of at the iscsi layer.
>
>
>
> > Curiously, if I physically disconnect the ethernet from the initiator
> > while running, all I/O access is correctly paused without returning I/
> > O errors. If I then reconnect before the 400s is up things go back to
> > normal. I don't however see the "detected conn error (1011)" message
> > in this situation however. Not sure if that really means anything.
>
> You should see the conn error 1011 message if
>
> 1. you have nops on and they timeout and that causes us to log that error.
>
> 2. the network layer figures out there is a problem and notifies us. It
> is possible that you pull a cable and plug it back in before the network
> throws an error.
>
> 3. iscsi driver or protocol error. In this case we should relogin quickly.
>
>  trace-conn-error.patch
> 1KViewDownload

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to