Firstly, I haven't dug in this yet but this is more of a call: 
"have-you-seen-this-too?"

When I reboot the machine without logging off from iSCSI targets I can
hang the reboot sequence. This is with 869-rc4 userspace, SLES 10 SP2 Beta 
kernel, with
a 869-rc4 kernels compiled out of tree. (With a SLES 10 SP2 Beta kernel, which 
has
a back-port of 868-rc1, I get the same bug)

I enabled the debugging in the kernel (DEBUG_SCSI) and added a dump_stack() in 
the
iscsi_check_transport_timeouts, and this is what I get:

iscsi: Sending nopout as ping on conn ffff88007a0b8a50
iscsi: Setting next tmo 4294974247
iscsi: mtask deq [cid 0 itt 0xa06]
iscsi: mgmtpdu [op 0x0 hdr->itt 0xa06 datalen 0]
Sending SIGKILL to all processes.
Please stand by while rebooting the system.
md: stopping all md devices.
Synchronizing SCSI cache for disk sdl: 
iscsi: ctask deq [cid 0 itt 0x2e imm 0 unsol 0]
iscsi: iscsi prep [read cid 0 sc ffff88007be9f500 cdb 0x35 itt 0x2e len 0 cmdsn 
47 win 723]
iscsi: Sending nopout as ping on conn ffff88007b192250
iscsi: Setting next tmo 4294976496
iscsi: mtask deq [cid 0 itt 0xa07]
iscsi: mgmtpdu [op 0x0 hdr->itt 0xa07 datalen 0]
 connection2:0: ping timeout of 15 secs expired, last rx 4294967997, last ping 
4294970497, now 4294974247
iscsi: can not broadcast skb (-3)
 connection2:0: detected conn error (1011)

Call Trace: <IRQ> <ffffffff8824e629>{:libiscsi:iscsi_check_transport_timeouts+0}
       <ffffffff8824e6e7>{:libiscsi:iscsi_check_transport_timeouts+190}
       <ffffffff88007192>{:usbcore:rh_timer_func+0} 
<ffffffff80137d81>{run_timer_softirq+370}
       <ffffffff80133262>{__do_softirq+124} <ffffffff8010af42>{call_softirq+30}
       <ffffffff8010beb5>{do_softirq+81} <ffffffff801333d7>{irq_exit+72}
       <ffffffff8010c089>{do_IRQ+107} <ffffffff80247347>{evtchn_do_upcall+327}
       <ffffffff8010aa76>{do_hypervisor_callback+30} <EOI>
       <ffffffff801063aa>{hypercall_page+938} 
<ffffffff801063aa>{hypercall_page+938}
       <ffffffff8010d8af>{safe_halt+176} <ffffffff801092ab>{xen_idle+112}
       <ffffffff80108fc4>{cpu_idle+173} 
<ffffffff8024ab5f>{cpu_bringup_and_idle+14}
 connection1:0: ping timeout of 15 secs expired, last rx 4294970246, last ping 
4294973996, now 4294976496
iscsi: can not broadcast skb (-3)
 connection1:0: detected conn error (1011)

Call Trace: <IRQ> <ffffffff8824e629>{:libiscsi:iscsi_check_transport_timeouts+0}
       <ffffffff8824e6e7>{:libiscsi:iscsi_check_transport_timeouts+190}
       <ffffffff80137d81>{run_timer_softirq+370} 
<ffffffff80133262>{__do_softirq+124}
       <ffffffff8010af42>{call_softirq+30} <ffffffff8010beb5>{do_softirq+81}
       <ffffffff801333d7>{irq_exit+72} <ffffffff8010c089>{do_IRQ+107}
       <ffffffff80247347>{evtchn_do_upcall+327} 
<ffffffff8010aa76>{do_hypervisor_callback+30} <EOI>
       <ffffffff801063aa>{hypercall_page+938} 
<ffffffff801063aa>{hypercall_page+938}
       <ffffffff8010d8af>{safe_halt+176} <ffffffff801092ab>{xen_idle+112}
       <ffffffff80108fc4>{cpu_idle+173} 
<ffffffff8024ab5f>{cpu_bringup_and_idle+14}
iscsi: LU Reset [sc ffff88007be9f500 lun 14]
iscsi: iscsi_eh_device_reset FAILED
iscsi: iscsi_eh_host_reset wait for relogin


I haven't dug in this yet but I was curious if anybody else has seen this? I 
somehow
remembing seeing this on RHEL5.1 Beta, but I think it was fixed there. 

What it looks like is that the 'iscsi_check_transport_timeouts' timer functions
never gets removed from the timer lists, and keeps on doing its job..

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to