Re: INFO: HARDIRQ-safe - HARDIRQ-unsafe lock order detected

2012-08-10 Thread Fubo Chen
On Fri, Aug 10, 2012 at 2:58 AM, Michael Christie micha...@cs.wisc.edu wrote:

 On Aug 8, 2012, at 4:42 AM, Fubo Chen fubo.c...@gmail.com wrote:

 Anyone seen this before ? Also occurs with 3.4.1.


 ==
 [ INFO: HARDIRQ-safe - HARDIRQ-unsafe lock order detected ]
 3.6.0-rc1-debug+ #1 Not tainted
 --
 swapper/1/0 [HC0[0]:SC1[1]:HE0:SE0] is trying to acquire:
 ((session-lock)-rlock){+.-...}, at: [a025dc08]
 iscsi_eh_cmd_timed_out+0x58/0x2e0 [libiscsi]

 and this task is already holding:
 ((q-__queue_lock)-rlock){-.-...}, at: [811f6965]
 blk_rq_timed_out_timer+0x25/0x140
 which would create a new lock dependency:
 ((q-__queue_lock)-rlock){-.-...} - ((session-lock)-rlock){+.-...}

 but this new dependency connects a HARDIRQ-irq-safe lock:
 ((q-__queue_lock)-rlock){-.-...}
 ... which became HARDIRQ-irq-safe at:
  [8109b5ca] __lock_acquire+0x7ea/0x1ba0
  [8109cfc2] lock_acquire+0x92/0x140
  [814b41c5] _raw_spin_lock_irqsave+0x65/0xb0
  [812e2974] blk_done+0x34/0x110
  [81295889] vring_interrupt+0x49/0xc0
  [810c68f5] handle_irq_event_percpu+0x75/0x270
  [810c6b38] handle_irq_event+0x48/0x70
  [810c9477] handle_edge_irq+0x77/0x110
  [81004042] handle_irq+0x22/0x40
  [814bda2a] do_IRQ+0x5a/0xe0
  [814b436f] ret_from_intr+0x0/0x1a
  [8100a7da] default_idle+0x4a/0x170
  [8100b609] cpu_idle+0xe9/0x130
  [814a4c6e] start_secondary+0x26a/0x26c


 Does this error only occur when using some sort of virt setup?

 I do not think we will hit this with iscsi, because we do not ever grab the 
 queue lock for a iscsi device from hard irq context. It is always done from 
 softirq or thread context. The snippet above seems to be from the 
 virtio_blk.c code.

Yes. This happened inside KVM machine.

Fubo.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



INFO: HARDIRQ-safe - HARDIRQ-unsafe lock order detected

2012-08-08 Thread Fubo Chen
Anyone seen this before ? Also occurs with 3.4.1.


==
[ INFO: HARDIRQ-safe - HARDIRQ-unsafe lock order detected ]
3.6.0-rc1-debug+ #1 Not tainted
--
swapper/1/0 [HC0[0]:SC1[1]:HE0:SE0] is trying to acquire:
 ((session-lock)-rlock){+.-...}, at: [a025dc08]
iscsi_eh_cmd_timed_out+0x58/0x2e0 [libiscsi]

and this task is already holding:
 ((q-__queue_lock)-rlock){-.-...}, at: [811f6965]
blk_rq_timed_out_timer+0x25/0x140
which would create a new lock dependency:
 ((q-__queue_lock)-rlock){-.-...} - ((session-lock)-rlock){+.-...}

but this new dependency connects a HARDIRQ-irq-safe lock:
 ((q-__queue_lock)-rlock){-.-...}
... which became HARDIRQ-irq-safe at:
  [8109b5ca] __lock_acquire+0x7ea/0x1ba0
  [8109cfc2] lock_acquire+0x92/0x140
  [814b41c5] _raw_spin_lock_irqsave+0x65/0xb0
  [812e2974] blk_done+0x34/0x110
  [81295889] vring_interrupt+0x49/0xc0
  [810c68f5] handle_irq_event_percpu+0x75/0x270
  [810c6b38] handle_irq_event+0x48/0x70
  [810c9477] handle_edge_irq+0x77/0x110
  [81004042] handle_irq+0x22/0x40
  [814bda2a] do_IRQ+0x5a/0xe0
  [814b436f] ret_from_intr+0x0/0x1a
  [8100a7da] default_idle+0x4a/0x170
  [8100b609] cpu_idle+0xe9/0x130
  [814a4c6e] start_secondary+0x26a/0x26c

to a HARDIRQ-irq-unsafe lock:
 ((session-lock)-rlock){+.-...}
... which became HARDIRQ-irq-unsafe at:
...  [8109b3e5] __lock_acquire+0x605/0x1ba0
  [8109cfc2] lock_acquire+0x92/0x140
  [814b361b] _raw_spin_lock_bh+0x4b/0x80
  [a025b404] iscsi_conn_setup+0x154/0x210 [libiscsi]
  [a02322b4] iscsi_tcp_conn_setup+0x14/0x40 [libiscsi_tcp]
  [a026a0e9] iscsi_sw_tcp_conn_create+0x29/0x100 [iscsi_tcp]
  [a024de58] iscsi_if_rx+0xa48/0xf60 [scsi_transport_iscsi]
  [813e68ed] netlink_unicast+0x1ad/0x230
  [813e6c0b] netlink_sendmsg+0x29b/0x2f0
  [813ad04f] sock_sendmsg+0x9f/0xe0
  [813ae7ff] __sys_sendmsg+0x2df/0x2f0
  [813af979] sys_sendmsg+0x49/0x90
  [814bc6e9] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

   CPU0CPU1
   
  lock((session-lock)-rlock);
   local_irq_disable();
   lock((q-__queue_lock)-rlock);
   lock((session-lock)-rlock);
  Interrupt
lock((q-__queue_lock)-rlock);

 *** DEADLOCK ***

2 locks held by swapper/1/0:
 #0:  (q-timeout){+.-...}, at: [8104dc7f]
run_timer_softirq+0x12f/0x4c0
 #1:  ((q-__queue_lock)-rlock){-.-...}, at: [811f6965]
blk_rq_timed_out_timer+0x25/0x140

the dependencies between HARDIRQ-irq-safe lock and the holding lock:
- ((q-__queue_lock)-rlock){-.-...} ops: 539229 {
   IN-HARDIRQ-W at:
[8109b5ca] __lock_acquire+0x7ea/0x1ba0
[8109cfc2] lock_acquire+0x92/0x140
[814b41c5] _raw_spin_lock_irqsave+0x65/0xb0
[812e2974] blk_done+0x34/0x110
[81295889] vring_interrupt+0x49/0xc0
[810c68f5] handle_irq_event_percpu+0x75/0x270
[810c6b38] handle_irq_event+0x48/0x70
[810c9477] handle_edge_irq+0x77/0x110
[81004042] handle_irq+0x22/0x40
[814bda2a] do_IRQ+0x5a/0xe0
[814b436f] ret_from_intr+0x0/0x1a
[8100a7da] default_idle+0x4a/0x170
[8100b609] cpu_idle+0xe9/0x130
[814a4c6e] start_secondary+0x26a/0x26c
   IN-SOFTIRQ-W at:
[8109b3b8] __lock_acquire+0x5d8/0x1ba0
[8109cfc2] lock_acquire+0x92/0x140
[814b41c5] _raw_spin_lock_irqsave+0x65/0xb0
[8120695e] cfq_idle_slice_timer+0x2e/0x110
[8104dcf8] run_timer_softirq+0x1a8/0x4c0
[81045f38] __do_softirq+0xd8/0x290
[814bd9bc] call_softirq+0x1c/0x26
[81004105] do_softirq+0xa5/0xe0
[8104642e] irq_exit+0xae/0xe0
[814bdb1e] smp_apic_timer_interrupt+0x6e/0x99
[814bd22f] apic_timer_interrupt+0x6f/0x80
[81296d34] vp_try_to_find_vqs+0x6e4/0x7b0
[81296fb2] vp_find_vqs+0x42/0xc0
[81325c73] init_vqs+0x83/0x110
[81326112] virtnet_probe+0x362/0x510
[812951a3] virtio_dev_probe+0xe3/0x160

Re: [PATCH 1/4] BNX2I: Added the use of kthreads to handle SCSI cmd completion

2011-06-22 Thread Fubo Chen
On Tue, Jun 21, 2011 at 6:49 PM, Eddie Wai eddie@broadcom.com wrote:
 +/**
 + * bnx2i_percpu_io_thread - thread per cpu for ios
 + *
 + * @arg:       ptr to bnx2i_percpu_info structure
 + */
 +int bnx2i_percpu_io_thread(void *arg)
 +{
 +       struct bnx2i_percpu_s *p = arg;
 +       struct bnx2i_work *work, *tmp;
 +       LIST_HEAD(work_list);
 +
 +       set_user_nice(current, -20);
 +
 +       set_current_state(TASK_INTERRUPTIBLE);
 +       while (!kthread_should_stop()) {
 +               schedule();
 +               spin_lock_bh(p-p_work_lock);
 +               while (!list_empty(p-work_list)) {
 +                       list_splice_init(p-work_list, work_list);
 +                       spin_unlock_bh(p-p_work_lock);
 +
 +                       list_for_each_entry_safe(work, tmp, work_list, list) 
 {
 +                               list_del_init(work-list);
 +                               /* work allocated in the bh, freed here */
 +                               bnx2i_process_scsi_cmd_resp(work-session,
 +                                                           work-bnx2i_conn,
 +                                                           work-cqe);
 +                               atomic_dec(work-bnx2i_conn-work_cnt);
 +                               kfree(work);
 +                       }
 +                       spin_lock_bh(p-p_work_lock);
 +               }
 +               set_current_state(TASK_INTERRUPTIBLE);
 +               spin_unlock_bh(p-p_work_lock);
 +       }
 +       __set_current_state(TASK_RUNNING);
 +
 +       return 0;
 +}

This loop looks a little strange to me. If the schedule() call would
be moved from the top of the outermost while loop to the bottom then
the first set_current_state(TASK_INTERRUPTIBLE) statement can be
eliminated. And that also fixes the (theoretical?) race that occurs if
wake_up_process() gets invoked after kthread_create() but before the
first set_current_state(TASK_INTERRUPTIBLE) statement got executed.

Fubo.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: MD-RAID1 and iSCSI with multipathd: some experience

2011-01-03 Thread Fubo Chen
On Oct 14 2010, 1:45 pm, Ulrich Windl ulrich.wi...@rz.uni-
regensburg.de wrote:
 I was investigating the status of building a RAID1 over iSCSI-connected 
 devices managed by multipathd (SLES10 SP3 Release Notes said it won't work). 
 Here are some of my findings:

 1) The multipath-devices cannot be opened exclusively my mdadm:
 # mdadm --verbose --create /dev/md0 --raid-devices=2 --level=raid1 
 --bitmap=internal /dev/disk/by-id/scsi-3600508b4001085dd00011226 
 /dev/disk/by-id/scsi-3600508b4001085dd00011229
 mdadm: Cannot open /dev/disk/by-id/scsi-3600508b4001085dd00011226: 
 Device or resource busy
 mdadm: Cannot open /dev/disk/by-id/scsi-3600508b4001085dd00011229: 
 Device or resource busy
 mdadm: create aborted

 open(/dev/disk/by-id/scsi-3600508b4001085dd00011226, 
 O_RDONLY|O_EXCL) = -1 EBUSY (Device or resource busy)

 2) The device-mapper files seem to be no SCSI Devices:
 # mdadm --verbose --create /dev/md0 --raid-devices=2 --level=raid1 
 --bitmap=internal /dev/dm-18 /dev/dm-19
 mdadm: /dev/dm-18 is too small: 0K
 mdadm: create aborted
 rkdvmso1:~ # sdparm -a /dev/dm-18
 unable to access /dev/dm-18, ATA disk?

 3) The iSCSI devices are SCSI-devices, but are busy:
 # sdparm -a /dev/sdax
     /dev/sdax: HP        HSV200            5000
 Read write error recovery mode page:
   AWRE        1  [cha: n, def:  1]
   ARRE        1  [cha: n, def:  1]
   TB          1  [cha: n, def:  1]
   RC          0  [cha: n, def:  0]
 [...]
 # mdadm --verbose --create /dev/md0 --raid-devices=2 --level=raid1 
 --bitmap=internal /dev/sdax /dev/sdbo
 mdadm: Cannot open /dev/sdax: Device or resource busy
 mdadm: Cannot open /dev/sdbo: Device or resource busy
 mdadm: create aborted

 I'm not a specialist on mdadm, so please if I did something wrong, please 
 tell me.

Hi,

I have been looking at related but not identical question: to
replicate local disk to another server via iSCSI and md mirroring
(RAID1, no multipath). While making that setup I noticed that open-
iscsi times out SCSI commands if the network falls away long enough.
Why does open-iscsi initiator make SCSI commands fail instead of
reporting disk removal ?

$ sg_inq /dev/disk/by-path/ip-192.168.3.114\:3260-iscsi-...:tgt-lun-0
| grep RMB
  PQual=0  Device_type=0  RMB=0  version=0x05  [SPC-3]

Fubo.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: MD-RAID1 and iSCSI with multipathd: some experience

2011-01-03 Thread Fubo Chen
On October 14, Ulrich Windl wrote:
 I was investigating the status of building a RAID1 over iSCSI-
 connected devices managed by multipathd (SLES10 SP3 Release Notes said
 it won't work). Here are some of my findings:

 1) The multipath-devices cannot be opened exclusively my mdadm:
 # mdadm --verbose --create /dev/md0 --raid-devices=2 --level=raid1 --
 bitmap=internal /dev/disk/by-id/
 scsi-3600508b4001085dd00011226 /dev/disk/by-id/
 scsi-3600508b4001085dd00011229
 mdadm: Cannot open /dev/disk/by-id/
 scsi-3600508b4001085dd00011226: Device or resource busy
 mdadm: Cannot open /dev/disk/by-id/
 scsi-3600508b4001085dd00011229: Device or resource busy
 mdadm: create aborted

 open(/dev/disk/by-id/scsi-3600508b4001085dd00011226,
 O_RDONLY|O_EXCL) = -1 EBUSY (Device or resource busy)

 2) The device-mapper files seem to be no SCSI Devices:
 # mdadm --verbose --create /dev/md0 --raid-devices=2 --level=raid1 --
 bitmap=internal /dev/dm-18 /dev/dm-19
 mdadm: /dev/dm-18 is too small: 0K
 mdadm: create aborted
 rkdvmso1:~ # sdparm -a /dev/dm-18
 unable to access /dev/dm-18, ATA disk?

 3) The iSCSI devices are SCSI-devices, but are busy:
 # sdparm -a /dev/sdax
 /dev/sdax: HPHSV2005000
 Read write error recovery mode page:
   AWRE1  [cha: n, def:  1]
   ARRE1  [cha: n, def:  1]
   TB  1  [cha: n, def:  1]
   RC  0  [cha: n, def:  0]
 [...]
 # mdadm --verbose --create /dev/md0 --raid-devices=2 --level=raid1 --
 bitmap=internal /dev/sdax /dev/sdbo
 mdadm: Cannot open /dev/sdax: Device or resource busy
 mdadm: Cannot open /dev/sdbo: Device or resource busy
 mdadm: create aborted

 I'm not a specialist on mdadm, so please if I did something wrong,
 please tell me.

Hi,

I have been looking at related but not identical problem. I'm trying
to use md to replicate local disk to remote server by iSCSI and
mirroring (RAID1). But I noticed that iSCSI commands fail if network
timeout occurs longer than the iSCSI command timeout. I noticed that
the block device created by open-iscsi is marked as non-removable
(RMB=0). Why does open-iscsi behave this way and why does it not
report disk removal event if network connection fails ?

# mdadm --query --detail /dev/md4 | tail -n 3
Number   Major   Minor   RaidDevice State
   0   8   320  active sync   /dev/sdc
   1   8   641  active sync   /dev/sde

# sg_inq /dev/disk/by-path/ip-192.168.3.114\:3260-iscsi-iqn\:tgt-lun-0
| grep RMB
  PQual=0  Device_type=0  RMB=0  version=0x05  [SPC-3]

Fubo.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.