Commit 659743b02c411075b26601725947b21df0bb29c8 [SCSI] libiscsi: Reduce locking contention in fast path introduced a locking regression.
According to the comment for iscsi_complete_task back_lock must be held while calling it: /* * iscsi_complete_task - finish a task * @task: iscsi cmd task * @state: state to complete task with * * Must be called with session back_lock. * / However, at three locations in libiscsi.c iscsi_complete_task is called without holding back_lock. This causes a race condition when processing certain iSCSI responses leading to list corruption. This causes iSCSI traffic to stall. A production box here at Kapsi with 44 active iSCSI paths was crashing because of this about five times a day at worst. Fixed by aquiring back_lock before calling iscsi_complete_task at the remaining spots. Please check if spin_lock_bh or spin_lock is more appropriate. Fixes: 659743b02c41 ("[SCSI] libiscsi: Reduce locking contention in fast path") Signed-off-by: Ilkka Sovanto <il...@kapsi.fi> Tested-by: Ilkka Sovanto <il...@kapsi.fi> Cc: sta...@vger.kernel.org --- ------------[ cut here ]------------ WARNING: CPU: 8 PID: 155 at lib/list_debug.c:62 __list_del_entry+0xc3/0xd0() list_del corruption. next->prev should be ffff880feaff3080, but was ffff880ff7eddac8 Modules linked in: CPU: 8 PID: 155 Comm: kworker/u56:4 Not tainted 4.4.48-kapsi #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014 Workqueue: iscsi_q_2 iscsi_xmitworker 0000000000000286 00000000764ebb7a ffff880ff8383d20 ffffffff81589063 ffff880ff8383d68 ffffffff820ff7b7 ffff880ff8383d58 ffffffff81090b96 ffff880ff7eddad8 ffff880ff7edda10 ffff880ff7eddab8 ffff880ff7eddaa8 Call Trace: [<ffffffff81589063>] dump_stack+0x63/0x90 [<ffffffff81090b96>] warn_slowpath_common+0x86/0xc0 [<ffffffff81090c2c>] warn_slowpath_fmt+0x5c/0x80 [<ffffffff815a4983>] __list_del_entry+0xc3/0xd0 [<ffffffff817128a6>] iscsi_xmitworker+0x186/0x2d0 [<ffffffff810a7e29>] process_one_work+0x149/0x3e0 [<ffffffff810a8129>] worker_thread+0x69/0x470 [<ffffffff810a80c0>] ? process_one_work+0x3e0/0x3e0 [<ffffffff810ad72a>] kthread+0xea/0x100 [<ffffffff810ad640>] ? kthread_create_on_node+0x1a0/0x1a0 [<ffffffff81c3c90f>] ret_from_fork+0x3f/0x70 [<ffffffff810ad640>] ? kthread_create_on_node+0x1a0/0x1a0 ---[ end trace 3b6d957784e71ffe ]--- ------------[ cut here ]------------ WARNING: CPU: 12 PID: 13505 at lib/list_debug.c:33 __list_add+0x8e/0xc0() list_add corruption. prev->next should be next (ffff880ff7eddab8), but was ffff880feaff3080. (prev=ffff880feaff3080). Modules linked in: CPU: 12 PID: 13505 Comm: nfsd Tainted: G W 4.4.48-kapsi #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014 0000000000000286 000000005c16edac ffff880fb6167000 ffffffff81589063 ffff880fb6167048 ffffffff820ff7b7 ffff880fb6167038 ffffffff81090b96 ffff880feaff3080 ffff880ff7eddab8 ffff880feaff3080 ffff880ff7ede800 Call Trace: [<ffffffff81589063>] dump_stack+0x63/0x90 [<ffffffff81090b96>] warn_slowpath_common+0x86/0xc0 [<ffffffff81090c2c>] warn_slowpath_fmt+0x5c/0x80 [<ffffffff8159dc59>] ? kfifo_copy_out.isra.5+0x59/0x70 [<ffffffff815a488e>] __list_add+0x8e/0xc0 [<ffffffff81712658>] iscsi_queuecommand+0x3a8/0x470 [<ffffffff816e62e6>] scsi_dispatch_cmd+0xb6/0x240 [<ffffffff816e7a35>] scsi_queue_rq+0x5c5/0x6c0 [<ffffffff8156ad37>] __blk_mq_run_hw_queue+0x237/0x380 [<ffffffff8156aaf7>] blk_mq_run_hw_queue+0x77/0x80 [<ffffffff8156c5d3>] blk_mq_insert_request+0xa3/0xc0 ... ---[ end trace 3b6d957784e71fff ]--- ------------[ cut here ]------------ WARNING: CPU: 12 PID: 13505 at lib/list_debug.c:36 __list_add+0xb3/0xc0() list_add double add: new=ffff880feaff3080, prev=ffff880feaff3080, next=ffff880ff7eddab8. Modules linked in: CPU: 12 PID: 13505 Comm: nfsd Tainted: G W 4.4.48-kapsi #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014 0000000000000286 000000005c16edac ffff880fb6167000 ffffffff81589063 ffff880fb6167048 ffffffff820ff7b7 ffff880fb6167038 ffffffff81090b96 ffff880feaff3080 ffff880ff7eddab8 ffff880feaff3080 ffff880ff7ede800 Call Trace: [<ffffffff81589063>] dump_stack+0x63/0x90 [<ffffffff81090b96>] warn_slowpath_common+0x86/0xc0 [<ffffffff81090c2c>] warn_slowpath_fmt+0x5c/0x80 [<ffffffff8159dc59>] ? kfifo_copy_out.isra.5+0x59/0x70 [<ffffffff815a48b3>] __list_add+0xb3/0xc0 [<ffffffff81712658>] iscsi_queuecommand+0x3a8/0x470 [<ffffffff816e62e6>] scsi_dispatch_cmd+0xb6/0x240 [<ffffffff816e7a35>] scsi_queue_rq+0x5c5/0x6c0 [<ffffffff8156ad37>] __blk_mq_run_hw_queue+0x237/0x380 [<ffffffff8156aaf7>] blk_mq_run_hw_queue+0x77/0x80 [<ffffffff8156c5d3>] blk_mq_insert_request+0xa3/0xc0 ... [<ffffffff810ad640>] ? kthread_create_on_node+0x1a0/0x1a0 ---[ end trace 3b6d957784e72000 ]--- ... diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -1747,7 +1747,9 @@ int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc) return 0; prepd_reject: + spin_lock_bh(&session->back_lock); iscsi_complete_task(task, ISCSI_TASK_REQUEUE_SCSIQ); + spin_unlock_bh(&session->back_lock); reject: spin_unlock_bh(&session->frwd_lock); ISCSI_DBG_SESSION(session, "cmd 0x%x rejected (%d)\n", @@ -1755,7 +1757,9 @@ reject: return SCSI_MLQUEUE_TARGET_BUSY; prepd_fault: + spin_lock_bh(&session->back_lock); iscsi_complete_task(task, ISCSI_TASK_REQUEUE_SCSIQ); + spin_lock_bh(&session->back_lock); fault: spin_unlock_bh(&session->frwd_lock); ISCSI_DBG_SESSION(session, "iscsi: cmd 0x%x is not queued (%d)\n", @@ -3068,7 +3072,9 @@ fail_mgmt_tasks(struct iscsi_session *session, struct iscsi_conn *conn) state = ISCSI_TASK_ABRT_SESS_RECOV; if (task->state == ISCSI_TASK_PENDING) state = ISCSI_TASK_COMPLETED; + spin_lock_bh(&session->back_lock); iscsi_complete_task(task, state); + spin_unlock_bh(&session->back_lock); } } -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+unsubscr...@googlegroups.com. To post to this group, send email to open-iscsi@googlegroups.com. Visit this group at https://groups.google.com/group/open-iscsi. For more options, visit https://groups.google.com/d/optout.