Re: [PATCH] scsi_sysfs: fix list entry double-deletion

2018-01-16 Thread Bart Van Assche
On Tue, 2018-01-16 at 14:10 +0800, zhenwei.pi wrote:
> CPU: 15 PID: 23529 Comm: lvm Tainted: G  D W   E 4.14.11 #1

Please retest with kernel v4.15-rc6 or later. That kernel includes commit
81b6c9998979 ("scsi: core: check for device state in __scsi_remove_target()").

Thanks,

Bart.


[PATCH] scsi_sysfs: fix list entry double-deletion

2018-01-15 Thread zhenwei.pi
Test iscsi performance in unstable network. And hit kernel die.
Here are two different call trace, but they have the same RIP & RCX.
1>
CPU: 15 PID: 23529 Comm: lvm Tainted: G  D W   E 4.14.11 #1
task: 91a382b72e00 task.stack: b3928ade8000
RIP: 0010:scsi_device_dev_release_usercontext+0x58/0x200
RSP: 0018:b3928adebb48 EFLAGS: 00010046
RAX: 0246 RBX: 9lad99f95738 RCX: dead0100
RDX: dead0200 RSI: dead0100 RDI: 91a3a2e5a030
RBP: 9lad99f95138 ROB: 0101 R09: 000181fe
R10: b3928adebb10 R11:  R12: 9lad99f95000
R13: 9lac2614b028 R14: 91328320 R15: 9lacef5e3e98
FS: 7f1ca63c9840() GS:9lad9f3c() knlGS:
CS: 0010 DS:  ES:  CRO: 80050033
CR2:00c42027d000 CR3:000cb5e06002 CR4:003626e0
DRO: DR1: DR2:
DR3: DR6:fffeOff0 DR7:0400
CallTrace:
table_load+0x360/0x360
execute_in_process_context+0x58/0x60
device_release+Ox2d/0x80
kobject_put+Ox7f/Ox1a0
scsi_disk_put+Ox2b/Ox40
__blkdev_put+Ox19e/Ox1f0
table_load+0x360/0x360
disk_flush_events+0x24/0x60
table_load+0x360/0x360
dm_put_table_device+Ox51/OxbO
dm_put_device+0x75/OxbO
table_load+0x360/0x360
linear_dtr+0x12/0x20
dm_table_destroy+0x66/0x110
table_load+0x360/0x360
dev_suspend+Oxde/0x250
ctl_ioctl+0x1c0/0x480
dm_ctl_ioctl+Oxa/Ox10
do_vfs_ioctl+0x9f/Ox5f0
Sys_ioctl+0x74/0x80

2>
CPU: 25 PIO: 2084 Comm: kworker/u64:6 Tainted: G  W   E 4.14.11 #1
WorkqueLm: scsi_wq_18 __iscsi_unbind_session [scsi_transport_iscsi]
task: 91255ed74500 task.stack: ad74c97cc000
RIP: 0010:scsi_device_dev_release_usercontext.0x58/0x200
RSP: 0018:f f f fad74c97cfd80 EFLAGS: 00010046
PAX: 0246  RBX: 912695682f38 RCX: dead0100
BOX: dead0200  RSI: dead0100 ROI: 9125e9529030
RBP: 912695682938  R08:  R09: 9126967b5220
R10: 029f  R11:  R12: 912695682800
R13: 911f4faef028  R14: 912695682800 R15: 9125e9529010
FS: () GS:91269f44() knIGS:
CS: 0010 DS:  ES:  CRO: 80050033
CR2: 7f90027ef020 CR3: 0002e640a006 CR4: 003626e0
Call Trace:
execute_in_process_context.0x58/0x60
device_release.0x2d/Ox80
kobject_put.0x7f/Ox1a0
scsi_remoue_target.0x171/0x1b0
__iscsi_unbind_session.0x63/0x160 [scsi_transport_iscsi]
process_one_work.0x151/0x3f0
worker_thread.0x4a/Ox440
kthread.Oxfc/Ox130
process_one_work.0x3f0/0x3f0
kthread_create_on_node.0x70/0x70
do_group_exit.0x3a/Oxa0
ret_from_fork.0x1f/Ox30

Both call trace die in the same IP (call list_del(>siblings)),
and RCX: dead0100 means list entry has been deleted.
So, before calling list_del, check list item is empty or not.

Signed-off-by: zhenwei.pi 
---
 drivers/scsi/scsi_sysfs.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 26ce1717..92463bb 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -439,7 +439,11 @@ static void scsi_device_dev_release_usercontext(struct 
work_struct *work)
parent = sdev->sdev_gendev.parent;
 
spin_lock_irqsave(sdev->host->host_lock, flags);
-   list_del(>siblings);
+   if (list_empty(>siblings)) {
+   spin_unlock_irqrestore(sdev->host->host_lock, flags);
+   return;
+   }
+   list_del_init(>siblings);
list_del(>same_target_siblings);
list_del(>starved_entry);
spin_unlock_irqrestore(sdev->host->host_lock, flags);
-- 
2.7.4