Re: [PATCH] aacraid: Insure command thread is not recursively stopped

2018-04-09 Thread Martin K. Petersen

Dave,

> If a recursive IOP_RESET is invoked, usually due to the eh_thread
> handling errors after the first reset, be sure we flag that the
> command thread has been stopped to avoid an Oops of the form;

Applied to 4.17/scsi-fixes. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


RE: [PATCH] aacraid: Insure command thread is not recursively stopped

2018-04-04 Thread Raghava Aditya Renukunta


> -Original Message-
> From: Dave Carroll [mailto:david.carr...@microsemi.com]
> Sent: Wednesday, April 4, 2018 3:21 AM
> To: Martin K . Petersen <martin.peter...@oracle.com>; James Bottomley
> <j...@linux.vnet.ibm.com>
> Cc: Dave Carroll <david.carr...@microsemi.com>; linux-scsi  s...@vger.kernel.org>; dl-esc-Aacraid Linux Driver
> <aacr...@microsemi.com>; Scott Benesh <scott.ben...@microsemi.com>
> Subject: [PATCH] aacraid: Insure command thread is not recursively stopped
> 
> If a recursive IOP_RESET is invoked, usually due to the eh_thread handling
> errors after the first reset, be sure we flag that the command thread has
> been stopped to avoid an Oops of the form;
> 
>  [ 336.620256] CPU: 28 PID: 1193 Comm: scsi_eh_0 Kdump: loaded Not
> tainted 4.14.0-49.el7a.ppc64le #1
>  [ 336.620297] task: c03fd630b800 task.stack: c03fd61a4000
>  [ 336.620326] NIP: c0176794 LR: c013038c CTR:
> c024bc10
>  [ 336.620361] REGS: c03fd61a7720 TRAP: 0300 Not tainted (4.14.0-
> 49.el7a.ppc64le)
>  [ 336.620395] MSR: 90009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR:
> 22084022 XER: 2004
>  [ 336.620435] CFAR: c0130388 DAR:  DSISR:
> 4000 SOFTE: 1
>  [ 336.620435] GPR00: c013038c c03fd61a79a0 c14c7e00
> 
>  [ 336.620435] GPR04: 000c 000c
> 90009033 0477
>  [ 336.620435] GPR08: 0477 
>  c00810f7d940
>  [ 336.620435] GPR12: c024bc10 c7a33400
> c01708a8 c03fe3b881d8
>  [ 336.620435] GPR16: c03fe3b88060 c03fd61a7d10 f000
> 001e
>  [ 336.620435] GPR20: 0001 c0ebf1a0
> 0001 c03fe3b88000
>  [ 336.620435] GPR24: 0003 0002
> c03fe3b88840 c03fe3b887e8
>  [ 336.620435] GPR28: c03fe3b88000 c03fc8181788 
> c03fc8181700
>  [ 336.620750] NIP [c0176794] exit_creds+0x34/0x160
>  [ 336.620775] LR [c013038c] __put_task_struct+0x8c/0x1f0
>  [ 336.620804] Call Trace:
>  [ 336.620817] [c03fd61a79a0] [c03fe3b88000] 0xc03fe3b88000
> (unreliable)
>  [ 336.620853] [c03fd61a79d0] [c013038c]
> __put_task_struct+0x8c/0x1f0
>  [ 336.620889] [c03fd61a7a00] [c0171418]
> kthread_stop+0x1e8/0x1f0
>  [ 336.620922] [c03fd61a7a40] [c00810f7448c]
> aac_reset_adapter+0x14c/0x8d0 [aacraid]
>  [ 336.620959] [c03fd61a7b00] [c00810f60174]
> aac_eh_host_reset+0x84/0x100 [aacraid]
>  [ 336.621010] [c03fd61a7b30] [c0864f24]
> scsi_try_host_reset+0x74/0x180
>  [ 336.621046] [c03fd61a7bb0] [c0867ac0]
> scsi_eh_ready_devs+0xc00/0x14d0
>  [ 336.625165] [c03fd61a7ca0] [c08699e0]
> scsi_error_handler+0x550/0x730
>  [ 336.632101] [c03fd61a7dc0] [c0170a08] kthread+0x168/0x1b0
>  [ 336.639031] [c03fd61a7e30] [c000b528]
> ret_from_kernel_thread+0x5c/0xb4
>  [ 336.645971] Instruction dump:
>  [ 336.648743] 384216a0 7c0802a6 fbe1fff8 f8010010 f821ffd1 7c7f1b78
> 6000 6000
>  [ 336.657056] 3940 e87f0838 f95f0838 7c0004ac <7d401828> 314a
> 7d40192d 40c2fff4
>  [ 336.663997] -[ end trace 4640cf8d4945ad95 ]-
> 
> So flag when the thread is stopped by setting the thread pointer to NULL.
> 
> Signed-off-by: Dave Carroll <david.carr...@microsemi.com>
Reviewed-by: Raghava Aditya Renukunta <raghavaaditya.renuku...@microsemi.com>


[PATCH] aacraid: Insure command thread is not recursively stopped

2018-04-03 Thread Dave Carroll
If a recursive IOP_RESET is invoked, usually due to the eh_thread handling 
errors after the first reset, be sure we flag that the command thread has 
been stopped to avoid an Oops of the form;

 [ 336.620256] CPU: 28 PID: 1193 Comm: scsi_eh_0 Kdump: loaded Not tainted 
4.14.0-49.el7a.ppc64le #1
 [ 336.620297] task: c03fd630b800 task.stack: c03fd61a4000
 [ 336.620326] NIP: c0176794 LR: c013038c CTR: c024bc10
 [ 336.620361] REGS: c03fd61a7720 TRAP: 0300 Not tainted 
(4.14.0-49.el7a.ppc64le)
 [ 336.620395] MSR: 90009033  CR: 22084022 
XER: 2004
 [ 336.620435] CFAR: c0130388 DAR:  DSISR: 4000 
SOFTE: 1
 [ 336.620435] GPR00: c013038c c03fd61a79a0 c14c7e00 

 [ 336.620435] GPR04: 000c 000c 90009033 
0477
 [ 336.620435] GPR08: 0477   
c00810f7d940
 [ 336.620435] GPR12: c024bc10 c7a33400 c01708a8 
c03fe3b881d8
 [ 336.620435] GPR16: c03fe3b88060 c03fd61a7d10 f000 
001e
 [ 336.620435] GPR20: 0001 c0ebf1a0 0001 
c03fe3b88000
 [ 336.620435] GPR24: 0003 0002 c03fe3b88840 
c03fe3b887e8
 [ 336.620435] GPR28: c03fe3b88000 c03fc8181788  
c03fc8181700
 [ 336.620750] NIP [c0176794] exit_creds+0x34/0x160
 [ 336.620775] LR [c013038c] __put_task_struct+0x8c/0x1f0
 [ 336.620804] Call Trace:
 [ 336.620817] [c03fd61a79a0] [c03fe3b88000] 0xc03fe3b88000 
(unreliable)
 [ 336.620853] [c03fd61a79d0] [c013038c] 
__put_task_struct+0x8c/0x1f0
 [ 336.620889] [c03fd61a7a00] [c0171418] kthread_stop+0x1e8/0x1f0
 [ 336.620922] [c03fd61a7a40] [c00810f7448c] 
aac_reset_adapter+0x14c/0x8d0 [aacraid]
 [ 336.620959] [c03fd61a7b00] [c00810f60174] 
aac_eh_host_reset+0x84/0x100 [aacraid]
 [ 336.621010] [c03fd61a7b30] [c0864f24] 
scsi_try_host_reset+0x74/0x180
 [ 336.621046] [c03fd61a7bb0] [c0867ac0] 
scsi_eh_ready_devs+0xc00/0x14d0
 [ 336.625165] [c03fd61a7ca0] [c08699e0] 
scsi_error_handler+0x550/0x730
 [ 336.632101] [c03fd61a7dc0] [c0170a08] kthread+0x168/0x1b0
 [ 336.639031] [c03fd61a7e30] [c000b528] 
ret_from_kernel_thread+0x5c/0xb4
 [ 336.645971] Instruction dump:
 [ 336.648743] 384216a0 7c0802a6 fbe1fff8 f8010010 f821ffd1 7c7f1b78 6000 
6000
 [ 336.657056] 3940 e87f0838 f95f0838 7c0004ac <7d401828> 314a 7d40192d 
40c2fff4
 [ 336.663997] -[ end trace 4640cf8d4945ad95 ]-

So flag when the thread is stopped by setting the thread pointer to NULL.

Signed-off-by: Dave Carroll 
---
 drivers/scsi/aacraid/commsup.c | 4 +++-
 drivers/scsi/aacraid/linit.c   | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 84858d5..0156c96 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -1502,9 +1502,10 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
forced, u8 reset_type)
host = aac->scsi_host_ptr;
scsi_block_requests(host);
aac_adapter_disable_int(aac);
-   if (aac->thread->pid != current->pid) {
+   if (aac->thread && aac->thread->pid != current->pid) {
spin_unlock_irq(host->host_lock);
kthread_stop(aac->thread);
+   aac->thread = NULL;
jafo = 1;
}
 
@@ -1591,6 +1592,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
forced, u8 reset_type)
  aac->name);
if (IS_ERR(aac->thread)) {
retval = PTR_ERR(aac->thread);
+   aac->thread = NULL;
goto out;
}
}
diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index 2664ea0..f24fb94 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -1562,6 +1562,7 @@ static void __aac_shutdown(struct aac_dev * aac)
up(>event_wait);
}
kthread_stop(aac->thread);
+   aac->thread = NULL;
}
 
aac_send_shutdown(aac);
-- 
2.8.4