RE: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert

2017-02-16 Thread Raghava Aditya Renukunta


> -Original Message-
> From: Johannes Thumshirn [mailto:jthumsh...@suse.de]
> Sent: Thursday, February 16, 2017 1:31 AM
> To: Raghava Aditya Renukunta
> <raghavaaditya.renuku...@microsemi.com>; j...@linux.vnet.ibm.com;
> martin.peter...@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carr...@microsemi.com>; Gana Sridaran
> <gana.srida...@microsemi.com>; Scott Benesh
> <scott.ben...@microsemi.com>; dan.carpen...@oracle.com
> Subject: Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw
> assert
> 
> EXTERNAL EMAIL
> 
> 
> On 02/15/2017 11:22 PM, Raghava Aditya Renukunta wrote:
> >>
> >> This look a bit scary. Can't the kthread be converted to a workqueue so
> >> we could call cancel_work_sync()?
> >
> > Could you please elaborate on the reasons why this fix is scary?
> > I understand that killing a thread is not standard (for any reason),
> > and if there are other nuanced issues I would like to understand them.
> 
> I'm actually concerned that this could have all kinds of side effects.
> But this is just a gut feeling. I see some drm drivers are doing the
> same, so it might be possible, but IMHO this is not a good design.
> 
> And IIRC kthreads do have more downsides (i.e. CPU hotplugging and
> issues with kernel live patching).
> 
> I think most kthreads (haven't looked too close to the aacraid kthread I
> must admit, but I'll be doing so) can be converted to either workqueues
> or timers (or a combination of both).
> 
> Thanks,
> Johannes

Makes sense,  and I agree. With that being said I will withdraw this patch
and resend it out in different patch series once we  rework aac_command_thread
into a work queue/timers.

Regards,
Raghava Aditya


> --
> Johannes Thumshirn  Storage
> jthumsh...@suse.de+49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert

2017-02-16 Thread Johannes Thumshirn
On 02/15/2017 11:22 PM, Raghava Aditya Renukunta wrote:
>>
>> This look a bit scary. Can't the kthread be converted to a workqueue so
>> we could call cancel_work_sync()?
> 
> Could you please elaborate on the reasons why this fix is scary?
> I understand that killing a thread is not standard (for any reason), 
> and if there are other nuanced issues I would like to understand them.

I'm actually concerned that this could have all kinds of side effects.
But this is just a gut feeling. I see some drm drivers are doing the
same, so it might be possible, but IMHO this is not a good design.

And IIRC kthreads do have more downsides (i.e. CPU hotplugging and
issues with kernel live patching).

I think most kthreads (haven't looked too close to the aacraid kthread I
must admit, but I'll be doing so) can be converted to either workqueues
or timers (or a combination of both).

Thanks,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


RE: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert

2017-02-15 Thread Raghava Aditya Renukunta


> -Original Message-
> From: Johannes Thumshirn [mailto:jthumsh...@suse.de]
> Sent: Wednesday, February 15, 2017 12:44 AM
> To: Raghava Aditya Renukunta
> <raghavaaditya.renuku...@microsemi.com>; j...@linux.vnet.ibm.com;
> martin.peter...@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carr...@microsemi.com>; Gana Sridaran
> <gana.srida...@microsemi.com>; Scott Benesh
> <scott.ben...@microsemi.com>; dan.carpen...@oracle.com
> Subject: Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw
> assert
> 
> EXTERNAL EMAIL
> 
> 
> On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> > When the command thread performs a periodic time sync and the
> firmware is
> > going through an assert during that time, the command thread waits for
> the
> > response that would never arrive. The SCSI Mid layer's error handler would
> > eventually reset the controller, but the eh_handler just issues a
> > "thread stop" to the command thread which is stuck on a semaphore and
> the
> > eh_thread would in turn goes to sleep waiting for the command_thread to
> > respond to the stop which never happens.
> >
> > Fixed by allowing SIGTERM for the command thread, and during the
> eh_reset
> > call, sends termination signal to the command thread. As a follow-up, the
> > eh_reset handler would take care of the controller reset.
> >
> > Signed-off-by: Raghava Aditya Renukunta
> <raghavaaditya.renuku...@microsemi.com>
> > Reviewed-by: David Carroll <david.carr...@microsemi.com>
> > ---
> 
> This look a bit scary. Can't the kthread be converted to a workqueue so
> we could call cancel_work_sync()?

Could you please elaborate on the reasons why this fix is scary?
I understand that killing a thread is not standard (for any reason), 
and if there are other nuanced issues I would like to understand them.

Thank you,
Raghava Aditya 

> 
> --
> Johannes Thumshirn  Storage
> jthumsh...@suse.de+49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert

2017-02-15 Thread Johannes Thumshirn
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> When the command thread performs a periodic time sync and the firmware is
> going through an assert during that time, the command thread waits for the
> response that would never arrive. The SCSI Mid layer's error handler would
> eventually reset the controller, but the eh_handler just issues a
> "thread stop" to the command thread which is stuck on a semaphore and the
> eh_thread would in turn goes to sleep waiting for the command_thread to
> respond to the stop which never happens.
> 
> Fixed by allowing SIGTERM for the command thread, and during the eh_reset
> call, sends termination signal to the command thread. As a follow-up, the
> eh_reset handler would take care of the controller reset.
> 
> Signed-off-by: Raghava Aditya Renukunta 
> 
> Reviewed-by: David Carroll 
> ---

This look a bit scary. Can't the kthread be converted to a workqueue so
we could call cancel_work_sync()?


-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


[PATCH 10/16] aacraid: Terminate kthread on controller fw assert

2017-02-14 Thread Raghava Aditya Renukunta
When the command thread performs a periodic time sync and the firmware is
going through an assert during that time, the command thread waits for the
response that would never arrive. The SCSI Mid layer's error handler would
eventually reset the controller, but the eh_handler just issues a
"thread stop" to the command thread which is stuck on a semaphore and the
eh_thread would in turn goes to sleep waiting for the command_thread to
respond to the stop which never happens.

Fixed by allowing SIGTERM for the command thread, and during the eh_reset
call, sends termination signal to the command thread. As a follow-up, the
eh_reset handler would take care of the controller reset.

Signed-off-by: Raghava Aditya Renukunta 
Reviewed-by: David Carroll 
---
 drivers/scsi/aacraid/commsup.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 78588e4..0ee91d0 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -1519,8 +1519,15 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
forced, u8 reset_type)
scsi_block_requests(host);
aac_adapter_disable_int(aac);
if (aac->thread->pid != current->pid) {
+   struct task_struct *tsk;
+
spin_unlock_irq(host->host_lock);
+   tsk = pid_task(find_vpid(aac->thread->pid), PIDTYPE_PID);
+   if (tsk)
+   send_sig(SIGTERM, tsk, 1);
kthread_stop(aac->thread);
+
+   dev_info(>pdev->dev, "Command Thread Terminated\n");
jafo = 1;
}
 
-- 
2.7.4