RE: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert

2017-02-16 Thread Raghava Aditya Renukunta


> -Original Message-
> From: Johannes Thumshirn [mailto:jthumsh...@suse.de]
> Sent: Thursday, February 16, 2017 1:31 AM
> To: Raghava Aditya Renukunta
> <raghavaaditya.renuku...@microsemi.com>; j...@linux.vnet.ibm.com;
> martin.peter...@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carr...@microsemi.com>; Gana Sridaran
> <gana.srida...@microsemi.com>; Scott Benesh
> <scott.ben...@microsemi.com>; dan.carpen...@oracle.com
> Subject: Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw
> assert
> 
> EXTERNAL EMAIL
> 
> 
> On 02/15/2017 11:22 PM, Raghava Aditya Renukunta wrote:
> >>
> >> This look a bit scary. Can't the kthread be converted to a workqueue so
> >> we could call cancel_work_sync()?
> >
> > Could you please elaborate on the reasons why this fix is scary?
> > I understand that killing a thread is not standard (for any reason),
> > and if there are other nuanced issues I would like to understand them.
> 
> I'm actually concerned that this could have all kinds of side effects.
> But this is just a gut feeling. I see some drm drivers are doing the
> same, so it might be possible, but IMHO this is not a good design.
> 
> And IIRC kthreads do have more downsides (i.e. CPU hotplugging and
> issues with kernel live patching).
> 
> I think most kthreads (haven't looked too close to the aacraid kthread I
> must admit, but I'll be doing so) can be converted to either workqueues
> or timers (or a combination of both).
> 
> Thanks,
> Johannes

Makes sense,  and I agree. With that being said I will withdraw this patch
and resend it out in different patch series once we  rework aac_command_thread
into a work queue/timers.

Regards,
Raghava Aditya


> --
> Johannes Thumshirn  Storage
> jthumsh...@suse.de+49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert

2017-02-16 Thread Johannes Thumshirn
On 02/15/2017 11:22 PM, Raghava Aditya Renukunta wrote:
>>
>> This look a bit scary. Can't the kthread be converted to a workqueue so
>> we could call cancel_work_sync()?
> 
> Could you please elaborate on the reasons why this fix is scary?
> I understand that killing a thread is not standard (for any reason), 
> and if there are other nuanced issues I would like to understand them.

I'm actually concerned that this could have all kinds of side effects.
But this is just a gut feeling. I see some drm drivers are doing the
same, so it might be possible, but IMHO this is not a good design.

And IIRC kthreads do have more downsides (i.e. CPU hotplugging and
issues with kernel live patching).

I think most kthreads (haven't looked too close to the aacraid kthread I
must admit, but I'll be doing so) can be converted to either workqueues
or timers (or a combination of both).

Thanks,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


RE: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert

2017-02-15 Thread Raghava Aditya Renukunta


> -Original Message-
> From: Johannes Thumshirn [mailto:jthumsh...@suse.de]
> Sent: Wednesday, February 15, 2017 12:44 AM
> To: Raghava Aditya Renukunta
> <raghavaaditya.renuku...@microsemi.com>; j...@linux.vnet.ibm.com;
> martin.peter...@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carr...@microsemi.com>; Gana Sridaran
> <gana.srida...@microsemi.com>; Scott Benesh
> <scott.ben...@microsemi.com>; dan.carpen...@oracle.com
> Subject: Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw
> assert
> 
> EXTERNAL EMAIL
> 
> 
> On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> > When the command thread performs a periodic time sync and the
> firmware is
> > going through an assert during that time, the command thread waits for
> the
> > response that would never arrive. The SCSI Mid layer's error handler would
> > eventually reset the controller, but the eh_handler just issues a
> > "thread stop" to the command thread which is stuck on a semaphore and
> the
> > eh_thread would in turn goes to sleep waiting for the command_thread to
> > respond to the stop which never happens.
> >
> > Fixed by allowing SIGTERM for the command thread, and during the
> eh_reset
> > call, sends termination signal to the command thread. As a follow-up, the
> > eh_reset handler would take care of the controller reset.
> >
> > Signed-off-by: Raghava Aditya Renukunta
> <raghavaaditya.renuku...@microsemi.com>
> > Reviewed-by: David Carroll <david.carr...@microsemi.com>
> > ---
> 
> This look a bit scary. Can't the kthread be converted to a workqueue so
> we could call cancel_work_sync()?

Could you please elaborate on the reasons why this fix is scary?
I understand that killing a thread is not standard (for any reason), 
and if there are other nuanced issues I would like to understand them.

Thank you,
Raghava Aditya 

> 
> --
> Johannes Thumshirn  Storage
> jthumsh...@suse.de+49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert

2017-02-15 Thread Johannes Thumshirn
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> When the command thread performs a periodic time sync and the firmware is
> going through an assert during that time, the command thread waits for the
> response that would never arrive. The SCSI Mid layer's error handler would
> eventually reset the controller, but the eh_handler just issues a
> "thread stop" to the command thread which is stuck on a semaphore and the
> eh_thread would in turn goes to sleep waiting for the command_thread to
> respond to the stop which never happens.
> 
> Fixed by allowing SIGTERM for the command thread, and during the eh_reset
> call, sends termination signal to the command thread. As a follow-up, the
> eh_reset handler would take care of the controller reset.
> 
> Signed-off-by: Raghava Aditya Renukunta 
> 
> Reviewed-by: David Carroll 
> ---

This look a bit scary. Can't the kthread be converted to a workqueue so
we could call cancel_work_sync()?


-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850