RE: [PATCH 03/29] scsi: aacraid: Fix hang in kdump

2017-12-26 Thread Raghava Aditya Renukunta


> -Original Message-
> From: Guilherme G. Piccoli [mailto:gpicc...@linux.vnet.ibm.com]
> Sent: Friday, December 22, 2017 7:14 AM
> To: Raghava Aditya Renukunta
> <raghavaaditya.renuku...@microsemi.com>; j...@linux.vnet.ibm.com;
> martin.peter...@oracle.com; linux-scsi@vger.kernel.org
> Cc: Scott Benesh <scott.ben...@microsemi.com>; dl-esc-Aacraid Linux
> Driver <aacr...@microsemi.com>; Tom White
> <tom.wh...@microsemi.com>; dougm...@linux.vnet.ibm.com
> Subject: Re: [PATCH 03/29] scsi: aacraid: Fix hang in kdump
> 
> EXTERNAL EMAIL
> 
> 
> On 12/21/2017 03:33 PM, Raghava Aditya Renukunta wrote:
> > Driver attempts to perform a device scan and device add after coming out
> > of reset. At times when the kdump kernel loads and it tries to perform
> > eh recovery, the device scan hangs since its commands are blocked
> because
> > of the eh recovery. This should have shown up in normal eh recovery path
> > (Should have been obvious)
> >
> > Remove the code that performs scanning.I can live without the rescanning
> > support in the stable kernels but a hanging kdump/eh recovery needs to
> be
> > fixed.
> >
> > Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after
> controller reset)
> > Cc: <sta...@vger.kernel.org>
> > Reported-by: Guilherme G. Piccoli <gpicc...@linux.vnet.ibm.com>
> 
> (Sorry in advance for flooding the thread heheh)
> I guess it'd be more appropriate to:
> 
> Reported-by: Douglas Miller <dougm...@linux.vnet.ibm.com>
> 
> Although I've tested it, Doug isolated the race condition based on code
> analysis...

Thank you pointing that out, I will fix it in the next iteration.

Regards,
Raghava Aditya

> Thanks,
> 
> 
> Guilherme
> 
> > Tested-by: Guilherme G. Piccoli <gpicc...@linux.vnet.ibm.com>
> > Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after
> controller reset)
> > Signed-off-by: Raghava Aditya Renukunta
> <raghavaaditya.renuku...@microsemi.com>
> > ---
> >  drivers/scsi/aacraid/commsup.c | 9 +
> >  1 file changed, 1 insertion(+), 8 deletions(-)
> >
> > diff --git a/drivers/scsi/aacraid/commsup.c
> b/drivers/scsi/aacraid/commsup.c
> > index 525a652..ffbfd04 100644
> > --- a/drivers/scsi/aacraid/commsup.c
> > +++ b/drivers/scsi/aacraid/commsup.c
> > @@ -1672,14 +1672,7 @@ static int _aac_reset_adapter(struct aac_dev
> *aac, int forced, u8 reset_type)
> >  out:
> >   aac->in_reset = 0;
> >   scsi_unblock_requests(host);
> > - /*
> > -  * Issue bus rescan to catch any configuration that might have
> > -  * occurred
> > -  */
> > - if (!retval) {
> > - dev_info(>pdev->dev, "Issuing bus rescan\n");
> > - scsi_scan_host(host);
> > - }
> > +
> >   if (jafo) {
> >   spin_lock_irq(host->host_lock);
> >   }
> >



Re: [PATCH 03/29] scsi: aacraid: Fix hang in kdump

2017-12-22 Thread Guilherme G. Piccoli
On 12/21/2017 03:33 PM, Raghava Aditya Renukunta wrote:
> Driver attempts to perform a device scan and device add after coming out
> of reset. At times when the kdump kernel loads and it tries to perform
> eh recovery, the device scan hangs since its commands are blocked because
> of the eh recovery. This should have shown up in normal eh recovery path
> (Should have been obvious)
> 
> Remove the code that performs scanning.I can live without the rescanning
> support in the stable kernels but a hanging kdump/eh recovery needs to be
> fixed.
> 
> Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after 
> controller reset)
> Cc: 
> Reported-by: Guilherme G. Piccoli 

(Sorry in advance for flooding the thread heheh)
I guess it'd be more appropriate to:

Reported-by: Douglas Miller 

Although I've tested it, Doug isolated the race condition based on code
analysis...

Thanks,


Guilherme

> Tested-by: Guilherme G. Piccoli 
> Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after 
> controller reset)
> Signed-off-by: Raghava Aditya Renukunta 
> 
> ---
>  drivers/scsi/aacraid/commsup.c | 9 +
>  1 file changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
> index 525a652..ffbfd04 100644
> --- a/drivers/scsi/aacraid/commsup.c
> +++ b/drivers/scsi/aacraid/commsup.c
> @@ -1672,14 +1672,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
> forced, u8 reset_type)
>  out:
>   aac->in_reset = 0;
>   scsi_unblock_requests(host);
> - /*
> -  * Issue bus rescan to catch any configuration that might have
> -  * occurred
> -  */
> - if (!retval) {
> - dev_info(>pdev->dev, "Issuing bus rescan\n");
> - scsi_scan_host(host);
> - }
> +
>   if (jafo) {
>   spin_lock_irq(host->host_lock);
>   }
> 



Re: [PATCH 03/29] scsi: aacraid: Fix hang in kdump

2017-12-21 Thread Guilherme G. Piccoli
On 12/21/2017 03:33 PM, Raghava Aditya Renukunta wrote:
> Driver attempts to perform a device scan and device add after coming out
> of reset. At times when the kdump kernel loads and it tries to perform
> eh recovery, the device scan hangs since its commands are blocked because
> of the eh recovery. This should have shown up in normal eh recovery path
> (Should have been obvious)
> 
> Remove the code that performs scanning.I can live without the rescanning
> support in the stable kernels but a hanging kdump/eh recovery needs to be
> fixed.
> 
> Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after 
> controller reset)
> Cc: 
> Reported-by: Guilherme G. Piccoli 
> Tested-by: Guilherme G. Piccoli 
> Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after 
> controller reset)
> Signed-off-by: Raghava Aditya Renukunta 
> 

Thanks a lot Raghava =)

> ---
>  drivers/scsi/aacraid/commsup.c | 9 +
>  1 file changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
> index 525a652..ffbfd04 100644
> --- a/drivers/scsi/aacraid/commsup.c
> +++ b/drivers/scsi/aacraid/commsup.c
> @@ -1672,14 +1672,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
> forced, u8 reset_type)
>  out:
>   aac->in_reset = 0;
>   scsi_unblock_requests(host);
> - /*
> -  * Issue bus rescan to catch any configuration that might have
> -  * occurred
> -  */
> - if (!retval) {
> - dev_info(>pdev->dev, "Issuing bus rescan\n");
> - scsi_scan_host(host);
> - }
> +
>   if (jafo) {
>   spin_lock_irq(host->host_lock);
>   }
> 



[PATCH 03/29] scsi: aacraid: Fix hang in kdump

2017-12-21 Thread Raghava Aditya Renukunta
Driver attempts to perform a device scan and device add after coming out
of reset. At times when the kdump kernel loads and it tries to perform
eh recovery, the device scan hangs since its commands are blocked because
of the eh recovery. This should have shown up in normal eh recovery path
(Should have been obvious)

Remove the code that performs scanning.I can live without the rescanning
support in the stable kernels but a hanging kdump/eh recovery needs to be
fixed.

Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after controller 
reset)
Cc: 
Reported-by: Guilherme G. Piccoli 
Tested-by: Guilherme G. Piccoli 
Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after controller 
reset)
Signed-off-by: Raghava Aditya Renukunta 
---
 drivers/scsi/aacraid/commsup.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 525a652..ffbfd04 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -1672,14 +1672,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
forced, u8 reset_type)
 out:
aac->in_reset = 0;
scsi_unblock_requests(host);
-   /*
-* Issue bus rescan to catch any configuration that might have
-* occurred
-*/
-   if (!retval) {
-   dev_info(>pdev->dev, "Issuing bus rescan\n");
-   scsi_scan_host(host);
-   }
+
if (jafo) {
spin_lock_irq(host->host_lock);
}
-- 
2.9.4