RE: [PATCH 03/29] scsi: aacraid: Fix hang in kdump
> -Original Message- > From: Guilherme G. Piccoli [mailto:gpicc...@linux.vnet.ibm.com] > Sent: Friday, December 22, 2017 7:14 AM > To: Raghava Aditya Renukunta > <raghavaaditya.renuku...@microsemi.com>; j...@linux.vnet.ibm.com; > martin.peter...@oracle.com; linux-scsi@vger.kernel.org > Cc: Scott Benesh <scott.ben...@microsemi.com>; dl-esc-Aacraid Linux > Driver <aacr...@microsemi.com>; Tom White > <tom.wh...@microsemi.com>; dougm...@linux.vnet.ibm.com > Subject: Re: [PATCH 03/29] scsi: aacraid: Fix hang in kdump > > EXTERNAL EMAIL > > > On 12/21/2017 03:33 PM, Raghava Aditya Renukunta wrote: > > Driver attempts to perform a device scan and device add after coming out > > of reset. At times when the kdump kernel loads and it tries to perform > > eh recovery, the device scan hangs since its commands are blocked > because > > of the eh recovery. This should have shown up in normal eh recovery path > > (Should have been obvious) > > > > Remove the code that performs scanning.I can live without the rescanning > > support in the stable kernels but a hanging kdump/eh recovery needs to > be > > fixed. > > > > Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after > controller reset) > > Cc: <sta...@vger.kernel.org> > > Reported-by: Guilherme G. Piccoli <gpicc...@linux.vnet.ibm.com> > > (Sorry in advance for flooding the thread heheh) > I guess it'd be more appropriate to: > > Reported-by: Douglas Miller <dougm...@linux.vnet.ibm.com> > > Although I've tested it, Doug isolated the race condition based on code > analysis... Thank you pointing that out, I will fix it in the next iteration. Regards, Raghava Aditya > Thanks, > > > Guilherme > > > Tested-by: Guilherme G. Piccoli <gpicc...@linux.vnet.ibm.com> > > Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after > controller reset) > > Signed-off-by: Raghava Aditya Renukunta > <raghavaaditya.renuku...@microsemi.com> > > --- > > drivers/scsi/aacraid/commsup.c | 9 + > > 1 file changed, 1 insertion(+), 8 deletions(-) > > > > diff --git a/drivers/scsi/aacraid/commsup.c > b/drivers/scsi/aacraid/commsup.c > > index 525a652..ffbfd04 100644 > > --- a/drivers/scsi/aacraid/commsup.c > > +++ b/drivers/scsi/aacraid/commsup.c > > @@ -1672,14 +1672,7 @@ static int _aac_reset_adapter(struct aac_dev > *aac, int forced, u8 reset_type) > > out: > > aac->in_reset = 0; > > scsi_unblock_requests(host); > > - /* > > - * Issue bus rescan to catch any configuration that might have > > - * occurred > > - */ > > - if (!retval) { > > - dev_info(>pdev->dev, "Issuing bus rescan\n"); > > - scsi_scan_host(host); > > - } > > + > > if (jafo) { > > spin_lock_irq(host->host_lock); > > } > >
Re: [PATCH 03/29] scsi: aacraid: Fix hang in kdump
On 12/21/2017 03:33 PM, Raghava Aditya Renukunta wrote: > Driver attempts to perform a device scan and device add after coming out > of reset. At times when the kdump kernel loads and it tries to perform > eh recovery, the device scan hangs since its commands are blocked because > of the eh recovery. This should have shown up in normal eh recovery path > (Should have been obvious) > > Remove the code that performs scanning.I can live without the rescanning > support in the stable kernels but a hanging kdump/eh recovery needs to be > fixed. > > Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after > controller reset) > Cc:> Reported-by: Guilherme G. Piccoli (Sorry in advance for flooding the thread heheh) I guess it'd be more appropriate to: Reported-by: Douglas Miller Although I've tested it, Doug isolated the race condition based on code analysis... Thanks, Guilherme > Tested-by: Guilherme G. Piccoli > Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after > controller reset) > Signed-off-by: Raghava Aditya Renukunta > > --- > drivers/scsi/aacraid/commsup.c | 9 + > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c > index 525a652..ffbfd04 100644 > --- a/drivers/scsi/aacraid/commsup.c > +++ b/drivers/scsi/aacraid/commsup.c > @@ -1672,14 +1672,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int > forced, u8 reset_type) > out: > aac->in_reset = 0; > scsi_unblock_requests(host); > - /* > - * Issue bus rescan to catch any configuration that might have > - * occurred > - */ > - if (!retval) { > - dev_info(>pdev->dev, "Issuing bus rescan\n"); > - scsi_scan_host(host); > - } > + > if (jafo) { > spin_lock_irq(host->host_lock); > } >
Re: [PATCH 03/29] scsi: aacraid: Fix hang in kdump
On 12/21/2017 03:33 PM, Raghava Aditya Renukunta wrote: > Driver attempts to perform a device scan and device add after coming out > of reset. At times when the kdump kernel loads and it tries to perform > eh recovery, the device scan hangs since its commands are blocked because > of the eh recovery. This should have shown up in normal eh recovery path > (Should have been obvious) > > Remove the code that performs scanning.I can live without the rescanning > support in the stable kernels but a hanging kdump/eh recovery needs to be > fixed. > > Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after > controller reset) > Cc:> Reported-by: Guilherme G. Piccoli > Tested-by: Guilherme G. Piccoli > Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after > controller reset) > Signed-off-by: Raghava Aditya Renukunta > Thanks a lot Raghava =) > --- > drivers/scsi/aacraid/commsup.c | 9 + > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c > index 525a652..ffbfd04 100644 > --- a/drivers/scsi/aacraid/commsup.c > +++ b/drivers/scsi/aacraid/commsup.c > @@ -1672,14 +1672,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int > forced, u8 reset_type) > out: > aac->in_reset = 0; > scsi_unblock_requests(host); > - /* > - * Issue bus rescan to catch any configuration that might have > - * occurred > - */ > - if (!retval) { > - dev_info(>pdev->dev, "Issuing bus rescan\n"); > - scsi_scan_host(host); > - } > + > if (jafo) { > spin_lock_irq(host->host_lock); > } >
[PATCH 03/29] scsi: aacraid: Fix hang in kdump
Driver attempts to perform a device scan and device add after coming out of reset. At times when the kdump kernel loads and it tries to perform eh recovery, the device scan hangs since its commands are blocked because of the eh recovery. This should have shown up in normal eh recovery path (Should have been obvious) Remove the code that performs scanning.I can live without the rescanning support in the stable kernels but a hanging kdump/eh recovery needs to be fixed. Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after controller reset) Cc:Reported-by: Guilherme G. Piccoli Tested-by: Guilherme G. Piccoli Fixes: a2d0321dd532901e (scsi: aacraid: Reload offlined drives after controller reset) Signed-off-by: Raghava Aditya Renukunta --- drivers/scsi/aacraid/commsup.c | 9 + 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c index 525a652..ffbfd04 100644 --- a/drivers/scsi/aacraid/commsup.c +++ b/drivers/scsi/aacraid/commsup.c @@ -1672,14 +1672,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int forced, u8 reset_type) out: aac->in_reset = 0; scsi_unblock_requests(host); - /* -* Issue bus rescan to catch any configuration that might have -* occurred -*/ - if (!retval) { - dev_info(>pdev->dev, "Issuing bus rescan\n"); - scsi_scan_host(host); - } + if (jafo) { spin_lock_irq(host->host_lock); } -- 2.9.4