Re: [PATCH 2/2] scsi: ufs: add Exynos-specific driver

2017-11-28 Thread Jaehoon Chung
Hi,

On 11/28/2017 02:36 PM, 김기웅 wrote:
> This driver is to use UFS devices on Exynos SoC and
> has been already used for many years for commercial products.

Well, i'm not sure but i remembered there are the similar patches 
before..Seungwon and Alim's patches.
Is it relevant to them?

Anyway.. i think i can't test with only these patches..
how did you test this patches?

> 
> Signed-off-by: Kiwoong Kim 
> ---
>  drivers/scsi/ufs/Kconfig  |  10 +
>  drivers/scsi/ufs/Makefile |   1 +
>  drivers/scsi/ufs/ufs-exynos.c | 962 
> ++
>  drivers/scsi/ufs/ufs-exynos.h | 351 +++
>  drivers/scsi/ufs/ufshcd.h |   1 +
>  5 files changed, 1325 insertions(+)
>  create mode 100644 drivers/scsi/ufs/ufs-exynos.c
>  create mode 100644 drivers/scsi/ufs/ufs-exynos.h

There is no binding file.

> 
> diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig
> index e27b4d4e6ae2..7d71ad8768c3 100644
> --- a/drivers/scsi/ufs/Kconfig
> +++ b/drivers/scsi/ufs/Kconfig
> @@ -100,3 +100,13 @@ config SCSI_UFS_QCOM
>  
> Select this if you have UFS controller on QCOM chipset.
> If unsure, say N.
> +
> +config SCSI_UFS_EXYNOS
> + tristate "EXYNOS UFS Host Controller Driver"
> + depends on SCSI_UFSHCD && SCSI_UFSHCD_PLATFORM
> + ---help---
> +   This selects the EXYNOS UFS host controller driver.
> +
> +   If you have a controller with this interface, say Y or M here.
> +
> +   If unsure, say N.
> diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
> index 9310c6c83041..3312b052dcff 100644
> --- a/drivers/scsi/ufs/Makefile
> +++ b/drivers/scsi/ufs/Makefile
> @@ -3,6 +3,7 @@
>  obj-$(CONFIG_SCSI_UFS_DWC_TC_PCI) += tc-dwc-g210-pci.o ufshcd-dwc.o 
> tc-dwc-g210.o
>  obj-$(CONFIG_SCSI_UFS_DWC_TC_PLATFORM) += tc-dwc-g210-pltfrm.o ufshcd-dwc.o 
> tc-dwc-g210.o
>  obj-$(CONFIG_SCSI_UFS_QCOM) += ufs-qcom.o
> +obj-$(CONFIG_SCSI_UFS_EXYNOS) += ufs-exynos.o
>  obj-$(CONFIG_SCSI_UFSHCD) += ufshcd.o
>  obj-$(CONFIG_SCSI_UFSHCD_PCI) += ufshcd-pci.o
>  obj-$(CONFIG_SCSI_UFSHCD_PLATFORM) += ufshcd-pltfrm.o
> diff --git a/drivers/scsi/ufs/ufs-exynos.c b/drivers/scsi/ufs/ufs-exynos.c
> new file mode 100644
> index ..98e5aeb80b06
> --- /dev/null
> +++ b/drivers/scsi/ufs/ufs-exynos.c
> @@ -0,0 +1,962 @@
> +/*
> + * UFS Host Controller driver for Exynos specific extensions
> + *
> + * Copyright (C) 2013-2014 Samsung Electronics Co., Ltd.

2013-2014? is it right?

> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "ufshcd.h"
> +#include "ufshcd-pltfrm.h"
> +#include "ufs-exynos.h"
> +#include 
> +#include 
> +#include 

ordering about header file.

> +
> +/*
> + * Debugging information, SFR/attributes/misc
> + */
> +static struct exynos_ufs *ufs_host_backup[1];> +static int ufs_host_index = 
> 0;

It has to use the global? Is there any other solution?

> +
> +static struct exynos_ufs_sfr_log ufs_log_std_sfr[] = {
> + {"CAPABILITIES" ,   REG_CONTROLLER_CAPABILITIES,
> 0},
> + {"UFS VERSION"  ,   REG_UFS_VERSION,
> 0},
> + {"PRODUCT ID"   ,   REG_CONTROLLER_DEV_ID,  
> 0},
> + {"MANUFACTURE ID"   ,   REG_CONTROLLER_PROD_ID, 
> 0},
> + {"INTERRUPT STATUS" ,   REG_INTERRUPT_STATUS,   
> 0},
> + {"INTERRUPT ENABLE" ,   REG_INTERRUPT_ENABLE,   
> 0},
> + {"CONTROLLER STATUS",   REG_CONTROLLER_STATUS,  
> 0},
> + {"CONTROLLER ENABLE",   REG_CONTROLLER_ENABLE,  
> 0},
> + {"UIC ERR PHY ADAPTER LAYER",   
> REG_UIC_ERROR_CODE_PHY_ADAPTER_LAYER,   0},
> + {"UIC ERR DATA LINK LAYER"  ,   
> REG_UIC_ERROR_CODE_DATA_LINK_LAYER, 0},
> + {"UIC ERR NETWORK LATER",   
> REG_UIC_ERROR_CODE_NETWORK_LAYER,   0},
> + {"UIC ERR TRANSPORT LAYER"  ,   
> REG_UIC_ERROR_CODE_TRANSPORT_LAYER, 0},
> + {"UIC ERR DME"  ,   REG_UIC_ERROR_CODE_DME, 
> 0},
> + {"UTP TRANSF REQ INT AGG CNTRL" ,   
> REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL,   0},
> + {"UTP TRANSF REQ LIST BASE L"   ,   
> REG_UTP_TRANSFER_REQ_LIST_BASE_L,   0},
> + {"UTP TRANSF REQ LIST BASE H"   ,   
> REG_UTP_TRANSFER_REQ_LIST_BASE_H,   0},
> + {"UTP TRANSF REQ DOOR BELL" ,   REG_UTP_TRANSFER_REQ_DOOR_BELL, 
> 0},
> + {"UTP TRANSF REQ LIST CLEAR",   
> REG_UTP_TRANSFER_REQ_LIST_CLEAR,0},
> + {"UTP TRANSF REQ LIST RUN STOP" , 

Re: [PATCH] libsas: flush pending destruct work in sas_unregister_domain_devices()

2017-11-28 Thread Johannes Thumshirn
On Mon, Nov 27, 2017 at 04:24:45PM -0800, Cong Wang wrote:
> We saw dozens of the following kernel waring:
> 
>  WARNING: CPU: 0 PID: 705 at fs/sysfs/group.c:224 
> sysfs_remove_group+0x54/0x88()
>  sysfs group 81ab7670 not found for kobject '6:0:3:0'
>  Modules linked in: cpufreq_ondemand x86_pkg_temp_thermal coretemp kvm_intel 
> kvm microcode raid0 iTCO_wdt iTCO_vendor_support sb_edac edac_core lpc_ich 
> mfd_core ioatdma i2c_i801 shpchp wmi hed acpi_cpufreq lp parport tcp_diag 
> inet_diag ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel igb ptp pps_core 
> i2c_algo_bit i2c_core crc32c_intel isci libsas scsi_transport_sas dca ipv6
>  CPU: 0 PID: 705 Comm: kworker/u240:0 Not tainted 4.1.35.el7.x86_64 #1

This should by now be fixed with commit fbce4d97fd43 ("scsi: fixup kernel
warning during rmmod()" which went into v4.14-rc6.

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


RE: [PATCH 2/2] scsi: ufs: add Exynos-specific driver

2017-11-28 Thread 김기웅
Hi.
This is modified from Seungwon's initial patch.
And this has been used for several commercial products.
I think you feel weird because you can't see a bunch of unipro and mphy.
But those stuff has been changed whenever new product comes.
So I didn't keep those in this driver.


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Jaehoon Chung
> Sent: Tuesday, November 28, 2017 5:20 PM
> To: 김기웅; linux-scsi@vger.kernel.org; Martin K. Petersen
> Cc: c...@samsung.com; HeonGwang Chu; 김부진; YOUNGEUN PARK
> Subject: Re: [PATCH 2/2] scsi: ufs: add Exynos-specific driver
> 
> Hi,
> 
> On 11/28/2017 02:36 PM, 김기웅 wrote:
> > This driver is to use UFS devices on Exynos SoC and has been already
> > used for many years for commercial products.
> 
> Well, i'm not sure but i remembered there are the similar patches
> before..Seungwon and Alim's patches.
> Is it relevant to them?
> 
> Anyway.. i think i can't test with only these patches..
> how did you test this patches?
> 
> >
> > Signed-off-by: Kiwoong Kim 
> > ---
> >  drivers/scsi/ufs/Kconfig  |  10 +
> >  drivers/scsi/ufs/Makefile |   1 +
> >  drivers/scsi/ufs/ufs-exynos.c | 962
> > ++
> >  drivers/scsi/ufs/ufs-exynos.h | 351 +++
> >  drivers/scsi/ufs/ufshcd.h |   1 +
> >  5 files changed, 1325 insertions(+)
> >  create mode 100644 drivers/scsi/ufs/ufs-exynos.c  create mode 100644
> > drivers/scsi/ufs/ufs-exynos.h
> 
> There is no binding file.
> 
> >
> > diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig index
> > e27b4d4e6ae2..7d71ad8768c3 100644
> > --- a/drivers/scsi/ufs/Kconfig
> > +++ b/drivers/scsi/ufs/Kconfig
> > @@ -100,3 +100,13 @@ config SCSI_UFS_QCOM
> >
> >   Select this if you have UFS controller on QCOM chipset.
> >   If unsure, say N.
> > +
> > +config SCSI_UFS_EXYNOS
> > +   tristate "EXYNOS UFS Host Controller Driver"
> > +   depends on SCSI_UFSHCD && SCSI_UFSHCD_PLATFORM
> > +   ---help---
> > + This selects the EXYNOS UFS host controller driver.
> > +
> > + If you have a controller with this interface, say Y or M here.
> > +
> > + If unsure, say N.
> > diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
> > index 9310c6c83041..3312b052dcff 100644
> > --- a/drivers/scsi/ufs/Makefile
> > +++ b/drivers/scsi/ufs/Makefile
> > @@ -3,6 +3,7 @@
> >  obj-$(CONFIG_SCSI_UFS_DWC_TC_PCI) += tc-dwc-g210-pci.o ufshcd-dwc.o
> > tc-dwc-g210.o
> >  obj-$(CONFIG_SCSI_UFS_DWC_TC_PLATFORM) += tc-dwc-g210-pltfrm.o
> > ufshcd-dwc.o tc-dwc-g210.o
> >  obj-$(CONFIG_SCSI_UFS_QCOM) += ufs-qcom.o
> > +obj-$(CONFIG_SCSI_UFS_EXYNOS) += ufs-exynos.o
> >  obj-$(CONFIG_SCSI_UFSHCD) += ufshcd.o
> >  obj-$(CONFIG_SCSI_UFSHCD_PCI) += ufshcd-pci.o
> >  obj-$(CONFIG_SCSI_UFSHCD_PLATFORM) += ufshcd-pltfrm.o diff --git
> > a/drivers/scsi/ufs/ufs-exynos.c b/drivers/scsi/ufs/ufs-exynos.c new
> > file mode 100644 index ..98e5aeb80b06
> > --- /dev/null
> > +++ b/drivers/scsi/ufs/ufs-exynos.c
> > @@ -0,0 +1,962 @@
> > +/*
> > + * UFS Host Controller driver for Exynos specific extensions
> > + *
> > + * Copyright (C) 2013-2014 Samsung Electronics Co., Ltd.
> 
> 2013-2014? is it right?
> 
> > + *
> > + * This program is free software; you can redistribute it and/or
> > +modify
> > + * it under the terms of the GNU General Public License as published
> > +by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + */
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include "ufshcd.h"
> > +#include "ufshcd-pltfrm.h"
> > +#include "ufs-exynos.h"
> > +#include 
> > +#include 
> > +#include 
> 
> ordering about header file.
> 
> > +
> > +/*
> > + * Debugging information, SFR/attributes/misc  */ static struct
> > +exynos_ufs *ufs_host_backup[1];> +static int ufs_host_index = 0;
> 
> It has to use the global? Is there any other solution?
> 
> > +
> > +static struct exynos_ufs_sfr_log ufs_log_std_sfr[] = {
> > +   {"CAPABILITIES" ,   REG_CONTROLLER_CAPABILITIES,
>   0},
> > +   {"UFS VERSION"  ,   REG_UFS_VERSION,
>   0},
> > +   {"PRODUCT ID"   ,   REG_CONTROLLER_DEV_ID,
>   0},
> > +   {"MANUFACTURE ID"   ,   REG_CONTROLLER_PROD_ID,
>   0},
> > +   {"INTERRUPT STATUS" ,   REG_INTERRUPT_STATUS,
>   0},
> > +   {"INTERRUPT ENABLE" ,   REG_INTERRUPT_ENABLE,
>   0},
> > +   {"CONTROLLER STATUS",   REG_CONTROLLER_STATUS,
>   0},
> > +   {"CONTROLLER ENABLE",   REG_CONTROLLER_ENABLE,
>   0},
> > +   {"UIC ERR PHY ADAPTER LAYER",
>   REG_UIC_ERROR_CODE_PHY_ADAPTER_LAYER,   0},
> > +   {"UIC ERR DATA LINK LAYER"  ,   
> > REG_UIC_ERROR_CODE_DATA_LINK_LAYER,
>   0},
> > +   {"UIC ERR NETWORK LATER"

Re: [PATCH] Ensure that the SCSI error handler gets woken up

2017-11-28 Thread Pavel Tikhomirov
1-st, Stuart - thanks for adding me to CC, 2-nd Bart - no idea why you 
didn't? =)



If scsi_eh_scmd_add() is called concurrently with scsi_host_queue_ready()
while shost->host_blocked > 0 then it can happen that neither function
wakes up the SCSI error handler. Fix this by making every function that
decreases the host_busy counter to wake up the error handler if necessary.


Bart, you've added a comment to my initial patch() about performance, 
let me quote it here:


 > An important achievement of the scsi-mq code was removal of all
 > spin_lock_irq(shost->host_lock) statements from the hot path. The above
 > changes will have a significant negative performance impact, 
especially if

 > multiple LUNs associated with the same SCSI host are involved. Can the
 > reported race be fixed without slowing down the hot path 
significantly? I

 > think that both adding spin lock or smp_mb() calls in the hot path will
 > have a significant negative performance impact.

These was a tricky question so I had no immediate answer. Here is the one:

a) We need to check if scsi_eh_wakeup needs to wake up error handler 
thread in every place where we change host_busy. Else we have a chance 
that these change will break the wake up check in other existing places 
and will lead to deadlock.


b) If we have several variables and change them (one different variable 
in in different thread) and after that we want to check the joint state 
of these variables, we sould surely have some kind of memory barriers to 
have a consistent state at some point.


c) We already have spinlocks in scsi_schedule_eh, scsi_eh_scmd_add and 
scsi_device_unbusy, so it seems the good thing to reuse them for new 
checks too.


I don't see another way to fix these problem.

Your patch puts spinlocks under check which should itself be under 
spinlock, and breaks the initial fix (see Stuart's comment that the 
problem still reproduces). And you does _not_ answer your own question.




Reported-by: Pavel Tikhomirov 
Fixes: commit 746650160866 ("scsi: convert host_busy to atomic_t")
Signed-off-by: Bart Van Assche 


As your patch is based on my initial patch 
(https://patchwork.kernel.org/patch/9938919/), when all problems will be 
resolved with it, can you please add here:

Signed-off-by: Pavel Tikhomirov 


Cc: Konstantin Khorenko 
Cc: Stuart Hayes 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
Cc: 
---
 drivers/scsi/scsi_error.c |  3 ++-
 drivers/scsi/scsi_lib.c   | 22 ++
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 5e89049e9b4e..f7f014c755d7 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -61,9 +61,10 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd);
 static int scsi_try_to_abort_cmd(struct scsi_host_template *,
 struct scsi_cmnd *);
 
-/* called with shost->host_lock held */

 void scsi_eh_wakeup(struct Scsi_Host *shost)
 {
+   lockdep_assert_held(shost->host_lock);
+
if (atomic_read(&shost->host_busy) == shost->host_failed) {
trace_scsi_eh_wakeup(shost);
wake_up_process(shost->ehandler);
I also think these hunk is just an additional precaution and should be 
in separate patch.



diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 1e05e1885ac8..abd37d77af2d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -318,22 +318,28 @@ static void scsi_init_cmd_errh(struct scsi_cmnd *cmd)
cmd->cmd_len = scsi_command_size(cmd->cmnd);
 }
 
-void scsi_device_unbusy(struct scsi_device *sdev)

+static void scsi_dec_host_busy(struct Scsi_Host *shost)
 {
-   struct Scsi_Host *shost = sdev->host;
-   struct scsi_target *starget = scsi_target(sdev);
unsigned long flags;
 
 	atomic_dec(&shost->host_busy);

-   if (starget->can_queue > 0)
-   atomic_dec(&starget->target_busy);
-
if (unlikely(scsi_host_in_recovery(shost) &&
 (shost->host_failed || shost->host_eh_scheduled))) {


As I've wrote above you do wrong locking here in scsi_dec_host_busy. 
Note that the above check reads host_failed and can load host_failed 
before host_busy is decremented due to reordering.



spin_lock_irqsave(shost->host_lock, flags);
scsi_eh_wakeup(shost);
spin_unlock_irqrestore(shost->host_lock, flags);
}
+}
+
+void scsi_device_unbusy(struct scsi_device *sdev)
+{
+   struct Scsi_Host *shost = sdev->host;
+   struct scsi_target *starget = scsi_target(sdev);
+
+   scsi_dec_host_busy(shost);
+
+   if (starget->can_queue > 0)
+   atomic_dec(&starget->target_busy);
 
 	atomic_dec(&sdev->device_busy);
 } > @@ -1532,7 +1538,7 @@ static inline int scsi_host_queue_ready(struct 

request_queue *q,

list_add_tail(&sdev->starved_entry, &shost->starved_list);
sp

Re: [PATCH] Ensure that the SCSI error handler gets woken up

2017-11-28 Thread Pavel Tikhomirov

Sorry, missed the thread, resending.

1-st, Stuart - thanks for adding me to CC, 2-nd Bart - no idea why you 
didn't? =)



If scsi_eh_scmd_add() is called concurrently with scsi_host_queue_ready()
while shost->host_blocked > 0 then it can happen that neither function
wakes up the SCSI error handler. Fix this by making every function that
decreases the host_busy counter to wake up the error handler if necessary.


Bart, you've added a comment to my initial patch() about performance, 
let me quote it here:


 > An important achievement of the scsi-mq code was removal of all
 > spin_lock_irq(shost->host_lock) statements from the hot path. The above
 > changes will have a significant negative performance impact, 
especially if

 > multiple LUNs associated with the same SCSI host are involved. Can the
 > reported race be fixed without slowing down the hot path 
significantly? I

 > think that both adding spin lock or smp_mb() calls in the hot path will
 > have a significant negative performance impact.

These was a tricky question so I had no immediate answer. Here is the one:

a) We need to check if scsi_eh_wakeup needs to wake up error handler 
thread in every place where we change host_busy. Else we have a chance 
that these change will break the wake up check in other existing places 
and will lead to deadlock.


b) If we have several variables and change them (one different variable 
in in different thread) and after that we want to check the joint state 
of these variables, we sould surely have some kind of memory barriers to 
have a consistent state at some point.


c) We already have spinlocks in scsi_schedule_eh, scsi_eh_scmd_add and 
scsi_device_unbusy, so it seems the good thing to reuse them for new 
checks too.


I don't see another way to fix these problem.

Your patch puts spinlocks under check which should itself be under 
spinlock, and breaks the initial fix (see Stuart's comment that the 
problem still reproduces). And you does _not_ answer your own question.




Reported-by: Pavel Tikhomirov 
Fixes: commit 746650160866 ("scsi: convert host_busy to atomic_t")
Signed-off-by: Bart Van Assche 


As your patch is based on my initial patch 
(https://patchwork.kernel.org/patch/9938919/), when all problems will be 
resolved with it, can you please add here:

Signed-off-by: Pavel Tikhomirov 


Cc: Konstantin Khorenko 
Cc: Stuart Hayes 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
Cc: 
---
 drivers/scsi/scsi_error.c |  3 ++-
 drivers/scsi/scsi_lib.c   | 22 ++
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 5e89049e9b4e..f7f014c755d7 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -61,9 +61,10 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd);
 static int scsi_try_to_abort_cmd(struct scsi_host_template *,
 struct scsi_cmnd *);
 
-/* called with shost->host_lock held */

 void scsi_eh_wakeup(struct Scsi_Host *shost)
 {
+   lockdep_assert_held(shost->host_lock);
+
if (atomic_read(&shost->host_busy) == shost->host_failed) {
trace_scsi_eh_wakeup(shost);
wake_up_process(shost->ehandler);
I also think these hunk is just an additional precaution and should be 
in separate patch.



diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 1e05e1885ac8..abd37d77af2d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -318,22 +318,28 @@ static void scsi_init_cmd_errh(struct scsi_cmnd *cmd)
cmd->cmd_len = scsi_command_size(cmd->cmnd);
 }
 
-void scsi_device_unbusy(struct scsi_device *sdev)

+static void scsi_dec_host_busy(struct Scsi_Host *shost)
 {
-   struct Scsi_Host *shost = sdev->host;
-   struct scsi_target *starget = scsi_target(sdev);
unsigned long flags;
 
 	atomic_dec(&shost->host_busy);

-   if (starget->can_queue > 0)
-   atomic_dec(&starget->target_busy);
-
if (unlikely(scsi_host_in_recovery(shost) &&
 (shost->host_failed || shost->host_eh_scheduled))) {


As I've wrote above you do wrong locking here in scsi_dec_host_busy. 
Note that the above check reads host_failed and can load host_failed 
before host_busy is decremented due to reordering.



spin_lock_irqsave(shost->host_lock, flags);
scsi_eh_wakeup(shost);
spin_unlock_irqrestore(shost->host_lock, flags);
}
+}
+
+void scsi_device_unbusy(struct scsi_device *sdev)
+{
+   struct Scsi_Host *shost = sdev->host;
+   struct scsi_target *starget = scsi_target(sdev);
+
+   scsi_dec_host_busy(shost);
+
+   if (starget->can_queue > 0)
+   atomic_dec(&starget->target_busy);
 
 	atomic_dec(&sdev->device_busy);
 } > @@ -1532,7 +1538,7 @@ static inline int scsi_host_queue_ready(struct 

request_queue *q,

list_add_tail(&sdev->starved_e

Re: [PATCH] Ensure that the SCSI error handler gets woken up

2017-11-28 Thread Pavel Tikhomirov
Resend again, Added proper in-reply-to, finally, sorry for my mailing 
skills.


1-st, Stuart - thanks for adding me to CC, 2-nd Bart - no idea why you 
didn't? =)



If scsi_eh_scmd_add() is called concurrently with scsi_host_queue_ready()
while shost->host_blocked > 0 then it can happen that neither function
wakes up the SCSI error handler. Fix this by making every function that
decreases the host_busy counter to wake up the error handler if necessary.


Bart, you've added a comment to my initial patch() about performance, 
let me quote it here:


 > An important achievement of the scsi-mq code was removal of all
 > spin_lock_irq(shost->host_lock) statements from the hot path. The above
 > changes will have a significant negative performance impact, 
especially if

 > multiple LUNs associated with the same SCSI host are involved. Can the
 > reported race be fixed without slowing down the hot path 
significantly? I

 > think that both adding spin lock or smp_mb() calls in the hot path will
 > have a significant negative performance impact.

These was a tricky question so I had no immediate answer. Here is the one:

a) We need to check if scsi_eh_wakeup needs to wake up error handler 
thread in every place where we change host_busy. Else we have a chance 
that these change will break the wake up check in other existing places 
and will lead to deadlock.


b) If we have several variables and change them (one different variable 
in in different thread) and after that we want to check the joint state 
of these variables, we sould surely have some kind of memory barriers to 
have a consistent state at some point.


c) We already have spinlocks in scsi_schedule_eh, scsi_eh_scmd_add and 
scsi_device_unbusy, so it seems the good thing to reuse them for new 
checks too.


I don't see another way to fix these problem.

Your patch puts spinlocks under check which should itself be under 
spinlock, and breaks the initial fix (see Stuart's comment that the 
problem still reproduces). And you does _not_ answer your own question.




Reported-by: Pavel Tikhomirov 
Fixes: commit 746650160866 ("scsi: convert host_busy to atomic_t")
Signed-off-by: Bart Van Assche 


As your patch is based on my initial patch 
(https://patchwork.kernel.org/patch/9938919/), when all problems will be 
resolved with it, can you please add here:

Signed-off-by: Pavel Tikhomirov 


Cc: Konstantin Khorenko 
Cc: Stuart Hayes 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
Cc: 
---
 drivers/scsi/scsi_error.c |  3 ++-
 drivers/scsi/scsi_lib.c   | 22 ++
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 5e89049e9b4e..f7f014c755d7 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -61,9 +61,10 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd);
 static int scsi_try_to_abort_cmd(struct scsi_host_template *,
 struct scsi_cmnd *);
 
-/* called with shost->host_lock held */

 void scsi_eh_wakeup(struct Scsi_Host *shost)
 {
+   lockdep_assert_held(shost->host_lock);
+
if (atomic_read(&shost->host_busy) == shost->host_failed) {
trace_scsi_eh_wakeup(shost);
wake_up_process(shost->ehandler);
I also think these hunk is just an additional precaution and should be 
in separate patch.



diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 1e05e1885ac8..abd37d77af2d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -318,22 +318,28 @@ static void scsi_init_cmd_errh(struct scsi_cmnd *cmd)
cmd->cmd_len = scsi_command_size(cmd->cmnd);
 }
 
-void scsi_device_unbusy(struct scsi_device *sdev)

+static void scsi_dec_host_busy(struct Scsi_Host *shost)
 {
-   struct Scsi_Host *shost = sdev->host;
-   struct scsi_target *starget = scsi_target(sdev);
unsigned long flags;
 
 	atomic_dec(&shost->host_busy);

-   if (starget->can_queue > 0)
-   atomic_dec(&starget->target_busy);
-
if (unlikely(scsi_host_in_recovery(shost) &&
 (shost->host_failed || shost->host_eh_scheduled))) {


As I've wrote above you do wrong locking here in scsi_dec_host_busy. 
Note that the above check reads host_failed and can load host_failed 
before host_busy is decremented due to reordering.



spin_lock_irqsave(shost->host_lock, flags);
scsi_eh_wakeup(shost);
spin_unlock_irqrestore(shost->host_lock, flags);
}
+}
+
+void scsi_device_unbusy(struct scsi_device *sdev)
+{
+   struct Scsi_Host *shost = sdev->host;
+   struct scsi_target *starget = scsi_target(sdev);
+
+   scsi_dec_host_busy(shost);
+
+   if (starget->can_queue > 0)
+   atomic_dec(&starget->target_busy);
 
 	atomic_dec(&sdev->device_busy);
 } > @@ -1532,7 +1538,7 @@ static inline int scsi_host_queue_ready(struct 

request_queue *q,


[PATCH v2 0/3] scsi: arcmsr: add driver module parameter - msi_enable, msix_enable

2017-11-28 Thread Ching Huang
From: Ching Huang 

The following patches apply to james' kernel/git/jejb/scsi.git/tree/?h=misc and
 martin's 
kernel/git/mkp/scsi.git/tree/?h=4.16/scsi-queue

Patch 1: Add module parameter msi_enable to has a chance to disable msi 
interrupt if controller has msi INT compatible issue.

Patch 2: Add module parameter msix_enable to has a chance to disable msix 
interrupt if controller has msix INT compatible issue.

Patch 3: Update driver version to v1.40.00.03-20171124
---



Re: [PATCH 2/2] scsi: ufs: add Exynos-specific driver

2017-11-28 Thread Jaehoon Chung
On 11/28/2017 05:27 PM, 김기웅 wrote:
> Hi.
> This is modified from Seungwon's initial patch.
> And this has been used for several commercial products.
> I think you feel weird because you can't see a bunch of unipro and mphy.
> But those stuff has been changed whenever new product comes.
> So I didn't keep those in this driver.

Unipro and mphy setting values can be got from device-tree according to each 
variant boards.
Then it doesn't need to keep in this driver. but there is no usage for them in 
this patch.
Otherwise, this driver may be a dead driver.

Also anyone doesn't have the interesting about this driver.

And i also added the some comment at below..

Best Regards,
Jaehoon Chung

> 
> 
>> -Original Message-
>> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
>> ow...@vger.kernel.org] On Behalf Of Jaehoon Chung
>> Sent: Tuesday, November 28, 2017 5:20 PM
>> To: 김기웅; linux-scsi@vger.kernel.org; Martin K. Petersen
>> Cc: c...@samsung.com; HeonGwang Chu; 김부진; YOUNGEUN PARK
>> Subject: Re: [PATCH 2/2] scsi: ufs: add Exynos-specific driver
>>
>> Hi,
>>
>> On 11/28/2017 02:36 PM, 김기웅 wrote:
>>> This driver is to use UFS devices on Exynos SoC and has been already
>>> used for many years for commercial products.
>>
>> Well, i'm not sure but i remembered there are the similar patches
>> before..Seungwon and Alim's patches.
>> Is it relevant to them?
>>
>> Anyway.. i think i can't test with only these patches..
>> how did you test this patches?
>>
>>>
>>> Signed-off-by: Kiwoong Kim 
>>> ---
>>>  drivers/scsi/ufs/Kconfig  |  10 +
>>>  drivers/scsi/ufs/Makefile |   1 +
>>>  drivers/scsi/ufs/ufs-exynos.c | 962
>>> ++
>>>  drivers/scsi/ufs/ufs-exynos.h | 351 +++
>>>  drivers/scsi/ufs/ufshcd.h |   1 +
>>>  5 files changed, 1325 insertions(+)
>>>  create mode 100644 drivers/scsi/ufs/ufs-exynos.c  create mode 100644
>>> drivers/scsi/ufs/ufs-exynos.h
>>
>> There is no binding file.
>>
>>>
>>> diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig index
>>> e27b4d4e6ae2..7d71ad8768c3 100644
>>> --- a/drivers/scsi/ufs/Kconfig
>>> +++ b/drivers/scsi/ufs/Kconfig
>>> @@ -100,3 +100,13 @@ config SCSI_UFS_QCOM
>>>
>>>   Select this if you have UFS controller on QCOM chipset.
>>>   If unsure, say N.
>>> +
>>> +config SCSI_UFS_EXYNOS
>>> +   tristate "EXYNOS UFS Host Controller Driver"
>>> +   depends on SCSI_UFSHCD && SCSI_UFSHCD_PLATFORM
>>> +   ---help---
>>> + This selects the EXYNOS UFS host controller driver.
>>> +
>>> + If you have a controller with this interface, say Y or M here.
>>> +
>>> + If unsure, say N.
>>> diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
>>> index 9310c6c83041..3312b052dcff 100644
>>> --- a/drivers/scsi/ufs/Makefile
>>> +++ b/drivers/scsi/ufs/Makefile
>>> @@ -3,6 +3,7 @@
>>>  obj-$(CONFIG_SCSI_UFS_DWC_TC_PCI) += tc-dwc-g210-pci.o ufshcd-dwc.o
>>> tc-dwc-g210.o
>>>  obj-$(CONFIG_SCSI_UFS_DWC_TC_PLATFORM) += tc-dwc-g210-pltfrm.o
>>> ufshcd-dwc.o tc-dwc-g210.o
>>>  obj-$(CONFIG_SCSI_UFS_QCOM) += ufs-qcom.o
>>> +obj-$(CONFIG_SCSI_UFS_EXYNOS) += ufs-exynos.o
>>>  obj-$(CONFIG_SCSI_UFSHCD) += ufshcd.o
>>>  obj-$(CONFIG_SCSI_UFSHCD_PCI) += ufshcd-pci.o
>>>  obj-$(CONFIG_SCSI_UFSHCD_PLATFORM) += ufshcd-pltfrm.o diff --git
>>> a/drivers/scsi/ufs/ufs-exynos.c b/drivers/scsi/ufs/ufs-exynos.c new
>>> file mode 100644 index ..98e5aeb80b06
>>> --- /dev/null
>>> +++ b/drivers/scsi/ufs/ufs-exynos.c
>>> @@ -0,0 +1,962 @@
>>> +/*
>>> + * UFS Host Controller driver for Exynos specific extensions
>>> + *
>>> + * Copyright (C) 2013-2014 Samsung Electronics Co., Ltd.
>>
>> 2013-2014? is it right?
>>
>>> + *
>>> + * This program is free software; you can redistribute it and/or
>>> +modify
>>> + * it under the terms of the GNU General Public License as published
>>> +by
>>> + * the Free Software Foundation; either version 2 of the License, or
>>> + * (at your option) any later version.
>>> + */
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include "ufshcd.h"
>>> +#include "ufshcd-pltfrm.h"
>>> +#include "ufs-exynos.h"
>>> +#include 
>>> +#include 
>>> +#include 
>>
>> ordering about header file.
>>
>>> +
>>> +/*
>>> + * Debugging information, SFR/attributes/misc  */ static struct
>>> +exynos_ufs *ufs_host_backup[1];> +static int ufs_host_index = 0;
>>
>> It has to use the global? Is there any other solution?
>>
>>> +
>>> +static struct exynos_ufs_sfr_log ufs_log_std_sfr[] = {
>>> +   {"CAPABILITIES" ,   REG_CONTROLLER_CAPABILITIES,
>>  0},
>>> +   {"UFS VERSION"  ,   REG_UFS_VERSION,
>>  0},
>>> +   {"PRODUCT ID"   ,   REG_CONTROLLER_DEV_ID,
>>  0},
>>> +   {"MANUFACTURE ID"   ,   REG_CONTROLLER_PROD_ID,
>>  0},
>>> +   {"INTERRUPT STATUS" ,   REG_INTERRUPT_STATUS,
>>  0},
>>> +   {"INTERRUPT ENABLE" ,   REG_INTERRUPT

[PATCH v2 1/3] scsi: arcmsr: Add driver module parameter msi_enable

2017-11-28 Thread Ching Huang
From: Ching Huang 

Add module parameter msi_enable to has a chance to disable msi interrupt if 
between controller and system has
msi INT compatible issue.

Signed-off-by: Ching Huang 
---
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-23 14:29:26.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-24 15:16:20.0 +0800
@@ -75,6 +75,10 @@ MODULE_DESCRIPTION("Areca ARC11xx/12xx/1
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_VERSION(ARCMSR_DRIVER_VERSION);
 
+static int msi_enable = 1;
+module_param(msi_enable, int, S_IRUGO);
+MODULE_PARM_DESC(msi_enable, "Enable MSI interrupt(0 ~ 1), 
msi_enable=1(enable), =0(disable)");
+
 static int host_can_queue = ARCMSR_DEFAULT_OUTSTANDING_CMD;
 module_param(host_can_queue, int, S_IRUGO);
 MODULE_PARM_DESC(host_can_queue, " adapter queue depth(32 ~ 1024), default is 
128");
@@ -831,11 +835,17 @@ arcmsr_request_irq(struct pci_dev *pdev,
pr_info("arcmsr%d: msi-x enabled\n", acb->host->host_no);
flags = 0;
} else {
-   nvec = pci_alloc_irq_vectors(pdev, 1, 1,
-   PCI_IRQ_MSI | PCI_IRQ_LEGACY);
+   if (msi_enable == 1) {
+   nvec = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI);
+   if (nvec == 1) {
+   dev_info(&pdev->dev, "msi enabled\n");
+   goto msi_int1;
+   }
+   }
+   nvec = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_LEGACY);
if (nvec < 1)
return FAILED;
-
+msi_int1:
flags = IRQF_SHARED;
}
 




[PATCH v2 2/3] scsi: arcmsr: Add driver module parameter msix_enable

2017-11-28 Thread Ching Huang
From: Ching Huang 

Add module parameter msix_enable to has a chance to disable msix interrupt if 
between controller and system has
msix INT compatible issue.

Signed-off-by: Ching Huang 
---
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-24 15:16:20.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-24 15:17:46.0 +0800
@@ -75,6 +75,10 @@ MODULE_DESCRIPTION("Areca ARC11xx/12xx/1
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_VERSION(ARCMSR_DRIVER_VERSION);
 
+static int msix_enable = 1;
+module_param(msix_enable, int, S_IRUGO);
+MODULE_PARM_DESC(msix_enable, "Enable MSI-X interrupt(0 ~ 1), 
msix_enable=1(enable), =0(disable)");
+
 static int msi_enable = 1;
 module_param(msi_enable, int, S_IRUGO);
 MODULE_PARM_DESC(msi_enable, "Enable MSI interrupt(0 ~ 1), 
msi_enable=1(enable), =0(disable)");
@@ -829,12 +833,15 @@ arcmsr_request_irq(struct pci_dev *pdev,
unsigned long flags;
int nvec, i;
 
+   if (msix_enable == 0)
+   goto msi_int0;
nvec = pci_alloc_irq_vectors(pdev, 1, ARCMST_NUM_MSIX_VECTORS,
PCI_IRQ_MSIX);
if (nvec > 0) {
pr_info("arcmsr%d: msi-x enabled\n", acb->host->host_no);
flags = 0;
} else {
+msi_int0:
if (msi_enable == 1) {
nvec = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI);
if (nvec == 1) {




[PATCH v2 3/3] scsi: arcmsr: Update driver version to v1.40.00.03-20171124

2017-11-28 Thread Ching Huang
From: Ching Huang 

Update driver version to v1.40.00.03-20171124

Signed-off-by: Ching Huang 
---
diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-11-23 14:29:46.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-11-24 12:07:00.0 +0800
@@ -54,7 +54,7 @@ struct device_attribute;
 #endif
 #define ARCMSR_DEFAULT_OUTSTANDING_CMD 128
 #define ARCMSR_MIN_OUTSTANDING_CMD 32
-#define ARCMSR_DRIVER_VERSION  "v1.40.00.02-20171011"
+#define ARCMSR_DRIVER_VERSION  "v1.40.00.03-20171124"
 #define ARCMSR_SCSI_INITIATOR_ID   255
 #define ARCMSR_MAX_XFER_SECTORS512
 #define ARCMSR_MAX_XFER_SECTORS_B  4096




Re: [PATCH] libsas: flush pending destruct work in sas_unregister_domain_devices()

2017-11-28 Thread John Garry

On 28/11/2017 08:20, Johannes Thumshirn wrote:

On Mon, Nov 27, 2017 at 04:24:45PM -0800, Cong Wang wrote:

We saw dozens of the following kernel waring:

 WARNING: CPU: 0 PID: 705 at fs/sysfs/group.c:224 sysfs_remove_group+0x54/0x88()
 sysfs group 81ab7670 not found for kobject '6:0:3:0'
 Modules linked in: cpufreq_ondemand x86_pkg_temp_thermal coretemp kvm_intel 
kvm microcode raid0 iTCO_wdt iTCO_vendor_support sb_edac edac_core lpc_ich 
mfd_core ioatdma i2c_i801 shpchp wmi hed acpi_cpufreq lp parport tcp_diag 
inet_diag ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel igb ptp pps_core 
i2c_algo_bit i2c_core crc32c_intel isci libsas scsi_transport_sas dca ipv6
 CPU: 0 PID: 705 Comm: kworker/u240:0 Not tainted 4.1.35.el7.x86_64 #1


This should by now be fixed with commit fbce4d97fd43 ("scsi: fixup kernel
warning during rmmod()" which went into v4.14-rc6.



Is that the same issue? I think Cong Wang is just trying to deal with 
the longstanding libsas hotplug WARN.


We at Huawei are still working to fix it. Our patchset is under internal 
test at the moment.


As for this patch:
>  drivers/scsi/libsas/sas_discover.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/libsas/sas_discover.c 
b/drivers/scsi/libsas/sas_discover.c

> index 60de66252fa2..27c11fc7aa2b 100644
> --- a/drivers/scsi/libsas/sas_discover.c
> +++ b/drivers/scsi/libsas/sas_discover.c
> @@ -388,6 +388,11 @@ void sas_unregister_dev(struct asd_sas_port 
*port, struct domain_device *dev)

>}
>  }
>
> +static void sas_flush_work(struct asd_sas_port *port)
> +{
> +  scsi_flush_work(port->ha->core.shost);
> +}
> +
>  void sas_unregister_domain_devices(struct asd_sas_port *port, int gone)
>  {
>struct domain_device *dev, *n;
> @@ -401,8 +406,8 @@ void sas_unregister_domain_devices(struct 
asd_sas_port *port, int gone)

>list_for_each_entry_safe(dev, n, &port->disco_list, disco_list_node)
>sas_unregister_dev(port, dev);
>
> +  sas_flush_work(port);

How can this work as sas_unregister_domain_devices() may be called from 
the same workqueue which you're trying to flush?


>port->port->rphy = NULL;
> -
>  }
>
>  void sas_device_set_phy(struct domain_device *dev, struct sas_port 
*port)

>

Thanks,
John





Re: News UBSAN warnings in aacraid

2017-11-28 Thread Meelis Roos
> I think this chunk would solve the problem and result in the
> same behavior as before:
> 
> --- a/drivers/scsi/aacraid/commsup.c
> +++ b/drivers/scsi/aacraid/commsup.c
> @@ -2511,8 +2511,8 @@ int aac_command_thread(void *data)
> /* Synchronize our watches */
> if (((NSEC_PER_SEC - (NSEC_PER_SEC / HZ)) > 
> now.tv_nsec)
>  && (now.tv_nsec > (NSEC_PER_SEC / HZ)))
> -   difference = (((NSEC_PER_SEC -
> now.tv_nsec) * HZ)
> - + NSEC_PER_SEC / 2) / NSEC_PER_SEC;
> +   difference = HZ + HZ / 2 -
> +now.tv_nsec / (NSEC_PER_SEC / 
> HZ);
> else {
> if (now.tv_nsec > NSEC_PER_SEC / 2)
> ++now.tv_sec;
> 
> but I don't see why we add in half a second here. Any ideas?

I did not try to understand the details but I can confirm that this 
patch makes the warnings go away.

-- 
Meelis Roos (mr...@linux.ee)


[PATCH] scsi: aacraid: address UBSAN warning regression

2017-11-28 Thread Arnd Bergmann
As reported by Meelis Roos, my previous patch causes an incorrect
calculation of the timeout, through an undefined signed integer
overflow:

[   12.228155] UBSAN: Undefined behaviour in 
drivers/scsi/aacraid/commsup.c:2514:49
[   12.228229] signed integer overflow:
[   12.228283] 964297611 * 250 cannot be represented in type 'long int'

The problem is that doing a multiplication with HZ first and then
dividing by USEC_PER_SEC worked correctly for 32-bit microseconds,
but not for 32-bit nanoseconds, which would require up to 41 bits.

This reworks the calculation to first convert the nanoseconds into
jiffies, which should give us the same result as before and not overflow.

Unfortunately I did not understand the exact intention of the algorithm,
in particular the part where we add half a second, so it's possible that
there is still a preexisting problem in this function. I added a comment
that this would be handled more nicely using usleep_range(), which
generally works better for waking up at a particular time than the
current schedule_timeout() based implementation. I did not feel
comfortable trying to implement that without being sure what the
intent is here though.

Fixes: 820f18865912 ("scsi: aacraid: use timespec64 instead of timeval")
Tested-by: Meelis Roos 
Signed-off-by: Arnd Bergmann 
---
 drivers/scsi/aacraid/commsup.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 525a652dab48..cc8fdefaebb6 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -2511,8 +2511,8 @@ int aac_command_thread(void *data)
/* Synchronize our watches */
if (((NSEC_PER_SEC - (NSEC_PER_SEC / HZ)) > now.tv_nsec)
 && (now.tv_nsec > (NSEC_PER_SEC / HZ)))
-   difference = (((NSEC_PER_SEC - now.tv_nsec) * 
HZ)
- + NSEC_PER_SEC / 2) / NSEC_PER_SEC;
+   difference = HZ + HZ / 2 -
+now.tv_nsec / (NSEC_PER_SEC / HZ);
else {
if (now.tv_nsec > NSEC_PER_SEC / 2)
++now.tv_sec;
@@ -2536,6 +2536,10 @@ int aac_command_thread(void *data)
if (kthread_should_stop())
break;
 
+   /*
+* we probably want usleep_range() here instead of the
+* jiffies computation
+*/
schedule_timeout(difference);
 
if (kthread_should_stop())
-- 
2.9.0



Re: News UBSAN warnings in aacraid

2017-11-28 Thread Arnd Bergmann
On Tue, Nov 28, 2017 at 2:05 PM, Meelis Roos  wrote:
>> I think this chunk would solve the problem and result in the
>> same behavior as before:
>>
>> --- a/drivers/scsi/aacraid/commsup.c
>> +++ b/drivers/scsi/aacraid/commsup.c
>> @@ -2511,8 +2511,8 @@ int aac_command_thread(void *data)
>> /* Synchronize our watches */
>> if (((NSEC_PER_SEC - (NSEC_PER_SEC / HZ)) > 
>> now.tv_nsec)
>>  && (now.tv_nsec > (NSEC_PER_SEC / HZ)))
>> -   difference = (((NSEC_PER_SEC -
>> now.tv_nsec) * HZ)
>> - + NSEC_PER_SEC / 2) / NSEC_PER_SEC;
>> +   difference = HZ + HZ / 2 -
>> +now.tv_nsec / (NSEC_PER_SEC / 
>> HZ);
>> else {
>> if (now.tv_nsec > NSEC_PER_SEC / 2)
>> ++now.tv_sec;
>>
>> but I don't see why we add in half a second here. Any ideas?
>
> I did not try to understand the details but I can confirm that this
> patch makes the warnings go away.

Thanks for testing! I've wrote it up as a proper patch now, and tried
to capture what I understand about this code and how I got to
the new change.

   Arnd


Re: [PATCH v2 1/3] scsi: arcmsr: Add driver module parameter msi_enable

2017-11-28 Thread Christoph Hellwig
On Tue, Nov 28, 2017 at 09:28:44AM +0800, Ching Huang wrote:
> From: Ching Huang 
> 
> Add module parameter msi_enable to has a chance to disable msi interrupt if 
> between controller and system has
> msi INT compatible issue.

If there is a system issue the system will need a quirk, and not every
driver.


Re: [PATCH 2/2] scsi: ufs: add Exynos-specific driver

2017-11-28 Thread Christoph Hellwig
On Tue, Nov 28, 2017 at 02:36:31PM +0900, 김기웅 wrote:
> This driver is to use UFS devices on Exynos SoC and
> has been already used for many years for commercial products.

So why do you only submit it only now?

> +/* Helper for UFS CAL interface */
> +static inline int ufs_init_cal(struct exynos_ufs *ufs, int idx,
> + struct platform_device *pdev)
> +{
> + return 0;
> +}
> +
> +static inline int ufs_pre_link(struct exynos_ufs *ufs)
> +{
> + return 0;
> +}
> +
> +static inline int ufs_post_link(struct exynos_ufs *ufs)
> +{
> + return 0;
> +}
> +
> +static inline int ufs_pre_gear_change(struct exynos_ufs *ufs,
> + struct uic_pwr_mode *pmd)
> +{
> + return 0;
> +}
> +
> +static inline int ufs_post_gear_change(struct exynos_ufs *ufs)
> +{
> + return 0;
> +}
> +
> +static inline int ufs_post_h8_enter(struct exynos_ufs *ufs)
> +{
> + return 0;
> +}
> +
> +static inline int ufs_pre_h8_exit(struct exynos_ufs *ufs)
> +{
> + return 0;
> +}

These are all dummys, please rmeove them.

> +#ifndef __EXYNOS_UFS_VS_DEBUG__

Please don't have ifdef code that isn't Kconfig selectable.

> +#ifndef __EXYNOS_UFS_MMIO_FUNC__
> +#define __EXYNOS_UFS_MMIO_FUNC__
> +#define EXYNOS_UFS_MMIO_FUNC(name)   
> \
> +static inline void name##_writel(struct exynos_ufs *ufs, u32 val, u32 reg)   
> \
> +{
> \
> + writel(val, ufs->reg_##name + reg); 
> \
> +}
> \
> + 
> \
> +static inline u32 name##_readl(struct exynos_ufs *ufs, u32 reg)  
> \
> +{
> \
> + return readl(ufs->reg_##name + reg);
> \
> +}
> +
> +EXYNOS_UFS_MMIO_FUNC(hci);
> +EXYNOS_UFS_MMIO_FUNC(unipro);

Please remove this macro magic.

> diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
> index 1332e544da92..1afd5ac9707c 100644
> --- a/drivers/scsi/ufs/ufshcd.h
> +++ b/drivers/scsi/ufs/ufshcd.h
> @@ -308,6 +308,7 @@ struct ufs_hba_variant_ops {
>   int (*setup_clocks)(struct ufs_hba *, bool,
>   enum ufs_notify_change_status);
>   int (*setup_regulators)(struct ufs_hba *, bool);
> + void(*host_reset)(struct ufs_hba *);
>   int (*hce_enable_notify)(struct ufs_hba *,
>enum ufs_notify_change_status);
>   int (*link_startup_notify)(struct ufs_hba *,

New ufs core methods should be added in a separate patch with 
a good description, and also with actual callers using them.


[PATCH] bfa: fix access to bfad_im_port_s

2017-11-28 Thread Johannes Thumshirn
Commit 'cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() helper")'
changed access to bfa's 'struct bfad_im_port_s' by using shost_priv()
instead of shost->hostdata[0].

This lead to crashes like in the following back-trace:

task: 880046375300 ti: 8800a2ef8000 task.ti: 8800a2ef8000
RIP: e030:[]  [] 
bfa_fcport_get_attr+0x82/0x260 [bfa]
RSP: e02b:8800a2efba10  EFLAGS: 00010046
RAX: 575f415441536432 RBX: 8800a2efba28 RCX: 
RDX:  RSI: 8800a2efba28 RDI: 880004dc31d8
RBP: 880004dc31d8 R08:  R09: 0001
R10: 88011fadc468 R11: 0001 R12: 880004dc31f0
R13: 0200 R14: 880004dc61d0 R15: 880004947a10
FS:  7feb1e489700() GS:88011fac() knlGS:
CS:  e033 DS:  ES:  CR0: 8005003b
CR2: 7ffe14e46c10 CR3: 957b8000 CR4: 0660
Stack:
 88001d4da000 880004dc31c0 a048a9df 81e56380
    
[] bfad_iocmd_ioc_get_info+0x4f/0x220 [bfa]
[] bfad_iocmd_handler+0xa00/0xd40 [bfa]
[] bfad_im_bsg_request+0xee/0x1b0 [bfa]
[] fc_bsg_dispatch+0x10b/0x1b0 [scsi_transport_fc]
[] bsg_request_fn+0x11d/0x1c0
[] __blk_run_queue+0x2f/0x40
[] blk_execute_rq_nowait+0xa8/0x160
[] blk_execute_rq+0x77/0x120
[] bsg_ioctl+0x1b6/0x200
[] do_vfs_ioctl+0x2cd/0x4a0
[] SyS_ioctl+0x74/0x80
[] entry_SYSCALL_64_fastpath+0x12/0x6d

Fixes: cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() helper")
Signed-off-by: Johannes Thumshirn 
Cc: Michal Koutný 
---
 drivers/scsi/bfa/bfad_bsg.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/bfa/bfad_bsg.c b/drivers/scsi/bfa/bfad_bsg.c
index 72ca2a2e08e2..09ef68c8225f 100644
--- a/drivers/scsi/bfa/bfad_bsg.c
+++ b/drivers/scsi/bfa/bfad_bsg.c
@@ -3135,7 +3135,8 @@ bfad_im_bsg_vendor_request(struct bsg_job *job)
struct fc_bsg_request *bsg_request = job->request;
struct fc_bsg_reply *bsg_reply = job->reply;
uint32_t vendor_cmd = bsg_request->rqst_data.h_vendor.vendor_cmd[0];
-   struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job));
+   struct Scsi_Host *shost = fc_bsg_to_shost(job);
+   struct bfad_im_port_s *im_port = shost->hostdata[0];
struct bfad_s *bfad = im_port->bfad;
void *payload_kbuf;
int rc = -EINVAL;
@@ -3350,7 +3351,8 @@ int
 bfad_im_bsg_els_ct_request(struct bsg_job *job)
 {
struct bfa_bsg_data *bsg_data;
-   struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job));
+   struct Scsi_Host *shost = fc_bsg_to_shost(job);
+   struct bfad_im_port_s *im_port = shost->hostdata[0];
struct bfad_s *bfad = im_port->bfad;
bfa_bsg_fcpt_t *bsg_fcpt;
struct bfad_fcxp*drv_fcxp;
-- 
2.13.6



Re: [PATCH] Ensure that the SCSI error handler gets woken up

2017-11-28 Thread Bart Van Assche
On Tue, 2017-11-28 at 12:04 +0300, Pavel Tikhomirov wrote:
> 1-st, Stuart - thanks for adding me to CC, 2-nd Bart - no idea why you 
> didn't? =)

That must have been an oversight. Sorry that I had not added you to the
Cc-list. I will add you to the Cc-list when I post the next version of this
patch.

> I don't see another way to fix these problem.

That doesn't mean that there is no other way :-) I implemented an alternative
approach yesterday and I have started testing it. I will post that patch as
soon as my tests have finished.

Bart.

Re: [PATCH] bfa: fix access to bfad_im_port_s

2017-11-28 Thread Hannes Reinecke
On 11/28/2017 04:26 PM, Johannes Thumshirn wrote:
> Commit 'cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() helper")'
> changed access to bfa's 'struct bfad_im_port_s' by using shost_priv()
> instead of shost->hostdata[0].
> 
> This lead to crashes like in the following back-trace:
> 
> task: 880046375300 ti: 8800a2ef8000 task.ti: 8800a2ef8000
> RIP: e030:[]  [] 
> bfa_fcport_get_attr+0x82/0x260 [bfa]
> RSP: e02b:8800a2efba10  EFLAGS: 00010046
> RAX: 575f415441536432 RBX: 8800a2efba28 RCX: 
> RDX:  RSI: 8800a2efba28 RDI: 880004dc31d8
> RBP: 880004dc31d8 R08:  R09: 0001
> R10: 88011fadc468 R11: 0001 R12: 880004dc31f0
> R13: 0200 R14: 880004dc61d0 R15: 880004947a10
> FS:  7feb1e489700() GS:88011fac() knlGS:
> CS:  e033 DS:  ES:  CR0: 8005003b
> CR2: 7ffe14e46c10 CR3: 957b8000 CR4: 0660
> Stack:
>  88001d4da000 880004dc31c0 a048a9df 81e56380
>     
> [] bfad_iocmd_ioc_get_info+0x4f/0x220 [bfa]
> [] bfad_iocmd_handler+0xa00/0xd40 [bfa]
> [] bfad_im_bsg_request+0xee/0x1b0 [bfa]
> [] fc_bsg_dispatch+0x10b/0x1b0 [scsi_transport_fc]
> [] bsg_request_fn+0x11d/0x1c0
> [] __blk_run_queue+0x2f/0x40
> [] blk_execute_rq_nowait+0xa8/0x160
> [] blk_execute_rq+0x77/0x120
> [] bsg_ioctl+0x1b6/0x200
> [] do_vfs_ioctl+0x2cd/0x4a0
> [] SyS_ioctl+0x74/0x80
> [] entry_SYSCALL_64_fastpath+0x12/0x6d
> 
> Fixes: cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() helper")
> Signed-off-by: Johannes Thumshirn 
> Cc: Michal Koutný 
> ---
>  drivers/scsi/bfa/bfad_bsg.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/bfa/bfad_bsg.c b/drivers/scsi/bfa/bfad_bsg.c
> index 72ca2a2e08e2..09ef68c8225f 100644
> --- a/drivers/scsi/bfa/bfad_bsg.c
> +++ b/drivers/scsi/bfa/bfad_bsg.c
> @@ -3135,7 +3135,8 @@ bfad_im_bsg_vendor_request(struct bsg_job *job)
>   struct fc_bsg_request *bsg_request = job->request;
>   struct fc_bsg_reply *bsg_reply = job->reply;
>   uint32_t vendor_cmd = bsg_request->rqst_data.h_vendor.vendor_cmd[0];
> - struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job));
> + struct Scsi_Host *shost = fc_bsg_to_shost(job);
> + struct bfad_im_port_s *im_port = shost->hostdata[0];
>   struct bfad_s *bfad = im_port->bfad;
>   void *payload_kbuf;
>   int rc = -EINVAL;
> @@ -3350,7 +3351,8 @@ int
>  bfad_im_bsg_els_ct_request(struct bsg_job *job)
>  {
>   struct bfa_bsg_data *bsg_data;
> - struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job));
> + struct Scsi_Host *shost = fc_bsg_to_shost(job);
> + struct bfad_im_port_s *im_port = shost->hostdata[0];
>   struct bfad_s *bfad = im_port->bfad;
>   bfa_bsg_fcpt_t *bsg_fcpt;
>   struct bfad_fcxp*drv_fcxp;
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] scsi_devinfo: cleanly zero-pad devinfo strings

2017-11-28 Thread Bart Van Assche
On Mon, 2017-11-27 at 23:47 +0100, Martin Wilck wrote:
> + /* This zero-pads the destination */
> + strncpy(to, from, to_length);
> + if (from_length < to_length && !compatible)
> + /*
> +  * space pad the string if it is short.
> +  */
> + memset(&to[from_length], ' ', to_length - from_length);

Since the code block controlled by the if-statement consists of multiple
lines, shouldn't that block be surrounded by braces ({})? Anyway:

Reviewed-by: Bart Van Assche 

Re: [PATCH] scsi: scsi_devinfo: handle non-terminated strings

2017-11-28 Thread Bart Van Assche
On Mon, 2017-11-27 at 23:47 +0100, Martin Wilck wrote:
> devinfo->vendor and devinfo->model aren't necessarily
> zero-terminated.

Reviewed-by: Bart Van Assche 

Re: [PATCH] libsas: flush pending destruct work in sas_unregister_domain_devices()

2017-11-28 Thread Cong Wang
On Tue, Nov 28, 2017 at 12:20 AM, Johannes Thumshirn  wrote:
> On Mon, Nov 27, 2017 at 04:24:45PM -0800, Cong Wang wrote:
>> We saw dozens of the following kernel waring:
>>
>>  WARNING: CPU: 0 PID: 705 at fs/sysfs/group.c:224 
>> sysfs_remove_group+0x54/0x88()
>>  sysfs group 81ab7670 not found for kobject '6:0:3:0'
>>  Modules linked in: cpufreq_ondemand x86_pkg_temp_thermal coretemp kvm_intel 
>> kvm microcode raid0 iTCO_wdt iTCO_vendor_support sb_edac edac_core lpc_ich 
>> mfd_core ioatdma i2c_i801 shpchp wmi hed acpi_cpufreq lp parport tcp_diag 
>> inet_diag ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel igb ptp pps_core 
>> i2c_algo_bit i2c_core crc32c_intel isci libsas scsi_transport_sas dca ipv6
>>  CPU: 0 PID: 705 Comm: kworker/u240:0 Not tainted 4.1.35.el7.x86_64 #1
>
> This should by now be fixed with commit fbce4d97fd43 ("scsi: fixup kernel
> warning during rmmod()" which went into v4.14-rc6.

I don't see the full backtrace in commit fbce4d97fd43, but it is probably
not rmmod path in our case.


Re: [PATCH] libsas: flush pending destruct work in sas_unregister_domain_devices()

2017-11-28 Thread Cong Wang
On Tue, Nov 28, 2017 at 3:18 AM, John Garry  wrote:
> On 28/11/2017 08:20, Johannes Thumshirn wrote:
>>
>> On Mon, Nov 27, 2017 at 04:24:45PM -0800, Cong Wang wrote:
>>>
>>> We saw dozens of the following kernel waring:
>>>
>>>  WARNING: CPU: 0 PID: 705 at fs/sysfs/group.c:224
>>> sysfs_remove_group+0x54/0x88()
>>>  sysfs group 81ab7670 not found for kobject '6:0:3:0'
>>>  Modules linked in: cpufreq_ondemand x86_pkg_temp_thermal coretemp
>>> kvm_intel kvm microcode raid0 iTCO_wdt iTCO_vendor_support sb_edac edac_core
>>> lpc_ich mfd_core ioatdma i2c_i801 shpchp wmi hed acpi_cpufreq lp parport
>>> tcp_diag inet_diag ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel igb ptp
>>> pps_core i2c_algo_bit i2c_core crc32c_intel isci libsas scsi_transport_sas
>>> dca ipv6
>>>  CPU: 0 PID: 705 Comm: kworker/u240:0 Not tainted 4.1.35.el7.x86_64 #1
>>
>>
>> This should by now be fixed with commit fbce4d97fd43 ("scsi: fixup kernel
>> warning during rmmod()" which went into v4.14-rc6.
>>
>
> Is that the same issue? I think Cong Wang is just trying to deal with the
> longstanding libsas hotplug WARN.

Right, we saw it on both 4.1 and 3.14, clearly an old bug.


>
> We at Huawei are still working to fix it. Our patchset is under internal
> test at the moment.
>
> As for this patch:
>>  drivers/scsi/libsas/sas_discover.c | 7 ++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/libsas/sas_discover.c
>> b/drivers/scsi/libsas/sas_discover.c
>> index 60de66252fa2..27c11fc7aa2b 100644
>> --- a/drivers/scsi/libsas/sas_discover.c
>> +++ b/drivers/scsi/libsas/sas_discover.c
>> @@ -388,6 +388,11 @@ void sas_unregister_dev(struct asd_sas_port *port,
>> struct domain_device *dev)
>>   }
>>  }
>>
>> +static void sas_flush_work(struct asd_sas_port *port)
>> +{
>> + scsi_flush_work(port->ha->core.shost);
>> +}
>> +
>>  void sas_unregister_domain_devices(struct asd_sas_port *port, int gone)
>>  {
>>   struct domain_device *dev, *n;
>> @@ -401,8 +406,8 @@ void sas_unregister_domain_devices(struct asd_sas_port
>> *port, int gone)
>>   list_for_each_entry_safe(dev, n, &port->disco_list, disco_list_node)
>>   sas_unregister_dev(port, dev);
>>
>> + sas_flush_work(port);
>
> How can this work as sas_unregister_domain_devices() may be called from the
> same workqueue which you're trying to flush?


I don't understand, the only caller of sas_unregister_domain_devices()
is sas_deform_port().


[PATCH] scsi: wd719x: make card_types static const, shrinks object size

2017-11-28 Thread Colin King
From: Colin Ian King 

Don't populate the read-only array card_types on the stack but instead
make it static and constify it. Makes the object code smaller by over
110 bytes:

Before:
   textdata bss dec hex filename
  256255752   0   313777a91 drivers/scsi/wd719x.o

After:
   textdata bss dec hex filename
  254475816   0   312637a1f drivers/scsi/wd719x.o

(gcc version 7.2.0 x86_64)

Signed-off-by: Colin Ian King 
---
 drivers/scsi/wd719x.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/wd719x.c b/drivers/scsi/wd719x.c
index 2a9da2e0ea6b..2ba2b7b47f41 100644
--- a/drivers/scsi/wd719x.c
+++ b/drivers/scsi/wd719x.c
@@ -803,7 +803,9 @@ static enum wd719x_card_type wd719x_detect_type(struct 
wd719x *wd)
 static int wd719x_board_found(struct Scsi_Host *sh)
 {
struct wd719x *wd = shost_priv(sh);
-   char *card_types[] = { "Unknown card", "WD7193", "WD7197", "WD7296" };
+   static const char * const card_types[] = {
+   "Unknown card", "WD7193", "WD7197", "WD7296"
+   };
int ret;
 
INIT_LIST_HEAD(&wd->active_scbs);
-- 
2.14.1



[PATCH 01/22] qla2xxx: Fix system crash for Notify ack timeout handling

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Fix NULL pointer crash due to missing timeout handling callback
for Notify Ack IOCB.

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_target.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_target.c 
b/drivers/scsi/qla2xxx/qla_target.c
index 18069edd4773..1259ec85ec0a 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -665,7 +665,7 @@ int qla24xx_async_notify_ack(scsi_qla_host_t *vha, 
fc_port_t *fcport,
qla2x00_init_timer(sp, qla2x00_get_async_timeout(vha)+2);
 
sp->u.iocb_cmd.u.nack.ntfy = ntfy;
-
+   sp->u.iocb_cmd.timeout = qla2x00_async_iocb_timeout;
sp->done = qla2x00_async_nack_sp_done;
 
rval = qla2x00_start_sp(sp);
-- 
2.12.0



[PATCH 00/22] qla2xxx: Bug fixes for 4.15-rc2

2017-11-28 Thread Himanshu Madhani
Hi Martin,

This series contains bug fixes discovered during error handling test
cases for large fabric.

Please apply this series to 4.15-rc2 at your earliest convenience.

Thanks,
Himanshu

Giridhar Malavali (2):
  qla2xxx: Defer processing of GS IOCB calls
  qla2xxx: Remove aborting ELS IOCB call issued as part of timeout.

Himanshu Madhani (2):
  qla2xxx: Fix memory leak in dual/target mode
  qla2xxx: Update driver version to 10.00.00.03-k

Quinn Tran (17):
  qla2xxx: Fix system crash for Notify ack timeout handling
  qla2xxx: Fix gpnid error processing
  qla2xxx: Move session delete to driver work queue
  qla2xxx: Skip IRQ affinity for Target QPairs
  qla2xxx: Fix re-login for Nport Handle in use
  qla2xxx: Retry switch command on time out
  qla2xxx: Serialize GPNID for multiple RSCN
  qla2xxx: Fix login state machine stuck at GPDB
  qla2xxx: Relogin to target port on a cable swap
  qla2xxx: Fix Relogin being triggered too fast
  qla2xxx: Clear send ELS LOGO flag after target re-login
  qla2xxx: Fix PRLI state check
  qla2xxx: Fix nested spinlock
  qla2xxx: Replace fcport alloc with qla2x00_alloc_fcport
  qla2xxx: Fix scan state field for fcport
  qla2xxx: Clear loop id after delete
  qla2xxx: Fix system crash in qlt_plogi_ack_unref

Sawan Chandak (1):
  qla2xxx: Fix NPIV host cleanup in target mode

 drivers/scsi/qla2xxx/qla_def.h |  49 
 drivers/scsi/qla2xxx/qla_gs.c  | 230 ++---
 drivers/scsi/qla2xxx/qla_init.c|  69 +--
 drivers/scsi/qla2xxx/qla_iocb.c|  13 ---
 drivers/scsi/qla2xxx/qla_isr.c |   7 +-
 drivers/scsi/qla2xxx/qla_mbx.c |   3 +-
 drivers/scsi/qla2xxx/qla_mid.c |  42 ---
 drivers/scsi/qla2xxx/qla_os.c  |  78 ++---
 drivers/scsi/qla2xxx/qla_target.c  |  60 +++---
 drivers/scsi/qla2xxx/qla_version.h |   2 +-
 10 files changed, 405 insertions(+), 148 deletions(-)

-- 
2.12.0



[PATCH 02/22] qla2xxx: Fix gpnid error processing

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Stop GPNID command from advancing if command has failed.

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_gs.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c
index bc3db6abc9a0..ddc69d36877e 100644
--- a/drivers/scsi/qla2xxx/qla_gs.c
+++ b/drivers/scsi/qla2xxx/qla_gs.c
@@ -3211,6 +3211,11 @@ static void qla2x00_async_gpnid_sp_done(void *s, int res)
sp->name, res, ct_req->req.port_id.port_id,
ct_rsp->rsp.gpn_id.port_name);
 
+   if (res) {
+   sp->free(sp);
+   return;
+   }
+
memset(&ea, 0, sizeof(ea));
memcpy(ea.port_name, ct_rsp->rsp.gpn_id.port_name, WWN_SIZE);
ea.sp = sp;
-- 
2.12.0



[PATCH 06/22] qla2xxx: Retry switch command on time out

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Retry GID_PN & GPN_ID switch commands for time out case.

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_gs.c | 34 ++
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c
index 8984f857bb34..ea1b562ebc8a 100644
--- a/drivers/scsi/qla2xxx/qla_gs.c
+++ b/drivers/scsi/qla2xxx/qla_gs.c
@@ -175,6 +175,9 @@ qla2x00_chk_ms_status(scsi_qla_host_t *vha, ms_iocb_entry_t 
*ms_pkt,
set_bit(LOCAL_LOOP_UPDATE, &vha->dpc_flags);
}
break;
+   case CS_TIMEOUT:
+   rval = QLA_FUNCTION_TIMEOUT;
+   /* drop through */
default:
ql_dbg(ql_dbg_disc, vha, 0x2033,
"%s failed, completion status (%x) on port_id: "
@@ -2889,9 +2892,22 @@ static void qla2x00_async_gidpn_sp_done(void *s, int res)
ea.rc = res;
ea.event = FCME_GIDPN_DONE;
 
-   ql_dbg(ql_dbg_disc, vha, 0x204f,
-   "Async done-%s res %x, WWPN %8phC ID %3phC \n",
-   sp->name, res, fcport->port_name, id);
+   if (res == QLA_FUNCTION_TIMEOUT) {
+   ql_dbg(ql_dbg_disc, sp->vha, 0x,
+   "Async done-%s WWPN %8phC timed out.\n",
+   sp->name, fcport->port_name);
+   qla24xx_post_gidpn_work(sp->vha, fcport);
+   sp->free(sp);
+   return;
+   } else if (res) {
+   ql_dbg(ql_dbg_disc, sp->vha, 0x,
+   "Async done-%s fail res %x, WWPN %8phC\n",
+   sp->name, res, fcport->port_name);
+   } else {
+   ql_dbg(ql_dbg_disc, vha, 0x204f,
+   "Async done-%s good WWPN %8phC ID %3phC\n",
+   sp->name, fcport->port_name, id);
+   }
 
qla2x00_fcport_event_handler(vha, &ea);
 
@@ -3217,11 +3233,6 @@ static void qla2x00_async_gpnid_sp_done(void *s, int res)
sp->name, ct_req->req.port_id.port_id,
ct_rsp->rsp.gpn_id.port_name);
 
-   if (res) {
-   sp->free(sp);
-   return;
-   }
-
memset(&ea, 0, sizeof(ea));
memcpy(ea.port_name, ct_rsp->rsp.gpn_id.port_name, WWN_SIZE);
ea.sp = sp;
@@ -3231,6 +3242,13 @@ static void qla2x00_async_gpnid_sp_done(void *s, int res)
ea.rc = res;
ea.event = FCME_GPNID_DONE;
 
+   if (res) {
+   if (res == QLA_FUNCTION_TIMEOUT)
+   qla24xx_post_gpnid_work(sp->vha, &ea.id);
+   sp->free(sp);
+   return;
+   }
+
qla2x00_fcport_event_handler(vha, &ea);
 
e = qla2x00_alloc_work(vha, QLA_EVT_GPNID_DONE);
-- 
2.12.0



[PATCH 03/22] qla2xxx: Move session delete to driver work queue

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Move session delete from system work queue to driver's
work queue for in time processing.

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_os.c | 3 ++-
 drivers/scsi/qla2xxx/qla_target.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 46f2d0cf7c0d..dfbf82e716b0 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3193,10 +3193,11 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct 
pci_device_id *id)
host->can_queue, base_vha->req,
base_vha->mgmt_svr_loop_id, host->sg_tablesize);
 
+   ha->wq = alloc_workqueue("qla2xxx_wq", WQ_MEM_RECLAIM, 0);
+
if (ha->mqenable) {
bool mq = false;
bool startit = false;
-   ha->wq = alloc_workqueue("qla2xxx_wq", WQ_MEM_RECLAIM, 0);
 
if (QLA_TGT_MODE_ENABLED()) {
mq = true;
diff --git a/drivers/scsi/qla2xxx/qla_target.c 
b/drivers/scsi/qla2xxx/qla_target.c
index 1259ec85ec0a..924d58f5408f 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -1205,7 +1205,8 @@ void qlt_schedule_sess_for_deletion(struct fc_port *sess,
ql_dbg(ql_dbg_tgt, sess->vha, 0xe001,
"Scheduling sess %p for deletion\n", sess);
 
-   schedule_work(&sess->del_work);
+   INIT_WORK(&sess->del_work, qla24xx_delete_sess_fn);
+   queue_work(sess->vha->hw->wq, &sess->del_work);
 }
 
 void qlt_schedule_sess_for_deletion_lock(struct fc_port *sess)
-- 
2.12.0



[PATCH 12/22] qla2xxx: Clear send ELS LOGO flag after target re-login

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

This patch fixes clearing out els_send_logo flag at the
time of session deletion.

Fixes: 3515832cc614 ("scsi: qla2xxx: Reset the logo flag, after target 
re-login.")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_target.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/qla2xxx/qla_target.c 
b/drivers/scsi/qla2xxx/qla_target.c
index 283ff316e4b2..e824cdc77139 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -983,6 +983,7 @@ static void qlt_free_session_done(struct work_struct *work)
logo.id = sess->d_id;
logo.cmd_count = 0;
qlt_send_first_logo(vha, &logo);
+   sess->send_els_logo = 0;
}
 
if (sess->logout_on_delete) {
-- 
2.12.0



[PATCH 14/22] qla2xxx: Fix nested spinlock

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Fixes: 6eb54715b54bb ("qla2xxx: Added interface to send explicit LOGO.")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_iocb.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c
index d810a447cb4a..106f4ac4f733 100644
--- a/drivers/scsi/qla2xxx/qla_iocb.c
+++ b/drivers/scsi/qla2xxx/qla_iocb.c
@@ -2394,7 +2394,6 @@ qla2x00_els_dcmd_iocb_timeout(void *data)
struct scsi_qla_host *vha = sp->vha;
struct qla_hw_data *ha = vha->hw;
struct srb_iocb *lio = &sp->u.iocb_cmd;
-   unsigned long flags = 0;
 
ql_dbg(ql_dbg_io, vha, 0x3069,
"%s Timeout, hdl=%x, portid=%02x%02x%02x\n",
@@ -2402,7 +2401,6 @@ qla2x00_els_dcmd_iocb_timeout(void *data)
fcport->d_id.b.al_pa);
 
/* Abort the exchange */
-   spin_lock_irqsave(&ha->hardware_lock, flags);
if (ha->isp_ops->abort_command(sp)) {
ql_dbg(ql_dbg_io, vha, 0x3070,
"mbx abort_command failed.\n");
@@ -2410,7 +2408,6 @@ qla2x00_els_dcmd_iocb_timeout(void *data)
ql_dbg(ql_dbg_io, vha, 0x3071,
"mbx abort_command success.\n");
}
-   spin_unlock_irqrestore(&ha->hardware_lock, flags);
 
complete(&lio->u.els_logo.comp);
 }
-- 
2.12.0



[PATCH 13/22] qla2xxx: Fix PRLI state check

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Get Port Database MBX cmd is to validate current Login state upon
PRLI completion. Current code looks at the last login state for
re-validation which was incorrect. This patch removed incorrect
state check.

Fixes: 15f30a5752287 ("qla2xxx: Use IOCB interface to submit non-critical MBX.")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_mbx.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c
index cb717d47339f..e2b5fa47bb57 100644
--- a/drivers/scsi/qla2xxx/qla_mbx.c
+++ b/drivers/scsi/qla2xxx/qla_mbx.c
@@ -6160,8 +6160,7 @@ int __qla24xx_parse_gpdb(struct scsi_qla_host *vha, 
fc_port_t *fcport,
}
 
/* Check for logged in state. */
-   if (current_login_state != PDS_PRLI_COMPLETE &&
-   last_login_state != PDS_PRLI_COMPLETE) {
+   if (current_login_state != PDS_PRLI_COMPLETE) {
ql_dbg(ql_dbg_mbx, vha, 0x119a,
"Unable to verify login-state (%x/%x) for loop_id %x.\n",
current_login_state, last_login_state, fcport->loop_id);
-- 
2.12.0



[PATCH 16/22] qla2xxx: Fix scan state field for fcport

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Add correct value of scan_state field indicating state
of the FC port

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_target.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/qla2xxx/qla_target.c 
b/drivers/scsi/qla2xxx/qla_target.c
index 2a6242d97a7e..1c219998ab60 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -5812,6 +5812,7 @@ static fc_port_t *qlt_get_port_database(struct 
scsi_qla_host *vha,
tfcp->port_type = fcport->port_type;
tfcp->supported_classes = fcport->supported_classes;
tfcp->flags |= fcport->flags;
+   tfcp->scan_state = QLA_FCPORT_FOUND;
 
del = fcport;
fcport = tfcp;
-- 
2.12.0



[PATCH 10/22] qla2xxx: Relogin to target port on a cable swap

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

If user swaps one target port for another target port for same
switch port, the new target port is not being recognized by the
driver. Current code assumes that old Target port has recovered
from link down. The fix will ask switch what is the WWPN of a
specific NportID (GPNID) rather than assuming it's the same Target
port which has came back.

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_gs.c | 164 ++
 drivers/scsi/qla2xxx/qla_init.c   |   6 +-
 drivers/scsi/qla2xxx/qla_os.c |  35 +++-
 drivers/scsi/qla2xxx/qla_target.c |  35 ++--
 4 files changed, 194 insertions(+), 46 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c
index 59ecc4eda6cd..4486c9cc72e6 100644
--- a/drivers/scsi/qla2xxx/qla_gs.c
+++ b/drivers/scsi/qla2xxx/qla_gs.c
@@ -3171,43 +3171,136 @@ void qla24xx_async_gpnid_done(scsi_qla_host_t *vha, 
srb_t *sp)
 
 void qla24xx_handle_gpnid_event(scsi_qla_host_t *vha, struct event_arg *ea)
 {
-   fc_port_t *fcport;
-   unsigned long flags;
+   fc_port_t *fcport, *conflict, *t;
 
-   spin_lock_irqsave(&vha->hw->tgt.sess_lock, flags);
-   fcport = qla2x00_find_fcport_by_wwpn(vha, ea->port_name, 1);
-   spin_unlock_irqrestore(&vha->hw->tgt.sess_lock, flags);
+   ql_dbg(ql_dbg_disc, vha, 0x,
+   "%s %d port_id: %06x\n",
+   __func__, __LINE__, ea->id.b24);
 
-   if (fcport) {
-   /* cable moved. just plugged in */
-   fcport->rscn_gen++;
-   fcport->d_id = ea->id;
-   fcport->scan_state = QLA_FCPORT_FOUND;
-   fcport->flags |= FCF_FABRIC_DEVICE;
-
-   switch (fcport->disc_state) {
-   case DSC_DELETED:
-   ql_dbg(ql_dbg_disc, vha, 0x210d,
-   "%s %d %8phC login\n", __func__, __LINE__,
-   fcport->port_name);
-   qla24xx_fcport_handle_login(vha, fcport);
-   break;
-   case DSC_DELETE_PEND:
-   break;
-   default:
-   ql_dbg(ql_dbg_disc, vha, 0x2064,
-   "%s %d %8phC post del sess\n",
-   __func__, __LINE__, fcport->port_name);
-   qlt_schedule_sess_for_deletion_lock(fcport);
-   break;
+   if (ea->rc) {
+   /* cable is disconnected */
+   list_for_each_entry_safe(fcport, t, &vha->vp_fcports, list) {
+   if (fcport->d_id.b24 == ea->id.b24) {
+   ql_dbg(ql_dbg_disc, vha, 0x,
+   "%s %d %8phC DS %d\n",
+   __func__, __LINE__,
+   fcport->port_name,
+   fcport->disc_state);
+   fcport->scan_state = QLA_FCPORT_SCAN;
+   switch (fcport->disc_state) {
+   case DSC_DELETED:
+   case DSC_DELETE_PEND:
+   break;
+   default:
+   ql_dbg(ql_dbg_disc, vha, 0x,
+   "%s %d %8phC post del sess\n",
+   __func__, __LINE__,
+   fcport->port_name);
+   qlt_schedule_sess_for_deletion_lock
+   (fcport);
+   break;
+   }
+   }
}
} else {
-   /* create new fcport */
-   ql_dbg(ql_dbg_disc, vha, 0x2065,
-   "%s %d %8phC post new sess\n",
-   __func__, __LINE__, ea->port_name);
+   /* cable is connected */
+   fcport = qla2x00_find_fcport_by_wwpn(vha, ea->port_name, 1);
+   if (fcport) {
+   list_for_each_entry_safe(conflict, t, &vha->vp_fcports,
+   list) {
+   if ((conflict->d_id.b24 == ea->id.b24) &&
+   (fcport != conflict)) {
+   /* 2 fcports with conflict Nport ID or
+* an existing fcport is having nport ID
+* conflict with new fcport.
+*/
+
+   ql_dbg(ql_dbg_disc, vha, 0x,
+   "%s %d %8phC DS %d\n",
+   __func__, __L

[PATCH 07/22] qla2xxx: Serialize GPNID for multiple RSCN

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

GPNID is triggered by RSCN. For multiple RSCNs of the same
affected NPORT ID, serialize the GPNID to prevent confusion.

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_def.h | 48 +++---
 drivers/scsi/qla2xxx/qla_gs.c  | 35 +-
 drivers/scsi/qla2xxx/qla_isr.c |  2 +-
 drivers/scsi/qla2xxx/qla_os.c  |  1 +
 4 files changed, 58 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index 01a9b8971e88..d9b4a0651a0f 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -315,6 +315,29 @@ struct srb_cmd {
 /* To identify if a srb is of T10-CRC type. @sp => srb_t pointer */
 #define IS_PROT_IO(sp) (sp->flags & SRB_CRC_CTX_DSD_VALID)
 
+/*
+ * 24 bit port ID type definition.
+ */
+typedef union {
+   uint32_t b24 : 24;
+
+   struct {
+#ifdef __BIG_ENDIAN
+   uint8_t domain;
+   uint8_t area;
+   uint8_t al_pa;
+#elif defined(__LITTLE_ENDIAN)
+   uint8_t al_pa;
+   uint8_t area;
+   uint8_t domain;
+#else
+#error "__BIG_ENDIAN or __LITTLE_ENDIAN must be defined!"
+#endif
+   uint8_t rsvd_1;
+   } b;
+} port_id_t;
+#define INVALID_PORT_ID0xFF
+
 struct els_logo_payload {
uint8_t opcode;
uint8_t rsvd[3];
@@ -338,6 +361,7 @@ struct ct_arg {
u32 rsp_size;
void*req;
void*rsp;
+   port_id_t   id;
 };
 
 /*
@@ -499,6 +523,7 @@ typedef struct srb {
const char *name;
int iocbs;
struct qla_qpair *qpair;
+   struct list_head elem;
u32 gen1;   /* scratch */
u32 gen2;   /* scratch */
union {
@@ -2164,28 +2189,6 @@ struct imm_ntfy_from_isp {
 #define REQUEST_ENTRY_SIZE (sizeof(request_t))
 
 
-/*
- * 24 bit port ID type definition.
- */
-typedef union {
-   uint32_t b24 : 24;
-
-   struct {
-#ifdef __BIG_ENDIAN
-   uint8_t domain;
-   uint8_t area;
-   uint8_t al_pa;
-#elif defined(__LITTLE_ENDIAN)
-   uint8_t al_pa;
-   uint8_t area;
-   uint8_t domain;
-#else
-#error "__BIG_ENDIAN or __LITTLE_ENDIAN must be defined!"
-#endif
-   uint8_t rsvd_1;
-   } b;
-} port_id_t;
-#define INVALID_PORT_ID0xFF
 
 /*
  * Switch info gathering structure.
@@ -4252,6 +4255,7 @@ typedef struct scsi_qla_host {
uint8_t n2n_node_name[WWN_SIZE];
uint8_t n2n_port_name[WWN_SIZE];
uint16_tn2n_id;
+   struct list_head gpnid_list;
 } scsi_qla_host_t;
 
 struct qla27xx_image_status {
diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c
index ea1b562ebc8a..59ecc4eda6cd 100644
--- a/drivers/scsi/qla2xxx/qla_gs.c
+++ b/drivers/scsi/qla2xxx/qla_gs.c
@@ -3221,16 +3221,17 @@ static void qla2x00_async_gpnid_sp_done(void *s, int 
res)
(struct ct_sns_rsp *)sp->u.iocb_cmd.u.ctarg.rsp;
struct event_arg ea;
struct qla_work_evt *e;
+   unsigned long flags;
 
if (res)
ql_dbg(ql_dbg_disc, vha, 0x2066,
-   "Async done-%s fail res %x ID %3phC. %8phC\n",
-   sp->name, res, ct_req->req.port_id.port_id,
+   "Async done-%s fail res %x rscn gen %d ID %3phC. %8phC\n",
+   sp->name, res, sp->gen1, ct_req->req.port_id.port_id,
ct_rsp->rsp.gpn_id.port_name);
else
ql_dbg(ql_dbg_disc, vha, 0x2066,
-   "Async done-%s good ID %3phC. %8phC\n",
-   sp->name, ct_req->req.port_id.port_id,
+   "Async done-%s good rscn gen %d ID %3phC. %8phC\n",
+   sp->name, sp->gen1, ct_req->req.port_id.port_id,
ct_rsp->rsp.gpn_id.port_name);
 
memset(&ea, 0, sizeof(ea));
@@ -3242,11 +3243,20 @@ static void qla2x00_async_gpnid_sp_done(void *s, int 
res)
ea.rc = res;
ea.event = FCME_GPNID_DONE;
 
+   spin_lock_irqsave(&vha->hw->tgt.sess_lock, flags);
+   list_del(&sp->elem);
+   spin_unlock_irqrestore(&vha->hw->tgt.sess_lock, flags);
+
if (res) {
if (res == QLA_FUNCTION_TIMEOUT)
qla24xx_post_gpnid_work(sp->vha, &ea.id);
sp->free(sp);
return;
+   } else if (sp->gen1) {
+   /* There was anoter RSNC for this Nport ID */
+   qla24xx_post_gpnid_work(sp->vha, &ea.id);
+   sp->free(sp);
+   return;
}
 
qla2x00_fcport_event_handler(vha, &ea);
@@ -3282,8 +3292,9 @@ int qla24xx_async_gpnid(scsi_qla_host_t *vha, port_id_t 
*id)
 {
int rval = QLA_FUNCTION_FAILED;
struct ct_sn

[PATCH 04/22] qla2xxx: Skip IRQ affinity for Target QPairs

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Fix co-existence between Block MQ and Target Mode. Block MQ
and initiator mode requires midlayer queue mapping to check
for IRQ to be affinitize. For target mode, it's not the case.

Fixes: 09620eeb62c41 ("scsi: qla2xxx: Add debug knob for user control workload")
Cc:  # 4.12+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_os.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index dfbf82e716b0..428e1bfaa83b 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -6609,9 +6609,14 @@ qla83xx_disable_laser(scsi_qla_host_t *vha)
 
 static int qla2xxx_map_queues(struct Scsi_Host *shost)
 {
+   int rc;
scsi_qla_host_t *vha = (scsi_qla_host_t *)shost->hostdata;
 
-   return blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev);
+   if (USER_CTRL_IRQ(vha->hw))
+   rc = blk_mq_map_queues(&shost->tag_set);
+   else
+   rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev);
+   return rc;
 }
 
 static const struct pci_error_handlers qla2xxx_err_handler = {
-- 
2.12.0



[PATCH 22/22] qla2xxx: Update driver version to 10.00.00.03-k

2017-11-28 Thread Himanshu Madhani
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_version.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_version.h 
b/drivers/scsi/qla2xxx/qla_version.h
index b6ec02b96d3d..911b82226d13 100644
--- a/drivers/scsi/qla2xxx/qla_version.h
+++ b/drivers/scsi/qla2xxx/qla_version.h
@@ -7,7 +7,7 @@
 /*
  * Driver version
  */
-#define QLA2XXX_VERSION  "10.00.00.02-k"
+#define QLA2XXX_VERSION  "10.00.00.03-k"
 
 #define QLA_DRIVER_MAJOR_VER   10
 #define QLA_DRIVER_MINOR_VER   0
-- 
2.12.0



[PATCH 08/22] qla2xxx: Fix login state machine stuck at GPDB

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

This patch returns discovery state machine back to
Login Complete.

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_init.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index be4c67b465b8..2f246996d3e2 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -863,6 +863,7 @@ void qla24xx_handle_gpdb_event(scsi_qla_host_t *vha, struct 
event_arg *ea)
int rval = ea->rc;
fc_port_t *fcport = ea->fcport;
unsigned long flags;
+   u16 opt = ea->sp->u.iocb_cmd.u.mbx.out_mb[10];
 
fcport->flags &= ~FCF_ASYNC_SENT;
 
@@ -893,7 +894,8 @@ void qla24xx_handle_gpdb_event(scsi_qla_host_t *vha, struct 
event_arg *ea)
}
 
spin_lock_irqsave(&vha->hw->tgt.sess_lock, flags);
-   ea->fcport->login_gen++;
+   if (opt != PDO_FORCE_ADISC)
+   ea->fcport->login_gen++;
ea->fcport->deleted = 0;
ea->fcport->logout_on_delete = 1;
 
@@ -917,6 +919,13 @@ void qla24xx_handle_gpdb_event(scsi_qla_host_t *vha, 
struct event_arg *ea)
 
qla24xx_post_gpsc_work(vha, fcport);
}
+   } else if (ea->fcport->login_succ) {
+   /*
+* We have an existing session. A late RSCN delivery
+* must have triggered the session to be re-validate.
+* session is still valid.
+*/
+   fcport->disc_state = DSC_LOGIN_COMPLETE;
}
spin_unlock_irqrestore(&vha->hw->tgt.sess_lock, flags);
 } /* gpdb event */
-- 
2.12.0



[PATCH 05/22] qla2xxx: Fix re-login for Nport Handle in use

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

When NPort Handle is in use, driver needs to mark the handle
as used and pick another. Instead, the code clears the handle
and re-pick the same handle.

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_gs.c   | 16 ++-
 drivers/scsi/qla2xxx/qla_init.c | 44 +
 drivers/scsi/qla2xxx/qla_isr.c  |  5 -
 3 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c
index ddc69d36877e..8984f857bb34 100644
--- a/drivers/scsi/qla2xxx/qla_gs.c
+++ b/drivers/scsi/qla2xxx/qla_gs.c
@@ -2833,7 +2833,7 @@ void qla24xx_handle_gidpn_event(scsi_qla_host_t *vha, 
struct event_arg *ea)
}
} else { /* fcport->d_id.b24 != ea->id.b24 */
fcport->d_id.b24 = ea->id.b24;
-   if (fcport->deleted == QLA_SESS_DELETED) {
+   if (fcport->deleted != QLA_SESS_DELETED) {
ql_dbg(ql_dbg_disc, vha, 0x2021,
"%s %d %8phC post del sess\n",
__func__, __LINE__, 
fcport->port_name);
@@ -3206,10 +3206,16 @@ static void qla2x00_async_gpnid_sp_done(void *s, int 
res)
struct event_arg ea;
struct qla_work_evt *e;
 
-   ql_dbg(ql_dbg_disc, vha, 0x2066,
-   "Async done-%s res %x ID %3phC. %8phC\n",
-   sp->name, res, ct_req->req.port_id.port_id,
-   ct_rsp->rsp.gpn_id.port_name);
+   if (res)
+   ql_dbg(ql_dbg_disc, vha, 0x2066,
+   "Async done-%s fail res %x ID %3phC. %8phC\n",
+   sp->name, res, ct_req->req.port_id.port_id,
+   ct_rsp->rsp.gpn_id.port_name);
+   else
+   ql_dbg(ql_dbg_disc, vha, 0x2066,
+   "Async done-%s good ID %3phC. %8phC\n",
+   sp->name, ct_req->req.port_id.port_id,
+   ct_rsp->rsp.gpn_id.port_name);
 
if (res) {
sp->free(sp);
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 1bafa043f9f1..be4c67b465b8 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -1452,6 +1452,8 @@ static void
 qla24xx_handle_plogi_done_event(struct scsi_qla_host *vha, struct event_arg 
*ea)
 {
port_id_t cid;  /* conflict Nport id */
+   u16 lid;
+   struct fc_port *conflict_fcport;
 
switch (ea->data[0]) {
case MBS_COMMAND_COMPLETE:
@@ -1467,8 +1469,12 @@ qla24xx_handle_plogi_done_event(struct scsi_qla_host 
*vha, struct event_arg *ea)
qla24xx_post_prli_work(vha, ea->fcport);
} else {
ql_dbg(ql_dbg_disc, vha, 0x20ea,
-   "%s %d %8phC post gpdb\n",
-   __func__, __LINE__, ea->fcport->port_name);
+   "%s %d %8phC LoopID 0x%x in use with %06x. post 
gnl\n",
+   __func__, __LINE__, ea->fcport->port_name,
+   ea->fcport->loop_id, ea->fcport->d_id.b24);
+
+   set_bit(ea->fcport->loop_id, vha->hw->loop_id_map);
+   ea->fcport->loop_id = FC_NO_LOOP_ID;
ea->fcport->chip_reset = 
vha->hw->base_qpair->chip_reset;
ea->fcport->logout_on_delete = 1;
ea->fcport->send_els_logo = 0;
@@ -1513,8 +1519,38 @@ qla24xx_handle_plogi_done_event(struct scsi_qla_host 
*vha, struct event_arg *ea)
ea->fcport->d_id.b.domain, ea->fcport->d_id.b.area,
ea->fcport->d_id.b.al_pa);
 
-   qla2x00_clear_loop_id(ea->fcport);
-   qla24xx_post_gidpn_work(vha, ea->fcport);
+   lid = ea->iop[1] & 0x;
+   qlt_find_sess_invalidate_other(vha,
+   wwn_to_u64(ea->fcport->port_name),
+   ea->fcport->d_id, lid, &conflict_fcport);
+
+   if (conflict_fcport) {
+   /*
+* Another fcport share the same loop_id/nport id.
+* Conflict fcport needs to finish cleanup before this
+* fcport can proceed to login.
+*/
+   conflict_fcport->conflict = ea->fcport;
+   ea->fcport->login_pause = 1;
+
+   ql_dbg(ql_dbg_disc, vha, 0x20ed,
+   "%s %d %8phC NPortId %06x inuse with loopid 0x%x. 
post gidpn\n",
+   __func__, __LINE__, ea->fcport->port_name,
+   ea->fcport->d_id.b24, lid);
+   qla2x00_clear_loop_id(ea

[PATCH 19/22] qla2xxx: Remove aborting ELS IOCB call issued as part of timeout.

2017-11-28 Thread Himanshu Madhani
From: Giridhar Malavali 

This fix the spinlock recursion issue seen while unloading the driver.

14 [9f2e21e03db8] native_queued_spin_lock_slowpath at ad0d8802
15 [9f2e21e03dc0] do_raw_spin_lock at ad0d99e4
16 [9f2e21e03dd8] _raw_spin_lock_irqsave at ad652471
17 [9f2e21e03e00] qla2x00_els_dcmd_iocb_timeout at c070cd63
18 [9f2e21e03e40] qla2x00_sp_timeout at c06f06d3 [qla2xxx]
19 [9f2e21e03e68] call_timer_fn at ad0f97d8
20 [9f2e21e03ed8] run_timer_softirq at ad0faf47
21 [9f2e21e03f68] __softirqentry_text_start at ad655f32

Fixes: 6eb54715b54bb ("qla2xxx: Added interface to send explicit LOGO.")
Cc:  # 4.10+
Signed-off-by: Giridhar Malavali 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_iocb.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c
index 106f4ac4f733..8ea59586f4f1 100644
--- a/drivers/scsi/qla2xxx/qla_iocb.c
+++ b/drivers/scsi/qla2xxx/qla_iocb.c
@@ -2392,7 +2392,6 @@ qla2x00_els_dcmd_iocb_timeout(void *data)
srb_t *sp = data;
fc_port_t *fcport = sp->fcport;
struct scsi_qla_host *vha = sp->vha;
-   struct qla_hw_data *ha = vha->hw;
struct srb_iocb *lio = &sp->u.iocb_cmd;
 
ql_dbg(ql_dbg_io, vha, 0x3069,
@@ -2400,15 +2399,6 @@ qla2x00_els_dcmd_iocb_timeout(void *data)
sp->name, sp->handle, fcport->d_id.b.domain, fcport->d_id.b.area,
fcport->d_id.b.al_pa);
 
-   /* Abort the exchange */
-   if (ha->isp_ops->abort_command(sp)) {
-   ql_dbg(ql_dbg_io, vha, 0x3070,
-   "mbx abort_command failed.\n");
-   } else {
-   ql_dbg(ql_dbg_io, vha, 0x3071,
-   "mbx abort_command success.\n");
-   }
-
complete(&lio->u.els_logo.comp);
 }
 
-- 
2.12.0



[PATCH 20/22] qla2xxx: Fix system crash in qlt_plogi_ack_unref

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Fix system crash due to NULL pointer access.

qlt_plogi_ack_t and fc_port structures were not properly
bound before calling qlt_plogi_ack_unref().

RIP: 0010:qlt_plogi_ack_unref+0xa1/0x150 [qla2xxx]
Call Trace:
qla24xx_create_new_sess+0xb1/0x320 [qla2xxx]
qla2x00_do_work+0x123/0x260 [qla2xxx]
qla2x00_iocb_work_fn+0x30/0x40 [qla2xxx]
process_one_work+0x1f3/0x530
worker_thread+0x4e/0x480
kthread+0x10c/0x140

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Giridhar Malavali 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_os.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 2ec77b9f78b8..789030c9dd26 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -4750,11 +4750,11 @@ void qla24xx_create_new_sess(struct scsi_qla_host *vha, 
struct qla_work_evt *e)
} else {
list_add_tail(&fcport->list, &vha->vp_fcports);
 
-   if (pla) {
-   qlt_plogi_ack_link(vha, pla, fcport,
-   QLT_PLOGI_LINK_SAME_WWN);
-   pla->ref_count--;
-   }
+   }
+   if (pla) {
+   qlt_plogi_ack_link(vha, pla, fcport,
+   QLT_PLOGI_LINK_SAME_WWN);
+   pla->ref_count--;
}
}
spin_unlock_irqrestore(&vha->hw->tgt.sess_lock, flags);
-- 
2.12.0



[PATCH 17/22] qla2xxx: Clear loop id after delete

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

clear loop id after delete to prevent session invalidation
of stale session.

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_target.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_target.c 
b/drivers/scsi/qla2xxx/qla_target.c
index 1c219998ab60..0c0453f2ca9e 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -986,7 +986,7 @@ static void qlt_free_session_done(struct work_struct *work)
sess->send_els_logo = 0;
}
 
-   if (sess->logout_on_delete) {
+   if (sess->logout_on_delete && sess->loop_id != FC_NO_LOOP_ID) {
int rc;
 
rc = qla2x00_post_async_logout_work(vha, sess, NULL);
@@ -1045,8 +1045,7 @@ static void qlt_free_session_done(struct work_struct 
*work)
sess->login_succ = 0;
}
 
-   if (sess->chip_reset != ha->base_qpair->chip_reset)
-   qla2x00_clear_loop_id(sess);
+   qla2x00_clear_loop_id(sess);
 
if (sess->conflict) {
sess->conflict->login_pause = 0;
@@ -4600,9 +4599,9 @@ qlt_find_sess_invalidate_other(scsi_qla_host_t *vha, 
uint64_t wwn,
"Invalidating sess %p loop_id %d wwn 
%llx.\n",
other_sess, other_sess->loop_id, other_wwn);
 
-
other_sess->keep_nport_handle = 1;
-   *conflict_sess = other_sess;
+   if (other_sess->disc_state != DSC_DELETED)
+   *conflict_sess = other_sess;
qlt_schedule_sess_for_deletion(other_sess,
true);
}
-- 
2.12.0



[PATCH 09/22] qla2xxx: Fix NPIV host cleanup in target mode

2017-11-28 Thread Himanshu Madhani
From: Sawan Chandak 

Add check to make sure we are cleaning up global target host
list only for NPIV hosts

Fixes: bdbe24de281e2 ("scsi: qla2xxx: Cleanup NPIV host in target mode during 
config teardown")
Cc:  # 4.10+
Signed-off-by: Sawan Chandak 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_target.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_target.c 
b/drivers/scsi/qla2xxx/qla_target.c
index 924d58f5408f..1bec8aebb7b6 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -1561,8 +1561,11 @@ static void qlt_release(struct qla_tgt *tgt)
 
btree_destroy64(&tgt->lun_qpair_map);
 
-   if (ha->tgt.tgt_ops && ha->tgt.tgt_ops->remove_target)
-   ha->tgt.tgt_ops->remove_target(vha);
+   if (vha->vp_idx)
+   if (ha->tgt.tgt_ops &&
+   ha->tgt.tgt_ops->remove_target &&
+   vha->vha_tgt.target_lport_ptr)
+   ha->tgt.tgt_ops->remove_target(vha);
 
vha->vha_tgt.qla_tgt = NULL;
 
-- 
2.12.0



[PATCH 18/22] qla2xxx: Defer processing of GS IOCB calls

2017-11-28 Thread Himanshu Madhani
From: Giridhar Malavali 

This patch defers processing of GS IOCB calls from interrupt
context to avoid hardware spinlock recursion.

Following stack trace is seen

? mod_timer+0x193/0x330
? ql_dbg+0xa7/0xf0 [qla2xxx]
_raw_spin_lock_irqsave+0x31/0x40
qla2x00_start_sp+0x3b/0x250 [qla2xxx]
qla24xx_async_gnl+0x1d3/0x240 [qla2xxx]
qla24xx_fcport_handle_login+0x285/0x290 [qla2xxx]
? vprintk_func+0x20/0x50

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Giridhar Malavali 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_init.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 7dd19785f820..57b8f43c5980 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -975,7 +975,7 @@ int qla24xx_fcport_handle_login(struct scsi_qla_host *vha, 
fc_port_t *fcport)
ql_dbg(ql_dbg_disc, vha, 0x20bd,
"%s %d %8phC post gnl\n",
__func__, __LINE__, fcport->port_name);
-   qla24xx_async_gnl(vha, fcport);
+   qla24xx_post_gnl_work(vha, fcport);
} else {
ql_dbg(ql_dbg_disc, vha, 0x20bf,
"%s %d %8phC post login\n",
@@ -1143,7 +1143,7 @@ void qla24xx_handle_relogin_event(scsi_qla_host_t *vha,
ql_dbg(ql_dbg_disc, vha, 0x20e9, "%s %d %8phC post gidpn\n",
__func__, __LINE__, fcport->port_name);
 
-   qla24xx_async_gidpn(vha, fcport);
+   qla24xx_post_gidpn_work(vha, fcport);
return;
}
 
-- 
2.12.0



[PATCH 11/22] qla2xxx: Fix Relogin being triggered too fast

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Current driver design schedules relogin process via DPC thread
every 1 second. In a large fabric, this DPC thread tries to
schedule too many jobs and might get overloaded. As a result of
this processing of DPC thread, it can schedule relogin earlier
than 1 second.

Fixes: 726b85487067d ("qla2xxx: Add framework for async fabric discovery")
Cc:  # 4.10+
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_def.h |  1 +
 drivers/scsi/qla2xxx/qla_mid.c | 24 +++-
 drivers/scsi/qla2xxx/qla_os.c  | 22 ++
 3 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index d9b4a0651a0f..93ff92e2363f 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -4110,6 +4110,7 @@ typedef struct scsi_qla_host {
 #define LOOP_READY 5
 #define LOOP_DEAD  6
 
+   unsigned long   relogin_jif;
unsigned long   dpc_flags;
 #define RESET_MARKER_NEEDED0   /* Send marker to ISP. */
 #define RESET_ACTIVE   1
diff --git a/drivers/scsi/qla2xxx/qla_mid.c b/drivers/scsi/qla2xxx/qla_mid.c
index bd9f14bf7ac2..618ca272d01a 100644
--- a/drivers/scsi/qla2xxx/qla_mid.c
+++ b/drivers/scsi/qla2xxx/qla_mid.c
@@ -343,15 +343,21 @@ qla2x00_do_dpc_vp(scsi_qla_host_t *vha)
"FCPort update end.\n");
}
 
-   if ((test_and_clear_bit(RELOGIN_NEEDED, &vha->dpc_flags)) &&
-   !test_bit(LOOP_RESYNC_NEEDED, &vha->dpc_flags) &&
-   atomic_read(&vha->loop_state) != LOOP_DOWN) {
-
-   ql_dbg(ql_dbg_dpc, vha, 0x4018,
-   "Relogin needed scheduled.\n");
-   qla2x00_relogin(vha);
-   ql_dbg(ql_dbg_dpc, vha, 0x4019,
-   "Relogin needed end.\n");
+   if (test_bit(RELOGIN_NEEDED, &vha->dpc_flags) &&
+   !test_bit(LOOP_RESYNC_NEEDED, &vha->dpc_flags) &&
+   atomic_read(&vha->loop_state) != LOOP_DOWN) {
+
+   if (!vha->relogin_jif ||
+   time_after_eq(jiffies, vha->relogin_jif)) {
+   vha->relogin_jif = jiffies + HZ;
+   clear_bit(RELOGIN_NEEDED, &vha->dpc_flags);
+
+   ql_dbg(ql_dbg_dpc, vha, 0x4018,
+   "Relogin needed scheduled.\n");
+   qla2x00_relogin(vha);
+   ql_dbg(ql_dbg_dpc, vha, 0x4019,
+   "Relogin needed end.\n");
+   }
}
 
if (test_and_clear_bit(RESET_MARKER_NEEDED, &vha->dpc_flags) &&
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 820d1c185beb..2ec77b9f78b8 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -4905,7 +4905,7 @@ void qla2x00_relogin(struct scsi_qla_host *vha)
 */
if (atomic_read(&fcport->state) != FCS_ONLINE &&
fcport->login_retry && !(fcport->flags & FCF_ASYNC_SENT)) {
-   fcport->login_retry--;
+
if (fcport->flags & FCF_FABRIC_DEVICE) {
ql_dbg(ql_dbg_disc, fcport->vha, 0x2108,
"%s %8phC DS %d LS %d\n", __func__,
@@ -4916,6 +4916,7 @@ void qla2x00_relogin(struct scsi_qla_host *vha)
ea.fcport = fcport;
qla2x00_fcport_event_handler(vha, &ea);
} else {
+   fcport->login_retry--;
status = qla2x00_local_device_login(vha,
fcport);
if (status == QLA_SUCCESS) {
@@ -5898,16 +5899,21 @@ qla2x00_do_dpc(void *data)
}
 
/* Retry each device up to login retry count */
-   if ((test_and_clear_bit(RELOGIN_NEEDED,
-   &base_vha->dpc_flags)) &&
+   if (test_bit(RELOGIN_NEEDED, &base_vha->dpc_flags) &&
!test_bit(LOOP_RESYNC_NEEDED, &base_vha->dpc_flags) &&
atomic_read(&base_vha->loop_state) != LOOP_DOWN) {
 
-   ql_dbg(ql_dbg_dpc, base_vha, 0x400d,
-   "Relogin scheduled.\n");
-   qla2x00_relogin(base_vha);
-   ql_dbg(ql_dbg_dpc, base_vha, 0x400e,
-   "Relogin end.\n");
+   if (!base_vha->relogin_jif ||
+   time_after_eq(jiffies, base_vha->relogin_jif)) {
+   base_vha->relogin_jif = jiffies + HZ;
+   clear_bit(RELOGIN_NEEDED, &base_vha->dpc_flags);
+
+   ql_dbg(ql_dbg_dpc, base_vha, 0x400d,
+   "Relogin scheduled.\n");
+   

[PATCH 15/22] qla2xxx: Replace fcport alloc with qla2x00_alloc_fcport

2017-11-28 Thread Himanshu Madhani
From: Quinn Tran 

Current code manually allocate an fcport structure that
is not properly initialize. Replace kzalloc with
qla2x00_alloc_fcport, so that all fields are initialized.
Also set set scan flag to port found

Cc: 
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_target.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_target.c 
b/drivers/scsi/qla2xxx/qla_target.c
index e824cdc77139..2a6242d97a7e 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -5783,7 +5783,7 @@ static fc_port_t *qlt_get_port_database(struct 
scsi_qla_host *vha,
unsigned long flags;
u8 newfcport = 0;
 
-   fcport = kzalloc(sizeof(*fcport), GFP_KERNEL);
+   fcport = qla2x00_alloc_fcport(vha, GFP_KERNEL);
if (!fcport) {
ql_dbg(ql_dbg_tgt_mgt, vha, 0xf06f,
"qla_target(%d): Allocation of tmp FC port failed",
-- 
2.12.0



[PATCH 21/22] qla2xxx: Fix memory leak in dual/target mode

2017-11-28 Thread Himanshu Madhani
When driver is loaded in Target/Dual mode, it creates QPair
to support MQ and allocates resources for each QPair. This Qpair
initialization is delayed until the FW personality is changed to
Dual/Target mode by issuing chip reset. At the time of chip reset
firmware is re-initilized in correct personality all the QPairs
are initialized by sending MBC_INITIALIZE_MULTIQ (001Fh).

This patch fixes memory leak by adding check to issue
MBC_INITIALIZE_MULTIQ command only while deleting rsp/req queue
when the flag is set for initiator mode, and clean up QPair resources
correctly during the driver unload. This MBX does not need to be
issued for Target/Dual mode because chip reset will reset ISP.

Fixes: d65237c7f0860 ("scsi: qla2xxx: Fix mailbox failure while deleting Queue 
pairs")
Cc:  # 4.10+
Signed-off-by: Himanshu Madhani 
---
 drivers/scsi/qla2xxx/qla_init.c |  4 +---
 drivers/scsi/qla2xxx/qla_mid.c  | 18 ++
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 57b8f43c5980..58663df38627 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -8220,9 +8220,6 @@ int qla2xxx_delete_qpair(struct scsi_qla_host *vha, 
struct qla_qpair *qpair)
int ret = QLA_FUNCTION_FAILED;
struct qla_hw_data *ha = qpair->hw;
 
-   if (!vha->flags.qpairs_req_created && !vha->flags.qpairs_rsp_created)
-   goto fail;
-
qpair->delete_in_progress = 1;
while (atomic_read(&qpair->ref_count))
msleep(500);
@@ -8230,6 +8227,7 @@ int qla2xxx_delete_qpair(struct scsi_qla_host *vha, 
struct qla_qpair *qpair)
ret = qla25xx_delete_req_que(vha, qpair->req);
if (ret != QLA_SUCCESS)
goto fail;
+
ret = qla25xx_delete_rsp_que(vha, qpair->rsp);
if (ret != QLA_SUCCESS)
goto fail;
diff --git a/drivers/scsi/qla2xxx/qla_mid.c b/drivers/scsi/qla2xxx/qla_mid.c
index 618ca272d01a..e538e6308885 100644
--- a/drivers/scsi/qla2xxx/qla_mid.c
+++ b/drivers/scsi/qla2xxx/qla_mid.c
@@ -575,14 +575,15 @@ qla25xx_free_rsp_que(struct scsi_qla_host *vha, struct 
rsp_que *rsp)
 int
 qla25xx_delete_req_que(struct scsi_qla_host *vha, struct req_que *req)
 {
-   int ret = -1;
+   int ret = QLA_SUCCESS;
 
-   if (req) {
+   if (req && vha->flags.qpairs_req_created) {
req->options |= BIT_0;
ret = qla25xx_init_req_que(vha, req);
+   if (ret != QLA_SUCCESS)
+   return QLA_FUNCTION_FAILED;
}
-   if (ret == QLA_SUCCESS)
-   qla25xx_free_req_que(vha, req);
+   qla25xx_free_req_que(vha, req);
 
return ret;
 }
@@ -590,14 +591,15 @@ qla25xx_delete_req_que(struct scsi_qla_host *vha, struct 
req_que *req)
 int
 qla25xx_delete_rsp_que(struct scsi_qla_host *vha, struct rsp_que *rsp)
 {
-   int ret = -1;
+   int ret = QLA_SUCCESS;
 
-   if (rsp) {
+   if (rsp && vha->flags.qpairs_rsp_created) {
rsp->options |= BIT_0;
ret = qla25xx_init_rsp_que(vha, rsp);
+   if (ret != QLA_SUCCESS)
+   return QLA_FUNCTION_FAILED;
}
-   if (ret == QLA_SUCCESS)
-   qla25xx_free_rsp_que(vha, rsp);
+   qla25xx_free_rsp_que(vha, rsp);
 
return ret;
 }
-- 
2.12.0



Re: [PATCH 14/22] qla2xxx: Fix nested spinlock

2017-11-28 Thread Bart Van Assche
On Tue, 2017-11-28 at 11:34 -0800, Himanshu Madhani wrote:
> From: Quinn Tran 

Nesting spinlocks is allowed so I think a more detailed description
is required for this patch. It would help e.g. to explain why this
patch is a bug fix and also what it fixes.

Thanks,

Bart.

Re: [PATCH 14/22] qla2xxx: Fix nested spinlock

2017-11-28 Thread Madhani, Himanshu
Hi Bart, 

> On Nov 28, 2017, at 11:45 AM, Bart Van Assche  wrote:
> 
> On Tue, 2017-11-28 at 11:34 -0800, Himanshu Madhani wrote:
>> From: Quinn Tran 
> 
> Nesting spinlocks is allowed so I think a more detailed description
> is required for this patch. It would help e.g. to explain why this
> patch is a bug fix and also what it fixes.
> 
> Thanks,
> 
> Bart.

Apology for using wrong subject for this patch. 

we ran into this deadlock issue while unloading driver and stack trace shows 
following

#14 [9f2e21e03db8] native_queued_spin_lock_slowpath at ad0d8802
#15 [9f2e21e03dc0] do_raw_spin_lock at ad0d99e4
#16 [9f2e21e03dd8] _raw_spin_lock_irqsave at ad652471
#17 [9f2e21e03e00] qla2x00_els_dcmd_iocb_timeout at c070cd63 
[qla2xxx]
#18 [9f2e21e03e40] qla2x00_sp_timeout at c06f06d3 [qla2xxx]
#19 [9f2e21e03e68] call_timer_fn at ad0f97d8
#20 [9f2e21e03ed8] run_timer_softirq at ad0faf47
#21 [9f2e21e03f68] __softirqentry_text_start at ad655f32

There is patch #19 (https://patchwork.kernel.org/patch/10080967/) later in this 
series which ultimately removes
this block of code protected by these spin locks. if you prefer i can merge 
this patch with the other patch and 
add more description

Thanks,
- Himanshu

Re: [PATCH 14/22] qla2xxx: Fix nested spinlock

2017-11-28 Thread Bart Van Assche
On Tue, 2017-11-28 at 21:38 +, Madhani, Himanshu wrote:
> Hi Bart, 
> 
> > On Nov 28, 2017, at 11:45 AM, Bart Van Assche  
> > wrote:
> > 
> > On Tue, 2017-11-28 at 11:34 -0800, Himanshu Madhani wrote:
> > > From: Quinn Tran 
> > 
> > Nesting spinlocks is allowed so I think a more detailed description
> > is required for this patch. It would help e.g. to explain why this
> > patch is a bug fix and also what it fixes.
> > 
> Apology for using wrong subject for this patch. 
> 
> we ran into this deadlock issue while unloading driver and stack trace shows 
> following
> 
> #14 [9f2e21e03db8] native_queued_spin_lock_slowpath at ad0d8802
> #15 [9f2e21e03dc0] do_raw_spin_lock at ad0d99e4
> #16 [9f2e21e03dd8] _raw_spin_lock_irqsave at ad652471
> #17 [9f2e21e03e00] qla2x00_els_dcmd_iocb_timeout at c070cd63 
> [qla2xxx]
> #18 [9f2e21e03e40] qla2x00_sp_timeout at c06f06d3 [qla2xxx]
> #19 [9f2e21e03e68] call_timer_fn at ad0f97d8
> #20 [9f2e21e03ed8] run_timer_softirq at ad0faf47
> #21 [9f2e21e03f68] __softirqentry_text_start at ad655f32
> 
> There is patch #19 (https://patchwork.kernel.org/patch/10080967/) later in 
> this series which ultimately removes
> this block of code protected by these spin locks. if you prefer i can merge 
> this patch with the other patch and 
> add more description

Hello Himanshu,

Do you enable CONFIG_PROVE_LOCKING in any of your tests? If this bug could
be reproduced with CONFIG_PROVE_LOCKING then that would provide very valuable
information about the root cause of the deadlock.

Thanks,

Bart.

Re: [PATCH 14/22] qla2xxx: Fix nested spinlock

2017-11-28 Thread Madhani, Himanshu
Hi Bart, 

> On Nov 28, 2017, at 2:09 PM, Bart Van Assche  wrote:
> 
> Hello Himanshu,
> 
> Do you enable CONFIG_PROVE_LOCKING in any of your tests? If this bug could
> be reproduced with CONFIG_PROVE_LOCKING then that would provide very valuable
> information about the root cause of the deadlock.
> 
> Thanks,
> 
> Bart.

This was reported by customer so i am not very confident they would have 
CONFIG_PROVE_LOCKING enabled in
their setup. I’ll enable this on our setup and we can try to reproduce this 
issue. 

Let me know if you want me to drop this patch until we get more details.

Thanks,
- Himanshu



Re: [PATCH 14/22] qla2xxx: Fix nested spinlock

2017-11-28 Thread Bart Van Assche
On Tue, 2017-11-28 at 22:22 +, Madhani, Himanshu wrote:
> This was reported by customer so i am not very confident they would have 
> CONFIG_PROVE_LOCKING enabled in
> their setup. I’ll enable this on our setup and we can try to reproduce this 
> issue. 
> 
> Let me know if you want me to drop this patch until we get more details.

Hello Himanshu,

What I think is that at least an explanation should be provided about why it
is safe to leave out the spin_lock() and spin_unlock() calls. What does the
hardware lock protect and why is it safe to leave it out from
qla2x00_els_dcmd_iocb_timeout()?

Thanks,

Bart.

Re: [PATCH 14/22] qla2xxx: Fix nested spinlock

2017-11-28 Thread Madhani, Himanshu
Hi Bart, 

On Nov 28, 2017, at 2:27 PM, Bart Van Assche  wrote:
> 
> On Tue, 2017-11-28 at 22:22 +, Madhani, Himanshu wrote:
>> This was reported by customer so i am not very confident they would have 
>> CONFIG_PROVE_LOCKING enabled in
>> their setup. I’ll enable this on our setup and we can try to reproduce this 
>> issue. 
>> 
>> Let me know if you want me to drop this patch until we get more details.
> 
> Hello Himanshu,
> 
> What I think is that at least an explanation should be provided about why it
> is safe to leave out the spin_lock() and spin_unlock() calls. What does the
> hardware lock protect and why is it safe to leave it out from
> qla2x00_els_dcmd_iocb_timeout()?
> 

Okay will update patch with the description and submit v2 of this patch.

> Thanks,
> 
> Bart.

Thanks,
- Himanshu



[PATCH] scsi: fix race condition when removing target

2017-11-28 Thread Jason Yan
In commit fbce4d97fd43 ("scsi: fixup kernel warning during rmmod()"), we
removed scsi_device_get() and directly called get_device() to increase
the refcount of the device. But actullay scsi_device_get() will fail in
three cases:
1. the scsi device is in SDEV_DEL or SDEV_CANCEL state
2. get_device() fail
3. the module is not alive

The intended purpose was to remove the check of the module alive.
Unfortunately the check of the device state was droped too. And this
introduced a race condition like this:

  CPU0   CPU1
__scsi_remove_target()
  ->iterate shost->__devices
  ->scsi_remove_device()
  ->put_device()
  someone still hold a refcount
   sd_release()
  ->scsi_disk_put()
  ->put_device() last put 
and trigger the device release

  ->goto restart
  ->iterate shost->__devices and got the same device
  ->get_device() while refcount is 0
  ->scsi_remove_device()
  ->put_device() refcount decreased to 0 again
  ->scsi_device_dev_release()
  ->scsi_device_dev_release_usercontext()

  
->scsi_device_dev_release()
  
->scsi_device_dev_release_usercontext()

The same scsi device will be found agian because it is in the shost->__devices
list until scsi_device_dev_release_usercontext() called, although the device
state was set to SDEV_DEL after the first scsi_remove_device().

Finally we got a oops in scsi_device_dev_release_usercontext() when the second
time be called.

Call trace:
[] scsi_device_dev_release_usercontext+0x7c/0x1c0
[] execute_in_process_context+0x70/0x80
[] scsi_device_dev_release+0x28/0x38
[] device_release+0x3c/0xa0
[] kobject_put+0x80/0xf0
[] put_device+0x24/0x30
[] scsi_device_put+0x30/0x40
[] scsi_disk_put+0x44/0x60
[] sd_release+0x50/0x80
[] __blkdev_put+0x21c/0x230
[] blkdev_put+0x54/0x118
[] blkdev_close+0x2c/0x40
[] __fput+0x94/0x1d8
[] fput+0x20/0x30
[] task_work_run+0x9c/0xb8
[] do_exit+0x2b4/0x9f8
[] do_group_exit+0x3c/0xa0
[] __wake_up_parent+0x0/0x40

And sometimes in __scsi_remove_target() it will loop for a long time
removing the same device if someone else holding a refcount until the
last refcount is released.

Notice that if CONFIG_REFCOUNT_FULL is open this race won't be triggered
because the full refcount implement will prevent the refcount increase
when it is 0.

Fix this by checking the sdev_state again like we did before in
scsi_device_get(). Then when iterating shost again we will skip the device
deleted because scsi_remove_device() will set the device state to
SDEV_CANCEL or SDEV_DEL.

Fixes: fbce4d97fd43 ("scsi: fixup kernel warning during rmmod()")
Signed-off-by: Jason Yan 
CC: Hannes Reinecke 
CC: Christoph Hellwig 
CC: Johannes Thumshirn 
CC: Zhaohongjiang 
CC: Miao Xie 
---
 drivers/scsi/scsi_sysfs.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 50e7d7e..d398894 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1398,6 +1398,15 @@ void scsi_remove_device(struct scsi_device *sdev)
 }
 EXPORT_SYMBOL(scsi_remove_device);
 
+static int scsi_device_get_not_deleted(struct scsi_device *sdev)
+{
+   if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)
+   return -ENXIO;
+   if (!get_device(&sdev->sdev_gendev))
+   return -ENXIO;
+   return 0;
+}
+
 static void __scsi_remove_target(struct scsi_target *starget)
 {
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
@@ -1415,7 +1424,7 @@ static void __scsi_remove_target(struct scsi_target 
*starget)
 */
if (sdev->channel != starget->channel ||
sdev->id != starget->id ||
-   !get_device(&sdev->sdev_gendev))
+   scsi_device_get_not_deleted(sdev))
continue;
spin_unlock_irqrestore(shost->host_lock, flags);
scsi_remove_device(sdev);
-- 
2.9.5



Re: [PATCH] scsi: lpfc: Use after free in lpfc_rq_buf_free()

2017-11-28 Thread Martin K. Petersen

Dan,

> The error message dereferences "rqb_entry" so we need to print it first
> and then free the buffer.

Applied to 4.15/scsi-fixes. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: libfc: fix ELS request handling

2017-11-28 Thread Martin K. Petersen

Martin,

> The modification of fc_lport_recv_els_req() in commit fcabb09e59a7
> (merged in 4.12-rc1) caused certain requests not to be handled at all.
> Fix that.

Applied to 4.15/scsi-fixes. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: debug: remove jiffies_to_timespec

2017-11-28 Thread Martin K. Petersen

Arnd,

> There is no need to go through an intermediate timespec to convert to
> ktime_t when we just want a simple multiplication. This gets rid of
> one of the few users of jiffies_to_timespec, which I hope to remove as
> part of the y2038 cleanup.

Applied to 4.16/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: aacraid: address UBSAN warning regression

2017-11-28 Thread Martin K. Petersen

Arnd,

> As reported by Meelis Roos, my previous patch causes an incorrect
> calculation of the timeout, through an undefined signed integer
> overflow:

Applied to 4.15/scsi-fixes, thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] bfa: fix access to bfad_im_port_s

2017-11-28 Thread Martin K. Petersen

Johannes,

> Commit 'cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() helper")'
> changed access to bfa's 'struct bfad_im_port_s' by using shost_priv()
> instead of shost->hostdata[0].

Applied to 4.15/scsi-fixes. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: wd719x: make card_types static const, shrinks object size

2017-11-28 Thread Martin K. Petersen

Colin,

> Don't populate the read-only array card_types on the stack but instead
> make it static and constify it. Makes the object code smaller by over
> 110 bytes:

Applied to 4.16/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: fix race condition when removing target

2017-11-28 Thread Hannes Reinecke
On 11/29/2017 04:05 AM, Jason Yan wrote:
> In commit fbce4d97fd43 ("scsi: fixup kernel warning during rmmod()"), we
> removed scsi_device_get() and directly called get_device() to increase
> the refcount of the device. But actullay scsi_device_get() will fail in
> three cases:
> 1. the scsi device is in SDEV_DEL or SDEV_CANCEL state
> 2. get_device() fail
> 3. the module is not alive
> 
> The intended purpose was to remove the check of the module alive.
> Unfortunately the check of the device state was droped too. And this
> introduced a race condition like this:
> 
>   CPU0   CPU1
> __scsi_remove_target()
>   ->iterate shost->__devices
>   ->scsi_remove_device()
>   ->put_device()
>   someone still hold a refcount
>sd_release()
>   ->scsi_disk_put()
>   ->put_device() last put 
> and trigger the device release
> 
>   ->goto restart
>   ->iterate shost->__devices and got the same device
>   ->get_device() while refcount is 0
>   ->scsi_remove_device()
>   ->put_device() refcount decreased to 0 again
>   ->scsi_device_dev_release()
>   ->scsi_device_dev_release_usercontext()
> 
>   
> ->scsi_device_dev_release()
>   
> ->scsi_device_dev_release_usercontext()
> 
> The same scsi device will be found agian because it is in the shost->__devices
> list until scsi_device_dev_release_usercontext() called, although the device
> state was set to SDEV_DEL after the first scsi_remove_device().
> 
> Finally we got a oops in scsi_device_dev_release_usercontext() when the second
> time be called.
> 
> Call trace:
> [] scsi_device_dev_release_usercontext+0x7c/0x1c0
> [] execute_in_process_context+0x70/0x80
> [] scsi_device_dev_release+0x28/0x38
> [] device_release+0x3c/0xa0
> [] kobject_put+0x80/0xf0
> [] put_device+0x24/0x30
> [] scsi_device_put+0x30/0x40
> [] scsi_disk_put+0x44/0x60
> [] sd_release+0x50/0x80
> [] __blkdev_put+0x21c/0x230
> [] blkdev_put+0x54/0x118
> [] blkdev_close+0x2c/0x40
> [] __fput+0x94/0x1d8
> [] fput+0x20/0x30
> [] task_work_run+0x9c/0xb8
> [] do_exit+0x2b4/0x9f8
> [] do_group_exit+0x3c/0xa0
> [] __wake_up_parent+0x0/0x40
> 
> And sometimes in __scsi_remove_target() it will loop for a long time
> removing the same device if someone else holding a refcount until the
> last refcount is released.
> 
> Notice that if CONFIG_REFCOUNT_FULL is open this race won't be triggered
> because the full refcount implement will prevent the refcount increase
> when it is 0.
> 
> Fix this by checking the sdev_state again like we did before in
> scsi_device_get(). Then when iterating shost again we will skip the device
> deleted because scsi_remove_device() will set the device state to
> SDEV_CANCEL or SDEV_DEL.
> 
> Fixes: fbce4d97fd43 ("scsi: fixup kernel warning during rmmod()")
> Signed-off-by: Jason Yan 
> CC: Hannes Reinecke 
> CC: Christoph Hellwig 
> CC: Johannes Thumshirn 
> CC: Zhaohongjiang 
> CC: Miao Xie 
> ---
>  drivers/scsi/scsi_sysfs.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 50e7d7e..d398894 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -1398,6 +1398,15 @@ void scsi_remove_device(struct scsi_device *sdev)
>  }
>  EXPORT_SYMBOL(scsi_remove_device);
>  
> +static int scsi_device_get_not_deleted(struct scsi_device *sdev)
> +{
> + if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)
> + return -ENXIO;
> + if (!get_device(&sdev->sdev_gendev))
> + return -ENXIO;
> + return 0;
> +}
> +
>  static void __scsi_remove_target(struct scsi_target *starget)
>  {
>   struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
> @@ -1415,7 +1424,7 @@ static void __scsi_remove_target(struct scsi_target 
> *starget)
>*/
>   if (sdev->channel != starget->channel ||
>   sdev->id != starget->id ||
> - !get_device(&sdev->sdev_gendev))
> + scsi_device_get_not_deleted(sdev))
>   continue;
>   spin_unlock_irqrestore(shost->host_lock, flags);
>   scsi_remove_device(sdev);
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)