Re: [PATCH 24/29] drivers: convert iblock_req.pending from atomic_t to refcount_t

2017-03-07 Thread Nicholas A. Bellinger
Hi Elena,

On Mon, 2017-03-06 at 16:21 +0200, Elena Reshetova wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
> 
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 
> ---
>  drivers/target/target_core_iblock.c | 12 ++--
>  drivers/target/target_core_iblock.h |  3 ++-
>  2 files changed, 8 insertions(+), 7 deletions(-)

For the target_core_iblock part:

Acked-by: Nicholas Bellinger 



Re: [PATCH v3 00/14] qla2xxx: Bug Fixes and updates for target.

2017-03-07 Thread Nicholas A. Bellinger
On Mon, 2017-03-06 at 00:43 +, Bart Van Assche wrote:
> On Fri, 2017-02-24 at 13:37 -0800, Himanshu Madhani wrote:
> > Please consider this series for inclusion in target-pending.
> 
> Hello Himanshu,
> 
> I applied this patch series on top of kernel v4.11-rc1 and installed it on
> a test system. Unfortunately the regression I reported one month ago against
> v2 of this patch series still occurs with v3 of this patch series:

Btw, the regression reported here in v2:

http://www.spinics.net/lists/target-devel/msg14348.html

is completely different from what you've reported here.

> 
> INFO: task kworker/2:1:71 blocked for more than 120 seconds.
>   Not tainted 4.11.0-rc1-dbg+ #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kworker/2:1 D071  2 0x
> Workqueue: events qlt_free_session_done [qla2xxx]
> Call Trace:
>  __schedule+0x1ef/0xa70
>  schedule+0x40/0x90
>  schedule_timeout+0x258/0x4a0
>  wait_for_completion+0x108/0x170
>  target_wait_for_sess_cmds+0x56/0x1b0 [target_core_mod]
>  tcm_qla2xxx_free_session+0x45/0x90 [tcm_qla2xxx]
>  qlt_free_session_done+0x44a/0x550 [qla2xxx]
>  process_one_work+0x20d/0x6c0
>  worker_thread+0x4e/0x4a0
>  kthread+0x10c/0x140
>  ret_from_fork+0x31/0x40
> 

It would be useful to explain how you reproduced this, instead of just
posting backtrace with zero context..?

Can we at least identify which patch in this series is causing this..?

Also, I assume you are running this on stock v4.11-rc1 with only this
qla2xxx series applied, and not all of your other stuff, right..?



Re: [PATCH v3 01/16] lpfc: Correct WQ creation for pagesize

2017-03-07 Thread Martin K. Petersen
> "Greg" == Greg KH  writes:

Greg,

>> Can this commit be included on stable v4.4+ , please? (in Linus tree)
>> 
>> 8ea73db486cda442f0671f4bc9c03a76be398a28 "scsi: lpfc: Correct WQ
>> creation for pagesize"

Greg> I need an ack from a scsi maintainer that this is an ok thing to
Greg> do before I can do so.

It's fine with me. This was merged as part of a bigger driver update.
And as a result, the patch in question wasn't tagged as a fix/stable
candidate.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] storvsc: workaround for virtual DVD SCSI version

2017-03-07 Thread Martin K. Petersen
> "Stephen" == Stephen Hemminger  writes:

Stephen,

Stephen> Hyper-V host emulation of SCSI for virtual DVD device reports
Stephen> SCSI version 0 (UNKNOWN) but is still capable of supporting
Stephen> REPORTLUN.

Stephen> Without this patch, a GEN2 Linux guest on Hyper-V will not boot
Stephen> 4.11 successfully with virtual DVD ROM device. What happens is
Stephen> that the SCSI scan process falls back to doing sequential
Stephen> probing by INQUIRY.  But the storvsc driver has a previous
Stephen> workaround that masks/blocks all errors reports from INQUIRY
Stephen> (or MODE_SENSE) commands.  This workaround causes the scan to
Stephen> then populate a full set of bogus LUN's on the target and then
Stephen> sends kernel spinning off into a death spiral doing block reads
Stephen> on the non-existent LUNs.

Stephen> By setting the correct blacklist flags, the target with the DVD
Stephen> device is scanned with REPORTLUN and that works correctly.

Applied to 4.11/scsi-fixes. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: lpfc: Add shutdown method for kexec

2017-03-07 Thread Martin K. Petersen
> "Mauricio" == Mauricio Faria de Oliveira  
> writes:

Mauricio> I think I should have included this in the tested-by tag
Mauricio> email, for documentation/evidence: no regression observed in
Mauricio> system shutdown path.

Applied to 4.11/scsi-fixes.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] storvsc: workaround for virtual DVD SCSI version

2017-03-07 Thread Christoph Hellwig
Thanks Stephen,

this looks good to me:

Reviewed-by: Christoph Hellwig 


Re: [PATCH] lpfc: Finalize Kconfig options for nvme

2017-03-07 Thread James Smart



On 3/7/2017 1:23 PM, Christoph Hellwig wrote:

+#if defined(CONFIG_NVME_FC)
+#define CONFIG_LPFC_NVME_INITIATOR
+#endif
+
+#if defined(CONFIG_NVME_TARGET_FC)
+#define CONFIG_LPFC_NVME_TARGET
+#endif

The CONFIG_* namespace is reserved for Kconfig defined symbols.

Also I think the above is wrong if the nvme fc core code is built
modular.  I think you want to replace all occurances of

#ifdef CONFIG_LPFC_NVME_INITIATOR

with

if (IS_ENABLED(CONFIG_NVME_FC))

and vice versa for the taget.


make sense

-- james



RE: [PATCHv2] hpsa: expose enclosures

2017-03-07 Thread Don Brace

> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.com]
> Sent: Tuesday, March 07, 2017 9:25 AM
> To: Don Brace ; Hannes Reinecke
> ; Martin K. Petersen 
> Cc: Christoph Hellwig ; James Bottomley
> ; linux-scsi@vger.kernel.org
> Subject: Re: [PATCHv2] hpsa: expose enclosures
> 
> EXTERNAL EMAIL
> 
> 
> On 03/07/2017 04:05 PM, Don Brace wrote:
> >> -Original Message-
> >> From: Hannes Reinecke [mailto:h...@suse.de]
> >> Sent: Thursday, February 23, 2017 4:55 AM
> >> To: Martin K. Petersen 
> >> Cc: Christoph Hellwig ; James Bottomley
> >> ; linux-scsi@vger.kernel.org;
> >> Hannes Reinecke ; Don Brace
> >> ; Hannes Reinecke 
> >> Subject: [PATCHv2] hpsa: expose enclosures
> >>
> >> EXTERNAL EMAIL
> >>
> >>
> >> Some servers have a built-in enclosure which will show up on the
> >> same bus as the internal physical devices. This patch fixes the
> >> driver to expose them.
> >>
> >> Cc: Don Brace 
> >> Signed-off-by: Hannes Reinecke 
> >
> > Masking various SES targets is a conscious choice, not a design oversight.
> > So, I have to decline this patch.,
> >
> Would you be okay with a module parameter for this?
> It really feels stupid to present the user with an enclosure device
> during boot (as hpsa presents all physical LUNs during booting), but not
> exporting it to the user.
> 
> Cheers,
> 
> Hannes

We would not be able to support this as it can cause issues with 
LED management. A customer could pull the wrong drive and lose data because the 
LED state was incorrect.
That results in support issues.

We like to show the enclosures even if masked as it helps with diagnosing 
problems.

Thanks,
Don Brace
ESC - Smart Storage
Microsemi Corporation





> --
> Dr. Hannes ReineckezSeries & Storage
> h...@suse.com  +49 911 74053 688
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
> HRB 21284 (AG Nürnberg)


Re: Help needed for a SCSI hang (SATA drives)

2017-03-07 Thread V
On Tue, Mar 7, 2017 at 3:58 AM, Jack Wang  wrote:
> 2017-03-07 1:54 GMT+01:00 V :
>> Hi,
>>
>> I am reaching out regarding a SCSI error handler issue, which we see
>> in our lab systems.
>>
>> Environment: Ubuntu 4.4.0-31-generic
>>
>> Issue: Write IOs are getting stuck in our system randomly. All drives
>> seen with the issue are all SATA drives.
>> Root cause: SCSI error handler is not woken up (or) it is up, but not
>> able to flush commands and hence the IOs get stuck. (as the requests
>> are not flushed out and host is not restarted)
>>
>> We have 6 instances seen till now.
>>
>> This is what we see.
>> 1) As part of testing our drives, we start lot of read/writes, along
>> with some user space utilities running to check the drive health.
>> 1) In our recent testing, we see lot of commands going through the
>> "abort scheduled" path. And we see that in between, for one of the
>> commands, the error handler is not woken up, and majority of
>> processes, which were writing, gets stalled.
>> 2) We are still trying to figure out what is causing this much number
>> of abort? Is it usual? Are there some other debugs, which I could
>> enable to get more information? We know these are user space commands
>> which are being aborted, but not sure which exact command it is for
>> now.
>> 3) I see the "abort scheduled" messages in almost all drives in all
>> systems, hence i dont believe it is a drive issue.
>> 4) I checked the host_failed and host_busy variables, both are set to
>> 1 in system 1. 2nd one is still alive, havent taken a crash dump yet.
>> 5) All drives seen with the issue are SATA drives.
>> 6) I have attached udevadm info of a drive which failed in system 2.
>>
>> Thanks in advance,
>> Viswesh
>>
>>
>> Logs
>> ---
>> System 1
>>
>> [371546.605736] sd 9:0:0:0: scsi_block_when_processing_errors: rtn: 1
>> [371546.606727] sd 9:0:0:0: scsi_block_when_processing_errors: rtn: 1
>> [371546.607721] sd 9:0:0:0: scsi_block_when_processing_errors: rtn: 1
>> [371546.618113] sd 9:0:0:0: [sg19] tag#20 abort scheduled
>> [371546.624039] sd 9:0:0:0: [sg19] tag#20 aborting command
>> [371546.624045] sd 9:0:0:0: [sg19] tag#20 cmd abort failed
>> [371546.624051] scsi host9: Waking error handler thread
>> [371546.624060] scsi host9: scsi_eh_9: waking up 0/1/1
>> [371546.624081] sd 9:0:0:0: [sg19] tag#20 scsi_eh_9: flush finish cmd
>> [371546.624095] scsi host9: waking up host to restart
>> [371546.624098] scsi host9: scsi_eh_9: sleeping
>>
>>
>> [371546.635314] sd 8:0:0:0: [sg17] tag#13 abort scheduled
>> [371546.640031] sd 8:0:0:0: [sg17] tag#13 aborting command
>> [371546.640037] sd 8:0:0:0: [sg17] tag#13 cmd abort failed
>> [371546.640043] scsi host8: Waking error handler thread
>> [371546.640078] scsi host8: scsi_eh_8: waking up 0/1/1
>> [371546.640098] sd 8:0:0:0: [sg17] tag#13 scsi_eh_8: flush finish cmd
>> [371546.640113] scsi host8: waking up host to restart
>> [371546.640117] scsi host8: scsi_eh_8: sleeping
>>
>> [371546.657269] sd 6:0:0:0: [sg12] tag#1 abort scheduled
>> [371546.664034] sd 6:0:0:0: [sg12] tag#1 aborting command
>> [371546.664040] sd 6:0:0:0: [sg12] tag#1 cmd abort failed
>>
>> Here we can see that, error handler is not woken up.  And the entire
>> IO subsystem goes for a toss.
>>
>> [371777.594510] INFO: task md2_raid1:508 blocked for more than 120 seconds.
>> [371777.594571]  Not tainted 4.4.0-31-generic #50~14.04.1-Ubuntu
>> [371777.594613] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [371777.594665] md2_raid1  D 883fed2afc780  508  2 0x
>> [371777.594673]  883fed2afc78 881fef0e8000 883fed21c4c0
>> 883fed2b
>> [371777.594678]  883ff236e298 883ff236e000 883ff236e018
>> 883ff236e018
>>
>> By enabling more scsi logs(in a different system), including LL, we
>> get some more info ( I have enabled the full scsi levels for all
>> categories and it is still running)
>>
>>
>> System 2
>>
>> [405197.171574] sd 5:0:0:0: [sg3] sg_write: count=3D88
>> [405197.171577] sd 5:0:0:0: scsi_block_when_processing_errors: rtn: 1
>> [405197.171581] sd 5:0:0:0: [sg3] sg_common_write:  scsi
>> opcode=3D0x85, cmd_size=3D16
>> [405197.171583] sd 5:0:0:0: [sg3] sg_start_req: dxfer_len=3D512
>> [405197.171605] sd 5:0:0:0: [sg3] sg_link_reserve: size=3D512
>> [405197.171623] sd 5:0:0:0: [sg3] sg_poll: res=3D0x104
>> [405197.172648] sd 5:0:0:0: [sg3] sg_cmd_done: pack_id=3D0, res=3D0x0
>> [405197.172701] sd 5:0:0:0: [sg3] sg_poll: res=3D0x145
>> [405197.172710] sd 5:0:0:0: [sg3] sg_read: count=3D88
>> [405197.172716] sd 5:0:0:0: [sg3] sg_finish_rem_req: res_used=3D1
>> [405197.172721] sd 5:0:0:0: [sg3] sg_unlink_reserve: req->k_use_sg=3D1
>> [405197.172756] sd 5:0:0:0: [sg3] sg_write: count=3D88
>> [405197.172760] sd 5:0:0:0: scsi_block_when_processing_errors: rtn: 1
>> [405197.172764] sd 5:0:0:0: [sg3] sg_common_write:  scsi
>> opcode=3D0x85, cmd_size=3D16
>> [405197.172767] sd 5:0:0:0: [sg3] sg_start_req: dxfer_len

Re: [PATCH] lpfc: Finalize Kconfig options for nvme

2017-03-07 Thread Christoph Hellwig
> +#if defined(CONFIG_NVME_FC)
> +#define CONFIG_LPFC_NVME_INITIATOR
> +#endif
> +
> +#if defined(CONFIG_NVME_TARGET_FC)
> +#define CONFIG_LPFC_NVME_TARGET
> +#endif

The CONFIG_* namespace is reserved for Kconfig defined symbols.

Also I think the above is wrong if the nvme fc core code is built
modular.  I think you want to replace all occurances of

#ifdef CONFIG_LPFC_NVME_INITIATOR

with

if (IS_ENABLED(CONFIG_NVME_FC))

and vice versa for the taget.


RE: [PATCH] storvsc: workaround for virtual DVD SCSI version

2017-03-07 Thread KY Srinivasan


> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Tuesday, March 7, 2017 9:16 AM
> To: KY Srinivasan ; Haiyang Zhang
> ; Long Li ;
> martin.peter...@oracle.com; h...@lst.de; h...@suse.de
> Cc: linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org;
> de...@linuxdriverproject.org; Stephen Hemminger
> 
> Subject: [PATCH] storvsc: workaround for virtual DVD SCSI version
> 
> Hyper-V host emulation of SCSI for virtual DVD device reports SCSI
> version 0 (UNKNOWN) but is still capable of supporting REPORTLUN.
> 
> Without this patch, a GEN2 Linux guest on Hyper-V will not boot 4.11
> successfully with virtual DVD ROM device. What happens is that the
> SCSI scan process falls back to doing sequential probing by INQUIRY.
> But the storvsc driver has a previous workaround that masks/blocks all
> errors reports from INQUIRY (or MODE_SENSE) commands.  This workaround
> causes the scan to then populate a full set of bogus LUN's on the
> target and then sends kernel spinning off into a death spiral doing
> block reads on the non-existent LUNs.
> 
> By setting the correct blacklist flags, the target with the
> DVD device is scanned with REPORTLUN and that works correctly.
> 
> Patch needs to go in current 4.11, it is safe but not necessary
> in older kernels.
> 
> Signed-off-by: Stephen Hemminger 

Reviewed-by: K. Y. Srinivasan 
> ---
>  drivers/scsi/storvsc_drv.c | 27 +--
>  1 file changed, 17 insertions(+), 10 deletions(-)
> 
> PS: The error handling does need to be fixed (have patches pending)
> but that is interrelated with hotplug and can wait.
> 
> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> index 638e5f427c90..19973e874830 100644
> --- a/drivers/scsi/storvsc_drv.c
> +++ b/drivers/scsi/storvsc_drv.c
> @@ -400,8 +400,6 @@
> MODULE_PARM_DESC(storvsc_vcpus_per_sub_channel, "Ratio of VCPUs to
> subchannels")
>   */
>  static int storvsc_timeout = 180;
> 
> -static int msft_blist_flags = BLIST_TRY_VPD_PAGES;
> -
>  #if IS_ENABLED(CONFIG_SCSI_FC_ATTRS)
>  static struct scsi_transport_template *fc_transport_template;
>  #endif
> @@ -1383,6 +1381,22 @@ static int storvsc_do_io(struct hv_device *device,
>   return ret;
>  }
> 
> +static int storvsc_device_alloc(struct scsi_device *sdevice)
> +{
> + /*
> +  * Set blist flag to permit the reading of the VPD pages even when
> +  * the target may claim SPC-2 compliance. MSFT targets currently
> +  * claim SPC-2 compliance while they implement post SPC-2 features.
> +  * With this flag we can correctly handle WRITE_SAME_16 issues.
> +  *
> +  * Hypervisor reports SCSI_UNKNOWN type for DVD ROM device but
> +  * still supports REPORT LUN.
> +  */
> + sdevice->sdev_bflags = BLIST_REPORTLUN2 |
> BLIST_TRY_VPD_PAGES;
> +
> + return 0;
> +}
> +
>  static int storvsc_device_configure(struct scsi_device *sdevice)
>  {
> 
> @@ -1396,14 +1410,6 @@ static int storvsc_device_configure(struct
> scsi_device *sdevice)
>   sdevice->no_write_same = 1;
> 
>   /*
> -  * Add blist flags to permit the reading of the VPD pages even when
> -  * the target may claim SPC-2 compliance. MSFT targets currently
> -  * claim SPC-2 compliance while they implement post SPC-2 features.
> -  * With this patch we can correctly handle WRITE_SAME_16 issues.
> -  */
> - sdevice->sdev_bflags |= msft_blist_flags;
> -
> - /*
>* If the host is WIN8 or WIN8 R2, claim conformance to SPC-3
>* if the device is a MSFT virtual device.  If the host is
>* WIN10 or newer, allow write_same.
> @@ -1661,6 +1667,7 @@ static struct scsi_host_template scsi_driver = {
>   .eh_host_reset_handler =storvsc_host_reset_handler,
>   .proc_name ="storvsc_host",
>   .eh_timed_out = storvsc_eh_timed_out,
> + .slave_alloc =  storvsc_device_alloc,
>   .slave_configure =  storvsc_device_configure,
>   .cmd_per_lun =  255,
>   .this_id =  -1,
> --
> 2.11.0



Re: [PATCH v3 01/16] lpfc: Correct WQ creation for pagesize

2017-03-07 Thread Greg KH
On Tue, Mar 07, 2017 at 10:44:51AM -0300, Mauricio Faria de Oliveira wrote:
> Hello stable kernel maintainers,
> 
> On 03/07/2017 12:35 AM, Martin K. Petersen wrote:
> > Mauricio> Please flag this patch for stable.
> > 
> > Mauricio> This patch resolves a serious problem on IBM Power systems at
> > Mauricio> least.
> > 
> > Both patches are already upstream so I can't tag them for stable. Either
> > you or James should mail sta...@vger.kernel.org and request for the
> > patches to be queued up.
> 
> Can this commit be included on stable v4.4+ , please? (in Linus tree)
> 
> 8ea73db486cda442f0671f4bc9c03a76be398a28 "scsi: lpfc: Correct WQ
> creation for pagesize"

I need an ack from a scsi maintainer that this is an ok thing to do
before I can do so.

thanks,

greg k-h


[PATCH] storvsc: workaround for virtual DVD SCSI version

2017-03-07 Thread Stephen Hemminger
Hyper-V host emulation of SCSI for virtual DVD device reports SCSI
version 0 (UNKNOWN) but is still capable of supporting REPORTLUN. 

Without this patch, a GEN2 Linux guest on Hyper-V will not boot 4.11
successfully with virtual DVD ROM device. What happens is that the
SCSI scan process falls back to doing sequential probing by INQUIRY.
But the storvsc driver has a previous workaround that masks/blocks all
errors reports from INQUIRY (or MODE_SENSE) commands.  This workaround
causes the scan to then populate a full set of bogus LUN's on the
target and then sends kernel spinning off into a death spiral doing
block reads on the non-existent LUNs.

By setting the correct blacklist flags, the target with the
DVD device is scanned with REPORTLUN and that works correctly.

Patch needs to go in current 4.11, it is safe but not necessary
in older kernels.

Signed-off-by: Stephen Hemminger 
---
 drivers/scsi/storvsc_drv.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

PS: The error handling does need to be fixed (have patches pending)
but that is interrelated with hotplug and can wait.

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 638e5f427c90..19973e874830 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -400,8 +400,6 @@ MODULE_PARM_DESC(storvsc_vcpus_per_sub_channel, "Ratio of 
VCPUs to subchannels")
  */
 static int storvsc_timeout = 180;
 
-static int msft_blist_flags = BLIST_TRY_VPD_PAGES;
-
 #if IS_ENABLED(CONFIG_SCSI_FC_ATTRS)
 static struct scsi_transport_template *fc_transport_template;
 #endif
@@ -1383,6 +1381,22 @@ static int storvsc_do_io(struct hv_device *device,
return ret;
 }
 
+static int storvsc_device_alloc(struct scsi_device *sdevice)
+{
+   /*
+* Set blist flag to permit the reading of the VPD pages even when
+* the target may claim SPC-2 compliance. MSFT targets currently
+* claim SPC-2 compliance while they implement post SPC-2 features.
+* With this flag we can correctly handle WRITE_SAME_16 issues.
+*
+* Hypervisor reports SCSI_UNKNOWN type for DVD ROM device but
+* still supports REPORT LUN.
+*/
+   sdevice->sdev_bflags = BLIST_REPORTLUN2 | BLIST_TRY_VPD_PAGES;
+
+   return 0;
+}
+
 static int storvsc_device_configure(struct scsi_device *sdevice)
 {
 
@@ -1396,14 +1410,6 @@ static int storvsc_device_configure(struct scsi_device 
*sdevice)
sdevice->no_write_same = 1;
 
/*
-* Add blist flags to permit the reading of the VPD pages even when
-* the target may claim SPC-2 compliance. MSFT targets currently
-* claim SPC-2 compliance while they implement post SPC-2 features.
-* With this patch we can correctly handle WRITE_SAME_16 issues.
-*/
-   sdevice->sdev_bflags |= msft_blist_flags;
-
-   /*
 * If the host is WIN8 or WIN8 R2, claim conformance to SPC-3
 * if the device is a MSFT virtual device.  If the host is
 * WIN10 or newer, allow write_same.
@@ -1661,6 +1667,7 @@ static struct scsi_host_template scsi_driver = {
.eh_host_reset_handler =storvsc_host_reset_handler,
.proc_name ="storvsc_host",
.eh_timed_out = storvsc_eh_timed_out,
+   .slave_alloc =  storvsc_device_alloc,
.slave_configure =  storvsc_device_configure,
.cmd_per_lun =  255,
.this_id =  -1,
-- 
2.11.0



[PATCH] lpfc: Finalize Kconfig options for nvme

2017-03-07 Thread jsmart2021
From: James Smart 

Reviewing the result of what was just added for Kconfig, we made
a poor choice. It worked well for full kernel builds, but not so
much for how it would be deployed on a distro.

Here's the final result:
- lpfc will compile in NVME initiator and/or NVME target support
  based on whether the kernel has the corresponding subsystem support.
  Kconfig is not used to drive this specifically for lpfc.
- There is a module parameter, lpfc_enable_fc4_type, that indicates
  whether the ports will do FCP-only or FCP & NVME (NVME-only not yet
  possible due to dependency on fc transport). As FCP & NVME divvys
  up exchange resources, and given NVME will not be often initially,
  the default is changed to FCP only.

-- james

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/Kconfig  | 14 --
 drivers/scsi/lpfc/lpfc.h  |  8 
 drivers/scsi/lpfc/lpfc_attr.c |  4 ++--
 3 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 4bf55b5..3c52867 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -1253,20 +1253,6 @@ config SCSI_LPFC_DEBUG_FS
  This makes debugging information from the lpfc driver
  available via the debugfs filesystem.
 
-config LPFC_NVME_INITIATOR
-   bool "Emulex LightPulse Fibre Channel NVME Initiator Support"
-   depends on SCSI_LPFC && NVME_FC
-   ---help---
- This enables NVME Initiator support in the Emulex lpfc driver.
-
-config LPFC_NVME_TARGET
-   bool "Emulex LightPulse Fibre Channel NVME Initiator Support"
-   depends on SCSI_LPFC && NVME_TARGET_FC
-   ---help---
- This enables NVME Target support in the Emulex lpfc driver.
- Target enablement must still be enabled on a per adapter
- basis by module parameters.
-
 config SCSI_SIM710
tristate "Simple 53c710 SCSI support (Compaq, NCR machines)"
depends on (EISA || MCA) && SCSI
diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 257bbdd..ce571ca 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -28,6 +28,14 @@
 #define CONFIG_SCSI_LPFC_DEBUG_FS
 #endif
 
+#if defined(CONFIG_NVME_FC)
+#define CONFIG_LPFC_NVME_INITIATOR
+#endif
+
+#if defined(CONFIG_NVME_TARGET_FC)
+#define CONFIG_LPFC_NVME_TARGET
+#endif
+
 struct lpfc_sli2_slim;
 
 #define ELX_MODEL_NAME_SIZE80
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index fbd3a56..84aa62f 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -3315,9 +3315,9 @@ LPFC_ATTR_R(nvmet_mrq_post, LPFC_DEF_MRQ_POST,
  * lpfc_enable_fc4_type: Defines what FC4 types are supported.
  * Supported Values:  1 - register just FCP
  *3 - register both FCP and NVME
- * Supported values are [1,3]. Default value is 3
+ * Supported values are [1,3]. Default value is 1
  */
-LPFC_ATTR_R(enable_fc4_type, LPFC_ENABLE_BOTH,
+LPFC_ATTR_R(enable_fc4_type, LPFC_ENABLE_FCP,
LPFC_ENABLE_FCP, LPFC_ENABLE_BOTH,
"Define fc4 type to register with fabric.");
 
-- 
2.5.0



Re: [PATCH 10/29] drivers, md: convert stripe_head.count from atomic_t to refcount_t

2017-03-07 Thread Shaohua Li
On Mon, Mar 06, 2017 at 04:20:57PM +0200, Elena Reshetova wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
> 
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 
> ---
>  drivers/md/raid5-cache.c |  8 +++---
>  drivers/md/raid5.c   | 66 
> 
>  drivers/md/raid5.h   |  3 ++-
>  3 files changed, 39 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
> index 3f307be..6c05e12 100644
> --- a/drivers/md/raid5-cache.c
> +++ b/drivers/md/raid5-cache.c

snip
>  sh->check_state, sh->reconstruct_state);
>  
>   analyse_stripe(sh, &s);
> @@ -4924,7 +4924,7 @@ static void activate_bit_delay(struct r5conf *conf,
>   struct stripe_head *sh = list_entry(head.next, struct 
> stripe_head, lru);
>   int hash;
>   list_del_init(&sh->lru);
> - atomic_inc(&sh->count);
> + refcount_inc(&sh->count);
>   hash = sh->hash_lock_index;
>   __release_stripe(conf, sh, &temp_inactive_list[hash]);
>   }
> @@ -5240,7 +5240,7 @@ static struct stripe_head *__get_priority_stripe(struct 
> r5conf *conf, int group)
>   sh->group = NULL;
>   }
>   list_del_init(&sh->lru);
> - BUG_ON(atomic_inc_return(&sh->count) != 1);
> + BUG_ON(refcount_inc_not_zero(&sh->count));

This changes the behavior. refcount_inc_not_zero doesn't inc if original value 
is 0

Thanks,
Shaohua


Re: [PATCH 08/29] drivers, md: convert mddev.active from atomic_t to refcount_t

2017-03-07 Thread Shaohua Li
On Mon, Mar 06, 2017 at 04:20:55PM +0200, Elena Reshetova wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.

Looks good. Let me know how do you want to route the patch to upstream.
 
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 
> ---
>  drivers/md/md.c | 6 +++---
>  drivers/md/md.h | 3 ++-
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 985374f..94c8ebf 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -449,7 +449,7 @@ EXPORT_SYMBOL(md_unplug);
>  
>  static inline struct mddev *mddev_get(struct mddev *mddev)
>  {
> - atomic_inc(&mddev->active);
> + refcount_inc(&mddev->active);
>   return mddev;
>  }
>  
> @@ -459,7 +459,7 @@ static void mddev_put(struct mddev *mddev)
>  {
>   struct bio_set *bs = NULL;
>  
> - if (!atomic_dec_and_lock(&mddev->active, &all_mddevs_lock))
> + if (!refcount_dec_and_lock(&mddev->active, &all_mddevs_lock))
>   return;
>   if (!mddev->raid_disks && list_empty(&mddev->disks) &&
>   mddev->ctime == 0 && !mddev->hold_active) {
> @@ -495,7 +495,7 @@ void mddev_init(struct mddev *mddev)
>   INIT_LIST_HEAD(&mddev->all_mddevs);
>   setup_timer(&mddev->safemode_timer, md_safemode_timeout,
>   (unsigned long) mddev);
> - atomic_set(&mddev->active, 1);
> + refcount_set(&mddev->active, 1);
>   atomic_set(&mddev->openers, 0);
>   atomic_set(&mddev->active_io, 0);
>   spin_lock_init(&mddev->lock);
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index b8859cb..4811663 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -360,7 +361,7 @@ struct mddev {
>*/
>   struct mutexopen_mutex;
>   struct mutexreconfig_mutex;
> - atomic_tactive; /* general refcount */
> + refcount_t  active; /* general refcount */
>   atomic_topeners;/* number of active 
> opens */
>  
>   int changed;/* True if we might 
> need to
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bdi_unregister] 165a5e22fa INFO: task swapper:1 blocked for more than 120 seconds.

2017-03-07 Thread James Bottomley
On Tue, 2017-03-07 at 15:41 +0100, Jan Kara wrote:
> On Mon 06-03-17 09:25:42, James Bottomley wrote:
> > On Mon, 2017-03-06 at 17:13 +0100, Jan Kara wrote:
> > > On Mon 06-03-17 07:44:55, James Bottomley wrote:
> ...
> > > > > Sure. The call trace is:
> > > > > 
> > > > > [   41.919244] [ cut here ]
> > > > > [   41.919263] WARNING: CPU: 4 PID: 2335 at
> > > > > drivers/scsi/sd.c:3332
> > > > > sd_shutdown+0x2f/0x100
> > > > > [   41.919268] Modules linked in: scsi_debug(+) netconsole 
> > > > > loop btrfs raid6_pq zlib_deflate lzo_compress xor
> > > > > [   41.919319] CPU: 4 PID: 2335 Comm: modprobe Not tainted 
> > > > > 4.11.0-rc1-xen+ #49
> > > > > [   41.919325] Hardware name: Bochs Bochs, BIOS Bochs
> > > > > 01/01/2011
> > > > > [   41.919331] Call Trace:
> > > > > [   41.919343]  dump_stack+0x8e/0xf0
> > > > > [   41.919354]  __warn+0x116/0x120
> > > > > [   41.919361]  warn_slowpath_null+0x1d/0x20
> > > > > [   41.919368]  sd_shutdown+0x2f/0x100
> > > > > [   41.919374]  sd_remove+0x70/0xd0
> > > > > 
> > > > > *** Here is the unexpected step I guess...
> > > > > 
> > > > > [   41.919383]  driver_probe_device+0xe0/0x4c0
> > > > 
> > > > Exactly.  It's this, I think
> > > > 
> > > > bool test_remove =
> > > > IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE)
> > > > &&
> > > >!drv->suppress_bind_attrs;
> > > > 
> > > > You have that config option set.
> > > 
> > > Yes - or better said 0-day testing has it set. Maybe that is not 
> > > a sane default for 0-day tests? The option is explicitely marked 
> > > as "unstable"... Fengguang?
> > > 
> > > > So the drivers base layer is calling ->remove after probe and
> > > > triggering the destruction of the queue.
> > > > 
> > > > What to do about this (apart from nuke such a stupid option) is
> > > > somewhat more problematic.
> > > 
> > > I guess this is between you and Greg :).
> > 
> > Nice try, sport ... you qualify just behind Dan in the "not my 
> > problem" olympic hurdles event.  I'm annoyed because there's no 
> > indication in the log that the driver add behaviour is radically 
> > altered, so I've been staring at the wrong code for weeks. 
> >  However, the unbind/rebind should work, so the real issue is that 
> > your bdi changes (or perhaps something else in block) have induced 
> > a regression in the unbinding of upper layer drivers.  If you're 
> > going to release the bdi in del_gendisk, you have to have some 
> > mechanism for re-acquiring it on re-probe (likely with the same 
> > name) otherwise rebind is broken for every block driver.
> 
> So my patch does not release bdi in del_gendisk(). Bdi has two
> initialization / destruction phases (similarly to request queue). You
> allocate and initialize bdi through bdi_init(), then you call
> bdi_register() to register it (which happens in device_add_disk()). 
> On shutdown you have to first call bdi_unregister() (used to be 
> called from blk_cleanup_queue(), my patch moved it to del_gendisk()). 
> After that the last reference to bdi may be dropped which does final 
> bdi destruction.
> 
> So do I understand correctly that SCSI may call device_add_disk(),
> del_gendisk() repeatedly for the same request queue?

Yes.  The upper drivers (sd, sr, st and sg) follow a device model. 
 They can thus be bound and unbound many times during the lifetime of a
SCSI device.

>  If yes, then indeed I have a bug to fix... But gendisk seems to get 
> allocated from scratch on each probe so we don't call 
> device_add_disk(), del_gendisk() more times on the same disk, right?

Right, gendisk, being a generic representation of a disk is a property
of the upper layer drivers.  We actually cheat and use it in all of
them (including the apparent character ones, they use alloc_disk,
put_disk but not add_disk).  So it has to be freed when the driver is
unbound and reallocated when it is bound.  It's the fundamental entity
which embeds the SCSI upper layer driver lifetime.

> > The fact that the second rebind tried with a different name 
> > indicates that sd_devt_release wasn't called, so some vestige of 
> > the devt remains on the subsequent rebind.
> 
> Yep, I guess that's caused by Dan's patch (commit 0dba1314d4f8 now) 
> which calls put_disk_devt() only in blk_cleanup_queue() which, if I
> understood you correctly, does not get called during unbind-bind 
> cycle? In fact Dan's patch would end up leaking devt's because of 
> repeated device_add_disk() calls for the same request queue...

That's a bit unfortunate.

> > Here's the problem: the queue belongs to SCSI (the lower layer), so 
> > it's not going to change because it doesn't see the release.  The
> > gendisk and its allied stuff belongs to sd so it gets freed and re
> > -created for the same queue.  Something in block is very confused
> > when this happens.
> 
> Yep, I think the binding of request queue to different gendisks is
> something I or Dan did not expect.

OK, so I think we now understand 

Re: [bdi_unregister] 165a5e22fa INFO: task swapper:1 blocked for more than 120 seconds.

2017-03-07 Thread Jan Kara
On Tue 07-03-17 08:10:29, James Bottomley wrote:
> On Tue, 2017-03-07 at 15:41 +0100, Jan Kara wrote:
> > On Mon 06-03-17 09:25:42, James Bottomley wrote:
> > > On Mon, 2017-03-06 at 17:13 +0100, Jan Kara wrote:
> > > > On Mon 06-03-17 07:44:55, James Bottomley wrote:
> > ...
> > > > > > Sure. The call trace is:
> > > > > > 
> > > > > > [   41.919244] [ cut here ]
> > > > > > [   41.919263] WARNING: CPU: 4 PID: 2335 at
> > > > > > drivers/scsi/sd.c:3332
> > > > > > sd_shutdown+0x2f/0x100
> > > > > > [   41.919268] Modules linked in: scsi_debug(+) netconsole 
> > > > > > loop btrfs raid6_pq zlib_deflate lzo_compress xor
> > > > > > [   41.919319] CPU: 4 PID: 2335 Comm: modprobe Not tainted 
> > > > > > 4.11.0-rc1-xen+ #49
> > > > > > [   41.919325] Hardware name: Bochs Bochs, BIOS Bochs
> > > > > > 01/01/2011
> > > > > > [   41.919331] Call Trace:
> > > > > > [   41.919343]  dump_stack+0x8e/0xf0
> > > > > > [   41.919354]  __warn+0x116/0x120
> > > > > > [   41.919361]  warn_slowpath_null+0x1d/0x20
> > > > > > [   41.919368]  sd_shutdown+0x2f/0x100
> > > > > > [   41.919374]  sd_remove+0x70/0xd0
> > > > > > 
> > > > > > *** Here is the unexpected step I guess...
> > > > > > 
> > > > > > [   41.919383]  driver_probe_device+0xe0/0x4c0
> > > > > 
> > > > > Exactly.  It's this, I think
> > > > > 
> > > > >   bool test_remove =
> > > > > IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE)
> > > > > &&
> > > > >  !drv->suppress_bind_attrs;
> > > > > 
> > > > > You have that config option set.
> > > > 
> > > > Yes - or better said 0-day testing has it set. Maybe that is not 
> > > > a sane default for 0-day tests? The option is explicitely marked 
> > > > as "unstable"... Fengguang?
> > > > 
> > > > > So the drivers base layer is calling ->remove after probe and
> > > > > triggering the destruction of the queue.
> > > > > 
> > > > > What to do about this (apart from nuke such a stupid option) is
> > > > > somewhat more problematic.
> > > > 
> > > > I guess this is between you and Greg :).
> > > 
> > > Nice try, sport ... you qualify just behind Dan in the "not my 
> > > problem" olympic hurdles event.  I'm annoyed because there's no 
> > > indication in the log that the driver add behaviour is radically 
> > > altered, so I've been staring at the wrong code for weeks. 
> > >  However, the unbind/rebind should work, so the real issue is that 
> > > your bdi changes (or perhaps something else in block) have induced 
> > > a regression in the unbinding of upper layer drivers.  If you're 
> > > going to release the bdi in del_gendisk, you have to have some 
> > > mechanism for re-acquiring it on re-probe (likely with the same 
> > > name) otherwise rebind is broken for every block driver.
> > 
> > So my patch does not release bdi in del_gendisk(). Bdi has two
> > initialization / destruction phases (similarly to request queue). You
> > allocate and initialize bdi through bdi_init(), then you call
> > bdi_register() to register it (which happens in device_add_disk()). 
> > On shutdown you have to first call bdi_unregister() (used to be 
> > called from blk_cleanup_queue(), my patch moved it to del_gendisk()). 
> > After that the last reference to bdi may be dropped which does final 
> > bdi destruction.
> > 
> > So do I understand correctly that SCSI may call device_add_disk(),
> > del_gendisk() repeatedly for the same request queue?
> 
> Yes.  The upper drivers (sd, sr, st and sg) follow a device model. 
>  They can thus be bound and unbound many times during the lifetime of a
> SCSI device.
> 
> >  If yes, then indeed I have a bug to fix... But gendisk seems to get 
> > allocated from scratch on each probe so we don't call 
> > device_add_disk(), del_gendisk() more times on the same disk, right?
> 
> Right, gendisk, being a generic representation of a disk is a property
> of the upper layer drivers.  We actually cheat and use it in all of
> them (including the apparent character ones, they use alloc_disk,
> put_disk but not add_disk).  So it has to be freed when the driver is
> unbound and reallocated when it is bound.  It's the fundamental entity
> which embeds the SCSI upper layer driver lifetime.
> 
> > > The fact that the second rebind tried with a different name 
> > > indicates that sd_devt_release wasn't called, so some vestige of 
> > > the devt remains on the subsequent rebind.
> > 
> > Yep, I guess that's caused by Dan's patch (commit 0dba1314d4f8 now) 
> > which calls put_disk_devt() only in blk_cleanup_queue() which, if I
> > understood you correctly, does not get called during unbind-bind 
> > cycle? In fact Dan's patch would end up leaking devt's because of 
> > repeated device_add_disk() calls for the same request queue...
> 
> That's a bit unfortunate.
> 
> > > Here's the problem: the queue belongs to SCSI (the lower layer), so 
> > > it's not going to change because it doesn't see the release.  The
> > > gendisk and its allied stuff belongs to sd so it gets 

With due respect

2017-03-07 Thread U B A
Base on the last email i sent to you, I wish to bring to your notice
that i am still waiting patiently for your response. Mrs. Anna
Johnson.


Re: [PATCHv2] hpsa: expose enclosures

2017-03-07 Thread Hannes Reinecke
On 03/07/2017 04:05 PM, Don Brace wrote:
>> -Original Message-
>> From: Hannes Reinecke [mailto:h...@suse.de]
>> Sent: Thursday, February 23, 2017 4:55 AM
>> To: Martin K. Petersen 
>> Cc: Christoph Hellwig ; James Bottomley
>> ; linux-scsi@vger.kernel.org;
>> Hannes Reinecke ; Don Brace
>> ; Hannes Reinecke 
>> Subject: [PATCHv2] hpsa: expose enclosures
>>
>> EXTERNAL EMAIL
>>
>>
>> Some servers have a built-in enclosure which will show up on the
>> same bus as the internal physical devices. This patch fixes the
>> driver to expose them.
>>
>> Cc: Don Brace 
>> Signed-off-by: Hannes Reinecke 
> 
> Masking various SES targets is a conscious choice, not a design oversight.
> So, I have to decline this patch.,
> 
Would you be okay with a module parameter for this?
It really feels stupid to present the user with an enclosure device
during boot (as hpsa presents all physical LUNs during booting), but not
exporting it to the user.

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.com  +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH v3 01/16] lpfc: Correct WQ creation for pagesize

2017-03-07 Thread Mauricio Faria de Oliveira

Hello stable kernel maintainers,

On 03/07/2017 12:35 AM, Martin K. Petersen wrote:

Mauricio> Please flag this patch for stable.

Mauricio> This patch resolves a serious problem on IBM Power systems at
Mauricio> least.

Both patches are already upstream so I can't tag them for stable. Either
you or James should mail sta...@vger.kernel.org and request for the
patches to be queued up.


Can this commit be included on stable v4.4+ , please? (in Linus tree)

8ea73db486cda442f0671f4bc9c03a76be398a28 "scsi: lpfc: Correct WQ 
creation for pagesize"


Thanks!


For documentation purposes, the resolved issue is described on [1].

Following is the rationale for the stable kernel version asked for, in
case Broadcom/others would like to discuss/contribute.

These are the 3 commits which introduced the lines changed by the 
desired fix commit


0c651878ba3018bb4bbfa2ccd0a876bebb618768 "[SCSI] lpfc 8.3.41: Fixed 
support for 128 byte WQEs"
5a6f133eea2d0b4f8f75367b803fef0f03acf268 "[SCSI] lpfc 8.3.22: Add 
new mailbox command and new BSG fix"
c31098cef5e091e22a02ff255f911e0ad71cc393 "[SCSI] lpfc 8.3.23: Fixes 
related to new hardware"


So, checking the oldest release git tag which contains the most recent
commit (so the fix applies cleanly)

$ git tag --contains 0c651878ba3018bb4bbfa2ccd0a876bebb618768 | 
grep -v rc | sort -V

v3.12
<...>
v4.10

And it is cherry-picked/applies cleanly on v3.12:

$ git checkout -b lpfc-v3.12 v3.12
Switched to a new branch 'lpfc-v3.12'

$ git cherry-pick 8ea73db486cda442f0671f4bc9c03a76be398a28
[lpfc-v3.12 c2b0912e6135] scsi: lpfc: Correct WQ creation for pagesize
Author: James Smart 
2 files changed, 7 insertions(+), 4 deletions(-)

However, v3.12 seems too old to go back, and it's unclear whether there
was any problem actually reported on these older kernel releases, or
whether the hardware/configuration that reproduces this problem is
supported on v3.12 at all.

On PowerPC (which is directly affected by the fix, as at least IBM
Power servers uses a 64k page size by default on most distros), this
fix is probably only required on v4.4+, so to cover a distro for which
Broadcom doesn't ship inbox drivers to (w/ fix included).

So, I'll ask it for stable v4.4+, and Broadcom/others can expand the
range if they see fit/required.

[1] http://www.spinics.net/lists/linux-scsi/msg105886.html

cheers,

--
Mauricio Faria de Oliveira
IBM Linux Technology Center



RE: [PATCHv2] hpsa: expose enclosures

2017-03-07 Thread Don Brace
> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, February 23, 2017 4:55 AM
> To: Martin K. Petersen 
> Cc: Christoph Hellwig ; James Bottomley
> ; linux-scsi@vger.kernel.org;
> Hannes Reinecke ; Don Brace
> ; Hannes Reinecke 
> Subject: [PATCHv2] hpsa: expose enclosures
> 
> EXTERNAL EMAIL
> 
> 
> Some servers have a built-in enclosure which will show up on the
> same bus as the internal physical devices. This patch fixes the
> driver to expose them.
> 
> Cc: Don Brace 
> Signed-off-by: Hannes Reinecke 

Masking various SES targets is a conscious choice, not a design oversight.
So, I have to decline this patch.,

Thanks,
Don Brace
ESC - Smart Storage
Microsemi Corporation

> ---
>  drivers/scsi/hpsa.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
> index 524a0c7..a77ed5a 100644
> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -4407,7 +4407,10 @@ static void hpsa_update_scsi_devices(struct
> ctlr_info *h)
>  * Expose all devices except for physical devices that
>  * are masked.
>  */
> -   if (MASKED_DEVICE(lunaddrbytes) && this_device-
> >physical_device)
> +   if (this_device->devtype == TYPE_ENCLOSURE)
> +   this_device->expose_device = 1;
> +   else if (MASKED_DEVICE(lunaddrbytes) &&
> +this_device->physical_device)
> this_device->expose_device = 0;
> else
> this_device->expose_device = 1;
> --
> 1.8.5.6



RE: [PATCH 13/29] drivers, media: convert vb2_vmarea_handler.refcount from atomic_t to refcount_t

2017-03-07 Thread Reshetova, Elena
> Hi Elena,
> 
> On Mon, Mar 06, 2017 at 04:21:00PM +0200, Elena Reshetova wrote:
> > refcount_t type and corresponding API should be
> > used instead of atomic_t when the variable is used as
> > a reference counter. This allows to avoid accidental
> > refcounter overflows that might lead to use-after-free
> > situations.
> >
> > Signed-off-by: Elena Reshetova 
> > Signed-off-by: Hans Liljestrand 
> > Signed-off-by: Kees Cook 
> > Signed-off-by: David Windsor 
> > ---
> >  drivers/media/v4l2-core/videobuf2-memops.c | 6 +++---
> >  include/media/videobuf2-memops.h   | 3 ++-
> >  2 files changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/media/v4l2-core/videobuf2-memops.c 
> > b/drivers/media/v4l2-
> core/videobuf2-memops.c
> > index 1cd322e..4bb8424 100644
> > --- a/drivers/media/v4l2-core/videobuf2-memops.c
> > +++ b/drivers/media/v4l2-core/videobuf2-memops.c
> > @@ -96,10 +96,10 @@ static void vb2_common_vm_open(struct
> vm_area_struct *vma)
> > struct vb2_vmarea_handler *h = vma->vm_private_data;
> >
> > pr_debug("%s: %p, refcount: %d, vma: %08lx-%08lx\n",
> > -  __func__, h, atomic_read(h->refcount), vma->vm_start,
> > +  __func__, h, refcount_read(h->refcount), vma->vm_start,
> >vma->vm_end);
> >
> > -   atomic_inc(h->refcount);
> > +   refcount_inc(h->refcount);
> >  }
> >
> >  /**
> > @@ -114,7 +114,7 @@ static void vb2_common_vm_close(struct
> vm_area_struct *vma)
> > struct vb2_vmarea_handler *h = vma->vm_private_data;
> >
> > pr_debug("%s: %p, refcount: %d, vma: %08lx-%08lx\n",
> > -  __func__, h, atomic_read(h->refcount), vma->vm_start,
> > +  __func__, h, refcount_read(h->refcount), vma->vm_start,
> >vma->vm_end);
> >
> > h->put(h->arg);
> > diff --git a/include/media/videobuf2-memops.h b/include/media/videobuf2-
> memops.h
> > index 36565c7a..a6ed091 100644
> > --- a/include/media/videobuf2-memops.h
> > +++ b/include/media/videobuf2-memops.h
> > @@ -16,6 +16,7 @@
> >
> >  #include 
> >  #include 
> > +#include 
> >
> >  /**
> >   * struct vb2_vmarea_handler - common vma refcount tracking handler
> > @@ -25,7 +26,7 @@
> >   * @arg:   argument for @put callback
> >   */
> >  struct vb2_vmarea_handler {
> > -   atomic_t*refcount;
> > +   refcount_t  *refcount;
> 
> This is a pointer to refcount, not refcount itself. The refcount is part of
> a memory type specific struct, the types that you change in the following
> three patches. I guess it would still compile and work as separate patches
> but you'd sure get warnings at least.

Actually it doesn't compile without this patch if I remember it correctly back 
in past when I was initially splitting the patches per variable. 

> 
> How about merging this and the three following patches that change the memop
> refcount types?

Sounds good!

Best Regards,
Elena.
> 
> > void(*put)(void *arg);
> > void*arg;
> >  };
> 
> --
> Kind regards,
> 
> Sakari Ailus
> e-mail: sakari.ai...@iki.fi   XMPP: sai...@retiisi.org.uk


RE: [PATCH 12/29] drivers, media: convert s2255_dev.num_channels from atomic_t to refcount_t

2017-03-07 Thread Reshetova, Elena

> Hi Elena,
> 
> On Mon, Mar 06, 2017 at 04:20:59PM +0200, Elena Reshetova wrote:
> > refcount_t type and corresponding API should be
> > used instead of atomic_t when the variable is used as
> > a reference counter. This allows to avoid accidental
> > refcounter overflows that might lead to use-after-free
> > situations.
> >
> > Signed-off-by: Elena Reshetova 
> > Signed-off-by: Hans Liljestrand 
> > Signed-off-by: Kees Cook 
> > Signed-off-by: David Windsor 
> > ---
> ...
> > @@ -1688,7 +1689,7 @@ static int s2255_probe_v4l(struct s2255_dev *dev)
> > "failed to register
> video device!\n");
> > break;
> > }
> > -   atomic_inc(&dev->num_channels);
> > +   refcount_set(&dev->num_channels, 1);
> 
> That's not right. The loop runs four iterations and the value after the
> loop should be indeed the number of channels.

Oh, yes, I was blind here, sorry. The problem why it cannot be left as inc is 
because it would do increment from zero here, which is not allowed by 
refcount_t interface. 

> atomic_t isn't really used for reference counting here, but storing out how
> many "channels" there are per hardware device, with maximum number of four
> (4). I'd leave this driver using atomic_t.
Yes, sounds like the best thing to do. I will drop this patch. 

Thank you for reviews!

Best Regards,
Elena.
> 
> > v4l2_info(&dev->v4l2_dev, "V4L2 device registered
> as %s\n",
> >   video_device_node_name(&vc-
> >vdev));
> >
> 
> --
> Kind regards,
> 
> Sakari Ailus
> e-mail: sakari.ai...@iki.fi   XMPP: sai...@retiisi.org.uk


Re: [bdi_unregister] 165a5e22fa INFO: task swapper:1 blocked for more than 120 seconds.

2017-03-07 Thread Jan Kara
On Mon 06-03-17 09:25:42, James Bottomley wrote:
> On Mon, 2017-03-06 at 17:13 +0100, Jan Kara wrote:
> > On Mon 06-03-17 07:44:55, James Bottomley wrote:
...
> > > > Sure. The call trace is:
> > > > 
> > > > [   41.919244] [ cut here ]
> > > > [   41.919263] WARNING: CPU: 4 PID: 2335 at
> > > > drivers/scsi/sd.c:3332
> > > > sd_shutdown+0x2f/0x100
> > > > [   41.919268] Modules linked in: scsi_debug(+) netconsole loop
> > > > btrfs
> > > > raid6_pq zlib_deflate lzo_compress xor
> > > > [   41.919319] CPU: 4 PID: 2335 Comm: modprobe Not tainted 4.11.0
> > > > -rc1
> > > > -xen+ #49
> > > > [   41.919325] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > > > [   41.919331] Call Trace:
> > > > [   41.919343]  dump_stack+0x8e/0xf0
> > > > [   41.919354]  __warn+0x116/0x120
> > > > [   41.919361]  warn_slowpath_null+0x1d/0x20
> > > > [   41.919368]  sd_shutdown+0x2f/0x100
> > > > [   41.919374]  sd_remove+0x70/0xd0
> > > > 
> > > > *** Here is the unexpected step I guess...
> > > > 
> > > > [   41.919383]  driver_probe_device+0xe0/0x4c0
> > > 
> > > Exactly.  It's this, I think
> > > 
> > >   bool test_remove = IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE)
> > > &&
> > >  !drv->suppress_bind_attrs;
> > > 
> > > You have that config option set.
> > 
> > Yes - or better said 0-day testing has it set. Maybe that is not a 
> > sane default for 0-day tests? The option is explicitely marked as
> > "unstable"... Fengguang?
> > 
> > > So the drivers base layer is calling ->remove after probe and
> > > triggering the destruction of the queue.
> > > 
> > > What to do about this (apart from nuke such a stupid option) is
> > > somewhat more problematic.
> > 
> > I guess this is between you and Greg :).
> 
> Nice try, sport ... you qualify just behind Dan in the "not my problem"
> olympic hurdles event.  I'm annoyed because there's no indication in
> the log that the driver add behaviour is radically altered, so I've
> been staring at the wrong code for weeks.  However, the unbind/rebind
> should work, so the real issue is that your bdi changes (or perhaps
> something else in block) have induced a regression in the unbinding of
> upper layer drivers.  If you're going to release the bdi in
> del_gendisk, you have to have some mechanism for re-acquiring it on re
> -probe (likely with the same name) otherwise rebind is broken for every
> block driver.

So my patch does not release bdi in del_gendisk(). Bdi has two
initialization / destruction phases (similarly to request queue). You
allocate and initialize bdi through bdi_init(), then you call
bdi_register() to register it (which happens in device_add_disk()). On
shutdown you have to first call bdi_unregister() (used to be called from
blk_cleanup_queue(), my patch moved it to del_gendisk()). After that the
last reference to bdi may be dropped which does final bdi destruction.

So do I understand correctly that SCSI may call device_add_disk(),
del_gendisk() repeatedly for the same request queue? If yes, then indeed I
have a bug to fix... But gendisk seems to get allocated from scratch on
each probe so we don't call device_add_disk(), del_gendisk() more times on
the same disk, right?

> The fact that the second rebind tried with a different name indicates
> that sd_devt_release wasn't called, so some vestige of the devt remains
> on the subsequent rebind.

Yep, I guess that's caused by Dan's patch (commit 0dba1314d4f8 now) which
calls put_disk_devt() only in blk_cleanup_queue() which, if I understood
you correctly, does not get called during unbind-bind cycle? In fact Dan's
patch would end up leaking devt's because of repeated device_add_disk()
calls for the same request queue...

> Here's the problem: the queue belongs to SCSI (the lower layer), so it's
> not going to change because it doesn't see the release.  The gendisk and
> its allied stuff belongs to sd so it gets freed and re-created for the
> same queue.  Something in block is very confused when this happens.

Yep, I think the binding of request queue to different gendisks is
something I or Dan did not expect.

Honza
-- 
Jan Kara 
SUSE Labs, CR


RE: [RFC] hv_storvsc: error handling.

2017-03-07 Thread KY Srinivasan


> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Monday, March 6, 2017 9:06 PM
> To: KY Srinivasan 
> Cc: Stephen Hemminger ; James
> Bottomley ; Hannes Reinecke
> ; Christoph Hellwig ; James Bottomley
> ; Jens Axboe ; Linus Torvalds
> ; Martin K. Petersen
> ; Dexuan Cui ; Long
> Li ; Josh Poulson ; Adrian
> Suhov (Cloudbase Solutions SRL) ; linux-
> s...@vger.kernel.org; Haiyang Zhang 
> Subject: Re: [RFC] hv_storvsc: error handling.
> 
> > Was the invalid LUN in the LUN0 position. Inquiry of LUN0 support (when
> LUN0 is not populated)
> > was added only recently to address host side issue.
> 
> How does HyperV expect device scanning to happen for a not populated
> LUN?
> 
> REPORT SUPPORTED LUNS but nothing else on LUN 0?  Maybe a TEST UNIT
> READY
> thrown?  Or does it actually support the REPORT LUNS well known LU?

LUN0 on every target is supposed at least return enough data to support
report luns  scan for the target. It looks like if LUN0 on a target is empty DVD
device, the INQUIRY data that is returned from the host is bogus. Stephen can 
give
additional information on this.

K. Y  


Re: [PATCH] scsi: lpfc: Add shutdown method for kexec

2017-03-07 Thread Mauricio Faria de Oliveira

Martin,

On 03/07/2017 02:24 AM, Benjamin Herrenschmidt wrote:


On Mon, 2017-03-06 at 22:46 -0500, Martin K. Petersen wrote:

I don't recall a consensus being reached on this patch.



What would be the opposition ? Without it kexec breaks. With it, it
works ...


That is the argument I'd present/ask for consideration too.

I think I should have included this in the tested-by tag email, for
documentation/evidence: no regression observed in system shutdown path.

Thanks,

--
Mauricio Faria de Oliveira
IBM Linux Technology Center



Re: [PATCH v3 01/16] lpfc: Correct WQ creation for pagesize

2017-03-07 Thread Mauricio Faria de Oliveira

Hi Martin,

On 03/07/2017 12:35 AM, Martin K. Petersen wrote:

Mauricio> Please flag this patch for stable.



Both patches are already upstream so I can't tag them for stable. Either
you or James should mail sta...@vger.kernel.org and request for the
patches to be queued up.


Right, sorry; I missed checking the right tree. Thanks for the pointers.


--
Mauricio Faria de Oliveira
IBM Linux Technology Center



Re: Help needed for a SCSI hang (SATA drives)

2017-03-07 Thread Jack Wang
2017-03-07 1:54 GMT+01:00 V :
> Hi,
>
> I am reaching out regarding a SCSI error handler issue, which we see
> in our lab systems.
>
> Environment: Ubuntu 4.4.0-31-generic
>
> Issue: Write IOs are getting stuck in our system randomly. All drives
> seen with the issue are all SATA drives.
> Root cause: SCSI error handler is not woken up (or) it is up, but not
> able to flush commands and hence the IOs get stuck. (as the requests
> are not flushed out and host is not restarted)
>
> We have 6 instances seen till now.
>
> This is what we see.
> 1) As part of testing our drives, we start lot of read/writes, along
> with some user space utilities running to check the drive health.
> 1) In our recent testing, we see lot of commands going through the
> "abort scheduled" path. And we see that in between, for one of the
> commands, the error handler is not woken up, and majority of
> processes, which were writing, gets stalled.
> 2) We are still trying to figure out what is causing this much number
> of abort? Is it usual? Are there some other debugs, which I could
> enable to get more information? We know these are user space commands
> which are being aborted, but not sure which exact command it is for
> now.
> 3) I see the "abort scheduled" messages in almost all drives in all
> systems, hence i dont believe it is a drive issue.
> 4) I checked the host_failed and host_busy variables, both are set to
> 1 in system 1. 2nd one is still alive, havent taken a crash dump yet.
> 5) All drives seen with the issue are SATA drives.
> 6) I have attached udevadm info of a drive which failed in system 2.
>
> Thanks in advance,
> Viswesh
>
>
> Logs
> ---
> System 1
>
> [371546.605736] sd 9:0:0:0: scsi_block_when_processing_errors: rtn: 1
> [371546.606727] sd 9:0:0:0: scsi_block_when_processing_errors: rtn: 1
> [371546.607721] sd 9:0:0:0: scsi_block_when_processing_errors: rtn: 1
> [371546.618113] sd 9:0:0:0: [sg19] tag#20 abort scheduled
> [371546.624039] sd 9:0:0:0: [sg19] tag#20 aborting command
> [371546.624045] sd 9:0:0:0: [sg19] tag#20 cmd abort failed
> [371546.624051] scsi host9: Waking error handler thread
> [371546.624060] scsi host9: scsi_eh_9: waking up 0/1/1
> [371546.624081] sd 9:0:0:0: [sg19] tag#20 scsi_eh_9: flush finish cmd
> [371546.624095] scsi host9: waking up host to restart
> [371546.624098] scsi host9: scsi_eh_9: sleeping
>
>
> [371546.635314] sd 8:0:0:0: [sg17] tag#13 abort scheduled
> [371546.640031] sd 8:0:0:0: [sg17] tag#13 aborting command
> [371546.640037] sd 8:0:0:0: [sg17] tag#13 cmd abort failed
> [371546.640043] scsi host8: Waking error handler thread
> [371546.640078] scsi host8: scsi_eh_8: waking up 0/1/1
> [371546.640098] sd 8:0:0:0: [sg17] tag#13 scsi_eh_8: flush finish cmd
> [371546.640113] scsi host8: waking up host to restart
> [371546.640117] scsi host8: scsi_eh_8: sleeping
>
> [371546.657269] sd 6:0:0:0: [sg12] tag#1 abort scheduled
> [371546.664034] sd 6:0:0:0: [sg12] tag#1 aborting command
> [371546.664040] sd 6:0:0:0: [sg12] tag#1 cmd abort failed
>
> Here we can see that, error handler is not woken up.  And the entire
> IO subsystem goes for a toss.
>
> [371777.594510] INFO: task md2_raid1:508 blocked for more than 120 seconds.
> [371777.594571]  Not tainted 4.4.0-31-generic #50~14.04.1-Ubuntu
> [371777.594613] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [371777.594665] md2_raid1  D 883fed2afc780  508  2 0x
> [371777.594673]  883fed2afc78 881fef0e8000 883fed21c4c0
> 883fed2b
> [371777.594678]  883ff236e298 883ff236e000 883ff236e018
> 883ff236e018
>
> By enabling more scsi logs(in a different system), including LL, we
> get some more info ( I have enabled the full scsi levels for all
> categories and it is still running)
>
>
> System 2
>
> [405197.171574] sd 5:0:0:0: [sg3] sg_write: count=3D88
> [405197.171577] sd 5:0:0:0: scsi_block_when_processing_errors: rtn: 1
> [405197.171581] sd 5:0:0:0: [sg3] sg_common_write:  scsi
> opcode=3D0x85, cmd_size=3D16
> [405197.171583] sd 5:0:0:0: [sg3] sg_start_req: dxfer_len=3D512
> [405197.171605] sd 5:0:0:0: [sg3] sg_link_reserve: size=3D512
> [405197.171623] sd 5:0:0:0: [sg3] sg_poll: res=3D0x104
> [405197.172648] sd 5:0:0:0: [sg3] sg_cmd_done: pack_id=3D0, res=3D0x0
> [405197.172701] sd 5:0:0:0: [sg3] sg_poll: res=3D0x145
> [405197.172710] sd 5:0:0:0: [sg3] sg_read: count=3D88
> [405197.172716] sd 5:0:0:0: [sg3] sg_finish_rem_req: res_used=3D1
> [405197.172721] sd 5:0:0:0: [sg3] sg_unlink_reserve: req->k_use_sg=3D1
> [405197.172756] sd 5:0:0:0: [sg3] sg_write: count=3D88
> [405197.172760] sd 5:0:0:0: scsi_block_when_processing_errors: rtn: 1
> [405197.172764] sd 5:0:0:0: [sg3] sg_common_write:  scsi
> opcode=3D0x85, cmd_size=3D16
> [405197.172767] sd 5:0:0:0: [sg3] sg_start_req: dxfer_len=3D512
> [405197.172774] sd 5:0:0:0: [sg3] sg_link_reserve: size=3D512
> [405197.172793] sd 5:0:0:0: [sg3] sg_poll: res=3D0x104
> [405197.173806] sd 5:0:0:0:

Re: [PATCH 11/29] drivers, media: convert cx88_core.refcount from atomic_t to refcount_t

2017-03-07 Thread Sergei Shtylyov

On 3/7/2017 10:52 AM, Reshetova, Elena wrote:


refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 

[...]

diff --git a/drivers/media/pci/cx88/cx88.h b/drivers/media/pci/cx88/cx88.h
index 115414c..16c1313 100644
--- a/drivers/media/pci/cx88/cx88.h
+++ b/drivers/media/pci/cx88/cx88.h

[...]

@@ -339,7 +340,7 @@ struct cx8802_dev;

 struct cx88_core {
struct list_head   devlist;
-   atomic_t   refcount;
+   refcount_t   refcount;


Could you please keep the name aligned with above and below?


You mean "not aligned" to devlist, but with a shift like it was before?


   I mean aligned, like it was before. :-)


Sure, will fix. Is the patch ok otherwise?


   I haven't noticed anything else...


Best Regards,
Elena.

[...]

MBR, Sergei


Re: [RFC] hv_storvsc: error handling.

2017-03-07 Thread Christoph Hellwig
> Was the invalid LUN in the LUN0 position. Inquiry of LUN0 support (when LUN0 
> is not populated)
> was added only recently to address host side issue.

How does HyperV expect device scanning to happen for a not populated LUN?

REPORT SUPPORTED LUNS but nothing else on LUN 0?  Maybe a TEST UNIT READY
thrown?  Or does it actually support the REPORT LUNS well known LU?


Re: [PATCH 13/29] drivers, media: convert vb2_vmarea_handler.refcount from atomic_t to refcount_t

2017-03-07 Thread Sakari Ailus
Hi Elena,

On Mon, Mar 06, 2017 at 04:21:00PM +0200, Elena Reshetova wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
> 
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 
> ---
>  drivers/media/v4l2-core/videobuf2-memops.c | 6 +++---
>  include/media/videobuf2-memops.h   | 3 ++-
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/media/v4l2-core/videobuf2-memops.c 
> b/drivers/media/v4l2-core/videobuf2-memops.c
> index 1cd322e..4bb8424 100644
> --- a/drivers/media/v4l2-core/videobuf2-memops.c
> +++ b/drivers/media/v4l2-core/videobuf2-memops.c
> @@ -96,10 +96,10 @@ static void vb2_common_vm_open(struct vm_area_struct *vma)
>   struct vb2_vmarea_handler *h = vma->vm_private_data;
>  
>   pr_debug("%s: %p, refcount: %d, vma: %08lx-%08lx\n",
> -__func__, h, atomic_read(h->refcount), vma->vm_start,
> +__func__, h, refcount_read(h->refcount), vma->vm_start,
>  vma->vm_end);
>  
> - atomic_inc(h->refcount);
> + refcount_inc(h->refcount);
>  }
>  
>  /**
> @@ -114,7 +114,7 @@ static void vb2_common_vm_close(struct vm_area_struct 
> *vma)
>   struct vb2_vmarea_handler *h = vma->vm_private_data;
>  
>   pr_debug("%s: %p, refcount: %d, vma: %08lx-%08lx\n",
> -__func__, h, atomic_read(h->refcount), vma->vm_start,
> +__func__, h, refcount_read(h->refcount), vma->vm_start,
>  vma->vm_end);
>  
>   h->put(h->arg);
> diff --git a/include/media/videobuf2-memops.h 
> b/include/media/videobuf2-memops.h
> index 36565c7a..a6ed091 100644
> --- a/include/media/videobuf2-memops.h
> +++ b/include/media/videobuf2-memops.h
> @@ -16,6 +16,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  /**
>   * struct vb2_vmarea_handler - common vma refcount tracking handler
> @@ -25,7 +26,7 @@
>   * @arg: argument for @put callback
>   */
>  struct vb2_vmarea_handler {
> - atomic_t*refcount;
> + refcount_t  *refcount;

This is a pointer to refcount, not refcount itself. The refcount is part of
a memory type specific struct, the types that you change in the following
three patches. I guess it would still compile and work as separate patches
but you'd sure get warnings at least.

How about merging this and the three following patches that change the memop
refcount types?

>   void(*put)(void *arg);
>   void*arg;
>  };

-- 
Kind regards,

Sakari Ailus
e-mail: sakari.ai...@iki.fi XMPP: sai...@retiisi.org.uk


Re: [PATCH 12/29] drivers, media: convert s2255_dev.num_channels from atomic_t to refcount_t

2017-03-07 Thread Sakari Ailus
Hi Elena,

On Mon, Mar 06, 2017 at 04:20:59PM +0200, Elena Reshetova wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
> 
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 
> ---
...
> @@ -1688,7 +1689,7 @@ static int s2255_probe_v4l(struct s2255_dev *dev)
>   "failed to register video device!\n");
>   break;
>   }
> - atomic_inc(&dev->num_channels);
> + refcount_set(&dev->num_channels, 1);

That's not right. The loop runs four iterations and the value after the
loop should be indeed the number of channels.

atomic_t isn't really used for reference counting here, but storing out how
many "channels" there are per hardware device, with maximum number of four
(4). I'd leave this driver using atomic_t.

>   v4l2_info(&dev->v4l2_dev, "V4L2 device registered as %s\n",
> video_device_node_name(&vc->vdev));
>  

-- 
Kind regards,

Sakari Ailus
e-mail: sakari.ai...@iki.fi XMPP: sai...@retiisi.org.uk


Re: [PATCH 11/29] drivers, media: convert cx88_core.refcount from atomic_t to refcount_t

2017-03-07 Thread Sakari Ailus
On Mon, Mar 06, 2017 at 04:20:58PM +0200, Elena Reshetova wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
> 
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 

Acked-by: Sakari Ailus 

-- 
Sakari Ailus
e-mail: sakari.ai...@iki.fi XMPP: sai...@retiisi.org.uk


Re: [PATCH v3 01/16] lpfc: Correct WQ creation for pagesize

2017-03-07 Thread Martin K. Petersen
> "Mauricio" == Mauricio Faria de Oliveira  
> writes:

Mauricio,

Mauricio> Please flag this patch for stable.

Mauricio> This patch resolves a serious problem on IBM Power systems at
Mauricio> least.

Both patches are already upstream so I can't tag them for stable. Either
you or James should mail sta...@vger.kernel.org and request for the
patches to be queued up.

-- 
Martin K. Petersen  Oracle Linux Engineering


RE: [PATCH 21/29] drivers, s390: convert fc_fcp_pkt.ref_cnt from atomic_t to refcount_t

2017-03-07 Thread Reshetova, Elena
> On 03/06/2017 03:21 PM, Elena Reshetova wrote:
> > refcount_t type and corresponding API should be
> > used instead of atomic_t when the variable is used as
> > a reference counter. This allows to avoid accidental
> > refcounter overflows that might lead to use-after-free
> > situations.
> 
> The subject is wrong, should be something like "scsi: libfc convert
> fc_fcp_pkt.ref_cnt from atomic_t to refcount_t" but not s390.

Very sorry about this: the error in the subject got from the time when I was 
trying to break the bigger drivers patch set into per-variable part and trying 
to automate the process too much :(
I will fix it in the end version!

Best Regards,
Elena

> 
> Other than that
> Acked-by: Johannes Thumshirn 
> 
> --
> Johannes Thumshirn  Storage
> jthumsh...@suse.de+49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


RE: [PATCH 11/29] drivers, media: convert cx88_core.refcount from atomic_t to refcount_t

2017-03-07 Thread Reshetova, Elena
> Hello.
> 
> On 03/06/2017 05:20 PM, Elena Reshetova wrote:
> 
> > refcount_t type and corresponding API should be
> > used instead of atomic_t when the variable is used as
> > a reference counter. This allows to avoid accidental
> > refcounter overflows that might lead to use-after-free
> > situations.
> >
> > Signed-off-by: Elena Reshetova 
> > Signed-off-by: Hans Liljestrand 
> > Signed-off-by: Kees Cook 
> > Signed-off-by: David Windsor 
> [...]
> > diff --git a/drivers/media/pci/cx88/cx88.h b/drivers/media/pci/cx88/cx88.h
> > index 115414c..16c1313 100644
> > --- a/drivers/media/pci/cx88/cx88.h
> > +++ b/drivers/media/pci/cx88/cx88.h
> > @@ -24,6 +24,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #include 
> >  #include 
> > @@ -339,7 +340,7 @@ struct cx8802_dev;
> >
> >  struct cx88_core {
> > struct list_head   devlist;
> > -   atomic_t   refcount;
> > +   refcount_t   refcount;
> 
> Could you please keep the name aligned with above and below?

You mean "not aligned" to devlist, but with a shift like it was before? 
Sure, will fix. Is the patch ok otherwise?

Best Regards,
Elena.

> 
> >
> > /* board name */
> > intnr;
> >
> 
> MBR, Sergei