Re: [PATCH] qla2xxx: Get mutex lock before checking optrom_state

2016-12-24 Thread Laurence Oberman


- Original Message -
> From: "Milan P. Gandhi" 
> To: linux-scsi@vger.kernel.org
> Cc: "Laurence Oberman" , "chad dupuis" 
> 
> Sent: Saturday, December 24, 2016 11:32:46 AM
> Subject: [PATCH] qla2xxx: Get mutex lock before checking optrom_state
> 
> Hello,
> 
> There is a race condition with qla2xxx optrom functions where
> one thread might modify optrom buffer, optrom_state while
> other thread is still reading from it.
> 
> In couple of crashes, it was found that we had successfully
> passed the following 'if' check where we confirm optrom_state
> to be QLA_SREADING. But by the time we acquired mutex lock
> to proceed with memory_read_from_buffer function, some other
> thread/process had already modified that option rom buffer
> and optrom_state from QLA_SREADING to QLA_SWAITING. Then
> we got ha->optrom_buffer 0x0 and crashed the system:
> 
> if (ha->optrom_state != QLA_SREADING)
> return 0;
> 
> mutex_lock(>optrom_mutex);
> rval = memory_read_from_buffer(buf, count, , ha->optrom_buffer,
> ha->optrom_region_size);
> mutex_unlock(>optrom_mutex);
> 
> 
> With current optrom function we get following crash due to
> a race condition:
> 
> [ 1479.466679] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [ 1479.466707] IP: [] memcpy+0x6/0x110
> [...]
> [ 1479.473673] Call Trace:
> [ 1479.474296]  [] ? memory_read_from_buffer+0x3c/0x60
> [ 1479.474941]  [] qla2x00_sysfs_read_optrom+0x9c/0xc0
> [qla2xxx]
> [ 1479.475571]  [] read+0xdb/0x1f0
> [ 1479.476206]  [] vfs_read+0x9e/0x170
> [ 1479.476839]  [] SyS_read+0x7f/0xe0
> [ 1479.477466]  [] system_call_fastpath+0x16/0x1b
> 
> 
> Below patch modifies qla2x00_sysfs_read_optrom,
> qla2x00_sysfs_write_optrom functions to get the mutex_lock
> before checking ha->optrom_state to avoid similar crashes.
> 
> The patch was applied and tested and same crashes were no
> longer observed again.
> 
> 
> Tested-by: Milan P. Gandhi 
> Signed-off-by: Milan P. Gandhi 
> ---
>  drivers/scsi/qla2xxx/qla_attr.c | 18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_attr.c
> b/drivers/scsi/qla2xxx/qla_attr.c
> index da5ae11..47ea164 100644
> --- a/drivers/scsi/qla2xxx/qla_attr.c
> +++ b/drivers/scsi/qla2xxx/qla_attr.c
> @@ -329,12 +329,15 @@ qla2x00_sysfs_read_optrom(struct file *filp, struct
> kobject *kobj,
>   struct qla_hw_data *ha = vha->hw;
>   ssize_t rval = 0;
>  
> + mutex_lock(>optrom_mutex);
> +
>   if (ha->optrom_state != QLA_SREADING)
> - return 0;
> + goto out;
>  
> - mutex_lock(>optrom_mutex);
>   rval = memory_read_from_buffer(buf, count, , ha->optrom_buffer,
>   ha->optrom_region_size);
> +
> +out:
>   mutex_unlock(>optrom_mutex);
>  
>   return rval;
> @@ -349,14 +352,19 @@ qla2x00_sysfs_write_optrom(struct file *filp, struct
> kobject *kobj,
>   struct device, kobj)));
>   struct qla_hw_data *ha = vha->hw;
>  
> - if (ha->optrom_state != QLA_SWRITING)
> + mutex_lock(>optrom_mutex);
> +
> + if (ha->optrom_state != QLA_SWRITING) {
> + mutex_unlock(>optrom_mutex);
>   return -EINVAL;
> - if (off > ha->optrom_region_size)
> + }
> + if (off > ha->optrom_region_size) {
> + mutex_unlock(>optrom_mutex);
>   return -ERANGE;
> + }
>   if (off + count > ha->optrom_region_size)
>   count = ha->optrom_region_size - off;
>  
> - mutex_lock(>optrom_mutex);
>   memcpy(>optrom_buffer[off], buf, count);
>   mutex_unlock(>optrom_mutex);
>  
> 
Looks good, and I know it fixed the issue.
Milan, Thank you for this work.

Reviewed-by: Laurence Oberman  
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] qla2xxx: Get mutex lock before checking optrom_state

2016-12-24 Thread Milan P. Gandhi
Hello,

There is a race condition with qla2xxx optrom functions where
one thread might modify optrom buffer, optrom_state while 
other thread is still reading from it.

In couple of crashes, it was found that we had successfully 
passed the following 'if' check where we confirm optrom_state 
to be QLA_SREADING. But by the time we acquired mutex lock 
to proceed with memory_read_from_buffer function, some other 
thread/process had already modified that option rom buffer  
and optrom_state from QLA_SREADING to QLA_SWAITING. Then 
we got ha->optrom_buffer 0x0 and crashed the system: 

if (ha->optrom_state != QLA_SREADING)
return 0;

mutex_lock(>optrom_mutex);
rval = memory_read_from_buffer(buf, count, , ha->optrom_buffer,
ha->optrom_region_size);
mutex_unlock(>optrom_mutex);


With current optrom function we get following crash due to 
a race condition:

[ 1479.466679] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[ 1479.466707] IP: [] memcpy+0x6/0x110
[...]
[ 1479.473673] Call Trace:
[ 1479.474296]  [] ? memory_read_from_buffer+0x3c/0x60
[ 1479.474941]  [] qla2x00_sysfs_read_optrom+0x9c/0xc0 
[qla2xxx]
[ 1479.475571]  [] read+0xdb/0x1f0
[ 1479.476206]  [] vfs_read+0x9e/0x170
[ 1479.476839]  [] SyS_read+0x7f/0xe0
[ 1479.477466]  [] system_call_fastpath+0x16/0x1b


Below patch modifies qla2x00_sysfs_read_optrom,
qla2x00_sysfs_write_optrom functions to get the mutex_lock 
before checking ha->optrom_state to avoid similar crashes.

The patch was applied and tested and same crashes were no 
longer observed again.


Tested-by: Milan P. Gandhi 
Signed-off-by: Milan P. Gandhi 
---
 drivers/scsi/qla2xxx/qla_attr.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
index da5ae11..47ea164 100644
--- a/drivers/scsi/qla2xxx/qla_attr.c
+++ b/drivers/scsi/qla2xxx/qla_attr.c
@@ -329,12 +329,15 @@ qla2x00_sysfs_read_optrom(struct file *filp, struct 
kobject *kobj,
struct qla_hw_data *ha = vha->hw;
ssize_t rval = 0;
 
+   mutex_lock(>optrom_mutex);
+
if (ha->optrom_state != QLA_SREADING)
-   return 0;
+   goto out;
 
-   mutex_lock(>optrom_mutex);
rval = memory_read_from_buffer(buf, count, , ha->optrom_buffer,
ha->optrom_region_size);
+
+out:
mutex_unlock(>optrom_mutex);
 
return rval;
@@ -349,14 +352,19 @@ qla2x00_sysfs_write_optrom(struct file *filp, struct 
kobject *kobj,
struct device, kobj)));
struct qla_hw_data *ha = vha->hw;
 
-   if (ha->optrom_state != QLA_SWRITING)
+   mutex_lock(>optrom_mutex);
+
+   if (ha->optrom_state != QLA_SWRITING) {
+   mutex_unlock(>optrom_mutex);
return -EINVAL;
-   if (off > ha->optrom_region_size)
+   }
+   if (off > ha->optrom_region_size) {
+   mutex_unlock(>optrom_mutex);
return -ERANGE;
+   }
if (off + count > ha->optrom_region_size)
count = ha->optrom_region_size - off;
 
-   mutex_lock(>optrom_mutex);
memcpy(>optrom_buffer[off], buf, count);
mutex_unlock(>optrom_mutex);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-24 Thread Christoph Hellwig
On Sat, Dec 24, 2016 at 02:17:26PM +0100, Hannes Reinecke wrote:
> Christoph, do you have a pointer to your patchset?
> Not that I'll be able to do any meaningful work until next year, but having 
> a look would be nice. Just to get a feeling where you want to head to; I 
> might be able to work on this start of January.

I'll push out a branch once it's revieable and not my current unbisectable
mess, should be soon.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-24 Thread Hannes Reinecke

On 12/24/2016 11:07 AM, Christoph Hellwig wrote:

On Fri, Dec 23, 2016 at 11:42:45AM -0800, Linus Torvalds wrote:

Ugh. This patch is nasty.


It's the same SCSI has done for ages - except that is uses a separate
kmalloc for the sense buffer.


I think we should just fix blk_execute_rq() instead.


As you found out below it's not just blk_execute_rq, it's the whole
architecture of the BLOCK_PC code, which expects a caller provided
sense buffer.  But with the way blk-mq allocates request structures
we can actually fix it, but I first need to extent the way it allows
drivers to allocate private data to the old request code.  I've
actually already implemented that for SCSI long time ago, and have
started to life it to the block layer.


Would be cool to have a generic sense buffer.
I always found it slightly odd, pretending that 'struct request' is 
protocol-agnostic and refusing to add a sense data pointer, but at the 
same time having a field 'sense_len' (which gives the length of what 
exactly?).


Christoph, do you have a pointer to your patchset?
Not that I'll be able to do any meaningful work until next year, but 
having a look would be nice. Just to get a feeling where you want to 
head to; I might be able to work on this start of January.


Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] scsi: lpfc: Replace BUG() with BUG_ON()

2016-12-24 Thread Shyam Saini
Replace BUG() with BUG_ON() using coccinelle

Signed-off-by: Shyam Saini 
---
 drivers/scsi/lpfc/lpfc_els.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 27f0cbb..ede14f1 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -8855,8 +8855,7 @@ lpfc_cmpl_fabric_iocb(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
 {
struct ls_rjt stat;
 
-   if ((cmdiocb->iocb_flag & LPFC_IO_FABRIC) != LPFC_IO_FABRIC)
-   BUG();
+   BUG_ON((cmdiocb->iocb_flag & LPFC_IO_FABRIC) != LPFC_IO_FABRIC);
 
switch (rspiocb->iocb.ulpStatus) {
case IOSTAT_NPORT_RJT:
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-24 Thread Christoph Hellwig
On Fri, Dec 23, 2016 at 11:42:45AM -0800, Linus Torvalds wrote:
> Ugh. This patch is nasty.

It's the same SCSI has done for ages - except that is uses a separate
kmalloc for the sense buffer.

> I think we should just fix blk_execute_rq() instead.

As you found out below it's not just blk_execute_rq, it's the whole
architecture of the BLOCK_PC code, which expects a caller provided
sense buffer.  But with the way blk-mq allocates request structures
we can actually fix it, but I first need to extent the way it allows
drivers to allocate private data to the old request code.  I've
actually already implemented that for SCSI long time ago, and have
started to life it to the block layer.

Once that is done the callers won't need a sense buffer at all, and
can just look at the driver provided one.  Which currently is missing
in virtio-blk, so we'd need something similar to the above patch
anyway.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-24 Thread Christoph Hellwig
On Fri, Dec 23, 2016 at 07:45:45PM -0700, Jens Axboe wrote:
> It's not that it's technically hard to fix up, it's more that it's a
> pain in the ass to have to do it. For instance, for blk_execute_rq(), we
> either should enforce that the caller allocates it dynamically and then
> free it, or we need nasty hack where the caller needs to know he has to
> free it. Pretty obvious what I would prefer there.
> 
> And yes, there would be a good chunk of other places where this would
> nede to be fixed up...

My planned rework for the BLOCK_PC code (split all fields for them out
of struct request and move them into a separate, driver-allocate structure)
would fix this up as a side-effect.  I really wanted to get it into 4.10,
but I didn't manage to fix it up.  I'll try to get it into 4.11 early.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html