Re: iSCSI request keep rejected by microsoft iSCSI target because of write_same check

2014-12-01 Thread vaughan
On 12/01/2014 03:25 PM, j...@deti74.ru wrote:
> I has some sproblem and only one ISCSI target - Microsoft SWT
> I disable write_same for iscsi by patching kernel (3.17.0)
This method is not good.  Below is the answer from Mike when I asked if
we can disable write_same in the template before.

"If you wanted to disable commands you would want to do it per device or
target at the scsi layer, because there are or will be iscsi targets
that support it."

Also Martin suggests the target should return a check condition of
ILLEGAL REQUEST if it doesn't support a specific command.
It's a bug that only happens with Microsoft target, not others like tgtd.


>
> Patch is simple add .no_write_same = 1 param, into scsi_host_template
> struct.
>
> file drivers/scsi/iscsi_tcp.c :
>
>  static struct scsi_host_template iscsi_sw_tcp_sht = {
>   .module= THIS_MODULE,
>   .name= "iSCSI Initiator over TCP/IP",
>   .queuecommand   = iscsi_queuecommand,
>   .change_queue_depth= iscsi_change_queue_depth,
>   .can_queue= ISCSI_DEF_XMIT_CMDS_MAX - 1,
>   .sg_tablesize= 4096,
>   .max_sectors= 0x,
>   .cmd_per_lun= ISCSI_DEF_CMD_PER_LUN,
>   .eh_abort_handler   = iscsi_eh_abort,
>   .eh_device_reset_handler= iscsi_eh_device_reset,
>   .eh_target_reset_handler = iscsi_eh_recover_target,
>   .use_clustering = DISABLE_CLUSTERING,
> + .no_write_same= 1,
>   .slave_alloc= iscsi_sw_tcp_slave_alloc,
>   .slave_configure= iscsi_sw_tcp_slave_configure,
>   .target_alloc= iscsi_target_alloc,
>   .proc_name= "iscsi_tcp",
>   .this_id= -1,
>   };
>
>
> суббота, 11 января 2014 г., 6:22:46 UTC+5 пользователь Vaughan Cao
> написал:
>
>
> On 2014年01月11日 04:24, Mike Christie wrote:
> > On 01/10/2014 02:09 AM, vaughan wrote:
> >> On 01/10/2014 03:41 PM, Mike Christie wrote:
> >>> On 1/10/14 12:11 AM, vaughan wrote:
> >>>> I haven't figure out why it's rejected with "bookmark
> invalid"(9)
> >>>> reason, rather than "command not supported". IMO "bookmark
> invalid" is
> >>>> used when minor protocol conflict such as final flag not set
> with
> >>>> non-write command. However, I haven't find error of this kind in
> >>>> report_opcode, so I guess it's not supported on the target.
> >>>>
> >>> Is it possible to get a wireshark/tcpdump trace? It does not
> have to
> >>> be during boot. We just need to see what commands are sent and
> the
> >>> response the target is returning.
> >>>
> >>> I forgot we know some microsoft iscsi target people. We can
> just email
> >>> them with the trace to confirm what is going on with the
> target. The
> >>> trace seems to be easier for them than them interpreting linux
> kernel
> >>> logs.
> >> I enabled debug_iscsi_tcp, here is a more detailed log in
> normal connection.
> >> Does "conn error (1020)" mean it's target peer who disconnect the
> >> connection at the same time of reject report_opcode?
> > Yes.
> >
> >> If it is, I think iSCSI boot failure can't be avoided without
> disable
> >> write_same check on OEL.
> > Yes, you are right. Due to how more distros do boot, iscsid will
> not be
> > up and you will hang. If are talking about disablement though I
> think it
> > should not be done at the iscsi layer. It should be some sort of
> > white/black list at the scsi device layer or something like that.
> >
> > However, I will ping Microsoft and cc you and we can see what is
> up for
> > sure. Maybe we will get lucky and they will have a release with
> a fix on
> > their side.
>
> Target:
>Windows Server 2008 R2 DataCenter
>Microsoft iSCSI Software Target: 3.3.16563.
> Initiator:
>kernel with write_same check in sd, such as Kernel
> 3.11.10-200.fc19.x86_64  
>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] notify block layer when using temporary change to cache_type

2014-06-03 Thread Vaughan Cao
This is a fix for commit:
  39c60a0948cc06139e2fbfe084f83cb7e7deae3b sd: fix array cache flushing bug 
causing performance problems
We must notify the block layer via q->flush_flags after temporary change the 
cache_type to write through.
If not, SYNCHRONIZE CACHE command will still be generated.
This patch factors out a helper that can be called from sd_revalidate_disk and 
cache_type_store.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sd.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index efcbcd1..e4774aa 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -132,6 +132,19 @@ static const char *sd_cache_types[] = {
"write back, no read (daft)"
 };
 
+static void sd_set_flush_flag(struct scsi_disk *sdkp)
+{
+   unsigned flush = 0;
+
+   if (sdkp->WCE) {
+   flush |= REQ_FLUSH;
+   if (sdkp->DPOFUA)
+   flush |= REQ_FUA;
+   }
+
+   blk_queue_flush(sdkp->disk->queue, flush);
+}
+
 static ssize_t
 cache_type_store(struct device *dev, struct device_attribute *attr,
 const char *buf, size_t count)
@@ -175,6 +188,7 @@ cache_type_store(struct device *dev, struct 
device_attribute *attr,
if (sdkp->cache_override) {
sdkp->WCE = wce;
sdkp->RCD = rcd;
+   sd_set_flush_flag(sdkp);
return count;
}
 
@@ -2712,7 +2726,6 @@ static int sd_revalidate_disk(struct gendisk *disk)
struct scsi_disk *sdkp = scsi_disk(disk);
struct scsi_device *sdp = sdkp->device;
unsigned char *buffer;
-   unsigned flush = 0;
 
SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp,
  "sd_revalidate_disk\n"));
@@ -2758,13 +2771,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 * We now have all cache related info, determine how we deal
 * with flush requests.
 */
-   if (sdkp->WCE) {
-   flush |= REQ_FLUSH;
-   if (sdkp->DPOFUA)
-   flush |= REQ_FUA;
-   }
-
-   blk_queue_flush(sdkp->disk->queue, flush);
+   sd_set_flush_flag(sdkp);
 
set_capacity(disk, sdkp->capacity);
sd_config_write_same(sdkp);
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] notify block layer when using temporary change to cache_type

2014-05-27 Thread Vaughan Cao
On 05/28/2014 12:18 AM, James Bottomley wrote:
> On Tue, 2014-05-27 at 19:39 +0800, Vaughan Cao wrote:
>> This is a fix for commit:
>>   39c60a0948cc06139e2fbfe084f83cb7e7deae3b sd: fix array cache flushing bug 
>> causing performance problems
>> We must notify the block layer via q->flush_flags after temporary change the 
>> cache_type to write through.
>> If not, SYNCHRONIZE CACHE command will still be generated.
>>
>> Signed-off-by: Vaughan Cao 
>> ---
>>  drivers/scsi/sd.c | 12 
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index 6146b9d..366e48b 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -144,6 +144,7 @@ sd_store_cache_type(struct device *dev, struct 
>> device_attribute *attr,
> This is a weird function name.  The name in the vanilla kernel is
> cache_type_store().
Sorry, this patch is created on oracle uek kernel, I just checked it can
apply to vanilla kernel cleanly and didn't notice they are using
different device attribute function name.

>>  struct scsi_sense_hdr sshdr;
>>  static const char temp[] = "temporary ";
>>  int len;
>> +unsigned flush;
>>  
>>  if (sdp->type != TYPE_DISK)
>>  /* no cache control on RBC devices; theoretically they
>> @@ -174,6 +175,17 @@ sd_store_cache_type(struct device *dev, struct 
>> device_attribute *attr,
>>  if (sdkp->cache_override) {
>>  sdkp->WCE = wce;
>>  sdkp->RCD = rcd;
>> +
>> +/* set flush_flags to notify the block layer */
>> +flush = 0;
>> +if (sdkp->WCE) {
>> +flush |= REQ_FLUSH;
>> +if (sdkp->DPOFUA)
>> +flush |= REQ_FUA;
>> +}
>> +
>> +blk_queue_flush(sdkp->disk->queue, flush);
>> +
> Is there a reason you cut and paste from sd_revalidate_disk() instead of
> just calling it directly?
No, I just want to keep the modification as small as possible. Checked
the actions of sd_revalidate_disk() again, it seems no harm to call it
from here. Also actual actions are skipped in sd_read_cache_type() if
cache_override!=0, so I suppose your original plan is to jump to
sd_revalidate_disk() from here.

However, that way may not be acceptable after further code review.
Changing the sdkp->WCE,RCD will cause these real parameters of the
underlying device *lost*. In sd_shutdown() and sd_suspend_common(), we
need them to call sd_sync_cache() if necessary. I don't think this
action is avoidable, just for the performance issue to solve. And we
can't change the mode just by setting q->flush_flags while leave
sdkp->WCE,RCD untouched either, because cache_type_show() needs
sdkp->WCE,RCD to present the temporary config to userspace.

static void sd_shutdown(struct device *dev)
{
...
if (sdkp->WCE && sdkp->media_present) {
sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
sd_sync_cache(sdkp);
}

static int sd_suspend_common(struct device *dev, bool ignore_stop_errors)
{
...
if (sdkp->WCE && sdkp->media_present) {
sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
ret = sd_sync_cache(sdkp);

So, It seems new fields like realWCE and readRCD in scsi_disk are needed
to save those configuration. What's your opinion?

Vaughan
>
> James
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] notify block layer when using temporary change to cache_type

2014-05-27 Thread Vaughan Cao
This is a fix for commit:
  39c60a0948cc06139e2fbfe084f83cb7e7deae3b sd: fix array cache flushing bug 
causing performance problems
We must notify the block layer via q->flush_flags after temporary change the 
cache_type to write through.
If not, SYNCHRONIZE CACHE command will still be generated.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sd.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 6146b9d..366e48b 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -144,6 +144,7 @@ sd_store_cache_type(struct device *dev, struct 
device_attribute *attr,
struct scsi_sense_hdr sshdr;
static const char temp[] = "temporary ";
int len;
+   unsigned flush;
 
if (sdp->type != TYPE_DISK)
/* no cache control on RBC devices; theoretically they
@@ -174,6 +175,17 @@ sd_store_cache_type(struct device *dev, struct 
device_attribute *attr,
if (sdkp->cache_override) {
sdkp->WCE = wce;
sdkp->RCD = rcd;
+
+   /* set flush_flags to notify the block layer */
+   flush = 0;
+   if (sdkp->WCE) {
+   flush |= REQ_FLUSH;
+   if (sdkp->DPOFUA)
+   flush |= REQ_FUA;
+   }
+
+   blk_queue_flush(sdkp->disk->queue, flush);
+
return count;
}
 
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: special sense code asc,ascq=04h,0Ch abort scsi scan in the middle

2014-02-19 Thread vaughan
Hi Hannes,

Sorry to bother you.
Months ago, you made a patch to fix this scsi_scan abort error found on
zfssa storage. Though it's only a specific storage, the logic -- not
abort scsi scan process because of an inquiry failure of a LU in the
middle, is helpful as a way to make our scanning more resilient. I'd
prefer to keep our device scanning behavior in sync with other OS like
Solaris and Windows.
Will you merge your patch in mainline? You can find the patch here
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg521753.html

Regards,
Vaughan

On 10/16/2013 02:52 PM, Hannes Reinecke wrote:
> On 10/14/2013 05:24 PM, Steffen Maier wrote:
>> On 10/14/2013 03:32 PM, Hannes Reinecke wrote:
>>> On 10/14/2013 03:18 PM, Hannes Reinecke wrote:
>>>> On 10/14/2013 02:51 PM, Steffen Maier wrote:
>>>>> On 10/14/2013 01:13 PM, Hannes Reinecke wrote:
>>>>>> On 10/13/2013 07:23 PM, Vaughan Cao wrote:
>>>>>>> [1.] One line summary of the problem:
>>>>>>> special sense code asc,ascq=04h,0Ch abort scsi scan in the middle
>>>>>>>
>>>>>>> [2.] Full description of the problem/report:
>>>>>>> For instance, storage represents 8 iscsi LUNs, however the LUN
>>>>>>> No.7
>>>>>>> is not well configured or has something wrong.
>>>>>>> Then messages received:
>>>>>>> kernel: scsi 5:0:0:0: Unexpected response from lun 7 while
>>>>>>> scanning, scan aborted
>>>>>>> Which will make LUN No.8 unavailable.
>>>>>>> It's confirmed that Windows and Solaris systems will continue the
>>>>>>> scan and make LUN No.1,2,3,4,5,6 and 8 available.
>>>>>>>
>>>>>>> Log snippet is as below:
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: scsi scan:
>>>>>>> INQUIRY pass 1 length 36
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Send:
>>>>>>> 0x8801e9bd4280
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: CDB:
>>>>>>> Inquiry: 12 00 00 00 24 00
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: buffer =
>>>>>>> 0x8801f71fc180, bufflen = 36, queuecommand 0xa00b99e7
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: leaving scsi_dispatch_cmnd()
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Done:
>>>>>>> 0x8801e9bd4280 SUCCESS
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Result:
>>>>>>> hostbyte=DID_OK driverbyte=DRIVER_OK
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: CDB:
>>>>>>> Inquiry: 12 00 00 00 24 00
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Sense Key :
>>>>>>> Not Ready [current]
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Add. Sense:
>>>>>>> Logical unit not accessible, target port in unavailable state
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: scsi host
>>>>>>> busy 1 failed 0
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: 0 sectors total, 36 bytes
>>>>>>> done.
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi scan: INQUIRY failed
>>>>>>> with code 0x802
>>>>>>> Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:0: Unexpected
>>>>>>> response from lun 7 while scanning, scan aborted
>>>>>>>
>>>>>>> According to scsi_report_lun_scan(), I found:
>>>>>>> Linux use an inquiry command to probe a lun according to the
>>>>>>> result
>>>>>>> of report_lun command.
>>>>>>> It assumes every probe cmd will get a legal result. Otherwise, it
>>>>>>> regards the whole peripheral not exist or dead.
>>>>>>> If the return of inquiry passes its legal checking and indicates
>>>>>>> 'LUN not present', it won't break but also continue with the scan
>>>>>>> process.
>>>>>>> In the log, inquiry to LUN7 return a sense - asc,ascq=04h,0Ch
>>>>>>> (Logical unit not accessible, target port in unavailable state).
>>>>>>> And this is ignored, so scsi_probe_lun() returns -EIO and the
>>>>>>> sc

Re: [PATCH 16/16] scsi_dh_alua: Use workqueue for RTPG

2014-01-21 Thread Vaughan Cao

On Fri, Dec 20, 2013 at 8:13 PM, Hannes Reinecke  wrote:
> 
> +static void alua_rtpg_work(struct work_struct *work)
> +{
> ..
> +   if (pg->flags & ALUA_PG_RUN_STPG) {
> +   spin_unlock_irqrestore(&pg->rtpg_lock, flags);
> +   err = alua_stpg(sdev, pg);
> +   spin_lock_irqsave(&pg->rtpg_lock, flags);
> +   pg->flags &= ~ALUA_PG_RUN_STPG;
> +   pg->flags |= ALUA_PG_STPG_DONE;
> +   if (err == SCSI_DH_RETRY) {
> +   pg->flags |= ALUA_PG_RUN_RTPG;
> +   pg->interval = ALUA_RTPG_DELAY_MSECS * 1000;
>

Is the line above a typo?

pg->interval is measured in second unit. You won't want to set it as
milliseconds*1000. And ALUA_RTPG_DELAY_MSECS is used for queueing delayed rtpg
work, not for delay in retry cases such as state of transitioning. What you
want may be set a new delay time to start, similar to "pg->interval += 2" in
alua_rtpg when state of transitioning is returned.

Thanks,
Vaughan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: iSCSI request keep rejected by microsoft iSCSI target because of write_same check

2014-01-10 Thread Vaughan Cao


On 2014年01月11日 04:24, Mike Christie wrote:

On 01/10/2014 02:09 AM, vaughan wrote:

On 01/10/2014 03:41 PM, Mike Christie wrote:

On 1/10/14 12:11 AM, vaughan wrote:

I haven't figure out why it's rejected with "bookmark invalid"(9)
reason, rather than "command not supported". IMO "bookmark invalid" is
used when minor protocol conflict such as final flag not set with
non-write command. However, I haven't find error of this kind in
report_opcode, so I guess it's not supported on the target.


Is it possible to get a wireshark/tcpdump trace? It does not have to
be during boot. We just need to see what commands are sent and the
response the target is returning.

I forgot we know some microsoft iscsi target people. We can just email
them with the trace to confirm what is going on with the target. The
trace seems to be easier for them than them interpreting linux kernel
logs.

I enabled debug_iscsi_tcp, here is a more detailed log in normal connection.
Does "conn error (1020)" mean it's target peer who disconnect the
connection at the same time of reject report_opcode?

Yes.


If it is, I think iSCSI boot failure can't be avoided without disable
write_same check on OEL.

Yes, you are right. Due to how more distros do boot, iscsid will not be
up and you will hang. If are talking about disablement though I think it
should not be done at the iscsi layer. It should be some sort of
white/black list at the scsi device layer or something like that.

However, I will ping Microsoft and cc you and we can see what is up for
sure. Maybe we will get lucky and they will have a release with a fix on
their side.


Target:
  Windows Server 2008 R2 DataCenter
  Microsoft iSCSI Software Target: 3.3.16563.
Initiator:
  kernel with write_same check in sd, such as Kernel 3.11.10-200.fc19.x86_64

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: iSCSI request keep rejected by microsoft iSCSI target because of write_same check

2014-01-10 Thread vaughan

On 01/10/2014 03:41 PM, Mike Christie wrote:
> On 1/10/14 12:11 AM, vaughan wrote:
>> I haven't figure out why it's rejected with "bookmark invalid"(9)
>> reason, rather than "command not supported". IMO "bookmark invalid" is
>> used when minor protocol conflict such as final flag not set with
>> non-write command. However, I haven't find error of this kind in
>> report_opcode, so I guess it's not supported on the target.
>>
> 
> Is it possible to get a wireshark/tcpdump trace? It does not have to be
> during boot. We just need to see what commands are sent and the response
> the target is returning.
> 
> I forgot we know some microsoft iscsi target people. We can just email
> them with the trace to confirm what is going on with the target. The
> trace seems to be easier for them than them interpreting linux kernel logs.

tcp dump attached.

Vaughan


iscsi.dump
Description: Binary data


Re: iSCSI request keep rejected by microsoft iSCSI target because of write_same check

2014-01-10 Thread vaughan
On 01/10/2014 03:41 PM, Mike Christie wrote:
> On 1/10/14 12:11 AM, vaughan wrote:
>> I haven't figure out why it's rejected with "bookmark invalid"(9)
>> reason, rather than "command not supported". IMO "bookmark invalid" is
>> used when minor protocol conflict such as final flag not set with
>> non-write command. However, I haven't find error of this kind in
>> report_opcode, so I guess it's not supported on the target.
>>
>
> Is it possible to get a wireshark/tcpdump trace? It does not have to
> be during boot. We just need to see what commands are sent and the
> response the target is returning.
>
> I forgot we know some microsoft iscsi target people. We can just email
> them with the trace to confirm what is going on with the target. The
> trace seems to be easier for them than them interpreting linux kernel
> logs.
I enabled debug_iscsi_tcp, here is a more detailed log in normal connection.
Does "conn error (1020)" mean it's target peer who disconnect the
connection at the same time of reject report_opcode?
If it is, I think iSCSI boot failure can't be avoided without disable
write_same check on OEL.

Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: iscsi_sw_tcp_recv in 52
bytes
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: iscsi_sw_tcp_recv read
0 bytes status 1
Dec 17 00:20:17 ol6u4gx64 kernel: sd 7:0:0:0: [sdc] Write Protect is off
Dec 17 00:20:17 ol6u4gx64 kernel: sd 7:0:0:0: [sdc] Mode Sense: 03 00 00 00
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0:
iscsi_sw_tcp_send_hdr_prep digest disabled
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0:
iscsi_sw_tcp_send_hdr_done Header done. Next segment size 0 total_size 0
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: iscsi_sw_tcp_xmit xmit
48 bytes
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: iscsi_sw_tcp_recv in 52
bytes
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: iscsi_sw_tcp_recv read
0 bytes status 1
Dec 17 00:20:17 ol6u4gx64 kernel: sd 7:0:0:0: [sdc] Got wrong page
Dec 17 00:20:17 ol6u4gx64 kernel: sd 7:0:0:0: [sdc] Assuming drive
cache: write through
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0:
iscsi_sw_tcp_send_hdr_prep digest disabled
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0:
iscsi_sw_tcp_send_hdr_done Header done. Next segment size 0 total_size 0
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: iscsi_sw_tcp_xmit xmit
48 bytes
Dec 17 00:20:17 ol6u4gx64 iscsid: Connection4:0 to [target:
iqn.1991-05.com.microsoft:test1, portal: 10.182.92.118,3260] through
[iface: default] is operational now
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: iscsi_sw_tcp_recv in 96
bytes
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: pdu (op 0x1 itt 0xa)
rejected. Reason code 0x9
Dec 17 00:20:17 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
0xc1, itt 0xa, data_length 512 CDB: a3 c 1 12 0 0 0 0 2 0 0 0 0 0 0 0
Dec 17 00:20:17 ol6u4gx64 kernel: iscsi_handle_reject: return 0
Dec 17 00:20:17 ol6u4gx64 kernel: __iscsi_complete_pdu:1165:
iscsi_handle_reject return 0
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: iscsi_sw_tcp_recv read
0 bytes status 1
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: iscsi_sw_sk_state_check
TCP_CLOSE|TCP_CLOSE_WAIT
Dec 17 00:20:17 ol6u4gx64 kernel: connection4:0: detected conn error (1020)
Dec 17 00:20:18 ol6u4gx64 iscsid: Kernel reported iSCSI connection 4:0
error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)

"rejected iscsi cmd hdr" line is added for debug as below.

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c

index d4b4b36..831786a 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -993,7 +993,9 @@ static int iscsi_handle_reject(struct iscsi_conn *conn, 
struct iscsi_hdr *hdr,
 {
struct iscsi_reject *reject = (struct iscsi_reject *)hdr;
struct iscsi_hdr rejected_pdu;
+   struct iscsi_scsi_req rejected_scsi_req;
int opcode, rc = 0;
+   int i;
 
conn->exp_statsn = be32_to_cpu(reject->statsn) + 1;
 
@@ -1061,6 +1063,22 @@ static int iscsi_handle_reject(struct iscsi_conn *conn, 
struct iscsi_hdr *hdr,
  "pdu (op 0x%x itt 0x%x) rejected. Reason "
  "code 0x%x\n", rejected_pdu.opcode,
  rejected_pdu.itt, reject->reason);
+   if (reject->reason == ISCSI_REASON_BOOKMARK_INVALID &&
+   rejected_pdu.opcode == ISCSI_OP_SCSI_CMD) {
+   memcpy(&rejected_scsi_req, data, sizeof(struct 
iscsi_hdr));
+   printk("rejected iscsi cmd hdr:"
+   "op 0x%x, flags 0x%x, itt 0x%x, data_length %u ",
+   rejected_scsi_req.opcode,
+   rejected_scsi_req.flags,
+  

Re: iSCSI request keep rejected by microsoft iSCSI target because of write_same check

2014-01-09 Thread vaughan
On 01/10/2014 02:38 AM, Mike Christie wrote:
> On 01/09/2014 04:20 AM, vaughan wrote:
>> We are testing linux iscsi boot under UEFI mode with Microsoft iSCSI
>> Software Target. But failed for rejected by target.
>> When do a normal iscsi connection, iscsi connection is keeping rejected
>> and recover. See below:
>>
>> Dec 15 20:09:07 ol6u4gx64 kernel: scsi11 : iSCSI Initiator over TCP/IP
>> Dec 15 20:09:08 ol6u4gx64 kernel: scsi 11:0:0:0: Direct-Access
>> MSFT Virtual HD   3.3  PQ: 0 ANSI: 5
>> Dec 15 20:09:08 ol6u4gx64 kernel: sd 11:0:0:0: Attached scsi generic sg3
>> type 0
>> Dec 15 20:09:08 ol6u4gx64 kernel: sd 11:0:0:0: [sdc] 307200 512-byte
>> logical blocks: (157 MB/150 MiB)
>> Dec 15 20:09:08 ol6u4gx64 kernel: sd 11:0:0:0: [sdc] Write Protect is off
>> Dec 15 20:09:08 ol6u4gx64 kernel: sd 11:0:0:0: [sdc] Got wrong page
>> Dec 15 20:09:08 ol6u4gx64 kernel: sd 11:0:0:0: [sdc] Assuming drive
>> cache: write through
>> Dec 15 20:09:08 ol6u4gx64 kernel: connection1:0: pdu (op 0x1 itt 0xa)
>> rejected. Reason code 0x9
>> Dec 15 20:09:08 ol6u4gx64 iscsid: Connection1:0 to [target:
>> iqn.1991-05.com.microsoft:test1, portal: 10.182.92.118,3260] through
>> [iface: default] is operational now
>> Dec 15 20:09:08 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
>> 0xc1, itt 0xa, data_length 512CDB: a3 c 1 12 0 0 0 0 2 0 0 0 0 0 0 0
>> connection1:0: detected conn error (1020)
>> Dec 15 20:09:09 ol6u4gx64 iscsid: Kernel reported iSCSI connection 1:0
>> error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
>> Dec 15 20:09:11 ol6u4gx64 kernel: connection1:0: pdu (op 0x1 itt
>> 0x100b) rejected. Reason code 0x9
>> Dec 15 20:09:11 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
>> 0xc1, itt 0x100b, data_length 512CDB: a3 c 1 12 0 0 0 0 2 0 0 0 0 0
>> 0 0 connection1:0: detected conn error (1020)
>> Dec 15 20:09:11 ol6u4gx64 iscsid: connection1:0 is operational after
>> recovery (1 attempts)
>> Dec 15 20:09:12 ol6u4gx64 iscsid: Kernel reported iSCSI connection 1:0
>> error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
>> Dec 15 20:09:14 ol6u4gx64 kernel: connection1:0: pdu (op 0x1 itt
>> 0x200c) rejected. Reason code 0x9
>> Dec 15 20:09:14 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
>> 0xc1, itt 0x200c, data_length 512CDB: a3 c 1 12 0 0 0 0 2 0 0 0 0 0
>> 0 0 connection1:0: detected conn error (1020)
>> ...
>> Dec 15 20:09:30 ol6u4gx64 kernel: connection1:0: pdu (op 0x1 itt
>> 0x7011) rejected. Reason code 0x9
>> Dec 15 20:09:30 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
>> 0xc1, itt 0x7011, data_length 512CDB: a3 c 1 93 0 0 0 0 2 0 0 0 0 0
>> 0 0 connection1:0: detected conn error (1020)
>> Dec 15 20:09:31 ol6u4gx64 iscsid: connection1:0 is operational after
>> recovery (1 attempts)
>> Dec 15 20:09:31 ol6u4gx64 iscsid: Kernel reported iSCSI connection 1:0
>> error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
>> Dec 15 20:09:33 ol6u4gx64 kernel: connection1:0: pdu (op 0x1 itt
>> 0x8012) rejected. Reason code 0x9
>> Dec 15 20:09:34 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
>> 0xc1, itt 0x8012, data_length 512CDB: a3 c 1 41 0 0 0 0 2 0 0 0 0 0
>> 0 0 connection1:0: detected conn error (1020)
>> Dec 15 20:09:34 ol6u4gx64 iscsid: connection1:0 is operational after
>> recovery (1 attempts)
>> Dec 15 20:09:34 ol6u4gx64 iscsid: Kernel reported iSCSI connection 1:0
>> error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
>>
>> Digging the code, I found it's because sd_read_write_same() use
>> scsi_report_opcode() to check if write_same is supported, but rejected
>> by target.
>> dd to the iSCSI disk is ok.  This issue doesn't occur with Linux stgt
>> target.
>>
> 
> Was the log above taken with your iscsi logging fixing patch applied?

Yes, it is. And "rejected iscsi cmd hdr" line is for debug.

> 
>> There are some questions:
>> 1. Why stgt target don't has this issue? Does it support report_opcode
>> because it has a embeded controller lun0? Or it just returns INVALID in
>> response?
> 
> Don't know. Will let someone else that works on stgt respond.
> 
>> 2. report_opcode is optional in SPC-3, so it's possible that not all
>> iscsi target support it. But is it correct for microsoft iSCSI target to
>> reject the command rather than return it with normal SCSI response?
> 
> The iscsi spec has the reason code "command not supported" so I guess it

iSCSI request keep rejected by microsoft iSCSI target because of write_same check

2014-01-09 Thread vaughan
We are testing linux iscsi boot under UEFI mode with Microsoft iSCSI
Software Target. But failed for rejected by target.
When do a normal iscsi connection, iscsi connection is keeping rejected
and recover. See below:

Dec 15 20:09:07 ol6u4gx64 kernel: scsi11 : iSCSI Initiator over TCP/IP
Dec 15 20:09:08 ol6u4gx64 kernel: scsi 11:0:0:0: Direct-Access
MSFT Virtual HD   3.3  PQ: 0 ANSI: 5
Dec 15 20:09:08 ol6u4gx64 kernel: sd 11:0:0:0: Attached scsi generic sg3
type 0
Dec 15 20:09:08 ol6u4gx64 kernel: sd 11:0:0:0: [sdc] 307200 512-byte
logical blocks: (157 MB/150 MiB)
Dec 15 20:09:08 ol6u4gx64 kernel: sd 11:0:0:0: [sdc] Write Protect is off
Dec 15 20:09:08 ol6u4gx64 kernel: sd 11:0:0:0: [sdc] Got wrong page
Dec 15 20:09:08 ol6u4gx64 kernel: sd 11:0:0:0: [sdc] Assuming drive
cache: write through
Dec 15 20:09:08 ol6u4gx64 kernel: connection1:0: pdu (op 0x1 itt 0xa)
rejected. Reason code 0x9
Dec 15 20:09:08 ol6u4gx64 iscsid: Connection1:0 to [target:
iqn.1991-05.com.microsoft:test1, portal: 10.182.92.118,3260] through
[iface: default] is operational now
Dec 15 20:09:08 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
0xc1, itt 0xa, data_length 512CDB: a3 c 1 12 0 0 0 0 2 0 0 0 0 0 0 0
connection1:0: detected conn error (1020)
Dec 15 20:09:09 ol6u4gx64 iscsid: Kernel reported iSCSI connection 1:0
error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Dec 15 20:09:11 ol6u4gx64 kernel: connection1:0: pdu (op 0x1 itt
0x100b) rejected. Reason code 0x9
Dec 15 20:09:11 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
0xc1, itt 0x100b, data_length 512CDB: a3 c 1 12 0 0 0 0 2 0 0 0 0 0
0 0 connection1:0: detected conn error (1020)
Dec 15 20:09:11 ol6u4gx64 iscsid: connection1:0 is operational after
recovery (1 attempts)
Dec 15 20:09:12 ol6u4gx64 iscsid: Kernel reported iSCSI connection 1:0
error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Dec 15 20:09:14 ol6u4gx64 kernel: connection1:0: pdu (op 0x1 itt
0x200c) rejected. Reason code 0x9
Dec 15 20:09:14 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
0xc1, itt 0x200c, data_length 512CDB: a3 c 1 12 0 0 0 0 2 0 0 0 0 0
0 0 connection1:0: detected conn error (1020)
...
Dec 15 20:09:30 ol6u4gx64 kernel: connection1:0: pdu (op 0x1 itt
0x7011) rejected. Reason code 0x9
Dec 15 20:09:30 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
0xc1, itt 0x7011, data_length 512CDB: a3 c 1 93 0 0 0 0 2 0 0 0 0 0
0 0 connection1:0: detected conn error (1020)
Dec 15 20:09:31 ol6u4gx64 iscsid: connection1:0 is operational after
recovery (1 attempts)
Dec 15 20:09:31 ol6u4gx64 iscsid: Kernel reported iSCSI connection 1:0
error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Dec 15 20:09:33 ol6u4gx64 kernel: connection1:0: pdu (op 0x1 itt
0x8012) rejected. Reason code 0x9
Dec 15 20:09:34 ol6u4gx64 kernel: rejected iscsi cmd hdr:op 0x1, flags
0xc1, itt 0x8012, data_length 512CDB: a3 c 1 41 0 0 0 0 2 0 0 0 0 0
0 0 connection1:0: detected conn error (1020)
Dec 15 20:09:34 ol6u4gx64 iscsid: connection1:0 is operational after
recovery (1 attempts)
Dec 15 20:09:34 ol6u4gx64 iscsid: Kernel reported iSCSI connection 1:0
error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)

Digging the code, I found it's because sd_read_write_same() use
scsi_report_opcode() to check if write_same is supported, but rejected
by target.
dd to the iSCSI disk is ok.  This issue doesn't occur with Linux stgt
target.

There are some questions:
1. Why stgt target don't has this issue? Does it support report_opcode
because it has a embeded controller lun0? Or it just returns INVALID in
response?
2. report_opcode is optional in SPC-3, so it's possible that not all
iscsi target support it. But is it correct for microsoft iSCSI target to
reject the command rather than return it with normal SCSI response?
3. Why it didn't do the same recovery in iSCSI boot case as did in
normal connection?
4. if report_opcode is not supported, shall write_same be disabled for
iSCSI?
5. Shall write_same be disabled for iSCSI by default?

Details of my test components:
  Target is setup with Microsoft iSCSI Software Target: 3.3.16563.
  stgt: scsi-target-utils-1.0.38-1.fc19.x86_64
  Kernel: 3.11.10-200.fc19.x86_64
 

-- 
Regards,
Vaughan

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] iscsi: fix wrong order of opcode and itt in iscsi_handle_reject prompt

2014-01-08 Thread Vaughan Cao
This patch makes reject messages show right value for opcode and itt, which
is converse previously.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/libiscsi.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index e399561..27547ff 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1013,13 +1013,13 @@ static int iscsi_handle_reject(struct iscsi_conn *conn, 
struct iscsi_hdr *hdr,
iscsi_conn_printk(KERN_ERR, conn,
  "pdu (op 0x%x itt 0x%x) rejected "
  "due to DataDigest error.\n",
- rejected_pdu.itt, opcode);
+ opcode, rejected_pdu.itt);
break;
case ISCSI_REASON_IMM_CMD_REJECT:
iscsi_conn_printk(KERN_ERR, conn,
  "pdu (op 0x%x itt 0x%x) rejected. Too many "
  "immediate commands.\n",
- rejected_pdu.itt, opcode);
+ opcode, rejected_pdu.itt);
/*
 * We only send one TMF at a time so if the target could not
 * handle it, then it should get fixed (RFC mandates that
@@ -1059,8 +1059,8 @@ static int iscsi_handle_reject(struct iscsi_conn *conn, 
struct iscsi_hdr *hdr,
default:
iscsi_conn_printk(KERN_ERR, conn,
  "pdu (op 0x%x itt 0x%x) rejected. Reason "
- "code 0x%x\n", rejected_pdu.itt,
- rejected_pdu.opcode, reject->reason);
+ "code 0x%x\n", rejected_pdu.opcode,
+ rejected_pdu.itt, reject->reason);
break;
}
return rc;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: special sense code asc,ascq=04h,0Ch abort scsi scan in the middle

2013-12-18 Thread Vaughan Cao


On 2013年10月21日 14:07, vaughan wrote:

On 10/16/2013 02:52 PM, Hannes Reinecke wrote:

But seeing that this approach raises quite some issues I've attached a
different patch. Vaughan, could you test with that, too? Should be
functionally equivalent to the previous one. Cheers, Hannes

Hi Hannes,

We only tested the later patch which returns _TARGET_PRESENT after
parsing sense, it works as expected.


Hi Hannes,

Will your second patch be included in mainline?

Thanks,
Vaughan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] iscsi: conn error (1020) each time iscsi session logout

2013-12-17 Thread Vaughan Cao
We do a normal login/logout process to iscsi server. iscsiadm report success,
but we always see the following error just before conn shutdown in dmesg.

Oct 15 05:30:09 vmhodtest019 iscsid: Connection1:0 to [target:
iqn.1986-03.com.sun:02:7b863a18-045a-cb04-c686-841f17df2f9c, portal:
10.182.32.162,3260] through [iface: default] is operational now
Oct 15 05:30:42 vmhodtest019 kernel:  connection1:0: detected conn error
(1020)
Oct 15 05:30:42 vmhodtest019 iscsid: Connection1:0 to [target:
iqn.1986-03.com.sun:02:7b863a18-045a-cb04-c686-841f17df2f9c, portal:
10.182.32.162,3260] through [iface: default] is shutdown.

It's because iscsi_tcp module evaluates socket state in data_ready() callback,
 and that detect the socket close. However, this socket close on target peer 
is in response to the logout request from initiator. So this is not an error 
that should be reported out. I quiesce it by checking session state and err 
value accordingly.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/libiscsi.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 415f2c0..84171ef 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1360,6 +1360,12 @@ void iscsi_conn_failure(struct iscsi_conn *conn, enum 
iscsi_err err)
spin_unlock_bh(&session->lock);
return;
}
+   /* Target closed the connection in response to logout */
+   if (session->state == ISCSI_STATE_LOGGING_OUT &&
+   err == ISCSI_ERR_TCP_CONN_CLOSE) {
+   spin_unlock_bh(&session->lock);
+   return;
+   }
 
if (conn->stop_stage == 0)
session->state = ISCSI_STATE_FAILED;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [SCSI] scsi_dh_alua: restrain retries during AAS transition period

2013-11-22 Thread Vaughan Cao
The SCSI ALUA handler currently retries the submit_rtpg continuously in a
loop (in the alua_rtpg routine) during the entire ALUA transitioning period
i.e.
during the time when the target replies with a NOT READY status - ASYMMETRIC
ACCESS STATE TRANSITION as shown below:

err = alua_check_sense(sdev, &sense_hdr);
if (err == ADD_TO_MLQUEUE && time_before(jiffies, expiry))
goto retry;

This causes the host to flood the target with RTPG commands during this
transitioning window (till ALUA_FAILOVER_TIMEOUT fires). This problem is also
applicable to normal block/fs read or write IO that hits the above NOT READY
status, which causes the SCSI ml to retry almost right away (through
scsi_decide_disposition->scsi_check_sense->alua_check_sense which returns
ADD_TO_MLQUEUE).

This may cause targets to be overwhelmed during this ALUA transitioning period,
leading to performance degradation. Ideally, one would want the host to perform
restrained retries during this transitioning period in all relevant code paths
including the alua_rtpg routine.

Restrain retries in the same way already implemented when state =
TPGS_STATE_TRANSITIONING is returned by extended RTPG.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/device_handler/scsi_dh_alua.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c 
b/drivers/scsi/device_handler/scsi_dh_alua.c
index 68adb89..d79e57dd 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -563,8 +563,15 @@ static int alua_rtpg(struct scsi_device *sdev, struct 
alua_dh_data *h)
}
 
err = alua_check_sense(sdev, &sense_hdr);
-   if (err == ADD_TO_MLQUEUE && time_before(jiffies, expiry))
+   if (err == ADD_TO_MLQUEUE && time_before(jiffies, expiry)) {
+   if (sense_hdr.sense_key == NOT_READY &&
+   sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) {
+   /* State transition, restrained retry */
+   interval += 2000;
+   msleep(interval);
+   }
goto retry;
+   }
sdev_printk(KERN_INFO, sdev,
"%s: rtpg sense code %02x/%02x/%02x\n",
ALUA_DH_NAME, sense_hdr.sense_key,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] iscsi: quiesce connection error event during logout

2013-11-21 Thread Vaughan Cao
'connectionX:X: detected conn error(1020)' message appear during normal
logout phase because of connection close on target peer.
Quiesce it to avoid confusion.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/libiscsi.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index e399561..0e127d1 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1356,6 +1356,12 @@ void iscsi_conn_failure(struct iscsi_conn *conn, enum 
iscsi_err err)
spin_unlock_bh(&session->lock);
return;
}
+   /* Target closed the connection in response to logout */
+   if (session->state == ISCSI_STATE_LOGGING_OUT &&
+   err == ISCSI_ERR_TCP_CONN_CLOSE) {
+   spin_unlock_bh(&session->lock);
+   return;
+   }
 
if (conn->stop_stage == 0)
session->state = ISCSI_STATE_FAILED;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] sg: O_EXCL and other lock handling

2013-11-02 Thread Vaughan Cao


On 2013年11月03日 02:22, Douglas Gilbert wrote:

On 13-11-01 01:16 AM, vaughan wrote:

I do not follow the last point but that is not important.


For reasons that I listed in a private post I think
that my patch presented in this thread is closer to
our goals than your patch (2013/6/17/319). Timing is
important as well since we are approaching the lk 3.13
merge window. Regressions are what will set us back.

Yes, I agree with you. My v2 patch lacks much consideration on 
release/detach cases, which is a very significant issue we should 
address along with o_excl bug.


Thanks,
Vaughan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] sg: O_EXCL and other lock handling

2013-10-31 Thread vaughan
On 11/01/2013 03:20 AM, Douglas Gilbert wrote:
> On 13-10-31 11:56 AM, Christoph Hellwig wrote:
>>> +struct semaphore or_sem;  /* protect co-incident opens and
>>> releases */
>>
>> Seems like this should be a mutex.
>
> Yes, it is being used as a mutex. However looking at
> their semantics (mutex.h versus semaphore.h), a mutex
> takes into account the task owner. If the user space
> wants to pass around a sg file descriptor in a Unix
> domain socket (see TLPI, Kerrisk) I don't see why the
> sg driver should object (and pay the small performance
> hit for each check).
>
> A nasty side effect of a mutex taking into account the
> task owner is that it can't be used in interrupt
> context. My patch does not try to do that yet (see next
> section) but why bother. Give me a simple mutex and
> I'll use it.
>
>>> sfds_list_empty(Sg_device *sdp)
>>>   {
>>>   unsigned long flags;
>>>   int ret;
>>>
>>> +spin_lock_irqsave(&sdp->sfd_lock, flags);
>>> +ret = list_empty(&sdp->sfds);
>>> +spin_unlock_irqrestore(&sdp->sfd_lock, flags);
>>>   return ret;
>>
>> Protecting just a list_empty check with a local will give you racy
>> results.  Seems like you should take the look over the check and the
>> resulting action that modifies the list.  That'd also mean replacing the
>> wait_event* calls with open coded prepare_wait / finish_wait loops.
>
> Not (usually) in this case. The sdp->sfds list can only
> be expanded by another sg_open(same_dev) but this has
> been excluded by taking down(&sdp->or_sem) prior to that
> call. The sdp->sfds list is only normally decreased by
> sg_release() which is also excluded by down(&sdp->or_sem).
>
> The abnormal case is device removal (detaching). Now an
> open(same_dev, O_EXCL) may start waiting just after a
> detach but miss the wake up on open_wait. That suggests
> the wake_up(open_wait) in sg_remove() should also
> take the sdp->or_sem semaphore.
> Ah, and if sg_remove() can be called from an interrupt
> context then that takes out using mutexes :-)
>
> The two level of locks in sg_remove() is already making me
> uncomfortable, adding the sdp->or_sem semaphore to the
> mix calls for more analysis.
CCed to Jörn Engel.

I feel the same regarding the many locks. After reviewing patches I
wrote these days, I suppose the version 2
(https://lkml.org/lkml/2013/6/17/319) may worth a re-consideration somehow.
The purpose of or_sem is to mutex co-incident open and allow only one
thread to be processing since it completes checking condition to really
linking a new sfd into sfd_siblings list. My v2 re-implement a sem using
sfd_lock and toopen. But as Jörn pointed out in the following mail, what
I implemented is a rw_sem and that is weird and made him "having trouble
wrapping my head around the semantics of this variable"...
With wait_event involved, it's indeed no sense to use a rw_sem here.
However, if we restrict toopen in [0, 1], not [0, INT_MAX] as I did in
v2, it will act in the same way as or_sem does. Beside the same affect
as or_sem, this implement avoids the problem using in interrupt context
and more readable for me, rather than twisting my head among many
spinlocks and semaphores.
Although v2 seems attractive to me, things would be more complicated
when detached comes in...

The following is a thought I presented previously:
Is it possible to split the sg_add_sfp() into two
functions, sg_init_sfp() and sg_add_sfp2(). We can do all initialize
work in sg_init_sfp()
without any lock and let sg_add_sfp2() only serve lock-check-add in one
place. It is less flaky.

>
> Also note that sdp->detached is only protected by a
> volatile type modifier and checks on it are sprinkled
> around the code in a rather unscientific way.
>
> As you are no doubt aware, handling the "surprise" device
> removal case safely is hard, very hard. And I have tried
> to test this, loading up 4 processes with 100 tasks each,
> some with asynchronous queueing requests but most hanging
> on an open_wait then remove the device. So far I have not
> seen a hang. Again, somebody with a big machine and
> patience might like to try a scaled up device removal test
> to see what breaks.
>
>>> +down(&sdp->or_sem);
>>> +alone = sfds_list_empty(sdp);
>>> +if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE))) {
>>> +retval = -EPERM; /* Don't allow O_EXCL with read only
>>> access */
>>> +goto error_out;
>>> +}
>>
>> Seems like the pure flags check should move to the beginning of the
>

Re: [PATCH v2] sg: O_EXCL and other lock handling

2013-10-31 Thread vaughan
On 11/01/2013 03:20 AM, Douglas Gilbert wrote:
> On 13-10-31 11:56 AM, Christoph Hellwig wrote:
>>> +struct semaphore or_sem;  /* protect co-incident opens and
>>> releases */
>>
>> Seems like this should be a mutex.
>
> Yes, it is being used as a mutex. However looking at
> their semantics (mutex.h versus semaphore.h), a mutex
> takes into account the task owner. If the user space
> wants to pass around a sg file descriptor in a Unix
> domain socket (see TLPI, Kerrisk) I don't see why the
> sg driver should object (and pay the small performance
> hit for each check).
>
> A nasty side effect of a mutex taking into account the
> task owner is that it can't be used in interrupt
> context. My patch does not try to do that yet (see next
> section) but why bother. Give me a simple mutex and
> I'll use it.
>
>>> sfds_list_empty(Sg_device *sdp)
>>>   {
>>>   unsigned long flags;
>>>   int ret;
>>>
>>> +spin_lock_irqsave(&sdp->sfd_lock, flags);
>>> +ret = list_empty(&sdp->sfds);
>>> +spin_unlock_irqrestore(&sdp->sfd_lock, flags);
>>>   return ret;
>>
>> Protecting just a list_empty check with a local will give you racy
>> results.  Seems like you should take the look over the check and the
>> resulting action that modifies the list.  That'd also mean replacing the
>> wait_event* calls with open coded prepare_wait / finish_wait loops.
>
> Not (usually) in this case. The sdp->sfds list can only
> be expanded by another sg_open(same_dev) but this has
> been excluded by taking down(&sdp->or_sem) prior to that
> call. The sdp->sfds list is only normally decreased by
> sg_release() which is also excluded by down(&sdp->or_sem).
>
> The abnormal case is device removal (detaching). Now an
> open(same_dev, O_EXCL) may start waiting just after a
> detach but miss the wake up on open_wait. That suggests
> the wake_up(open_wait) in sg_remove() should also
> take the sdp->or_sem semaphore.
> Ah, and if sg_remove() can be called from an interrupt
> context then that takes out using mutexes :-)
>
> The two level of locks in sg_remove() is already making me
> uncomfortable, adding the sdp->or_sem semaphore to the
> mix calls for more analysis.
CCed to Jörn Engel.

I feel the same regarding the many locks. After reviewing patches I
wrote these days, I suppose the version 2
(https://lkml.org/lkml/2013/6/17/319) may worth a re-consideration somehow.
The purpose of or_sem is to mutex co-incident open and allow only one
thread to be processing since it completes checking condition to really
linking a new sfd into sfd_siblings list. My v2 re-implement a sem using
sfd_lock and toopen. But as Jörn pointed out in the following mail, what
I implemented is a rw_sem and that is weird and made him "having trouble
wrapping my head around the semantics of this variable"...
With wait_event involved, it's indeed no sense to use a rw_sem here.
However, if we restrict toopen in [0, 1], not [0, INT_MAX] as I did in
v2, it will act in the same way as or_sem does. Beside the same affect
as or_sem, this implement avoids the problem using in interrupt context
and more readable for me, rather than twisting my head among many
spinlocks and semaphores.
Although v2 seems attractive to me, things would be more complicated
when detached comes in...

The following is a thought I presented previously:
Is it possible to split the sg_add_sfp() into two
functions, sg_init_sfp() and sg_add_sfp2(). We can do all initialize
work in sg_init_sfp()
without any lock and let sg_add_sfp2() only serve lock-check-add in one
place. It is less flaky.

>
> Also note that sdp->detached is only protected by a
> volatile type modifier and checks on it are sprinkled
> around the code in a rather unscientific way.
>
> As you are no doubt aware, handling the "surprise" device
> removal case safely is hard, very hard. And I have tried
> to test this, loading up 4 processes with 100 tasks each,
> some with asynchronous queueing requests but most hanging
> on an open_wait then remove the device. So far I have not
> seen a hang. Again, somebody with a big machine and
> patience might like to try a scaled up device removal test
> to see what breaks.
>
>>> +down(&sdp->or_sem);
>>> +alone = sfds_list_empty(sdp);
>>> +if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE))) {
>>> +retval = -EPERM; /* Don't allow O_EXCL with read only
>>> access */
>>> +goto error_out;
>>> +}
>>
>> Seems like the pure flags check should move to the beginning of the
>

Re: [PATCH] [SCSI] sg: late O_EXCL fix for lk 3.12-rc

2013-10-21 Thread Vaughan Cao


On 2013年10月21日 07:00, Douglas Gilbert wrote:

On 13-10-20 01:31 PM, Bart Van Assche wrote:

On 10/20/13 18:09, Douglas Gilbert wrote:

Given that lk 3.12.0 release is not far away, the safest path
may still be to revert Vaughan Cao's patch. I'll leave that
decision to the maintainers.


Hello Doug,

Thanks for looking into this. But I would appreciate it if you could 
address the

whitespace errors reported by checkpatch:

ERROR: space prohibited after that '!' (ctx:BxW)
#24: FILE: drivers/scsi/sg.c:241:
+ (excl_case ? (! sdp->exclude) : sfds_list_empty(sdp;
^

ERROR: space prohibited after that '!' (ctx:BxW)
#55: FILE: drivers/scsi/sg.c:289:
+ if (! alone) {
^

ERROR: code indent should use tabs where possible
#59: FILE: drivers/scsi/sg.c:292:
+ }$

WARNING: please, no spaces at the start of a line
#59: FILE: drivers/scsi/sg.c:292:
+ }$

ERROR: space prohibited after that '!' (ctx:BxW)
#73: FILE: drivers/scsi/sg.c:301:
+ while (! alone) {
^

WARNING: suspect code indent for conditional statements (8, 12)
#144: FILE: drivers/scsi/sg.c:375:
+ if (excl || sfds_list_empty(sdp))
+ wake_up_interruptible(&sdp->open_wait);



I'd prefer people to test the patch or find logical flaws.

Doug Gilbert


Hi Doug,

Will the lines below conflict with the meaning of NONBLOCK?
>+ down(&sdp->or_sem);
>+ alone = sfds_list_empty(sdp);
> if (!((flags & O_NONBLOCK) ||
> scsi_block_when_processing_errors(sdp->device))) {

Assume one thread holds the or_sem and waiting in 
scsi_block_when_processing_errors for the underlying scsi device to 
complete error recovery,

another thread with O_NONBLOCK call sg_open().

I'm also curious why we can skip checking _processing_errors() when 
called with O_NONBLOCK?...Though it has been there from the very beginning.
In other words, since scsi device may go into a error recovery state at 
random time, why we only check here?


Thanks,
Vaughan

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: special sense code asc,ascq=04h,0Ch abort scsi scan in the middle

2013-10-20 Thread vaughan
On 10/16/2013 02:52 PM, Hannes Reinecke wrote:
> But seeing that this approach raises quite some issues I've attached a
> different patch. Vaughan, could you test with that, too? Should be
> functionally equivalent to the previous one. Cheers, Hannes 
Hi Hannes,

We only tested the later patch which returns _TARGET_PRESENT after
parsing sense, it works as expected.

About the cause of this issue, admin said he is configuring a
active-active cluster mode storage. Each node has it own LUN pool and a
set of rule to control which node can access the pool.
LUN7 is owned and can only be able to manipulated by the other node, but
can be seen by this node for a misconfig. So it presents itself in
REPORT_LUN but return NOT_READY when accessed through this node.
Do you still regard this as a misbehave in response to INQUIRY?

Thanks,
Vaughan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-16 Thread vaughan
On 10/17/2013 06:41 AM, Douglas Gilbert wrote:
> That seems to be the case. Vaughan acknowledged the
> problem and forwarded it to me 8 days ago. Yes, it
> seems to be a "no-no" to hold a any kernel semaphore
> when returning to the user space; in this case from
> sg_open(). I was hoping a revised patch might
> appear from Vaughan but to date that has not been
> the case. So with only a few weeks to go before
> lk 3.12 is released, reverting the whole 4 patches
> in that series seems to be the safest course.
>
> Also without a new patch from Vaughan in the next few
> weeks he may also miss the opportunity of getting
> his improved O_EXCL logic into the lk 3.13 series.
>
>
> Thinking about how to solve this problem: a field could
> be added to 'struct sg_device' with one of three states:
> no_opens, non_excl_opens and excl_open. It could be
> manipulated by sg_open() and sg_release() like a
> read-write semaphore. And the faulty 'struct
> rw_semaphore o_sem' in sg_device could be replaced by a
> normal semaphore to protect the manipulation of the new
> three-state field.
> And the new three-state field would replace (or expand)
the 'char exclude' field in struct sg_device .
>
> Doug Gilbert
Hi Doug,

Thanks for providing advice on how to fix this.
However, it seems be still awkward somehow. We have to
1. hold a lock (maybe sg_index_lock or a new one)
2. check
   a) the new three-state field;
   b) if sfp list is empty;
   c) sdp->detached field;
if either condition fails, we may link the open process into o_excl_wait
queue and need wakeup.
if satisfied, we go on.
3. then we release at least sg_index_lock to malloc a new sfp and
initialize.
4. try to acquire sg_index_lock again and add this sfp into sfd_siblings
list if possible.  <== We still have to check at least sdp->detached field
5. update three-state field to reflect the result of Step 4, and wake up
processes waiting in o_excl_wait.

This uncomfortable is introduced by releasing the sg_index_lock in the
middle of check->malloc->add the new sfp struct.

I wanna ask if it is possible to split the sg_add_sfp() into two
functions, sg_init_sfp() and sg_add_sfp2(). We can do all initialize
work in sg_init_sfp()
without any lock and let sg_add_sfp2() only serve lock-check-add in one
way. It seems more convenient for me to understand.
But there is still some questions on this approach:
1. memory consume can be very large if lots of sg_init_sfp in the same time;
2. some field are initialized according to the fields of scsi device sdp
points to, such as low_dma, sg_tablesize, max_sector, phys_segs.
I know scsi_device_get() would keep the underlying scsi_device
alive, however would these fields change in the gap of our initialize
and add?
The relationship of sg_device and scsi_device like above said
confuse me somehow...

Thanks,
Vaughan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: special sense code asc,ascq=04h,0Ch abort scsi scan in the middle

2013-10-16 Thread vaughan
On 10/16/2013 02:52 PM, Hannes Reinecke wrote:
> But seeing that this approach raises quite some issues I've attached a
> different patch. Vaughan, could you test with that, too? Should be
> functionally equivalent to the previous one. Cheers, Hannes 
Of course. This one is more clear to express our intention than setting
PQ 3 to break out.

Vaughan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: special sense code asc,ascq=04h,0Ch abort scsi scan in the middle

2013-10-15 Thread Vaughan Cao


On 2013年10月15日 13:51, Hannes Reinecke wrote:

But that notwithstanding, did you get a chance to test my patch?

Cheers,

Hannes

Hi Hannes,

Kernel patched and waiting feedback from lab guy.

Thanks,
Vaughan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: special sense code asc,ascq=04h,0Ch abort scsi scan in the middle

2013-10-14 Thread vaughan
On 10/14/2013 07:13 PM, Hannes Reinecke wrote:
> In the log, inquiry to LUN7 return a sense - asc,ascq=04h,0Ch
> (Logical unit not accessible, target port in unavailable state).
> And this is ignored, so scsi_probe_lun() returns -EIO and the scan
> process is aborted.
>
> I have two questions:
> 1. Is it correct for hardware to return a sense 04h,0Ch to inquiry
> again, even after presenting this lun in responce to REPORT_LUN
> command?
> Yes, this is correct. 'REPORT LUNS' is supported in 'Unavailable' state.
>
>> 2. Since windows and solaris can continue scan, is it reasonable for
>> linux to do the same, even for a fault-tolerance purpose?
>>
> Hmm. Yes, and no.
>
> _Actually_ this is an issue with the target, as it looks as if it
> will return the above sense code while sending an 'INQUIRY' to the
> device.
> SPC explicitely states that the INQUIRY command should _not_ fail
> for unavailable devices.
Hi all,

I found this below in spc4.
>>>
5.15.2.4.4 Target port group asymmetric access states - Standby state
While in the unavailable primary target port asymmetric access state,
the device server shall support those of
the following commands that it supports while in the active/optimized state:
a) INQUIRY (the peripheral qualifier (see 6.6.2) shall be set to 001b);

For those commands that are not supported, the device server shall
terminate the command with CHECK
CONDITION status, with the sense key set to NOT READY, and the
additional sense code set to LOGICAL
UNIT NOT ACCESSIBLE, TARGET PORT IN UNAVAILABLE STATE.
<<<
>From the above, I suppose the hardware may works very compliant with
spc. The case could be:
Storage is a alua supported target. Initiator sent REPORT_LUN to target,
target return all pqual=000b to it.
Then Initiator INQUIRY lun 7 which is in standby state where pqual=000b
not 001b. So this INQUIRY is regarded as
'not supported', and get terminated with CHECK_CONDITION,  sense key=NOT
READY, asc,ascq=04h,0ch.

Could you confirm if my understanding is right or wrong?

Thanks,
Vaughan
> But yeah, we probably should work around this issues.
> Nevertheless, please raise this issue with your array vendor.
>
> Please try the attached patch.
>
> Cheers,
>
> Hannes

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: special sense code asc,ascq=04h,0Ch abort scsi scan in the middle

2013-10-14 Thread Vaughan Cao


On 2013年10月14日 21:18, Hannes Reinecke wrote:

On 10/14/2013 02:51 PM, Steffen Maier wrote:

Hi Hannes,

On 10/14/2013 01:13 PM, Hannes Reinecke wrote:

On 10/13/2013 07:23 PM, Vaughan Cao wrote:

Hi James,

[1.] One line summary of the problem:
special sense code asc,ascq=04h,0Ch abort scsi scan in the middle

[2.] Full description of the problem/report:
For instance, storage represents 8 iscsi LUNs, however the LUN No.7
is not well configured or has something wrong.
Then messages received:
kernel: scsi 5:0:0:0: Unexpected response from lun 7 while scanning, scan 
aborted
Which will make LUN No.8 unavailable.
It's confirmed that Windows and Solaris systems will continue the
scan and make LUN No.1,2,3,4,5,6 and 8 available.

Log snippet is as below:
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: scsi scan: INQUIRY pass 1 
length 36
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Send: 0x8801e9bd4280
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: CDB: Inquiry: 12 00 00 00 24 
00
Aug 24 00:32:49 vmhodtest019 kernel: buffer = 0x8801f71fc180, bufflen = 36, 
queuecommand 0xa00b99e7
Aug 24 00:32:49 vmhodtest019 kernel: leaving scsi_dispatch_cmnd()
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Done: 0x8801e9bd4280 
SUCCESS
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Result: hostbyte=DID_OK 
driverbyte=DRIVER_OK
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: CDB: Inquiry: 12 00 00 00 24 
00
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Sense Key : Not Ready 
[current]
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Add. Sense: Logical unit not 
accessible, target port in unavailable state
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: scsi host busy 1 failed 0
Aug 24 00:32:49 vmhodtest019 kernel: 0 sectors total, 36 bytes done.
Aug 24 00:32:49 vmhodtest019 kernel: scsi scan: INQUIRY failed with code 
0x802
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:0: Unexpected response from lun 
7 while scanning, scan aborted

According to scsi_report_lun_scan(), I found:
Linux use an inquiry command to probe a lun according to the result
of report_lun command.
It assumes every probe cmd will get a legal result. Otherwise, it
regards the whole peripheral not exist or dead.
If the return of inquiry passes its legal checking and indicates
'LUN not present', it won't break but also continue with the scan
process.
In the log, inquiry to LUN7 return a sense - asc,ascq=04h,0Ch
(Logical unit not accessible, target port in unavailable state).
And this is ignored, so scsi_probe_lun() returns -EIO and the scan
process is aborted.

I have two questions:
1. Is it correct for hardware to return a sense 04h,0Ch to inquiry
again, even after presenting this lun in responce to REPORT_LUN
command?

Yes, this is correct. 'REPORT LUNS' is supported in 'Unavailable' state.


2. Since windows and solaris can continue scan, is it reasonable for
linux to do the same, even for a fault-tolerance purpose?


Hmm. Yes, and no.

_Actually_ this is an issue with the target, as it looks as if it
will return the above sense code while sending an 'INQUIRY' to the
device.
SPC explicitely states that the INQUIRY command should _not_ fail
for unavailable devices.
But yeah, we probably should work around this issues.
Nevertheless, please raise this issue with your array vendor.

Please try the attached patch.

Cheers,

Hannes


In LLDDs that do their own initiator based LUN masking (because the midlayer 
does not have this
functionality to enable hardware virtualization without NPIV, or

to work around suboptimal LUN

masking on the target), they are likely to return -ENXIO from

slave_alloc(), making scsi_alloc_sdev()

return NULL, being converted to SCSI_SCAN_NO_RESPONSE by

scsi_probe_and_add_lun() and thus going

through the same code path above.


Ah. Hmm. Yes, they would.

However, I personally would question this approach, as SPC states that


The REPORT LUNS command (see table 284) requests the device
server to return the peripheral device logical unit inventory
accessible to the I_T nexus.

So by plain reading this would meant that you either should modify
'REPORT LUNS' to not show the masked LUNs,
I have the same question. If you don't want us use them, why still you 
present them in response to REPORT_LUN?
Since you report it in REPORT_LUN, I suppose the target server at least 
hold some information of this lun, so it shouldn't give an error when I 
check it? It should give me something to suggest that lun does exist, 
though it's not allowed to deal more with it at this time.
Or 'accessible' doesn't mean accessible at this time, but we have rights 
to address this LUN in this session? Whether it's online or not depends 
on the result of INQUIRY and TEST_UNIT_READY?



  or set the pqual field to
'0x10' or '0x11' for those LUNs.

Do you mean 001b?
After read the spc4r36g a

PROBLEM: special sense code asc,ascq=04h,0Ch abort scsi scan in the middle

2013-10-13 Thread Vaughan Cao

Hi James,

[1.] One line summary of the problem:
special sense code asc,ascq=04h,0Ch abort scsi scan in the middle

[2.] Full description of the problem/report:
For instance, storage represents 8 iscsi LUNs, however the LUN No.7 is 
not well configured or has something wrong.

Then messages received:
kernel: scsi 5:0:0:0: Unexpected response from lun 7 while scanning, 
scan aborted

Which will make LUN No.8 unavailable.
It's confirmed that Windows and Solaris systems will continue the scan 
and make LUN No.1,2,3,4,5,6 and 8 available.


Log snippet is as below:
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: scsi scan: INQUIRY 
pass 1 length 36

Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Send: 0x8801e9bd4280
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: CDB: Inquiry: 12 00 
00 00 24 00
Aug 24 00:32:49 vmhodtest019 kernel: buffer = 0x8801f71fc180, 
bufflen = 36, queuecommand 0xa00b99e7

Aug 24 00:32:49 vmhodtest019 kernel: leaving scsi_dispatch_cmnd()
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Done: 
0x8801e9bd4280 SUCCESS
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Result: 
hostbyte=DID_OK driverbyte=DRIVER_OK
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: CDB: Inquiry: 12 00 
00 00 24 00
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Sense Key : Not Ready 
[current]
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: Add. Sense: Logical 
unit not accessible, target port in unavailable state

Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:7: scsi host busy 1 failed 0
Aug 24 00:32:49 vmhodtest019 kernel: 0 sectors total, 36 bytes done.
Aug 24 00:32:49 vmhodtest019 kernel: scsi scan: INQUIRY failed with code 
0x802
Aug 24 00:32:49 vmhodtest019 kernel: scsi 5:0:0:0: Unexpected response 
from lun 7 while scanning, scan aborted


According to scsi_report_lun_scan(), I found:
Linux use an inquiry command to probe a lun according to the result of 
report_lun command.
It assumes every probe cmd will get a legal result. Otherwise, it 
regards the whole peripheral not exist or dead.
If the return of inquiry passes its legal checking and indicates 'LUN 
not present', it won't break but also continue with the scan process.
In the log, inquiry to LUN7 return a sense - asc,ascq=04h,0Ch (Logical 
unit not accessible, target port in unavailable state).
And this is ignored, so scsi_probe_lun() returns -EIO and the scan 
process is aborted.


I have two questions:
1. Is it correct for hardware to return a sense 04h,0Ch to inquiry 
again, even after presenting this lun in responce to REPORT_LUN command?
2. Since windows and solaris can continue scan, is it reasonable for 
linux to do the same, even for a fault-tolerance purpose?


Below is information of our storage setting:
Storage array is configured as a cluster mode, and there is a "default" 
target group and "default" initiator group exist on
the storage that includes the target nodename of both the nodes in the 
cluster and all initiator names respectively.
In the partner node, there was lun mapped to the default target 
group/initiator group and having the ID 7.
Since that lun is owner by the partner node, the SCSI inquiry was 
failing on it and as a result the initiator aborts the scan.


Thanks,
Vaughan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 0/4][SCSI] sg: fix race condition in sg_open

2013-08-28 Thread Vaughan Cao
There is a race when open sg with O_EXCL flag. Also a race may happen between
sg_open and sg_remove.

Changes from v6:
 * [1/4] remove double if.
 * [3/4] fix via IS_ERR
Changes from v5:
 Patches based on v3.11-rc7 and passed sg_tst_excl test.
 * [1/4] * remove unused variables - res,sdp.
 * fix insane code dealing with sg_add_sfp.
 * [2/4] resolve conflict with v3.11-rc7. 
 * [3/4] remove sem_out label.
 * [4/4] add sdp definition.

Changes from v4:
 * [3/4] use ERR_PTR series instead of adding another parameter in sg_add_sfp
 * [4/4] fix conflict for cherry-pick from v3.

Changes from v3:
 * release o_sem in sg_release(), not in sg_remove_sfp().
 * not set exclude with sfd_lock held.

Vaughan Cao (4):
  sg: use rwsem to solve race during exclusive open
  sg: no need sg_open_exclusive_lock
  sg: checking sdp->detached isn't protected when open
  sg: push file descriptor list locking down to per-device locking

 drivers/scsi/sg.c | 176 +-
 1 file changed, 81 insertions(+), 95 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 2/4] sg: no need sg_open_exclusive_lock

2013-08-28 Thread Vaughan Cao
Open exclusive check is protected by o_sem, no need sg_open_exclusive_lock.
@exclude is used to record which type of rwsem we are holding.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 34 +-
 1 file changed, 5 insertions(+), 29 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 4efa9b5..d4af132 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -105,8 +105,6 @@ static int scatter_elem_sz_prev = SG_SCATTER_SZ;
 static int sg_add(struct device *, struct class_interface *);
 static void sg_remove(struct device *, struct class_interface *);
 
-static DEFINE_SPINLOCK(sg_open_exclusive_lock);
-
 static DEFINE_IDR(sg_index_idr);
 static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
   file descriptor list 
for device */
@@ -176,7 +174,6 @@ typedef struct sg_device { /* holds the state of each scsi 
generic device */
struct list_head sfds;
struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
-   /* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
char sgdebug;   /* 0->off, 1->sense, 9->dump dev, 10-> all devs 
*/
struct gendisk *disk;
@@ -225,27 +222,6 @@ static int sg_allow_access(struct file *filp, unsigned 
char *cmd)
return blk_verify_command(cmd, filp->f_mode & FMODE_WRITE);
 }
 
-static int get_exclude(Sg_device *sdp)
-{
-   unsigned long flags;
-   int ret;
-
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   ret = sdp->exclude;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return ret;
-}
-
-static int set_exclude(Sg_device *sdp, char val)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   sdp->exclude = val;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return val;
-}
-
 static int sfds_list_empty(Sg_device *sdp)
 {
unsigned long flags;
@@ -317,7 +293,7 @@ sg_open(struct inode *inode, struct file *filp)
}
/* Since write lock is held, no need to check sfd_list */
if (flags & O_EXCL)
-   set_exclude(sdp, 1);
+   sdp->exclude = 1;   /* used by release lock */
 
if (sdp->detached) {
retval = -ENODEV;
@@ -337,7 +313,7 @@ sg_open(struct inode *inode, struct file *filp)
retval = -ENOMEM;
 sem_out:
if (flags & O_EXCL) {
-   set_exclude(sdp, 0);/* undo if error */
+   sdp->exclude = 0;   /* undo if error */
up_write(&sdp->o_sem);
} else
up_read(&sdp->o_sem);
@@ -364,8 +340,8 @@ sg_release(struct inode *inode, struct file *filp)
return -ENXIO;
SCSI_LOG_TIMEOUT(3, printk("sg_release: %s\n", sdp->disk->disk_name));
 
-   excl = get_exclude(sdp);
-   set_exclude(sdp, 0);
+   excl = sdp->exclude;
+   sdp->exclude = 0;
if (excl)
up_write(&sdp->o_sem);
else
@@ -2622,7 +2598,7 @@ static int sg_proc_seq_show_debug(struct seq_file *s, 
void *v)
 scsidp->lun,
 scsidp->host->hostt->emulated);
seq_printf(s, " sg_tablesize=%d excl=%d\n",
-  sdp->sg_tablesize, get_exclude(sdp));
+  sdp->sg_tablesize, sdp->exclude);
sg_proc_debug_helper(s, sdp);
}
read_unlock_irqrestore(&sg_index_lock, iflags);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 4/4] sg: push file descriptor list locking down to per-device locking

2013-08-28 Thread Vaughan Cao
Push file descriptor list locking down to per-device locking. Let sg_index_lock
only protect device lookup.
sdp->detached is also set and checked with this lock held.

Changes from v4:
 * Since I use ERR_PTR and friends in sg_add_sfp, this patch should also be
updated to resolve conflict in cherrry-pick.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 62 ++-
 1 file changed, 34 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 64df1ab..5cbc4bb 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -106,8 +106,7 @@ static int sg_add(struct device *, struct class_interface 
*);
 static void sg_remove(struct device *, struct class_interface *);
 
 static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock);
 
 static struct class_interface sg_interface = {
.add_dev= sg_add,
@@ -144,8 +143,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests 
outstanding per file */
 } Sg_request;
 
 typedef struct sg_fd { /* holds the state of a file descriptor */
-   /* sfd_siblings is protected by sg_index_lock */
-   struct list_head sfd_siblings;
+   struct list_head sfd_siblings; /* protected by sfd_lock of device */
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
rwlock_t rq_list_lock;  /* protect access to list in req_arr */
@@ -170,7 +168,7 @@ typedef struct sg_device { /* holds the state of each scsi 
generic device */
struct scsi_device *device;
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
-   /* sfds is protected by sg_index_lock */
+   spinlock_t sfd_lock;/* protect file descriptor list for device */
struct list_head sfds;
struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
@@ -227,9 +225,9 @@ static int sfds_list_empty(Sg_device *sdp)
unsigned long flags;
int ret;
 
-   read_lock_irqsave(&sg_index_lock, flags);
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
ret = list_empty(&sdp->sfds);
-   read_unlock_irqrestore(&sg_index_lock, flags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
return ret;
 }
 
@@ -1393,6 +1391,7 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct 
scsi_device *scsidp)
disk->first_minor = k;
sdp->disk = disk;
sdp->device = scsidp;
+   spin_lock_init(&sdp->sfd_lock);
INIT_LIST_HEAD(&sdp->sfds);
init_rwsem(&sdp->o_sem);
sdp->sg_tablesize = queue_max_segments(q);
@@ -1527,11 +1526,13 @@ static void sg_remove(struct device *cl_dev, struct 
class_interface *cl_intf)
 
/* Need a write lock to set sdp->detached. */
write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock(&sdp->sfd_lock);
sdp->detached = 1;
list_for_each_entry(sfp, &sdp->sfds, sfd_siblings) {
wake_up_interruptible(&sfp->read_wait);
kill_fasync(&sfp->async_qp, SIGPOLL, POLL_HUP);
}
+   spin_unlock(&sdp->sfd_lock);
write_unlock_irqrestore(&sg_index_lock, iflags);
 
sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
@@ -2056,13 +2057,13 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->cmd_q = SG_DEF_COMMAND_Q;
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
-   write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock_irqsave(&sdp->sfd_lock, iflags);
if (sdp->detached) {
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
return ERR_PTR(-ENODEV);
}
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
if (unlikely(sg_big_buff != def_reserved_size))
sg_big_buff = def_reserved_size;
@@ -2109,11 +2110,12 @@ static void sg_remove_sfp_usercontext(struct 
work_struct *work)
 static void sg_remove_sfp(struct kref *kref)
 {
struct sg_fd *sfp = container_of(kref, struct sg_fd, f_ref);
+   struct sg_device *sdp = sfp->parentdp;
unsigned long iflags;
 
-   write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock

[PATCH v7 1/4] sg: use rwsem to solve race during exclusive open

2013-08-28 Thread Vaughan Cao
A race condition may happen if two threads are both trying to open the same sg
with O_EXCL simultaneously. It's possible that they both find fsds list is
empty and get_exclude(sdp) returns 0, then they both call set_exclude() and
break out from wait_event_interruptible and resume open.

Now use rwsem to protect this process. Exclusive open gets write lock and
others get read lock. The lock will be held until file descriptor is closed.
This also leads 'exclude' only a status rather than a check mark.

Changes from v6:
 * remove double if.
Changes from v5:
 * remove unused variables
 * fix insane code dealing with sg_add_sfp.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 79 +--
 1 file changed, 41 insertions(+), 38 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index df5e961..4efa9b5 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -170,11 +170,11 @@ typedef struct sg_fd {/* holds the state of a 
file descriptor */
 
 typedef struct sg_device { /* holds the state of each scsi generic device */
struct scsi_device *device;
-   wait_queue_head_t o_excl_wait;  /* queue open() when O_EXCL in use */
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
/* sfds is protected by sg_index_lock */
struct list_head sfds;
+   struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
/* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
@@ -265,7 +265,6 @@ sg_open(struct inode *inode, struct file *filp)
struct request_queue *q;
Sg_device *sdp;
Sg_fd *sfp;
-   int res;
int retval;
 
nonseekable_open(inode, filp);
@@ -294,35 +293,35 @@ sg_open(struct inode *inode, struct file *filp)
goto error_out;
}
 
-   if (flags & O_EXCL) {
-   if (O_RDONLY == (flags & O_ACCMODE)) {
-   retval = -EPERM; /* Can't lock it with read only access 
*/
-   goto error_out;
-   }
-   if (!sfds_list_empty(sdp) && (flags & O_NONBLOCK)) {
-   retval = -EBUSY;
-   goto error_out;
-   }
-   res = wait_event_interruptible(sdp->o_excl_wait,
-  ((!sfds_list_empty(sdp) || 
get_exclude(sdp)) ? 0 : set_exclude(sdp, 1)));
-   if (res) {
-   retval = res;   /* -ERESTARTSYS because signal hit 
process */
-   goto error_out;
-   }
-   } else if (get_exclude(sdp)) {  /* some other fd has an exclusive lock 
on dev */
-   if (flags & O_NONBLOCK) {
-   retval = -EBUSY;
-   goto error_out;
-   }
-   res = wait_event_interruptible(sdp->o_excl_wait, 
!get_exclude(sdp));
-   if (res) {
-   retval = res;   /* -ERESTARTSYS because signal hit 
process */
-   goto error_out;
+   if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE))) {
+   retval = -EPERM; /* Can't lock it with read only access */
+   goto error_out;
+   }
+   if (flags & O_NONBLOCK) {
+   if (flags & O_EXCL) {
+   if (!down_write_trylock(&sdp->o_sem)) {
+   retval = -EBUSY;
+   goto error_out;
+   }
+   } else {
+   if (!down_read_trylock(&sdp->o_sem)) {
+   retval = -EBUSY;
+   goto error_out;
+   }
}
+   } else {
+   if (flags & O_EXCL)
+   down_write(&sdp->o_sem);
+   else
+   down_read(&sdp->o_sem);
}
+   /* Since write lock is held, no need to check sfd_list */
+   if (flags & O_EXCL)
+   set_exclude(sdp, 1);
+
if (sdp->detached) {
retval = -ENODEV;
-   goto error_out;
+   goto sem_out;
}
if (sfds_list_empty(sdp)) { /* no existing opens on this device */
sdp->sgdebug = 0;
@@ -331,17 +330,18 @@ sg_open(struct inode *inode, struct file *filp)
}
if ((sfp = sg_add_sfp(sdp, dev)))
filp->private_data = sfp;
+   /* retval is already provably zero at this point because of the
+* check after retval = scsi_autopm_get_device(sdp->device))
+*/
 

[PATCH v7 3/4] sg: checking sdp->detached isn't protected when open

2013-08-28 Thread Vaughan Cao
@detached is set under the protection of sg_index_lock. Without getting the
lock, new sfp will be added during sg removal and there is no chance for it
to be picked out. So check with sg_index_lock held in sg_add_sfp().

Changes from v6:
 * fix via IS_ERR
Changes from v5:
 * remove sem_out label.
Changes from v4:
 * use ERR_PTR series instead of adding another parameter in sg_add_sfp

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index d4af132..64df1ab 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -295,23 +295,20 @@ sg_open(struct inode *inode, struct file *filp)
if (flags & O_EXCL)
sdp->exclude = 1;   /* used by release lock */
 
-   if (sdp->detached) {
-   retval = -ENODEV;
-   goto sem_out;
-   }
if (sfds_list_empty(sdp)) { /* no existing opens on this device */
sdp->sgdebug = 0;
q = sdp->device->request_queue;
sdp->sg_tablesize = queue_max_segments(q);
}
-   if ((sfp = sg_add_sfp(sdp, dev)))
+   sfp = sg_add_sfp(sdp, dev);
+   if (!IS_ERR(sfp))
filp->private_data = sfp;
/* retval is already provably zero at this point because of the
 * check after retval = scsi_autopm_get_device(sdp->device))
 */
else {
-   retval = -ENOMEM;
-sem_out:
+   retval = PTR_ERR(sfp);
+
if (flags & O_EXCL) {
sdp->exclude = 0;   /* undo if error */
up_write(&sdp->o_sem);
@@ -2045,7 +2042,7 @@ sg_add_sfp(Sg_device * sdp, int dev)
 
sfp = kzalloc(sizeof(*sfp), GFP_ATOMIC | __GFP_NOWARN);
if (!sfp)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
 
init_waitqueue_head(&sfp->read_wait);
rwlock_init(&sfp->rq_list_lock);
@@ -2060,6 +2057,10 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
write_lock_irqsave(&sg_index_lock, iflags);
+   if (sdp->detached) {
+   write_unlock_irqrestore(&sg_index_lock, iflags);
+   return ERR_PTR(-ENODEV);
+   }
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
write_unlock_irqrestore(&sg_index_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 0/4][SCSI] sg: fix race condition in sg_open

2013-08-28 Thread Vaughan Cao
There is a race when open sg with O_EXCL flag. Also a race may happen between
sg_open and sg_remove.

Changes from v5:
 Patches based on v3.11-rc7 and passed sg_tst_excl test.
 * [1/4] * remove unused variables - res,sdp.
 * fix insane code dealing with sg_add_sfp.
 * [2/4] resolve conflict with v3.11-rc7. 
 * [3/4] remove sem_out label.
 * [4/4] add sdp definition.

Changes from v4:
 * [3/4] use ERR_PTR series instead of adding another parameter in sg_add_sfp
 * [4/4] fix conflict for cherry-pick from v3.

Changes from v3:
 * release o_sem in sg_release(), not in sg_remove_sfp().
 * not set exclude with sfd_lock held.

Vaughan Cao (4):
  sg: use rwsem to solve race during exclusive open
  sg: no need sg_open_exclusive_lock
  sg: checking sdp->detached isn't protected when open
  sg: push file descriptor list locking down to per-device locking

 drivers/scsi/sg.c | 173 +-
 1 file changed, 80 insertions(+), 93 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 3/4] sg: checking sdp->detached isn't protected when open

2013-08-28 Thread Vaughan Cao
@detached is set under the protection of sg_index_lock. Without getting the
lock, new sfp will be added during sg removal and there is no chance for it
to be picked out. So check with sg_index_lock held in sg_add_sfp().

Changes from v5:
 * remove sem_out label.
Changes from v4:
 * use ERR_PTR series instead of adding another parameter in sg_add_sfp

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index dcbd95f..6bffe52 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -295,10 +295,6 @@ sg_open(struct inode *inode, struct file *filp)
if (flags & O_EXCL)
sdp->exclude = 1;   /* used by release lock */
 
-   if (sdp->detached) {
-   retval = -ENODEV;
-   goto sem_out;
-   }
if (sfds_list_empty(sdp)) { /* no existing opens on this device */
sdp->sgdebug = 0;
q = sdp->device->request_queue;
@@ -309,16 +305,16 @@ sg_open(struct inode *inode, struct file *filp)
/* retval is already provably zero at this point because of the
 * check after retval = scsi_autopm_get_device(sdp->device))
 */
-   else
-   retval = -ENOMEM;
-
-   if (retval) {
-sem_out:
+   else {
+   retval = PTR_ERR(sfp);
if (flags & O_EXCL) {
sdp->exclude = 0;   /* undo if error */
up_write(&sdp->o_sem);
} else
up_read(&sdp->o_sem);
+   }
+
+   if (retval) {
 error_out:
scsi_autopm_put_device(sdp->device);
 sdp_put:
@@ -2047,7 +2043,7 @@ sg_add_sfp(Sg_device * sdp, int dev)
 
sfp = kzalloc(sizeof(*sfp), GFP_ATOMIC | __GFP_NOWARN);
if (!sfp)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
 
init_waitqueue_head(&sfp->read_wait);
rwlock_init(&sfp->rq_list_lock);
@@ -2062,6 +2058,10 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
write_lock_irqsave(&sg_index_lock, iflags);
+   if (sdp->detached) {
+   write_unlock_irqrestore(&sg_index_lock, iflags);
+   return ERR_PTR(-ENODEV);
+   }
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
write_unlock_irqrestore(&sg_index_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 4/4] sg: push file descriptor list locking down to per-device locking

2013-08-28 Thread Vaughan Cao
Push file descriptor list locking down to per-device locking. Let sg_index_lock
only protect device lookup.
sdp->detached is also set and checked with this lock held.

Changes from v4:
 * Since I use ERR_PTR and friends in sg_add_sfp, this patch should also be
updated to resolve conflict in cherrry-pick.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 62 ++-
 1 file changed, 34 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 6bffe52..10d6943 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -106,8 +106,7 @@ static int sg_add(struct device *, struct class_interface 
*);
 static void sg_remove(struct device *, struct class_interface *);
 
 static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock);
 
 static struct class_interface sg_interface = {
.add_dev= sg_add,
@@ -144,8 +143,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests 
outstanding per file */
 } Sg_request;
 
 typedef struct sg_fd { /* holds the state of a file descriptor */
-   /* sfd_siblings is protected by sg_index_lock */
-   struct list_head sfd_siblings;
+   struct list_head sfd_siblings; /* protected by sfd_lock of device */
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
rwlock_t rq_list_lock;  /* protect access to list in req_arr */
@@ -170,7 +168,7 @@ typedef struct sg_device { /* holds the state of each scsi 
generic device */
struct scsi_device *device;
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
-   /* sfds is protected by sg_index_lock */
+   spinlock_t sfd_lock;/* protect file descriptor list for device */
struct list_head sfds;
struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
@@ -227,9 +225,9 @@ static int sfds_list_empty(Sg_device *sdp)
unsigned long flags;
int ret;
 
-   read_lock_irqsave(&sg_index_lock, flags);
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
ret = list_empty(&sdp->sfds);
-   read_unlock_irqrestore(&sg_index_lock, flags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
return ret;
 }
 
@@ -1394,6 +1392,7 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct 
scsi_device *scsidp)
disk->first_minor = k;
sdp->disk = disk;
sdp->device = scsidp;
+   spin_lock_init(&sdp->sfd_lock);
INIT_LIST_HEAD(&sdp->sfds);
init_rwsem(&sdp->o_sem);
sdp->sg_tablesize = queue_max_segments(q);
@@ -1528,11 +1527,13 @@ static void sg_remove(struct device *cl_dev, struct 
class_interface *cl_intf)
 
/* Need a write lock to set sdp->detached. */
write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock(&sdp->sfd_lock);
sdp->detached = 1;
list_for_each_entry(sfp, &sdp->sfds, sfd_siblings) {
wake_up_interruptible(&sfp->read_wait);
kill_fasync(&sfp->async_qp, SIGPOLL, POLL_HUP);
}
+   spin_unlock(&sdp->sfd_lock);
write_unlock_irqrestore(&sg_index_lock, iflags);
 
sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
@@ -2057,13 +2058,13 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->cmd_q = SG_DEF_COMMAND_Q;
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
-   write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock_irqsave(&sdp->sfd_lock, iflags);
if (sdp->detached) {
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
return ERR_PTR(-ENODEV);
}
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
if (unlikely(sg_big_buff != def_reserved_size))
sg_big_buff = def_reserved_size;
@@ -2110,11 +2111,12 @@ static void sg_remove_sfp_usercontext(struct 
work_struct *work)
 static void sg_remove_sfp(struct kref *kref)
 {
struct sg_fd *sfp = container_of(kref, struct sg_fd, f_ref);
+   struct sg_device *sdp = sfp->parentdp;
unsigned long iflags;
 
-   write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock

[PATCH v6 2/4] sg: no need sg_open_exclusive_lock

2013-08-28 Thread Vaughan Cao
Open exclusive check is protected by o_sem, no need sg_open_exclusive_lock.
@exclude is used to record which type of rwsem we are holding.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 34 +-
 1 file changed, 5 insertions(+), 29 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 7a54c92..dcbd95f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -105,8 +105,6 @@ static int scatter_elem_sz_prev = SG_SCATTER_SZ;
 static int sg_add(struct device *, struct class_interface *);
 static void sg_remove(struct device *, struct class_interface *);
 
-static DEFINE_SPINLOCK(sg_open_exclusive_lock);
-
 static DEFINE_IDR(sg_index_idr);
 static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
   file descriptor list 
for device */
@@ -176,7 +174,6 @@ typedef struct sg_device { /* holds the state of each scsi 
generic device */
struct list_head sfds;
struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
-   /* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
char sgdebug;   /* 0->off, 1->sense, 9->dump dev, 10-> all devs 
*/
struct gendisk *disk;
@@ -225,27 +222,6 @@ static int sg_allow_access(struct file *filp, unsigned 
char *cmd)
return blk_verify_command(cmd, filp->f_mode & FMODE_WRITE);
 }
 
-static int get_exclude(Sg_device *sdp)
-{
-   unsigned long flags;
-   int ret;
-
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   ret = sdp->exclude;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return ret;
-}
-
-static int set_exclude(Sg_device *sdp, char val)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   sdp->exclude = val;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return val;
-}
-
 static int sfds_list_empty(Sg_device *sdp)
 {
unsigned long flags;
@@ -317,7 +293,7 @@ sg_open(struct inode *inode, struct file *filp)
}
/* Since write lock is held, no need to check sfd_list */
if (flags & O_EXCL)
-   set_exclude(sdp, 1);
+   sdp->exclude = 1;   /* used by release lock */
 
if (sdp->detached) {
retval = -ENODEV;
@@ -339,7 +315,7 @@ sg_open(struct inode *inode, struct file *filp)
if (retval) {
 sem_out:
if (flags & O_EXCL) {
-   set_exclude(sdp, 0);/* undo if error */
+   sdp->exclude = 0;   /* undo if error */
up_write(&sdp->o_sem);
} else
up_read(&sdp->o_sem);
@@ -366,8 +342,8 @@ sg_release(struct inode *inode, struct file *filp)
return -ENXIO;
SCSI_LOG_TIMEOUT(3, printk("sg_release: %s\n", sdp->disk->disk_name));
 
-   excl = get_exclude(sdp);
-   set_exclude(sdp, 0);
+   excl = sdp->exclude;
+   sdp->exclude = 0;
if (excl)
up_write(&sdp->o_sem);
else
@@ -2624,7 +2600,7 @@ static int sg_proc_seq_show_debug(struct seq_file *s, 
void *v)
 scsidp->lun,
 scsidp->host->hostt->emulated);
seq_printf(s, " sg_tablesize=%d excl=%d\n",
-  sdp->sg_tablesize, get_exclude(sdp));
+  sdp->sg_tablesize, sdp->exclude);
sg_proc_debug_helper(s, sdp);
}
read_unlock_irqrestore(&sg_index_lock, iflags);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 1/4] sg: use rwsem to solve race during exclusive open

2013-08-28 Thread Vaughan Cao
A race condition may happen if two threads are both trying to open the same sg
with O_EXCL simultaneously. It's possible that they both find fsds list is
empty and get_exclude(sdp) returns 0, then they both call set_exclude() and
break out from wait_event_interruptible and resume open.

Now use rwsem to protect this process. Exclusive open gets write lock and
others get read lock. The lock will be held until file descriptor is closed.
This also leads 'exclude' only a status rather than a check mark.

Changes from v5:
 * remove unused variables
 * fix insane code dealing with sg_add_sfp.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 83 +--
 1 file changed, 44 insertions(+), 39 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index df5e961..7a54c92 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -170,11 +170,11 @@ typedef struct sg_fd {/* holds the state of a 
file descriptor */
 
 typedef struct sg_device { /* holds the state of each scsi generic device */
struct scsi_device *device;
-   wait_queue_head_t o_excl_wait;  /* queue open() when O_EXCL in use */
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
/* sfds is protected by sg_index_lock */
struct list_head sfds;
+   struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
/* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
@@ -265,7 +265,6 @@ sg_open(struct inode *inode, struct file *filp)
struct request_queue *q;
Sg_device *sdp;
Sg_fd *sfp;
-   int res;
int retval;
 
nonseekable_open(inode, filp);
@@ -294,35 +293,35 @@ sg_open(struct inode *inode, struct file *filp)
goto error_out;
}
 
-   if (flags & O_EXCL) {
-   if (O_RDONLY == (flags & O_ACCMODE)) {
-   retval = -EPERM; /* Can't lock it with read only access 
*/
-   goto error_out;
-   }
-   if (!sfds_list_empty(sdp) && (flags & O_NONBLOCK)) {
-   retval = -EBUSY;
-   goto error_out;
-   }
-   res = wait_event_interruptible(sdp->o_excl_wait,
-  ((!sfds_list_empty(sdp) || 
get_exclude(sdp)) ? 0 : set_exclude(sdp, 1)));
-   if (res) {
-   retval = res;   /* -ERESTARTSYS because signal hit 
process */
-   goto error_out;
-   }
-   } else if (get_exclude(sdp)) {  /* some other fd has an exclusive lock 
on dev */
-   if (flags & O_NONBLOCK) {
-   retval = -EBUSY;
-   goto error_out;
-   }
-   res = wait_event_interruptible(sdp->o_excl_wait, 
!get_exclude(sdp));
-   if (res) {
-   retval = res;   /* -ERESTARTSYS because signal hit 
process */
-   goto error_out;
+   if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE))) {
+   retval = -EPERM; /* Can't lock it with read only access */
+   goto error_out;
+   }
+   if (flags & O_NONBLOCK) {
+   if (flags & O_EXCL) {
+   if (!down_write_trylock(&sdp->o_sem)) {
+   retval = -EBUSY;
+   goto error_out;
+   }
+   } else {
+   if (!down_read_trylock(&sdp->o_sem)) {
+   retval = -EBUSY;
+   goto error_out;
+   }
}
+   } else {
+   if (flags & O_EXCL)
+   down_write(&sdp->o_sem);
+   else
+   down_read(&sdp->o_sem);
}
+   /* Since write lock is held, no need to check sfd_list */
+   if (flags & O_EXCL)
+   set_exclude(sdp, 1);
+
if (sdp->detached) {
retval = -ENODEV;
-   goto error_out;
+   goto sem_out;
}
if (sfds_list_empty(sdp)) { /* no existing opens on this device */
sdp->sgdebug = 0;
@@ -331,17 +330,20 @@ sg_open(struct inode *inode, struct file *filp)
}
if ((sfp = sg_add_sfp(sdp, dev)))
filp->private_data = sfp;
-   else {
+   /* retval is already provably zero at this point because of the
+* check after retval = scsi_autopm_get_device(sdp->device))
+*/
+   else
+ 

Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-08-27 Thread vaughan
On 08/27/2013 09:13 PM, Douglas Gilbert wrote:
> On 13-08-27 10:16 AM, vaughan wrote:
>> On 08/13/2013 11:16 AM, Douglas Gilbert wrote:
>>> On 13-08-12 10:46 PM, vaughan wrote:
>>>> On 08/06/2013 04:52 AM, Douglas Gilbert wrote:
>>>>> On 13-08-04 10:19 PM, vaughan wrote:
>>>>>> On 08/03/2013 01:25 PM, Douglas Gilbert wrote:
>>>>>>> On 13-08-01 01:01 AM, Douglas Gilbert wrote:
>>>>>>>> On 13-07-22 01:03 PM, Jörn Engel wrote:
>>>>>>>>> On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:
>>>>>>>>>>
>>>>>>>>>> There is a race when open sg with O_EXCL flag. Also a race may
>>>>>>>>>> happen between
>>>>>>>>>> sg_open and sg_remove.
>>>>>>>>>>
>>>>>>>>>> Changes from v4:
>>>>>>>>>>  * [3/4] use ERR_PTR series instead of adding another
>>>>>>>>>> parameter in
>>>>>>>>>> sg_add_sfp
>>>>>>>>>>  * [4/4] fix conflict for cherry-pick from v3.
>>>>>>>>>>
>>>>>>>>>> Changes from v3:
>>>>>>>>>>  * release o_sem in sg_release(), not in sg_remove_sfp().
>>>>>>>>>>  * not set exclude with sfd_lock held.
>>>>>>>>>>
>>>>>>>>>> Vaughan Cao (4):
>>>>>>>>>>   [SCSI] sg: use rwsem to solve race during exclusive open
>>>>>>>>>>   [SCSI] sg: no need sg_open_exclusive_lock
>>>>>>>>>>   [SCSI] sg: checking sdp->detached isn't protected when
>>>>>>>>>> open
>>>>>>>>>>   [SCSI] sg: push file descriptor list locking down to
>>>>>>>>>> per-device
>>>>>>>>>> locking
>>>>>>>>>>
>>>>>>>>>>  drivers/scsi/sg.c | 178
>>>>>>>>>> +-
>>>>>>>>>>  1 file changed, 83 insertions(+), 95 deletions(-)
>>>>>>>>>
>>>>>>>>> Patchset looks good to me, although I didn't test it on hardware
>>>>>>>>> yet.
>>>>>>>>> Signed-off-by: Joern Engel 
>>>>>>>>>
>>>>>>>>> James, care to pick this up?
>>>>>>>>
>>>>>>>> Acked-by: Douglas Gilbert 
>>>>>>>>
>>>>>>>> Tested O_EXCL with multiple processes and threads; passed.
>>>>>>>> sg driver prior to this patch had "leaky" O_EXCL logic
>>>>>>>> according to the same test. Block device passed.
>>>>>>>>
>>>>>>>> James, could you clean this up:
>>>>>>>>   drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
>>>>>>>> [-Wunused-variable]
>>>>>>>
>>>>>>> Further testing suggests this patch on the sg driver is
>>>>>>> broken, so I'll rescind my ack.
>>>>>>>
>>>>>>> The case it is broken for is when a device is opened
>>>>>>> without O_EXCL. Now if, while it is open, a second
>>>>>>> thread/process tries to open the same device O_EXCL
>>>>>>> then IMO the second open should fail with EBUSY.
>>>>>>>
>>>>>>> My testing shows that O_EXCL opens properly deflect
>>>>>>> other O_EXCL opens.
>>>>>> Hi  Doug,
>>>>>>
>>>>>> My test don't have this issue. The routine is something as below:
>>>>>>
>>>>>> I start three opens without O_EXCL, wait 30s each, and open with
>>>>>> O_EXCL|O_NONBLOCK, it failed with EBUSY.
>>>>>> And I also call myopen with/without O_EXCL many times in
>>>>>> background at
>>>>>> the same time, and the test is passed. I don't know why it failed in
>>>>>> your test.
>>>>>>
>>>>>> Usage: myopen [-e][-n][-d delay] -f file
>>>>>>  -e: exclude
>>>>>>  -n: nonbl

Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-08-27 Thread vaughan
On 08/13/2013 11:16 AM, Douglas Gilbert wrote:
> On 13-08-12 10:46 PM, vaughan wrote:
>> On 08/06/2013 04:52 AM, Douglas Gilbert wrote:
>>> On 13-08-04 10:19 PM, vaughan wrote:
>>>> On 08/03/2013 01:25 PM, Douglas Gilbert wrote:
>>>>> On 13-08-01 01:01 AM, Douglas Gilbert wrote:
>>>>>> On 13-07-22 01:03 PM, Jörn Engel wrote:
>>>>>>> On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:
>>>>>>>>
>>>>>>>> There is a race when open sg with O_EXCL flag. Also a race may
>>>>>>>> happen between
>>>>>>>> sg_open and sg_remove.
>>>>>>>>
>>>>>>>> Changes from v4:
>>>>>>>> * [3/4] use ERR_PTR series instead of adding another
>>>>>>>> parameter in
>>>>>>>> sg_add_sfp
>>>>>>>> * [4/4] fix conflict for cherry-pick from v3.
>>>>>>>>
>>>>>>>> Changes from v3:
>>>>>>>> * release o_sem in sg_release(), not in sg_remove_sfp().
>>>>>>>> * not set exclude with sfd_lock held.
>>>>>>>>
>>>>>>>> Vaughan Cao (4):
>>>>>>>>  [SCSI] sg: use rwsem to solve race during exclusive open
>>>>>>>>  [SCSI] sg: no need sg_open_exclusive_lock
>>>>>>>>  [SCSI] sg: checking sdp->detached isn't protected when open
>>>>>>>>  [SCSI] sg: push file descriptor list locking down to
>>>>>>>> per-device
>>>>>>>>locking
>>>>>>>>
>>>>>>>> drivers/scsi/sg.c | 178
>>>>>>>> +-
>>>>>>>> 1 file changed, 83 insertions(+), 95 deletions(-)
>>>>>>>
>>>>>>> Patchset looks good to me, although I didn't test it on hardware
>>>>>>> yet.
>>>>>>> Signed-off-by: Joern Engel 
>>>>>>>
>>>>>>> James, care to pick this up?
>>>>>>
>>>>>> Acked-by: Douglas Gilbert 
>>>>>>
>>>>>> Tested O_EXCL with multiple processes and threads; passed.
>>>>>> sg driver prior to this patch had "leaky" O_EXCL logic
>>>>>> according to the same test. Block device passed.
>>>>>>
>>>>>> James, could you clean this up:
>>>>>>  drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
>>>>>> [-Wunused-variable]
>>>>>
>>>>> Further testing suggests this patch on the sg driver is
>>>>> broken, so I'll rescind my ack.
>>>>>
>>>>> The case it is broken for is when a device is opened
>>>>> without O_EXCL. Now if, while it is open, a second
>>>>> thread/process tries to open the same device O_EXCL
>>>>> then IMO the second open should fail with EBUSY.
>>>>>
>>>>> My testing shows that O_EXCL opens properly deflect
>>>>> other O_EXCL opens.
>>>> Hi  Doug,
>>>>
>>>> My test don't have this issue. The routine is something as below:
>>>>
>>>> I start three opens without O_EXCL, wait 30s each, and open with
>>>> O_EXCL|O_NONBLOCK, it failed with EBUSY.
>>>> And I also call myopen with/without O_EXCL many times in background at
>>>> the same time, and the test is passed. I don't know why it failed in
>>>> your test.
>>>>
>>>> Usage: myopen [-e][-n][-d delay] -f file
>>>> -e: exclude
>>>> -n: nonblock
>>>> -d: delay N seconds and then close.
>>>>
>>>> [root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
>>>> [1] 3417
>>>> [root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
>>>> [2] 3418
>>>> [root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
>>>> [3] 3419
>>>> [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
>>>> max_active_device=6(origin 1)
>>>>def_reserved_size=32768
>>>>>>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55
>>>> excl=0
>>>>  FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
>>>

Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-08-12 Thread vaughan
On 08/06/2013 04:52 AM, Douglas Gilbert wrote:
> On 13-08-04 10:19 PM, vaughan wrote:
>> On 08/03/2013 01:25 PM, Douglas Gilbert wrote:
>>> On 13-08-01 01:01 AM, Douglas Gilbert wrote:
>>>> On 13-07-22 01:03 PM, Jörn Engel wrote:
>>>>> On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:
>>>>>>
>>>>>> There is a race when open sg with O_EXCL flag. Also a race may
>>>>>> happen between
>>>>>> sg_open and sg_remove.
>>>>>>
>>>>>> Changes from v4:
>>>>>>* [3/4] use ERR_PTR series instead of adding another parameter in
>>>>>> sg_add_sfp
>>>>>>    * [4/4] fix conflict for cherry-pick from v3.
>>>>>>
>>>>>> Changes from v3:
>>>>>>* release o_sem in sg_release(), not in sg_remove_sfp().
>>>>>>* not set exclude with sfd_lock held.
>>>>>>
>>>>>> Vaughan Cao (4):
>>>>>> [SCSI] sg: use rwsem to solve race during exclusive open
>>>>>> [SCSI] sg: no need sg_open_exclusive_lock
>>>>>> [SCSI] sg: checking sdp->detached isn't protected when open
>>>>>> [SCSI] sg: push file descriptor list locking down to per-device
>>>>>>   locking
>>>>>>
>>>>>>drivers/scsi/sg.c | 178
>>>>>> +-
>>>>>>1 file changed, 83 insertions(+), 95 deletions(-)
>>>>>
>>>>> Patchset looks good to me, although I didn't test it on hardware yet.
>>>>> Signed-off-by: Joern Engel 
>>>>>
>>>>> James, care to pick this up?
>>>>
>>>> Acked-by: Douglas Gilbert 
>>>>
>>>> Tested O_EXCL with multiple processes and threads; passed.
>>>> sg driver prior to this patch had "leaky" O_EXCL logic
>>>> according to the same test. Block device passed.
>>>>
>>>> James, could you clean this up:
>>>> drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
>>>> [-Wunused-variable]
>>>
>>> Further testing suggests this patch on the sg driver is
>>> broken, so I'll rescind my ack.
>>>
>>> The case it is broken for is when a device is opened
>>> without O_EXCL. Now if, while it is open, a second
>>> thread/process tries to open the same device O_EXCL
>>> then IMO the second open should fail with EBUSY.
>>>
>>> My testing shows that O_EXCL opens properly deflect
>>> other O_EXCL opens.
>> Hi  Doug,
>>
>> My test don't have this issue. The routine is something as below:
>>
>> I start three opens without O_EXCL, wait 30s each, and open with
>> O_EXCL|O_NONBLOCK, it failed with EBUSY.
>> And I also call myopen with/without O_EXCL many times in background at
>> the same time, and the test is passed. I don't know why it failed in
>> your test.
>>
>> Usage: myopen [-e][-n][-d delay] -f file
>>-e: exclude
>>-n: nonblock
>>-d: delay N seconds and then close.
>>
>> [root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
>> [1] 3417
>> [root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
>> [2] 3418
>> [root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
>> [3] 3419
>> [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
>> max_active_device=6(origin 1)
>>   def_reserved_size=32768
>>   >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
>> FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
>> cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>   No requests active
>> FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
>> cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>   No requests active
>> FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
>> cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>   No requests active
>>
>> [root@vacaowol5 16835013]# ./myopen -e -n  -f /dev/sg5 -d 30 &
>> [4] 3422
>> [3422:3351] /dev/sg5:exclude: Device or resource busy
>>
>> [4]+  Exit 1  ./myopen -e -n -f /dev/sg5 -d 30
>>
>> [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
>> max_active_device=6(origin 1)
>>   def_reserved_size=32768
>>   >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
>> 

Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-08-04 Thread vaughan
On 08/03/2013 01:25 PM, Douglas Gilbert wrote:
> On 13-08-01 01:01 AM, Douglas Gilbert wrote:
>> On 13-07-22 01:03 PM, Jörn Engel wrote:
>>> On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:
>>>>
>>>> There is a race when open sg with O_EXCL flag. Also a race may
>>>> happen between
>>>> sg_open and sg_remove.
>>>>
>>>> Changes from v4:
>>>>   * [3/4] use ERR_PTR series instead of adding another parameter in
>>>> sg_add_sfp
>>>>   * [4/4] fix conflict for cherry-pick from v3.
>>>>
>>>> Changes from v3:
>>>>   * release o_sem in sg_release(), not in sg_remove_sfp().
>>>>   * not set exclude with sfd_lock held.
>>>>
>>>> Vaughan Cao (4):
>>>>[SCSI] sg: use rwsem to solve race during exclusive open
>>>>[SCSI] sg: no need sg_open_exclusive_lock
>>>>[SCSI] sg: checking sdp->detached isn't protected when open
>>>>[SCSI] sg: push file descriptor list locking down to per-device
>>>>  locking
>>>>
>>>>   drivers/scsi/sg.c | 178
>>>> +-
>>>>   1 file changed, 83 insertions(+), 95 deletions(-)
>>>
>>> Patchset looks good to me, although I didn't test it on hardware yet.
>>> Signed-off-by: Joern Engel 
>>>
>>> James, care to pick this up?
>>
>> Acked-by: Douglas Gilbert 
>>
>> Tested O_EXCL with multiple processes and threads; passed.
>> sg driver prior to this patch had "leaky" O_EXCL logic
>> according to the same test. Block device passed.
>>
>> James, could you clean this up:
>>drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
>> [-Wunused-variable]
>
> Further testing suggests this patch on the sg driver is
> broken, so I'll rescind my ack.
>
> The case it is broken for is when a device is opened
> without O_EXCL. Now if, while it is open, a second
> thread/process tries to open the same device O_EXCL
> then IMO the second open should fail with EBUSY.
>
> My testing shows that O_EXCL opens properly deflect
> other O_EXCL opens.
Hi  Doug,

My test don't have this issue. The routine is something as below:

I start three opens without O_EXCL, wait 30s each, and open with
O_EXCL|O_NONBLOCK, it failed with EBUSY.
And I also call myopen with/without O_EXCL many times in background at
the same time, and the test is passed. I don't know why it failed in
your test.

Usage: myopen [-e][-n][-d delay] -f file
  -e: exclude
  -n: nonblock
  -d: delay N seconds and then close.

[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[1] 3417
[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[2] 3418
[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[3] 3419
[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
max_active_device=6(origin 1)
 def_reserved_size=32768
 >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
   FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
   cmd_q=0 f_packid=0 k_orphan=0 closed=0
 No requests active
   FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
   cmd_q=0 f_packid=0 k_orphan=0 closed=0
 No requests active
   FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
   cmd_q=0 f_packid=0 k_orphan=0 closed=0
 No requests active

[root@vacaowol5 16835013]# ./myopen -e -n  -f /dev/sg5 -d 30 &
[4] 3422
[3422:3351] /dev/sg5:exclude: Device or resource busy

[4]+  Exit 1  ./myopen -e -n -f /dev/sg5 -d 30

[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
max_active_device=6(origin 1)
 def_reserved_size=32768
 >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
   FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
   cmd_q=0 f_packid=0 k_orphan=0 closed=0
 No requests active
   FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
   cmd_q=0 f_packid=0 k_orphan=0 closed=0
 No requests active
   FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
   cmd_q=0 f_packid=0 k_orphan=0 closed=0
 No requests active
[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
[1]   Done./myopen -f /dev/sg5 -d 30
[2]-  Done./myopen -f /dev/sg5 -d 30
[3]+  Done./myopen -f /dev/sg5 -d 30


>
> BTW the standard block driver (e.g. /dev/sdc) is broken
> in exactly the same way, according to my tests.
>
> Doug Gilbert
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-07-30 Thread vaughan
On 07/26/2013 04:33 AM, Douglas Gilbert wrote:
> On 13-07-25 11:32 AM, vaughan wrote:
>> On 07/23/2013 01:03 AM, Jörn Engel wrote:
>>> On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:
>>>> There is a race when open sg with O_EXCL flag. Also a race may
>>>> happen between
>>>> sg_open and sg_remove.
>>>>
>>>> Changes from v4:
>>>>   * [3/4] use ERR_PTR series instead of adding another parameter in
>>>> sg_add_sfp
>>>>   * [4/4] fix conflict for cherry-pick from v3.
>>>>
>>>> Changes from v3:
>>>>   * release o_sem in sg_release(), not in sg_remove_sfp().
>>>>   * not set exclude with sfd_lock held.
>>>>
>>>> Vaughan Cao (4):
>>>>[SCSI] sg: use rwsem to solve race during exclusive open
>>>>[SCSI] sg: no need sg_open_exclusive_lock
>>>>[SCSI] sg: checking sdp->detached isn't protected when open
>>>>[SCSI] sg: push file descriptor list locking down to per-device
>>>>  locking
>>>>
>>>>   drivers/scsi/sg.c | 178
>>>> +-
>>>>   1 file changed, 83 insertions(+), 95 deletions(-)
>>> Patchset looks good to me, although I didn't test it on hardware yet.
>>> Signed-off-by: Joern Engel 
>>>
>>> James, care to pick this up?
>>>
>>> Jörn
>> Hi James,
>>
>> sg driver has two races happen in
>>   a) exclusive open and non-exclusive open
>>   b) sg removal and sg open
>> I explained the scenario detail in the separate patches. I did test
>> those patches and
>> Jörn has reviewed them.  I got no response from Doug Gilbert for a long
>> time.
>> Would you care to pick these up?
>
> Hi,
> Your patches applied with a little noise to lk 3.10.2 and
> gave this warning from the build.
>
>   CC [M]  drivers/scsi/sg.o
> drivers/scsi/sg.c: In function ‘sg_open’:
> drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
> [-Wunused-variable]
>
> I'll keep testing ...
Hi Doug,

Can I ask how about the test result?

Thanks,
Vaughan
>
> Doug Gilbert
>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-07-25 Thread vaughan
On 07/23/2013 01:03 AM, Jörn Engel wrote:
> On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:
>> There is a race when open sg with O_EXCL flag. Also a race may happen between
>> sg_open and sg_remove.
>>
>> Changes from v4:
>>  * [3/4] use ERR_PTR series instead of adding another parameter in sg_add_sfp
>>  * [4/4] fix conflict for cherry-pick from v3.
>>
>> Changes from v3:
>>  * release o_sem in sg_release(), not in sg_remove_sfp().
>>  * not set exclude with sfd_lock held.
>>
>> Vaughan Cao (4):
>>   [SCSI] sg: use rwsem to solve race during exclusive open
>>   [SCSI] sg: no need sg_open_exclusive_lock
>>   [SCSI] sg: checking sdp->detached isn't protected when open
>>   [SCSI] sg: push file descriptor list locking down to per-device
>> locking
>>
>>  drivers/scsi/sg.c | 178 
>> +-
>>  1 file changed, 83 insertions(+), 95 deletions(-)
> Patchset looks good to me, although I didn't test it on hardware yet.
> Signed-off-by: Joern Engel 
>
> James, care to pick this up?
>
> Jörn
Hi James,

sg driver has two races happen in
 a) exclusive open and non-exclusive open
 b) sg removal and sg open
I explained the scenario detail in the separate patches. I did test
those patches and
Jörn has reviewed them.  I got no response from Doug Gilbert for a long
time.
Would you care to pick these up?

Thanks,
Vaughan

>
> --
> Good warriors cause others to come to them and do not go to others.
> -- Sun Tzu

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 1/4] [SCSI] sg: use rwsem to solve race during exclusive open

2013-07-21 Thread Vaughan Cao
A race condition may happen if two threads are both trying to open the same sg
with O_EXCL simultaneously. It's possible that they both find fsds list is
empty and get_exclude(sdp) returns 0, then they both call set_exclude() and
break out from wait_event_interruptible and resume open.

Now use rwsem to protect this process. Exclusive open gets write lock and
others get read lock. The lock will be held until file descriptor is closed.
This also leads 'exclude' only a status rather than a check mark.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 77 ++-
 1 file changed, 42 insertions(+), 35 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 25b5455..edc395a 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -168,11 +168,11 @@ typedef struct sg_fd {/* holds the state of a 
file descriptor */
 
 typedef struct sg_device { /* holds the state of each scsi generic device */
struct scsi_device *device;
-   wait_queue_head_t o_excl_wait;  /* queue open() when O_EXCL in use */
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
/* sfds is protected by sg_index_lock */
struct list_head sfds;
+   struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
/* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
@@ -293,35 +293,35 @@ sg_open(struct inode *inode, struct file *filp)
goto error_out;
}
 
-   if (flags & O_EXCL) {
-   if (O_RDONLY == (flags & O_ACCMODE)) {
-   retval = -EPERM; /* Can't lock it with read only access 
*/
-   goto error_out;
-   }
-   if (!sfds_list_empty(sdp) && (flags & O_NONBLOCK)) {
-   retval = -EBUSY;
-   goto error_out;
-   }
-   res = wait_event_interruptible(sdp->o_excl_wait,
-  ((!sfds_list_empty(sdp) || 
get_exclude(sdp)) ? 0 : set_exclude(sdp, 1)));
-   if (res) {
-   retval = res;   /* -ERESTARTSYS because signal hit 
process */
-   goto error_out;
-   }
-   } else if (get_exclude(sdp)) {  /* some other fd has an exclusive lock 
on dev */
-   if (flags & O_NONBLOCK) {
-   retval = -EBUSY;
-   goto error_out;
-   }
-   res = wait_event_interruptible(sdp->o_excl_wait, 
!get_exclude(sdp));
-   if (res) {
-   retval = res;   /* -ERESTARTSYS because signal hit 
process */
-   goto error_out;
+   if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE))) {
+   retval = -EPERM; /* Can't lock it with read only access */
+   goto error_out;
+   }
+   if (flags & O_NONBLOCK) {
+   if (flags & O_EXCL) {
+   if (!down_write_trylock(&sdp->o_sem)) {
+   retval = -EBUSY;
+   goto error_out;
+   }
+   } else {
+   if (!down_read_trylock(&sdp->o_sem)) {
+   retval = -EBUSY;
+   goto error_out;
+   }
}
+   } else {
+   if (flags & O_EXCL)
+   down_write(&sdp->o_sem);
+   else
+   down_read(&sdp->o_sem);
}
+   /* Since write lock is held, no need to check sfd_list */
+   if (flags & O_EXCL)
+   set_exclude(sdp, 1);
+
if (sdp->detached) {
retval = -ENODEV;
-   goto error_out;
+   goto sem_out;
}
if (sfds_list_empty(sdp)) { /* no existing opens on this device */
sdp->sgdebug = 0;
@@ -331,16 +331,19 @@ sg_open(struct inode *inode, struct file *filp)
if ((sfp = sg_add_sfp(sdp, dev)))
filp->private_data = sfp;
else {
-   if (flags & O_EXCL) {
-   set_exclude(sdp, 0);/* undo if error */
-   wake_up_interruptible(&sdp->o_excl_wait);
-   }
retval = -ENOMEM;
-   goto error_out;
+   goto sem_out;
}
retval = 0;
-error_out:
+
if (retval) {
+sem_out:
+   if (flags & O_EXCL) {
+   set_exclude(sdp, 0);/* undo if error */
+   up_write(&s

[PATCH v5 2/4] [SCSI] sg: no need sg_open_exclusive_lock

2013-07-21 Thread Vaughan Cao
Open exclusive check is protected by o_sem, no need sg_open_exclusive_lock.
@exclude is used to record which type of rwsem we are holding.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 34 +-
 1 file changed, 5 insertions(+), 29 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index edc395a..671b760 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -103,8 +103,6 @@ static int scatter_elem_sz_prev = SG_SCATTER_SZ;
 static int sg_add(struct device *, struct class_interface *);
 static void sg_remove(struct device *, struct class_interface *);
 
-static DEFINE_SPINLOCK(sg_open_exclusive_lock);
-
 static DEFINE_IDR(sg_index_idr);
 static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
   file descriptor list 
for device */
@@ -174,7 +172,6 @@ typedef struct sg_device { /* holds the state of each scsi 
generic device */
struct list_head sfds;
struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
-   /* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
char sgdebug;   /* 0->off, 1->sense, 9->dump dev, 10-> all devs 
*/
struct gendisk *disk;
@@ -224,27 +221,6 @@ static int sg_allow_access(struct file *filp, unsigned 
char *cmd)
return blk_verify_command(q, cmd, filp->f_mode & FMODE_WRITE);
 }
 
-static int get_exclude(Sg_device *sdp)
-{
-   unsigned long flags;
-   int ret;
-
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   ret = sdp->exclude;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return ret;
-}
-
-static int set_exclude(Sg_device *sdp, char val)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   sdp->exclude = val;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return val;
-}
-
 static int sfds_list_empty(Sg_device *sdp)
 {
unsigned long flags;
@@ -317,7 +293,7 @@ sg_open(struct inode *inode, struct file *filp)
}
/* Since write lock is held, no need to check sfd_list */
if (flags & O_EXCL)
-   set_exclude(sdp, 1);
+   sdp->exclude = 1;   /* used by release lock */
 
if (sdp->detached) {
retval = -ENODEV;
@@ -339,7 +315,7 @@ sg_open(struct inode *inode, struct file *filp)
if (retval) {
 sem_out:
if (flags & O_EXCL) {
-   set_exclude(sdp, 0);/* undo if error */
+   sdp->exclude = 0;   /* undo if error */
up_write(&sdp->o_sem);
} else
up_read(&sdp->o_sem);
@@ -366,8 +342,8 @@ sg_release(struct inode *inode, struct file *filp)
return -ENXIO;
SCSI_LOG_TIMEOUT(3, printk("sg_release: %s\n", sdp->disk->disk_name));
 
-   excl = get_exclude(sdp);
-   set_exclude(sdp, 0);
+   excl = sdp->exclude;
+   sdp->exclude = 0;
if (excl)
up_write(&sdp->o_sem);
else
@@ -2637,7 +2613,7 @@ static int sg_proc_seq_show_debug(struct seq_file *s, 
void *v)
 scsidp->lun,
 scsidp->host->hostt->emulated);
seq_printf(s, " sg_tablesize=%d excl=%d\n",
-  sdp->sg_tablesize, get_exclude(sdp));
+  sdp->sg_tablesize, sdp->exclude);
sg_proc_debug_helper(s, sdp);
}
read_unlock_irqrestore(&sg_index_lock, iflags);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 4/4] [SCSI] sg: push file descriptor list locking down to per-device locking

2013-07-21 Thread Vaughan Cao
Push file descriptor list locking down to per-device locking. Let sg_index_lock
only protect device lookup.
sdp->detached is also set and checked with this lock held.

Changes from v4:
 * Since I use ERR_PTR and friends in sg_add_sfp, this patch should also be
updated to resolve conflict in cherrry-pick.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 61 ++-
 1 file changed, 33 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index f0e4785..3431d12 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -104,8 +104,7 @@ static int sg_add(struct device *, struct class_interface 
*);
 static void sg_remove(struct device *, struct class_interface *);
 
 static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock);
 
 static struct class_interface sg_interface = {
.add_dev= sg_add,
@@ -142,8 +141,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests 
outstanding per file */
 } Sg_request;
 
 typedef struct sg_fd { /* holds the state of a file descriptor */
-   /* sfd_siblings is protected by sg_index_lock */
-   struct list_head sfd_siblings;
+   struct list_head sfd_siblings; /* protected by sfd_lock of device */
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
rwlock_t rq_list_lock;  /* protect access to list in req_arr */
@@ -168,7 +166,7 @@ typedef struct sg_device { /* holds the state of each scsi 
generic device */
struct scsi_device *device;
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
-   /* sfds is protected by sg_index_lock */
+   spinlock_t sfd_lock;/* protect file descriptor list for device */
struct list_head sfds;
struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
@@ -226,9 +224,9 @@ static int sfds_list_empty(Sg_device *sdp)
unsigned long flags;
int ret;
 
-   read_lock_irqsave(&sg_index_lock, flags);
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
ret = list_empty(&sdp->sfds);
-   read_unlock_irqrestore(&sg_index_lock, flags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
return ret;
 }
 
@@ -1394,6 +1392,7 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct 
scsi_device *scsidp)
disk->first_minor = k;
sdp->disk = disk;
sdp->device = scsidp;
+   spin_lock_init(&sdp->sfd_lock);
INIT_LIST_HEAD(&sdp->sfds);
init_rwsem(&sdp->o_sem);
sdp->sg_tablesize = queue_max_segments(q);
@@ -1536,11 +1535,13 @@ static void sg_remove(struct device *cl_dev, struct 
class_interface *cl_intf)
 
/* Need a write lock to set sdp->detached. */
write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock(&sdp->sfd_lock);
sdp->detached = 1;
list_for_each_entry(sfp, &sdp->sfds, sfd_siblings) {
wake_up_interruptible(&sfp->read_wait);
kill_fasync(&sfp->async_qp, SIGPOLL, POLL_HUP);
}
+   spin_unlock(&sdp->sfd_lock);
write_unlock_irqrestore(&sg_index_lock, iflags);
 
sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
@@ -2065,13 +2066,13 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->cmd_q = SG_DEF_COMMAND_Q;
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
-   write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock_irqsave(&sdp->sfd_lock, iflags);
if (sdp->detached) {
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
return ERR_PTR(-ENODEV);
}
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
if (unlikely(sg_big_buff != def_reserved_size))
sg_big_buff = def_reserved_size;
@@ -2121,9 +2122,9 @@ static void sg_remove_sfp(struct kref *kref)
struct sg_device *sdp = sfp->parentdp;
unsigned long iflags;
 
-   write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock_irqsave(&sdp->sfd_lock, iflags);
list_del(&sfp->sfd_siblings);
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+  

[PATCH v5 3/4] [SCSI] sg: checking sdp->detached isn't protected when open

2013-07-21 Thread Vaughan Cao
@detached is set under the protection of sg_index_lock. Without getting the
lock, new sfp will be added during sg removal and there is no chance for it
to be picked out. So check with sg_index_lock held in sg_add_sfp().

Changes from v4:
 * use ERR_PTR series instead of adding another parameter in sg_add_sfp

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 671b760..f0e4785 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -295,21 +295,17 @@ sg_open(struct inode *inode, struct file *filp)
if (flags & O_EXCL)
sdp->exclude = 1;   /* used by release lock */
 
-   if (sdp->detached) {
-   retval = -ENODEV;
-   goto sem_out;
-   }
if (sfds_list_empty(sdp)) { /* no existing opens on this device */
sdp->sgdebug = 0;
q = sdp->device->request_queue;
sdp->sg_tablesize = queue_max_segments(q);
}
-   if ((sfp = sg_add_sfp(sdp, dev)))
-   filp->private_data = sfp;
-   else {
-   retval = -ENOMEM;
+   sfp = sg_add_sfp(sdp, dev);
+   if (IS_ERR(sfp)) {
+   retval = PTR_ERR(sfp);
goto sem_out;
}
+   filp->private_data = sfp;
retval = 0;
 
if (retval) {
@@ -2055,7 +2051,7 @@ sg_add_sfp(Sg_device * sdp, int dev)
 
sfp = kzalloc(sizeof(*sfp), GFP_ATOMIC | __GFP_NOWARN);
if (!sfp)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
 
init_waitqueue_head(&sfp->read_wait);
rwlock_init(&sfp->rq_list_lock);
@@ -2070,6 +2066,10 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
write_lock_irqsave(&sg_index_lock, iflags);
+   if (sdp->detached) {
+   write_unlock_irqrestore(&sg_index_lock, iflags);
+   return ERR_PTR(-ENODEV);
+   }
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
write_unlock_irqrestore(&sg_index_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-07-21 Thread Vaughan Cao
There is a race when open sg with O_EXCL flag. Also a race may happen between
sg_open and sg_remove.

Changes from v4:
 * [3/4] use ERR_PTR series instead of adding another parameter in sg_add_sfp
 * [4/4] fix conflict for cherry-pick from v3.

Changes from v3:
 * release o_sem in sg_release(), not in sg_remove_sfp().
 * not set exclude with sfd_lock held.

Vaughan Cao (4):
  [SCSI] sg: use rwsem to solve race during exclusive open
  [SCSI] sg: no need sg_open_exclusive_lock
  [SCSI] sg: checking sdp->detached isn't protected when open
  [SCSI] sg: push file descriptor list locking down to per-device
locking

 drivers/scsi/sg.c | 178 +-
 1 file changed, 83 insertions(+), 95 deletions(-)

-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 4/4] [SCSI] sg: push file descriptor list locking down to per-device locking

2013-07-21 Thread Vaughan Cao
Push file descriptor list locking down to per-device locking. Let sg_index_lock
only protect device lookup.
sdp->detached is also set and checked with this lock held.

Changes from v4:
 * Since I use ERR_PTR and friends in sg_add_sfp, this patch should also be
updated to resolve conflict in cherrry-pick.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 61 ++-
 1 file changed, 33 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index f0e4785..3431d12 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -104,8 +104,7 @@ static int sg_add(struct device *, struct class_interface 
*);
 static void sg_remove(struct device *, struct class_interface *);
 
 static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock);
 
 static struct class_interface sg_interface = {
.add_dev= sg_add,
@@ -142,8 +141,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests 
outstanding per file */
 } Sg_request;
 
 typedef struct sg_fd { /* holds the state of a file descriptor */
-   /* sfd_siblings is protected by sg_index_lock */
-   struct list_head sfd_siblings;
+   struct list_head sfd_siblings; /* protected by sfd_lock of device */
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
rwlock_t rq_list_lock;  /* protect access to list in req_arr */
@@ -168,7 +166,7 @@ typedef struct sg_device { /* holds the state of each scsi 
generic device */
struct scsi_device *device;
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
-   /* sfds is protected by sg_index_lock */
+   spinlock_t sfd_lock;/* protect file descriptor list for device */
struct list_head sfds;
struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
@@ -226,9 +224,9 @@ static int sfds_list_empty(Sg_device *sdp)
unsigned long flags;
int ret;
 
-   read_lock_irqsave(&sg_index_lock, flags);
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
ret = list_empty(&sdp->sfds);
-   read_unlock_irqrestore(&sg_index_lock, flags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
return ret;
 }
 
@@ -1394,6 +1392,7 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct 
scsi_device *scsidp)
disk->first_minor = k;
sdp->disk = disk;
sdp->device = scsidp;
+   spin_lock_init(&sdp->sfd_lock);
INIT_LIST_HEAD(&sdp->sfds);
init_rwsem(&sdp->o_sem);
sdp->sg_tablesize = queue_max_segments(q);
@@ -1536,11 +1535,13 @@ static void sg_remove(struct device *cl_dev, struct 
class_interface *cl_intf)
 
/* Need a write lock to set sdp->detached. */
write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock(&sdp->sfd_lock);
sdp->detached = 1;
list_for_each_entry(sfp, &sdp->sfds, sfd_siblings) {
wake_up_interruptible(&sfp->read_wait);
kill_fasync(&sfp->async_qp, SIGPOLL, POLL_HUP);
}
+   spin_unlock(&sdp->sfd_lock);
write_unlock_irqrestore(&sg_index_lock, iflags);
 
sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
@@ -2065,13 +2066,13 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->cmd_q = SG_DEF_COMMAND_Q;
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
-   write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock_irqsave(&sdp->sfd_lock, iflags);
if (sdp->detached) {
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
return ERR_PTR(-ENODEV);
}
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
if (unlikely(sg_big_buff != def_reserved_size))
sg_big_buff = def_reserved_size;
@@ -2121,9 +2122,9 @@ static void sg_remove_sfp(struct kref *kref)
struct sg_device *sdp = sfp->parentdp;
unsigned long iflags;
 
-   write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock_irqsave(&sdp->sfd_lock, iflags);
list_del(&sfp->sfd_siblings);
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+  

[PATCH v5 3/4] [SCSI] sg: checking sdp->detached isn't protected when open

2013-07-21 Thread Vaughan Cao
@detached is set under the protection of sg_index_lock. Without getting the
lock, new sfp will be added during sg removal and there is no chance for it
to be picked out. So check with sg_index_lock held in sg_add_sfp().

Changes from v4:
 * use ERR_PTR series instead of adding another parameter in sg_add_sfp

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 671b760..f0e4785 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -295,21 +295,17 @@ sg_open(struct inode *inode, struct file *filp)
if (flags & O_EXCL)
sdp->exclude = 1;   /* used by release lock */
 
-   if (sdp->detached) {
-   retval = -ENODEV;
-   goto sem_out;
-   }
if (sfds_list_empty(sdp)) { /* no existing opens on this device */
sdp->sgdebug = 0;
q = sdp->device->request_queue;
sdp->sg_tablesize = queue_max_segments(q);
}
-   if ((sfp = sg_add_sfp(sdp, dev)))
-   filp->private_data = sfp;
-   else {
-   retval = -ENOMEM;
+   sfp = sg_add_sfp(sdp, dev);
+   if (IS_ERR(sfp)) {
+   retval = PTR_ERR(sfp);
goto sem_out;
}
+   filp->private_data = sfp;
retval = 0;
 
if (retval) {
@@ -2055,7 +2051,7 @@ sg_add_sfp(Sg_device * sdp, int dev)
 
sfp = kzalloc(sizeof(*sfp), GFP_ATOMIC | __GFP_NOWARN);
if (!sfp)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
 
init_waitqueue_head(&sfp->read_wait);
rwlock_init(&sfp->rq_list_lock);
@@ -2070,6 +2066,10 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
write_lock_irqsave(&sg_index_lock, iflags);
+   if (sdp->detached) {
+   write_unlock_irqrestore(&sg_index_lock, iflags);
+   return ERR_PTR(-ENODEV);
+   }
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
write_unlock_irqrestore(&sg_index_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 4/4] [SCSI] sg: push file descriptor list locking down to per-device locking

2013-07-17 Thread Vaughan Cao
Push file descriptor list locking down to per-device locking. Let sg_index_lock
only protect device lookup.
sdp->detached is also set and checked with this lock held.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 61 ++-
 1 file changed, 33 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 4d2a19f..3ea7c09 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -104,8 +104,7 @@ static int sg_add(struct device *, struct class_interface 
*);
 static void sg_remove(struct device *, struct class_interface *);
 
 static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock);
 
 static struct class_interface sg_interface = {
.add_dev= sg_add,
@@ -142,8 +141,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests 
outstanding per file */
 } Sg_request;
 
 typedef struct sg_fd { /* holds the state of a file descriptor */
-   /* sfd_siblings is protected by sg_index_lock */
-   struct list_head sfd_siblings;
+   struct list_head sfd_siblings; /* protected by sfd_lock of device */
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
rwlock_t rq_list_lock;  /* protect access to list in req_arr */
@@ -168,7 +166,7 @@ typedef struct sg_device { /* holds the state of each scsi 
generic device */
struct scsi_device *device;
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
-   /* sfds is protected by sg_index_lock */
+   spinlock_t sfd_lock;/* protect file descriptor list for device */
struct list_head sfds;
struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
@@ -226,9 +224,9 @@ static int sfds_list_empty(Sg_device *sdp)
unsigned long flags;
int ret;
 
-   read_lock_irqsave(&sg_index_lock, flags);
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
ret = list_empty(&sdp->sfds);
-   read_unlock_irqrestore(&sg_index_lock, flags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
return ret;
 }
 
@@ -1391,6 +1389,7 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct 
scsi_device *scsidp)
disk->first_minor = k;
sdp->disk = disk;
sdp->device = scsidp;
+   spin_lock_init(&sdp->sfd_lock);
INIT_LIST_HEAD(&sdp->sfds);
init_rwsem(&sdp->o_sem);
sdp->sg_tablesize = queue_max_segments(q);
@@ -1533,11 +1532,13 @@ static void sg_remove(struct device *cl_dev, struct 
class_interface *cl_intf)
 
/* Need a write lock to set sdp->detached. */
write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock(&sdp->sfd_lock);
sdp->detached = 1;
list_for_each_entry(sfp, &sdp->sfds, sfd_siblings) {
wake_up_interruptible(&sfp->read_wait);
kill_fasync(&sfp->async_qp, SIGPOLL, POLL_HUP);
}
+   spin_unlock(&sdp->sfd_lock);
write_unlock_irqrestore(&sg_index_lock, iflags);
 
sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
@@ -2065,15 +2066,15 @@ sg_add_sfp(Sg_device * sdp, int dev, int * reason)
sfp->cmd_q = SG_DEF_COMMAND_Q;
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
-   write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock_irqsave(&sdp->sfd_lock, iflags);
if (sdp->detached) {
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
if (reason)
*reason = -ENODEV;
return NULL;
}
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
if (unlikely(sg_big_buff != def_reserved_size))
sg_big_buff = def_reserved_size;
@@ -2123,9 +2124,9 @@ static void sg_remove_sfp(struct kref *kref)
struct sg_device *sdp = sfp->parentdp;
unsigned long iflags;
 
-   write_lock_irqsave(&sg_index_lock, iflags);
+   spin_lock_irqsave(&sdp->sfd_lock, iflags);
list_del(&sfp->sfd_siblings);
-   write_unlock_irqrestore(&sg_index_lock, iflags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
 

[PATCH v4 1/4] [SCSI] sg: use rwsem to solve race during exclusive open

2013-07-17 Thread Vaughan Cao
A race condition may happen if two threads are both trying to open the same sg
with O_EXCL simultaneously. It's possible that they both find fsds list is
empty and get_exclude(sdp) returns 0, then they both call set_exclude() and
break out from wait_event_interruptible and resume open.

Now use rwsem to protect this process. Exclusive open gets write lock and
others get read lock. The lock will be held until file descriptor is closed.
This also leads 'exclude' only a status rather than a check mark.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 77 ++-
 1 file changed, 42 insertions(+), 35 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 25b5455..edc395a 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -168,11 +168,11 @@ typedef struct sg_fd {/* holds the state of a 
file descriptor */
 
 typedef struct sg_device { /* holds the state of each scsi generic device */
struct scsi_device *device;
-   wait_queue_head_t o_excl_wait;  /* queue open() when O_EXCL in use */
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
/* sfds is protected by sg_index_lock */
struct list_head sfds;
+   struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
/* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
@@ -293,35 +293,35 @@ sg_open(struct inode *inode, struct file *filp)
goto error_out;
}
 
-   if (flags & O_EXCL) {
-   if (O_RDONLY == (flags & O_ACCMODE)) {
-   retval = -EPERM; /* Can't lock it with read only access 
*/
-   goto error_out;
-   }
-   if (!sfds_list_empty(sdp) && (flags & O_NONBLOCK)) {
-   retval = -EBUSY;
-   goto error_out;
-   }
-   res = wait_event_interruptible(sdp->o_excl_wait,
-  ((!sfds_list_empty(sdp) || 
get_exclude(sdp)) ? 0 : set_exclude(sdp, 1)));
-   if (res) {
-   retval = res;   /* -ERESTARTSYS because signal hit 
process */
-   goto error_out;
-   }
-   } else if (get_exclude(sdp)) {  /* some other fd has an exclusive lock 
on dev */
-   if (flags & O_NONBLOCK) {
-   retval = -EBUSY;
-   goto error_out;
-   }
-   res = wait_event_interruptible(sdp->o_excl_wait, 
!get_exclude(sdp));
-   if (res) {
-   retval = res;   /* -ERESTARTSYS because signal hit 
process */
-   goto error_out;
+   if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE))) {
+   retval = -EPERM; /* Can't lock it with read only access */
+   goto error_out;
+   }
+   if (flags & O_NONBLOCK) {
+   if (flags & O_EXCL) {
+   if (!down_write_trylock(&sdp->o_sem)) {
+   retval = -EBUSY;
+   goto error_out;
+   }
+   } else {
+   if (!down_read_trylock(&sdp->o_sem)) {
+   retval = -EBUSY;
+   goto error_out;
+   }
}
+   } else {
+   if (flags & O_EXCL)
+   down_write(&sdp->o_sem);
+   else
+   down_read(&sdp->o_sem);
}
+   /* Since write lock is held, no need to check sfd_list */
+   if (flags & O_EXCL)
+   set_exclude(sdp, 1);
+
if (sdp->detached) {
retval = -ENODEV;
-   goto error_out;
+   goto sem_out;
}
if (sfds_list_empty(sdp)) { /* no existing opens on this device */
sdp->sgdebug = 0;
@@ -331,16 +331,19 @@ sg_open(struct inode *inode, struct file *filp)
if ((sfp = sg_add_sfp(sdp, dev)))
filp->private_data = sfp;
else {
-   if (flags & O_EXCL) {
-   set_exclude(sdp, 0);/* undo if error */
-   wake_up_interruptible(&sdp->o_excl_wait);
-   }
retval = -ENOMEM;
-   goto error_out;
+   goto sem_out;
}
retval = 0;
-error_out:
+
if (retval) {
+sem_out:
+   if (flags & O_EXCL) {
+   set_exclude(sdp, 0);/* undo if error */
+   up_write(&s

[PATCH v4 3/4] [SCSI] sg: checking sdp->detached isn't protected when open

2013-07-17 Thread Vaughan Cao
@detached is set under the protection of sg_index_lock. Without getting the
lock, new sfp will be added during sg removal and there is no chance for it
to be picked out. So check with sg_index_lock held in sg_add_sfp().

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 671b760..4d2a19f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -196,7 +196,7 @@ static void sg_remove_scat(Sg_scatter_hold * schp);
 static void sg_build_reserve(Sg_fd * sfp, int req_size);
 static void sg_link_reserve(Sg_fd * sfp, Sg_request * srp, int size);
 static void sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp);
-static Sg_fd *sg_add_sfp(Sg_device * sdp, int dev);
+static Sg_fd *sg_add_sfp(Sg_device * sdp, int dev, int * reason);
 static void sg_remove_sfp(struct kref *);
 static Sg_request *sg_get_rq_mark(Sg_fd * sfp, int pack_id);
 static Sg_request *sg_add_request(Sg_fd * sfp);
@@ -295,21 +295,14 @@ sg_open(struct inode *inode, struct file *filp)
if (flags & O_EXCL)
sdp->exclude = 1;   /* used by release lock */
 
-   if (sdp->detached) {
-   retval = -ENODEV;
-   goto sem_out;
-   }
if (sfds_list_empty(sdp)) { /* no existing opens on this device */
sdp->sgdebug = 0;
q = sdp->device->request_queue;
sdp->sg_tablesize = queue_max_segments(q);
}
-   if ((sfp = sg_add_sfp(sdp, dev)))
-   filp->private_data = sfp;
-   else {
-   retval = -ENOMEM;
+   if (!(sfp = sg_add_sfp(sdp, dev, &retval)))
goto sem_out;
-   }
+   filp->private_data = sfp;
retval = 0;
 
if (retval) {
@@ -2047,15 +2040,18 @@ sg_remove_request(Sg_fd * sfp, Sg_request * srp)
 }
 
 static Sg_fd *
-sg_add_sfp(Sg_device * sdp, int dev)
+sg_add_sfp(Sg_device * sdp, int dev, int * reason)
 {
Sg_fd *sfp;
unsigned long iflags;
int bufflen;
 
sfp = kzalloc(sizeof(*sfp), GFP_ATOMIC | __GFP_NOWARN);
-   if (!sfp)
+   if (!sfp) {
+   if (reason)
+   *reason = -ENOMEM;
return NULL;
+   }
 
init_waitqueue_head(&sfp->read_wait);
rwlock_init(&sfp->rq_list_lock);
@@ -2070,6 +2066,12 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
write_lock_irqsave(&sg_index_lock, iflags);
+   if (sdp->detached) {
+   write_unlock_irqrestore(&sg_index_lock, iflags);
+   if (reason)
+   *reason = -ENODEV;
+   return NULL;
+   }
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
write_unlock_irqrestore(&sg_index_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/4] [SCSI] sg: no need sg_open_exclusive_lock

2013-07-17 Thread Vaughan Cao
Open exclusive check is protected by o_sem, no need sg_open_exclusive_lock.
@exclude is used to record which type of rwsem we are holding.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 34 +-
 1 file changed, 5 insertions(+), 29 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index edc395a..671b760 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -103,8 +103,6 @@ static int scatter_elem_sz_prev = SG_SCATTER_SZ;
 static int sg_add(struct device *, struct class_interface *);
 static void sg_remove(struct device *, struct class_interface *);
 
-static DEFINE_SPINLOCK(sg_open_exclusive_lock);
-
 static DEFINE_IDR(sg_index_idr);
 static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
   file descriptor list 
for device */
@@ -174,7 +172,6 @@ typedef struct sg_device { /* holds the state of each scsi 
generic device */
struct list_head sfds;
struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
-   /* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
char sgdebug;   /* 0->off, 1->sense, 9->dump dev, 10-> all devs 
*/
struct gendisk *disk;
@@ -224,27 +221,6 @@ static int sg_allow_access(struct file *filp, unsigned 
char *cmd)
return blk_verify_command(q, cmd, filp->f_mode & FMODE_WRITE);
 }
 
-static int get_exclude(Sg_device *sdp)
-{
-   unsigned long flags;
-   int ret;
-
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   ret = sdp->exclude;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return ret;
-}
-
-static int set_exclude(Sg_device *sdp, char val)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   sdp->exclude = val;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return val;
-}
-
 static int sfds_list_empty(Sg_device *sdp)
 {
unsigned long flags;
@@ -317,7 +293,7 @@ sg_open(struct inode *inode, struct file *filp)
}
/* Since write lock is held, no need to check sfd_list */
if (flags & O_EXCL)
-   set_exclude(sdp, 1);
+   sdp->exclude = 1;   /* used by release lock */
 
if (sdp->detached) {
retval = -ENODEV;
@@ -339,7 +315,7 @@ sg_open(struct inode *inode, struct file *filp)
if (retval) {
 sem_out:
if (flags & O_EXCL) {
-   set_exclude(sdp, 0);/* undo if error */
+   sdp->exclude = 0;   /* undo if error */
up_write(&sdp->o_sem);
} else
up_read(&sdp->o_sem);
@@ -366,8 +342,8 @@ sg_release(struct inode *inode, struct file *filp)
return -ENXIO;
SCSI_LOG_TIMEOUT(3, printk("sg_release: %s\n", sdp->disk->disk_name));
 
-   excl = get_exclude(sdp);
-   set_exclude(sdp, 0);
+   excl = sdp->exclude;
+   sdp->exclude = 0;
if (excl)
up_write(&sdp->o_sem);
else
@@ -2637,7 +2613,7 @@ static int sg_proc_seq_show_debug(struct seq_file *s, 
void *v)
 scsidp->lun,
 scsidp->host->hostt->emulated);
seq_printf(s, " sg_tablesize=%d excl=%d\n",
-  sdp->sg_tablesize, get_exclude(sdp));
+  sdp->sg_tablesize, sdp->exclude);
sg_proc_debug_helper(s, sdp);
}
read_unlock_irqrestore(&sg_index_lock, iflags);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/4] [SCSI] sg: fix race condition in sg_open

2013-07-17 Thread Vaughan Cao
There is a race when open sg with O_EXCL flag. Also a race may happen between
sg_open and sg_remove.

Changes from v3:
 * release o_sem in sg_release(), not in sg_remove_sfp().
 * not set exclude with sfd_lock held.

Vaughan Cao (4):
  [SCSI] sg: use rwsem to solve race during exclusive open
  [SCSI] sg: no need sg_open_exclusive_lock
  [SCSI] sg: checking sdp->detached isn't protected when open
  [SCSI] sg: push file descriptor list locking down to per-device
locking

 drivers/scsi/sg.c | 186 ++
 1 file changed, 88 insertions(+), 98 deletions(-)

-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/1] [SCSI] sg: fix race condition when do exclusive open

2013-07-07 Thread vaughan

Use rwsem to aid opens. Exclusive open has to get write lock and non-exclusive 
open should get read lock.
Replace global sg_open_exclusive_lock with a per device lock - sfd_lock. Since 
sfds list is now protected by the lock owned by the same sg_device, 
sg_index_lock becomes a real global lock to only protect sg devices lookup.

Please pay attention to sdp->detached. Previously sg_open may also race with 
sg_remove. Now there are two points for sg_open to detect detached and finish 
itself. One is at sg_device lookup and the other is when trying to link new sfp 
into the sfds list in sg_add_sfp.
I don't think it's necessary to do extra set_exclude and wake_up o_excl_wait in 
sg_release before, so I remove them and only do the cleanup in sg_remove_sfp.
Although simple testcases are passed, I'm not certain if I've fixed it well, 
please give me any comments as you wish.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 187 ++
 1 file changed, 89 insertions(+), 98 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 25b5455..3fc8c19 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -103,11 +103,8 @@ static int scatter_elem_sz_prev = SG_SCATTER_SZ;
 static int sg_add(struct device *, struct class_interface *);
 static void sg_remove(struct device *, struct class_interface *);
 
-static DEFINE_SPINLOCK(sg_open_exclusive_lock);

-
 static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock);   /* protect sg index */
 
 static struct class_interface sg_interface = {

.add_dev= sg_add,
@@ -144,7 +141,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests 
outstanding per file */
 } Sg_request;
 
 typedef struct sg_fd {		/* holds the state of a file descriptor */

-   /* sfd_siblings is protected by sg_index_lock */
+   /* sfd_siblings is protected by sfd_lock of sg_device */
struct list_head sfd_siblings;
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
@@ -168,13 +165,12 @@ typedef struct sg_fd {/* holds the state of a 
file descriptor */
 
 typedef struct sg_device { /* holds the state of each scsi generic device */

struct scsi_device *device;
-   wait_queue_head_t o_excl_wait;  /* queue open() when O_EXCL in use */
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
-   /* sfds is protected by sg_index_lock */
+   spinlock_t sfd_lock;/* protect sfds, exclude */
struct list_head sfds;
+   struct rw_semaphore o_sem;  /* exclude open should hold this rwsem 
*/
volatile char detached; /* 0->attached, 1->detached pending removal */
-   /* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
char sgdebug;   /* 0->off, 1->sense, 9->dump dev, 10-> all devs 
*/
struct gendisk *disk;
@@ -199,7 +195,7 @@ static void sg_remove_scat(Sg_scatter_hold * schp);
 static void sg_build_reserve(Sg_fd * sfp, int req_size);
 static void sg_link_reserve(Sg_fd * sfp, Sg_request * srp, int size);
 static void sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp);
-static Sg_fd *sg_add_sfp(Sg_device * sdp, int dev);
+static Sg_fd *sg_add_sfp(Sg_device * sdp, int dev, int excl, int * reason);
 static void sg_remove_sfp(struct kref *);
 static Sg_request *sg_get_rq_mark(Sg_fd * sfp, int pack_id);
 static Sg_request *sg_add_request(Sg_fd * sfp);
@@ -224,35 +220,14 @@ static int sg_allow_access(struct file *filp, unsigned 
char *cmd)
return blk_verify_command(q, cmd, filp->f_mode & FMODE_WRITE);
 }
 
-static int get_exclude(Sg_device *sdp)

-{
-   unsigned long flags;
-   int ret;
-
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   ret = sdp->exclude;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return ret;
-}
-
-static int set_exclude(Sg_device *sdp, char val)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   sdp->exclude = val;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return val;
-}
-
 static int sfds_list_empty(Sg_device *sdp)
 {
unsigned long flags;
int ret;
 
-	read_lock_irqsave(&sg_index_lock, flags);

+   spin_lock_irqsave(&sdp->sfd_lock, flags);
ret = list_empty(&sdp->sfds);
-   read_unlock_irqrestore(&sg_index_lock, flags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
return ret;
 }
 
@@ -264,8 +239,8 @@ sg_open(struct inode *inode, struct f

Re: [PATCH v2 1/1] [SCSI] sg: fix race condition when do exclusive open

2013-07-06 Thread vaughan

On 07/06/2013 01:39 AM, Jörn Engel wrote:

Sorry about replying so late.

On Mon, 17 June 2013 21:10:53 +0800, vaughan wrote:

Rewrite the last patch.
Add a new field 'toopen' in sg_device to count ongoing sg_open's. By checking 
both 'toopen' and 'exclude' marks when do exclusive open, old race conditions 
can be avoided.
Replace global sg_open_exclusive_lock with a per device lock - sfd_lock. Since 
sfds list is now protected by the lock owned by the same sg_device, 
sg_index_lock becomes a real global lock to only protect sg devices lookup.
Also did some cleanup, such as remove get_exclude() and rename set_exclude() to 
clear_exclude().


...

@@ -171,10 +168,10 @@ typedef struct sg_device { /* holds the state of each 
scsi generic device */
wait_queue_head_t o_excl_wait;  /* queue open() when O_EXCL in use */
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
-   /* sfds is protected by sg_index_lock */
+   spinlock_t sfd_lock;/* protect sfds, exclude, toopen */
struct list_head sfds;
+   int toopen; /* number of who are ready to open sg */

 ^
I think the 'toopen' is a bad choice.  I'm having trouble wrapping my
head around the semantics of this variable, your description feels a
bit handwavy, the main noun is missing in the command above, I think I
found one more overflow bug,...

What you ended up doing is reimplement a rw_semaphone.  Why not use
one instead?  down_write() for exclusive access, down_read() for
non-exclusive, _trylock variants for nonblocking opens, etc.
The critical part of open is to add a new sfd to the list and its 
protected by the
spin_lock(sg_index_lock previously) well. So I added an counter as a 
sign rather than
introducing another spinlock or mutex which means I should deal with 
potential deadlock.
The code may be simpler with a rwsem implementation as you suggest, I'll 
modify it in

this way.

There is no overflow bug, I eliminated it with the following line :)
 if (!sdp->exclude && sdp->toopen != INT_MAX) { ...

Do you agree that I use a per device spin_lock 'sfd_lock' to protect 
sfds list and leave sg_index_lock
only protect the global sg device lookup? I think it's reasonable for 
concurrency.



Thanks,
Vaughan



Would this work?




Jörn

--
I've never met a human being who would want to read 17,000 pages of
documentation, and if there was, I'd kill him to get him out of the
gene pool.
-- Joseph Costello


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/1] [SCSI] sg: fix race condition when do exclusive open

2013-07-04 Thread vaughan

On 06/26/2013 09:37 AM, vaughan wrote:

Hi Jörn Engel,

Ping.
How about this one? I found my lat patch hasn't fix the issue, so I
modified it a little more. Last thread is:
Subject: Re: [PATCH] sg: atomize check and set sdp->exclude in sg_open
Message-ID: <20130605154106.ga2...@logfs.org>

Regards,
Vaughan

于 2013年06月17日 21:10, vaughan 写道:

Rewrite the last patch.
Add a new field 'toopen' in sg_device to count ongoing sg_open's. By checking 
both 'toopen' and 'exclude' marks when do exclusive open, old race conditions 
can be avoided.
Replace global sg_open_exclusive_lock with a per device lock - sfd_lock. Since 
sfds list is now protected by the lock owned by the same sg_device, 
sg_index_lock becomes a real global lock to only protect sg devices lookup.
Also did some cleanup, such as remove get_exclude() and rename set_exclude() to 
clear_exclude().

Signed-off-by: Vaughan Cao 
---
  drivers/scsi/sg.c | 152 --
  1 file changed, 90 insertions(+), 62 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 25b5455..b0ea73f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -103,11 +103,8 @@ static int scatter_elem_sz_prev = SG_SCATTER_SZ;
  static int sg_add(struct device *, struct class_interface *);
  static void sg_remove(struct device *, struct class_interface *);

-static DEFINE_SPINLOCK(sg_open_exclusive_lock);
-
  static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock);   /* protect sg index */

  static struct class_interface sg_interface = {
.add_dev= sg_add,
@@ -144,7 +141,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests 
outstanding per file */
  } Sg_request;

  typedef struct sg_fd {/* holds the state of a file descriptor 
*/
-   /* sfd_siblings is protected by sg_index_lock */
+   /* sfd_siblings is protected by sfd_lock of sg_device */
struct list_head sfd_siblings;
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
@@ -171,10 +168,10 @@ typedef struct sg_device { /* holds the state of each 
scsi generic device */
wait_queue_head_t o_excl_wait;  /* queue open() when O_EXCL in use */
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
-   /* sfds is protected by sg_index_lock */
+   spinlock_t sfd_lock;/* protect sfds, exclude, toopen */
struct list_head sfds;
+   int toopen; /* number of who are ready to open sg */
volatile char detached; /* 0->attached, 1->detached pending removal */
-   /* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
char sgdebug;   /* 0->off, 1->sense, 9->dump dev, 10-> all devs 
*/
struct gendisk *disk;
@@ -224,25 +221,44 @@ static int sg_allow_access(struct file *filp, unsigned 
char *cmd)
return blk_verify_command(q, cmd, filp->f_mode & FMODE_WRITE);
  }

-static int get_exclude(Sg_device *sdp)
+static void clear_exclude(Sg_device *sdp)
  {
unsigned long flags;
-   int ret;

-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   ret = sdp->exclude;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
+   sdp->exclude = 0;
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
+   return;
+}
+
+/* we can add exclusively only when no other addition is going on */
+static int try_add_exclude(Sg_device *sdp)
+{
+   unsigned long flags;
+   int ret = 0;
+
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
+   if (list_empty(&sdp->sfds) && !sdp->toopen) {
+   sdp->exclude = 1;
+   sdp->toopen++;
+   ret = 1;
+   }
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
return ret;
  }

-static int set_exclude(Sg_device *sdp, char val)
+static int try_add_shareable(Sg_device *sdp)
  {
unsigned long flags;
+   int ret = 0;

-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   sdp->exclude = val;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return val;
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
+   if (!sdp->exclude && sdp->toopen != INT_MAX) {
+   sdp->toopen++;
+   ret = 1;
+   }
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
+   return ret;
  }

  static int sfds_list_empty(Sg_device *sdp)
@@ -250,9 +266,9 @@ static int sfd

Re: [PATCH v2 1/1] [SCSI] sg: fix race condition when do exclusive open

2013-06-25 Thread vaughan
Hi Jörn Engel,

Ping.
How about this one? I found my lat patch hasn't fix the issue, so I
modified it a little more. Last thread is:
Subject: Re: [PATCH] sg: atomize check and set sdp->exclude in sg_open
Message-ID: <20130605154106.ga2...@logfs.org>

Regards,
Vaughan

于 2013年06月17日 21:10, vaughan 写道:
> Rewrite the last patch.
> Add a new field 'toopen' in sg_device to count ongoing sg_open's. By checking 
> both 'toopen' and 'exclude' marks when do exclusive open, old race conditions 
> can be avoided.
> Replace global sg_open_exclusive_lock with a per device lock - sfd_lock. 
> Since sfds list is now protected by the lock owned by the same sg_device, 
> sg_index_lock becomes a real global lock to only protect sg devices lookup.
> Also did some cleanup, such as remove get_exclude() and rename set_exclude() 
> to clear_exclude().
> 
> Signed-off-by: Vaughan Cao 
> ---
>  drivers/scsi/sg.c | 152 
> --
>  1 file changed, 90 insertions(+), 62 deletions(-)
> 
> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 25b5455..b0ea73f 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -103,11 +103,8 @@ static int scatter_elem_sz_prev = SG_SCATTER_SZ;
>  static int sg_add(struct device *, struct class_interface *);
>  static void sg_remove(struct device *, struct class_interface *);
>  
> -static DEFINE_SPINLOCK(sg_open_exclusive_lock);
> -
>  static DEFINE_IDR(sg_index_idr);
> -static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock
> -file descriptor list 
> for device */
> +static DEFINE_RWLOCK(sg_index_lock); /* protect sg index */
>  
>  static struct class_interface sg_interface = {
>   .add_dev= sg_add,
> @@ -144,7 +141,7 @@ typedef struct sg_request {   /* SG_MAX_QUEUE 
> requests outstanding per file */
>  } Sg_request;
>  
>  typedef struct sg_fd {   /* holds the state of a file descriptor 
> */
> - /* sfd_siblings is protected by sg_index_lock */
> + /* sfd_siblings is protected by sfd_lock of sg_device */
>   struct list_head sfd_siblings;
>   struct sg_device *parentdp; /* owning device */
>   wait_queue_head_t read_wait;/* queue read until command done */
> @@ -171,10 +168,10 @@ typedef struct sg_device { /* holds the state of each 
> scsi generic device */
>   wait_queue_head_t o_excl_wait;  /* queue open() when O_EXCL in use */
>   int sg_tablesize;   /* adapter's max scatter-gather table size */
>   u32 index;  /* device index number */
> - /* sfds is protected by sg_index_lock */
> + spinlock_t sfd_lock;/* protect sfds, exclude, toopen */
>   struct list_head sfds;
> + int toopen; /* number of who are ready to open sg */
>   volatile char detached; /* 0->attached, 1->detached pending removal */
> - /* exclude protected by sg_open_exclusive_lock */
>   char exclude;   /* opened for exclusive access */
>   char sgdebug;   /* 0->off, 1->sense, 9->dump dev, 10-> all devs 
> */
>   struct gendisk *disk;
> @@ -224,25 +221,44 @@ static int sg_allow_access(struct file *filp, unsigned 
> char *cmd)
>   return blk_verify_command(q, cmd, filp->f_mode & FMODE_WRITE);
>  }
>  
> -static int get_exclude(Sg_device *sdp)
> +static void clear_exclude(Sg_device *sdp)
>  {
>   unsigned long flags;
> - int ret;
>  
> - spin_lock_irqsave(&sg_open_exclusive_lock, flags);
> - ret = sdp->exclude;
> - spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
> + spin_lock_irqsave(&sdp->sfd_lock, flags);
> + sdp->exclude = 0;
> + spin_unlock_irqrestore(&sdp->sfd_lock, flags);
> + return;
> +}
> +
> +/* we can add exclusively only when no other addition is going on */
> +static int try_add_exclude(Sg_device *sdp)
> +{
> + unsigned long flags;
> + int ret = 0;
> +
> + spin_lock_irqsave(&sdp->sfd_lock, flags);
> + if (list_empty(&sdp->sfds) && !sdp->toopen) {
> + sdp->exclude = 1;
> + sdp->toopen++;
> + ret = 1;
> + }
> + spin_unlock_irqrestore(&sdp->sfd_lock, flags);
>   return ret;
>  }
>  
> -static int set_exclude(Sg_device *sdp, char val)
> +static int try_add_shareable(Sg_device *sdp)
>  {
>   unsigned long flags;
> + int ret = 0;
>  
> - spin_lock_irqsave(&sg_open_exclusive_lock, flags);
> - sdp->exclude = val;
> - spin_unlock_irqrestore(&sg_open_exclusive_lock

[PATCH v2 1/1] [SCSI] sg: fix race condition when do exclusive open

2013-06-17 Thread vaughan
Rewrite the last patch.
Add a new field 'toopen' in sg_device to count ongoing sg_open's. By checking 
both 'toopen' and 'exclude' marks when do exclusive open, old race conditions 
can be avoided.
Replace global sg_open_exclusive_lock with a per device lock - sfd_lock. Since 
sfds list is now protected by the lock owned by the same sg_device, 
sg_index_lock becomes a real global lock to only protect sg devices lookup.
Also did some cleanup, such as remove get_exclude() and rename set_exclude() to 
clear_exclude().

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 152 --
 1 file changed, 90 insertions(+), 62 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 25b5455..b0ea73f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -103,11 +103,8 @@ static int scatter_elem_sz_prev = SG_SCATTER_SZ;
 static int sg_add(struct device *, struct class_interface *);
 static void sg_remove(struct device *, struct class_interface *);
 
-static DEFINE_SPINLOCK(sg_open_exclusive_lock);
-
 static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock);   /* protect sg index */
 
 static struct class_interface sg_interface = {
.add_dev= sg_add,
@@ -144,7 +141,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests 
outstanding per file */
 } Sg_request;
 
 typedef struct sg_fd { /* holds the state of a file descriptor */
-   /* sfd_siblings is protected by sg_index_lock */
+   /* sfd_siblings is protected by sfd_lock of sg_device */
struct list_head sfd_siblings;
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
@@ -171,10 +168,10 @@ typedef struct sg_device { /* holds the state of each 
scsi generic device */
wait_queue_head_t o_excl_wait;  /* queue open() when O_EXCL in use */
int sg_tablesize;   /* adapter's max scatter-gather table size */
u32 index;  /* device index number */
-   /* sfds is protected by sg_index_lock */
+   spinlock_t sfd_lock;/* protect sfds, exclude, toopen */
struct list_head sfds;
+   int toopen; /* number of who are ready to open sg */
volatile char detached; /* 0->attached, 1->detached pending removal */
-   /* exclude protected by sg_open_exclusive_lock */
char exclude;   /* opened for exclusive access */
char sgdebug;   /* 0->off, 1->sense, 9->dump dev, 10-> all devs 
*/
struct gendisk *disk;
@@ -224,25 +221,44 @@ static int sg_allow_access(struct file *filp, unsigned 
char *cmd)
return blk_verify_command(q, cmd, filp->f_mode & FMODE_WRITE);
 }
 
-static int get_exclude(Sg_device *sdp)
+static void clear_exclude(Sg_device *sdp)
 {
unsigned long flags;
-   int ret;
 
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   ret = sdp->exclude;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
+   sdp->exclude = 0;
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
+   return;
+}
+
+/* we can add exclusively only when no other addition is going on */
+static int try_add_exclude(Sg_device *sdp)
+{
+   unsigned long flags;
+   int ret = 0;
+
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
+   if (list_empty(&sdp->sfds) && !sdp->toopen) {
+   sdp->exclude = 1;
+   sdp->toopen++;
+   ret = 1;
+   }
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
return ret;
 }
 
-static int set_exclude(Sg_device *sdp, char val)
+static int try_add_shareable(Sg_device *sdp)
 {
unsigned long flags;
+   int ret = 0;
 
-   spin_lock_irqsave(&sg_open_exclusive_lock, flags);
-   sdp->exclude = val;
-   spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
-   return val;
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
+   if (!sdp->exclude && sdp->toopen != INT_MAX) {
+   sdp->toopen++;
+   ret = 1;
+   }
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
+   return ret;
 }
 
 static int sfds_list_empty(Sg_device *sdp)
@@ -250,9 +266,9 @@ static int sfds_list_empty(Sg_device *sdp)
unsigned long flags;
int ret;
 
-   read_lock_irqsave(&sg_index_lock, flags);
+   spin_lock_irqsave(&sdp->sfd_lock, flags);
ret = list_empty(&sdp->sfds);
-   read_unlock_irqrestore(&sg_index_lock, flags);
+   spin_unlock_irqrestore(&sdp->sfd_lock, flags);
return r

Re: [PATCH] sg: atomize check and set sdp->exclude in sg_open

2013-06-06 Thread vaughan

于 2013年06月06日 15:19, vaughan 写道:

于 2013年06月05日 23:41, Jörn Engel 写道:

On Thu, 6 June 2013 00:16:45 +0800, vaughan wrote:

于 2013年06月05日 21:27, Jörn Engel 写道:

On Wed, 5 June 2013 17:18:33 +0800, vaughan wrote:


Check and set sdp->exclude should be atomic when set in sg_open().


The patch is line-wrapped.  More importantly, it doesn't seem to do

It's shorter than the original line, so I just leave it like this...


Sure.  What I meant by line-wrapped is that your mailer mangled the
patch.  Those two lines should have been one:

-   ((!sfds_list_empty(sdp) || get_exclude(sdp))
? 0 : set_exclude(sdp, 1)));



what your description indicates it should do.  And lastly, does this
fix a bug, possibly even one you have a testcase for, or was it found
by code inspection?

I found it by code inspection. A race condition may happen with the
old code if two threads are both trying to open the same sg with
O_EXCL simultaneously. It's possible that they both find fsds list
is empty and get_exclude(sdp) returns 0, then they both call
set_exclude() and break out from wait_event_interruptible and resume
open. So it's necessary to check again with sg_open_exclusive_lock
held to ensure only one can set sdp->exclude and return >0 to break
out from wait_event loop.


Makes sense.  And reading the code again, I have to wonder what monkey
came up with the get_exclude/set_exclude functions.

Can I sucker you into a slightly larger cleanup?  I think the entire
"get_exclude(sdp)) ? 0 : set_exclude(sdp, 1)" should be simplified.
And once you add the try_set_exclude(), set_exclude will only ever do
clear_exclude, so you might as well rename and simplify that as well.

I find my patch is not enough to avoid this race condition said above.
Since sg_add_sfp() just do an add_to_list without check and wait_event
check don't set a sign to announce a future add_to_list is on going, the
time window between wait_event and sg_add_sfp gives others to open sg
before the prechecked sg_add_sfp() called.

The same case also happens when one shared and one exclude open occur
simultaneously. If the shared open pass the precheck stage and ready to
sg_add_sfp(). At this time another exclude open will also pass the check:
   ((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 :
try_set_exclude(sdp)));
Then, both open can succeed.

I think the point is we separate the check&add routine and haven't set
an sign to let others wait until the whole actions complete. I suppose
we may change the steps a bit to avoid trouble like this. If we can
malloc&initialize sfp at first, and then check&add sfp under the
protection of sg_index_lock, everything seems to be quite simple.
We also should prevent sg_device change those parameters which are 
needed to copy to sfp during sfp initialization.


Regards,
Vaughan




Regards,
Vaughan



Let no good deed go unpunished.

Jörn

--
It's just what we asked for, but not what we want!
-- anonymous


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sg: atomize check and set sdp->exclude in sg_open

2013-06-06 Thread vaughan

于 2013年06月05日 23:41, Jörn Engel 写道:

On Thu, 6 June 2013 00:16:45 +0800, vaughan wrote:

于 2013年06月05日 21:27, Jörn Engel 写道:

On Wed, 5 June 2013 17:18:33 +0800, vaughan wrote:


Check and set sdp->exclude should be atomic when set in sg_open().


The patch is line-wrapped.  More importantly, it doesn't seem to do

It's shorter than the original line, so I just leave it like this...


Sure.  What I meant by line-wrapped is that your mailer mangled the
patch.  Those two lines should have been one:

-   ((!sfds_list_empty(sdp) || get_exclude(sdp))
? 0 : set_exclude(sdp, 1)));



what your description indicates it should do.  And lastly, does this
fix a bug, possibly even one you have a testcase for, or was it found
by code inspection?

I found it by code inspection. A race condition may happen with the
old code if two threads are both trying to open the same sg with
O_EXCL simultaneously. It's possible that they both find fsds list
is empty and get_exclude(sdp) returns 0, then they both call
set_exclude() and break out from wait_event_interruptible and resume
open. So it's necessary to check again with sg_open_exclusive_lock
held to ensure only one can set sdp->exclude and return >0 to break
out from wait_event loop.


Makes sense.  And reading the code again, I have to wonder what monkey
came up with the get_exclude/set_exclude functions.

Can I sucker you into a slightly larger cleanup?  I think the entire
"get_exclude(sdp)) ? 0 : set_exclude(sdp, 1)" should be simplified.
And once you add the try_set_exclude(), set_exclude will only ever do
clear_exclude, so you might as well rename and simplify that as well.
I find my patch is not enough to avoid this race condition said above. 
Since sg_add_sfp() just do an add_to_list without check and wait_event 
check don't set a sign to announce a future add_to_list is on going, the 
time window between wait_event and sg_add_sfp gives others to open sg 
before the prechecked sg_add_sfp() called.


The same case also happens when one shared and one exclude open occur 
simultaneously. If the shared open pass the precheck stage and ready to 
sg_add_sfp(). At this time another exclude open will also pass the check:
  ((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 : 
try_set_exclude(sdp)));

Then, both open can succeed.

I think the point is we separate the check&add routine and haven't set 
an sign to let others wait until the whole actions complete. I suppose 
we may change the steps a bit to avoid trouble like this. If we can 
malloc&initialize sfp at first, and then check&add sfp under the 
protection of sg_index_lock, everything seems to be quite simple.



Regards,
Vaughan



Let no good deed go unpunished.

Jörn

--
It's just what we asked for, but not what we want!
-- anonymous


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sg: atomize check and set sdp->exclude in sg_open

2013-06-05 Thread vaughan

于 2013年06月05日 21:27, Jörn Engel 写道:

On Wed, 5 June 2013 17:18:33 +0800, vaughan wrote:


Check and set sdp->exclude should be atomic when set in sg_open().


The patch is line-wrapped.  More importantly, it doesn't seem to do

It's shorter than the original line, so I just leave it like this...


what your description indicates it should do.  And lastly, does this
fix a bug, possibly even one you have a testcase for, or was it found
by code inspection?
I found it by code inspection. A race condition may happen with the old 
code if two threads are both trying to open the same sg with O_EXCL 
simultaneously. It's possible that they both find fsds list is empty and 
get_exclude(sdp) returns 0, then they both call set_exclude() and break 
out from wait_event_interruptible and resume open. So it's necessary to 
check again with sg_open_exclusive_lock held to ensure only one can set 
sdp->exclude and return >0 to break out from wait_event loop.





Signed-off-by: Vaughan Cao 
---
  drivers/scsi/sg.c | 17 -
  1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 25b5455..0ede08f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -245,6 +245,21 @@ static int set_exclude(Sg_device *sdp, char val)
  return val;
  }

+/* Check if we can set exclude and then set, return 1 if success */
+static int try_set_exclude(Sg_device *sdp)
+{
+unsigned long flags;
+
+spin_lock_irqsave(&sg_open_exclusive_lock, flags);
+if (sdp->exclude) {
+spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
+return 0;
+}
+sdp->exclude = 1;
+spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
+return 1;
+}
+
  static int sfds_list_empty(Sg_device *sdp)
  {
  unsigned long flags;
@@ -303,7 +318,7 @@ sg_open(struct inode *inode, struct file *filp)
  goto error_out;
  }
  res = wait_event_interruptible(sdp->o_excl_wait,
-   ((!sfds_list_empty(sdp) || get_exclude(sdp))
? 0 : set_exclude(sdp, 1)));
+((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 :
try_set_exclude(sdp)));
  if (res) {
  retval = res;/* -ERESTARTSYS because signal hit process */
  goto error_out;
--
1.7.11.7



Jörn

--
Fantasy is more important than knowledge. Knowledge is limited,
while fantasy embraces the whole world.
-- Albert Einstein


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] sg: atomize check and set sdp->exclude in sg_open

2013-06-05 Thread vaughan

Check and set sdp->exclude should be atomic when set in sg_open().

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/sg.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 25b5455..0ede08f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -245,6 +245,21 @@ static int set_exclude(Sg_device *sdp, char val)
 return val;
 }

+/* Check if we can set exclude and then set, return 1 if success */
+static int try_set_exclude(Sg_device *sdp)
+{
+unsigned long flags;
+
+spin_lock_irqsave(&sg_open_exclusive_lock, flags);
+if (sdp->exclude) {
+spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
+return 0;
+}
+sdp->exclude = 1;
+spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
+return 1;
+}
+
 static int sfds_list_empty(Sg_device *sdp)
 {
 unsigned long flags;
@@ -303,7 +318,7 @@ sg_open(struct inode *inode, struct file *filp)
 goto error_out;
 }
 res = wait_event_interruptible(sdp->o_excl_wait,
-   ((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 
: set_exclude(sdp, 1)));
+((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 : 
try_set_exclude(sdp)));

 if (res) {
 retval = res;/* -ERESTARTSYS because signal hit process */
 goto error_out;
--
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html