re: [SCSI] fnic: FIP VLAN Discovery Feature Support

2014-01-08 Thread Dan Carpenter
Hello Hiral Patel,

The patch d3c995f1dcf9: "[SCSI] fnic: FIP VLAN Discovery Feature
Support" from Feb 25, 2013, leads to the following
static checker warning:

drivers/scsi/fnic/fnic_fcs.c:278 is_fnic_fip_flogi_reject()
warn: is it ok to set 'els_len' to negative?

This is some unpushable debug code on my Smatch system.  It is sort of
a nonsense warning because Smatch has run into an "impossible" condition
because of a bug in Smatch but also because of bug in
fnic_fcoe_send_vlan_req().

drivers/scsi/fnic/fnic_fcs.c
   251  fiph = (struct fip_header *)skb->data;
^
fiph is untrusted data.  We don't trust the network.  This is a rule of
Linux.

   252  op = ntohs(fiph->fip_op);
   253  sub = fiph->fip_subcode;
   254  
   255  if (op != FIP_OP_LS)
   256  return 0;
   257  
   258  if (sub != FIP_SC_REP)
   259  return 0;
   260  
   261  rlen = ntohs(fiph->fip_dl_len) * 4;
   262  if (rlen + sizeof(*fiph) > skb->len)
   263  return 0;
   264  
   265  desc = (struct fip_desc *)(fiph + 1);
   266  dlen = desc->fip_dlen * FIP_BPW;

desc->fip_dlen is u8 type so it could be a number between 0-255.  That's
fine.
FIP_BPW is 4.
Here is the bug in Smatch.  It looks at the sender code and says that
desc->fip_dlen is set to either 2 or 4.  So now dlen is in 8-16 range.

   267  
   268  if (desc->fip_dtype == FIP_DT_FLOGI) {
   269  
   270  shost_printk(KERN_DEBUG, lport->host,
   271" FIP TYPE FLOGI: fab name:%llx "
   272"vfid:%d map:%x\n",
   273fip->sel_fcf->fabric_name, fip->sel_fcf->vfid,
   274fip->sel_fcf->fc_map);
   275  if (dlen < sizeof(*els) + sizeof(*fh) + 1)
^
8-16 is always less than 29 so this condition is always true.

   276  return 0;
   277  
   278  els_len = dlen - sizeof(*els);
^
We are in an impossible condition and Smatch doesn't know the possible
values of "dlen" any more so it generates the warning.

   279  els = (struct fip_encaps *)desc;
   280  fh = (struct fc_frame_header *)(els + 1);
   281  els_dtype = desc->fip_dtype;
   282  
   283  if (!fh)
   284  return 0;
   285  
   286  /*
   287   * ELS command code, reason and explanation should be = 
Reject,
   288   * unsupported command and insufficient resource
   289   */
   290  els_op = *(u8 *)(fh + 1);
   291  if (els_op == ELS_LS_RJT) {
   292  shost_printk(KERN_INFO, lport->host,
   293"Flogi Request Rejected by Switch\n");
   294  return 1;
   295  }
   296  shost_printk(KERN_INFO, lport->host,
   297  "Flogi Request Accepted by Switch\n");
   298  }
   299  return 0;
   300  }
   301  
   302  static void fnic_fcoe_send_vlan_req(struct fnic *fnic)
   303  {
   304  struct fcoe_ctlr *fip = &fnic->ctlr;
   305  struct fnic_stats *fnic_stats = &fnic->fnic_stats;
   306  struct sk_buff *skb;
   307  char *eth_fr;
   308  int fr_len;
   309  struct fip_vlan *vlan;
   310  u64 vlan_tov;
   311  
   312  fnic_fcoe_reset_vlans(fnic);
   313  fnic->set_vlan(fnic, 0);
   314  FNIC_FCS_DBG(KERN_INFO, fnic->lport->host,
   315"Sending VLAN request...\n");
   316  skb = dev_alloc_skb(sizeof(struct fip_vlan));
   317  if (!skb)
   318  return;
   319  
   320  fr_len = sizeof(*vlan);
   321  eth_fr = (char *)skb->data;
   322  vlan = (struct fip_vlan *)eth_fr;
   323  
   324  memset(vlan, 0, sizeof(*vlan));
   325  memcpy(vlan->eth.h_source, fip->ctl_src_addr, ETH_ALEN);
   326  memcpy(vlan->eth.h_dest, fcoe_all_fcfs, ETH_ALEN);
   327  vlan->eth.h_proto = htons(ETH_P_FIP);
   328  
   329  vlan->fip.fip_ver = FIP_VER_ENCAPS(FIP_VER);
   330  vlan->fip.fip_op = htons(FIP_OP_VLAN);
   331  vlan->fip.fip_subcode = FIP_SC_VL_REQ;
   332  vlan->fip.fip_dl_len = htons(sizeof(vlan->desc) / FIP_BPW);
   333  
   334  vlan->desc.mac.fd_desc.fip_dtype = FIP_DT_MAC;
   335  vlan->desc.mac.fd_desc.fip_dlen = sizeof(vlan->desc.mac) / 
FIP_BPW;

8 / 4 = 2.  We are setting ->fib_dlen to 2.

   336  memcpy(&vlan->desc.mac.fd_mac, fip->ctl_

RE: [patch] [SCSI] megaraid: missing bounds check in mimd_to_kioc()

2014-01-08 Thread Saxena, Sumit


>-Original Message-
>From: Dan Carpenter [mailto:dan.carpen...@oracle.com]
>Sent: Wednesday, October 30, 2013 10:44 PM
>To: DL-MegaRAID Linux
>Cc: James E.J. Bottomley; linux-scsi@vger.kernel.org; secur...@kernel.org;
>Nico Golde; Fabian Yamaguchi
>Subject: [patch] [SCSI] megaraid: missing bounds check in mimd_to_kioc()
>
>pthru32->dataxferlen comes from the user so we need to check that it's
>not too large so we don't overflow the buffer.
>
>Reported-by: Nico Golde 
>Reported-by: Fabian Yamaguchi 
>Signed-off-by: Dan Carpenter 
>---
>Please review this carefully because I have not tested it.
>
>diff --git a/drivers/scsi/megaraid/megaraid_mm.c
>b/drivers/scsi/megaraid/megaraid_mm.c
>index dfffd0f..a706927 100644
>--- a/drivers/scsi/megaraid/megaraid_mm.c
>+++ b/drivers/scsi/megaraid/megaraid_mm.c
>@@ -486,6 +486,8 @@ mimd_to_kioc(mimd_t __user *umimd,
>mraid_mmadp_t *adp, uioc_t *kioc)
>
>   pthru32->dataxferaddr   = kioc->buf_paddr;
>   if (kioc->data_dir & UIOC_WR) {
>+  if (pthru32->dataxferlen > kioc->xferlen)
>+  return -EINVAL;
>   if (copy_from_user(kioc->buf_vaddr, kioc->user_data,
>   pthru32->dataxferlen)) {
>   return (-EFAULT);

Acked-by: Sumit Saxena 

Sumit
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] megaraid_sas: make the log message about rescanning more informative.

2014-01-08 Thread Saxena, Sumit


>-Original Message-
>From: Jasper Spaans [mailto:spa...@fox-it.com]
>Sent: Friday, October 25, 2013 5:53 PM
>To: DL-MegaRAID Linux
>Cc: linux-scsi@vger.kernel.org
>Subject: [PATCH] megaraid_sas: make the log message about rescanning
>more informative.
>
>Hi,
>
>I was working on one of our servers, and saw the following interesting
>message in dmesg:
>
>[20961024.720972] scanning ...
>
>This scared me somewhat. Please find a patch to make it easier to see where
>this message is coming from.
>
>Cheers,
>Jasper
>
>---
> drivers/scsi/megaraid/megaraid_sas_base.c |2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>b/drivers/scsi/megaraid/megaraid_sas_base.c
>index 1f0ca68..3457289 100644
>--- a/drivers/scsi/megaraid/megaraid_sas_base.c
>+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>@@ -5461,7 +5461,7 @@ megasas_aen_polling(struct work_struct *work)
>   }
>
>   if (doscan) {
>-  printk(KERN_INFO "scanning ...\n");
>+  printk(KERN_INFO "Rescanning MegaRAID devices ...\n");
>   megasas_get_pd_list(instance);
>   for (i = 0; i < MEGASAS_MAX_PD_CHANNELS; i++) {
>   for (j = 0; j < MEGASAS_MAX_DEV_PER_CHANNEL;
>j++) {
>--
>1.7.9.5
Acked-by: Sumit Saxena 

Sumit
>
>
>--
> /\/\   ir. Jasper Spaans // Lead Developer www.FoxDetACT.com
> \   (_)/   Fox-IT - For a more secure society!
>  \XT: +31-15-2847999
>   \  / \   M: +31-6-41588725
>\/  KvK Haaglanden 27301624

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [SCSI] megaraid: use GFP_ATOMIC under spin lock

2014-01-08 Thread Saxena, Sumit


>-Original Message-
>From: Wei Yongjun [mailto:weiyj...@gmail.com]
>Sent: Friday, December 20, 2013 8:38 AM
>To: DL-MegaRAID Linux; jbottom...@parallels.com; simon.pu...@gmail.com;
>jkos...@suse.cz
>Cc: yongjun_...@trendmicro.com.cn; linux-scsi@vger.kernel.org
>Subject: [PATCH] [SCSI] megaraid: use GFP_ATOMIC under spin lock
>
>From: Wei Yongjun 
>
>A spin lock is taken here so we should use GFP_ATOMIC.
>
>Signed-off-by: Wei Yongjun 
>---
> drivers/scsi/megaraid/megaraid_mm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/scsi/megaraid/megaraid_mm.c
>b/drivers/scsi/megaraid/megaraid_mm.c
>index a706927..99fa5d3 100644
>--- a/drivers/scsi/megaraid/megaraid_mm.c
>+++ b/drivers/scsi/megaraid/megaraid_mm.c
>@@ -570,7 +570,7 @@ mraid_mm_attach_buf(mraid_mmadp_t *adp, uioc_t
>*kioc, int xferlen)
>
>   kioc->pool_index= right_pool;
>   kioc->free_buf  = 1;
>-  kioc->buf_vaddr = pci_pool_alloc(pool->handle, GFP_KERNEL,
>+  kioc->buf_vaddr = pci_pool_alloc(pool->handle,
>GFP_ATOMIC,
>   &kioc->buf_paddr);
>   spin_unlock_irqrestore(&pool->lock, flags);
>
>
Acked-by: Sumit Saxena 

Sumit

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [SCSI] fnic: FIP VLAN Discovery Feature Support

2014-01-08 Thread Dan Carpenter
Hiral isn't at Cisco any more.

regards,
dan carpenter

On Wed, Jan 08, 2014 at 12:57:41PM +0300, Dan Carpenter wrote:
> Hello Hiral Patel,
> 
> The patch d3c995f1dcf9: "[SCSI] fnic: FIP VLAN Discovery Feature
> Support" from Feb 25, 2013, leads to the following
> static checker warning:
> 
>   drivers/scsi/fnic/fnic_fcs.c:278 is_fnic_fip_flogi_reject()
>   warn: is it ok to set 'els_len' to negative?
> 
> This is some unpushable debug code on my Smatch system.  It is sort of
> a nonsense warning because Smatch has run into an "impossible" condition
> because of a bug in Smatch but also because of bug in
> fnic_fcoe_send_vlan_req().
> 
> drivers/scsi/fnic/fnic_fcs.c
>251  fiph = (struct fip_header *)skb->data;
> ^
> fiph is untrusted data.  We don't trust the network.  This is a rule of
> Linux.
> 
>252  op = ntohs(fiph->fip_op);
>253  sub = fiph->fip_subcode;
>254  
>255  if (op != FIP_OP_LS)
>256  return 0;
>257  
>258  if (sub != FIP_SC_REP)
>259  return 0;
>260  
>261  rlen = ntohs(fiph->fip_dl_len) * 4;
>262  if (rlen + sizeof(*fiph) > skb->len)
>263  return 0;
>264  
>265  desc = (struct fip_desc *)(fiph + 1);
>266  dlen = desc->fip_dlen * FIP_BPW;
> 
> desc->fip_dlen is u8 type so it could be a number between 0-255.  That's
> fine.
> FIP_BPW is 4.
> Here is the bug in Smatch.  It looks at the sender code and says that
> desc->fip_dlen is set to either 2 or 4.  So now dlen is in 8-16 range.
> 
>267  
>268  if (desc->fip_dtype == FIP_DT_FLOGI) {
>269  
>270  shost_printk(KERN_DEBUG, lport->host,
>271" FIP TYPE FLOGI: fab name:%llx "
>272"vfid:%d map:%x\n",
>273fip->sel_fcf->fabric_name, 
> fip->sel_fcf->vfid,
>274fip->sel_fcf->fc_map);
>275  if (dlen < sizeof(*els) + sizeof(*fh) + 1)
> ^
> 8-16 is always less than 29 so this condition is always true.
> 
>276  return 0;
>277  
>278  els_len = dlen - sizeof(*els);
> ^
> We are in an impossible condition and Smatch doesn't know the possible
> values of "dlen" any more so it generates the warning.
> 
>279  els = (struct fip_encaps *)desc;
>280  fh = (struct fc_frame_header *)(els + 1);
>281  els_dtype = desc->fip_dtype;
>282  
>283  if (!fh)
>284  return 0;
>285  
>286  /*
>287   * ELS command code, reason and explanation should be 
> = Reject,
>288   * unsupported command and insufficient resource
>289   */
>290  els_op = *(u8 *)(fh + 1);
>291  if (els_op == ELS_LS_RJT) {
>292  shost_printk(KERN_INFO, lport->host,
>293"Flogi Request Rejected by 
> Switch\n");
>294  return 1;
>295  }
>296  shost_printk(KERN_INFO, lport->host,
>297  "Flogi Request Accepted by Switch\n");
>298  }
>299  return 0;
>300  }
>301  
>302  static void fnic_fcoe_send_vlan_req(struct fnic *fnic)
>303  {
>304  struct fcoe_ctlr *fip = &fnic->ctlr;
>305  struct fnic_stats *fnic_stats = &fnic->fnic_stats;
>306  struct sk_buff *skb;
>307  char *eth_fr;
>308  int fr_len;
>309  struct fip_vlan *vlan;
>310  u64 vlan_tov;
>311  
>312  fnic_fcoe_reset_vlans(fnic);
>313  fnic->set_vlan(fnic, 0);
>314  FNIC_FCS_DBG(KERN_INFO, fnic->lport->host,
>315"Sending VLAN request...\n");
>316  skb = dev_alloc_skb(sizeof(struct fip_vlan));
>317  if (!skb)
>318  return;
>319  
>320  fr_len = sizeof(*vlan);
>321  eth_fr = (char *)skb->data;
>322  vlan = (struct fip_vlan *)eth_fr;
>323  
>324  memset(vlan, 0, sizeof(*vlan));
>325  memcpy(vlan->eth.h_source, fip->ctl_src_addr, ETH_ALEN);
>326  memcpy(vlan->eth.h_dest, fcoe_all_fcfs, ETH_ALEN);
>327  vlan->eth.h_proto = htons(ETH_P_FIP);
>328  
>329  vlan->fip.fip_ver = FIP_VER_ENCAPS(FIP_VER);
>330  vlan->fip.fip_op = htons(FIP_OP_VLAN);
>331  vlan->fip.fip_subcode =

Re: Terrible performance of sequential O_DIRECT 4k writes in SAN environment. ~3 times slower then Solars 10 with the same HBA/Storage.

2014-01-08 Thread Christoph Hellwig
On Wed, Jan 08, 2014 at 02:17:13AM +0100, Jan Kara wrote:
>   Well, I was specifically worried about i_mutex locking. In particular:
> Before we report appending IO completion we need to update i_size.
> To update i_size we need to grab i_mutex.
> 
> Now this is unpleasant because inode_dio_wait() happens under i_mutex so
> the above would create lock inversion. And we cannot really do
> inode_dio_done() before grabbing i_mutex as that would open interesting
> races between truncate decreasing i_size and DIO increasing it.

Yeah, XFS splits this between the ilock and iolock, which just makes
life in this area a whole lot easier.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Terrible performance of sequential O_DIRECT 4k writes in SAN environment. ~3 times slower then Solars 10 with the same HBA/Storage.

2014-01-08 Thread Christoph Hellwig
On Tue, Jan 07, 2014 at 08:37:23PM +0200, Sergey Meirovich wrote:
> Actually my initial report (14.67Mb/sec  3755.41 Requests/sec) was about ext4
> However I have tried XFS as well. It was a bit slower than ext4 on all
> occasions.

I wasn't trying to say XFS fixes your problem, but that we could
implement appending AIO writes in XFS fairly easily.

To verify Jan's theory, can you try to preallocate the file to the full
size and then run the benchmark by doing a:

# fallocate -l  

and then run it?  If that's indeed the issue I'd be happy to implement
the "real aio" append support for you as well.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Terrible performance of sequential O_DIRECT 4k writes in SAN environment. ~3 times slower then Solars 10 with the same HBA/Storage.

2014-01-08 Thread Sergey Meirovich
Hi James,

On 7 January 2014 22:57, James Smart  wrote:
> Sergey,
>
> The Thor chipset is a bit old - a 4Gig adapter.  Most of our performance
> improvements, including parallelization, have gone into the 8G and 16G
> adapters. But you still should have seen significantly beyond what you
> reported.

First of all - thanks a lot!

I took Thor because we have exactly the same Thors in some of our
Solaris servers. I've also tried 6 different qlogics (mostly 8G) and
fnic (10G) as well. Surprisingly enough Thor was the fastest one for
seqwr 4k. Though in most of the cases machines were from our different
DCs and hence each one connected to yet another storage.

>
> We did a sanity check some hardware we already had set up with a Thor
> adapter.  We saw 23555 iop/s and 92.1 MB/s without needing to do much, well
> beyond what you've reported, and still not up to what we know the card can
> do.  There are some inefficiencies from the linux kernel and some locking
> deltas between our solaris and linux drivers - but not enough to account for
> what you are seeing.
>
> I expect the Direct IO filesystem behavior is the root issue.

The strangest thing to me that this is the problem with sequential
write. For example the fnic one machine is zoned to EMC XtremIO and
had results: 14.43Mb/sec 3693.65 Requests/sec for sequential 4k. The
same fnic machine perfrormed rather impressive for random 4k
451.11Mb/sec 115485.02 Requests/sec
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] drivers: target: target_core_mod: use div64_u64_rem() instead of operator '%' for u64

2014-01-08 Thread Chen Gang
On 01/08/2014 03:32 PM, Hannes Reinecke wrote:
> On 12/24/2013 04:35 AM, Chen Gang wrote:
>> On 12/23/2013 02:51 PM, Nicholas A. Bellinger wrote:
>>> On Sun, 2013-12-22 at 17:17 +0800, Chen Gang wrote:
 On 12/22/2013 10:56 AM, Nicholas A. Bellinger wrote:
> Hi Chen,
>
> On Sat, 2013-12-21 at 10:08 +0800, Chen Gang wrote:
>> In kernel, need use div64_u64_rem() instead of operator '%' for u64, or
>> can not pass compiling (with allmodconfig under metag):
>>
>> MODPOST 2909 modules
>>   ERROR: "__umoddi3" [drivers/target/target_core_mod.ko] undefined!
>>
>> Also need u64 type cast for u32 variable multiply u32 variable, or will
>> cause type overflow issue.
>>
>>
>> Signed-off-by: Chen Gang 
>> ---
>>  drivers/target/target_core_alua.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>
> FYI, this unsigned long long division in core_alua_state_lba_dependent()
> was fixed for 32-bit in linux-next >= 12192013 code.
>

 OK, thanks.

 The related fix patch changed "start_lba = lba % ..." to "start_lba =
 lba / ...", and also assumed "segment_size * segment_mult" is still
 within u32 (can not cause type over flow).

 I guess the original author already knew about them, and intended to do
 like that (if not, please let me know, thanks).

>>>
>>> Sorry, your correct that the original code is using modulo division to
>>> calculate start_lba.
>>>
>>
>> Oh, that's all right, (in fact, don't need sorry), I am not quite
>> familiar with the details, so need related member help check it.  :-)
>>
>>> Hannes, can you please double check this below..?
>>>
>>
>> Please help check when have time, thanks.
>>
> I would even convert segment_size and segment_mult to u64,
> to ensure no overflows occur:
> 
> diff --git a/drivers/target/target_core_alua.c
> b/drivers/target/target_core_alua
> .c
> index 9b1856d..54b1e52 100644
> --- a/drivers/target/target_core_alua.c
> +++ b/drivers/target/target_core_alua.c
> @@ -477,8 +477,7 @@ static inline int core_alua_state_lba_dependent(
> u8 *alua_ascq)
>  {
> struct se_device *dev = cmd->se_dev;
> -   u32 segment_size, segment_mult, sectors;
> -   u64 lba;
> +   u64 segment_size, segment_mult, sectors, lba;
> 
> /* Only need to check for cdb actually containing LBAs */
> if (!(cmd->se_cmd_flags & SCF_SCSI_DATA_CDB))
> 
> 

OK, thanks.

> Other than that the sector_div() patch is correct.
> 

So we really need use '/' instead of original '%'?


Thanks
-- 
Chen Gang

Open, share and attitude like air, water and life which God blessed
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Terrible performance of sequential O_DIRECT 4k writes in SAN environment. ~3 times slower then Solars 10 with the same HBA/Storage.

2014-01-08 Thread Sergey Meirovich
Hi Christoph,

On 8 January 2014 16:03, Christoph Hellwig  wrote:
> On Tue, Jan 07, 2014 at 08:37:23PM +0200, Sergey Meirovich wrote:
>> Actually my initial report (14.67Mb/sec  3755.41 Requests/sec) was about ext4
>> However I have tried XFS as well. It was a bit slower than ext4 on all
>> occasions.
>
> I wasn't trying to say XFS fixes your problem, but that we could
> implement appending AIO writes in XFS fairly easily.
>
> To verify Jan's theory, can you try to preallocate the file to the full
> size and then run the benchmark by doing a:
>
> # fallocate -l  
>
> and then run it?  If that's indeed the issue I'd be happy to implement
> the "real aio" append support for you as well.
>

After fallocate:
[root@illin01 ext4]# du -k test_file.* | awk '{print $1}' |sort |uniq
81920
[root@illin01 ext4]# fallocate -l 81920k test_file.*

Results are almost the same:
14.68Mb/sec  3758.02 Requests/sec
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: status of block-integrity

2014-01-08 Thread Martin K. Petersen
> "Hannes" == Hannes Reinecke  writes:

Hannes,

Hannes> As there is no user (apart from oracleasm) no-one can attach
Hannes> protection information to any data, so even the most dedicated
Hannes> admin cannot exercise this path, let alone find issues here.

That's not how it works!

If the filesystem has not attached protection information to a bio the
block layer will do it for you. The block layer generates protection
information for writes and verifies it for reads. That's how it's worked
since day one. The code is there, it is used by everyone with a
DIX-capable HBA. See Documentation/block/data-integrity.txt.

Normal applications do not want to have to deal with generating
protection information, using an async I/O model, keeping completion
state around for extended periods of time to figure out whether the I/O
actually completed or not and so on. So the kernel-to-platter protection
scheme we have in place now is good enough.

That doesn't mean that I'm not interested in augmenting libaio. I
am. Very. And I know of several applications that are keen to use
it. But getting page cache passthrough and filesystem interaction
working is non-trivial. That's what has inhibited progress, not
extending the libaio API.

Hannes> Doug Gilbert and I are currently discussing LID4 / ROD Token
Hannes> copy for sg3_utils and the block layer, so any patches would be
Hannes> very helpful here.

I'm only doing LID1 right now. Any particular reason you are exploring
LID4 and ROD?

I resumed my efforts before Christmas but I keep running into issues.
I'm guessing I'm a week or two from having something that is suitable
for consumption.

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Terrible performance of sequential O_DIRECT 4k writes in SAN environment. ~3 times slower then Solars 10 with the same HBA/Storage.

2014-01-08 Thread Christoph Hellwig
On Wed, Jan 08, 2014 at 04:43:07PM +0200, Sergey Meirovich wrote:
> Results are almost the same:
> 14.68Mb/sec  3758.02 Requests/sec
>

On my laptop SSD I get the following results (sometimes up to 200MB/s,
sometimes down to 100MB/s, always in the 40k to 50k IOps range):

time elapsed (sec.):5
bandwidth (MiB/s):  160.00
IOps:   40960.00

The IOps are more than the hardware is physically capable of, but given
that you didn't specify O_SYNC this seems sensible given that we never
have to flush the disk cache.

Could it be that your array has WCE=0?  In Linux we'll never enable the
cache automatically, but Solaris does at least when using ZFS.  Try
running:

   sdparm --set=WCE /dev/sdX

and try again.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: status of block-integrity

2014-01-08 Thread Martin K. Petersen
> "James" == James Bottomley  writes:

James> No, I think you're confusing algorithms with protocols.  DIF and
James> DIX are two names for protection envelopes.  DIF verifies
James> integrity from the HBA to the device surface.  DIX verifies
James> integrity from an application to the HBA. 

Actually, DIX is a data integrity-aware HBA programming interface. We
have an implementation of that interface in the SCSI layer and in some
of the initiator drivers (lpfc, qla2xxx, mptNsas).

There is no single name for stuff above DIX. Other than "block layer
data integrity goo", "page cache black magic" and "let's add a few
fields to struct iocb".

James> So, the question is do we need to bother with DIX at all?  No
James> filesystem uses it 

...explicitly. Every filesystem uses it implicitly. There are only two
reasons for filesystems to want to be explicitly "block layer data
integrity goo"-aware:

 1. To be able to use the application tag space for back pointers or
other metadata without requiring disk format changes.

 2. To facilitate passthrough of protection information submitted
via the $TBD application programming interface.

I was hoping the extN folks would be interested in (1) but there were no
takers. (2) is hard but not forgotten. In any case the status quo is
that there is no point in filesystems manually generating protection
information when the block layer is going to do it for them when the bio
is submitted.

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 2/2] dual scan thread bug fix

2014-01-08 Thread Alan Stern
On Wed, 8 Jan 2014, James Bottomley wrote:

> OK, Agreed, but that means modifying the 1/2 patch with the below.  This
> should make the proposed diff to 2/2 correct.
> 
> James
> 
> ---
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index ef3f958..5fad646 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -417,7 +417,7 @@ static struct scsi_target *scsi_alloc_target(struct 
> device *parent,
>   + shost->transportt->target_size;
>   struct scsi_target *starget;
>   struct scsi_target *found_target;
> - int error;
> + int error, ref_got;
>  
>   starget = kzalloc(size, GFP_KERNEL);
>   if (!starget) {
> @@ -466,15 +466,15 @@ static struct scsi_target *scsi_alloc_target(struct 
> device *parent,
>   return starget;
>  
>   found:
> - if (!kref_get_unless_zero(&found_target->reap_ref))
> - /*
> -  * release routine already fired.  Target is dead, but
> -  * STARGET_DEL may not yet be set (set in the release
> -  * routine), so set here as well, just in case
> -  */
> - found_target->state = STARGET_DEL;
> + /*
> +  * release routine already fired if kref is zero, so if we can still
> +  * take the reference, the target must be alive.  If we can't, it must
> +  * be dying and we need to wait for a new target
> +  */
> + ref_got = kref_get_unless_zero(&found_target->reap_ref);
> +
>   spin_unlock_irqrestore(shost->host_lock, flags);
> - if (found_target->state != STARGET_DEL) {
> + if (ref_got) {
>   put_device(dev);
>   return found_target;
>   }
> @@ -482,8 +482,8 @@ static struct scsi_target *scsi_alloc_target(struct 
> device *parent,
>* Unfortunately, we found a dying target; need to wait until it's
>* dead before we can get a new one.  There is an anomaly here.  We
>* *should* call scsi_target_reap() to balance the kref_get() of the
> -  * reap_ref above.  However, since the target is in state STARGET_DEL,
> -  * it's already invisible and the reap_ref is irrelevant.  If we call
> +  * reap_ref above.  However, since the target being released, it's
> +  * already invisible and the reap_ref is irrelevant.  If we call
>* scsi_target_reap() we might spuriously do another device_del() on
>* an already invisible target.
>*/

In fact, most of this comment (everything after the first sentence) is
no longer needed.  If the target is dying then kref_get_unless_zero
must have failed, so there is no need to worry about unbalanced
refcounts.

Otherwise this looks good.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-08 Thread David Laight
> From: Sarah Sharp
> On Tue, Jan 07, 2014 at 03:57:00PM -0800, walt wrote:
> > On 01/07/2014 01:21 PM, Sarah Sharp wrote:
> >
> > > Can you please try the attached patch, on top of the previous three
> > > patches, and send me dmesg?
> >
> > Hi Sarah, I just now finished running 0001-More-debugging.patch for the
> > first time.  The previous dmesg didn't include that patch, but this one
> > does.
> >
> > I read through this dmesg but I nodded off somewhere around line 500.
> > I hope you can stay awake :)
> 
> Well, it has all the info I need, but the results don't make me too
> happy.  Everything I've checked seems consistent, and I don't know why
> the host stopped.  The link TRBs are intact, the dequeue pointer for the
> endpoint was pointing to the transfer that timed out and it had the
> cycle bit set correctly, etc.  Perhaps the no-op TRBs are really the
> issue.
> 
> I'll have to take a look at the log again tomorrow.  I posted the dmesg
> on pastebin if David wants to check it out as well:
> http://pastebin.com/a4AUpsL1

I can't see anything obvious either.
However there is no response to the 'stop endpoint' command.
Section 4.6.9 (page 107 of rev1.0) states that the controller will complete
any USB IN or OUT transaction before raising the command completion event.
Possibly it is too 'stuck' to complete the transaction?

The endpoint status is also still '1' (running).
This also means that the 'TR dequeue pointer' is undefined - so the
controller could easily be processing a later TRB.
This field might even still contain the ring base address written by
the driver much earlier.

This might mean that something 'catastrophic' has happened earlier.
Maybe the controller isn't actually seeing any doorbell writes at all.
Maybe the base addresses it has for the rings have all got corrupted.
At least this looks like amd64 - so there aren't memory coherency issues.

Some hacks that might help isolate the problem:
1) Request an interrupt from the last nop data TRB.
2) Put a command nop (decimal 23) TRB into the command ring before
   the 'stop endpoint'.
3) Comment out the code that adds the nop data TRBs.
The first two might need code adding to handle the responses.

Do we know the actual xhci device?
I think it reports version 0x96.
(Sarah - it might be useful if that version were in one of the trace
messages that is output by default.)

David


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-08 Thread Alan Stern
On Tue, 7 Jan 2014, Sarah Sharp wrote:

> On Tue, Jan 07, 2014 at 03:57:00PM -0800, walt wrote:
> > On 01/07/2014 01:21 PM, Sarah Sharp wrote:
> > 
> > > Can you please try the attached patch, on top of the previous three
> > > patches, and send me dmesg?
> > 
> > Hi Sarah, I just now finished running 0001-More-debugging.patch for the
> > first time.  The previous dmesg didn't include that patch, but this one
> > does.
> > 
> > I read through this dmesg but I nodded off somewhere around line 500.
> > I hope you can stay awake :)
> 
> Well, it has all the info I need, but the results don't make me too
> happy.  Everything I've checked seems consistent, and I don't know why
> the host stopped.  The link TRBs are intact, the dequeue pointer for the
> endpoint was pointing to the transfer that timed out and it had the
> cycle bit set correctly, etc.  Perhaps the no-op TRBs are really the
> issue.

This may be a foolish question, but why is xhci-hcd using no-op TRBs in 
the first place?

It makes sense that they would be needed if you have to unlink an URB 
that isn't the first one on the endpoint ring.  But usb-storage never 
does that; whenever it unlinks an URB, it always unlinks the earliest 
entry in the endpoint's queue.

After unlinking the first URB on the ring, you don't need to fill in
its TRBs with no-ops.  Instead, when you are ready to start the ring
againk, just tell the host controller to move the dequeue pointer up to
the start of the next surviving URB.  (You'll also have to adjust the
CYCLE bits of the TRBs that get skipped over, but that's trivial.)

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-08 Thread David Laight
> From: Alan Stern
> 
> This may be a foolish question, but why is xhci-hcd using no-op TRBs in
> the first place?

Because it can't write in a link TRB because other parts of the
code use link TRBs to detect the end of the ring.

The problem is that it can't put a link TRB in the middle of
a chain of data fragments unless it is at a 'suitable' offset
from the start of the data TD. Given arbitrary input fragmentation
this means that you can't put a link TRB in the middle of a TD.
(The documented alignment might be as high as 16kB.)

If the rest of the code used a 'ring end pointer' then a link TRB
could be used instead.

David



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-08 Thread Alan Stern
On Wed, 8 Jan 2014, David Laight wrote:

> > From: Alan Stern
> > 
> > This may be a foolish question, but why is xhci-hcd using no-op TRBs in
> > the first place?
> 
> Because it can't write in a link TRB because other parts of the
> code use link TRBs to detect the end of the ring.
> 
> The problem is that it can't put a link TRB in the middle of
> a chain of data fragments unless it is at a 'suitable' offset
> from the start of the data TD. Given arbitrary input fragmentation
> this means that you can't put a link TRB in the middle of a TD.
> (The documented alignment might be as high as 16kB.)
> 
> If the rest of the code used a 'ring end pointer' then a link TRB
> could be used instead.

I see.  Sounds like a poor design decision in hindsight.  Can it be
changed?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Terrible performance of sequential O_DIRECT 4k writes in SAN environment. ~3 times slower then Solars 10 with the same HBA/Storage.

2014-01-08 Thread Sergey Meirovich
On 8 January 2014 17:26, Christoph Hellwig  wrote:
>
> On my laptop SSD I get the following results (sometimes up to 200MB/s,
> sometimes down to 100MB/s, always in the 40k to 50k IOps range):
>
> time elapsed (sec.):5
> bandwidth (MiB/s):  160.00
> IOps:   40960.00

Any direct attached storage I've tried was faster for me as well,
indeed. I have already posted IIRC
"06:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS
2208 [Thunderbolt] (rev 05)"   - 1Gb BBU RAM
sysbench seqwr aio 4k: 326.24Mb/sec 20879.56 Requests/sec

That is good that you mentioned SSD. I've tried fnic HBA zoned to EMC
XtremIO (SSD only based storage)
 14.43Mb/sec 3693.65 Requests/sec for sequential 4k.

So far I've seen so massive degradation only in SAN environment. I
started my investigation with RHEL6.5 kernel so below table is from it
but the trend is the same as for mainline it seems.

Chunk size Bandwidth MiB/s

64M512
32M510
16M492
8M  451
4M  436
2M  350
1M  256
512K   191
256K   165
128K   142
64K 101
32K 65
16K 39
8K   20
4K   11


>
> The IOps are more than the hardware is physically capable of, but given
> that you didn't specify O_SYNC this seems sensible given that we never
> have to flush the disk cache.
>
> Could it be that your array has WCE=0?  In Linux we'll never enable the
> cache automatically, but Solaris does at least when using ZFS.  Try
> running:
>
>sdparm --set=WCE /dev/sdX
>
> and try again.

ZFS is not supporting direct IO so that was UFS. I tried to do sdparm
--set=WCE /dev/sdX on the same fnic/XtremIO however this is multipath
and for second half of 4 LUNs that failed (probably this is normal)
Results have not changed much: 13.317Mb/sec 3409.26 Requests/sec


[root@dca-poc-gtsxdb3 mnt]# multipath -ll
mpathb (3514f0c5c11a0002d) dm-0 XtremIO,XtremApp
size=50G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 0:0:4:1 sdg 8:96  active ready running
  |- 0:0:5:1 sdh 8:112 active ready running
  |- 1:0:4:1 sdo 8:224 active ready running
  `- 1:0:5:1 sdp 8:240 active ready running

[root@dca-poc-gtsxdb3 mnt]# sdparm --set=WCE /dev/sdg
/dev/sdg: XtremIO   XtremApp  1.05
[root@dca-poc-gtsxdb3 mnt]# sdparm --set=WCE /dev/sdh
/dev/sdh: XtremIO   XtremApp  1.05
[root@dca-poc-gtsxdb3 mnt]# sdparm --set=WCE /dev/sdo
/dev/sdo: XtremIO   XtremApp  1.05
mode sense command failed, unit attention
change_mode_page: failed fetching page: Caching (SBC)
# ^
[root@dca-poc-gtsxdb3 mnt]# sdparm --set=WCE /dev/sdp
/dev/sdp: XtremIO   XtremApp  1.05
mode sense command failed, unit attention
change_mode_page: failed fetching page: Caching (SBC)
# ^
[root@dca-poc-gtsxdb3 mnt]#
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-08 Thread David Laight
> From: Alan Stern
> On Wed, 8 Jan 2014, David Laight wrote:
> 
> > > From: Alan Stern
> > >
> > > This may be a foolish question, but why is xhci-hcd using no-op TRBs in
> > > the first place?
> >
> > Because it can't write in a link TRB because other parts of the
> > code use link TRBs to detect the end of the ring.
> >
> > The problem is that it can't put a link TRB in the middle of
> > a chain of data fragments unless it is at a 'suitable' offset
> > from the start of the data TD. Given arbitrary input fragmentation
> > this means that you can't put a link TRB in the middle of a TD.
> > (The documented alignment might be as high as 16kB.)
> >
> > If the rest of the code used a 'ring end pointer' then a link TRB
> > could be used instead.
> 
> I see.  Sounds like a poor design decision in hindsight.  Can it be
> changed?

Anything can be changed :-)
But it is a bit pervasive.
Padding out with nops isolated the change.

David



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: REQ_PM vs REQ_TYPE_PM_RESUME

2014-01-08 Thread Alan Stern
On Tue, 7 Jan 2014, Phillip Susi wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
> 
> On 01/07/2014 02:18 PM, Alan Stern wrote:
> > I don't know how you manually spin down your disk, but one
> > reasonable way to do it is to set the autosuspend timeout to 0.  If
> > you use this approach and apply my patch, the disk should remain
> > spun down when the system resumes.
> 
> The traditional method before the recently added block pm feature was
> with hdparm -y.

A real problem here is that hdparm does not interact with the runtime 
PM part of the kernel.  They work independently and can interfere with 
each other.

> > Right.  If you hadn't put the whole system to sleep, the disk
> > would have gone into runtime suspend 2 minutes later (assuming you
> > didn't use it in the meantime).  Whereas since you did put the
> > whole system to sleep, the disk will go into runtime suspend 5
> > minutes after the system wakes up -- not 2 minutes later.  I.e.,
> > the timer has been reset from 2 minutes back to 5.
> 
> What resets the timer?  It should only be reset when IO is done.  And

I/O _was_ done.  To spin up the disk requires sending it a START 
command.  That's I/O.

Okay, you may not think that is a valid argument.  Still, the timer has
to be set to _something_, and during the resume we don't know how much
time was remaining as of the suspend.  (I suppose we could store this 
information when the suspend occurs, but currently we don't.)

> yes, it is exactly that wakeup and turn around and suspend again that
> I'm trying to avoid; it puts unnecessary wear and tear on the disk.

In essence, you want your disk's state to move directly from unpowered
(system asleep) to runtime-suspended with no intermediate active -- but
only if the disk doesn't spin up all by itself.

One problem: How can the kernel tell whether the disk has spun up by
itself?  I don't think it can be done at the SCSI level.  Can it be
done at the ATA level (and without using the SCSI stack, as that would
require the disk to go out of runtime suspend and spin up anyway)?


> > How is the kernel supposed to know that you finished using the
> > disk before suspending the system?  After all, by setting the
> > autosuspend timeout to 5 minutes, you have already told the kernel
> > to keep the disk spun up if it has been used any time in the last 5
> > minutes, and you used it only 3 minutes ago.  As far as the kernel
> > can tell, you still want the disk to be spun up and ready for use.
> > That remains true at the time of the system resume.
> 
> That's the whole point: it doesn't know.  What it does know is that
> the disk is spun down on resume, and it has no reason to believe that
> you will need it again soon.

Actually, it does have a reason.  Namely, you told it earlier to assume 
the disk will be needed again if it was used within the last 5 
minutes.  At the time of the system resume, the disk was last used 3 
minutes ago (not counting the time the system was asleep).

>  This is especially true given that your
> typical single ATA disk system will spin up the drive on its own
> anyhow, so on systems with multiple scsi or ATA disks that explicitly
> have been set to power up in standby, it is at least an even bet that
> you won't be touching them all soon.

Are you assuming these disks were all runtime-active before the system 
sleep?  If so, the kernel is justified in assuming that you will be 
touching them all soon.

> > You're forgetting that the same sd driver manages other types of 
> > devices than disk drives.  Card readers and USB flash drives have
> > no heads to retract and no spinning platters.  They don't need
> > START STOP commands (in fact, some of them probably would crash if
> > they received such a command).
> 
> They are irrelevant since they ignore the command anyhow.

Not the ones that crash when they receive it.  But this point is moot
since manage_start_stop doesn't do what you want anyway.

> > Now I'm really getting confused.  Here, you are saying that ATA
> > disks will always spin up, all by themselves, whenever the system
> > resumes, and there's nothing the kernel can do to prevent it.  But
> > earlier you wrote:
> 
> No, I'm saying some ( most in fact ) do, and some do not.  Both cases
> need to be handled.  Runtime pm must not indicate the disk is
> suspended when it spins up on its own after power on, which ATA disks
> do by default.  Oddly, the SCSI standard says that SCSI disks can be
> configured behave this way as well rather than requiring an explicit
> start command, though it doesn't say how and I've not heard of a way
> to do this.  I suppose this also applies to the USB flash drives and
> SD card readers you mentioned that are managed by sd.

No, it doesn't.  They don't spin, so they can't be configured either to 
spin up or not to spin up after power-on.

> > Isn't this a contradiction?  Or are you making a distinction
> > between ATA and non-ATA disks?
> 
> What?  I have some disk drive

Re: REQ_PM vs REQ_TYPE_PM_RESUME

2014-01-08 Thread Alan Stern
On Wed, 8 Jan 2014, Alan Stern wrote:

> In essence, you want your disk's state to move directly from unpowered
> (system asleep) to runtime-suspended with no intermediate active -- but
> only if the disk doesn't spin up all by itself.

> In the end, it sounds like what you want can be done fairly easily.  

It turns out there is a snag.  The PM core is set up such that during
system sleep transitions, there's no way to tell whether or not
userspace has permitted runtime PM for a device.  (This is because the
core wants to prevent untimely runtime suspends from occurring in the
middle of a system sleep transition, and to do so it uses the same
mechanism as it does when userspace disallows runtime suspend.)

As a result, we must not go directly from unpowered to
runtime-suspended.  Not without some sort of change to the PM core.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: REQ_PM vs REQ_TYPE_PM_RESUME

2014-01-08 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 1/8/2014 2:00 AM, Aaron Lu wrote:
> Then I would say we just keep the pm_request_resume call in their
> system resume callback. For drives that have that cool feature
> turned on, it will be in runtime active state while it's actually
> in standby mode, not

No, it won't since the runtime resume starts the drive.


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSzadoAAoJEI5FoCIzSKrwQWsIAKt/F+eiYYrxxjx58hFCwLxV
3E+CDiuf+pTyUUqPFgcGjyNrfoBO2nWXiCyg5XZ//pjPwbD4GenwSajgUIvKRw/6
0mDw4bb4CyU9/vqVKwHUHfM2jgsAuK9VFn+1fSomODy6B9ZYX6pDPr3jw5S7kDko
nDaYhBX9L2/AE4oPo2qPLpAPMxiYFD9qIi8LcfW/Ha2Av7AneIyTXEvFZzyeDv2K
WoH/KF59snFz9t4wwwfdqu1l1w93b+Tcn2zCpFBaA63TyrJKq4qpx9/W+INgBGol
HKEyv1vh5UkaVSvffiFE/2b1lR4n0hL7LVMe1egazAlNPcsjXdBZv1amYJ7NltI=
=N8WO
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [SCSI] advansys: change buildtime warning into runtime error

2014-01-08 Thread Paul Bolle
On Tue, 2013-01-29 at 11:20 +0100, Paul Bolle wrote:
> On Mon, 2012-11-05 at 11:58 +0100, Paul Bolle wrote:
> > Building advansys.o triggers this warning:
> > drivers/scsi/advansys.c:71:2: warning: #warning this driver is still 
> > not properly converted to the DMA API [-Wcpp]
> > 
> > This warning can be traced back to a patch called "advansys: add warning
> > and convert #includes" which was included in v2.6.10. That patch also
> > marked the driver as BROKEN.
> > 
> > Commit 4661e3eace2c7b8433476b5bf0ee437ab3c7dfd4 ("[SCSI] advansys
> > driver: limp along on x86") enabled this driver for x86-32. And commit
> > 9d511a4b29de6764931343d03e493f2e04df0271 ("[SCSI] advansys: Changes to
> > work on parisc") enabled this driver for all architectures. But the
> > commit explanation stated:
> > I haven't removed the #warning yet because virt_to_bus/bus_to_virt are
> > only eliminated for narrow boards.  Wide boards need more work.
> > 
> > Five years have passed and, apparently, those wide boards still need
> > more work. So let's change the buildtime warning into a runtime error,
> > only printed for those wide boards. Perhaps that might push the people
> > using those wide boards to convert this driver. And for all others
> > there's now one less buildtime warning to ignore.
> > 
> > Signed-off-by: Paul Bolle 
> > ---
> > Compile tested only. I don't have any AdvanSys SCSI boards (neither
> > narrow nor wide).
> 
> The date of this message suggests I submitted this patch for a warning
> seen while building v3.7-rc4. An identical warning can be seen while
> building v3.8-rc5. What's the status of my patch? Did anyone find some
> time to have a look at it?

I've been carrying this patch locally for over a year. Is there any
chance of someone trying to remove this buildtime warning?

I'm inclined to submit a path to Fedora - my local builds use
their .config as a base - to just disable this driver. It seems that
would increase my chances of finally shutting down this warning.


Paul Bolle

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [SCSI] advansys: change buildtime warning into runtime error

2014-01-08 Thread Ondrej Zary
On Wednesday 08 January 2014 21:05:14 Paul Bolle wrote:
> On Tue, 2013-01-29 at 11:20 +0100, Paul Bolle wrote:
> > On Mon, 2012-11-05 at 11:58 +0100, Paul Bolle wrote:
> > > Building advansys.o triggers this warning:
> > > drivers/scsi/advansys.c:71:2: warning: #warning this driver is
> > > still not properly converted to the DMA API [-Wcpp]
> > >
> > > This warning can be traced back to a patch called "advansys: add
> > > warning and convert #includes" which was included in v2.6.10. That
> > > patch also marked the driver as BROKEN.
> > >
> > > Commit 4661e3eace2c7b8433476b5bf0ee437ab3c7dfd4 ("[SCSI] advansys
> > > driver: limp along on x86") enabled this driver for x86-32. And commit
> > > 9d511a4b29de6764931343d03e493f2e04df0271 ("[SCSI] advansys: Changes to
> > > work on parisc") enabled this driver for all architectures. But the
> > > commit explanation stated:
> > > I haven't removed the #warning yet because virt_to_bus/bus_to_virt
> > > are only eliminated for narrow boards.  Wide boards need more work.
> > >
> > > Five years have passed and, apparently, those wide boards still need
> > > more work. So let's change the buildtime warning into a runtime error,
> > > only printed for those wide boards. Perhaps that might push the people
> > > using those wide boards to convert this driver. And for all others
> > > there's now one less buildtime warning to ignore.
> > >
> > > Signed-off-by: Paul Bolle 
> > > ---
> > > Compile tested only. I don't have any AdvanSys SCSI boards (neither
> > > narrow nor wide).
> >
> > The date of this message suggests I submitted this patch for a warning
> > seen while building v3.7-rc4. An identical warning can be seen while
> > building v3.8-rc5. What's the status of my patch? Did anyone find some
> > time to have a look at it?
>
> I've been carrying this patch locally for over a year. Is there any
> chance of someone trying to remove this buildtime warning?
>
> I'm inclined to submit a path to Fedora - my local builds use
> their .config as a base - to just disable this driver. It seems that
> would increase my chances of finally shutting down this warning.
>
>
> Paul Bolle

The driver works fine with narrow boards. Please don't break working drivers 
because of some stupid warning.

-- 
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: REQ_PM vs REQ_TYPE_PM_RESUME

2014-01-08 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 1/8/2014 12:46 PM, Alan Stern wrote:
> I/O _was_ done.  To spin up the disk requires sending it a START 
> command.  That's I/O.

Yes, currently, but that's exactly what I'm trying to get rid of.

> In essence, you want your disk's state to move directly from
> unpowered (system asleep) to runtime-suspended with no intermediate
> active -- but only if the disk doesn't spin up all by itself.
> 
> One problem: How can the kernel tell whether the disk has spun up
> by itself?  I don't think it can be done at the SCSI level.  Can it
> be done at the ATA level (and without using the SCSI stack, as that
> would require the disk to go out of runtime suspend and spin up
> anyway)?

You issue a REQUEST SENSE command and that returns status indicating
whether the drive is stopped, or in standby.  See my patches.  One of
them fixes the scsi-ata translation to issue the ATA CHECK power
command to see if the drive is suspended, as SAT-3 calls for.

> Actually, it does have a reason.  Namely, you told it earlier to
> assume the disk will be needed again if it was used within the last
> 5 minutes.  At the time of the system resume, the disk was last
> used 3 minutes ago (not counting the time the system was asleep).

You're looking at it backwards.  You aren't telling it to assume the
disk will be needed again if it was used within the last 5 minutes;
you are telling it that it can assume it *won't* be needed again soon
if not used in the last 5 minutes, and therefore, it is ok to shut it
down.

Also the act of suspending the system is a pretty clear breaking point
in general activity.  You normally stop your work and quiesce the
system before you suspend it.  If I finish my work and copy it to my
backup drive, then suspend the system, and my wife comes by an hour
later to browse the web, she's not going to be touching my backup disk.

> Are you assuming these disks were all runtime-active before the
> system sleep?  If so, the kernel is justified in assuming that you
> will be touching them all soon.

No it isn't.  Absence of evidence is not evidence of absence.  I
listed several use cases where the drive won't be runtime suspended
before the system suspend, yet will not be accessed any time soon
after resume.

> Not the ones that crash when they receive it.  But this point is
> moot since manage_start_stop doesn't do what you want anyway.

Any that crash when they get it either are already unusable or have a
quirk registered to inhibit the command, which would still apply.

> No, it doesn't.  They don't spin, so they can't be configured
> either to spin up or not to spin up after power-on.

Which means we shouldn't be setting the runtime status to suspended
after system resume... that's the second case.

> What is PuiS?

Power up in Standby.

> In the end, it sounds like what you want can be done fairly easily.
>  But only if the kernel has a reliable way to tell whether or not a
> disk is spinning (or is spinning up).

That's why my patches are attempting to do: use REQUEST SENSE to ask
the disk if it is spinning up or not.


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSzbMqAAoJEI5FoCIzSKrwCYwIAKMwlRovIdfO6rSgZL7pufFG
LOV/9apHzqldPPLJ5BPk0A69Ok/TimrQax06HTPyGJ1/MmOrEcjJzb482mJ4ixpx
b7DP3De+RCEUifrfIlI/U+fWLHpw4DOYToSy5ZjNZbF/cDz43k85arAwRK/Bkb++
C2F+YfeQyLcR0KjToj8MK9evlGY6oKfEYx9QCjBEgiaXRjIHfu0AuLn7y7AATK6n
pFBzgq4cNSalob9I4WdwejzJz/RzqQFujtknB+nAaKhESqLCoclJLSzwNQyuxcRz
qT3DckTyTBF7vGmYayaeERnukqWnDEBAQOc8qtOxx2rbYpivNHiqIc8sfrRoaJc=
=2kXd
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [SCSI] advansys: change buildtime warning into runtime error

2014-01-08 Thread Paul Bolle
On Wed, 2014-01-08 at 21:15 +0100, Ondrej Zary wrote:
> The driver works fine with narrow boards. Please don't break working drivers 
> because of some stupid warning.

My patch changed a buildtime warning for both narrow and wide boards in
a runtime error on wide boards only. That wouldn't break the narrow
boards. (I happen to own neither a narrow nor a wide board.)

In seven years that buildtime warning has not resulted in a fix for wide
boards, has it? So at this point that warning is rather pointless.

Is anybody even using these wide boards?


Paul Bolle

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/14] target: Initial support for DIF Type1+Type3 emulation

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

Hi MKP & SCSI folks,

This series contains initial support for target mode DIF Type1+Type3
emulation within target core, RAMDISK_MCP device backend, and tcm_loop
fabric driver.

DIF emulation is enabled via a new 'pi_prot_type' device attribute
within configfs, which is set after initial device configuration and
before target fabric LUN export occurs.

The DIF read/write verify emulation has been made generic enough so
it can be used by other backend drivers (eg: FILEIO), as well as
DIF v2 in the near future.  Also note that the majority of the logic
has been groked from existing scsi_debug.c code.

The current plan is to enable basic support for emulated backends with
tcm_loop for v3.14 code, and then move onto IBLOCK backend support
(that requires BLOCK layer changes), and then onto other more complex
fabric driver support + hardware offloads (iser-target that Sagi + Or
are currently working on) and DIF support into KVM guest using
virtio-scsi + vhost-scsi.

Here is a quick snippet of the code in action with v3.13-rc3:

[  846.774085] scsi8 : TCM_Loopback
[  846.778044] scsi 8:0:1:0: Direct-Access LIO-ORG  RAMDISK-MCP  4.0 
PQ: 0 ANSI: 5
[  846.780160] init_tag_map: adjusted depth to 256
[  846.781309] init_tag_map: adjusted depth to 256
[  846.782424] init_tag_map: adjusted depth to 256
[  846.783685] sd 8:0:1:0: Attached scsi generic sg1 type 0
[  846.783923] sd 8:0:1:0: [sda] Enabling DIF Type 1 protection
[  846.783929] sd 8:0:1:0: [sda] 2097152 512-byte logical blocks: (1.07 GB/1.00 
GiB)
[  846.784111] sd 8:0:1:0: [sda] Write Protect is off
[  846.784114] sd 8:0:1:0: [sda] Mode Sense: 43 00 00 08
[  846.784157] sd 8:0:1:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[  846.785357]  sda: unknown partition table
[  846.795464] sd 8:0:1:0: [sda] Enabling DIX T10-DIF-TYPE1-CRC protection
[  846.797278] sd 8:0:1:0: [sda] DIF application tag size 2
[  846.798951] sd 8:0:1:0: [sda] Attached SCSI disk



[  901.477786] Entering sd_dif_type1_generate >>>
[  901.477809] SD TYPE1 Generate: sector: 0 guard_tag: 0x5fd1 app_tag: 0x 
ref_tag: 0
[  901.477823] SD TYPE1 Generate: sector: 1 guard_tag: 0x41de app_tag: 0x 
ref_tag: 1
[  901.477835] SD TYPE1 Generate: sector: 2 guard_tag: 0x27b1 app_tag: 0x 
ref_tag: 2
[  901.477848] SD TYPE1 Generate: sector: 3 guard_tag: 0xeb4c app_tag: 0x 
ref_tag: 3
[  901.477860] SD TYPE1 Generate: sector: 4 guard_tag: 0x5d81 app_tag: 0x 
ref_tag: 4
[  901.477872] SD TYPE1 Generate: sector: 5 guard_tag: 0xda61 app_tag: 0x 
ref_tag: 5
[  901.477884] SD TYPE1 Generate: sector: 6 guard_tag: 0x80df app_tag: 0x 
ref_tag: 6
[  901.477896] SD TYPE1 Generate: sector: 7 guard_tag: 0x5ede app_tag: 0x 
ref_tag: 7
[  901.477940] sd_prep_fn: CDB: 0x2a blk_integrity_rq(rq): 1
[  901.481576] DIF WRITE sector: 0 guard_tag: 0x5fd1 app_tag: 0x ref_tag: 0
[  901.481591] DIF WRITE sector: 1 guard_tag: 0x41de app_tag: 0x ref_tag: 1
[  901.481615] DIF WRITE sector: 2 guard_tag: 0x27b1 app_tag: 0x ref_tag: 2
[  901.481637] DIF WRITE sector: 3 guard_tag: 0xeb4c app_tag: 0x ref_tag: 3
[  901.481649] DIF WRITE sector: 4 guard_tag: 0x5d81 app_tag: 0x ref_tag: 4
[  901.481661] DIF WRITE sector: 5 guard_tag: 0xda61 app_tag: 0x ref_tag: 5
[  901.481673] DIF WRITE sector: 6 guard_tag: 0x80df app_tag: 0x ref_tag: 6
[  901.481685] DIF WRITE sector: 7 guard_tag: 0x5ede app_tag: 0x ref_tag: 7



[  901.522883] sd_prep_fn: CDB: 0x28 blk_integrity_rq(rq): 1
[  901.528637] DIF READ sector: 0 guard_tag: 0x5fd1 app_tag: 0x ref_tag: 0
[  901.528655] DIF READ sector: 1 guard_tag: 0x41de app_tag: 0x ref_tag: 1
[  901.528667] DIF READ sector: 2 guard_tag: 0x27b1 app_tag: 0x ref_tag: 2
[  901.528679] DIF READ sector: 3 guard_tag: 0xeb4c app_tag: 0x ref_tag: 3
[  901.528691] DIF READ sector: 4 guard_tag: 0x5d81 app_tag: 0x ref_tag: 4
[  901.528703] DIF READ sector: 5 guard_tag: 0xda61 app_tag: 0x ref_tag: 5
[  901.528715] DIF READ sector: 6 guard_tag: 0x80df app_tag: 0x ref_tag: 6
[  901.528727] DIF READ sector: 7 guard_tag: 0x5ede app_tag: 0x ref_tag: 7
[  901.528788] Entering bio_integrity_verify sector: 0 >>>
[  901.535807] bio_integrity_verify: bio->bi_idx: 1 bio->bi_vcnt: 1
[  901.538222] SD TYPE1 Verify sector: 0 guard_tag: 0x5fd1 app_tag: 0x 
ref_tag: 0
[  901.538226] SD TYPE1 Verify sector: 1 guard_tag: 0x41de app_tag: 0x 
ref_tag: 1
[  901.538230] SD TYPE1 Verify sector: 2 guard_tag: 0x27b1 app_tag: 0x 
ref_tag: 2
[  901.538234] SD TYPE1 Verify sector: 3 guard_tag: 0xeb4c app_tag: 0x 
ref_tag: 3
[  901.538238] SD TYPE1 Verify sector: 4 guard_tag: 0x5d81 app_tag: 0x 
ref_tag: 4
[  901.538241] SD TYPE1 Verify sector: 5 guard_tag: 0xda61 app_tag: 0x 
ref_tag: 5
[  901.538245] SD TYPE1 Verify sector: 6 guard_tag: 0x80df app_tag: 0x 
ref_tag: 6
[  901.538248] SD TYPE1 Verify sector: 7 guard_tag: 0x5ed

[PATCH 06/14] target/spc: Add protection related bits to INQUIRY EVPD=0x86

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch updates spc_emulate_evpd_86() (extended INQUIRY) to
report GRD_CHK (Guard Check) and REF_CHK (Reference Check) bits
when DIF emulation is enabled by the backend device.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_spc.c |9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index 6b06f01..d99eb95 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -441,6 +441,15 @@ spc_emulate_evpd_86(struct se_cmd *cmd, unsigned char *buf)
struct se_device *dev = cmd->se_dev;
 
buf[3] = 0x3c;
+   /*
+* Set GRD_CHK + REF_CHK for TYPE1 protection, or GRD_CHK
+* only for TYPE3 protection.
+*/
+   if (dev->dev_attrib.pi_prot_type == TARGET_DIF_TYPE1_PROT)
+   buf[4] = 0x5;
+   else if (dev->dev_attrib.pi_prot_type == TARGET_DIF_TYPE3_PROT)
+   buf[4] = 0x4;
+
/* Set HEADSUP, ORDSUP, SIMPSUP */
buf[5] = 0x07;
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/14] target: Add protection SGLs to target_submit_cmd_map_sgls

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds support to target_submit_cmd_map_sgls() for
accepting 'sgl_prot' + 'sgl_prot_count' parameters for
DIF protection information.

Note the passed parameters are stored at se_cmd->t_prot_sg
and se_cmd->t_prot_nents respectively.

Also, update tcm_loop and vhost-scsi fabrics usage of
target_submit_cmd_map_sgls() to take into account the
new parameters.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/loopback/tcm_loop.c |2 +-
 drivers/target/target_core_transport.c |   16 ++--
 drivers/vhost/scsi.c   |2 +-
 include/target/target_core_fabric.h|3 ++-
 4 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/target/loopback/tcm_loop.c 
b/drivers/target/loopback/tcm_loop.c
index 1b41e67..c6f1427 100644
--- a/drivers/target/loopback/tcm_loop.c
+++ b/drivers/target/loopback/tcm_loop.c
@@ -217,7 +217,7 @@ static void tcm_loop_submission_work(struct work_struct 
*work)
scsi_bufflen(sc), tcm_loop_sam_attr(sc),
sc->sc_data_direction, 0,
scsi_sglist(sc), scsi_sg_count(sc),
-   sgl_bidi, sgl_bidi_count);
+   sgl_bidi, sgl_bidi_count, NULL, 0);
if (rc < 0) {
set_host_byte(sc, DID_NO_CONNECT);
goto out_done;
diff --git a/drivers/target/target_core_transport.c 
b/drivers/target/target_core_transport.c
index 707ee17..65b42b4 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -1284,6 +1284,8 @@ transport_generic_map_mem_to_cmd(struct se_cmd *cmd, 
struct scatterlist *sgl,
  * @sgl_count: scatterlist count for unidirectional mapping
  * @sgl_bidi: struct scatterlist memory for bidirectional READ mapping
  * @sgl_bidi_count: scatterlist count for bidirectional READ mapping
+ * @sgl_prot: struct scatterlist memory protection information
+ * @sgl_prot_count: scatterlist count for protection information
  *
  * Returns non zero to signal active I/O shutdown failure.  All other
  * setup exceptions will be returned as a SCSI CHECK_CONDITION response,
@@ -1296,7 +1298,8 @@ int target_submit_cmd_map_sgls(struct se_cmd *se_cmd, 
struct se_session *se_sess
unsigned char *cdb, unsigned char *sense, u32 unpacked_lun,
u32 data_length, int task_attr, int data_dir, int flags,
struct scatterlist *sgl, u32 sgl_count,
-   struct scatterlist *sgl_bidi, u32 sgl_bidi_count)
+   struct scatterlist *sgl_bidi, u32 sgl_bidi_count,
+   struct scatterlist *sgl_prot, u32 sgl_prot_count)
 {
struct se_portal_group *se_tpg;
sense_reason_t rc;
@@ -1338,6 +1341,14 @@ int target_submit_cmd_map_sgls(struct se_cmd *se_cmd, 
struct se_session *se_sess
target_put_sess_cmd(se_sess, se_cmd);
return 0;
}
+   /*
+* Save pointers for SGLs containing protection information,
+* if present.
+*/
+   if (sgl_prot_count) {
+   se_cmd->t_prot_sg = sgl_prot;
+   se_cmd->t_prot_nents = sgl_prot_count;
+   }
 
rc = target_setup_cmd_from_cdb(se_cmd, cdb);
if (rc != 0) {
@@ -1380,6 +1391,7 @@ int target_submit_cmd_map_sgls(struct se_cmd *se_cmd, 
struct se_session *se_sess
return 0;
}
}
+
/*
 * Check if we need to delay processing because of ALUA
 * Active/NonOptimized primary access state..
@@ -1419,7 +1431,7 @@ int target_submit_cmd(struct se_cmd *se_cmd, struct 
se_session *se_sess,
 {
return target_submit_cmd_map_sgls(se_cmd, se_sess, cdb, sense,
unpacked_lun, data_length, task_attr, data_dir,
-   flags, NULL, 0, NULL, 0);
+   flags, NULL, 0, NULL, 0, NULL, 0);
 }
 EXPORT_SYMBOL(target_submit_cmd);
 
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index f175629..84488a8 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -889,7 +889,7 @@ static void tcm_vhost_submission_work(struct work_struct 
*work)
cmd->tvc_lun, cmd->tvc_exp_data_len,
cmd->tvc_task_attr, cmd->tvc_data_direction,
TARGET_SCF_ACK_KREF, sg_ptr, cmd->tvc_sgl_count,
-   sg_bidi_ptr, sg_no_bidi);
+   sg_bidi_ptr, sg_no_bidi, NULL, 0);
if (rc < 0) {
transport_send_check_condition_and_sense(se_cmd,
TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, 0);
diff --git a/include/target/target_core_fabric.h 
b/include/target/target_core_fabric.h
index 4cf4fda..0218d68 100644
--- a/include/target/target_core_fabric.h
+++ b/include/target/target_core_fabric.h
@@ -105,7 +105,8 @@ sense_rea

[PATCH 12/14] target/rd: Add support for protection SGL setup + release

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds rd_build_prot_space() + rd_release_prot_space() logic
to setup + release protection information scatterlists.

It also adds rd_init_prot() + rd_free_prot() se_subsystem_api
callbacks used by target core code for setup + release of
protection information.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_rd.c |   81 +++
 drivers/target/target_core_rd.h |4 ++
 2 files changed, 85 insertions(+)

diff --git a/drivers/target/target_core_rd.c b/drivers/target/target_core_rd.c
index fdc5022..dd99844 100644
--- a/drivers/target/target_core_rd.c
+++ b/drivers/target/target_core_rd.c
@@ -226,6 +226,64 @@ static int rd_build_device_space(struct rd_dev *rd_dev)
return 0;
 }
 
+static void rd_release_prot_space(struct rd_dev *rd_dev)
+{
+   struct rd_dev_sg_table *sg_table;
+   u32 page_count;
+
+   if (!rd_dev->sg_prot_array || !rd_dev->sg_prot_count)
+   return;
+
+   sg_table = rd_dev->sg_prot_array;
+
+   page_count = rd_release_sgl_table(rd_dev, rd_dev->sg_prot_array,
+ rd_dev->sg_prot_count);
+
+   pr_debug("CORE_RD[%u] - Released protection space for Ramdisk"
+" Device ID: %u, pages %u in %u tables total bytes %lu\n",
+rd_dev->rd_host->rd_host_id, rd_dev->rd_dev_id, page_count,
+rd_dev->sg_table_count, (unsigned long)page_count * PAGE_SIZE);
+
+   rd_dev->sg_prot_array = NULL;
+   rd_dev->sg_prot_count = 0;
+}
+
+static int rd_build_prot_space(struct rd_dev *rd_dev, int prot_length)
+{
+   struct rd_dev_sg_table *sg_table;
+   u32 total_sg_needed, sg_tables;
+   u32 max_sg_per_table = (RD_MAX_ALLOCATION_SIZE /
+   sizeof(struct scatterlist));
+   int rc;
+
+   if (rd_dev->rd_flags & RDF_NULLIO)
+   return 0;
+
+   total_sg_needed = rd_dev->rd_page_count / prot_length;
+
+   sg_tables = (total_sg_needed / max_sg_per_table) + 1;
+
+   sg_table = kzalloc(sg_tables * sizeof(struct rd_dev_sg_table), 
GFP_KERNEL);
+   if (!sg_table) {
+   pr_err("Unable to allocate memory for Ramdisk protection"
+  " scatterlist tables\n");
+   return -ENOMEM;
+   }
+
+   rd_dev->sg_prot_array = sg_table;
+   rd_dev->sg_prot_count = sg_tables;
+
+   rc = rd_allocate_sgl_table(rd_dev, sg_table, total_sg_needed, 0xff);
+   if (rc)
+   return rc;
+
+   pr_debug("CORE_RD[%u] - Built Ramdisk Device ID: %u prot space of"
+" %u pages in %u tables\n", rd_dev->rd_host->rd_host_id,
+rd_dev->rd_dev_id, total_sg_needed, rd_dev->sg_prot_count);
+
+   return 0;
+}
+
 static struct se_device *rd_alloc_device(struct se_hba *hba, const char *name)
 {
struct rd_dev *rd_dev;
@@ -280,6 +338,7 @@ static void rd_free_device(struct se_device *dev)
 {
struct rd_dev *rd_dev = RD_DEV(dev);
 
+   rd_release_prot_space(rd_dev);
rd_release_device_space(rd_dev);
kfree(rd_dev);
 }
@@ -482,6 +541,26 @@ static sector_t rd_get_blocks(struct se_device *dev)
return blocks_long;
 }
 
+static int rd_init_prot(struct se_device *dev)
+{
+   struct rd_dev *rd_dev = RD_DEV(dev);
+
+if (!dev->dev_attrib.pi_prot_type)
+   return 0;
+
+   return rd_build_prot_space(rd_dev, dev->prot_length);
+}
+
+static void rd_free_prot(struct se_device *dev)
+{
+   struct rd_dev *rd_dev = RD_DEV(dev);
+
+   if (dev->dev_attrib.pi_prot_type)
+   return;
+
+   rd_release_prot_space(rd_dev);
+}
+
 static struct sbc_ops rd_sbc_ops = {
.execute_rw = rd_execute_rw,
 };
@@ -507,6 +586,8 @@ static struct se_subsystem_api rd_mcp_template = {
.show_configfs_dev_params = rd_show_configfs_dev_params,
.get_device_type= sbc_get_device_type,
.get_blocks = rd_get_blocks,
+   .init_prot  = rd_init_prot,
+   .free_prot  = rd_free_prot,
 };
 
 int __init rd_module_init(void)
diff --git a/drivers/target/target_core_rd.h b/drivers/target/target_core_rd.h
index 1789d1e..cc46a6a 100644
--- a/drivers/target/target_core_rd.h
+++ b/drivers/target/target_core_rd.h
@@ -33,8 +33,12 @@ struct rd_dev {
u32 rd_page_count;
/* Number of SG tables in sg_table_array */
u32 sg_table_count;
+   /* Number of SG tables in sg_prot_array */
+   u32 sg_prot_count;
/* Array of rd_dev_sg_table_t containing scatterlists */
struct rd_dev_sg_table *sg_table_array;
+   /* Array of rd_dev_sg_table containing protection scatterlists */
+   struct rd_dev_sg_table *sg_prot_array;
/* Ramdisk HBA device is connected to */
stru

[PATCH 14/14] tcm_loop: Enable DIF/DIX modes in SCSI host LLD

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch updates tcm_loop_driver_probe() to set protection
information using scsi_host_set_prot() and scsi_host_set_guard(),
which currently enabled all modes of DIF/DIX protection, minus
DIF TYPE0.

Also, update tcm_loop_submission_work() to pass struct scsi_cmnd
related protection into target_submit_cmd_map_sgls() during CDB
dispatch.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/loopback/tcm_loop.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/target/loopback/tcm_loop.c 
b/drivers/target/loopback/tcm_loop.c
index c6f1427..8d3355f 100644
--- a/drivers/target/loopback/tcm_loop.c
+++ b/drivers/target/loopback/tcm_loop.c
@@ -217,7 +217,8 @@ static void tcm_loop_submission_work(struct work_struct 
*work)
scsi_bufflen(sc), tcm_loop_sam_attr(sc),
sc->sc_data_direction, 0,
scsi_sglist(sc), scsi_sg_count(sc),
-   sgl_bidi, sgl_bidi_count, NULL, 0);
+   sgl_bidi, sgl_bidi_count,
+   scsi_prot_sglist(sc), scsi_prot_sg_count(sc));
if (rc < 0) {
set_host_byte(sc, DID_NO_CONNECT);
goto out_done;
@@ -462,7 +463,7 @@ static int tcm_loop_driver_probe(struct device *dev)
 {
struct tcm_loop_hba *tl_hba;
struct Scsi_Host *sh;
-   int error;
+   int error, host_prot;
 
tl_hba = to_tcm_loop_hba(dev);
 
@@ -486,6 +487,13 @@ static int tcm_loop_driver_probe(struct device *dev)
sh->max_channel = 0;
sh->max_cmd_len = TL_SCSI_MAX_CMD_LEN;
 
+   host_prot = SHOST_DIF_TYPE1_PROTECTION | SHOST_DIF_TYPE2_PROTECTION |
+   SHOST_DIF_TYPE3_PROTECTION | SHOST_DIX_TYPE1_PROTECTION |
+   SHOST_DIX_TYPE2_PROTECTION | SHOST_DIX_TYPE3_PROTECTION;
+
+   scsi_host_set_prot(sh, host_prot);
+   scsi_host_set_guard(sh, SHOST_DIX_GUARD_CRC);
+
error = scsi_add_host(sh, &tl_hba->dev);
if (error) {
pr_err("%s: scsi_add_host failed\n", __func__);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/14] target/rd: Refactor rd_build_device_space + rd_release_device_space

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch refactors rd_build_device_space() + rd_release_device_space()
into rd_allocate_sgl_table() + rd_release_device_space() so that they
may be used seperatly for setup + release of protection information
scatterlists.

Also add explicit memset of pages within rd_allocate_sgl_table() based
upon passed 'init_payload' value.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_rd.c |  116 ---
 1 file changed, 71 insertions(+), 45 deletions(-)

diff --git a/drivers/target/target_core_rd.c b/drivers/target/target_core_rd.c
index 4ffe5f2..fdc5022 100644
--- a/drivers/target/target_core_rd.c
+++ b/drivers/target/target_core_rd.c
@@ -78,23 +78,14 @@ static void rd_detach_hba(struct se_hba *hba)
hba->hba_ptr = NULL;
 }
 
-/* rd_release_device_space():
- *
- *
- */
-static void rd_release_device_space(struct rd_dev *rd_dev)
+static u32 rd_release_sgl_table(struct rd_dev *rd_dev, struct rd_dev_sg_table 
*sg_table,
+u32 sg_table_count)
 {
-   u32 i, j, page_count = 0, sg_per_table;
-   struct rd_dev_sg_table *sg_table;
struct page *pg;
struct scatterlist *sg;
+   u32 i, j, page_count = 0, sg_per_table;
 
-   if (!rd_dev->sg_table_array || !rd_dev->sg_table_count)
-   return;
-
-   sg_table = rd_dev->sg_table_array;
-
-   for (i = 0; i < rd_dev->sg_table_count; i++) {
+   for (i = 0; i < sg_table_count; i++) {
sg = sg_table[i].sg_table;
sg_per_table = sg_table[i].rd_sg_count;
 
@@ -105,16 +96,31 @@ static void rd_release_device_space(struct rd_dev *rd_dev)
page_count++;
}
}
-
kfree(sg);
}
 
+   kfree(sg_table);
+   return page_count;
+}
+
+static void rd_release_device_space(struct rd_dev *rd_dev)
+{
+   struct rd_dev_sg_table *sg_table;
+   u32 page_count;
+
+   if (!rd_dev->sg_table_array || !rd_dev->sg_table_count)
+   return;
+
+   sg_table = rd_dev->sg_table_array;
+
+   page_count = rd_release_sgl_table(rd_dev, rd_dev->sg_table_array,
+ rd_dev->sg_table_count);
+
pr_debug("CORE_RD[%u] - Released device space for Ramdisk"
" Device ID: %u, pages %u in %u tables total bytes %lu\n",
rd_dev->rd_host->rd_host_id, rd_dev->rd_dev_id, page_count,
rd_dev->sg_table_count, (unsigned long)page_count * PAGE_SIZE);
 
-   kfree(sg_table);
rd_dev->sg_table_array = NULL;
rd_dev->sg_table_count = 0;
 }
@@ -124,38 +130,15 @@ static void rd_release_device_space(struct rd_dev *rd_dev)
  *
  *
  */
-static int rd_build_device_space(struct rd_dev *rd_dev)
+static int rd_allocate_sgl_table(struct rd_dev *rd_dev, struct rd_dev_sg_table 
*sg_table,
+u32 total_sg_needed, unsigned char 
init_payload)
 {
-   u32 i = 0, j, page_offset = 0, sg_per_table, sg_tables, total_sg_needed;
+   u32 i = 0, j, page_offset = 0, sg_per_table;
u32 max_sg_per_table = (RD_MAX_ALLOCATION_SIZE /
sizeof(struct scatterlist));
-   struct rd_dev_sg_table *sg_table;
struct page *pg;
struct scatterlist *sg;
-
-   if (rd_dev->rd_page_count <= 0) {
-   pr_err("Illegal page count: %u for Ramdisk device\n",
-   rd_dev->rd_page_count);
-   return -EINVAL;
-   }
-
-   /* Don't need backing pages for NULLIO */
-   if (rd_dev->rd_flags & RDF_NULLIO)
-   return 0;
-
-   total_sg_needed = rd_dev->rd_page_count;
-
-   sg_tables = (total_sg_needed / max_sg_per_table) + 1;
-
-   sg_table = kzalloc(sg_tables * sizeof(struct rd_dev_sg_table), 
GFP_KERNEL);
-   if (!sg_table) {
-   pr_err("Unable to allocate memory for Ramdisk"
-   " scatterlist tables\n");
-   return -ENOMEM;
-   }
-
-   rd_dev->sg_table_array = sg_table;
-   rd_dev->sg_table_count = sg_tables;
+   unsigned char *p;
 
while (total_sg_needed) {
sg_per_table = (total_sg_needed > max_sg_per_table) ?
@@ -186,16 +169,59 @@ static int rd_build_device_space(struct rd_dev *rd_dev)
}
sg_assign_page(&sg[j], pg);
sg[j].length = PAGE_SIZE;
+
+   p = kmap(pg);
+   memset(p, init_payload, PAGE_SIZE);
+   kunmap(pg);
}
 
page_offset += sg_per_table;
total_sg_needed -= sg_per_table;
}
 
+   return 0;
+}
+
+static int rd_build_device_space(struct rd_dev *rd_dev)
+{
+   struct rd_dev_sg_table *sg_table;
+   u3

[PATCH 09/14] target/configfs: Expose protection device attributes

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds support for exposing DIF protection device
attributes via configfs.  This includes:

   pi_prot_type: Protection Type (0, 1, 3 currently support)
   pi_prot_version: Protection Version (DIF v1 currently supported)
   pi_guard_type: Guard Type (1=DIF CRC, 2=IP CRC)

Within se_dev_set_pi_prot_type() it also adds the se_subsystem_api
device callbacks to setup per device protection information.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_configfs.c |   12 ++
 drivers/target/target_core_device.c   |   65 +
 drivers/target/target_core_internal.h |2 +
 3 files changed, 79 insertions(+)

diff --git a/drivers/target/target_core_configfs.c 
b/drivers/target/target_core_configfs.c
index 272755d..0f1101c 100644
--- a/drivers/target/target_core_configfs.c
+++ b/drivers/target/target_core_configfs.c
@@ -643,6 +643,15 @@ SE_DEV_ATTR(emulate_caw, S_IRUGO | S_IWUSR);
 DEF_DEV_ATTRIB(emulate_3pc);
 SE_DEV_ATTR(emulate_3pc, S_IRUGO | S_IWUSR);
 
+DEF_DEV_ATTRIB(pi_prot_type);
+SE_DEV_ATTR(pi_prot_type, S_IRUGO | S_IWUSR);
+
+DEF_DEV_ATTRIB_RO(pi_prot_version);
+SE_DEV_ATTR_RO(pi_prot_version);
+
+DEF_DEV_ATTRIB(pi_guard_type);
+SE_DEV_ATTR(pi_guard_type, S_IRUGO | S_IWUSR);
+
 DEF_DEV_ATTRIB(enforce_pr_isids);
 SE_DEV_ATTR(enforce_pr_isids, S_IRUGO | S_IWUSR);
 
@@ -702,6 +711,9 @@ static struct configfs_attribute 
*target_core_dev_attrib_attrs[] = {
&target_core_dev_attrib_emulate_tpws.attr,
&target_core_dev_attrib_emulate_caw.attr,
&target_core_dev_attrib_emulate_3pc.attr,
+   &target_core_dev_attrib_pi_prot_type.attr,
+   &target_core_dev_attrib_pi_prot_version.attr,
+   &target_core_dev_attrib_pi_guard_type.attr,
&target_core_dev_attrib_enforce_pr_isids.attr,
&target_core_dev_attrib_is_nonrot.attr,
&target_core_dev_attrib_emulate_rest_reord.attr,
diff --git a/drivers/target/target_core_device.c 
b/drivers/target/target_core_device.c
index 207b340..2b59beb 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -918,6 +918,67 @@ int se_dev_set_emulate_3pc(struct se_device *dev, int flag)
return 0;
 }
 
+int se_dev_set_pi_prot_type(struct se_device *dev, int flag)
+{
+   int rc, old_prot = dev->dev_attrib.pi_prot_type;
+
+   if (flag != 0 && flag != 1 && flag != 2 && flag != 3) {
+   pr_err("Illegal value %d for pi_prot_type\n", flag);
+   return -EINVAL;
+   }
+   if (flag == 2) {
+   pr_err("DIF TYPE2 protection currently not supported\n");
+   return -ENOSYS;
+   }
+   if (!dev->transport->init_prot || !dev->transport->free_prot) {
+   pr_err("DIF protection not supported by backend: %s\n",
+  dev->transport->name);
+   return -ENOSYS;
+   }
+   if (!(dev->dev_flags & DF_CONFIGURED)) {
+   pr_err("DIF protection requires device to be configured\n");
+   return -ENODEV;
+   }
+   if (dev->export_count) {
+   pr_err("dev[%p]: Unable to change SE Device PROT type while"
+  " export_count is %d\n", dev, dev->export_count);
+   return -EINVAL;
+   }
+
+   dev->dev_attrib.pi_prot_type = flag;
+
+   if (flag && !old_prot) {
+   rc = dev->transport->init_prot(dev);
+   if (rc) {
+   dev->dev_attrib.pi_prot_type = old_prot;
+   return rc;
+   }
+   } else if (!flag && old_prot) {
+   dev->transport->free_prot(dev);
+   }
+   pr_debug("dev[%p]: SE Device Protection Type: %d\n", dev, flag);
+
+   return 0;
+}
+
+int se_dev_set_pi_guard_type(struct se_device *dev, int flag)
+{
+   if (flag != 1 && flag != 2) {
+   pr_err("Illegal value %d for pi_guard_type\n", flag);
+   return -EINVAL;
+   }
+   if (dev->export_count) {
+   pr_err("dev[%p]: Unable to change SE Device GUARD type while"
+  " export_count is %d\n", dev, dev->export_count);
+   return -EINVAL;
+   }
+
+   dev->dev_attrib.pi_guard_type = flag;
+   pr_debug("dev[%p]: SE Device Guard Type: %d\n", dev, flag);
+
+   return 0;
+}
+
 int se_dev_set_enforce_pr_isids(struct se_device *dev, int flag)
 {
if ((flag != 0) && (flag != 1)) {
@@ -1415,6 +1476,7 @@ struct se_device *target_alloc_device(struct se_hba *hba, 
const char *name)
dev->dev_link_magic = SE_DEV_LINK_MAGIC;
dev->se_hba = hba;
dev->transport = hba->transport;
+   dev->prot_length = sizeof(struct se_dif_v1_tuple);
 
INIT_LIST_HEAD(&dev->dev_list);
INIT_LIST_HEAD(&dev->dev_sep_list);
@@ -1455,6 +1517,9 @@ struct se_device *target_alloc_device(struct

[PATCH 13/14] target/rd: Add DIF protection into rd_execute_rw

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds support for DIF protection into rd_execute_rw() code
for WRITE/READ I/O using sbc_dif_verify_[write,read]() logic.

It also adds rd_get_prot_table() for locating protection SGLs
assoicated with the ramdisk backend device.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_rd.c |   67 +++
 1 file changed, 67 insertions(+)

diff --git a/drivers/target/target_core_rd.c b/drivers/target/target_core_rd.c
index dd99844..3fd51eb 100644
--- a/drivers/target/target_core_rd.c
+++ b/drivers/target/target_core_rd.c
@@ -363,6 +363,26 @@ static struct rd_dev_sg_table *rd_get_sg_table(struct 
rd_dev *rd_dev, u32 page)
return NULL;
 }
 
+static struct rd_dev_sg_table *rd_get_prot_table(struct rd_dev *rd_dev, u32 
page)
+{
+   struct rd_dev_sg_table *sg_table;
+   u32 i, sg_per_table = (RD_MAX_ALLOCATION_SIZE /
+   sizeof(struct scatterlist));
+
+   i = page / sg_per_table;
+   if (i < rd_dev->sg_prot_count) {
+   sg_table = &rd_dev->sg_prot_array[i];
+   if ((sg_table->page_start_offset <= page) &&
+(sg_table->page_end_offset >= page))
+   return sg_table;
+   }
+
+   pr_err("Unable to locate struct prot rd_dev_sg_table for page: %u\n",
+   page);
+
+   return NULL;
+}
+
 static sense_reason_t
 rd_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
  enum dma_data_direction data_direction)
@@ -377,6 +397,7 @@ rd_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, 
u32 sgl_nents,
u32 rd_page;
u32 src_len;
u64 tmp;
+   sense_reason_t rc;
 
if (dev->rd_flags & RDF_NULLIO) {
target_complete_cmd(cmd, SAM_STAT_GOOD);
@@ -399,6 +420,29 @@ rd_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, 
u32 sgl_nents,
data_direction == DMA_FROM_DEVICE ? "Read" : "Write",
cmd->t_task_lba, rd_size, rd_page, rd_offset);
 
+   if ((cmd->se_cmd_flags & SCF_PROT) && se_dev->dev_attrib.pi_prot_type &&
+   data_direction == DMA_TO_DEVICE) {
+   sector_t sector = cmd->data_length / 
se_dev->dev_attrib.block_size;
+   struct rd_dev_sg_table *prot_table;
+   struct scatterlist *prot_sg;
+   u32 prot_offset, prot_page;
+
+   tmp = cmd->t_task_lba * se_dev->prot_length;
+   prot_offset = do_div(tmp, PAGE_SIZE);
+   prot_page = tmp;
+
+   prot_table = rd_get_prot_table(dev, prot_page);
+   if (!prot_table)
+   return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+
+   prot_sg = &prot_table->sg_table[prot_page - 
prot_table->page_start_offset];
+
+   rc = sbc_dif_verify_write(cmd, cmd->t_task_lba, sector, 0,
+ prot_sg, prot_offset);
+   if (rc)
+   return rc;
+   }
+
src_len = PAGE_SIZE - rd_offset;
sg_miter_start(&m, sgl, sgl_nents,
data_direction == DMA_FROM_DEVICE ?
@@ -460,6 +504,29 @@ rd_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, 
u32 sgl_nents,
}
sg_miter_stop(&m);
 
+   if ((cmd->se_cmd_flags & SCF_PROT) && se_dev->dev_attrib.pi_prot_type &&
+   data_direction == DMA_FROM_DEVICE) {
+   sector_t sector = cmd->data_length / 
se_dev->dev_attrib.block_size;
+   struct rd_dev_sg_table *prot_table;
+   struct scatterlist *prot_sg;
+   u32 prot_offset, prot_page;
+
+   tmp = cmd->t_task_lba * se_dev->prot_length;
+   prot_offset = do_div(tmp, PAGE_SIZE);
+   prot_page = tmp;
+
+   prot_table = rd_get_prot_table(dev, prot_page);
+   if (!prot_table)
+   return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+
+   prot_sg = &prot_table->sg_table[prot_page - 
prot_table->page_start_offset];
+
+   rc = sbc_dif_verify_read(cmd, cmd->t_task_lba, sector, 0,
+prot_sg, prot_offset);
+   if (rc)
+   return rc;
+   }
+
target_complete_cmd(cmd, SAM_STAT_GOOD);
return 0;
 }
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/14] target/sbc: Add P_TYPE + PROT_EN bits to READ_CAPACITY_16

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch updates sbc_emulate_readcapacity_16() to set
P_TYPE and PROT_EN bits when DIF emulation is enabled by
the backend device.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_sbc.c |5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index 366b9bb..22599e8 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -106,6 +106,11 @@ sbc_emulate_readcapacity_16(struct se_cmd *cmd)
buf[9] = (dev->dev_attrib.block_size >> 16) & 0xff;
buf[10] = (dev->dev_attrib.block_size >> 8) & 0xff;
buf[11] = dev->dev_attrib.block_size & 0xff;
+   /*
+* Set P_TYPE and PROT_EN bits for DIF support
+*/
+   if (dev->dev_attrib.pi_prot_type)
+   buf[12] = (dev->dev_attrib.pi_prot_type - 1) << 1 | 0x1;
 
if (dev->transport->get_lbppbe)
buf[13] = dev->transport->get_lbppbe(dev) & 0x0f;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/14] target/spc: Expose ATO bit in control mode page

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch updates spc_modesense_control() to set the Application
Tag Owner (ATO) bit when when DIF emulation is enabled by the
backend device.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_spc.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index d99eb95..4360c80 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -799,6 +799,19 @@ static int spc_modesense_control(struct se_device *dev, u8 
pc, u8 *p)
 * status (see SAM-4).
 */
p[5] = (dev->dev_attrib.emulate_tas) ? 0x40 : 0x00;
+   /*
+* From spc4r30, section 7.5.7 Control mode page
+*
+* Application Tag Owner (ATO) bit set to one.
+*
+* If the ATO bit is set to one the device server shall not modify the
+* LOGICAL BLOCK APPLICATION TAG field and, depending on the protection
+* type, shall not modify the contents of the LOGICAL BLOCK REFERENCE
+* TAG field.
+*/
+   if (dev->dev_attrib.pi_prot_type)
+   p[5] |= 0x80;
+
p[8] = 0xff;
p[9] = 0xff;
p[11] = 30;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/14] target/spc: Add protection bit to standard INQUIRY output

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch updates spc_emulate_inquiry_std() to set the
PROTECT bit when DIF emulation is enabled by the backend
device.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_spc.c |5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index 021c3f4..6b06f01 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -100,6 +100,11 @@ spc_emulate_inquiry_std(struct se_cmd *cmd, unsigned char 
*buf)
 */
if (dev->dev_attrib.emulate_3pc)
buf[5] |= 0x8;
+   /*
+* Set Protection (PROTECT) bit when DIF has been enabled.
+*/
+   if (dev->dev_attrib.pi_prot_type)
+   buf[5] |= 0x1;
 
buf[7] = 0x2; /* CmdQue=1 */
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/14] target: Add DIF CHECK_CONDITION ASC/ASCQ exception cases

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds support for DIF related CHECK_CONDITION ASC/ASCQ
exception cases into transport_send_check_condition_and_sense().

This includes:

  LOGICAL BLOCK GUARD CHECK FAILED
  LOGICAL BLOCK APPLICATION TAG CHECK FAILED
  LOGICAL BLOCK REFERENCE TAG CHECK FAILED

that used by DIF TYPE1 and TYPE3 failure cases.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_transport.c |   30 ++
 include/target/target_core_base.h  |3 +++
 2 files changed, 33 insertions(+)

diff --git a/drivers/target/target_core_transport.c 
b/drivers/target/target_core_transport.c
index 91953da..707ee17 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -2648,6 +2648,36 @@ transport_send_check_condition_and_sense(struct se_cmd 
*cmd,
buffer[SPC_ASC_KEY_OFFSET] = 0x1d;
buffer[SPC_ASCQ_KEY_OFFSET] = 0x00;
break;
+   case TCM_LOGICAL_BLOCK_GUARD_CHECK_FAILED:
+   /* CURRENT ERROR */
+   buffer[0] = 0x70;
+   buffer[SPC_ADD_SENSE_LEN_OFFSET] = 10;
+   /* ILLEGAL REQUEST */
+   buffer[SPC_SENSE_KEY_OFFSET] = ILLEGAL_REQUEST;
+   /* LOGICAL BLOCK GUARD CHECK FAILED */
+   buffer[SPC_ASC_KEY_OFFSET] = 0x10;
+   buffer[SPC_ASCQ_KEY_OFFSET] = 0x01;
+   break;
+   case TCM_LOGICAL_BLOCK_APP_TAG_CHECK_FAILED:
+   /* CURRENT ERROR */
+   buffer[0] = 0x70;
+   buffer[SPC_ADD_SENSE_LEN_OFFSET] = 10;
+   /* ILLEGAL REQUEST */
+   buffer[SPC_SENSE_KEY_OFFSET] = ILLEGAL_REQUEST;
+   /* LOGICAL BLOCK APPLICATION TAG CHECK FAILED */
+   buffer[SPC_ASC_KEY_OFFSET] = 0x10;
+   buffer[SPC_ASCQ_KEY_OFFSET] = 0x02;
+   break;
+   case TCM_LOGICAL_BLOCK_REF_TAG_CHECK_FAILED:
+   /* CURRENT ERROR */
+   buffer[0] = 0x70;
+   buffer[SPC_ADD_SENSE_LEN_OFFSET] = 10;
+   /* ILLEGAL REQUEST */
+   buffer[SPC_SENSE_KEY_OFFSET] = ILLEGAL_REQUEST;
+   /* LOGICAL BLOCK REFERENCE TAG CHECK FAILED */
+   buffer[SPC_ASC_KEY_OFFSET] = 0x10;
+   buffer[SPC_ASCQ_KEY_OFFSET] = 0x03;
+   break;
case TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE:
default:
/* CURRENT ERROR */
diff --git a/include/target/target_core_base.h 
b/include/target/target_core_base.h
index 15f402c..9a6e091 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -206,6 +206,9 @@ enum tcm_sense_reason_table {
TCM_OUT_OF_RESOURCES= R(0x12),
TCM_PARAMETER_LIST_LENGTH_ERROR = R(0x13),
TCM_MISCOMPARE_VERIFY   = R(0x14),
+   TCM_LOGICAL_BLOCK_GUARD_CHECK_FAILED= R(0x15),
+   TCM_LOGICAL_BLOCK_APP_TAG_CHECK_FAILED  = R(0x16),
+   TCM_LOGICAL_BLOCK_REF_TAG_CHECK_FAILED  = R(0x17),
 #undef R
 };
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/14] target/sbc: Add DIF TYPE1+TYPE3 read/write verify emulation

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds support for DIF read/write verify emulation
for TARGET_DIF_TYPE1_PROT + TARGET_DIF_TYPE3_PROT operation.

This includes sbc_dif_verify_write() + sbc_dif_verify_read()
calls accessable by backend drivers to perform DIF verify
for SGL based data and protection information.

Also included is sbc_dif_copy_prot() logic to copy protection
information to/from backend provided protection SGLs.

Based on scsi_debug.c DIF TYPE1+TYPE3 emulation.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_sbc.c |  180 ++
 include/target/target_core_backend.h |4 +
 2 files changed, 184 insertions(+)

diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index 600ffcb..366b9bb 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -998,3 +999,182 @@ err:
return ret;
 }
 EXPORT_SYMBOL(sbc_execute_unmap);
+
+static sense_reason_t
+sbc_dif_v1_verify(struct se_device *dev, struct se_dif_v1_tuple *sdt,
+ const void *p, sector_t sector, unsigned int ei_lba)
+{
+   int block_size = dev->dev_attrib.block_size;
+   __be16 csum;
+
+   if (dev->dev_attrib.pi_guard_type == TARGET_DIX_GUARD_CRC)
+   csum = cpu_to_be16(crc_t10dif(p, block_size));
+   else
+   csum = (__force __be16)ip_compute_csum(p, block_size);
+
+   if (sdt->guard_tag != csum) {
+   pr_err("DIFv1 checksum failed on sector %llu guard tag 0x%04x"
+   " csum 0x%04x\n", (unsigned long long)sector,
+   be16_to_cpu(sdt->guard_tag), be16_to_cpu(csum));
+   return TCM_LOGICAL_BLOCK_GUARD_CHECK_FAILED;
+   }
+
+   if (dev->dev_attrib.pi_prot_type == TARGET_DIF_TYPE1_PROT &&
+   be32_to_cpu(sdt->ref_tag) != (sector & 0x)) {
+   pr_err("DIFv1 Type 1 reference failed on sector: %llu tag: 
0x%08x"
+  " sector MSB: 0x%08x\n", (unsigned long long)sector,
+  be32_to_cpu(sdt->ref_tag), (u32)(sector & 0x));
+   return TCM_LOGICAL_BLOCK_REF_TAG_CHECK_FAILED;
+   }
+
+   if (dev->dev_attrib.pi_prot_type == TARGET_DIF_TYPE2_PROT &&
+   be32_to_cpu(sdt->ref_tag) != ei_lba) {
+   pr_err("DIFv1 Type 2 reference failed on sector: %llu tag: 
0x%08x"
+  " ei_lba: 0x%08x\n", (unsigned long long)sector,
+   be32_to_cpu(sdt->ref_tag), ei_lba);
+   return TCM_LOGICAL_BLOCK_REF_TAG_CHECK_FAILED;
+   }
+
+   return 0;
+}
+
+static void
+sbc_dif_copy_prot(struct se_cmd *cmd, unsigned int sectors, bool read,
+ struct scatterlist *sg, int sg_off)
+{
+   struct se_device *dev = cmd->se_dev;
+   struct scatterlist *psg;
+   void *paddr, *addr;
+   unsigned int i, len, left;
+
+   left = sectors * dev->prot_length;
+
+   for_each_sg(cmd->t_prot_sg, psg, cmd->t_prot_nents, i) {
+
+   len = min(psg->length, left);
+   paddr = kmap_atomic(sg_page(psg)) + psg->offset;
+   addr = kmap_atomic(sg_page(sg)) + sg_off;
+
+   if (read)
+   memcpy(paddr, addr, len);
+   else
+   memcpy(addr, paddr, len);
+
+   left -= len;
+   kunmap_atomic(paddr);
+   kunmap_atomic(addr);
+   }
+}
+
+sense_reason_t
+sbc_dif_verify_write(struct se_cmd *cmd, sector_t start, unsigned int sectors,
+unsigned int ei_lba, struct scatterlist *sg, int sg_off)
+{
+   struct se_device *dev = cmd->se_dev;
+   struct se_dif_v1_tuple *sdt;
+   struct scatterlist *dsg, *psg = cmd->t_prot_sg;
+   sector_t sector = start;
+   void *daddr, *paddr;
+   int i, j, offset = 0;
+   sense_reason_t rc;
+
+   for_each_sg(cmd->t_data_sg, dsg, cmd->t_data_nents, i) {
+   daddr = kmap_atomic(sg_page(dsg)) + dsg->offset;
+   paddr = kmap_atomic(sg_page(psg)) + psg->offset;
+
+   for (j = 0; j < dsg->length; j += dev->dev_attrib.block_size) {
+
+   if (offset >= psg->length) {
+   kunmap_atomic(paddr);
+   psg = sg_next(psg);
+   paddr = kmap_atomic(sg_page(psg)) + psg->offset;
+   offset = 0;
+   }
+
+   sdt = paddr + offset;
+
+   pr_debug("DIF WRITE sector: %llu guard_tag: 0x%04x"
+" app_tag: 0x%04x ref_tag: %u\n",
+(unsigned long long)sector, sdt->guard_tag,
+   

[PATCH 01/14] target: Add DIF related base definitions

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds DIF related definitions to target_core_base.h
that includes enums for target_prot_op + target_prot_type +
target_prot_version + target_guard_type + target_pi_error.

Also included is struct se_dif_v1_tuple, along with changes
to struct se_cmd, struct se_dev_attrib, and struct se_device.

Also, add new se_subsystem_api->[init,free]_prot() callers
used by target core code to setup backend specific protection
information after the device has been configured.

Enums taken from Sagi Grimberg's original patch.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 

target: more defs

Signed-off-by: Nicholas Bellinger 
---
 include/target/target_core_backend.h |2 ++
 include/target/target_core_base.h|   59 ++
 2 files changed, 61 insertions(+)

diff --git a/include/target/target_core_backend.h 
b/include/target/target_core_backend.h
index 39e0114..930f30d 100644
--- a/include/target/target_core_backend.h
+++ b/include/target/target_core_backend.h
@@ -41,6 +41,8 @@ struct se_subsystem_api {
unsigned int (*get_io_opt)(struct se_device *);
unsigned char *(*get_sense_buffer)(struct se_cmd *);
bool (*get_write_cache)(struct se_device *);
+   int (*init_prot)(struct se_device *);
+   void (*free_prot)(struct se_device *);
 };
 
 struct sbc_ops {
diff --git a/include/target/target_core_base.h 
b/include/target/target_core_base.h
index 45412a6..15f402c 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -166,6 +166,7 @@ enum se_cmd_flags_table {
SCF_COMPARE_AND_WRITE   = 0x0008,
SCF_COMPARE_AND_WRITE_POST  = 0x0010,
SCF_CMD_XCOPY_PASSTHROUGH   = 0x0020,
+   SCF_PROT= 0x0040,
 };
 
 /* struct se_dev_entry->lun_flags and struct se_lun->lun_access */
@@ -414,6 +415,45 @@ struct se_tmr_req {
struct list_headtmr_list;
 };
 
+enum target_prot_op {
+   TARGET_PROT_NORMAL,
+   TARGET_PROT_READ_INSERT,
+   TARGET_PROT_WRITE_INSERT,
+   TARGET_PROT_READ_STRIP,
+   TARGET_PROT_WRITE_STRIP,
+   TARGET_PROT_READ_PASS,
+   TARGET_PROT_WRITE_PASS,
+};
+
+enum target_prot_type {
+   TARGET_DIF_TYPE0_PROT,
+   TARGET_DIF_TYPE1_PROT,
+   TARGET_DIF_TYPE2_PROT,
+   TARGET_DIF_TYPE3_PROT,
+};
+
+enum target_prot_version {
+   TARGET_DIF_V1 = 1,
+   TARGET_DIF_V2 = 2,
+};
+
+enum target_guard_type {
+   TARGET_DIX_GUARD_CRC = 1,
+   TARGET_DIX_GUARD_IP = 2,
+};
+
+enum target_pi_error {
+   TARGET_GUARD_CHECK_FAILED = 0x1,
+   TARGET_APPTAG_CHECK_FAILED = 0x2,
+   TARGET_REFTAG_CHECK_FAILED = 0x3,
+};
+
+struct se_dif_v1_tuple {
+   __be16  guard_tag;
+   __be16  app_tag;
+   __be32  ref_tag;
+};
+
 struct se_cmd {
/* SAM response code being sent to initiator */
u8  scsi_status;
@@ -498,6 +538,20 @@ struct se_cmd {
 
/* Used for lun->lun_ref counting */
boollun_ref_active;
+
+   /* DIF related members */
+   enum target_prot_op prot_op;
+   enum target_prot_type   prot_type;
+   enum target_guard_type  bg_type;
+   u16 bg_seed;
+   u16 reftag_seed;
+   u32 apptag_seed;
+   u32 prot_length;
+   struct scatterlist  *t_prot_sg;
+   unsigned intt_prot_nents;
+   boolprot_interleaved;
+   enum target_pi_errorpi_err;
+   u32 block_num;
 };
 
 struct se_ua {
@@ -609,6 +663,9 @@ struct se_dev_attrib {
int emulate_tpws;
int emulate_caw;
int emulate_3pc;
+   enum target_prot_type pi_prot_type;
+   enum target_prot_version pi_prot_version;
+   enum target_guard_type pi_guard_type;
int enforce_pr_isids;
int is_nonrot;
int emulate_rest_reord;
@@ -739,6 +796,8 @@ struct se_device {
/* Linked list for struct se_hba struct se_device list */
struct list_headdev_list;
struct se_lun   xcopy_lun;
+   /* Protection Information */
+   int prot_length;
 };
 
 struct se_hba {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/14] target/sbc: Add sbc_check_prot + update sbc_parse_cdb for DIF

2014-01-08 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch adds sbc_check_prot() for performing various DIF
related CDB sanity checks, along with setting SCF_PROT once
sanity checks have passed.

Also, add calls in sbc_parse_cdb() for READ_[10,12,16] +
WRITE_[10,12,16] to perform DIF sanity checking.

Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Or Gerlitz 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_sbc.c |   39 ++
 1 file changed, 39 insertions(+)

diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index 52ae54e..600ffcb 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -563,6 +563,27 @@ sbc_compare_and_write(struct se_cmd *cmd)
return TCM_NO_SENSE;
 }
 
+bool
+sbc_check_prot(struct se_device *dev, struct se_cmd *cmd, unsigned char *cdb)
+{
+   if (!dev->dev_attrib.pi_prot_type)
+   return true;
+
+   if (dev->dev_attrib.pi_prot_type == TARGET_DIF_TYPE2_PROT &&
+  (cdb[1] & 0xe0))
+   return false;
+
+   if (!(cdb[1] & 0xe0)) {
+   pr_warn("Target: Unprotected READ/WRITE to DIF device\n");
+   return true;
+   }
+   if (!cmd->t_prot_sg || !cmd->t_prot_nents)
+   return true;
+
+   cmd->se_cmd_flags |= SCF_PROT;
+   return true;
+}
+
 sense_reason_t
 sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)
 {
@@ -581,6 +602,9 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)
cmd->execute_cmd = sbc_execute_rw;
break;
case READ_10:
+   if (!sbc_check_prot(dev, cmd, cdb))
+   return TCM_UNSUPPORTED_SCSI_OPCODE;
+
sectors = transport_get_sectors_10(cdb);
cmd->t_task_lba = transport_lba_32(cdb);
cmd->se_cmd_flags |= SCF_SCSI_DATA_CDB;
@@ -588,6 +612,9 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)
cmd->execute_cmd = sbc_execute_rw;
break;
case READ_12:
+   if (!sbc_check_prot(dev, cmd, cdb))
+   return TCM_UNSUPPORTED_SCSI_OPCODE;
+
sectors = transport_get_sectors_12(cdb);
cmd->t_task_lba = transport_lba_32(cdb);
cmd->se_cmd_flags |= SCF_SCSI_DATA_CDB;
@@ -595,6 +622,9 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)
cmd->execute_cmd = sbc_execute_rw;
break;
case READ_16:
+   if (!sbc_check_prot(dev, cmd, cdb))
+   return TCM_UNSUPPORTED_SCSI_OPCODE;
+
sectors = transport_get_sectors_16(cdb);
cmd->t_task_lba = transport_lba_64(cdb);
cmd->se_cmd_flags |= SCF_SCSI_DATA_CDB;
@@ -610,6 +640,9 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)
break;
case WRITE_10:
case WRITE_VERIFY:
+   if (!sbc_check_prot(dev, cmd, cdb))
+   return TCM_UNSUPPORTED_SCSI_OPCODE;
+
sectors = transport_get_sectors_10(cdb);
cmd->t_task_lba = transport_lba_32(cdb);
if (cdb[1] & 0x8)
@@ -619,6 +652,9 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)
cmd->execute_cmd = sbc_execute_rw;
break;
case WRITE_12:
+   if (!sbc_check_prot(dev, cmd, cdb))
+   return TCM_UNSUPPORTED_SCSI_OPCODE;
+
sectors = transport_get_sectors_12(cdb);
cmd->t_task_lba = transport_lba_32(cdb);
if (cdb[1] & 0x8)
@@ -628,6 +664,9 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)
cmd->execute_cmd = sbc_execute_rw;
break;
case WRITE_16:
+   if (!sbc_check_prot(dev, cmd, cdb))
+   return TCM_UNSUPPORTED_SCSI_OPCODE;
+
sectors = transport_get_sectors_16(cdb);
cmd->t_task_lba = transport_lba_64(cdb);
if (cdb[1] & 0x8)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Terrible performance of sequential O_DIRECT 4k writes in SAN environment. ~3 times slower then Solars 10 with the same HBA/Storage.

2014-01-08 Thread Jan Kara
On Wed 08-01-14 19:30:38, Sergey Meirovich wrote:
> On 8 January 2014 17:26, Christoph Hellwig  wrote:
> >
> > On my laptop SSD I get the following results (sometimes up to 200MB/s,
> > sometimes down to 100MB/s, always in the 40k to 50k IOps range):
> >
> > time elapsed (sec.):5
> > bandwidth (MiB/s):  160.00
> > IOps:   40960.00
> 
> Any direct attached storage I've tried was faster for me as well,
> indeed. I have already posted IIRC
> "06:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS
> 2208 [Thunderbolt] (rev 05)"   - 1Gb BBU RAM
> sysbench seqwr aio 4k: 326.24Mb/sec 20879.56 Requests/sec
> 
> That is good that you mentioned SSD. I've tried fnic HBA zoned to EMC
> XtremIO (SSD only based storage)
>  14.43Mb/sec 3693.65 Requests/sec for sequential 4k.
  You see big degradation only in SAN environments because they have
generally higher latency to complete a single request. And given appending
direct IO is completely synchronous, the latency is the only thing that
really matters for performance. I've also seen my desktop-grade SATA drive
perform better than some enterprise grade SAN for this particular
workload...

> So far I've seen so massive degradation only in SAN environment. I
> started my investigation with RHEL6.5 kernel so below table is from it
> but the trend is the same as for mainline it seems.
> 
> Chunk size Bandwidth MiB/s
> 
> 64M512
> 32M510
> 16M492
> 8M  451
> 4M  436
> 2M  350
> 1M  256
> 512K   191
> 256K   165
> 128K   142
> 64K 101
> 32K 65
> 16K 39
> 8K   20
> 4K   11
  Yes, that's expected. The latency to complete a request consists of some
fixed overhead + time to write data. So for small request sizes the latency
is constant (corresponding to bandwidth growing linearly with the request
size) and for larger request sizes latency somewhat grows so bandwidth grows
slower and slower (as the time to write the data forms larger and larger
part of the total latency)...

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


I have a very lucrative Business

2014-01-08 Thread Dr. Musa Hamed
ATTN: Dear,

I am Dr. Musa Hamed Executive Director of the Bank of Africa
Plc B.O.A,Cotonou Benin Rep. I have a very lucrative Business proposal
worth of 21,500,000.00 USD.

An Iraqi named Hamadi Hashem a business man made a fixed deposit of
21,500,000.00 USD . Upon maturity several notices was sent to him,
even during the war,eleven years ago (2003) up till this date. Again
after the war another notification was sent and still no response came
from him. it has been found out that Hamadi Hashem and his family had
been killed during the war in bomb blast that hit their home at
Mukaradeeb where his personal oil well was. I am prepared to split the
funds with you 50%:50%.By placing you as the NEXT OF KIN to his
deposit.

Should you be interested in doing this deal with me, please send me
the following information:

(1) Your Name
(2) Resident Address
(3) Your Occupation
(4) Your Phone Number
(5) Date of Birth
(6) Resident Country

And I will prefer you to reach me privately and securely on my personal
email address: dr.musahame...@yahoo.fr
Tel: +229-988-400-97

I will provide you with further information as soon as I hear
positively from you.

Regards,
Dr. Musa Hamed
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: REQ_PM vs REQ_TYPE_PM_RESUME

2014-01-08 Thread Alan Stern
On Wed, 8 Jan 2014, Phillip Susi wrote:

> > In essence, you want your disk's state to move directly from
> > unpowered (system asleep) to runtime-suspended with no intermediate
> > active -- but only if the disk doesn't spin up all by itself.
> > 
> > One problem: How can the kernel tell whether the disk has spun up
> > by itself?  I don't think it can be done at the SCSI level.  Can it
> > be done at the ATA level (and without using the SCSI stack, as that
> > would require the disk to go out of runtime suspend and spin up
> > anyway)?
> 
> You issue a REQUEST SENSE command and that returns status indicating
> whether the drive is stopped, or in standby.  See my patches.  One of

I never saw your patches.  Where were they posted?

If you issue the REQUEST SENSE command in the usual way (going through
the SCSI and block layers), and the disk is already in runtime suspend,
it won't do what you want.  The block layer won't deliver requests
until the device leaves the RPM_SUSPENDED state.  In addition, when it
receives the command the block layer will submit a deferred
runtime-resume request, which rather defeats the whole purpose.

(I guess you actually saw some of this happen, and that's what led to
this email thread in the first place.)

It's a knotty situation.  The only way to find out whether the disk is
spinning is to ask it, which requires doing I/O, which requires
spinning up the disk.  Maybe we need to add a mechanism to the block
layer's runtime PM implementation for addressing just this situation.

> them fixes the scsi-ata translation to issue the ATA CHECK power
> command to see if the drive is suspended, as SAT-3 calls for.

What happens with non-ATA disks?


> > Actually, it does have a reason.  Namely, you told it earlier to
> > assume the disk will be needed again if it was used within the last
> > 5 minutes.  At the time of the system resume, the disk was last
> > used 3 minutes ago (not counting the time the system was asleep).
> 
> You're looking at it backwards.  You aren't telling it to assume the
> disk will be needed again if it was used within the last 5 minutes;
> you are telling it that it can assume it *won't* be needed again soon
> if not used in the last 5 minutes, and therefore, it is ok to shut it
> down.

Okay, yes, the autosuspend timeout affects when devices should be
powered down, not when they should be powered back up.

> Also the act of suspending the system is a pretty clear breaking point
> in general activity.  You normally stop your work and quiesce the
> system before you suspend it.  If I finish my work and copy it to my
> backup drive, then suspend the system, and my wife comes by an hour
> later to browse the web, she's not going to be touching my backup disk.

While that's true, it's also true that any userspace processes running 
before the system suspend will continue running, with no perceptible 
break, after the system wakes up.  If they were accessing the disk 
before, they will most likely continue to access it afterward.

But never mind all this; it's a pointless discussion.  The real 
question is how to implement what you want.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: REQ_PM vs REQ_TYPE_PM_RESUME

2014-01-08 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 1/8/2014 4:21 PM, Alan Stern wrote:
> I never saw your patches.  Where were they posted?

Higher up in this thread on linux-ide and linux-scsi.  Subject was Let
sleeping disks lie.

> If you issue the REQUEST SENSE command in the usual way (going
> through the SCSI and block layers), and the disk is already in
> runtime suspend, it won't do what you want.  The block layer won't
> deliver requests until the device leaves the RPM_SUSPENDED state.
> In addition, when it receives the command the block layer will
> submit a deferred runtime-resume request, which rather defeats the
> whole purpose.

Right, which is why I left the device in the active state and used the
pre/post_suspend functions to change the queue into the RPM_SUSPENDING
state, then either complete the transition to RPM_SUSPENDED, or bail
out and go back to RPM_ACTIVE.

> What happens with non-ATA disks?

Same thing: they return a sense status that either says they are
suspended, stopped, or normal.


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSzcg2AAoJEI5FoCIzSKrw7wcIAJLiliMYnoHLm/dGWHhhfcnb
ELJLLAhhWsOIQmHiV1svdy0F7EuKi3KF9qJ9JwCaAspO0w8q24UGhvQW20uSH9Md
JliKhn6eBY/NFTctaVP2tnOcc4vevCndJfEScBlWxI82MWMC1TV3lA7xPtcJ5ocX
LMBLPXpr0qypisgmu/tAnwTPQVyU4WUkgMmG9us4w3BgCkvI/oXf4oDKFkCNgUye
FX16NTZPR6iaIK+YJMG3uKCSD4CQjTnKJNXGR89XKJG+Z9v04jyQgWs7kgvqCov9
F5j+yueyMVlveGPHmNwhElHEPT2UWSoyQbonuOrYQZ+Db+xJBOxzoY8Jgq9lWVI=
=+vwf
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 2/2] dual scan thread bug fix

2014-01-08 Thread James Bottomley
On Wed, 2014-01-08 at 10:57 -0500, Alan Stern wrote:
> On Wed, 8 Jan 2014, James Bottomley wrote:
> 
> > OK, Agreed, but that means modifying the 1/2 patch with the below.  This
> > should make the proposed diff to 2/2 correct.
> > 
> > James
> > 
> > ---
> > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> > index ef3f958..5fad646 100644
> > --- a/drivers/scsi/scsi_scan.c
> > +++ b/drivers/scsi/scsi_scan.c
> > @@ -417,7 +417,7 @@ static struct scsi_target *scsi_alloc_target(struct 
> > device *parent,
> > + shost->transportt->target_size;
> > struct scsi_target *starget;
> > struct scsi_target *found_target;
> > -   int error;
> > +   int error, ref_got;
> >  
> > starget = kzalloc(size, GFP_KERNEL);
> > if (!starget) {
> > @@ -466,15 +466,15 @@ static struct scsi_target *scsi_alloc_target(struct 
> > device *parent,
> > return starget;
> >  
> >   found:
> > -   if (!kref_get_unless_zero(&found_target->reap_ref))
> > -   /*
> > -* release routine already fired.  Target is dead, but
> > -* STARGET_DEL may not yet be set (set in the release
> > -* routine), so set here as well, just in case
> > -*/
> > -   found_target->state = STARGET_DEL;
> > +   /*
> > +* release routine already fired if kref is zero, so if we can still
> > +* take the reference, the target must be alive.  If we can't, it must
> > +* be dying and we need to wait for a new target
> > +*/
> > +   ref_got = kref_get_unless_zero(&found_target->reap_ref);
> > +
> > spin_unlock_irqrestore(shost->host_lock, flags);
> > -   if (found_target->state != STARGET_DEL) {
> > +   if (ref_got) {
> > put_device(dev);
> > return found_target;
> > }
> > @@ -482,8 +482,8 @@ static struct scsi_target *scsi_alloc_target(struct 
> > device *parent,
> >  * Unfortunately, we found a dying target; need to wait until it's
> >  * dead before we can get a new one.  There is an anomaly here.  We
> >  * *should* call scsi_target_reap() to balance the kref_get() of the
> > -* reap_ref above.  However, since the target is in state STARGET_DEL,
> > -* it's already invisible and the reap_ref is irrelevant.  If we call
> > +* reap_ref above.  However, since the target being released, it's
> > +* already invisible and the reap_ref is irrelevant.  If we call
> >  * scsi_target_reap() we might spuriously do another device_del() on
> >  * an already invisible target.
> >  */
> 
> In fact, most of this comment (everything after the first sentence) is
> no longer needed.  If the target is dying then kref_get_unless_zero
> must have failed, so there is no need to worry about unbalanced
> refcounts.

I'd like to keep the comment: I get a lot of email from people who write
static checkers for this.  In principle, I agree, they should treat
kref_get_unless_zero() as a spin_trylock(), but it's nice to have
something concrete in the code to point to when the email arrives.

> Otherwise this looks good.

Great, thanks, I'll re-roll and repost.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] drivers: target: target_core_mod: use div64_u64_rem() instead of operator '%' for u64

2014-01-08 Thread Nicholas A. Bellinger
On Wed, 2014-01-08 at 08:32 +0100, Hannes Reinecke wrote:
> On 12/24/2013 04:35 AM, Chen Gang wrote:
> > On 12/23/2013 02:51 PM, Nicholas A. Bellinger wrote:
> >> On Sun, 2013-12-22 at 17:17 +0800, Chen Gang wrote:



> >>> The related fix patch changed "start_lba = lba % ..." to "start_lba =
> >>> lba / ...", and also assumed "segment_size * segment_mult" is still
> >>> within u32 (can not cause type over flow).
> >>>
> >>> I guess the original author already knew about them, and intended to do
> >>> like that (if not, please let me know, thanks).
> >>>
> >>
> >> Sorry, your correct that the original code is using modulo division to
> >> calculate start_lba.
> >>
> > 
> > Oh, that's all right, (in fact, don't need sorry), I am not quite
> > familiar with the details, so need related member help check it.  :-)
> > 
> >> Hannes, can you please double check this below..?
> >>
> > 
> > Please help check when have time, thanks.
> > 
> I would even convert segment_size and segment_mult to u64,
> to ensure no overflows occur:
> 
> diff --git a/drivers/target/target_core_alua.c
> b/drivers/target/target_core_alua
> .c
> index 9b1856d..54b1e52 100644
> --- a/drivers/target/target_core_alua.c
> +++ b/drivers/target/target_core_alua.c
> @@ -477,8 +477,7 @@ static inline int core_alua_state_lba_dependent(
> u8 *alua_ascq)
>  {
> struct se_device *dev = cmd->se_dev;
> -   u32 segment_size, segment_mult, sectors;
> -   u64 lba;
> +   u64 segment_size, segment_mult, sectors, lba;
> 
> /* Only need to check for cdb actually containing LBAs */
> if (!(cmd->se_cmd_flags & SCF_SCSI_DATA_CDB))
> 
> 

Will squash the above into the original patch shortly in for-next..

> Other than that the sector_div() patch is correct.
> 

 Thanks for confirming that sector_div() is correct here vs. the
original code using modulo that Chen had pointed out.

Thanks Hannes!

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-08 Thread walt
On 01/03/2014 03:29 PM, Sarah Sharp wrote:

> I'll let you know when I have some diagnostic patches ready.

Hi Sarah.  I see today gregkh committed the patches you've already sent
me, so I assume someone (other than me) has tested those patches and
discovered some benefit from them?

I'm still wondering if I'm suffering from hardware quirks.  From the
first day I installed my usb3 adapter card and the usb3 disk docking
station I've noticed some quirky behavior.

e.g. I boot the machine with the docking station powered-off, and then
later I power it on, the usb3 disk is not detected at all -- until I
reboot the machine with the docking station still powered on.

Minor stuff, yes, but maybe relevant?  I dunno.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: REQ_PM vs REQ_TYPE_PM_RESUME

2014-01-08 Thread Aaron Lu
On 01/09/2014 05:21 AM, Alan Stern wrote:
> On Wed, 8 Jan 2014, Phillip Susi wrote:
>> You issue a REQUEST SENSE command and that returns status indicating
>> whether the drive is stopped, or in standby.  See my patches.  One of
> 
> I never saw your patches.  Where were they posted?
> 
> If you issue the REQUEST SENSE command in the usual way (going through
> the SCSI and block layers), and the disk is already in runtime suspend,
> it won't do what you want.  The block layer won't deliver requests
> until the device leaves the RPM_SUSPENDED state.  In addition, when it
> receives the command the block layer will submit a deferred
> runtime-resume request, which rather defeats the whole purpose.
> 
> (I guess you actually saw some of this happen, and that's what led to
> this email thread in the first place.)
> 
> It's a knotty situation.  The only way to find out whether the disk is
> spinning is to ask it, which requires doing I/O, which requires
> spinning up the disk.  Maybe we need to add a mechanism to the block
> layer's runtime PM implementation for addressing just this situation.

I think it's knotty because the runtime PM status is a view from the
kernel/host side, i.e. it is runtime suspended if it is not being used,
no matter which power state it is in. The trigger for the PM state
transition are all based on this, if we want to do it the other way
around(update device's runtime PM status based on device's actual power
state), we are in a knotty situation.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] iscsi: fix wrong order of opcode and itt in iscsi_handle_reject prompt

2014-01-08 Thread Vaughan Cao
This patch makes reject messages show right value for opcode and itt, which
is converse previously.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/libiscsi.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index e399561..27547ff 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1013,13 +1013,13 @@ static int iscsi_handle_reject(struct iscsi_conn *conn, 
struct iscsi_hdr *hdr,
iscsi_conn_printk(KERN_ERR, conn,
  "pdu (op 0x%x itt 0x%x) rejected "
  "due to DataDigest error.\n",
- rejected_pdu.itt, opcode);
+ opcode, rejected_pdu.itt);
break;
case ISCSI_REASON_IMM_CMD_REJECT:
iscsi_conn_printk(KERN_ERR, conn,
  "pdu (op 0x%x itt 0x%x) rejected. Too many "
  "immediate commands.\n",
- rejected_pdu.itt, opcode);
+ opcode, rejected_pdu.itt);
/*
 * We only send one TMF at a time so if the target could not
 * handle it, then it should get fixed (RFC mandates that
@@ -1059,8 +1059,8 @@ static int iscsi_handle_reject(struct iscsi_conn *conn, 
struct iscsi_hdr *hdr,
default:
iscsi_conn_printk(KERN_ERR, conn,
  "pdu (op 0x%x itt 0x%x) rejected. Reason "
- "code 0x%x\n", rejected_pdu.itt,
- rejected_pdu.opcode, reject->reason);
+ "code 0x%x\n", rejected_pdu.opcode,
+ rejected_pdu.itt, reject->reason);
break;
}
return rc;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] iscsi: fix wrong order of opcode and itt in iscsi_handle_reject prompt

2014-01-08 Thread Mike Christie
On 01/08/2014 08:21 PM, Vaughan Cao wrote:
> This patch makes reject messages show right value for opcode and itt, which
> is converse previously.
> 
> Signed-off-by: Vaughan Cao 
> ---
>  drivers/scsi/libiscsi.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index e399561..27547ff 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -1013,13 +1013,13 @@ static int iscsi_handle_reject(struct iscsi_conn 
> *conn, struct iscsi_hdr *hdr,
>   iscsi_conn_printk(KERN_ERR, conn,
> "pdu (op 0x%x itt 0x%x) rejected "
> "due to DataDigest error.\n",
> -   rejected_pdu.itt, opcode);
> +   opcode, rejected_pdu.itt);
>   break;
>   case ISCSI_REASON_IMM_CMD_REJECT:
>   iscsi_conn_printk(KERN_ERR, conn,
> "pdu (op 0x%x itt 0x%x) rejected. Too many "
> "immediate commands.\n",
> -   rejected_pdu.itt, opcode);
> +   opcode, rejected_pdu.itt);
>   /*
>* We only send one TMF at a time so if the target could not
>* handle it, then it should get fixed (RFC mandates that
> @@ -1059,8 +1059,8 @@ static int iscsi_handle_reject(struct iscsi_conn *conn, 
> struct iscsi_hdr *hdr,
>   default:
>   iscsi_conn_printk(KERN_ERR, conn,
> "pdu (op 0x%x itt 0x%x) rejected. Reason "
> -   "code 0x%x\n", rejected_pdu.itt,
> -   rejected_pdu.opcode, reject->reason);
> +   "code 0x%x\n", rejected_pdu.opcode,
> +   rejected_pdu.itt, reject->reason);
>   break;
>   }
>   return rc;
> 

Thanks.

Acked-by: Mike Christie 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html