Re: [PATCH] scsi: sd: Remember that READ CAPACITY(16) succeeded

2018-03-22 Thread Bart Van Assche
On Wed, 2018-03-14 at 12:15 -0400, Martin K. Petersen wrote:
> The USB storage glue sets the try_rc_10_first flag in an attempt to
> avoid wedging poorly implemented legacy USB devices.
> 
> If the device capacity is too large to be expressed in the provided
> response buffer field of READ CAPACITY(10), a well-behaved device will
> set the reported capacity to 0x. We will then attempt to issue a
> READ CAPACITY(16) to obtain the real capacity.
> 
> Since this part of the discovery logic is not covered by the first_scan
> flag, a warning will be printed a couple of times times per revalidate
> attempt if we upgrade from READ CAPACITY(10) to READ CAPACITY(16).
> 
> Remember that we have successfully issued READ CAPACITY(16) so we can
> take the fast path on subsequent revalidate attempts.

Reviewed-by: Bart Van Assche 




答复: [PATCH v8 2/5] dt-bindings: scsi: ufs: add document for hisi-ufs

2018-03-22 Thread liwei (CM)
Hi, Arnd
Sorry to bother you again, please take the time to review the patch. Are there 
any other suggestions?
Looking forward to your reply.

-邮件原件-
发件人: arndbergm...@gmail.com [mailto:arndbergm...@gmail.com] 代表 Arnd Bergmann
发送时间: 2018年2月19日 17:58
收件人: liwei (CM)
抄送: Rob Herring; Mark Rutland; xuwei (O); Catalin Marinas; Will Deacon; Vinayak 
Holikatti; James E.J. Bottomley; Martin K. Petersen; Kevin Hilman; Gregory 
CLEMENT; Thomas Petazzoni; Masahiro Yamada; Riku Voipio; Thierry Reding; 
Krzysztof Kozlowski; Eric Anholt; DTML; Linux Kernel Mailing List; Linux ARM; 
linux-scsi; zangleigang; Gengjianfeng; Guodong Xu; Zhangfei Gao; Fengbaopeng 
(kevin, Kirin Solution Dept)
主题: Re: [PATCH v8 2/5] dt-bindings: scsi: ufs: add document for hisi-ufs

On Tue, Feb 13, 2018 at 11:14 AM, Li Wei  wrote:
> add ufs node document for Hisilicon.
>
> Signed-off-by: Li Wei 
> ---
>  Documentation/devicetree/bindings/ufs/ufs-hisi.txt | 37 
> ++
>  1 file changed, 37 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/ufs/ufs-hisi.txt


I'm pretty sure we've discussed it before, but can you make this so that the 
generic properties are part of the ufshcd binding, and you refer to it from 
here, only describing in what ways the hisi ufs binding differs from the 
standard?

> diff --git a/Documentation/devicetree/bindings/ufs/ufs-hisi.txt 
> b/Documentation/devicetree/bindings/ufs/ufs-hisi.txt
> new file mode 100644
> index ..0d21b57496cf
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/ufs/ufs-hisi.txt
> @@ -0,0 +1,37 @@
> +* Hisilicon Universal Flash Storage (UFS) Host Controller
> +
> +UFS nodes are defined to describe on-chip UFS hardware macro.
> +Each UFS Host Controller should have its own node.
> +
> +Required properties:
> +- compatible: compatible list, contains one of the following -
> +   "hisilicon,hi3660-ufs", 
> "jedec,ufs-1.1" for hisi ufs
> +   host controller present on Hi36xx 
> chipset.
> +- reg   : should contain UFS register address space & UFS SYS 
> CTRL register address,
> +- interrupt-parent  : interrupt device
> +- interrupts: interrupt number
> +- clocks   : List of phandle and clock specifier pairs
> +- clock-names   : List of clock input name strings sorted in the same
> +   order as the clocks property. 
> +"ref_clk", "phy_clk" is optional

The clock names sound generic enough, should we have both in the generic 
binding?

Do you mean that add a "phy_clk" to ufshcd-pltfrm 's bindings? 
At present, it seems that in the implementation of generic code, apart from 
"ref_clk" may have special processing, other clk will not have special 
processing and simply parse and enable;
Referring to ufs-qcom binding, I think "phy_clk" can be named "iface_clk", this 
"iface_clk" exists in ufshcd-pltfrm bindings;If so, "ref_clk", "iface_clk" are 
both in the generic binding,we will remove them here. Is that okay?

> +- resets: reset node register, one reset the clk and the other 
> reset the controller
> +- reset-names   : describe reset node register

This looks incomplete. What is the name of the reset line supposed to be?
I'd also suggest you document it in the ufshcd binding instead.

The "rst" corresponds to reset the whole UFS IP, and " arst " only reset the 
APB/AXI bus. Discussed with our soc colleagues that "arst" is assert by default 
and needs to deassert .
But I think it may be difficult to add this to common code, or it may not be 
necessary; 
Other manufacturers may not need to do this soc init because they probably 
already done in the bootloader phase. Even if they need to do it, it's probably 
different from us.
We need to make sure that our ufs works even if not do soc init during the 
bootloader phase.




  Arnd


[PATCH 2/3] Netvsc: Use the vmbus functiton to calculate ring buffer percentage

2018-03-22 Thread Long Li
From: Long Li 

In Vmbus, we have defined a function to calculate available ring buffer
percentage to write.

Use that function and remove duplicate netvsc code.

Signed-off-by: Long Li 
---
 drivers/net/hyperv/netvsc.c | 17 +++--
 drivers/net/hyperv/netvsc_drv.c |  3 ---
 2 files changed, 3 insertions(+), 17 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 0265d703eb03..8af0069e4d8c 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -31,7 +31,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 
@@ -590,17 +589,6 @@ void netvsc_device_remove(struct hv_device *device)
 #define RING_AVAIL_PERCENT_HIWATER 20
 #define RING_AVAIL_PERCENT_LOWATER 10
 
-/*
- * Get the percentage of available bytes to write in the ring.
- * The return value is in range from 0 to 100.
- */
-static u32 hv_ringbuf_avail_percent(const struct hv_ring_buffer_info 
*ring_info)
-{
-   u32 avail_write = hv_get_bytes_to_write(ring_info);
-
-   return reciprocal_divide(avail_write  * 100, netvsc_ring_reciprocal);
-}
-
 static inline void netvsc_free_send_slot(struct netvsc_device *net_device,
 u32 index)
 {
@@ -649,7 +637,8 @@ static void netvsc_send_tx_complete(struct netvsc_device 
*net_device,
wake_up(_device->wait_drain);
 
if (netif_tx_queue_stopped(netdev_get_tx_queue(ndev, q_idx)) &&
-   (hv_ringbuf_avail_percent(>outbound) > 
RING_AVAIL_PERCENT_HIWATER ||
+   (hv_get_avail_to_write_percent(>outbound) >
+RING_AVAIL_PERCENT_HIWATER ||
 queue_sends < 1)) {
netif_tx_wake_queue(netdev_get_tx_queue(ndev, q_idx));
ndev_ctx->eth_stats.wake_queue++;
@@ -757,7 +746,7 @@ static inline int netvsc_send_pkt(
struct netdev_queue *txq = netdev_get_tx_queue(ndev, packet->q_idx);
u64 req_id;
int ret;
-   u32 ring_avail = hv_ringbuf_avail_percent(_channel->outbound);
+   u32 ring_avail = hv_get_avail_to_write_percent(_channel->outbound);
 
nvmsg.hdr.msg_type = NVSP_MSG1_TYPE_SEND_RNDIS_PKT;
if (skb)
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index faea0be18924..b0b1c2fd2b7b 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -35,7 +35,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
@@ -55,7 +54,6 @@ static unsigned int ring_size __ro_after_init = 128;
 module_param(ring_size, uint, S_IRUGO);
 MODULE_PARM_DESC(ring_size, "Ring buffer size (# of pages)");
 unsigned int netvsc_ring_bytes __ro_after_init;
-struct reciprocal_value netvsc_ring_reciprocal __ro_after_init;
 
 static const u32 default_msg = NETIF_MSG_DRV | NETIF_MSG_PROBE |
NETIF_MSG_LINK | NETIF_MSG_IFUP |
@@ -2186,7 +2184,6 @@ static int __init netvsc_drv_init(void)
ring_size);
}
netvsc_ring_bytes = ring_size * PAGE_SIZE;
-   netvsc_ring_reciprocal = reciprocal_value(netvsc_ring_bytes);
 
ret = vmbus_driver_register(_drv);
if (ret)
-- 
2.14.1



[PATCH 3/3] Storvsc: Select channel based on available percentage of ring buffer to write

2018-03-22 Thread Long Li
From: Long Li 

This is a best effort for estimating on how busy the ring buffer is for
that channel, based on available buffer to write in percentage. It is still
possible that at the time of actual ring buffer write, the space may not be
available due to other processes may be writing at the time.

Selecting a channel based on how full it is can reduce the possibility that
a ring buffer write will fail, and avoid the situation a channel is over
busy.

Now it's possible that storvsc can use a smaller ring buffer size
(e.g. 40k bytes) to take advantage of cache locality.

Signed-off-by: Long Li 
---
 drivers/scsi/storvsc_drv.c | 62 +-
 1 file changed, 50 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index a2ec0bc9e9fa..96681c4f75cb 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -395,6 +395,12 @@ MODULE_PARM_DESC(storvsc_ringbuffer_size, "Ring buffer 
size (bytes)");
 
 module_param(storvsc_vcpus_per_sub_channel, int, S_IRUGO);
 MODULE_PARM_DESC(storvsc_vcpus_per_sub_channel, "Ratio of VCPUs to 
subchannels");
+
+static int ring_avail_percent_lowater = 10;
+module_param(ring_avail_percent_lowater, int, S_IRUGO);
+MODULE_PARM_DESC(ring_avail_percent_lowater,
+   "Select a channel if available ring size > this in percent");
+
 /*
  * Timeout in seconds for all devices managed by this driver.
  */
@@ -1285,9 +1291,9 @@ static int storvsc_do_io(struct hv_device *device,
 {
struct storvsc_device *stor_device;
struct vstor_packet *vstor_packet;
-   struct vmbus_channel *outgoing_channel;
+   struct vmbus_channel *outgoing_channel, *channel;
int ret = 0;
-   struct cpumask alloced_mask;
+   struct cpumask alloced_mask, other_numa_mask;
int tgt_cpu;
 
vstor_packet = >vstor_packet;
@@ -1301,22 +1307,53 @@ static int storvsc_do_io(struct hv_device *device,
/*
 * Select an an appropriate channel to send the request out.
 */
-
if (stor_device->stor_chns[q_num] != NULL) {
outgoing_channel = stor_device->stor_chns[q_num];
-   if (outgoing_channel->target_cpu == smp_processor_id()) {
+   if (outgoing_channel->target_cpu == q_num) {
/*
 * Ideally, we want to pick a different channel if
 * available on the same NUMA node.
 */
cpumask_and(_mask, _device->alloced_cpus,
cpumask_of_node(cpu_to_node(q_num)));
-   for_each_cpu_wrap(tgt_cpu, _mask,
-   outgoing_channel->target_cpu + 1) {
-   if (tgt_cpu != outgoing_channel->target_cpu) {
-   outgoing_channel =
-   stor_device->stor_chns[tgt_cpu];
-   break;
+
+   for_each_cpu_wrap(tgt_cpu, _mask, q_num + 1) {
+   if (tgt_cpu == q_num)
+   continue;
+   channel = stor_device->stor_chns[tgt_cpu];
+   if (hv_get_avail_to_write_percent(
+   >outbound)
+   > ring_avail_percent_lowater) {
+   outgoing_channel = channel;
+   goto found_channel;
+   }
+   }
+
+   /*
+* All the othe channels on the same NUMA node are
+* busy. Try to use the channel with the current CPU
+*/
+   if (hv_get_avail_to_write_percent(
+   _channel->outbound)
+   > ring_avail_percent_lowater)
+   goto found_channel;
+
+   /*
+* If we reach here, all the channels on the current
+* NUMA node are busy. Try to find a channel in
+* other NUMA nodes
+*/
+   cpumask_andnot(_numa_mask,
+   _device->alloced_cpus,
+   cpumask_of_node(cpu_to_node(q_num)));
+
+   for_each_cpu(tgt_cpu, _numa_mask) {
+   channel = stor_device->stor_chns[tgt_cpu];
+   if (hv_get_avail_to_write_percent(
+   >outbound)
+   > ring_avail_percent_lowater) {
+   

[PATCH 1/3] Vmbus: Add function to report available ring buffer to write in total ring size percentage

2018-03-22 Thread Long Li
From: Long Li 

Netvsc has a similar function to calculate how much ring buffer in
percentage is available to write. This function is useful for storvsc and
other vmbus devices.

Define a similar function in vmbus to be used by storvsc.

Signed-off-by: Long Li 
---
 drivers/hv/ring_buffer.c |  2 ++
 include/linux/hyperv.h   | 12 
 2 files changed, 14 insertions(+)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 8699bb969e7e..3c836c099a8f 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -227,6 +227,8 @@ int hv_ringbuffer_init(struct hv_ring_buffer_info 
*ring_info,
ring_info->ring_buffer->feature_bits.value = 1;
 
ring_info->ring_size = page_cnt << PAGE_SHIFT;
+   ring_info->ring_size_div10_reciprocal =
+   reciprocal_value(ring_info->ring_size / 10);
ring_info->ring_datasize = ring_info->ring_size -
sizeof(struct hv_ring_buffer);
 
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 2048f3c3b68a..eb7204851089 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define MAX_PAGE_BUFFER_COUNT  32
 #define MAX_MULTIPAGE_BUFFER_COUNT 32 /* 128K */
@@ -121,6 +122,7 @@ struct hv_ring_buffer {
 struct hv_ring_buffer_info {
struct hv_ring_buffer *ring_buffer;
u32 ring_size;  /* Include the shared header */
+   struct reciprocal_value ring_size_div10_reciprocal;
spinlock_t ring_lock;
 
u32 ring_datasize;  /* < ring_size */
@@ -155,6 +157,16 @@ static inline u32 hv_get_bytes_to_write(const struct 
hv_ring_buffer_info *rbi)
return write;
 }
 
+static inline u32 hv_get_avail_to_write_percent(
+   const struct hv_ring_buffer_info *rbi)
+{
+   u32 avail_write = hv_get_bytes_to_write(rbi);
+
+   return reciprocal_divide(
+   (avail_write  << 3) + (avail_write << 1),
+   rbi->ring_size_div10_reciprocal);
+}
+
 /*
  * VMBUS version is 32 bit entity broken up into
  * two 16 bit quantities: major_number. minor_number.
-- 
2.14.1



Re: [PATCH v2 12/38] cxlflash: Use IDR to manage adapter contexts

2018-03-22 Thread Uma Krishnan


> On Mar 22, 2018, at 11:40 AM, Frederic Barrat  
> wrote:
> 
> 
> 
> Le 26/02/2018 à 23:21, Uma Krishnan a écrit :
>> A range of PASIDs are used as identifiers for the adapter contexts. These
>> contexts may be destroyed and created randomly. Use an IDR to keep track
>> of contexts that are in use and assign a unique identifier to new ones.
>> Signed-off-by: Uma Krishnan 
>> Acked-by: Matthew R. Ochs 
>> ---
>>  drivers/scsi/cxlflash/ocxl_hw.c | 20 ++--
>>  drivers/scsi/cxlflash/ocxl_hw.h |  2 ++
>>  2 files changed, 20 insertions(+), 2 deletions(-)
>> diff --git a/drivers/scsi/cxlflash/ocxl_hw.c 
>> b/drivers/scsi/cxlflash/ocxl_hw.c
>> index d75b873..6472210 100644
>> --- a/drivers/scsi/cxlflash/ocxl_hw.c
>> +++ b/drivers/scsi/cxlflash/ocxl_hw.c
>> @@ -12,6 +12,8 @@
>>   * 2 of the License, or (at your option) any later version.
>>   */
>> +#include 
>> +
>>  #include 
>>  #include "backend.h"
>> @@ -60,14 +62,25 @@ static void *ocxlflash_dev_context_init(struct pci_dev 
>> *pdev, void *afu_cookie)
>>  if (unlikely(!ctx)) {
>>  dev_err(dev, "%s: Context allocation failed\n", __func__);
>>  rc = -ENOMEM;
>> -goto err;
>> +goto err1;
>> +}
>> +
>> +idr_preload(GFP_KERNEL);
>> +rc = idr_alloc(>idr, ctx, 0, afu->max_pasid, GFP_NOWAIT);
>> +idr_preload_end();
> 
> 
> I believe we can call idr_alloc(... GFP_KERNEL) directly in that
> context now.
> 
> 
>> +if (unlikely(rc < 0)) {
>> +dev_err(dev, "%s: idr_alloc failed rc=%d\n", __func__, rc);
>> +goto err2;
>>  }
>> +ctx->pe = rc;
>>  ctx->master = false;
>>  ctx->hw_afu = afu;
>>  out:
>>  return ctx;
>> -err:
>> +err2:
>> +kfree(ctx);
>> +err1:
>>  ctx = ERR_PTR(rc);
>>  goto out;
>>  }
>> @@ -86,6 +99,7 @@ static int ocxlflash_release_context(void *ctx_cookie)
>>  if (!ctx)
>>  goto out;
>> +idr_remove(>hw_afu->idr, ctx->pe);
>>  kfree(ctx);
>>  out:
>>  return rc;
>> @@ -103,6 +117,7 @@ static void ocxlflash_destroy_afu(void *afu_cookie)
>>  return;
>>  ocxlflash_release_context(afu->ocxl_ctx);
>> +idr_destroy(>idr);
>>  kfree(afu);
>>  }
>> @@ -237,6 +252,7 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)
>>  goto err1;
>>  }
>> +idr_init(>idr);
> 
> You initialize the IDR too late. ocxlflash_dev_context_init() was called just 
> above and allocated a PE.
> 
>  Fred

Good Catch. I will fix this bug and send out a V3 soon. Thanks for the review !!

> 
>>  afu->ocxl_ctx = ctx;
>>  out:
>>  return afu;
>> diff --git a/drivers/scsi/cxlflash/ocxl_hw.h 
>> b/drivers/scsi/cxlflash/ocxl_hw.h
>> index de43c04..0381682 100644
>> --- a/drivers/scsi/cxlflash/ocxl_hw.h
>> +++ b/drivers/scsi/cxlflash/ocxl_hw.h
>> @@ -26,10 +26,12 @@ struct ocxl_hw_afu {
>>  int afu_actag_base; /* AFU acTag base */
>>  int afu_actag_enabled;  /* AFU acTag number enabled */
>> +struct idr idr; /* IDR to manage contexts */
>>  int max_pasid;  /* Maximum number of contexts */
>>  };
>>  struct ocxlflash_context {
>>  struct ocxl_hw_afu *hw_afu; /* HW AFU back pointer */
>>  bool master;/* Whether this is a master context */
>> +int pe; /* Process element */
>>  };
> 



[PATCH v2] storvsc: Set up correct queue depth values for IDE devices

2018-03-22 Thread Long Li
From: Long Li 

Unlike SCSI and FC, we don't use multiple channels for IDE.
Also fix the calculation for sub-channels.

Change log:
v2: Addressed comment on incorrect number of sub-channels.
(Michael Kelley )

Signed-off-by: Long Li 
---
 drivers/scsi/storvsc_drv.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 8c51d628b52e..a2ec0bc9e9fa 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1722,11 +1722,14 @@ static int storvsc_probe(struct hv_device *device,
max_targets = STORVSC_MAX_TARGETS;
max_channels = STORVSC_MAX_CHANNELS;
/*
-* On Windows8 and above, we support sub-channels for storage.
+* On Windows8 and above, we support sub-channels for storage
+* on SCSI and FC controllers.
 * The number of sub-channels offerred is based on the number of
 * VCPUs in the guest.
 */
-   max_sub_channels = (num_cpus / storvsc_vcpus_per_sub_channel);
+   if (!dev_is_ide)
+   max_sub_channels =
+   (num_cpus - 1) / storvsc_vcpus_per_sub_channel;
}
 
scsi_driver.can_queue = (max_outstanding_req_per_channel *
-- 
2.14.1



Re: [PATCH v2] target/file: add support of direct and async I/O

2018-03-22 Thread Andrei Vagin
On Thu, Mar 22, 2018 at 10:34:34AM -0700, Christoph Hellwig wrote:
> > 
> > DIF (PI) emulation doesn't work when a target uses async I/O, because
> > DIF metadata is saved in a separate file, and it is another non-trivial
> > task how to synchronize writing in two files, so that a following read
> > operation always returns a consisten metadata for a specified block.
> 
> As said, this isn't really in any way aio specific.
> 
> > +   int is_write = !(data_direction == DMA_FROM_DEVICE);
> 
>   bool is_write = data_direction != DMA_FROM_DEVICE;
> 
> > +   if (is_write && (cmd->se_cmd_flags & SCF_FUA))
> > +   aio_cmd->iocb.ki_flags |= IOCB_DSYNC;
> > +
> > +   if (is_write)
> > +   ret = call_write_iter(file, _cmd->iocb, );
> > +   else
> > +   ret = call_read_iter(file, _cmd->iocb, );
> > +
> > +   kfree(bvec);
> 
> While this looks good to me this is the same pattern just said doesn't
> work in loop.  So it works here just fine?

Yes, it works here, because a bvec array is always allocated from our
code, so we fully controll its livetime. This way doesn't work in loop,
because sometimes we get a bvec array from bio, so its lifetime depeends
on bio, so we have to gurantee that bio will not be released while we are
using the bvec array.

Thank you for the help with this patch.

> 
> Otherwise this looks fine to me:
> 
> Signed-off-by: Christoph Hellwig 


[Bug 199155] /sys/block/dev/device/model truncates to 16 characters

2018-03-22 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199155

Bryan Seitz (seit...@gmail.com) changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #4 from Bryan Seitz (seit...@gmail.com) ---
Thanks Doug!

-- 
You are receiving this mail because:
You are the assignee for the bug.


Re: [PATCH v2] target/file: add support of direct and async I/O

2018-03-22 Thread Bryant G. Ly
On 3/22/18 12:34 PM, Christoph Hellwig wrote:

>> DIF (PI) emulation doesn't work when a target uses async I/O, because
>> DIF metadata is saved in a separate file, and it is another non-trivial
>> task how to synchronize writing in two files, so that a following read
>> operation always returns a consisten metadata for a specified block.
> As said, this isn't really in any way aio specific.
>
>> +int is_write = !(data_direction == DMA_FROM_DEVICE);
>   bool is_write = data_direction != DMA_FROM_DEVICE;
>
>> +if (is_write && (cmd->se_cmd_flags & SCF_FUA))
>> +aio_cmd->iocb.ki_flags |= IOCB_DSYNC;
>> +
>> +if (is_write)
>> +ret = call_write_iter(file, _cmd->iocb, );
>> +else
>> +ret = call_read_iter(file, _cmd->iocb, );
>> +
>> +kfree(bvec);
> While this looks good to me this is the same pattern just said doesn't
> work in loop.  So it works here just fine?
>
> Otherwise this looks fine to me:
>
> Signed-off-by: Christoph Hellwig 
>
This patch created the helpers for f_op->read/write:
https://patchwork.kernel.org/patch/9580365/

That patch was queued up for 4.11 so I guess if anyone tried to port this patch
back they would hit issues if they didn't use f_op.

Reviewed-by: Bryant G. Ly 

-Bryant



smp affinity and kworker io submission

2018-03-22 Thread Kashyap Desai
Hi,

I am running FIO script on Linux 4.15. This is generic behavior even on
3.x kernels as well. I wanted to know if my observation is correct or not.

Here is FIO command -

numactl -C 0-2  fio single --bs=4k  --iodepth=64 --rw=randread
--ioscheduler=none --group_report --numjobs=2

If driver is provides affinity_hint, kernel choose only kworker (0,1,2)
(it looks like kworker binding is smartly handled by kernel because I am
running FIO from cpu0,1,2)  for IO submission from delayed context.

14140 root  15  -5  519296   1560612 R  87.7  0.0   0:20.91 fio

 14138 root  15  -5  519292   1556608 R  76.1  0.0   0:21.79 fio

 14142 root  15  -5  519308   1560612 R  66.8  0.0   0:19.69 fio

 14141 root  15  -5  519304   1564616 R  54.5  0.0   0:20.51 fio

   923 root   0 -20   0  0  0 S   6.3  0.0   0:09.73
kworker/1:1H

  1075 root   0 -20   0  0  0 S   5.3  0.0   0:08.69
kworker/0:1H

   924 root   0 -20   0  0  0 S   3.3  0.0   0:12.82
kworker/2:1H

If driver is not providing affinity_hint, kernel choose *any* kworker from
local numa node for IO submission from delayed context. In below snippet,
you can see kworke4, kworke5 and kworke3 was participating in IO
submission.

14281 root  15  -5  519308   1556612 R  87.0  0.0   0:16.16 fio

 14280 root  15  -5  519304   1560616 R  74.1  0.0   0:14.62 fio

 14279 root  15  -5  519296   1556612 R  71.8  0.0   0:15.02 fio

 14277 root  15  -5  519292   1552608 R  66.8  0.0   0:15.06 fio

  1887 root   0 -20   0  0  0 R  15.3  0.0   0:40.91
kworker/4:1H

  3856 root   0 -20   0  0  0 S  13.6  0.0   0:38.90
kworker/5:1H

  3646 root   0 -20   0  0  0 S  13.0  0.0   0:40.17
kworker/3:1H

Which kernel component is making this decision ? Is this behavior tied to
block layer/irq subsystem  ?

I am trying to see which behavior is most suitable for my test. I am
seeing performance is not improving because it is CPU bound and If I
choose not to do smp affinity hint in driver, it is helping as explained
above.


Thanks, Kashyap


Re: [PATCH v2] target/file: add support of direct and async I/O

2018-03-22 Thread Christoph Hellwig
> 
> DIF (PI) emulation doesn't work when a target uses async I/O, because
> DIF metadata is saved in a separate file, and it is another non-trivial
> task how to synchronize writing in two files, so that a following read
> operation always returns a consisten metadata for a specified block.

As said, this isn't really in any way aio specific.

> + int is_write = !(data_direction == DMA_FROM_DEVICE);

bool is_write = data_direction != DMA_FROM_DEVICE;

> + if (is_write && (cmd->se_cmd_flags & SCF_FUA))
> + aio_cmd->iocb.ki_flags |= IOCB_DSYNC;
> +
> + if (is_write)
> + ret = call_write_iter(file, _cmd->iocb, );
> + else
> + ret = call_read_iter(file, _cmd->iocb, );
> +
> + kfree(bvec);

While this looks good to me this is the same pattern just said doesn't
work in loop.  So it works here just fine?

Otherwise this looks fine to me:

Signed-off-by: Christoph Hellwig 


Re: [PATCH v2 21/38] cxlflash: Setup function OCXL link

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:22, Uma Krishnan a écrit :

After reading and modifying the function configuration, setup the OCXL
link using the OCXL provider services. The link is released when the
adapter is unconfigured.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---



Reviewed-by: Frederic Barrat 



  drivers/scsi/cxlflash/ocxl_hw.c | 25 ++---
  drivers/scsi/cxlflash/ocxl_hw.h |  1 +
  2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index a1d58fc..ea2906d 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -412,11 +412,28 @@ static int ocxlflash_config_fn(struct pci_dev *pdev, 
struct ocxl_hw_afu *afu)
ocxl_config_set_actag(pdev, fcfg->dvsec_function_pos, base, enabled);
dev_dbg(dev, "%s: Function acTag range base=%u enabled=%u\n",
__func__, base, enabled);
+
+   rc = ocxl_link_setup(pdev, 0, >link_token);
+   if (unlikely(rc)) {
+   dev_err(dev, "%s: ocxl_link_setup failed rc=%d\n",
+   __func__, rc);
+   goto out;
+   }
  out:
return rc;
  }

  /**
+ * ocxlflash_unconfig_fn() - unconfigure the host function
+ * @pdev:  PCI device associated with the host.
+ * @afu:   AFU associated with the host.
+ */
+static void ocxlflash_unconfig_fn(struct pci_dev *pdev, struct ocxl_hw_afu 
*afu)
+{
+   ocxl_link_release(pdev, afu->link_token);
+}
+
+/**
   * ocxlflash_map_mmio() - map the AFU MMIO space
   * @afu: AFU associated with the host.
   *
@@ -552,7 +569,7 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)
if (unlikely(rc)) {
dev_err(dev, "%s: AFU configuration failed rc=%d\n",
__func__, rc);
-   goto err1;
+   goto err2;
}

ctx = ocxlflash_dev_context_init(pdev, afu);
@@ -560,15 +577,17 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)
rc = PTR_ERR(ctx);
dev_err(dev, "%s: ocxlflash_dev_context_init failed rc=%d\n",
__func__, rc);
-   goto err2;
+   goto err3;
}

idr_init(>idr);
afu->ocxl_ctx = ctx;
  out:
return afu;
-err2:
+err3:
ocxlflash_unconfig_afu(afu);
+err2:
+   ocxlflash_unconfig_fn(pdev, afu);
  err1:
kfree(afu);
afu = NULL;
diff --git a/drivers/scsi/cxlflash/ocxl_hw.h b/drivers/scsi/cxlflash/ocxl_hw.h
index 18f402a..4d43650 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.h
+++ b/drivers/scsi/cxlflash/ocxl_hw.h
@@ -31,6 +31,7 @@ struct ocxl_hw_afu {
phys_addr_t gmmio_phys; /* Global AFU MMIO space */
void __iomem *gmmio_virt;   /* Global MMIO map */

+   void *link_token;   /* Link token for the SPA */
struct idr idr; /* IDR to manage contexts */
int max_pasid;  /* Maximum number of contexts */
  };





Re: [PATCH v2 19/38] cxlflash: Support AFU state toggling

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:22, Uma Krishnan a écrit :

The AFU should be enabled following a successful configuration and
disabled near the end of the cleanup path.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---


Reviewed-by: Frederic Barrat 



  drivers/scsi/cxlflash/ocxl_hw.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 67f0252..364e7a5 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -342,12 +342,18 @@ static void ocxlflash_unconfig_afu(struct ocxl_hw_afu 
*afu)
  static void ocxlflash_destroy_afu(void *afu_cookie)
  {
struct ocxl_hw_afu *afu = afu_cookie;
+   int pos;

if (!afu)
return;

ocxlflash_release_context(afu->ocxl_ctx);
idr_destroy(>idr);
+
+   /* Disable the AFU */
+   pos = afu->acfg.dvsec_afu_control_pos;
+   ocxl_config_set_afu_state(afu->pdev, pos, 0);
+
ocxlflash_unconfig_afu(afu);
kfree(afu);
  }
@@ -492,6 +498,9 @@ static int ocxlflash_config_afu(struct pci_dev *pdev, 
struct ocxl_hw_afu *afu)
__func__, rc);
goto out;
}
+
+   /* Enable the AFU */
+   ocxl_config_set_afu_state(pdev, acfg->dvsec_afu_control_pos, 1);
  out:
return rc;
  }





Re: [PATCH v2 16/38] cxlflash: MMIO map the AFU

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:22, Uma Krishnan a écrit :

When the AFU is configured, the global and per process MMIO regions
are presented by the configuration space. Save these regions and
map the global MMIO region that is used to access all of the control
and provisioning data in the AFU.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---



Reviewed-by: Frederic Barrat 



  drivers/scsi/cxlflash/ocxl_hw.c | 74 -
  drivers/scsi/cxlflash/ocxl_hw.h |  4 +++
  2 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 7717a63..3917aa1 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -264,6 +264,18 @@ static void ocxlflash_perst_reloads_same_image(void 
*afu_cookie, bool image)
  }

  /**
+ * ocxlflash_unconfig_afu() - unconfigure the AFU
+ * @afu: AFU associated with the host.
+ */
+static void ocxlflash_unconfig_afu(struct ocxl_hw_afu *afu)
+{
+   if (afu->gmmio_virt) {
+   iounmap(afu->gmmio_virt);
+   afu->gmmio_virt = NULL;
+   }
+}
+
+/**
   * ocxlflash_destroy_afu() - destroy the AFU structure
   * @afu_cookie:   AFU to be freed.
   */
@@ -276,6 +288,7 @@ static void ocxlflash_destroy_afu(void *afu_cookie)

ocxlflash_release_context(afu->ocxl_ctx);
idr_destroy(>idr);
+   ocxlflash_unconfig_afu(afu);
kfree(afu);
  }

@@ -324,6 +337,56 @@ static int ocxlflash_config_fn(struct pci_dev *pdev, 
struct ocxl_hw_afu *afu)
  }

  /**
+ * ocxlflash_map_mmio() - map the AFU MMIO space
+ * @afu: AFU associated with the host.
+ *
+ * Return: 0 on success, -errno on failure
+ */
+static int ocxlflash_map_mmio(struct ocxl_hw_afu *afu)
+{
+   struct ocxl_afu_config *acfg = >acfg;
+   struct pci_dev *pdev = afu->pdev;
+   struct device *dev = afu->dev;
+   phys_addr_t gmmio, ppmmio;
+   int rc = 0;
+
+   rc = pci_request_region(pdev, acfg->global_mmio_bar, "ocxlflash");
+   if (unlikely(rc)) {
+   dev_err(dev, "%s: pci_request_region for global failed rc=%d\n",
+   __func__, rc);
+   goto out;
+   }
+   gmmio = pci_resource_start(pdev, acfg->global_mmio_bar);
+   gmmio += acfg->global_mmio_offset;
+
+   rc = pci_request_region(pdev, acfg->pp_mmio_bar, "ocxlflash");
+   if (unlikely(rc)) {
+   dev_err(dev, "%s: pci_request_region for pp bar failed rc=%d\n",
+   __func__, rc);
+   goto err1;
+   }
+   ppmmio = pci_resource_start(pdev, acfg->pp_mmio_bar);
+   ppmmio += acfg->pp_mmio_offset;
+
+   afu->gmmio_virt = ioremap(gmmio, acfg->global_mmio_size);
+   if (unlikely(!afu->gmmio_virt)) {
+   dev_err(dev, "%s: MMIO mapping failed\n", __func__);
+   rc = -ENOMEM;
+   goto err2;
+   }
+
+   afu->gmmio_phys = gmmio;
+   afu->ppmmio_phys = ppmmio;
+out:
+   return rc;
+err2:
+   pci_release_region(pdev, acfg->pp_mmio_bar);
+err1:
+   pci_release_region(pdev, acfg->global_mmio_bar);
+   goto out;
+}
+
+/**
   * ocxlflash_config_afu() - configure the host AFU
   * @pdev: PCI device associated with the host.
   * @afu:  AFU associated with the host.
@@ -362,6 +425,13 @@ static int ocxlflash_config_afu(struct pci_dev *pdev, 
struct ocxl_hw_afu *afu)
afu->max_pasid = 1 << acfg->pasid_supported_log;

ocxl_config_set_afu_pasid(pdev, pos, 0, acfg->pasid_supported_log);
+
+   rc = ocxlflash_map_mmio(afu);
+   if (unlikely(rc)) {
+   dev_err(dev, "%s: ocxlflash_map_mmio failed rc=%d\n",
+   __func__, rc);
+   goto out;
+   }
  out:
return rc;
  }
@@ -407,13 +477,15 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)
rc = PTR_ERR(ctx);
dev_err(dev, "%s: ocxlflash_dev_context_init failed rc=%d\n",
__func__, rc);
-   goto err1;
+   goto err2;
}

idr_init(>idr);
afu->ocxl_ctx = ctx;
  out:
return afu;
+err2:
+   ocxlflash_unconfig_afu(afu);
  err1:
kfree(afu);
afu = NULL;
diff --git a/drivers/scsi/cxlflash/ocxl_hw.h b/drivers/scsi/cxlflash/ocxl_hw.h
index fd1a8df..29c75c4 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.h
+++ b/drivers/scsi/cxlflash/ocxl_hw.h
@@ -27,6 +27,10 @@ struct ocxl_hw_afu {
int afu_actag_base; /* AFU acTag base */
int afu_actag_enabled;  /* AFU acTag number enabled */

+   phys_addr_t ppmmio_phys;/* Per process MMIO space */
+   phys_addr_t gmmio_phys; /* Global AFU MMIO space */
+   void __iomem *gmmio_virt;   /* Global MMIO map */
+
struct idr idr; /* IDR to manage contexts */

Re: [PATCH v2 13/38] cxlflash: Support adapter file descriptors for OCXL

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:21, Uma Krishnan a écrit :

Allocate a file descriptor for an adapter context when requested. In order
to allocate inodes for the file descriptors, a pseudo filesystem is created
and used.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---



We've touched the subject before, and I don't have a magic solution, but 
it feels like something could be shared here with cxl, or maybe even 
other drivers?


I only took a quick read of the inode allocator.

  Fred





  drivers/scsi/cxlflash/ocxl_hw.c | 200 
  drivers/scsi/cxlflash/ocxl_hw.h |   1 +
  2 files changed, 201 insertions(+)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 6472210..59e9003 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -12,13 +12,144 @@
   * 2 of the License, or (at your option) any later version.
   */

+#include 
  #include 
+#include 
+#include 

  #include 

  #include "backend.h"
  #include "ocxl_hw.h"

+/*
+ * Pseudo-filesystem to allocate inodes.
+ */
+
+#define OCXLFLASH_FS_MAGIC  0x1697698f
+
+static int ocxlflash_fs_cnt;
+static struct vfsmount *ocxlflash_vfs_mount;
+
+static const struct dentry_operations ocxlflash_fs_dops = {
+   .d_dname= simple_dname,
+};
+
+/*
+ * ocxlflash_fs_mount() - mount the pseudo-filesystem
+ * @fs_type:   File system type.
+ * @flags: Flags for the filesystem.
+ * @dev_name:  Device name associated with the filesystem.
+ * @data:  Data pointer.
+ *
+ * Return: pointer to the directory entry structure
+ */
+static struct dentry *ocxlflash_fs_mount(struct file_system_type *fs_type,
+int flags, const char *dev_name,
+void *data)
+{
+   return mount_pseudo(fs_type, "ocxlflash:", NULL, _fs_dops,
+   OCXLFLASH_FS_MAGIC);
+}
+
+static struct file_system_type ocxlflash_fs_type = {
+   .name   = "ocxlflash",
+   .owner  = THIS_MODULE,
+   .mount  = ocxlflash_fs_mount,
+   .kill_sb= kill_anon_super,
+};
+
+/*
+ * ocxlflash_release_mapping() - release the memory mapping
+ * @ctx:   Context whose mapping is to be released.
+ */
+static void ocxlflash_release_mapping(struct ocxlflash_context *ctx)
+{
+   if (ctx->mapping)
+   simple_release_fs(_vfs_mount, _fs_cnt);
+   ctx->mapping = NULL;
+}
+
+/*
+ * ocxlflash_getfile() - allocate pseudo filesystem, inode, and the file
+ * @dev:   Generic device of the host.
+ * @name:  Name of the pseudo filesystem.
+ * @fops:  File operations.
+ * @priv:  Private data.
+ * @flags: Flags for the file.
+ *
+ * Return: pointer to the file on success, ERR_PTR on failure
+ */
+static struct file *ocxlflash_getfile(struct device *dev, const char *name,
+ const struct file_operations *fops,
+ void *priv, int flags)
+{
+   struct qstr this;
+   struct path path;
+   struct file *file;
+   struct inode *inode = NULL;
+   int rc;
+
+   if (fops->owner && !try_module_get(fops->owner)) {
+   dev_err(dev, "%s: Owner does not exist\n", __func__);
+   rc = -ENOENT;
+   goto err1;
+   }
+
+   rc = simple_pin_fs(_fs_type, _vfs_mount,
+  _fs_cnt);
+   if (unlikely(rc < 0)) {
+   dev_err(dev, "%s: Cannot mount ocxlflash pseudofs rc=%d\n",
+   __func__, rc);
+   goto err2;
+   }
+
+   inode = alloc_anon_inode(ocxlflash_vfs_mount->mnt_sb);
+   if (IS_ERR(inode)) {
+   rc = PTR_ERR(inode);
+   dev_err(dev, "%s: alloc_anon_inode failed rc=%d\n",
+   __func__, rc);
+   goto err3;
+   }
+
+   this.name = name;
+   this.len = strlen(name);
+   this.hash = 0;
+   path.dentry = d_alloc_pseudo(ocxlflash_vfs_mount->mnt_sb, );
+   if (!path.dentry) {
+   dev_err(dev, "%s: d_alloc_pseudo failed\n", __func__);
+   rc = -ENOMEM;
+   goto err4;
+   }
+
+   path.mnt = mntget(ocxlflash_vfs_mount);
+   d_instantiate(path.dentry, inode);
+
+   file = alloc_file(, OPEN_FMODE(flags), fops);
+   if (IS_ERR(file)) {
+   rc = PTR_ERR(file);
+   dev_err(dev, "%s: alloc_file failed rc=%d\n",
+   __func__, rc);
+   goto err5;
+   }
+
+   file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
+   file->private_data = priv;
+out:
+   return file;
+err5:
+   path_put();
+err4:
+   iput(inode);
+err3:
+   simple_release_fs(_vfs_mount, _fs_cnt);
+err2:
+   module_put(fops->owner);
+err1:
+   file = ERR_PTR(rc);
+   goto out;
+}
+
  /**
   * 

aacraid issues in 4.16-rc

2018-03-22 Thread Justin Forbes
The Fedora QA folks have run into a regression with aacraid in the
4.16 rc kernels. This is not an issue with 4.15.x kernels.
Essentially things just do not work unless 'aac_sync_mode=1' is passed
on the command line. The failures show logs of

aacraid: Host adapter abort request
aacraid: Outstanding commands on (adapter)

The reset request shows outstanding command : firmware-2

More details and logs are available in the Fedora bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1557659

Thanks,
Justin


Re: [PATCH v2 12/38] cxlflash: Use IDR to manage adapter contexts

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:21, Uma Krishnan a écrit :

A range of PASIDs are used as identifiers for the adapter contexts. These
contexts may be destroyed and created randomly. Use an IDR to keep track
of contexts that are in use and assign a unique identifier to new ones.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---
  drivers/scsi/cxlflash/ocxl_hw.c | 20 ++--
  drivers/scsi/cxlflash/ocxl_hw.h |  2 ++
  2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index d75b873..6472210 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -12,6 +12,8 @@
   * 2 of the License, or (at your option) any later version.
   */

+#include 
+
  #include 

  #include "backend.h"
@@ -60,14 +62,25 @@ static void *ocxlflash_dev_context_init(struct pci_dev 
*pdev, void *afu_cookie)
if (unlikely(!ctx)) {
dev_err(dev, "%s: Context allocation failed\n", __func__);
rc = -ENOMEM;
-   goto err;
+   goto err1;
+   }
+
+   idr_preload(GFP_KERNEL);
+   rc = idr_alloc(>idr, ctx, 0, afu->max_pasid, GFP_NOWAIT);
+   idr_preload_end();



I believe we can call idr_alloc(... GFP_KERNEL) directly in that
context now.



+   if (unlikely(rc < 0)) {
+   dev_err(dev, "%s: idr_alloc failed rc=%d\n", __func__, rc);
+   goto err2;
}

+   ctx->pe = rc;
ctx->master = false;
ctx->hw_afu = afu;
  out:
return ctx;
-err:
+err2:
+   kfree(ctx);
+err1:
ctx = ERR_PTR(rc);
goto out;
  }
@@ -86,6 +99,7 @@ static int ocxlflash_release_context(void *ctx_cookie)
if (!ctx)
goto out;

+   idr_remove(>hw_afu->idr, ctx->pe);
kfree(ctx);
  out:
return rc;
@@ -103,6 +117,7 @@ static void ocxlflash_destroy_afu(void *afu_cookie)
return;

ocxlflash_release_context(afu->ocxl_ctx);
+   idr_destroy(>idr);
kfree(afu);
  }

@@ -237,6 +252,7 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)
goto err1;
}

+   idr_init(>idr);


You initialize the IDR too late. ocxlflash_dev_context_init() was called 
just above and allocated a PE.


  Fred



afu->ocxl_ctx = ctx;
  out:
return afu;
diff --git a/drivers/scsi/cxlflash/ocxl_hw.h b/drivers/scsi/cxlflash/ocxl_hw.h
index de43c04..0381682 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.h
+++ b/drivers/scsi/cxlflash/ocxl_hw.h
@@ -26,10 +26,12 @@ struct ocxl_hw_afu {
int afu_actag_base; /* AFU acTag base */
int afu_actag_enabled;  /* AFU acTag number enabled */

+   struct idr idr; /* IDR to manage contexts */
int max_pasid;  /* Maximum number of contexts */
  };

  struct ocxlflash_context {
struct ocxl_hw_afu *hw_afu; /* HW AFU back pointer */
bool master;/* Whether this is a master context */
+   int pe; /* Process element */
  };





Re: [PATCH v2 11/38] cxlflash: Adapter context support for OCXL

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:21, Uma Krishnan a écrit :

Add support to create and release the adapter contexts for OCXL and
provide means to specify certain contexts as a master.

The existing cxlflash core has a design requirement that each host will
have a single host context available by default. To satisfy this
requirement, one host adapter context is created when the hardware AFU is
initialized. This is returned by the get_context() fop.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---


Reviewed-by: Frederic Barrat 


  drivers/scsi/cxlflash/ocxl_hw.c | 90 +
  drivers/scsi/cxlflash/ocxl_hw.h |  6 +++
  2 files changed, 96 insertions(+)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index bd86eef..d75b873 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -18,6 +18,80 @@
  #include "ocxl_hw.h"

  /**
+ * ocxlflash_set_master() - sets the context as master
+ * @ctx_cookie:Adapter context to set as master.
+ */
+static void ocxlflash_set_master(void *ctx_cookie)
+{
+   struct ocxlflash_context *ctx = ctx_cookie;
+
+   ctx->master = true;
+}
+
+/**
+ * ocxlflash_get_context() - obtains the context associated with the host
+ * @pdev:  PCI device associated with the host.
+ * @afu_cookie:Hardware AFU associated with the host.
+ *
+ * Return: returns the pointer to host adapter context
+ */
+static void *ocxlflash_get_context(struct pci_dev *pdev, void *afu_cookie)
+{
+   struct ocxl_hw_afu *afu = afu_cookie;
+
+   return afu->ocxl_ctx;
+}
+
+/**
+ * ocxlflash_dev_context_init() - allocate and initialize an adapter context
+ * @pdev:  PCI device associated with the host.
+ * @afu_cookie:Hardware AFU associated with the host.
+ *
+ * Return: returns the adapter context on success, ERR_PTR on failure
+ */
+static void *ocxlflash_dev_context_init(struct pci_dev *pdev, void *afu_cookie)
+{
+   struct ocxl_hw_afu *afu = afu_cookie;
+   struct device *dev = afu->dev;
+   struct ocxlflash_context *ctx;
+   int rc;
+
+   ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+   if (unlikely(!ctx)) {
+   dev_err(dev, "%s: Context allocation failed\n", __func__);
+   rc = -ENOMEM;
+   goto err;
+   }
+
+   ctx->master = false;
+   ctx->hw_afu = afu;
+out:
+   return ctx;
+err:
+   ctx = ERR_PTR(rc);
+   goto out;
+}
+
+/**
+ * ocxlflash_release_context() - releases an adapter context
+ * @ctx_cookie:Adapter context to be released.
+ *
+ * Return: 0 on success, -errno on failure
+ */
+static int ocxlflash_release_context(void *ctx_cookie)
+{
+   struct ocxlflash_context *ctx = ctx_cookie;
+   int rc = 0;
+
+   if (!ctx)
+   goto out;
+
+   kfree(ctx);
+out:
+   return rc;
+}
+
+/**
   * ocxlflash_destroy_afu() - destroy the AFU structure
   * @afu_cookie:   AFU to be freed.
   */
@@ -28,6 +102,7 @@ static void ocxlflash_destroy_afu(void *afu_cookie)
if (!afu)
return;

+   ocxlflash_release_context(afu->ocxl_ctx);
kfree(afu);
  }

@@ -127,6 +202,7 @@ static int ocxlflash_config_afu(struct pci_dev *pdev, 
struct ocxl_hw_afu *afu)
  static void *ocxlflash_create_afu(struct pci_dev *pdev)
  {
struct device *dev = >dev;
+   struct ocxlflash_context *ctx;
struct ocxl_hw_afu *afu;
int rc;

@@ -152,6 +228,16 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)
__func__, rc);
goto err1;
}
+
+   ctx = ocxlflash_dev_context_init(pdev, afu);
+   if (IS_ERR(ctx)) {
+   rc = PTR_ERR(ctx);
+   dev_err(dev, "%s: ocxlflash_dev_context_init failed rc=%d\n",
+   __func__, rc);
+   goto err1;
+   }
+
+   afu->ocxl_ctx = ctx;
  out:
return afu;
  err1:
@@ -163,6 +249,10 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)
  /* Backend ops to ocxlflash services */
  const struct cxlflash_backend_ops cxlflash_ocxl_ops = {
.module = THIS_MODULE,
+   .set_master = ocxlflash_set_master,
+   .get_context= ocxlflash_get_context,
+   .dev_context_init   = ocxlflash_dev_context_init,
+   .release_context= ocxlflash_release_context,
.create_afu = ocxlflash_create_afu,
.destroy_afu= ocxlflash_destroy_afu,
  };
diff --git a/drivers/scsi/cxlflash/ocxl_hw.h b/drivers/scsi/cxlflash/ocxl_hw.h
index a6f7796..de43c04 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.h
+++ b/drivers/scsi/cxlflash/ocxl_hw.h
@@ -14,6 +14,7 @@

  /* OCXL hardware AFU associated with the host */
  struct ocxl_hw_afu {
+   struct ocxlflash_context *ocxl_ctx; /* Host context */
struct pci_dev *pdev; 

Re: [PATCH v2 10/38] cxlflash: Setup AFU PASID

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:21, Uma Krishnan a écrit :

Per the OCXL specification, the maximum PASID supported by the AFU is
indicated by a field within the configuration space. Similar to acTags,
implementations can choose to use any sub-range of PASID within their
assigned range. For cxlflash, the entire range is used.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---


It sure helps to know you've got only one function/AFU :-)

Reviewed-by: Frederic Barrat 


  drivers/scsi/cxlflash/ocxl_hw.c | 3 +++
  drivers/scsi/cxlflash/ocxl_hw.h | 2 ++
  2 files changed, 5 insertions(+)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index d01847d9..bd86eef 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -111,6 +111,9 @@ static int ocxlflash_config_afu(struct pci_dev *pdev, 
struct ocxl_hw_afu *afu)
dev_dbg(dev, "%s: acTag base=%d enabled=%d\n", __func__, base, count);
afu->afu_actag_base = base;
afu->afu_actag_enabled = count;
+   afu->max_pasid = 1 << acfg->pasid_supported_log;
+
+   ocxl_config_set_afu_pasid(pdev, pos, 0, acfg->pasid_supported_log);
  out:
return rc;
  }
diff --git a/drivers/scsi/cxlflash/ocxl_hw.h b/drivers/scsi/cxlflash/ocxl_hw.h
index 9c675fa..a6f7796 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.h
+++ b/drivers/scsi/cxlflash/ocxl_hw.h
@@ -24,4 +24,6 @@ struct ocxl_hw_afu {
int fn_actag_enabled;   /* Function acTag number enabled */
int afu_actag_base; /* AFU acTag base */
int afu_actag_enabled;  /* AFU acTag number enabled */
+
+   int max_pasid;  /* Maximum number of contexts */
  };





Re: [PATCH v2 09/38] cxlflash: Setup AFU acTag range

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:21, Uma Krishnan a écrit :

The OCXL specification supports distributing acTags amongst different
AFUs and functions on the link. As cxlflash devices are expected to only
support a single AFU and function, the entire range that was assigned to
the function is also assigned to the AFU.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---


Reviewed-by: Frederic Barrat 


  drivers/scsi/cxlflash/ocxl_hw.c | 13 +
  drivers/scsi/cxlflash/ocxl_hw.h |  2 ++
  2 files changed, 15 insertions(+)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 2325030..d01847d9 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -89,6 +89,9 @@ static int ocxlflash_config_afu(struct pci_dev *pdev, struct 
ocxl_hw_afu *afu)
struct ocxl_afu_config *acfg = >acfg;
struct ocxl_fn_config *fcfg = >fcfg;
struct device *dev = >dev;
+   int count;
+   int base;
+   int pos;
int rc = 0;

/* Read AFU config at index 0 */
@@ -98,6 +101,16 @@ static int ocxlflash_config_afu(struct pci_dev *pdev, 
struct ocxl_hw_afu *afu)
__func__, rc);
goto out;
}
+
+   /* Only one AFU per function is supported, so actag_base is same */
+   base = afu->fn_actag_base;
+   count = min_t(int, acfg->actag_supported, afu->fn_actag_enabled);
+   pos = acfg->dvsec_afu_control_pos;
+
+   ocxl_config_set_afu_actag(pdev, pos, base, count);
+   dev_dbg(dev, "%s: acTag base=%d enabled=%d\n", __func__, base, count);
+   afu->afu_actag_base = base;
+   afu->afu_actag_enabled = count;
  out:
return rc;
  }
diff --git a/drivers/scsi/cxlflash/ocxl_hw.h b/drivers/scsi/cxlflash/ocxl_hw.h
index f6af247..9c675fa 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.h
+++ b/drivers/scsi/cxlflash/ocxl_hw.h
@@ -22,4 +22,6 @@ struct ocxl_hw_afu {

int fn_actag_base;  /* Function acTag base */
int fn_actag_enabled;   /* Function acTag number enabled */
+   int afu_actag_base; /* AFU acTag base */
+   int afu_actag_enabled;  /* AFU acTag number enabled */
  };





Re: [PATCH v2 08/38] cxlflash: Read host AFU configuration

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:21, Uma Krishnan a écrit :

The host AFU configuration is read on the initialization path to identify
the features and configuration of the AFU. This data is cached for use in
later configuration steps.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---
  drivers/scsi/cxlflash/ocxl_hw.c | 34 ++
  drivers/scsi/cxlflash/ocxl_hw.h |  1 +
  2 files changed, 35 insertions(+)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 39cccb7..2325030 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -76,6 +76,33 @@ static int ocxlflash_config_fn(struct pci_dev *pdev, struct 
ocxl_hw_afu *afu)
  }

  /**
+ * ocxlflash_config_afu() - configure the host AFU
+ * @pdev:  PCI device associated with the host.
+ * @afu:   AFU associated with the host.
+ *
+ * Must be called _after_ host function configuration.
+ *
+ * Return: 0 on success, -errno on failure
+ */
+static int ocxlflash_config_afu(struct pci_dev *pdev, struct ocxl_hw_afu *afu)
+{
+   struct ocxl_afu_config *acfg = >acfg;
+   struct ocxl_fn_config *fcfg = >fcfg;
+   struct device *dev = >dev;
+   int rc = 0;
+
+   /* Read AFU config at index 0 */
+   rc = ocxl_config_read_afu(pdev, fcfg, acfg, 0);



Looking at other patches around this one, there's really the assumption 
that we have only one AFU (which is fine). So similar to another 
comment, we could harden that the AFU image really looks like what we 
expect. Here, we read the config at index 0. Just a suggestion that 
ocxl_fn_config gives you the maximum index of AFUs for the function

ocxl_fn_config->max_afu_index

So it should be 0 for cxlflash.

  Fred



+   if (unlikely(rc)) {
+   dev_err(dev, "%s: ocxl_config_read_afu failed rc=%d\n",
+   __func__, rc);
+   goto out;
+   }
+out:
+   return rc;
+}
+
+/**
   * ocxlflash_create_afu() - create the AFU for OCXL
   * @pdev: PCI device associated with the host.
   *
@@ -102,6 +129,13 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)
__func__, rc);
goto err1;
}
+
+   rc = ocxlflash_config_afu(pdev, afu);
+   if (unlikely(rc)) {
+   dev_err(dev, "%s: AFU configuration failed rc=%d\n",
+   __func__, rc);
+   goto err1;
+   }
  out:
return afu;
  err1:
diff --git a/drivers/scsi/cxlflash/ocxl_hw.h b/drivers/scsi/cxlflash/ocxl_hw.h
index 190d71a..f6af247 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.h
+++ b/drivers/scsi/cxlflash/ocxl_hw.h
@@ -18,6 +18,7 @@ struct ocxl_hw_afu {
struct device *dev; /* Generic device */

struct ocxl_fn_config fcfg; /* DVSEC config of the function */
+   struct ocxl_afu_config acfg;/* AFU configuration data */

int fn_actag_base;  /* Function acTag base */
int fn_actag_enabled;   /* Function acTag number enabled */





Re: [PATCH v2 08/38] cxlflash: Read host AFU configuration

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:21, Uma Krishnan a écrit :

The host AFU configuration is read on the initialization path to identify
the features and configuration of the AFU. This data is cached for use in
later configuration steps.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---


Reviewed-by: Frederic Barrat 


  drivers/scsi/cxlflash/ocxl_hw.c | 34 ++
  drivers/scsi/cxlflash/ocxl_hw.h |  1 +
  2 files changed, 35 insertions(+)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 39cccb7..2325030 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -76,6 +76,33 @@ static int ocxlflash_config_fn(struct pci_dev *pdev, struct 
ocxl_hw_afu *afu)
  }

  /**
+ * ocxlflash_config_afu() - configure the host AFU
+ * @pdev:  PCI device associated with the host.
+ * @afu:   AFU associated with the host.
+ *
+ * Must be called _after_ host function configuration.
+ *
+ * Return: 0 on success, -errno on failure
+ */
+static int ocxlflash_config_afu(struct pci_dev *pdev, struct ocxl_hw_afu *afu)
+{
+   struct ocxl_afu_config *acfg = >acfg;
+   struct ocxl_fn_config *fcfg = >fcfg;
+   struct device *dev = >dev;
+   int rc = 0;
+
+   /* Read AFU config at index 0 */
+   rc = ocxl_config_read_afu(pdev, fcfg, acfg, 0);
+   if (unlikely(rc)) {
+   dev_err(dev, "%s: ocxl_config_read_afu failed rc=%d\n",
+   __func__, rc);
+   goto out;
+   }
+out:
+   return rc;
+}
+
+/**
   * ocxlflash_create_afu() - create the AFU for OCXL
   * @pdev: PCI device associated with the host.
   *
@@ -102,6 +129,13 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)
__func__, rc);
goto err1;
}
+
+   rc = ocxlflash_config_afu(pdev, afu);
+   if (unlikely(rc)) {
+   dev_err(dev, "%s: AFU configuration failed rc=%d\n",
+   __func__, rc);
+   goto err1;
+   }
  out:
return afu;
  err1:
diff --git a/drivers/scsi/cxlflash/ocxl_hw.h b/drivers/scsi/cxlflash/ocxl_hw.h
index 190d71a..f6af247 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.h
+++ b/drivers/scsi/cxlflash/ocxl_hw.h
@@ -18,6 +18,7 @@ struct ocxl_hw_afu {
struct device *dev; /* Generic device */

struct ocxl_fn_config fcfg; /* DVSEC config of the function */
+   struct ocxl_afu_config acfg;/* AFU configuration data */

int fn_actag_base;  /* Function acTag base */
int fn_actag_enabled;   /* Function acTag number enabled */





Re: [PATCH v2 07/38] cxlflash: Setup function acTag range

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:20, Uma Krishnan a écrit :

The OCXL specification supports distributing acTags amongst different
AFUs and functions on the link. The platform-specific acTag range for the
link is obtained using the OCXL provider services and then assigned to the
host function based on implementation. For cxlflash devices only a single
function per host is expected and thus the entire range is assigned.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---
  drivers/scsi/cxlflash/ocxl_hw.c | 15 +++
  drivers/scsi/cxlflash/ocxl_hw.h |  3 +++
  2 files changed, 18 insertions(+)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index dc32a73..39cccb7 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -42,6 +42,7 @@ static int ocxlflash_config_fn(struct pci_dev *pdev, struct 
ocxl_hw_afu *afu)
  {
struct ocxl_fn_config *fcfg = >fcfg;
struct device *dev = >dev;
+   u16 base, enabled, supported;
int rc = 0;

/* Read DVSEC config of the function */
@@ -56,6 +57,20 @@ static int ocxlflash_config_fn(struct pci_dev *pdev, struct 
ocxl_hw_afu *afu)
if (fcfg->max_afu_index != 0)
dev_warn(dev, "%s: Unexpected AFU index value %d\n",
 __func__, fcfg->max_afu_index);
+
+   rc = ocxl_config_get_actag_info(pdev, , , );
+   if (unlikely(rc)) {
+   dev_err(dev, "%s: ocxl_config_get_actag_info failed rc=%d\n",
+   __func__, rc);
+   goto out;
+   }
+
+   afu->fn_actag_base = base;
+   afu->fn_actag_enabled = enabled;
+


Since you know your configuration (only 1 function with AFU, all actags 
to that AFU), you could even have a warning if supported != enabled, to 
detect suspicious AFU images.


Reviewed-by: Frederic Barrat 



+   ocxl_config_set_actag(pdev, fcfg->dvsec_function_pos, base, enabled);
+   dev_dbg(dev, "%s: Function acTag range base=%u enabled=%u\n",
+   __func__, base, enabled);
  out:
return rc;
  }
diff --git a/drivers/scsi/cxlflash/ocxl_hw.h b/drivers/scsi/cxlflash/ocxl_hw.h
index 658f420..190d71a 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.h
+++ b/drivers/scsi/cxlflash/ocxl_hw.h
@@ -18,4 +18,7 @@ struct ocxl_hw_afu {
struct device *dev; /* Generic device */

struct ocxl_fn_config fcfg; /* DVSEC config of the function */
+
+   int fn_actag_base;  /* Function acTag base */
+   int fn_actag_enabled;   /* Function acTag number enabled */
  };





Re: [PATCH v2 06/38] cxlflash: Read host function configuration

2018-03-22 Thread Frederic Barrat



Le 26/02/2018 à 23:20, Uma Krishnan a écrit :

Per the OCXL specification, the underlying host can have multiple AFUs
per function with each function supporting its own configuration. The host
function configuration is read on the initialization path to evaluate the
number of functions present and identify the features and configuration of
the functions present. This data is cached for use in later configuration
steps. Note that for the OCXL hardware supported by the cxlflash driver,
only one AFU per function is expected.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---


Reviewed-by: Frederic Barrat 



  drivers/scsi/cxlflash/ocxl_hw.c | 41 +
  drivers/scsi/cxlflash/ocxl_hw.h |  2 ++
  2 files changed, 43 insertions(+)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index e3a0a9b..dc32a73 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -32,6 +32,35 @@ static void ocxlflash_destroy_afu(void *afu_cookie)
  }

  /**
+ * ocxlflash_config_fn() - configure the host function
+ * @pdev:  PCI device associated with the host.
+ * @afu:   AFU associated with the host.
+ *
+ * Return: 0 on success, -errno on failure
+ */
+static int ocxlflash_config_fn(struct pci_dev *pdev, struct ocxl_hw_afu *afu)
+{
+   struct ocxl_fn_config *fcfg = >fcfg;
+   struct device *dev = >dev;
+   int rc = 0;
+
+   /* Read DVSEC config of the function */
+   rc = ocxl_config_read_function(pdev, fcfg);
+   if (unlikely(rc)) {
+   dev_err(dev, "%s: ocxl_config_read_function failed rc=%d\n",
+   __func__, rc);
+   goto out;
+   }
+
+   /* Only one AFU per function is supported by ocxlflash */
+   if (fcfg->max_afu_index != 0)
+   dev_warn(dev, "%s: Unexpected AFU index value %d\n",
+__func__, fcfg->max_afu_index);
+out:
+   return rc;
+}
+
+/**
   * ocxlflash_create_afu() - create the AFU for OCXL
   * @pdev: PCI device associated with the host.
   *
@@ -41,6 +70,7 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)
  {
struct device *dev = >dev;
struct ocxl_hw_afu *afu;
+   int rc;

afu = kzalloc(sizeof(*afu), GFP_KERNEL);
if (unlikely(!afu)) {
@@ -50,8 +80,19 @@ static void *ocxlflash_create_afu(struct pci_dev *pdev)

afu->pdev = pdev;
afu->dev = dev;
+
+   rc = ocxlflash_config_fn(pdev, afu);
+   if (unlikely(rc)) {
+   dev_err(dev, "%s: Function configuration failed rc=%d\n",
+   __func__, rc);
+   goto err1;
+   }
  out:
return afu;
+err1:
+   kfree(afu);
+   afu = NULL;
+   goto out;
  }

  /* Backend ops to ocxlflash services */
diff --git a/drivers/scsi/cxlflash/ocxl_hw.h b/drivers/scsi/cxlflash/ocxl_hw.h
index c7e5c4d..658f420 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.h
+++ b/drivers/scsi/cxlflash/ocxl_hw.h
@@ -16,4 +16,6 @@
  struct ocxl_hw_afu {
struct pci_dev *pdev;   /* PCI device */
struct device *dev; /* Generic device */
+
+   struct ocxl_fn_config fcfg; /* DVSEC config of the function */
  };





Re: [PATCH v2] scsi: ufs: add trace event for ufs upiu

2018-03-22 Thread Bart Van Assche
On Thu, 2018-03-22 at 13:50 +0200, Ohad Sharabi wrote:
> +static void ufshcd_add_cmd_upiu_trace(struct ufs_hba *hba, unsigned int tag,
> + const char *str)
> +{
> + struct utp_upiu_req *rq = hba->lrb[tag].ucd_req_ptr;
> +
> + trace_ufshcd_upiu(dev_name(hba->dev), str, (u8 *)>header,
> + (u8 *)>sc.cdb);
> +}
> +
> +static void ufshcd_add_query_upiu_trace(struct ufs_hba *hba, unsigned int 
> tag,
> + const char *str)
> +{
> + struct utp_upiu_req *rq = hba->lrb[tag].ucd_req_ptr;
> +
> + trace_ufshcd_upiu(dev_name(hba->dev), str, (u8 *)>header,
> + (u8 *)>qr);
> +}
> +
> +static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag,
> + const char *str)
> +{
> + struct utp_task_req_desc *descp;
> + struct utp_upiu_task_req *task_req;
> + int off = (int)tag - hba->nutrs;
> +
> + descp = >utmrdl_base_addr[off];
> + task_req = (struct utp_upiu_task_req *)descp->task_req_upiu;
> + trace_ufshcd_upiu(dev_name(hba->dev), str, (u8 *)_req->header,
> + (u8 *)_req->input_param1);
> +}

Can the trace_ufshcd_upiu() definition be changed such that the third and
fourth argument are declared as 'void *' such that the ugly '(u8 *)' casts
can be left out from the callers?

Thanks,

Bart.




Re: [PATCH 1/2] smartpqi: workaround fw bug for oq deletion

2018-03-22 Thread Johannes Thumshirn
On Thu, Mar 22, 2018 at 08:45:33AM -0400, Martin K. Petersen wrote:
> 
> And he already removed the '-' and left aligned the commit message :)

Thanks a lot :)
-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH 1/2] smartpqi: workaround fw bug for oq deletion

2018-03-22 Thread Martin K. Petersen

Johannes,

>> Please remove the '-' and left align the commit message.
>
> Gnah, too late. Martin has already applied the patch.

And he already removed the '-' and left aligned the commit message :)

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH v2] scsi: ufs: add trace event for ufs upiu

2018-03-22 Thread Ohad Sharabi
Add UFS Protocol Information Units(upiu) trace events for ufs driver,
used to trace various ufs transaction types- command, task-management
and device management.
The trace-point format is generic and can be easily adapted to trace
other upius if needed.
Currently tracing ufs transaction of type 'device management', which
this patch introduce, cannot be obtained from any other trace.
Device management transactions are used for communication with the
device such as reading and writing descriptor or attributes etc.

v1->v2:
- split to transaction specific functions (fix warnings and simplifies
  code)
- adding traces when sending query command

Signed-off-by: Ohad Sharabi 
---
 drivers/scsi/ufs/ufshcd.c  | 42 ++
 include/trace/events/ufs.h | 28 
 2 files changed, 70 insertions(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index c7da2c1..6f68d5c 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -274,6 +274,37 @@ static inline void ufshcd_remove_non_printable(char *val)
*val = ' ';
 }
 
+static void ufshcd_add_cmd_upiu_trace(struct ufs_hba *hba, unsigned int tag,
+   const char *str)
+{
+   struct utp_upiu_req *rq = hba->lrb[tag].ucd_req_ptr;
+
+   trace_ufshcd_upiu(dev_name(hba->dev), str, (u8 *)>header,
+   (u8 *)>sc.cdb);
+}
+
+static void ufshcd_add_query_upiu_trace(struct ufs_hba *hba, unsigned int tag,
+   const char *str)
+{
+   struct utp_upiu_req *rq = hba->lrb[tag].ucd_req_ptr;
+
+   trace_ufshcd_upiu(dev_name(hba->dev), str, (u8 *)>header,
+   (u8 *)>qr);
+}
+
+static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag,
+   const char *str)
+{
+   struct utp_task_req_desc *descp;
+   struct utp_upiu_task_req *task_req;
+   int off = (int)tag - hba->nutrs;
+
+   descp = >utmrdl_base_addr[off];
+   task_req = (struct utp_upiu_task_req *)descp->task_req_upiu;
+   trace_ufshcd_upiu(dev_name(hba->dev), str, (u8 *)_req->header,
+   (u8 *)_req->input_param1);
+}
+
 static void ufshcd_add_command_trace(struct ufs_hba *hba,
unsigned int tag, const char *str)
 {
@@ -283,6 +314,9 @@ static void ufshcd_add_command_trace(struct ufs_hba *hba,
struct ufshcd_lrb *lrbp;
int transfer_len = -1;
 
+   /* trace UPIU also */
+   ufshcd_add_cmd_upiu_trace(hba, tag, str);
+
if (!trace_ufshcd_command_enabled())
return;
 
@@ -2584,6 +2618,7 @@ static int ufshcd_exec_dev_cmd(struct ufs_hba *hba,
 
hba->dev_cmd.complete = 
 
+   ufshcd_add_query_upiu_trace(hba, tag, "query_send");
/* Make sure descriptors are ready before ringing the doorbell */
wmb();
spin_lock_irqsave(hba->host->host_lock, flags);
@@ -2593,6 +2628,9 @@ static int ufshcd_exec_dev_cmd(struct ufs_hba *hba,
 
err = ufshcd_wait_for_dev_cmd(hba, lrbp, timeout);
 
+   ufshcd_add_query_upiu_trace(hba, tag,
+   err ? "query_complete_err" : "query_complete");
+
 out_put_tag:
ufshcd_put_dev_cmd_tag(hba, tag);
wake_up(>dev_cmd.tag_wq);
@@ -5464,11 +5502,14 @@ static int ufshcd_issue_tm_cmd(struct ufs_hba *hba, int 
lun_id, int task_id,
 
spin_unlock_irqrestore(host->host_lock, flags);
 
+   ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_send");
+
/* wait until the task management command is completed */
err = wait_event_timeout(hba->tm_wq,
test_bit(free_slot, >tm_condition),
msecs_to_jiffies(TM_CMD_TIMEOUT));
if (!err) {
+   ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete_err");
dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
__func__, tm_function);
if (ufshcd_clear_tm_cmd(hba, free_slot))
@@ -5477,6 +5518,7 @@ static int ufshcd_issue_tm_cmd(struct ufs_hba *hba, int 
lun_id, int task_id,
err = -ETIMEDOUT;
} else {
err = ufshcd_task_req_compl(hba, free_slot, tm_response);
+   ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete");
}
 
clear_bit(free_slot, >tm_condition);
diff --git a/include/trace/events/ufs.h b/include/trace/events/ufs.h
index bf6f826..0b2ff5d 100644
--- a/include/trace/events/ufs.h
+++ b/include/trace/events/ufs.h
@@ -257,6 +257,34 @@
)
 );
 
+TRACE_EVENT(ufshcd_upiu,
+   TP_PROTO(const char *dev_name, const char *str, unsigned char *hdr,
+   unsigned char *tsf),
+
+   TP_ARGS(dev_name, str, hdr, tsf),
+
+   TP_STRUCT__entry(
+   __string(dev_name, dev_name)
+   __string(str, str)
+   __array(unsigned char, hdr, 12)
+   __array(unsigned char, tsf, 16)
+   ),

Re: [PATCH] target: Use WARNON_NON_RT(!irqs_disabled())

2018-03-22 Thread kbuild test robot
Hi Arnaldo,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.16-rc6 next-20180322]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Arnaldo-Carvalho-de-Melo/target-Use-WARNON_NON_RT-irqs_disabled/20180322-174549
config: i386-randconfig-s1-03221113 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/target/target_core_transport.c: In function 
'__transport_check_aborted_status':
>> drivers/target/target_core_transport.c:3207:2: error: implicit declaration 
>> of function 'WARN_ON_ONCE_NONRT' [-Werror=implicit-function-declaration]
 WARN_ON_ONCE_NONRT(!irqs_disabled());
 ^~
   cc1: some warnings being treated as errors

vim +/WARN_ON_ONCE_NONRT +3207 drivers/target/target_core_transport.c

  3199  
  3200  static int __transport_check_aborted_status(struct se_cmd *cmd, int 
send_status)
  3201  __releases(>t_state_lock)
  3202  __acquires(>t_state_lock)
  3203  {
  3204  int ret;
  3205  
  3206  assert_spin_locked(>t_state_lock);
> 3207  WARN_ON_ONCE_NONRT(!irqs_disabled());
  3208  
  3209  if (!(cmd->transport_state & CMD_T_ABORTED))
  3210  return 0;
  3211  /*
  3212   * If cmd has been aborted but either no status is to be sent 
or it has
  3213   * already been sent, just return
  3214   */
  3215  if (!send_status || !(cmd->se_cmd_flags & 
SCF_SEND_DELAYED_TAS)) {
  3216  if (send_status)
  3217  cmd->se_cmd_flags |= SCF_SEND_DELAYED_TAS;
  3218  return 1;
  3219  }
  3220  
  3221  pr_debug("Sending delayed SAM_STAT_TASK_ABORTED status for CDB:"
  3222  " 0x%02x ITT: 0x%08llx\n", cmd->t_task_cdb[0], 
cmd->tag);
  3223  
  3224  cmd->se_cmd_flags &= ~SCF_SEND_DELAYED_TAS;
  3225  cmd->scsi_status = SAM_STAT_TASK_ABORTED;
  3226  trace_target_cmd_complete(cmd);
  3227  
  3228  spin_unlock_irq(>t_state_lock);
  3229  ret = cmd->se_tfo->queue_status(cmd);
  3230  if (ret)
  3231  transport_handle_queue_full(cmd, cmd->se_dev, ret, 
false);
  3232  spin_lock_irq(>t_state_lock);
  3233  
  3234  return 1;
  3235  }
  3236  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] target: Use WARNON_NON_RT(!irqs_disabled())

2018-03-22 Thread kbuild test robot
Hi Arnaldo,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.16-rc6 next-20180322]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Arnaldo-Carvalho-de-Melo/target-Use-WARNON_NON_RT-irqs_disabled/20180322-174549
config: i386-randconfig-x015-201811 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers//target/target_core_transport.c: In function 
'__transport_check_aborted_status':
>> drivers//target/target_core_transport.c:3207:2: error: implicit declaration 
>> of function 'WARN_ON_ONCE_NONRT'; did you mean 'WARN_ON_ONCE'? 
>> [-Werror=implicit-function-declaration]
 WARN_ON_ONCE_NONRT(!irqs_disabled());
 ^~
 WARN_ON_ONCE
   cc1: some warnings being treated as errors

vim +3207 drivers//target/target_core_transport.c

  3199  
  3200  static int __transport_check_aborted_status(struct se_cmd *cmd, int 
send_status)
  3201  __releases(>t_state_lock)
  3202  __acquires(>t_state_lock)
  3203  {
  3204  int ret;
  3205  
  3206  assert_spin_locked(>t_state_lock);
> 3207  WARN_ON_ONCE_NONRT(!irqs_disabled());
  3208  
  3209  if (!(cmd->transport_state & CMD_T_ABORTED))
  3210  return 0;
  3211  /*
  3212   * If cmd has been aborted but either no status is to be sent 
or it has
  3213   * already been sent, just return
  3214   */
  3215  if (!send_status || !(cmd->se_cmd_flags & 
SCF_SEND_DELAYED_TAS)) {
  3216  if (send_status)
  3217  cmd->se_cmd_flags |= SCF_SEND_DELAYED_TAS;
  3218  return 1;
  3219  }
  3220  
  3221  pr_debug("Sending delayed SAM_STAT_TASK_ABORTED status for CDB:"
  3222  " 0x%02x ITT: 0x%08llx\n", cmd->t_task_cdb[0], 
cmd->tag);
  3223  
  3224  cmd->se_cmd_flags &= ~SCF_SEND_DELAYED_TAS;
  3225  cmd->scsi_status = SAM_STAT_TASK_ABORTED;
  3226  trace_target_cmd_complete(cmd);
  3227  
  3228  spin_unlock_irq(>t_state_lock);
  3229  ret = cmd->se_tfo->queue_status(cmd);
  3230  if (ret)
  3231  transport_handle_queue_full(cmd, cmd->se_dev, ret, 
false);
  3232  spin_lock_irq(>t_state_lock);
  3233  
  3234  return 1;
  3235  }
  3236  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[Bug 199155] /sys/block/dev/device/model truncates to 16 characters

2018-03-22 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199155

--- Comment #3 from d gilbert (dgilb...@interlog.com) ---
On 2018-03-22 05:08 AM, bugzilla-dae...@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=199155
> 
> --- Comment #2 from Bryan Seitz (seit...@gmail.com) ---
> Is there some other place this is stored under /sys ?
> 

There shouldn't be any need. If that SATA SSD is behind a compliant
SATL (SCSI to ATA Translation Layer) then it should supply the ATA
Information VPD page plus a SCSI ATA PASS-THROUGH command so that
an ATA Identify command can be tunnelled through to the SSD (and
the response will have the untruncated device details). Tools like
smartmontools use that ATA PASS-THROUGH.

This is a SATA SSD connected via a USB Type C to SATA dongle that
uses UASP.

# lsscsi -gs
[0:0:0:0]  disk INTEL SS DSA2M080G2GC 2CV1 /dev/sda  /dev/sg0  80.0GB


# hdparm -I /dev/sg0

/dev/sg0:

ATA device, with non-removable media
Model Number:   INTEL SSDSA2M080G2GC
Serial Number:  CVPO017405HT080JGN
Firmware Revision:  2CV102M3
Transport:  Serial, ATA8-AST, SATA 1.0a, SATA II Extensions,
SATA Rev 
2.5, SATA Rev 2.6



Note that is not the transport connected to my laptop (USB is).

# sg_vpd -p ai /dev/sda
fetching VPD page failed: Illegal request
sg_vpd failed: Illegal request

That is not good, the SATL should implement the AI VPD page.


# smartctl -a /dev/sg0
smartctl 6.7 2018-03-07 r4718 [x86_64-linux-4.16.0-rc1+] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:   INTEL SS
Product:  DSA2M080G2GC
Revision: 2CV1
Compliance:   SPC-4


So smartctl thinks its a SCSI device, but if told there is a SATL:

# smartctl -a -d sat /dev/sg0
smartctl 6.7 2018-03-07 r4718 [x86_64-linux-4.16.0-rc1+] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Intel X18-M/X25-M/X25-V G2 SSDs
Device Model: INTEL SSDSA2M080G2GC
Serial Number:CVPO017405HT080JGN



So now smartctl sees that it is a SATA SSD. BTW there is some discussion
on the smartmontool dev list about changing that default. That means
that all SCSI storage devices would be checked for the presence of a SATL
and if it was present automatically switch to "ATA" mode.

Doug Gilbert

-- 
You are receiving this mail because:
You are the assignee for the bug.


Re: [PATCH] target: Use WARNON_NON_RT(!irqs_disabled())

2018-03-22 Thread Thomas Gleixner
On Thu, 22 Mar 2018, Arnaldo Carvalho de Melo wrote:
> Em Wed, Mar 21, 2018 at 11:43:58AM -0700, Linus Torvalds escreveu:
> > [ Adding PeterZ to participants due to query about lockdep_assert() ]
> > 
> > On Wed, Mar 21, 2018 at 8:38 AM, Arnaldo Carvalho de Melo
> >  wrote:
> > >
> > > assert_spin_locked(>t_state_lock);
> > > -   WARN_ON_ONCE(!irqs_disabled());
> > > +   WARN_ON_ONCE_NONRT(!irqs_disabled());
> > 
> > Ugh.
> > 
> > Can't we just replace both of those with a lockdep annotation?
> 
> Huh, even better, when that feature gets finished (tglx said it was
> being developed, not there yet tho) it'll allow further reduction of the
> PREEMPT_RT_FULL patchkit.

That's the evil plan :)


Re: [PATCH] target: Use WARNON_NON_RT(!irqs_disabled())

2018-03-22 Thread Arnaldo Carvalho de Melo
Em Thu, Mar 22, 2018 at 06:37:45AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Mar 21, 2018 at 11:43:58AM -0700, Linus Torvalds escreveu:
> > [ Adding PeterZ to participants due to query about lockdep_assert() ]
> > 
> > On Wed, Mar 21, 2018 at 8:38 AM, Arnaldo Carvalho de Melo
> >  wrote:
> > >
> > > assert_spin_locked(>t_state_lock);
> > > -   WARN_ON_ONCE(!irqs_disabled());
> > > +   WARN_ON_ONCE_NONRT(!irqs_disabled());
> > 
> > Ugh.
> > 
> > Can't we just replace both of those with a lockdep annotation?
> 
> Huh, even better, when that feature gets finished (tglx said it was
> being developed, not there yet tho) it'll allow further reduction of the
> PREEMPT_RT_FULL patchkit.

We'd get rid of these:

[acme@jouet patches-4.11.12-rt15]$ grep "^+[[:space:]]\+.*NONRT" *.patch
dm-make-rt-aware.patch:+BUG_ON_NONRT(!irqs_disabled());
fs-block-rt-support.patch:+ WARN_ON_NONRT(!irqs_disabled());
i915-bogus-warning-from-i915-when-running-on-PREEMPT.patch:+
WARN_ON_NONRT(!in_interrupt());
iommu-amd--Use-WARN_ON_NORT.patch:+ WARN_ON_NONRT(!irqs_disabled());
iommu-amd--Use-WARN_ON_NORT.patch:+ WARN_ON_NONRT(!irqs_disabled());
irqwork-push_most_work_into_softirq_context.patch:+ 
BUG_ON_NONRT(!irqs_disabled());
net-wireless-warn-nort.patch:+  WARN_ON_ONCE_NONRT(softirq_count() == 0);
posix-timers-thread-posix-cpu-timers-on-rt.patch:+  
WARN_ON_ONCE_NONRT(!irqs_disabled());
posix-timers-thread-posix-cpu-timers-on-rt.patch:+  
WARN_ON_ONCE_NONRT(!irqs_disabled());
posix-timers-thread-posix-cpu-timers-on-rt.patch:+  
WARN_ON_ONCE_NONRT(!irqs_disabled());
workqueue-use-locallock.patch:+ WARN_ON_ONCE_NONRT(!irqs_disabled());
[acme@jouet patches-4.11.12-rt15]$

- Arnaldo


Re: [PATCH] target: Use WARNON_NON_RT(!irqs_disabled())

2018-03-22 Thread Arnaldo Carvalho de Melo
Em Wed, Mar 21, 2018 at 11:43:58AM -0700, Linus Torvalds escreveu:
> [ Adding PeterZ to participants due to query about lockdep_assert() ]
> 
> On Wed, Mar 21, 2018 at 8:38 AM, Arnaldo Carvalho de Melo
>  wrote:
> >
> > assert_spin_locked(>t_state_lock);
> > -   WARN_ON_ONCE(!irqs_disabled());
> > +   WARN_ON_ONCE_NONRT(!irqs_disabled());
> 
> Ugh.
> 
> Can't we just replace both of those with a lockdep annotation?

Huh, even better, when that feature gets finished (tglx said it was
being developed, not there yet tho) it'll allow further reduction of the
PREEMPT_RT_FULL patchkit.

- Arnaldo


Re: [PATCH] target: Use WARNON_NON_RT(!irqs_disabled())

2018-03-22 Thread Thomas Gleixner
On Wed, 21 Mar 2018, Linus Torvalds wrote:
> [ Adding PeterZ to participants due to query about lockdep_assert() ]
> 
> On Wed, Mar 21, 2018 at 8:38 AM, Arnaldo Carvalho de Melo
>  wrote:
> >
> > assert_spin_locked(>t_state_lock);
> > -   WARN_ON_ONCE(!irqs_disabled());
> > +   WARN_ON_ONCE_NONRT(!irqs_disabled());
> 
> Ugh.
> 
> Can't we just replace both of those with a lockdep annotation?
> 
> Does "lockdep_assert_held()" already verify the irq contextr, or do we
> need lockdep_assert_irqs_disabled() too?
> 
> Honestly, the old-fashioned way of doing verification of state by hand
> is understandable, but it's legacy and kind of pointless when we have
> much better tools these days.
> 
> I'm perfectly willing to leave old assertions in place, but if they
> need fixing anyway, I'd damn well want to fix them *right* instead of
> starting to just add more piles of hacks on top of the old model.
> 
> Because when the details of the locking rules depend on RT vs non-RT,
> I want the checks to make sense.  And presumably lockdep is the thing
> that really knows what the status of a lock is, no?

We are working on replacing the _NONRT _RT variants with proper lockdep
mechnisms which are aware of the RT vs. non-RT magic under the hood. Just
not there yet.

Thanks,

tglx


Re: [PATCH 1/2] smartpqi: workaround fw bug for oq deletion

2018-03-22 Thread Johannes Thumshirn
On Thu, Mar 22, 2018 at 09:34:17AM +0100, Johannes Thumshirn wrote:
> On Wed, Mar 21, 2018 at 01:32:31PM -0500, Don Brace wrote:
> > From: Kevin Barnett 
> > 
> > - skip deleting PQI operational queues when there is
> >   an error creating a new queue group. It's not really necessary to
> >   delete the queues anyway because they get deleted during the PQI
> >   reset that is part of the error recovery path.
> 
> Please remove the '-' and left align the commit message.

Gnah, too late. Martin has already applied the patch.

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH] scsi: sd: Remember that READ CAPACITY(16) succeeded

2018-03-22 Thread Menion
Maybe is worth to add a log in KERN_NOTICE also

2018-03-14 17:47 GMT+01:00 Laurence Oberman :
> On Wed, 2018-03-14 at 12:15 -0400, Martin K. Petersen wrote:
>> The USB storage glue sets the try_rc_10_first flag in an attempt to
>> avoid wedging poorly implemented legacy USB devices.
>>
>> If the device capacity is too large to be expressed in the provided
>> response buffer field of READ CAPACITY(10), a well-behaved device
>> will
>> set the reported capacity to 0x. We will then attempt to
>> issue a
>> READ CAPACITY(16) to obtain the real capacity.
>>
>> Since this part of the discovery logic is not covered by the
>> first_scan
>> flag, a warning will be printed a couple of times times per
>> revalidate
>> attempt if we upgrade from READ CAPACITY(10) to READ CAPACITY(16).
>>
>> Remember that we have successfully issued READ CAPACITY(16) so we can
>> take the fast path on subsequent revalidate attempts.
>>
>> Reported-by: Menion 
>> Signed-off-by: Martin K. Petersen 
>> ---
>>  drivers/scsi/sd.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index bff21e636ddd..6e971b94af7d 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -2484,6 +2484,8 @@ sd_read_capacity(struct scsi_disk *sdkp,
>> unsigned char *buffer)
>>   sector_size = old_sector_size;
>>   goto got_data;
>>   }
>> + /* Remember that READ CAPACITY(16) succeeded
>> */
>> + sdp->try_rc_10_first = 0;
>>   }
>>   }
>>
>
> Looks fine to me.
> Reviewed-by: Laurence Oberman 


[PATCH v2] target/file: add support of direct and async I/O

2018-03-22 Thread Andrei Vagin
There are two advantages:
* Direct I/O allows to avoid the write-back cache, so it reduces
  affects to other processes in the system.
* Async I/O allows to handle a few commands concurrently.

DIO + AIO shows a better perfomance for random write operations:

Mode: O_DSYNC Async: 1
$ ./fio --bs=4K --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 
--name=/dev/sda --runtime=20 --numjobs=2
  WRITE: bw=45.9MiB/s (48.1MB/s), 21.9MiB/s-23.0MiB/s (22.0MB/s-25.2MB/s), 
io=919MiB (963MB), run=20002-20020msec

Mode: O_DSYNC Async: 0
$ ./fio --bs=4K --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 
--name=/dev/sdb --runtime=20 --numjobs=2
  WRITE: bw=1607KiB/s (1645kB/s), 802KiB/s-805KiB/s (821kB/s-824kB/s), 
io=31.8MiB (33.4MB), run=20280-20295msec

Known issue:

DIF (PI) emulation doesn't work when a target uses async I/O, because
DIF metadata is saved in a separate file, and it is another non-trivial
task how to synchronize writing in two files, so that a following read
operation always returns a consisten metadata for a specified block.

v2: fix comments from Christoph Hellwig

Cc: "Nicholas A. Bellinger" 
Cc: Christoph Hellwig 
Cc: Bryant G. Ly 
Tested-by: Bryant G. Ly 
Signed-off-by: Andrei Vagin 
---
 drivers/target/target_core_file.c | 137 ++
 drivers/target/target_core_file.h |   1 +
 2 files changed, 124 insertions(+), 14 deletions(-)

diff --git a/drivers/target/target_core_file.c 
b/drivers/target/target_core_file.c
index 9b2c0c773022..16751ae55d7b 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -250,6 +250,84 @@ static void fd_destroy_device(struct se_device *dev)
}
 }
 
+struct target_core_file_cmd {
+   unsigned long   len;
+   struct se_cmd   *cmd;
+   struct kiocbiocb;
+};
+
+static void cmd_rw_aio_complete(struct kiocb *iocb, long ret, long ret2)
+{
+   struct target_core_file_cmd *cmd;
+
+   cmd = container_of(iocb, struct target_core_file_cmd, iocb);
+
+   if (ret != cmd->len)
+   target_complete_cmd(cmd->cmd, SAM_STAT_CHECK_CONDITION);
+   else
+   target_complete_cmd(cmd->cmd, SAM_STAT_GOOD);
+
+   kfree(cmd);
+}
+
+static sense_reason_t
+fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
+ enum dma_data_direction data_direction)
+{
+   int is_write = !(data_direction == DMA_FROM_DEVICE);
+   struct se_device *dev = cmd->se_dev;
+   struct fd_dev *fd_dev = FD_DEV(dev);
+   struct file *file = fd_dev->fd_file;
+   struct target_core_file_cmd *aio_cmd;
+   struct iov_iter iter = {};
+   struct scatterlist *sg;
+   struct bio_vec *bvec;
+   ssize_t len = 0;
+   int ret = 0, i;
+
+   aio_cmd = kmalloc(sizeof(struct target_core_file_cmd), GFP_KERNEL);
+   if (!aio_cmd)
+   return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+
+   bvec = kcalloc(sgl_nents, sizeof(struct bio_vec), GFP_KERNEL);
+   if (!bvec) {
+   kfree(aio_cmd);
+   return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+   }
+
+   for_each_sg(sgl, sg, sgl_nents, i) {
+   bvec[i].bv_page = sg_page(sg);
+   bvec[i].bv_len = sg->length;
+   bvec[i].bv_offset = sg->offset;
+
+   len += sg->length;
+   }
+
+   iov_iter_bvec(, ITER_BVEC | is_write, bvec, sgl_nents, len);
+
+   aio_cmd->cmd = cmd;
+   aio_cmd->len = len;
+   aio_cmd->iocb.ki_pos = cmd->t_task_lba * dev->dev_attrib.block_size;
+   aio_cmd->iocb.ki_filp = file;
+   aio_cmd->iocb.ki_complete = cmd_rw_aio_complete;
+   aio_cmd->iocb.ki_flags = IOCB_DIRECT;
+
+   if (is_write && (cmd->se_cmd_flags & SCF_FUA))
+   aio_cmd->iocb.ki_flags |= IOCB_DSYNC;
+
+   if (is_write)
+   ret = call_write_iter(file, _cmd->iocb, );
+   else
+   ret = call_read_iter(file, _cmd->iocb, );
+
+   kfree(bvec);
+
+   if (ret != -EIOCBQUEUED)
+   cmd_rw_aio_complete(_cmd->iocb, ret, 0);
+
+   return 0;
+}
+
 static int fd_do_rw(struct se_cmd *cmd, struct file *fd,
u32 block_size, struct scatterlist *sgl,
u32 sgl_nents, u32 data_length, int is_write)
@@ -527,7 +605,7 @@ fd_execute_unmap(struct se_cmd *cmd, sector_t lba, sector_t 
nolb)
 }
 
 static sense_reason_t
-fd_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
+fd_execute_rw_buffered(struct se_cmd *cmd, struct scatterlist *sgl, u32 
sgl_nents,
  enum dma_data_direction data_direction)
 {
struct se_device *dev = cmd->se_dev;
@@ -537,16 +615,6 @@ fd_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, 
u32 sgl_nents,
sense_reason_t rc;
int ret = 0;
/*
-* We are currently limited by the 

Fiber Channel in LIO / TCM

2018-03-22 Thread TomK
Curious if there will be continued development and support of Fiber 
Channel in LIO / TCM?


I haven't seen much in the way in terms of commits on github or the 
linux-iscsi.org/wiki/Fibre_Channel page in recent months / years so 
wondering about it's future.


--
Cheers,
Tom K.