Re: [PATCH v2] virtio-blk: add DISCARD support to virtio-blk driver

2017-07-04 Thread Michael S. Tsirkin
On Tue, Jul 04, 2017 at 11:24:01AM +0200, Paolo Bonzini wrote:
> 
> 
> On 05/07/2017 10:44, Changpeng Liu wrote:
> > Currently virtio-blk driver does not provide discard feature flag, so the
> > filesystems which built on top of the block device will not send discard
> > command. This is okay for HDD backend, but it will impact the performance
> > for SSD backend.
> > 
> > Add a feature flag VIRTIO_BLK_F_DISCARD and command VIRTIO_BLK_T_DISCARD
> > to extend exist virtio-blk protocol, define 16 bytes discard descriptor
> > for each discard segment, the discard segment defination aligns with
> > SCSI or NVM Express protocols, virtio-blk driver will support multi-range
> > discard request as well.
> > 
> > Signed-off-by: Changpeng Liu 
> 
> Please include a patch for the specification.


Most importantly, please remember to copy virtio-...@lists.oasis-open.org
on anything that changes the host/guest interface.

> Since we are at it, I
> would like to have three operations defined using the same descriptor:
> 
> - discard (SCSI UNMAP)
> 
> - write zeroes (SCSI WRITE SAME without UNMAP flag)
> 
> - write zeroes and possibly discard (SCSI WRITE SAME with UNMAP flag)
> 
> The last two can use the same command VIRTIO_BLK_T_WRITE_ZEROES, using
> the reserved field as a flags field.
> 
> Paolo

> > ---
> >  drivers/block/virtio_blk.c  | 76 
> > +++--
> >  include/uapi/linux/virtio_blk.h | 19 +++
> >  2 files changed, 92 insertions(+), 3 deletions(-)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net] virtio-net: unbreak cusmed packet for small buffer XDP

2017-07-04 Thread Jason Wang



On 2017年07月04日 01:03, Michael S. Tsirkin wrote:

On Wed, Jun 28, 2017 at 08:05:06PM +0800, Jason Wang wrote:


On 2017年06月28日 12:01, Michael S. Tsirkin wrote:

On Wed, Jun 28, 2017 at 11:40:30AM +0800, Jason Wang wrote:

On 2017年06月28日 11:31, Michael S. Tsirkin wrote:

On Wed, Jun 28, 2017 at 10:45:18AM +0800, Jason Wang wrote:

On 2017年06月28日 10:17, Michael S. Tsirkin wrote:

On Wed, Jun 28, 2017 at 10:14:34AM +0800, Jason Wang wrote:

On 2017年06月28日 10:02, Michael S. Tsirkin wrote:

On Wed, Jun 28, 2017 at 09:54:03AM +0800, Jason Wang wrote:

We should allow csumed packet for small buffer, otherwise XDP_PASS
won't work correctly.

Fixes commit bb91accf2733 ("virtio-net: XDP support for small buffers")
Signed-off-by: Jason Wang

The issue would be VIRTIO_NET_HDR_F_DATA_VALID might be set.
What do you think?

I think it's safe. For XDP_PASS, it work like in the past.

That's the part I don't get. With DATA_VALID csum in packet is wrong, XDP
tools assume it's value.

DATA_VALID is CHECKSUM_UNCESSARY on the host, and according to the comment
in skbuff.h


"
*   The hardware you're dealing with doesn't calculate the full checksum
*   (as in CHECKSUM_COMPLETE), but it does parse headers and verify
checksums
*   for specific protocols. For such packets it will set
CHECKSUM_UNNECESSARY
*   if their checksums are okay. skb->csum is still undefined in this case
*   though. A driver or device must never modify the checksum field in the
*   packet even if checksum is verified.
"

The csum is correct I believe?

Thanks

That's on input. But I think for tun it's output, where that is equivalent
to CHECKSUM_NONE



Yes, but the comment said:

"
CKSUM_NONE:
   *
   *   The skb was already checksummed by the protocol, or a checksum is not
   *   required.
   *
   * CHECKSUM_UNNECESSARY:
   *
   *   This has the same meaning on as CHECKSUM_NONE for checksum offload on
   *   output.
   *
"

So still correct I think?

Thanks

Hmm maybe I mean NEEDS_CHECKSUM actually.

I'll need to re-read the spec.


Not sure this is an issue. But if it is, we can probably checksum the packet
before passing it to XDP. But it would be a little slow.

Thanks



Right. I confused DATA_VALID with NEEDS_CHECKSUM.

IIUC XDP generally refuses to attach if checksum offload
is enabled.


Any reason to do this? (Looks like I don't see any code for this)



Could you pls explain how to reproduce the issue you are seeing?



Using small buffer, all csumed packets will be dropped.

Thanks
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v1] virtio_blk: Use sysfs_match_string() helper

2017-07-04 Thread Jason Wang



On 2017年07月03日 20:05, Andy Shevchenko wrote:

On Fri, 2017-06-09 at 15:07 +0300, Andy Shevchenko wrote:

Use sysfs_match_string() helper instead of open coded variant.

Did I miss maintainer?


Nope :)

Reviewed-by: Jason Wang 




Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Signed-off-by: Andy Shevchenko 
---
  drivers/block/virtio_blk.c | 7 ++-
  1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 553cc4c542b4..0e707b8cce9d 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -541,12 +541,9 @@ virtblk_cache_type_store(struct device *dev,
struct device_attribute *attr,
int i;
  
  	BUG_ON(!virtio_has_feature(vblk->vdev,

VIRTIO_BLK_F_CONFIG_WCE));
-   for (i = ARRAY_SIZE(virtblk_cache_types); --i >= 0; )
-   if (sysfs_streq(buf, virtblk_cache_types[i]))
-   break;
-
+   i = sysfs_match_string(virtblk_cache_types, buf);
if (i < 0)
-   return -EINVAL;
+   return i;
  
  	virtio_cwrite8(vdev, offsetof(struct virtio_blk_config, wce),

i);
virtblk_update_cache_mode(vdev);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 0/1] Change email address

2017-07-04 Thread Christian Borntraeger
On 07/04/2017 02:02 PM, Paolo Bonzini wrote:
> 
> 
> On 04/07/2017 11:30, Cornelia Huck wrote:
>> New employer, new address. Make sure people use the right one.
>>
>> Christian, Martin, it's probably quickest if one of you takes this.
> 
> I'm sending a pull request tomorrow so I'll include this too.

Ok,
Acked-by: Christian Borntraeger 


> 
> Paolo
> 
>> Cornelia Huck (1):
>>   Update my email address
>>
>>  MAINTAINERS | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/1] Change email address

2017-07-04 Thread Paolo Bonzini


On 04/07/2017 11:30, Cornelia Huck wrote:
> New employer, new address. Make sure people use the right one.
> 
> Christian, Martin, it's probably quickest if one of you takes this.

I'm sending a pull request tomorrow so I'll include this too.

Paolo

> Cornelia Huck (1):
>   Update my email address
> 
>  MAINTAINERS | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] virtio-blk: add DISCARD support to virtio-blk driver

2017-07-04 Thread Paolo Bonzini


On 05/07/2017 10:44, Changpeng Liu wrote:
> Currently virtio-blk driver does not provide discard feature flag, so the
> filesystems which built on top of the block device will not send discard
> command. This is okay for HDD backend, but it will impact the performance
> for SSD backend.
> 
> Add a feature flag VIRTIO_BLK_F_DISCARD and command VIRTIO_BLK_T_DISCARD
> to extend exist virtio-blk protocol, define 16 bytes discard descriptor
> for each discard segment, the discard segment defination aligns with
> SCSI or NVM Express protocols, virtio-blk driver will support multi-range
> discard request as well.
> 
> Signed-off-by: Changpeng Liu 

Please include a patch for the specification.  Since we are at it, I
would like to have three operations defined using the same descriptor:

- discard (SCSI UNMAP)

- write zeroes (SCSI WRITE SAME without UNMAP flag)

- write zeroes and possibly discard (SCSI WRITE SAME with UNMAP flag)

The last two can use the same command VIRTIO_BLK_T_WRITE_ZEROES, using
the reserved field as a flags field.

Paolo

> ---
>  drivers/block/virtio_blk.c  | 76 
> +++--
>  include/uapi/linux/virtio_blk.h | 19 +++
>  2 files changed, 92 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 0297ad7..8f0c614 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -172,10 +172,52 @@ static int virtblk_add_req(struct virtqueue *vq, struct 
> virtblk_req *vbr,
>   return virtqueue_add_sgs(vq, sgs, num_out, num_in, vbr, GFP_ATOMIC);
>  }
>  
> +static inline int virtblk_setup_discard(struct request *req)
> +{
> + unsigned short segments = blk_rq_nr_discard_segments(req), n = 0;
> + u32 block_size = queue_logical_block_size(req->q);
> + struct virtio_blk_discard *range;
> + struct bio *bio;
> +
> + if (block_size < 512 || !block_size)
> + return -1;
> +
> + range = kmalloc_array(segments, sizeof(*range), GFP_ATOMIC);
> + if (!range)
> + return -1;
> +
> + __rq_for_each_bio(bio, req) {
> + u64 slba = (bio->bi_iter.bi_sector << 9) / block_size;
> + u32 nlb = bio->bi_iter.bi_size / block_size;
> +
> + range[n].reserved = cpu_to_le32(0);
> + range[n].nlba = cpu_to_le32(nlb);
> + range[n].slba = cpu_to_le64(slba);
> + n++;
> + }
> +
> + if (WARN_ON_ONCE(n != segments)) {
> + kfree(range);
> + return -1;
> + }
> +
> + req->special_vec.bv_page = virt_to_page(range);
> + req->special_vec.bv_offset = offset_in_page(range);
> + req->special_vec.bv_len = sizeof(*range) * segments;
> + req->rq_flags |= RQF_SPECIAL_PAYLOAD;
> +
> + return 0;
> +}
> +
>  static inline void virtblk_request_done(struct request *req)
>  {
>   struct virtblk_req *vbr = blk_mq_rq_to_pdu(req);
>  
> + if (req->rq_flags & RQF_SPECIAL_PAYLOAD) {
> + kfree(page_address(req->special_vec.bv_page) +
> +   req->special_vec.bv_offset);
> + }
> +
>   switch (req_op(req)) {
>   case REQ_OP_SCSI_IN:
>   case REQ_OP_SCSI_OUT:
> @@ -237,6 +279,9 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx 
> *hctx,
>   case REQ_OP_FLUSH:
>   type = VIRTIO_BLK_T_FLUSH;
>   break;
> + case REQ_OP_DISCARD:
> + type = VIRTIO_BLK_T_DISCARD;
> + break;
>   case REQ_OP_SCSI_IN:
>   case REQ_OP_SCSI_OUT:
>   type = VIRTIO_BLK_T_SCSI_CMD;
> @@ -256,9 +301,15 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx 
> *hctx,
>  
>   blk_mq_start_request(req);
>  
> + if (type == VIRTIO_BLK_T_DISCARD) {
> + err = virtblk_setup_discard(req);
> + if (err)
> + return BLK_STS_IOERR;
> + }
> +
>   num = blk_rq_map_sg(hctx->queue, req, vbr->sg);
>   if (num) {
> - if (rq_data_dir(req) == WRITE)
> + if (rq_data_dir(req) == WRITE || type == VIRTIO_BLK_T_DISCARD)
>   vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev, 
> VIRTIO_BLK_T_OUT);
>   else
>   vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev, 
> VIRTIO_BLK_T_IN);
> @@ -767,6 +818,25 @@ static int virtblk_probe(struct virtio_device *vdev)
>   if (!err && opt_io_size)
>   blk_queue_io_opt(q, blk_size * opt_io_size);
>  
> + if (virtio_has_feature(vdev, VIRTIO_BLK_F_DISCARD)) {
> + q->limits.discard_alignment = blk_size;
> + q->limits.discard_granularity = blk_size;
> +
> + virtio_cread(vdev, struct virtio_blk_config, max_discard_seg, 
> );
> + if (v)
> + blk_queue_max_discard_sectors(q, v);
> + else
> + blk_queue_max_discard_sectors(q, -1U);
> +
> + virtio_cread(vdev, struct 

[PATCH v2] virtio-blk: add DISCARD support to virtio-blk driver

2017-07-04 Thread Changpeng Liu
Currently virtio-blk driver does not provide discard feature flag, so the
filesystems which built on top of the block device will not send discard
command. This is okay for HDD backend, but it will impact the performance
for SSD backend.

Add a feature flag VIRTIO_BLK_F_DISCARD and command VIRTIO_BLK_T_DISCARD
to extend exist virtio-blk protocol, define 16 bytes discard descriptor
for each discard segment, the discard segment defination aligns with
SCSI or NVM Express protocols, virtio-blk driver will support multi-range
discard request as well.

Signed-off-by: Changpeng Liu 
---
 drivers/block/virtio_blk.c  | 76 +++--
 include/uapi/linux/virtio_blk.h | 19 +++
 2 files changed, 92 insertions(+), 3 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 0297ad7..8f0c614 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -172,10 +172,52 @@ static int virtblk_add_req(struct virtqueue *vq, struct 
virtblk_req *vbr,
return virtqueue_add_sgs(vq, sgs, num_out, num_in, vbr, GFP_ATOMIC);
 }
 
+static inline int virtblk_setup_discard(struct request *req)
+{
+   unsigned short segments = blk_rq_nr_discard_segments(req), n = 0;
+   u32 block_size = queue_logical_block_size(req->q);
+   struct virtio_blk_discard *range;
+   struct bio *bio;
+
+   if (block_size < 512 || !block_size)
+   return -1;
+
+   range = kmalloc_array(segments, sizeof(*range), GFP_ATOMIC);
+   if (!range)
+   return -1;
+
+   __rq_for_each_bio(bio, req) {
+   u64 slba = (bio->bi_iter.bi_sector << 9) / block_size;
+   u32 nlb = bio->bi_iter.bi_size / block_size;
+
+   range[n].reserved = cpu_to_le32(0);
+   range[n].nlba = cpu_to_le32(nlb);
+   range[n].slba = cpu_to_le64(slba);
+   n++;
+   }
+
+   if (WARN_ON_ONCE(n != segments)) {
+   kfree(range);
+   return -1;
+   }
+
+   req->special_vec.bv_page = virt_to_page(range);
+   req->special_vec.bv_offset = offset_in_page(range);
+   req->special_vec.bv_len = sizeof(*range) * segments;
+   req->rq_flags |= RQF_SPECIAL_PAYLOAD;
+
+   return 0;
+}
+
 static inline void virtblk_request_done(struct request *req)
 {
struct virtblk_req *vbr = blk_mq_rq_to_pdu(req);
 
+   if (req->rq_flags & RQF_SPECIAL_PAYLOAD) {
+   kfree(page_address(req->special_vec.bv_page) +
+ req->special_vec.bv_offset);
+   }
+
switch (req_op(req)) {
case REQ_OP_SCSI_IN:
case REQ_OP_SCSI_OUT:
@@ -237,6 +279,9 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx 
*hctx,
case REQ_OP_FLUSH:
type = VIRTIO_BLK_T_FLUSH;
break;
+   case REQ_OP_DISCARD:
+   type = VIRTIO_BLK_T_DISCARD;
+   break;
case REQ_OP_SCSI_IN:
case REQ_OP_SCSI_OUT:
type = VIRTIO_BLK_T_SCSI_CMD;
@@ -256,9 +301,15 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx 
*hctx,
 
blk_mq_start_request(req);
 
+   if (type == VIRTIO_BLK_T_DISCARD) {
+   err = virtblk_setup_discard(req);
+   if (err)
+   return BLK_STS_IOERR;
+   }
+
num = blk_rq_map_sg(hctx->queue, req, vbr->sg);
if (num) {
-   if (rq_data_dir(req) == WRITE)
+   if (rq_data_dir(req) == WRITE || type == VIRTIO_BLK_T_DISCARD)
vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev, 
VIRTIO_BLK_T_OUT);
else
vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev, 
VIRTIO_BLK_T_IN);
@@ -767,6 +818,25 @@ static int virtblk_probe(struct virtio_device *vdev)
if (!err && opt_io_size)
blk_queue_io_opt(q, blk_size * opt_io_size);
 
+   if (virtio_has_feature(vdev, VIRTIO_BLK_F_DISCARD)) {
+   q->limits.discard_alignment = blk_size;
+   q->limits.discard_granularity = blk_size;
+
+   virtio_cread(vdev, struct virtio_blk_config, max_discard_seg, 
);
+   if (v)
+   blk_queue_max_discard_sectors(q, v);
+   else
+   blk_queue_max_discard_sectors(q, -1U);
+
+   virtio_cread(vdev, struct virtio_blk_config, max_discard_num, 
);
+   if (v)
+   blk_queue_max_discard_segments(q, v);
+   else
+   blk_queue_max_discard_segments(q, 256);
+
+   queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
+   }
+
virtio_device_ready(vdev);
 
device_add_disk(>dev, vblk->disk);
@@ -874,14 +944,14 @@ static int virtblk_restore(struct virtio_device *vdev)
VIRTIO_BLK_F_SCSI,
 #endif
VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE,
-