Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
On 16.06.21 15:18, Paolo Bonzini wrote: On 15/06/21 18:18, Max Reitz wrote: } +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */ +uint64_t blk_get_max_hw_transfer(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + uint64_t max = INT_MAX; + + if (bs) { + max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer); + } + return max; Both `max_hw_transfer` and `max_transfer` can be 0, so this could return 0, contrary to what the comment above promises. Should `max` be initialized to 0 with a `MIN_NON_ZERO(max, INT_MAX)` here (like `blk_get_max_transfer()` does it)? Yes, something to that effect. (As for the rest, I think aligning to the request alignment makes sense, but then again we don’t do that for max_transfer either, so... this at least wouldn’t be a new bug. Ok, will do. I will also add a new patch to align max_transfer to the request alignment. Regarding the comment, checkpatch complains about it, so it should be fixed so that /* is on its own line. That makes it different from every other comment in block_int.h though. Is it okay to fix all of them in a follow-up? The reason it’s different is that the comment style in question was added to checkpatch only relatively recently. I can’t speak for others, but I’m a simple person. I just do what makes checkpatch happy. :) Given that the checkpatch complaint is only a warning, I think it’s OK to keep the comment as it is here, and perhaps optionally fix all comments in block_int.h in a follow-up. I don’t think we need to fix existing comments, but, well, it wouldn’t be wrong. Max
Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
On 15/06/21 18:18, Max Reitz wrote: } +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */ +uint64_t blk_get_max_hw_transfer(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + uint64_t max = INT_MAX; + + if (bs) { + max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer); + } + return max; Both `max_hw_transfer` and `max_transfer` can be 0, so this could return 0, contrary to what the comment above promises. Should `max` be initialized to 0 with a `MIN_NON_ZERO(max, INT_MAX)` here (like `blk_get_max_transfer()` does it)? Yes, something to that effect. (As for the rest, I think aligning to the request alignment makes sense, but then again we don’t do that for max_transfer either, so... this at least wouldn’t be a new bug. Ok, will do. I will also add a new patch to align max_transfer to the request alignment. Regarding the comment, checkpatch complains about it, so it should be fixed so that /* is on its own line. That makes it different from every other comment in block_int.h though. Is it okay to fix all of them in a follow-up? Paolo Speaking of checkpatch, now that I ran it, it also complains about the new line in bdrv_merge_limits() exceeding 80 characters, so that should be fixed, too.)
Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
On 03.06.21 15:37, Paolo Bonzini wrote: For block host devices, I/O can happen through either the kernel file descriptor I/O system calls (preadv/pwritev, io_submit, io_uring) or the SCSI passthrough ioctl SG_IO. In the latter case, the size of each transfer can be limited by the HBA, while for file descriptor I/O the kernel is able to split and merge I/O in smaller pieces as needed. Applying the HBA limits to file descriptor I/O results in more system calls and suboptimal performance, so this patch splits the max_transfer limit in two: max_transfer remains valid and is used in general, while max_hw_transfer is limited to the maximum hardware size. max_hw_transfer can then be included by the scsi-generic driver in the block limits page, to ensure that the stricter hardware limit is used. Signed-off-by: Paolo Bonzini --- block/block-backend.c | 12 block/file-posix.c | 2 +- block/io.c | 1 + hw/scsi/scsi-generic.c | 2 +- include/block/block_int.h | 7 +++ include/sysemu/block-backend.h | 1 + 6 files changed, 23 insertions(+), 2 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index 15f1ea4288..2ea1412a54 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1953,6 +1953,18 @@ uint32_t blk_get_request_alignment(BlockBackend *blk) return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE; } +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */ +uint64_t blk_get_max_hw_transfer(BlockBackend *blk) +{ +BlockDriverState *bs = blk_bs(blk); +uint64_t max = INT_MAX; + +if (bs) { +max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer); +} +return max; Both `max_hw_transfer` and `max_transfer` can be 0, so this could return 0, contrary to what the comment above promises. Should `max` be initialized to 0 with a `MIN_NON_ZERO(max, INT_MAX)` here (like `blk_get_max_transfer()` does it)? (As for the rest, I think aligning to the request alignment makes sense, but then again we don’t do that for max_transfer either, so... this at least wouldn’t be a new bug. Regarding the comment, checkpatch complains about it, so it should be fixed so that /* is on its own line. Speaking of checkpatch, now that I ran it, it also complains about the new line in bdrv_merge_limits() exceeding 80 characters, so that should be fixed, too.) Max
Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
On 03/06/21 19:33, Eric Blake wrote: +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */ +uint64_t blk_get_max_hw_transfer(BlockBackend *blk) +{ +BlockDriverState *bs = blk_bs(blk); +uint64_t max = INT_MAX; This is an unaligned value; should we instead round it down to the request_alignment granularity? See below... +++ b/include/block/block_int.h @@ -695,6 +695,13 @@ typedef struct BlockLimits { * clamped down. */ uint32_t max_transfer; +/* Maximal hardware transfer length in bytes. Applies whenever Leading /* on its own line, per our style. The whole file still uses this style, I can change it if desired or do it later for the whole file or even the whole block subsystem. + * transfers to the device bypass the kernel I/O scheduler, for + * example with SG_IO. If larger than max_transfer or if zero, + * blk_get_max_hw_transfer will fall back to max_transfer. + */ Should we mandate any additional requirements on this value such as multiple of request_alignment or even power-of-2? Certainly not power of 2. Multiple of request_alignment probably makes sense, but max_transfer doesn't have that limit. Paolo +uint64_t max_hw_transfer; + /* memory alignment, in bytes so that no bounce buffer is needed */ size_t min_mem_alignment;
Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
On Thu, Jun 03, 2021 at 03:37:18PM +0200, Paolo Bonzini wrote: > For block host devices, I/O can happen through either the kernel file > descriptor I/O system calls (preadv/pwritev, io_submit, io_uring) > or the SCSI passthrough ioctl SG_IO. > > In the latter case, the size of each transfer can be limited by the > HBA, while for file descriptor I/O the kernel is able to split and > merge I/O in smaller pieces as needed. Applying the HBA limits to > file descriptor I/O results in more system calls and suboptimal > performance, so this patch splits the max_transfer limit in two: > max_transfer remains valid and is used in general, while max_hw_transfer > is limited to the maximum hardware size. max_hw_transfer can then be > included by the scsi-generic driver in the block limits page, to ensure > that the stricter hardware limit is used. > > +/* Returns the maximum hardware transfer length, in bytes; guaranteed > nonzero */ > +uint64_t blk_get_max_hw_transfer(BlockBackend *blk) > +{ > +BlockDriverState *bs = blk_bs(blk); > +uint64_t max = INT_MAX; This is an unaligned value; should we instead round it down to the request_alignment granularity? > + > +if (bs) { > +max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer); > +} > +return max; > +} > + > /* Returns the maximum transfer length, in bytes; guaranteed nonzero */ > uint32_t blk_get_max_transfer(BlockBackend *blk) > { > +++ b/include/block/block_int.h > @@ -695,6 +695,13 @@ typedef struct BlockLimits { > * clamped down. */ > uint32_t max_transfer; > > +/* Maximal hardware transfer length in bytes. Applies whenever Leading /* on its own line, per our style. > + * transfers to the device bypass the kernel I/O scheduler, for > + * example with SG_IO. If larger than max_transfer or if zero, > + * blk_get_max_hw_transfer will fall back to max_transfer. > + */ Should we mandate any additional requirements on this value such as multiple of request_alignment or even power-of-2? > +uint64_t max_hw_transfer; > + > /* memory alignment, in bytes so that no bounce buffer is needed */ > size_t min_mem_alignment; > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
[PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
For block host devices, I/O can happen through either the kernel file descriptor I/O system calls (preadv/pwritev, io_submit, io_uring) or the SCSI passthrough ioctl SG_IO. In the latter case, the size of each transfer can be limited by the HBA, while for file descriptor I/O the kernel is able to split and merge I/O in smaller pieces as needed. Applying the HBA limits to file descriptor I/O results in more system calls and suboptimal performance, so this patch splits the max_transfer limit in two: max_transfer remains valid and is used in general, while max_hw_transfer is limited to the maximum hardware size. max_hw_transfer can then be included by the scsi-generic driver in the block limits page, to ensure that the stricter hardware limit is used. Signed-off-by: Paolo Bonzini --- block/block-backend.c | 12 block/file-posix.c | 2 +- block/io.c | 1 + hw/scsi/scsi-generic.c | 2 +- include/block/block_int.h | 7 +++ include/sysemu/block-backend.h | 1 + 6 files changed, 23 insertions(+), 2 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index 15f1ea4288..2ea1412a54 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1953,6 +1953,18 @@ uint32_t blk_get_request_alignment(BlockBackend *blk) return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE; } +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */ +uint64_t blk_get_max_hw_transfer(BlockBackend *blk) +{ +BlockDriverState *bs = blk_bs(blk); +uint64_t max = INT_MAX; + +if (bs) { +max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer); +} +return max; +} + /* Returns the maximum transfer length, in bytes; guaranteed nonzero */ uint32_t blk_get_max_transfer(BlockBackend *blk) { diff --git a/block/file-posix.c b/block/file-posix.c index e3241a0dd3..f55f92d0f5 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1234,7 +1234,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp) int ret = sg_get_max_transfer_length(s->fd); if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) { -bs->bl.max_transfer = pow2floor(ret); +bs->bl.max_hw_transfer = pow2floor(ret); } ret = sg_get_max_segments(s->fd); diff --git a/block/io.c b/block/io.c index 323854d063..089b99bb0c 100644 --- a/block/io.c +++ b/block/io.c @@ -127,6 +127,7 @@ static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src) { dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer); dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer); +dst->max_hw_transfer = MIN_NON_ZERO(dst->max_hw_transfer, src->max_hw_transfer); dst->opt_mem_alignment = MAX(dst->opt_mem_alignment, src->opt_mem_alignment); dst->min_mem_alignment = MAX(dst->min_mem_alignment, diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c index 82e1e2ee79..3762dce749 100644 --- a/hw/scsi/scsi-generic.c +++ b/hw/scsi/scsi-generic.c @@ -179,7 +179,7 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s) (r->req.cmd.buf[1] & 0x01)) { page = r->req.cmd.buf[2]; if (page == 0xb0) { -uint32_t max_transfer = blk_get_max_transfer(s->conf.blk); +uint64_t max_transfer = blk_get_max_hw_transfer(s->conf.blk); uint32_t max_iov = blk_get_max_iov(s->conf.blk); assert(max_transfer); diff --git a/include/block/block_int.h b/include/block/block_int.h index 057d88b1fc..f1a54db0f8 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -695,6 +695,13 @@ typedef struct BlockLimits { * clamped down. */ uint32_t max_transfer; +/* Maximal hardware transfer length in bytes. Applies whenever + * transfers to the device bypass the kernel I/O scheduler, for + * example with SG_IO. If larger than max_transfer or if zero, + * blk_get_max_hw_transfer will fall back to max_transfer. + */ +uint64_t max_hw_transfer; + /* memory alignment, in bytes so that no bounce buffer is needed */ size_t min_mem_alignment; diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index 5423e3d9c6..9ac5f7bbd3 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -208,6 +208,7 @@ void blk_eject(BlockBackend *blk, bool eject_flag); int blk_get_flags(BlockBackend *blk); uint32_t blk_get_request_alignment(BlockBackend *blk); uint32_t blk_get_max_transfer(BlockBackend *blk); +uint64_t blk_get_max_hw_transfer(BlockBackend *blk); int blk_get_max_iov(BlockBackend *blk); void blk_set_guest_block_size(BlockBackend *blk, int align); void *blk_try_blockalign(BlockBackend *blk, size_t size); -- 2.31.1