On Thu, Jun 03, 2021 at 03:37:18PM +0200, Paolo Bonzini wrote: > For block host devices, I/O can happen through either the kernel file > descriptor I/O system calls (preadv/pwritev, io_submit, io_uring) > or the SCSI passthrough ioctl SG_IO. > > In the latter case, the size of each transfer can be limited by the > HBA, while for file descriptor I/O the kernel is able to split and > merge I/O in smaller pieces as needed. Applying the HBA limits to > file descriptor I/O results in more system calls and suboptimal > performance, so this patch splits the max_transfer limit in two: > max_transfer remains valid and is used in general, while max_hw_transfer > is limited to the maximum hardware size. max_hw_transfer can then be > included by the scsi-generic driver in the block limits page, to ensure > that the stricter hardware limit is used. >
> +/* Returns the maximum hardware transfer length, in bytes; guaranteed > nonzero */ > +uint64_t blk_get_max_hw_transfer(BlockBackend *blk) > +{ > + BlockDriverState *bs = blk_bs(blk); > + uint64_t max = INT_MAX; This is an unaligned value; should we instead round it down to the request_alignment granularity? > + > + if (bs) { > + max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer); > + } > + return max; > +} > + > /* Returns the maximum transfer length, in bytes; guaranteed nonzero */ > uint32_t blk_get_max_transfer(BlockBackend *blk) > { > +++ b/include/block/block_int.h > @@ -695,6 +695,13 @@ typedef struct BlockLimits { > * clamped down. */ > uint32_t max_transfer; > > + /* Maximal hardware transfer length in bytes. Applies whenever Leading /* on its own line, per our style. > + * transfers to the device bypass the kernel I/O scheduler, for > + * example with SG_IO. If larger than max_transfer or if zero, > + * blk_get_max_hw_transfer will fall back to max_transfer. > + */ Should we mandate any additional requirements on this value such as multiple of request_alignment or even power-of-2? > + uint64_t max_hw_transfer; > + > /* memory alignment, in bytes so that no bounce buffer is needed */ > size_t min_mem_alignment; > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org