Re: [Qemu-block] [PATCH 5/6] raw-posix: Implement .bdrv_co_preadv/pwritev

2016-06-14 Thread Stefan Hajnoczi
On Wed, Jun 08, 2016 at 04:10:10PM +0200, Kevin Wolf wrote:
> The raw-posix block driver actually supports byte-aligned requests now
> on non-O_DIRECT images, like it already (and previously incorrectly)
> claimed in bs->request_alignment.
> 
> For some block drivers this means that a RMW cycle can be avoided when
> they write sub-sector metadata e.g. for cluster allocation.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/linux-aio.c |  6 ++
>  block/raw-aio.h   |  2 +-
>  block/raw-posix.c | 42 ++
>  3 files changed, 25 insertions(+), 25 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH 5/6] raw-posix: Implement .bdrv_co_preadv/pwritev

2016-06-08 Thread Eric Blake
On 06/08/2016 08:10 AM, Kevin Wolf wrote:
> The raw-posix block driver actually supports byte-aligned requests now
> on non-O_DIRECT images, like it already (and previously incorrectly)
> claimed in bs->request_alignment.
> 
> For some block drivers this means that a RMW cycle can be avoided when
> they write sub-sector metadata e.g. for cluster allocation.

[well, there's still probably a RMW going on, but it's being done by the
kernel, rather than qemu - and choice of caching may let the kernel
optimize things... not worth cluttering the commit message with this,
though]

> 
> Signed-off-by: Kevin Wolf 
> ---

> +++ b/block/linux-aio.c
> @@ -272,14 +272,12 @@ static int laio_do_submit(int fd, struct qemu_laiocb 
> *laiocb, off_t offset,
>  }
>  
>  int laio_submit_co(BlockDriverState *bs, LinuxAioState *s, int fd,
> -int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, int type)
> +   uint64_t offset, QEMUIOVector *qiov, int type)
>  {
> -off_t offset = sector_num * 512;
>  int ret;
> -
>  struct qemu_laiocb laiocb = {
>  .co = qemu_coroutine_self(),
> -.nbytes = nb_sectors * 512,
> +.nbytes = qiov->size,

So for this interface, we require non-NULL qiov and no duplicated
length; I guess it isn't used for write_zeroes.  We may still want to do
some consistency sweep to decide what level of NULL-ness we want for
representing write_zeroes, rather than ad hoc decisions at each layer of
the call stack, but that's a task for another day.

> @@ -1344,26 +1344,27 @@ static int coroutine_fn raw_co_rw(BlockDriverState 
> *bs, int64_t sector_num,
>  type |= QEMU_AIO_MISALIGNED;
>  #ifdef CONFIG_LINUX_AIO
>  } else if (s->use_aio) {
> -return laio_submit_co(bs, s->aio_ctx, s->fd, sector_num, qiov,
> -   nb_sectors, type);
> +assert(qiov->size == bytes);

Worth hoisting the assertion outside of the #ifdef?...

> +return laio_submit_co(bs, s->aio_ctx, s->fd, offset, qiov, type);
>  #endif
>  }
>  }
>  
> -return paio_submit_co(bs, s->fd, sector_num * BDRV_SECTOR_SIZE, qiov,
> -  nb_sectors * BDRV_SECTOR_SIZE, type);
> +return paio_submit_co(bs, s->fd, offset, qiov, bytes, type);

...then again, paio_submit_co() also does the assert - and this is more
evidence of our inconsistency on whether we duplicate a separate bytes
parameter or reuse qiov->size.

>  
> -static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t 
> sector_num,
> - int nb_sectors, QEMUIOVector *qiov)
> +static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
> +  uint64_t bytes, QEMUIOVector *qiov,
> +  int flags)
>  {
> -return raw_co_rw(bs, sector_num, nb_sectors, qiov, QEMU_AIO_READ);
> +return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_READ);

We ignore flags, but that's not a change in semantics.  (Maybe someday
we need .supported_read_flags)

>  }
>  
> -static int coroutine_fn raw_co_writev(BlockDriverState *bs, int64_t 
> sector_num,
> -  int nb_sectors, QEMUIOVector *qiov)
> +static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset,
> +   uint64_t bytes, QEMUIOVector *qiov,
> +   int flags)
>  {
> -return raw_co_rw(bs, sector_num, nb_sectors, qiov, QEMU_AIO_WRITE);
> +return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_WRITE);

And here, we could assert(!flags) (since we intentionally don't set
.supported_write_flags) - but I won't insist.

None of my comments require a code change, other than a possible added
assertion, so:

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature