On Tue, Jun 25, 2019 at 06:06:02PM +0200, Max Reitz wrote: > On 06.05.19 14:23, Stefano Garzarella wrote: > > This patch adds the support of preallocation (off/full) for the RBD > > block driver. > > If available, we use rbd_writesame() to quickly fill the image when > > full preallocation is required. > > > > Signed-off-by: Stefano Garzarella <sgarz...@redhat.com> > > --- > > v2: > > - Use 4 KiB buffer for rbd_writesame() [Jason] > > - Use "rbd_concurrent_management_ops" and "stripe_unit" to limit > > concurrent ops to the backing cluster [Jason] > > --- > > block/rbd.c | 188 +++++++++++++++++++++++++++++++++++++++---- > > qapi/block-core.json | 4 +- > > 2 files changed, 175 insertions(+), 17 deletions(-) > > This patch conflicts a bit with the rbd file growth patch, so that would > need to be resolved in a v3.
Sure, I'll rebase this patch in a v3! > > But still, that doesn’t prevent me from reviewing it as it is: > > > diff --git a/block/rbd.c b/block/rbd.c > > index 0c549c9935..bc54395e1c 100644 > > --- a/block/rbd.c > > +++ b/block/rbd.c > > [...] > > > @@ -331,6 +333,147 @@ static void qemu_rbd_memset(RADOSCB *rcb, int64_t > > offs) > > [...] > > > +static int qemu_rbd_do_truncate(rados_t cluster, rbd_image_t image, > > + int64_t offset, PreallocMode prealloc, > > + Error **errp) > > +{ > > [...] > > > +#ifdef LIBRBD_SUPPORTS_WRITESAME > > + uint64_t stripe_unit, writesame_max_size; > > + int max_concurrent_ops; > > + > > + max_concurrent_ops = qemu_rbd_get_max_concurrent_ops(cluster); > > + ret = rbd_get_stripe_unit(image, &stripe_unit); > > + if (ret < 0) { > > + error_setg_errno(errp, -ret, "Failed to get stripe unit"); > > + goto out; > > + } > > + > > + /* > > + * We limit the rbd_writesame() size to avoid to spawn more then > > s/then/than/ Ooo, I'll fix it! > > > + * 'rbd_concurrent_management_ops' concurrent operations. > > + */ > > + writesame_max_size = MIN(stripe_unit * max_concurrent_ops, > > INT_MAX); > > +#endif /* LIBRBD_SUPPORTS_WRITESAME */ > > + > > + buf = g_malloc(buf_size); > > + /* > > + * Some versions of rbd_writesame() discards writes of buffers with > > + * all zeroes. In order to avoid this behaviour, we set the first > > byte > > + * to one. > > + */ > > + buf[0] = 1; > > But I guess you’ll need to rewrite it as zero later, or the > “bdrv_rbd.bdrv_has_zero_init = bdrv_has_zero_init_1” is no longer quite > true. Yes, and I should use g_malloc0. I'll check if the next rewrites is too slow, in this case I'll use rbd_writesame() only for ceph version where zeroed buffer is supported. > > [...] > > > @@ -449,6 +603,16 @@ static int coroutine_fn qemu_rbd_co_create_opts(const > > char *filename, > > BLOCK_OPT_CLUSTER_SIZE, > > 0); > > rbd_opts->has_cluster_size = (rbd_opts->cluster_size != 0); > > > > + prealloc = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC); > > + rbd_opts->preallocation = qapi_enum_parse(&PreallocMode_lookup, > > prealloc, > > + PREALLOC_MODE_OFF, > > &local_err); > > You also need to set rbd_opts->has_preallocation to true. I'll fix. > > > + g_free(prealloc); > > + if (local_err) { > > + ret = -EINVAL; > > + error_propagate(errp, local_err); > > + goto exit; > > + } > > + > > options = qdict_new(); > > qemu_rbd_parse_filename(filename, options, &local_err); > > if (local_err) { > > [...] > > > diff --git a/qapi/block-core.json b/qapi/block-core.json > > index 7ccbfff9d0..db25a4065b 100644 > > --- a/qapi/block-core.json > > +++ b/qapi/block-core.json > > @@ -4277,13 +4277,15 @@ > > # point to a snapshot. > > # @size Size of the virtual disk in bytes > > # @cluster-size RBD object size > > +# @preallocation Preallocation mode (allowed values: off, full) > > There should be a "(Since: 4.1)" note here. I'll add the note! Thanks for the review, Stefano