Stefano Garzarella <sgarz...@redhat.com> writes:

> On Wed, May 08, 2019 at 01:44:27PM +0200, Markus Armbruster wrote:
>> Stefano Garzarella <sgarz...@redhat.com> writes:
>> 
>> > On Tue, May 07, 2019 at 08:34:51AM +0200, Markus Armbruster wrote:
>> >> Cc: Peter for a libvirt perspective.
>> >> 
>> >> Stefano Garzarella <sgarz...@redhat.com> writes:
>> >> 
>> >> > This patch adds the support of preallocation (off/full) for the RBD
>> >> > block driver.
>> >> > If available, we use rbd_writesame() to quickly fill the image when
>> >> > full preallocation is required.
>> >> >
>> >> > Signed-off-by: Stefano Garzarella <sgarz...@redhat.com>
>> >> > ---
>> >> >  block/rbd.c          | 149 ++++++++++++++++++++++++++++++++++++++-----
>> >> >  qapi/block-core.json |   4 +-
>> >> >  2 files changed, 136 insertions(+), 17 deletions(-)
>> >> >
>> >> > diff --git a/block/rbd.c b/block/rbd.c
>> >> > index 0c549c9935..29dd1bb040 100644
>> >> > --- a/block/rbd.c
>> >> > +++ b/block/rbd.c
>> >> > @@ -13,6 +13,7 @@
>> >> >  
>> >> >  #include "qemu/osdep.h"
>> >> >  
>> >> > +#include "qemu/units.h"
>> >> >  #include <rbd/librbd.h>
>> >> >  #include "qapi/error.h"
>> >> >  #include "qemu/error-report.h"
>> >> > @@ -331,6 +332,110 @@ static void qemu_rbd_memset(RADOSCB *rcb, int64_t 
>> >> > offs)
>> >> >      }
>> >> >  }
>> >> >  
>> >> > +static int qemu_rbd_do_truncate(rbd_image_t image, int64_t offset,
>> >> > +                                PreallocMode prealloc, Error **errp)
>> >> > +{
>> >> > +    uint64_t current_length;
>> >> > +    char *buf = NULL;
>> >> > +    int ret;
>> >> > +
>> >> > +    ret = rbd_get_size(image, &current_length);
>> >> > +    if (ret < 0) {
>> >> > +        error_setg_errno(errp, -ret, "Failed to get file length");
>> >> > +        goto out;
>> >> > +    }
>> >> > +
>> >> > +    if (current_length > offset && prealloc != PREALLOC_MODE_OFF) {
>> >> > +        error_setg(errp, "Cannot use preallocation for shrinking 
>> >> > files");
>> >> > +        ret = -ENOTSUP;
>> >> > +        goto out;
>> >> > +    }
>> >> > +
>> >> > +    switch (prealloc) {
>> >> > +    case PREALLOC_MODE_FULL: {
>> >> [...]
>> >> > +    case PREALLOC_MODE_OFF:
>> >> [...]
>> >> > +    default:
>> >> > +        error_setg(errp, "Unsupported preallocation mode: %s",
>> >> > +                   PreallocMode_str(prealloc));
>> >> > +        ret = -ENOTSUP;
>> >> > +        goto out;
>> >> > +    }
>> >> 
>> >> Other block drivers also accept only some values of PreallocMode.  Okay.
>> >> 
>> >> I wonder whether management applications need to know which values are
>> >> supported.
>> >
>> > Good point!
>> 
>> We can continue to assume they don't until somebody tells us otherwise.
>> 
>> >> Let me review support in drivers:
>> >> 
>> >> * file (file-win32.c)
>> >> * iscsi
>> >> * nfs
>> >> * qed
>> >> * ssh
>> >> 
>> >>   - Reject all but PREALLOC_MODE_OFF
>> >> 
>> >> * copy-on-read
>> >> * luks (crypto.c)
>> >> * raw
>> >> 
>> >>   - Pass through only
>> >> 
>> >> * file host_cdrom host_device (file-posix.c)
>> >> 
>> >>   - Reject all but PREALLOC_MODE_OFF when shrinking and for non-regular
>> >>     files
>> >>   - Reject PREALLOC_MODE_FALLOC unless CONFIG_POSIX_FALLOCATE
>> >>   - Reject PREALLOC_MODE_METADATA
>> >> 
>> >> * gluster
>> >> 
>> >>   - Reject all but PREALLOC_MODE_OFF when shrinking
>> >>   - Reject PREALLOC_MODE_FALLOC unless CONFIG_GLUSTERFS_FALLOCATE
>> >>   - Reject PREALLOC_MODE_FULL unless CONFIG_GLUSTERFS_ZEROFILL
>> >>   - Reject PREALLOC_MODE_METADATA
>> >> 
>> >> * qcow2
>> >> 
>> >>   - Reject all but PREALLOC_MODE_OFF when shrinking and with a backing
>> >>     file
>> >>   
>> >> * rbd with this patch
>> >> 
>> >>   - Reject all but PREALLOC_MODE_OFF when shrinking
>> >>   - Reject PREALLOC_MODE_METADATA and PREALLOC_MODE_FALLOC
>> >> 
>> >> * sheepdog
>> >> 
>> >>   - Reject PREALLOC_MODE_METADATA and PREALLOC_MODE_FALLOC
>> >>   - Doesn't support shrinking
>> >> 
>> >> * vdi
>> >> 
>> >>   - Reject PREALLOC_MODE_FALLOC and PREALLOC_MODE_FULL
>> >>   - Doesn't support shrinking
>> >> 
>> >> * blkdebug
>> >> * blklogwrites
>> >> * blkverify
>> >> * bochs
>> >> * cloop
>> >> * dmg
>> >> * ftp
>> >> * ftps
>> >> * http
>> >> * https
>> >> * luks
>> >> * nbd
>> >> * null-aio
>> >> * null-co
>> >> * nvme
>> >> * parallels
>> >> * qcow
>> >> * quorum
>> >> * replication
>> >> * throttle
>> >> * vhdx
>> >> * vmdk
>> >> * vpc
>> >> * vvfat
>> >> * vxhs
>> >> 
>> >>   - These appear not to use PreallocMode: they don't implement
>> >>     .bdrv_co_truncate(), and either don't implement .bdrv_co_create() or
>> >>     implement it without a prealloc parameter.
>> >> 
>> >> Looks good to me.
>> >>
>> >
>> > Thanks for the analysis!
>> >
>> >> > +
>> >> > +    ret = 0;
>> >> > +
>> >> > +out:
>> >> > +    g_free(buf);
>> >> > +    return ret;
>> >> > +}
>> >> > +
>> >> >  static QemuOptsList runtime_opts = {
>> >> >      .name = "rbd",
>> >> >      .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
>> >> [...]
>> >> > diff --git a/qapi/block-core.json b/qapi/block-core.json
>> >> > index 7ccbfff9d0..db25a4065b 100644
>> >> > --- a/qapi/block-core.json
>> >> > +++ b/qapi/block-core.json
>> >> > @@ -4277,13 +4277,15 @@
>> >> >  #                   point to a snapshot.
>> >> >  # @size             Size of the virtual disk in bytes
>> >> >  # @cluster-size     RBD object size
>> >> > +# @preallocation    Preallocation mode (allowed values: off, full)
>> >> >  #
>> >> >  # Since: 2.12
>> >> >  ##
>> >> >  { 'struct': 'BlockdevCreateOptionsRbd',
>> >> >    'data': { 'location':         'BlockdevOptionsRbd',
>> >> >              'size':             'size',
>> >> > -            '*cluster-size' :   'size' } }
>> >> > +            '*cluster-size' :   'size',
>> >> > +            '*preallocation':   'PreallocMode' } }
>> >> >  
>> >> >  ##
>> >> >  # @BlockdevVmdkSubformat:
>> >> 
>> >> The non-support of values 'metadata' and 'falloc' is not visible in
>> >> introspection, only in documentation.  No reason to block this patch, as
>> >> the other block drivers have the same introspection weakness (only
>> >> sheepdog and vdi bother to document).
>> >> 
>> >> Should we address the introspection weakness?  Only if there's a use for
>> >> the information, I think.
>> >
>> > If the management applications will use that information (or maybe also
>> > our help pages), could be useful to have an array of 'PreallocMode'
>> > supported per-driver.
>> 
>> Ideally, query-qmp-schema would show only the supported values.
>> 
>> Not hard to do, just tedious: we'd get a number of sub-enums in addition
>> to the full one, and we'd have to map from sub-enum to the full one.
>> 
>> QAPI language support for sub-enums would remove most of the tedium.
>> Not worthwhile unless the need for sub-enums is actually common.
>
> I should study better the QMP and QAPI to understand how to implement
> the sub-enums.

Sub-enums of

    { 'enum': 'PreallocMode',
      'data': [ 'off', 'metadata', 'falloc', 'full' ] }

done the obvious way:

    { 'enum': 'PreallocModeOff',
      'data': [ 'off' ] }
    { 'enum': 'PreallocModeOffPosix',
      'data': [ 'off', 'metadata',
                 { 'name': 'falloc', 'if': 'defined(CONFIG_POSIX_FALLOCATE)' },
                 'full' ] }

and so forth.

This generates a bunch of different C enum types in addition to
PreallocMode: PreallocModeOff, PreallocModePosix, ...

Common C code continues to use just PreallocMode.  The QMP command
handlers using sub-enums will have to map between the sub-enums and
PreallocMode.

Tedious.

With QAPI language support for sub-enums, we could eliminate the
additional C enums.

> If you agree, I'll put it as a background task, until somebody from
> management applications tell us his interest.

Only act if there's a compelling use case.

>> >> Should we improve documentation for the other block drivers?
>> >> 
>> >
>> > Yes, e.g. for Gluster it is not updated.
>> > If you agree, I can check and update the documentation of all drivers 
>> > following
>> > your analysis.
>> 
>> Yes, please!
>
> Okay, I'll send a patch to update it.
>
> Thanks,
> Stefano

Reply via email to