rbd: increase dynamically the image size

Kevin Wolf Mon, 29 Apr 2019 07:30:51 -0700

Am 29.04.2019 um 16:04 hat Stefano Garzarella geschrieben:
> On Mon, Apr 29, 2019 at 12:25:10PM +0200, Kevin Wolf wrote:
> > Am 11.04.2019 um 12:50 hat Stefano Garzarella geschrieben:
> > > RBD APIs don't allow us to write more than the size set with rbd_create()
> > > or rbd_resize().
> > > In order to support growing images (eg. qcow2), we resize the image
> > > before RW operations that exceed the current size.
> > > 
> > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1171007
> > > Signed-off-by: Stefano Garzarella <sgarz...@redhat.com>
> > > ---
> > >  block/rbd.c | 25 +++++++++++++++++++++++++
> > >  1 file changed, 25 insertions(+)
> > > 
> > > diff --git a/block/rbd.c b/block/rbd.c
> > > index 0c549c9935..228658e20a 100644
> > > --- a/block/rbd.c
> > > +++ b/block/rbd.c
> > > @@ -102,6 +102,7 @@ typedef struct BDRVRBDState {
> > >      rbd_image_t image;
> > >      char *image_name;
> > >      char *snap;
> > > +    uint64_t image_size;
> > >  } BDRVRBDState;
> > 
> > Can't we use bs->total_sectors instead of adding a new image_size field?
> 
> I'm not sure we can use bs->total_sectors. IIUC, for example, it doesn't
> take care of bytes used by QCOW2 metadata.


bs->total_sectors for the rbd BLockDriverState is the image file size,
not the virtual disk size.

The only reason not to use it would be if we need byte granularity
rather than 512 byte granularity. But I don't think it's a problem to
round up offsets to the next 512 bytes (BDRV_SECTOR_SIZE) boundary.

> > >  static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
> > > @@ -777,6 +778,14 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
> > > *options, int flags,
> > >          goto failed_open;
> > >      }
> > >  
> > > +    r = rbd_get_size(s->image, &s->image_size);
> > > +    if (r < 0) {
> > > +        error_setg_errno(errp, -r, "error reading image size from %s",
> > > +                         s->image_name);
> > > +        rbd_close(s->image);
> > > +        goto failed_open;
> > > +    }
> > > +
> > >      /* If we are using an rbd snapshot, we must be r/o, otherwise
> > >       * leave as-is */
> > >      if (s->snap != NULL) {
> > > @@ -921,6 +930,20 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState 
> > > *bs,
> > >          rcb->buf = acb->bounce;
> > >      }
> > >  
> > > +    /*
> > > +     * RBD APIs don't allow us to write more than actual size, so in 
> > > order
> > > +     * to support growing images, we resize the image before RW 
> > > operations
> > > +     * that exceed the current size.
> > > +     */
> > > +    if (s->image_size < off + size) {
> > > +        r = rbd_resize(s->image, off + size);
> > > +        if (r < 0) {
> > > +            goto failed;
> > > +        }
> > > +
> > > +        s->image_size = off + size;
> > > +    }
> > 
> > This doesn't check the request type, so it's actually not limited to RW
> > operations, but even reads will try to resize the image. This is at
> > least surprising. For regular files, file-posix extends the file for
> > write requests, but for reads it returns a zeroed buffer without
> > actually changing the file size.
> 
> Yes, I'll change the behaviour in the v2.
> 
> I did some tries (i.e. using qemu-io and reading more than bytes used) and
> the RBD driver didn't receive 'read' requests that exceed the current
> size, maybe because it is managed in the QCOW2 protocol, but of course
> I'll handle also in the RBD driver.

I don't remember the exact scenario where it happened, but I know I
implemented it for file-posix to fix a bug. Maybe it actually doesn't
happen any more because we have made other changes in the meantime, but
I'm not sure.

Kevin

Re: [Qemu-devel] [PATCH RFC 1/1] block/rbd: increase dynamically the image size

Reply via email to