On Fri, Jul 5, 2019 at 6:43 AM Stefano Garzarella <sgarz...@redhat.com> wrote: > > On Fri, Jul 05, 2019 at 11:58:43AM +0200, Max Reitz wrote: > > On 05.07.19 11:32, Stefano Garzarella wrote: > > > This patch allows 'qemu-img info' to show the 'disk size' for > > > the RBD images that have the fast-diff feature enabled. > > > > > > If this feature is enabled, we use the rbd_diff_iterate2() API > > > to calculate the allocated size for the image. > > > > > > Signed-off-by: Stefano Garzarella <sgarz...@redhat.com> > > > --- > > > v3: > > > - return -ENOTSUP instead of -1 when fast-diff is not available > > > [John, Jason] > > > v2: > > > - calculate the actual usage only if the fast-diff feature is > > > enabled [Jason] > > > --- > > > block/rbd.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 54 insertions(+) > > > > Well, the librbd documentation is non-existing as always, but while > > googling, I at least found that libvirt has exactly the same code. So I > > suppose it must be quite correct, then. > > > > While I wrote this code I took a look at libvirt implementation and also > at the "rbd" tool in the ceph repository: compute_image_disk_usage() in > src/tools/rbd/action/DiskUsage.cc > > > > diff --git a/block/rbd.c b/block/rbd.c > > > index 59757b3120..b6bed683e5 100644 > > > --- a/block/rbd.c > > > +++ b/block/rbd.c > > > @@ -1084,6 +1084,59 @@ static int64_t qemu_rbd_getlength(BlockDriverState > > > *bs) > > > return info.size; > > > } > > > > > > +static int rbd_allocated_size_cb(uint64_t offset, size_t len, int exists, > > > + void *arg) > > > +{ > > > + int64_t *alloc_size = (int64_t *) arg; > > > + > > > + if (exists) { > > > + (*alloc_size) += len; > > > + } > > > + > > > + return 0; > > > +} > > > + > > > +static int64_t qemu_rbd_get_allocated_file_size(BlockDriverState *bs) > > > +{ > > > + BDRVRBDState *s = bs->opaque; > > > + uint64_t flags, features; > > > + int64_t alloc_size = 0; > > > + int r; > > > + > > > + r = rbd_get_flags(s->image, &flags); > > > + if (r < 0) { > > > + return r; > > > + } > > > + > > > + r = rbd_get_features(s->image, &features); > > > + if (r < 0) { > > > + return r; > > > + } > > > + > > > + /* > > > + * We use rbd_diff_iterate2() only if the RBD image have fast-diff > > > + * feature enabled. If it is disabled, rbd_diff_iterate2() could be > > > + * very slow on a big image. > > > + */ > > > + if (!(features & RBD_FEATURE_FAST_DIFF) || > > > + (flags & RBD_FLAG_FAST_DIFF_INVALID)) { > > > + return -ENOTSUP; > > > + } > > > + > > > + /* > > > + * rbd_diff_iterate2(), if the source snapshot name is NULL, invokes > > > + * the callback on all allocated regions of the image. > > > + */ > > > + r = rbd_diff_iterate2(s->image, NULL, 0, > > > + bs->total_sectors * BDRV_SECTOR_SIZE, 0, 1, > > > + &rbd_allocated_size_cb, &alloc_size); > > > > But I have a question. This is basically block_status, right? So it > > gives us information on which areas are allocated and which are not. > > The result thus gives us a lower bound on the allocation size, but is it > > really exactly the allocation size? > > > > There are two things I’m concerned about: > > > > 1. What about metadata? > > Good question, I don't think it includes the size used by metadata and I > don't know if there is a way to know it. I'll check better.
It does not include the size of metadata, the "rbd_diff_iterate2" function is literally just looking for touched data blocks within the RBD image. > > > > 2. If you have multiple snapshots, this will only report the overall > > allocation information, right? So say there is something like this: > > > > (“A” means an allocated MB, “-” is an unallocated MB) > > > > Snapshot 1: AAAA--- > > Snapshot 2: --AAAAA > > Snapshot 3: -AAAA-- > > > > I think the allocated data size is the number of As in total (13 MB). > > But I suppose this API will just return 7 MB, because it looks on > > everything an it sees the whole image range (7 MB) to be allocated. It > > doesn’t report in how many snapshots some region is allocated. It should return 13 dirty data blocks (multipled by the size of the data block) since when you don't provide a "from snapshot" name, it will iterate from the first snapshot to the HEAD revision. > Looking at the documentation of rbd_diff_iterate2() [1] they says: > > * If the source snapshot name is NULL, we > * interpret that as the beginning of time and return all allocated > * regions of the image. > > But I don't know the answer of your question (maybe Jason can help > here). > I should check better the implementation to understand if I can cycle > on all snapshot to get the exact allocated data size. > > https://github.com/ceph/ceph/blob/master/src/include/rbd/librbd.h#L925 > > I'll back when I have more details on the rbd implementation to better > answer your questions. > > Thanks, > Stefano -- Jason