On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven <p...@kamp.de> wrote: > > librbd had a bug until early 2022 that affected all versions of ceph that > supported fast-diff. This bug results in reporting of incorrect offsets > if the offset parameter to rbd_diff_iterate2 is not object aligned. > Work around this bug by rounding down the offset to object boundaries. > > Fixes: https://tracker.ceph.com/issues/53784
I don't think the Fixes tag is appropriate here. Linking librbd ticket is fine but this patch doesn't really fix anything. > Cc: qemu-sta...@nongnu.org > Signed-off-by: Peter Lieven <p...@kamp.de> > --- > block/rbd.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/block/rbd.c b/block/rbd.c > index 5e9dc91d81..260cb9f4b4 100644 > --- a/block/rbd.c > +++ b/block/rbd.c > @@ -1333,6 +1333,7 @@ static int coroutine_fn > qemu_rbd_co_block_status(BlockDriverState *bs, > int status, r; > RBDDiffIterateReq req = { .offs = offset }; > uint64_t features, flags; > + int64_t head; > > assert(offset + bytes <= s->image_size); > > @@ -1360,6 +1361,19 @@ static int coroutine_fn > qemu_rbd_co_block_status(BlockDriverState *bs, > return status; > } > > + /* > + * librbd had a bug until early 2022 that affected all versions of ceph > that > + * supported fast-diff. This bug results in reporting of incorrect > offsets > + * if the offset parameter to rbd_diff_iterate2 is not object aligned. > + * Work around this bug by rounding down the offset to object boundaries. > + * > + * See: https://tracker.ceph.com/issues/53784 > + */ > + head = offset & (s->object_size - 1); > + offset -= head; > + req.offs -= head; > + bytes += head; So it looks like the intention is to have more or less a permanent workaround since all librbd versions are affected, right? For that, I think we would need to also reject custom striping patterns and clones. For the above to be reliable, the image has to be standalone and have a default striping pattern (stripe_unit == object_size && stripe_count == 1). Otherwise, behave as if fast-diff is disabled or invalid. > + Nit: I'd replace { .offs = offset } initialization at the top with {} and assign to req.offs here, right before calling rbd_diff_iterate2(). > r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true, > qemu_rbd_diff_iterate_cb, &req); > if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) { > @@ -1379,7 +1393,8 @@ static int coroutine_fn > qemu_rbd_co_block_status(BlockDriverState *bs, > status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID; > } > > - *pnum = req.bytes; > + assert(req.bytes > head); I'd expand the workaround comment with an explanation of why it's OK to round down the offset -- because rbd_diff_iterate2() is called with whole_object=true. If that wasn't the case, on top of inconsistent results for different offsets within an object, this assert could be triggered. Thanks, Ilya