On 09/08/2017 09:04 AM, Eric Blake wrote: >>> void bdrv_dirty_bitmap_truncate(BlockDriverState *bs) >>> { >>> BdrvDirtyBitmap *bitmap; >>> - uint64_t size = bdrv_nb_sectors(bs); >>> + int64_t size = bdrv_getlength(bs); >>> >>> + assert(size >= 0); >> >> How can you assert that there will never be an error? Even if it's >> correct (I don't know whether you can have dirty bitmaps on devices that >> don't use the cached value), this needs at least a comment. > > The old code wasn't checking for errors; if an error occurs, we have no > way to report it. So I indeed need to audit whether all callers have a > cached length at this point in time (it can't fail), or else change > bdrv_dirty_bitmap_truncate() to be able to fail (pass failure along) and > update all callers. This may indeed be reason for a respin, depending > on what I find.
Verdict - it can indeed fail; bdrv_truncate() was blindly calling the dirty bitmap resize even after a failed refresh_total_sectors(), which could then resize the dirty bitmap to -1. At least bdrv_truncate() itself still failed, but it's cleaner to fail up front rather than get internal state even more botched in the meantime, so fixing that will be a separate patch in v7. Sadly, the failure is probably more theoretical, and I did not quickly see an easy way to write an iotests to expose it (which has been the case with a lot of our recent bdrv_nb_sectors/bdrv_getlength failure cleanup patches). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
signature.asc
Description: OpenPGP digital signature