bdrv_co_block_status_above has several problems with handling short backing files:
1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but without BDRV_BLOCK_ALLOCATED flag, when actually short backing file which produces these after-EOF zeros is inside requested backing sequesnce. 2. With want_zeros=false, it will just stop inside requested region, if we have unallocated region in top node when underlying backing is short. Fix these things, making logic about short backing files clearer. Note that 154 output changed, because now bdrv_block_status_above don't merge unallocated zeros with zeros after EOF (which are actually "allocated" in POV of read from backing-chain top) and is_zero() just don't understand that the whole head or tail is zero. We may update is_zero to call bdrv_block_status_above several times, or add flag to bdrv_block_status_above that we are not interested in ALLOCATED flag, so ranges with different ALLOCATED status may be merged, but actually, it seems that we'd better don't care about this corner case. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> --- block/io.c | 41 ++++++++++++++++++++++++++++---------- tests/qemu-iotests/154.out | 4 ++-- 2 files changed, 32 insertions(+), 13 deletions(-) diff --git a/block/io.c b/block/io.c index f75777f5ea..4d7fa99bd2 100644 --- a/block/io.c +++ b/block/io.c @@ -2434,25 +2434,44 @@ static int coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs, ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map, file); if (ret < 0) { - break; + return ret; } - if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) { + if (*pnum == 0) { + if (first) { + return ret; + } + /* - * Reading beyond the end of the file continues to read - * zeroes, but we can only widen the result to the - * unallocated length we learned from an earlier - * iteration. + * Reads from bs for selected region will return zeroes, produced + * because current level is short. We should consider it as + * allocated. + * + * TODO: Should we report p as file here? */ + assert(ret & BDRV_BLOCK_EOF); *pnum = bytes; + return BDRV_BLOCK_ZERO | BDRV_BLOCK_ALLOCATED; } - if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) { - break; + if (ret & BDRV_BLOCK_ALLOCATED) { + /* We've found the node and the status, we must return. */ + + if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) { + /* + * This level also responsible for reads after EOF inside + * unallocated region in previous level. + */ + *pnum = bytes; + } + + return ret; } - /* [offset, pnum] unallocated on this layer, which could be only - * the first part of [offset, bytes]. */ - bytes = MIN(bytes, *pnum); + + /* Proceed to backing */ + assert(*pnum <= bytes); + bytes = *pnum; first = false; } + return ret; } diff --git a/tests/qemu-iotests/154.out b/tests/qemu-iotests/154.out index fa3673317f..a203dfcadd 100644 --- a/tests/qemu-iotests/154.out +++ b/tests/qemu-iotests/154.out @@ -310,13 +310,13 @@ wrote 512/512 bytes at offset 134217728 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 2048/2048 bytes allocated at offset 128 MiB [{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false}, -{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}] +{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base wrote 512/512 bytes at offset 134219264 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 2048/2048 bytes allocated at offset 128 MiB [{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false}, -{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}] +{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base wrote 1024/1024 bytes at offset 134218240 1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -- 2.21.0