Re: [PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation
25.02.2021 18:52, Eric Blake wrote: On 2/25/21 8:57 AM, Vladimir Sementsov-Ogievskiy wrote: 25.02.2021 16:50, Vladimir Sementsov-Ogievskiy wrote: 18.02.2021 23:15, Eric Blake wrote: Previous commits (such as 6e280648, 75d34eb9) have mentioned that our NBD server still sends unaligned fragments when an active layer with large advertised minimum block size is backed by another layer with a smaller block size. Expand the test to actually cover these scenario, by using two different approaches: qcow2 encryption (which forces 512-byte alignment) with an unaligned raw backing file, and blkdebug with a 4k alignment. Now I don't think that aligning qemu:allocation-depth information is a correct thing to do. Why not? First, it's very rare that you'd have a qcow2 image with mandated 4k minimum block size, backed by another qcow2 image with 512 block size (blkdebug made it possible to expose the bug, but I could not find a way in common day-to-day usage), so we really aren't impacting normal users. Second, from the perspective of copying backing chains over NBD, what difference does it make if we have the backing chain: A (granularity 512) <- B (granularity 512) <- C (granularity 4k) with the allocation pattern: A: -A-A-A-A-A-A-A-A B: --BB--BB--BB--BB C: and report the allocation depth as: instead of 03220322 The former may be imprecise, but it obeys our bounds, and in all reality, if all we have access to is 4k chunks, any decisions we make about how to handle that 4k block should be based on the fact that at least some of the data was allocated in our backing file, and not treating the entire 4k as unallocated merely because the first 512 bytes are neither in A nor B. I'm not sure about NBD client, but in qemu block-jobs the decision may be different for different tasks, as I mentioned in my answer on [2/5]. For example block-stream will skip chunks allocated in top, because nothing to do, data is already in top. But if we imagine that top may return ALLOCATED for something that is not ALLOCATED the stream logic is broken.. Probably that's a bad example. I agree that this is a rare case anyway and we probably shouldn't care too much. But we should at least describe it in allocation-depth specification. -- Best regards, Vladimir
Re: [PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation
On 2/25/21 8:57 AM, Vladimir Sementsov-Ogievskiy wrote: > 25.02.2021 16:50, Vladimir Sementsov-Ogievskiy wrote: >> 18.02.2021 23:15, Eric Blake wrote: >>> Previous commits (such as 6e280648, 75d34eb9) have mentioned that our >>> NBD server still sends unaligned fragments when an active layer with >>> large advertised minimum block size is backed by another layer with a >>> smaller block size. Expand the test to actually cover these scenario, >>> by using two different approaches: qcow2 encryption (which forces >>> 512-byte alignment) with an unaligned raw backing file, and blkdebug >>> with a 4k alignment. >>> > > Now I don't think that aligning qemu:allocation-depth information is a > correct thing to do. Why not? First, it's very rare that you'd have a qcow2 image with mandated 4k minimum block size, backed by another qcow2 image with 512 block size (blkdebug made it possible to expose the bug, but I could not find a way in common day-to-day usage), so we really aren't impacting normal users. Second, from the perspective of copying backing chains over NBD, what difference does it make if we have the backing chain: A (granularity 512) <- B (granularity 512) <- C (granularity 4k) with the allocation pattern: A: -A-A-A-A-A-A-A-A B: --BB--BB--BB--BB C: and report the allocation depth as: instead of 03220322 The former may be imprecise, but it obeys our bounds, and in all reality, if all we have access to is 4k chunks, any decisions we make about how to handle that 4k block should be based on the fact that at least some of the data was allocated in our backing file, and not treating the entire 4k as unallocated merely because the first 512 bytes are neither in A nor B. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation
On 2/25/21 7:50 AM, Vladimir Sementsov-Ogievskiy wrote: > 18.02.2021 23:15, Eric Blake wrote: >> Previous commits (such as 6e280648, 75d34eb9) have mentioned that our >> NBD server still sends unaligned fragments when an active layer with >> large advertised minimum block size is backed by another layer with a >> smaller block size. Expand the test to actually cover these scenario, >> by using two different approaches: qcow2 encryption (which forces >> 512-byte alignment) with an unaligned raw backing file, and blkdebug >> with a 4k alignment. >> >> The encryption test passes with the desired results, but only because >> the client side works around the server's non-compliance; if you >> repeat the test manually with tracing turned on, you will see the >> server sending a status for 1000 bytes of data then 1048 bytes of >> hole, which is not aligned. But reverting commit 737d3f5244 shows that >> it is indeed the client working around the bug in the server. >> >> Meanwhile, the blkdebug test gives incorrect results: remember, when >> using x-dirty-bitmap with qemu-img map as a way to sniff alternative >> metadata contexts, the meanings of "data" and "zero" are determined by > > How I'm tired of this abuse:) It seems that total amount of comments > about it in code and commit messages worth creating more intuitive > interface.. Don't you have an idea in mind? Yes: 'nbdinfo' as part of the libnbd project ;) Sadly, libnbd is not available on all our common porting targets yet, and nbdinfo is less than a year old (so even distros that have libnbd 1.0 are too old). > >> that context. Our client workaround is assuming that the fragmented >> replies can be merged according to base:allocation rules, but those >> rules do not work for other contexts (merging dirty and clean bitmap >> should produce dirty; merging allocated and unallocated should produce >> allocated; see the FIXME for more about the decoded values we expect). > > You could instead keep the test output correct (without FIXME marks) but > add the test to "disabled" group and drop it from the group when fixed. Either way, it's fixed by the end of the series. > >> >> Signed-off-by: Eric Blake > > Reviewed-by: Vladimir Sementsov-Ogievskiy > Thanks! -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation
25.02.2021 16:50, Vladimir Sementsov-Ogievskiy wrote: 18.02.2021 23:15, Eric Blake wrote: Previous commits (such as 6e280648, 75d34eb9) have mentioned that our NBD server still sends unaligned fragments when an active layer with large advertised minimum block size is backed by another layer with a smaller block size. Expand the test to actually cover these scenario, by using two different approaches: qcow2 encryption (which forces 512-byte alignment) with an unaligned raw backing file, and blkdebug with a 4k alignment. The encryption test passes with the desired results, but only because the client side works around the server's non-compliance; if you repeat the test manually with tracing turned on, you will see the server sending a status for 1000 bytes of data then 1048 bytes of hole, which is not aligned. But reverting commit 737d3f5244 shows that it is indeed the client working around the bug in the server. Meanwhile, the blkdebug test gives incorrect results: remember, when using x-dirty-bitmap with qemu-img map as a way to sniff alternative metadata contexts, the meanings of "data" and "zero" are determined by How I'm tired of this abuse:) It seems that total amount of comments about it in code and commit messages worth creating more intuitive interface.. Don't you have an idea in mind? that context. Our client workaround is assuming that the fragmented replies can be merged according to base:allocation rules, but those rules do not work for other contexts (merging dirty and clean bitmap should produce dirty; merging allocated and unallocated should produce allocated; see the FIXME for more about the decoded values we expect). You could instead keep the test output correct (without FIXME marks) but add the test to "disabled" group and drop it from the group when fixed. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy Now I don't think that aligning qemu:allocation-depth information is a correct thing to do. -- Best regards, Vladimir
Re: [PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation
18.02.2021 23:15, Eric Blake wrote: Previous commits (such as 6e280648, 75d34eb9) have mentioned that our NBD server still sends unaligned fragments when an active layer with large advertised minimum block size is backed by another layer with a smaller block size. Expand the test to actually cover these scenario, by using two different approaches: qcow2 encryption (which forces 512-byte alignment) with an unaligned raw backing file, and blkdebug with a 4k alignment. The encryption test passes with the desired results, but only because the client side works around the server's non-compliance; if you repeat the test manually with tracing turned on, you will see the server sending a status for 1000 bytes of data then 1048 bytes of hole, which is not aligned. But reverting commit 737d3f5244 shows that it is indeed the client working around the bug in the server. Meanwhile, the blkdebug test gives incorrect results: remember, when using x-dirty-bitmap with qemu-img map as a way to sniff alternative metadata contexts, the meanings of "data" and "zero" are determined by How I'm tired of this abuse:) It seems that total amount of comments about it in code and commit messages worth creating more intuitive interface.. Don't you have an idea in mind? that context. Our client workaround is assuming that the fragmented replies can be merged according to base:allocation rules, but those rules do not work for other contexts (merging dirty and clean bitmap should produce dirty; merging allocated and unallocated should produce allocated; see the FIXME for more about the decoded values we expect). You could instead keep the test output correct (without FIXME marks) but add the test to "disabled" group and drop it from the group when fixed. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- tests/qemu-iotests/241 | 60 ++ tests/qemu-iotests/241.out | 21 + 2 files changed, 76 insertions(+), 5 deletions(-) diff --git a/tests/qemu-iotests/241 b/tests/qemu-iotests/241 index c962c8b6075d..5217af82dc65 100755 --- a/tests/qemu-iotests/241 +++ b/tests/qemu-iotests/241 @@ -3,7 +3,7 @@ # # Test qemu-nbd vs. unaligned images # -# Copyright (C) 2018-2019 Red Hat, Inc. +# Copyright (C) 2018-2021 Red Hat, Inc. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by @@ -27,7 +27,7 @@ status=1 # failure is the default! _cleanup() { _cleanup_test_img -rm -f "$TEST_DIR/server.log" +rm -f "$TEST_DIR/server.log" "$TEST_IMG_FILE.mid" "$TEST_IMG_FILE.qcow2" nbd_server_stop } trap "_cleanup; exit \$status" 0 1 2 3 15 @@ -37,7 +37,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 . ./common.filter . ./common.nbd -_supported_fmt raw +_supported_fmt raw # although the test also requires use of qcow2 _supported_proto nbd _supported_os Linux _require_command QEMU_NBD @@ -89,11 +89,61 @@ $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map $QEMU_IO -c map "$TEST_IMG" nbd_server_stop -# Not tested yet: we also want to ensure that qemu as NBD client does +echo +echo "=== Encrypted qcow2 file backed by unaligned raw image ===" +echo + +# Enabling encryption in qcow2 forces 512-alignment +SECRET=secret,id=sec0,data=12345 +$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F raw --object "$SECRET" \ + -o encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10 \ + "$TEST_IMG_FILE.qcow2" 2k | _filter_img_create +nbd_server_start_unix_socket --object "$SECRET" --image-opts \ + driver=qcow2,file.filename="$TEST_IMG_FILE.qcow2",encrypt.key-secret=sec0 + +$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)' +$QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map +$QEMU_IO -c map "$TEST_IMG" +nbd_server_stop + +echo +echo "=== Use blkdebug for larger alignment than backing layer ===" +echo + +$QEMU_IMG create -f qcow2 -o cluster_size=1024 \ + "$TEST_IMG_FILE" 2k | _filter_img_create +$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F qcow2 -o cluster_size=512 \ + "$TEST_IMG_FILE.mid" | _filter_img_create +$QEMU_IMG bitmap --add -g 512 --enable "$TEST_IMG_FILE.mid" b0 +$QEMU_IO -f qcow2 -c 'w 512 512' "$TEST_IMG_FILE.mid" | _filter_qemu_io +$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE.mid" -F qcow2 \ + "$TEST_IMG_FILE.qcow2" 4k | _filter_img_create +FILE=image.file.filename="$TEST_IMG_FILE.qcow2" +nbd_server_start_unix_socket -B b0 -A --image-opts \ + driver=blkdebug,align=4096,image.driver=qcow2,image.file.driver=file,"$FILE" + +TEST_IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket" +$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)' +# FIXME: this should report a single 4k block of "data":false which translates +# to the dirty bitmap being set for at least part of the region; "data":true +# is wrong unless the entire 4k is clean. +$QEMU_IMG map --output=json --i
[PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation
Previous commits (such as 6e280648, 75d34eb9) have mentioned that our NBD server still sends unaligned fragments when an active layer with large advertised minimum block size is backed by another layer with a smaller block size. Expand the test to actually cover these scenario, by using two different approaches: qcow2 encryption (which forces 512-byte alignment) with an unaligned raw backing file, and blkdebug with a 4k alignment. The encryption test passes with the desired results, but only because the client side works around the server's non-compliance; if you repeat the test manually with tracing turned on, you will see the server sending a status for 1000 bytes of data then 1048 bytes of hole, which is not aligned. But reverting commit 737d3f5244 shows that it is indeed the client working around the bug in the server. Meanwhile, the blkdebug test gives incorrect results: remember, when using x-dirty-bitmap with qemu-img map as a way to sniff alternative metadata contexts, the meanings of "data" and "zero" are determined by that context. Our client workaround is assuming that the fragmented replies can be merged according to base:allocation rules, but those rules do not work for other contexts (merging dirty and clean bitmap should produce dirty; merging allocated and unallocated should produce allocated; see the FIXME for more about the decoded values we expect). Signed-off-by: Eric Blake --- tests/qemu-iotests/241 | 60 ++ tests/qemu-iotests/241.out | 21 + 2 files changed, 76 insertions(+), 5 deletions(-) diff --git a/tests/qemu-iotests/241 b/tests/qemu-iotests/241 index c962c8b6075d..5217af82dc65 100755 --- a/tests/qemu-iotests/241 +++ b/tests/qemu-iotests/241 @@ -3,7 +3,7 @@ # # Test qemu-nbd vs. unaligned images # -# Copyright (C) 2018-2019 Red Hat, Inc. +# Copyright (C) 2018-2021 Red Hat, Inc. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by @@ -27,7 +27,7 @@ status=1 # failure is the default! _cleanup() { _cleanup_test_img -rm -f "$TEST_DIR/server.log" +rm -f "$TEST_DIR/server.log" "$TEST_IMG_FILE.mid" "$TEST_IMG_FILE.qcow2" nbd_server_stop } trap "_cleanup; exit \$status" 0 1 2 3 15 @@ -37,7 +37,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 . ./common.filter . ./common.nbd -_supported_fmt raw +_supported_fmt raw # although the test also requires use of qcow2 _supported_proto nbd _supported_os Linux _require_command QEMU_NBD @@ -89,11 +89,61 @@ $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map $QEMU_IO -c map "$TEST_IMG" nbd_server_stop -# Not tested yet: we also want to ensure that qemu as NBD client does +echo +echo "=== Encrypted qcow2 file backed by unaligned raw image ===" +echo + +# Enabling encryption in qcow2 forces 512-alignment +SECRET=secret,id=sec0,data=12345 +$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F raw --object "$SECRET" \ + -o encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10 \ + "$TEST_IMG_FILE.qcow2" 2k | _filter_img_create +nbd_server_start_unix_socket --object "$SECRET" --image-opts \ + driver=qcow2,file.filename="$TEST_IMG_FILE.qcow2",encrypt.key-secret=sec0 + +$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)' +$QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map +$QEMU_IO -c map "$TEST_IMG" +nbd_server_stop + +echo +echo "=== Use blkdebug for larger alignment than backing layer ===" +echo + +$QEMU_IMG create -f qcow2 -o cluster_size=1024 \ + "$TEST_IMG_FILE" 2k | _filter_img_create +$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F qcow2 -o cluster_size=512 \ + "$TEST_IMG_FILE.mid" | _filter_img_create +$QEMU_IMG bitmap --add -g 512 --enable "$TEST_IMG_FILE.mid" b0 +$QEMU_IO -f qcow2 -c 'w 512 512' "$TEST_IMG_FILE.mid" | _filter_qemu_io +$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE.mid" -F qcow2 \ + "$TEST_IMG_FILE.qcow2" 4k | _filter_img_create +FILE=image.file.filename="$TEST_IMG_FILE.qcow2" +nbd_server_start_unix_socket -B b0 -A --image-opts \ + driver=blkdebug,align=4096,image.driver=qcow2,image.file.driver=file,"$FILE" + +TEST_IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket" +$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)' +# FIXME: this should report a single 4k block of "data":false which translates +# to the dirty bitmap being set for at least part of the region; "data":true +# is wrong unless the entire 4k is clean. +$QEMU_IMG map --output=json --image-opts \ + "$TEST_IMG",x-dirty-bitmap=qemu:dirty-bitmap:b0 | _filter_qemu_img_map + +# FIXME: this should report a single 4k block of "zero":true,"data":true, +# meaning allocated from the backing chain. Using "zero":false,"data":false +# (allocated in active layer) or "zero":false,"data":true (entire region +# unallocated) is wrong. +$QEMU_IMG map --output=json --image-opts \ + "$TEST_IMG",x-dirty-bitmap=qemu:allocation-de