Re: [Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file
On Fri, Aug 16, 2013 at 10:41:36AM +0200, Stefan Hajnoczi wrote: On Mon, Aug 5, 2013 at 10:11 AM, Asias He as...@redhat.com wrote: From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp While Asias is debugging an issue creating qcow2 images on top of non-file protocols. It boils down to this example using NBD: $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512' Notice the open -g option to set bs-growable. This means you can read/write beyond end of file. Reading beyond end of file is supposed to produce zeroes. We rely on this behavior in qcow2_create2() during qcow2 image creation. We create a new file and then write the qcow2 header structure using bdrv_pwrite(). Since QCowHeader is not a multiple of sector size, block.c first uses bdrv_read() on the empty file to fetch the first sector (should be all zeroes). Here is the output from the qemu-io NBD example above: $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512' : ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab 0010: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab 0020: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ... We are not zeroing the buffer! As a result qcow2 image creation on top of protocols is not guaranteed to work even when file creation is supported by the protocol. Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp Signed-off-by: Asias He as...@redhat.com --- block.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index 01b66d8..deaf0a0 100644 --- a/block.c +++ b/block.c @@ -2544,7 +2544,35 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, } } -ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov); +if (!bs-drv-protocol_name) { +ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov); +} else { +/* NBD doesn't support reading beyond end of file. */ +int64_t len, total_sectors, max_nb_sectors; + +len = bdrv_getlength(bs); +if (len 0) { +ret = len; +goto out; +} + +total_sectors = len BDRV_SECTOR_BITS; +max_nb_sectors = MAX(0, total_sectors - sector_num); +if (max_nb_sectors 0) { +ret = drv-bdrv_co_readv(bs, sector_num, + MIN(nb_sectors, max_nb_sectors), qiov); +} else { +ret = 0; +} + +/* Reading beyond end of file is supposed to produce zeroes */ +if (ret == 0 total_sectors sector_num + nb_sectors) { +size_t offset = MAX(0, total_sectors - sector_num); +size_t bytes = (sector_num + nb_sectors - offset) * +BDRV_SECTOR_SIZE; +qemu_iovec_memset(qiov, offset * BDRV_SECTOR_SIZE, 0, bytes); +} +} This patch breaks qemu-iotests ./check -qcow2 022. This happens because qcow2 temporarily sets -growable = 1 for vmstate accesses (which are stored beyond the end of regular image data). I am a bit confused. This is from the other mail: I think it would break qcow2_load_vmstate(), which is basically a bdrv_pread() after the end of the image. I see, then only protocols have to zeroing the buffer? In case of protocols, I think bdrv_getlength() returns the underlying file length, so qcow2_load_vmstate() would be a bdrv_pread() within the result of bdrv_getlength(). Limiting it to protocols solves the problem, I think. And in v1 of this patch, Kevin wanted bs-growable check instad of the protocol_name one. -ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov); +if (!bs-drv-protocol_name) { I think !bs-growable is the right check. Checking for the protocol name is always a hack and most times wrong. Switching back to the protocol_name check, ./check -qcow2 022 test passes. static int qcow2_load_vmstate(BlockDriverState *bs, uint8_t *buf,
Re: [Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file
On Mon, Aug 19, 2013 at 02:36:14PM +0800, Asias He wrote: On Fri, Aug 16, 2013 at 10:41:36AM +0200, Stefan Hajnoczi wrote: On Mon, Aug 5, 2013 at 10:11 AM, Asias He as...@redhat.com wrote: A simple but ugly way to fix this is for block.c to also have a -zero_beyond_eof flag which enables the behavior you are adding. qcow2_load_vmstate() would disable -zero_beyond_eof temporarily in addition to enabling -growable. I am wondering why the -growable logic is introduced in the first place. Adding yet another this kind of flag looks realy ugly ;( bs-growable serves two functions: 1. It means you can read/write beyond the end of the file, for example, when creating a new image file. Normally BlockDriverState rejects requests beyond the EOF. 2. qcow2 uses it as a hack to implement the vmstate area after the end of regular guest data. This is the ugly part but it always worked up until now. #1 is a legitimate use case. You don't always know how big the file will end up so it's much more convenient to grow on demand, instead of having to bdrv_truncate() all the time. #2 is hacky but hard to solve without duplicating the bounce buffer and coroutine wrapping logic in bdrv_pread() (Kevin has suggested calling bdrv_co_readv() internally instead of bdrv_pread()). Stefan
Re: [Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file
On Mon, Aug 5, 2013 at 10:11 AM, Asias He as...@redhat.com wrote: From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp While Asias is debugging an issue creating qcow2 images on top of non-file protocols. It boils down to this example using NBD: $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512' Notice the open -g option to set bs-growable. This means you can read/write beyond end of file. Reading beyond end of file is supposed to produce zeroes. We rely on this behavior in qcow2_create2() during qcow2 image creation. We create a new file and then write the qcow2 header structure using bdrv_pwrite(). Since QCowHeader is not a multiple of sector size, block.c first uses bdrv_read() on the empty file to fetch the first sector (should be all zeroes). Here is the output from the qemu-io NBD example above: $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512' : ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab 0010: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab 0020: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ... We are not zeroing the buffer! As a result qcow2 image creation on top of protocols is not guaranteed to work even when file creation is supported by the protocol. Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp Signed-off-by: Asias He as...@redhat.com --- block.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index 01b66d8..deaf0a0 100644 --- a/block.c +++ b/block.c @@ -2544,7 +2544,35 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, } } -ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov); +if (!bs-drv-protocol_name) { +ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov); +} else { +/* NBD doesn't support reading beyond end of file. */ +int64_t len, total_sectors, max_nb_sectors; + +len = bdrv_getlength(bs); +if (len 0) { +ret = len; +goto out; +} + +total_sectors = len BDRV_SECTOR_BITS; +max_nb_sectors = MAX(0, total_sectors - sector_num); +if (max_nb_sectors 0) { +ret = drv-bdrv_co_readv(bs, sector_num, + MIN(nb_sectors, max_nb_sectors), qiov); +} else { +ret = 0; +} + +/* Reading beyond end of file is supposed to produce zeroes */ +if (ret == 0 total_sectors sector_num + nb_sectors) { +size_t offset = MAX(0, total_sectors - sector_num); +size_t bytes = (sector_num + nb_sectors - offset) * +BDRV_SECTOR_SIZE; +qemu_iovec_memset(qiov, offset * BDRV_SECTOR_SIZE, 0, bytes); +} +} This patch breaks qemu-iotests ./check -qcow2 022. This happens because qcow2 temporarily sets -growable = 1 for vmstate accesses (which are stored beyond the end of regular image data). static int qcow2_load_vmstate(BlockDriverState *bs, uint8_t *buf, int64_t pos, int size) { BDRVQcowState *s = bs-opaque; int growable = bs-growable; int ret; BLKDBG_EVENT(bs-file, BLKDBG_VMSTATE_LOAD); bs-growable = 1; ret = bdrv_pread(bs, qcow2_vm_state_offset(s) + pos, buf, size); bs-growable = growable; return ret; } Please *always* run qemu-iotests before submitting block patches: http://qemu-project.org/Documentation/QemuIoTests A simple but ugly way to fix this is for block.c to also have a -zero_beyond_eof flag which enables the behavior you are adding. qcow2_load_vmstate() would disable -zero_beyond_eof temporarily in addition to enabling -growable. It's not easy to call the internal qcow2_co_readv() from qcow2_load_vmstate() because the vmstate functions are byte granularity (like bdrv_pread()) while .bdrv_co_readv() is sector-granularity. Stefan
[Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file
From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp While Asias is debugging an issue creating qcow2 images on top of non-file protocols. It boils down to this example using NBD: $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512' Notice the open -g option to set bs-growable. This means you can read/write beyond end of file. Reading beyond end of file is supposed to produce zeroes. We rely on this behavior in qcow2_create2() during qcow2 image creation. We create a new file and then write the qcow2 header structure using bdrv_pwrite(). Since QCowHeader is not a multiple of sector size, block.c first uses bdrv_read() on the empty file to fetch the first sector (should be all zeroes). Here is the output from the qemu-io NBD example above: $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512' : ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab 0010: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab 0020: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ... We are not zeroing the buffer! As a result qcow2 image creation on top of protocols is not guaranteed to work even when file creation is supported by the protocol. Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp Signed-off-by: Asias He as...@redhat.com --- block.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index 01b66d8..deaf0a0 100644 --- a/block.c +++ b/block.c @@ -2544,7 +2544,35 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, } } -ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov); +if (!bs-drv-protocol_name) { +ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov); +} else { +/* NBD doesn't support reading beyond end of file. */ +int64_t len, total_sectors, max_nb_sectors; + +len = bdrv_getlength(bs); +if (len 0) { +ret = len; +goto out; +} + +total_sectors = len BDRV_SECTOR_BITS; +max_nb_sectors = MAX(0, total_sectors - sector_num); +if (max_nb_sectors 0) { +ret = drv-bdrv_co_readv(bs, sector_num, + MIN(nb_sectors, max_nb_sectors), qiov); +} else { +ret = 0; +} + +/* Reading beyond end of file is supposed to produce zeroes */ +if (ret == 0 total_sectors sector_num + nb_sectors) { +size_t offset = MAX(0, total_sectors - sector_num); +size_t bytes = (sector_num + nb_sectors - offset) * +BDRV_SECTOR_SIZE; +qemu_iovec_memset(qiov, offset * BDRV_SECTOR_SIZE, 0, bytes); +} +} out: tracked_request_end(req); -- 1.8.3.1
Re: [Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file
Am 05.08.2013 um 10:11 hat Asias He geschrieben: From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp While Asias is debugging an issue creating qcow2 images on top of non-file protocols. It boils down to this example using NBD: $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512' Notice the open -g option to set bs-growable. This means you can read/write beyond end of file. Reading beyond end of file is supposed to produce zeroes. We rely on this behavior in qcow2_create2() during qcow2 image creation. We create a new file and then write the qcow2 header structure using bdrv_pwrite(). Since QCowHeader is not a multiple of sector size, block.c first uses bdrv_read() on the empty file to fetch the first sector (should be all zeroes). Here is the output from the qemu-io NBD example above: $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512' : ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab 0010: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab 0020: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ... We are not zeroing the buffer! As a result qcow2 image creation on top of protocols is not guaranteed to work even when file creation is supported by the protocol. Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp Signed-off-by: Asias He as...@redhat.com --- block.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index 01b66d8..deaf0a0 100644 --- a/block.c +++ b/block.c @@ -2544,7 +2544,35 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, } } -ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov); +if (!bs-drv-protocol_name) { I think !bs-growable is the right check. Checking for the protocol name is always a hack and most times wrong. +ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov); +} else { +/* NBD doesn't support reading beyond end of file. */ This is not only for NBD, make it a neutral comment like: /* Read zeros after EOF of growable BDSes */ +int64_t len, total_sectors, max_nb_sectors; + +len = bdrv_getlength(bs); +if (len 0) { +ret = len; +goto out; +} + +total_sectors = len BDRV_SECTOR_BITS; +max_nb_sectors = MAX(0, total_sectors - sector_num); +if (max_nb_sectors 0) { +ret = drv-bdrv_co_readv(bs, sector_num, + MIN(nb_sectors, max_nb_sectors), qiov); +} else { +ret = 0; +} + +/* Reading beyond end of file is supposed to produce zeroes */ +if (ret == 0 total_sectors sector_num + nb_sectors) { +size_t offset = MAX(0, total_sectors - sector_num); +size_t bytes = (sector_num + nb_sectors - offset) * +BDRV_SECTOR_SIZE; uint64_t for both offset and bytes, size_t can be 32 bits. +qemu_iovec_memset(qiov, offset * BDRV_SECTOR_SIZE, 0, bytes); +} +} out: tracked_request_end(req); Kevin