Re: [Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file

2013-08-19 Thread Asias He
On Fri, Aug 16, 2013 at 10:41:36AM +0200, Stefan Hajnoczi wrote:
 On Mon, Aug 5, 2013 at 10:11 AM, Asias He as...@redhat.com wrote:
  From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
 
  While Asias is debugging an issue creating qcow2 images on top of
  non-file protocols.  It boils down to this example using NBD:
 
  $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'
 
  Notice the open -g option to set bs-growable.  This means you can
  read/write beyond end of file.  Reading beyond end of file is supposed
  to produce zeroes.
 
  We rely on this behavior in qcow2_create2() during qcow2 image
  creation.  We create a new file and then write the qcow2 header
  structure using bdrv_pwrite().  Since QCowHeader is not a multiple of
  sector size, block.c first uses bdrv_read() on the empty file to fetch
  the first sector (should be all zeroes).
 
  Here is the output from the qemu-io NBD example above:
 
  $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'
  :  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
  0010:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
  0020:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
  ...
 
  We are not zeroing the buffer!  As a result qcow2 image creation on top
  of protocols is not guaranteed to work even when file creation is
  supported by the protocol.
 
  Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
  Signed-off-by: Asias He as...@redhat.com
  ---
   block.c | 30 +-
   1 file changed, 29 insertions(+), 1 deletion(-)
 
  diff --git a/block.c b/block.c
  index 01b66d8..deaf0a0 100644
  --- a/block.c
  +++ b/block.c
  @@ -2544,7 +2544,35 @@ static int coroutine_fn 
  bdrv_co_do_readv(BlockDriverState *bs,
   }
   }
 
  -ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
  +if (!bs-drv-protocol_name) {
  +ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
  +} else {
  +/* NBD doesn't support reading beyond end of file. */
  +int64_t len, total_sectors, max_nb_sectors;
  +
  +len = bdrv_getlength(bs);
  +if (len  0) {
  +ret = len;
  +goto out;
  +}
  +
  +total_sectors = len  BDRV_SECTOR_BITS;
  +max_nb_sectors = MAX(0, total_sectors - sector_num);
  +if (max_nb_sectors  0) {
  +ret = drv-bdrv_co_readv(bs, sector_num,
  + MIN(nb_sectors, max_nb_sectors), 
  qiov);
  +} else {
  +ret = 0;
  +}
  +
  +/* Reading beyond end of file is supposed to produce zeroes */
  +if (ret == 0  total_sectors  sector_num + nb_sectors) {
  +size_t offset = MAX(0, total_sectors - sector_num);
  +size_t bytes = (sector_num + nb_sectors - offset) *
  +BDRV_SECTOR_SIZE;
  +qemu_iovec_memset(qiov, offset * BDRV_SECTOR_SIZE, 0, bytes);
  +}
  +}
 
 This patch breaks qemu-iotests ./check -qcow2 022.  This happens
 because qcow2 temporarily sets -growable = 1 for vmstate accesses
 (which are stored beyond the end of regular image data).

I am a bit confused. This is from the other mail:


  I think it would break qcow2_load_vmstate(), which is basically a   
  
  bdrv_pread() after the end of the image.
  
   
 
 I see, then only protocols have to zeroing the buffer?  In case of
 
 protocols, I think bdrv_getlength() returns the underlying file   
 
 length, so qcow2_load_vmstate() would be a bdrv_pread() within the
 
 result of bdrv_getlength().   
 

Limiting it to protocols solves the problem, I think.


And in v1 of this patch, Kevin wanted bs-growable check instad of the
protocol_name one.


 -ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov);  
 
 +if (!bs-drv-protocol_name) {   
 

I think !bs-growable is the right check.

Checking for the protocol name is always a hack and most times wrong.


Switching back to the protocol_name check, ./check -qcow2 022 test passes.

 static int qcow2_load_vmstate(BlockDriverState *bs, uint8_t *buf,

Re: [Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file

2013-08-19 Thread Stefan Hajnoczi
On Mon, Aug 19, 2013 at 02:36:14PM +0800, Asias He wrote:
 On Fri, Aug 16, 2013 at 10:41:36AM +0200, Stefan Hajnoczi wrote:
  On Mon, Aug 5, 2013 at 10:11 AM, Asias He as...@redhat.com wrote:
  A simple but ugly way to fix this is for block.c to also have a
  -zero_beyond_eof flag which enables the behavior you are adding.
  qcow2_load_vmstate() would disable -zero_beyond_eof temporarily in
  addition to enabling -growable.
 
 I am wondering why the -growable logic is introduced in the first
 place.  Adding yet another this kind of flag looks realy ugly ;(

bs-growable serves two functions:

1. It means you can read/write beyond the end of the file, for example,
   when creating a new image file.  Normally BlockDriverState rejects
   requests beyond the EOF.

2. qcow2 uses it as a hack to implement the vmstate area after the end
   of regular guest data.  This is the ugly part but it always worked up
   until now.

#1 is a legitimate use case.  You don't always know how big the file
will end up so it's much more convenient to grow on demand, instead of
having to bdrv_truncate() all the time.

#2 is hacky but hard to solve without duplicating the bounce buffer and
coroutine wrapping logic in bdrv_pread() (Kevin has suggested calling
bdrv_co_readv() internally instead of bdrv_pread()).

Stefan



Re: [Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file

2013-08-16 Thread Stefan Hajnoczi
On Mon, Aug 5, 2013 at 10:11 AM, Asias He as...@redhat.com wrote:
 From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp

 While Asias is debugging an issue creating qcow2 images on top of
 non-file protocols.  It boils down to this example using NBD:

 $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'

 Notice the open -g option to set bs-growable.  This means you can
 read/write beyond end of file.  Reading beyond end of file is supposed
 to produce zeroes.

 We rely on this behavior in qcow2_create2() during qcow2 image
 creation.  We create a new file and then write the qcow2 header
 structure using bdrv_pwrite().  Since QCowHeader is not a multiple of
 sector size, block.c first uses bdrv_read() on the empty file to fetch
 the first sector (should be all zeroes).

 Here is the output from the qemu-io NBD example above:

 $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'
 :  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
 0010:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
 0020:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
 ...

 We are not zeroing the buffer!  As a result qcow2 image creation on top
 of protocols is not guaranteed to work even when file creation is
 supported by the protocol.

 Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
 Signed-off-by: Asias He as...@redhat.com
 ---
  block.c | 30 +-
  1 file changed, 29 insertions(+), 1 deletion(-)

 diff --git a/block.c b/block.c
 index 01b66d8..deaf0a0 100644
 --- a/block.c
 +++ b/block.c
 @@ -2544,7 +2544,35 @@ static int coroutine_fn 
 bdrv_co_do_readv(BlockDriverState *bs,
  }
  }

 -ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
 +if (!bs-drv-protocol_name) {
 +ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
 +} else {
 +/* NBD doesn't support reading beyond end of file. */
 +int64_t len, total_sectors, max_nb_sectors;
 +
 +len = bdrv_getlength(bs);
 +if (len  0) {
 +ret = len;
 +goto out;
 +}
 +
 +total_sectors = len  BDRV_SECTOR_BITS;
 +max_nb_sectors = MAX(0, total_sectors - sector_num);
 +if (max_nb_sectors  0) {
 +ret = drv-bdrv_co_readv(bs, sector_num,
 + MIN(nb_sectors, max_nb_sectors), qiov);
 +} else {
 +ret = 0;
 +}
 +
 +/* Reading beyond end of file is supposed to produce zeroes */
 +if (ret == 0  total_sectors  sector_num + nb_sectors) {
 +size_t offset = MAX(0, total_sectors - sector_num);
 +size_t bytes = (sector_num + nb_sectors - offset) *
 +BDRV_SECTOR_SIZE;
 +qemu_iovec_memset(qiov, offset * BDRV_SECTOR_SIZE, 0, bytes);
 +}
 +}

This patch breaks qemu-iotests ./check -qcow2 022.  This happens
because qcow2 temporarily sets -growable = 1 for vmstate accesses
(which are stored beyond the end of regular image data).

static int qcow2_load_vmstate(BlockDriverState *bs, uint8_t *buf,
  int64_t pos, int size)
{
BDRVQcowState *s = bs-opaque;
int growable = bs-growable;
int ret;

BLKDBG_EVENT(bs-file, BLKDBG_VMSTATE_LOAD);
bs-growable = 1;
ret = bdrv_pread(bs, qcow2_vm_state_offset(s) + pos, buf, size);
bs-growable = growable;

return ret;
}

Please *always* run qemu-iotests before submitting block patches:
http://qemu-project.org/Documentation/QemuIoTests

A simple but ugly way to fix this is for block.c to also have a
-zero_beyond_eof flag which enables the behavior you are adding.
qcow2_load_vmstate() would disable -zero_beyond_eof temporarily in
addition to enabling -growable.

It's not easy to call the internal qcow2_co_readv() from
qcow2_load_vmstate() because the vmstate functions are byte
granularity (like bdrv_pread()) while .bdrv_co_readv() is
sector-granularity.

Stefan



[Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file

2013-08-05 Thread Asias He
From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp

While Asias is debugging an issue creating qcow2 images on top of
non-file protocols.  It boils down to this example using NBD:

$ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'

Notice the open -g option to set bs-growable.  This means you can
read/write beyond end of file.  Reading beyond end of file is supposed
to produce zeroes.

We rely on this behavior in qcow2_create2() during qcow2 image
creation.  We create a new file and then write the qcow2 header
structure using bdrv_pwrite().  Since QCowHeader is not a multiple of
sector size, block.c first uses bdrv_read() on the empty file to fetch
the first sector (should be all zeroes).

Here is the output from the qemu-io NBD example above:

$ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'
:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
0010:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
0020:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
...

We are not zeroing the buffer!  As a result qcow2 image creation on top
of protocols is not guaranteed to work even when file creation is
supported by the protocol.

Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
Signed-off-by: Asias He as...@redhat.com
---
 block.c | 30 +-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 01b66d8..deaf0a0 100644
--- a/block.c
+++ b/block.c
@@ -2544,7 +2544,35 @@ static int coroutine_fn 
bdrv_co_do_readv(BlockDriverState *bs,
 }
 }
 
-ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
+if (!bs-drv-protocol_name) {
+ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
+} else {
+/* NBD doesn't support reading beyond end of file. */
+int64_t len, total_sectors, max_nb_sectors;
+
+len = bdrv_getlength(bs);
+if (len  0) {
+ret = len;
+goto out;
+}
+
+total_sectors = len  BDRV_SECTOR_BITS;
+max_nb_sectors = MAX(0, total_sectors - sector_num);
+if (max_nb_sectors  0) {
+ret = drv-bdrv_co_readv(bs, sector_num,
+ MIN(nb_sectors, max_nb_sectors), qiov);
+} else {
+ret = 0;
+}
+
+/* Reading beyond end of file is supposed to produce zeroes */
+if (ret == 0  total_sectors  sector_num + nb_sectors) {
+size_t offset = MAX(0, total_sectors - sector_num);
+size_t bytes = (sector_num + nb_sectors - offset) *
+BDRV_SECTOR_SIZE;
+qemu_iovec_memset(qiov, offset * BDRV_SECTOR_SIZE, 0, bytes);
+}
+}
 
 out:
 tracked_request_end(req);
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file

2013-08-05 Thread Kevin Wolf
Am 05.08.2013 um 10:11 hat Asias He geschrieben:
 From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
 
 While Asias is debugging an issue creating qcow2 images on top of
 non-file protocols.  It boils down to this example using NBD:
 
 $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'
 
 Notice the open -g option to set bs-growable.  This means you can
 read/write beyond end of file.  Reading beyond end of file is supposed
 to produce zeroes.
 
 We rely on this behavior in qcow2_create2() during qcow2 image
 creation.  We create a new file and then write the qcow2 header
 structure using bdrv_pwrite().  Since QCowHeader is not a multiple of
 sector size, block.c first uses bdrv_read() on the empty file to fetch
 the first sector (should be all zeroes).
 
 Here is the output from the qemu-io NBD example above:
 
 $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'
 :  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
 0010:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
 0020:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  
 ...
 
 We are not zeroing the buffer!  As a result qcow2 image creation on top
 of protocols is not guaranteed to work even when file creation is
 supported by the protocol.
 
 Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
 Signed-off-by: Asias He as...@redhat.com
 ---
  block.c | 30 +-
  1 file changed, 29 insertions(+), 1 deletion(-)
 
 diff --git a/block.c b/block.c
 index 01b66d8..deaf0a0 100644
 --- a/block.c
 +++ b/block.c
 @@ -2544,7 +2544,35 @@ static int coroutine_fn 
 bdrv_co_do_readv(BlockDriverState *bs,
  }
  }
  
 -ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
 +if (!bs-drv-protocol_name) {

I think !bs-growable is the right check.

Checking for the protocol name is always a hack and most times wrong.

 +ret = drv-bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
 +} else {
 +/* NBD doesn't support reading beyond end of file. */

This is not only for NBD, make it a neutral comment like:

/* Read zeros after EOF of growable BDSes */

 +int64_t len, total_sectors, max_nb_sectors;
 +
 +len = bdrv_getlength(bs);
 +if (len  0) {
 +ret = len;
 +goto out;
 +}
 +
 +total_sectors = len  BDRV_SECTOR_BITS;
 +max_nb_sectors = MAX(0, total_sectors - sector_num);
 +if (max_nb_sectors  0) {
 +ret = drv-bdrv_co_readv(bs, sector_num,
 + MIN(nb_sectors, max_nb_sectors), qiov);
 +} else {
 +ret = 0;
 +}
 +
 +/* Reading beyond end of file is supposed to produce zeroes */
 +if (ret == 0  total_sectors  sector_num + nb_sectors) {
 +size_t offset = MAX(0, total_sectors - sector_num);
 +size_t bytes = (sector_num + nb_sectors - offset) *
 +BDRV_SECTOR_SIZE;

uint64_t for both offset and bytes, size_t can be 32 bits.

 +qemu_iovec_memset(qiov, offset * BDRV_SECTOR_SIZE, 0, bytes);
 +}
 +}
  
  out:
  tracked_request_end(req);

Kevin