Re: [Qemu-block] raw iotest regressions in 2.12.0-rc0

2018-03-21 Thread Peter Xu
On Wed, Mar 21, 2018 at 05:58:48PM -0400, John Snow wrote:
> ./check -v -raw
> Failures: 109 132 136 148 152 183
> 
> 3fd2457d18edf5736f713dfe1ada9c87a9badab1 is the first bad commit
> commit 3fd2457d18edf5736f713dfe1ada9c87a9badab1
> Author: Peter Xu 
> Date:   Fri Mar 9 17:00:03 2018 +0800
> 
> monitor: enable IO thread for (qmp & !mux) typed
> 
> Start to use dedicate IO thread for QMP monitors that are not using
> MUXed chardev.
> 
> Reviewed-by: Fam Zheng 
> Reviewed-by: Stefan Hajnoczi 
> Signed-off-by: Peter Xu 
> Message-Id: <20180309090006.10018-21-pet...@redhat.com>
> Signed-off-by: Eric Blake 
> 
> 
> The symptom appears to be extra "RESUME" events in the stream that
> weren't expected by the original output for tests 109 and 183; the rest
> are python and I didn't dig yet.
> 
> ./check -v raw
> Failures: 055
> Failed 5 of 5 tests
> 
> 91ad45061af0fe44ac5dadb5bedaf4d7a08077c8 is the first bad commit
> commit 91ad45061af0fe44ac5dadb5bedaf4d7a08077c8
> Author: Peter Xu 
> Date:   Fri Mar 9 17:00:05 2018 +0800
> 
> tests: qmp-test: verify command batching
> 
> OOB introduced DROP event for flow control.  This should not affect old
> QMP clients.  Add a command batching check to make sure of it.
> 
> Reviewed-by: Stefan Hajnoczi 
> Signed-off-by: Peter Xu 
> Message-Id: <20180309090006.10018-23-pet...@redhat.com>
> Reviewed-by: Eric Blake 
> Signed-off-by: Eric Blake 
> 
> 
> 
> Maybe these are known, but I wanted to consolidate them for rc0 for
> something easy to search for. There are others for qcow2 which I'll post
> in a bit...!

I'll have a look on this today and see whether this is the same
problem as reported by Max.  Sorry for that, and thanks for reporting!

-- 
Peter Xu



Re: [Qemu-block] [Qemu-devel] [PATCH] scsi: turn "is this a SCSI device?" into a conditional hint

2018-03-21 Thread Fam Zheng
On Wed, 03/21 11:58, Paolo Bonzini wrote:
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 7414fe2d67..b3de5df324 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -500,9 +500,10 @@ static void scsi_generic_realize(SCSIDevice *s, Error 
> **errp)
>  /* check we are using a driver managing SG_IO (version 3 and after */
>  rc = blk_ioctl(s->conf.blk, SG_GET_VERSION_NUM, _version);
>  if (rc < 0) {
> -error_setg(errp, "cannot get SG_IO version number: %s.  "
> - "Is this a SCSI device?",
> - strerror(-rc));
> +error_setg(errp, "cannot get SG_IO version number: %s", 
> strerror(-rc));
> +if (rc != -EPERM) {
> +error_append_hint(errp, "Is this a SCSI device?");
> +}

Out of the scope of this patch but I find it a bit annoying that hints are not
propagated up in QMP. So that if you hot plug from virsh, the user friendliness
of the old message is lost. Not sure if that is a problem or not.

Fam



Re: [Qemu-block] [Qemu-devel] [Libguestfs] [PATCH] tests: regressions: make test-big-heap use a temporary empty file

2018-03-21 Thread Richard W.M. Jones
On Wed, Mar 21, 2018 at 03:58:05PM -0500, Eric Blake wrote:
> On 03/21/2018 03:44 PM, Kevin Wolf wrote:
> >>>
> >>>You're right that file locking on a character device like /dev/null is
> >>>not going to work as expected, but is it a case where fcntl() actually
> >>>fails, or is it worse where the fcntl() claiming the locks "succeeds"
> >>>but doesn't do what we want?  That is, what were the actual error
> >>>messages you ran into?
> >>
> >>$ qemu-img --version
> >>qemu-img version 2.10.1(qemu-2.10.1-2.fc27)
> >>Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
> >>$ qemu-img info /dev/null
> >>qemu-img: Could not open '/dev/null': Failed to get "consistent read" lock
> >>Is another process using the image?
> >
> >Not sure where the difference is, but I can't reproduce this on
> >upstream, neither git master nor the v2.10.1 tag:
> 
> Is it a case where file locking actually works, and more than one
> process is trying to lock /dev/null at once?  (qemu-img info is
> short-lived, but could there be another longer-lived process also
> using /dev/null)?
> 
> Does using -r help (if the only reason you're telling qemu-img to
> operate on /dev/null is to probe qemu-img features, can you probe
> those same features without needing to write, which in turn requests
> less locking)?

The original test (before Pino's patch) runs ‘qemu-img info /dev/null’
purely as a way to fork qemu-img.  It's a regression test for some
problem we had with the code that confines qemu-img using resource
limits.  So really nothing much to do with qemu-img at all.  The fix
is to create a temporary file and run qemu-img against that.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top



[Qemu-block] raw iotest regressions in 2.12.0-rc0

2018-03-21 Thread John Snow
./check -v -raw
Failures: 109 132 136 148 152 183

3fd2457d18edf5736f713dfe1ada9c87a9badab1 is the first bad commit
commit 3fd2457d18edf5736f713dfe1ada9c87a9badab1
Author: Peter Xu 
Date:   Fri Mar 9 17:00:03 2018 +0800

monitor: enable IO thread for (qmp & !mux) typed

Start to use dedicate IO thread for QMP monitors that are not using
MUXed chardev.

Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Peter Xu 
Message-Id: <20180309090006.10018-21-pet...@redhat.com>
Signed-off-by: Eric Blake 


The symptom appears to be extra "RESUME" events in the stream that
weren't expected by the original output for tests 109 and 183; the rest
are python and I didn't dig yet.

./check -v raw
Failures: 055
Failed 5 of 5 tests

91ad45061af0fe44ac5dadb5bedaf4d7a08077c8 is the first bad commit
commit 91ad45061af0fe44ac5dadb5bedaf4d7a08077c8
Author: Peter Xu 
Date:   Fri Mar 9 17:00:05 2018 +0800

tests: qmp-test: verify command batching

OOB introduced DROP event for flow control.  This should not affect old
QMP clients.  Add a command batching check to make sure of it.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Peter Xu 
Message-Id: <20180309090006.10018-23-pet...@redhat.com>
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 



Maybe these are known, but I wanted to consolidate them for rc0 for
something easy to search for. There are others for qcow2 which I'll post
in a bit...!


Thanks,
--js



Re: [Qemu-block] [Qemu-devel] [Libguestfs] [PATCH] tests: regressions: make test-big-heap use a temporary empty file

2018-03-21 Thread Eric Blake

On 03/21/2018 03:44 PM, Kevin Wolf wrote:


You're right that file locking on a character device like /dev/null is
not going to work as expected, but is it a case where fcntl() actually
fails, or is it worse where the fcntl() claiming the locks "succeeds"
but doesn't do what we want?  That is, what were the actual error
messages you ran into?


$ qemu-img --version
qemu-img version 2.10.1(qemu-2.10.1-2.fc27)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
$ qemu-img info /dev/null
qemu-img: Could not open '/dev/null': Failed to get "consistent read" lock
Is another process using the image?


Not sure where the difference is, but I can't reproduce this on
upstream, neither git master nor the v2.10.1 tag:


Is it a case where file locking actually works, and more than one 
process is trying to lock /dev/null at once?  (qemu-img info is 
short-lived, but could there be another longer-lived process also using 
/dev/null)?


Does using -r help (if the only reason you're telling qemu-img to 
operate on /dev/null is to probe qemu-img features, can you probe those 
same features without needing to write, which in turn requests less 
locking)?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [Libguestfs] [PATCH] tests: regressions: make test-big-heap use a temporary empty file

2018-03-21 Thread Kevin Wolf
Am 21.03.2018 um 14:48 hat Pino Toscano geschrieben:
> On Wednesday, 21 March 2018 14:45:38 CET Eric Blake wrote:
> > [adding qemu lists]
> > 
> > On 03/21/2018 07:51 AM, Richard W.M. Jones wrote:
> > > On Wed, Mar 21, 2018 at 01:44:17PM +0100, Pino Toscano wrote:
> > >> Newer versions of qemu use file locking for the images by default, and
> > >> apparently that does not work with /dev/null.  Since this test just
> > >> calls qemu-img to get the format of an empty image, create a temporary
> > >> one instead.
> > > 
> > > ACK, but feels like this is a bug in qemu-img to me.
> > 
> > You're right that file locking on a character device like /dev/null is 
> > not going to work as expected, but is it a case where fcntl() actually 
> > fails, or is it worse where the fcntl() claiming the locks "succeeds" 
> > but doesn't do what we want?  That is, what were the actual error 
> > messages you ran into?
> 
> $ qemu-img --version
> qemu-img version 2.10.1(qemu-2.10.1-2.fc27)
> Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
> $ qemu-img info /dev/null 
> qemu-img: Could not open '/dev/null': Failed to get "consistent read" lock
> Is another process using the image?

Not sure where the difference is, but I can't reproduce this on
upstream, neither git master nor the v2.10.1 tag:

$ ./qemu-img --version
qemu-img version 2.10.1 (v2.10.1-dirty)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
$ ./qemu-img info /dev/null
image: /dev/null
file format: raw
virtual size: 0 (0 bytes)
disk size: 0

Also with strace:

open("/dev/null", O_RDONLY|O_CLOEXEC)   = 10
fcntl(10, F_OFD_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=100, 
l_len=1}) = 0
...

So my kernel (4.15.9-200.fc26.x86_64) seems to have no problems with
locking /dev/null.

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH for-2.12 v5] file-posix: specify expected filetypes

2018-03-21 Thread Kevin Wolf
Am 21.03.2018 um 21:26 hat Eric Blake geschrieben:
> On 03/21/2018 03:01 PM, John Snow wrote:
> > Adjust each caller of raw_open_common to specify if they are expecting
> > host and character devices or not. Tighten expectations of file types upon
> > open in the common code and refuse types that are not expected.
> > 
> > This has two effects:
> > 
> > (1) Character and block devices are now considered deprecated for the
> >  'file' driver, which expects only S_IFREG, and
> 
> Which may have interesting effects on v2v's attempt to use qemu-img on
> /dev/null as a quick feature probe of qemu-img...
> https://lists.gnu.org/archive/html/qemu-devel/2018-03/msg05753.html
> 
> but we may want to special-case that one in a separate patch.

Do we know the exact check is performs? I doubt it explicitly specifies
the file driver, and the automatic detection already returned the right
protocol driver.

Kevin



Re: [Qemu-block] [Qemu-devel] [PATCH for-2.12 v5] file-posix: specify expected filetypes

2018-03-21 Thread Eric Blake

On 03/21/2018 03:01 PM, John Snow wrote:

Adjust each caller of raw_open_common to specify if they are expecting
host and character devices or not. Tighten expectations of file types upon
open in the common code and refuse types that are not expected.

This has two effects:

(1) Character and block devices are now considered deprecated for the
 'file' driver, which expects only S_IFREG, and


Which may have interesting effects on v2v's attempt to use qemu-img on 
/dev/null as a quick feature probe of qemu-img...

https://lists.gnu.org/archive/html/qemu-devel/2018-03/msg05753.html

but we may want to special-case that one in a separate patch.


(2) no file-posix driver (file, host_cdrom, or host_device) can open
 directories now.

I don't think there's a legitimate reason to open directories as if
they were files. This prevents QEMU from opening and attempting to probe
a directory inode, which can break in exciting ways. One of those ways
is lseek on ext4/xfs, which will return 0x7fff as the file
size instead of EISDIR. This can coax QEMU into responding with a
confusing "file too big" instead of "Hey, that's not a file".

See: https://bugs.launchpad.net/qemu/+bug/1739304/
Signed-off-by: John Snow 
---

v5: rebase for 2.12.0-rc0


Still makes sense as a 2.12 bugfix.

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH v5] file-posix: specify expected filetypes

2018-03-21 Thread Kevin Wolf
Am 21.03.2018 um 21:01 hat John Snow geschrieben:
> Adjust each caller of raw_open_common to specify if they are expecting
> host and character devices or not. Tighten expectations of file types upon
> open in the common code and refuse types that are not expected.
> 
> This has two effects:
> 
> (1) Character and block devices are now considered deprecated for the
> 'file' driver, which expects only S_IFREG, and
> (2) no file-posix driver (file, host_cdrom, or host_device) can open
> directories now.
> 
> I don't think there's a legitimate reason to open directories as if
> they were files. This prevents QEMU from opening and attempting to probe
> a directory inode, which can break in exciting ways. One of those ways
> is lseek on ext4/xfs, which will return 0x7fff as the file
> size instead of EISDIR. This can coax QEMU into responding with a
> confusing "file too big" instead of "Hey, that's not a file".
> 
> See: https://bugs.launchpad.net/qemu/+bug/1739304/
> Signed-off-by: John Snow 
> ---
> 
> v5: rebase for 2.12.0-rc0
> 
>  block/file-posix.c | 37 +
>  qemu-doc.texi  |  6 ++
>  2 files changed, 35 insertions(+), 8 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index d7fb772c14..31d9afe026 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -420,7 +420,8 @@ static QemuOptsList raw_runtime_opts = {
>  };
>  
>  static int raw_open_common(BlockDriverState *bs, QDict *options,
> -   int bdrv_flags, int open_flags, Error **errp)
> +   int bdrv_flags, int open_flags,
> +   bool device, Error **errp)
>  {
>  BDRVRawState *s = bs->opaque;
>  QemuOpts *opts;
> @@ -558,10 +559,30 @@ static int raw_open_common(BlockDriverState *bs, QDict 
> *options,
>  error_setg_errno(errp, errno, "Could not stat file");
>  goto fail;
>  }
> -if (S_ISREG(st.st_mode)) {
> -s->discard_zeroes = true;
> -s->has_fallocate = true;
> +
> +if (!device) {
> +if (S_ISBLK(st.st_mode)) {
> +warn_report("Opening a block device as file using 'file' "
> +"driver is deprecated");
> +} else if (S_ISCHR(st.st_mode)) {
> +warn_report("Opening a character device as file using the 'file' 
> "
> +"driver is deprecated");
> +} else if (!S_ISREG(st.st_mode)) {
> +error_setg(errp, "A regular file was expected by the 'file' 
> driver, "
> +   "but something else was given");
> +goto fail;

ret needs to be set here, otherwise we return success. In my test, I
still got the wrong message: "Could not refresh total sector count:
Invalid argument"

> +} else {
> +s->discard_zeroes = true;
> +s->has_fallocate = true;
> +}
> +} else {
> +if (!(S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode))) {
> +error_setg(errp, "host_device/host_cdrom driver expects either "
> +   "a character or block device");
> +goto fail;

Same here.

> +}
>  }

Do we want a qemu-iotests case for this?

Kevin



Re: [Qemu-block] [PATCH v5] file-posix: specify expected filetypes

2018-03-21 Thread John Snow


On 03/21/2018 04:25 PM, Kevin Wolf wrote:
> Am 21.03.2018 um 21:01 hat John Snow geschrieben:
>> Adjust each caller of raw_open_common to specify if they are expecting
>> host and character devices or not. Tighten expectations of file types upon
>> open in the common code and refuse types that are not expected.
>>
>> This has two effects:
>>
>> (1) Character and block devices are now considered deprecated for the
>> 'file' driver, which expects only S_IFREG, and
>> (2) no file-posix driver (file, host_cdrom, or host_device) can open
>> directories now.
>>
>> I don't think there's a legitimate reason to open directories as if
>> they were files. This prevents QEMU from opening and attempting to probe
>> a directory inode, which can break in exciting ways. One of those ways
>> is lseek on ext4/xfs, which will return 0x7fff as the file
>> size instead of EISDIR. This can coax QEMU into responding with a
>> confusing "file too big" instead of "Hey, that's not a file".
>>
>> See: https://bugs.launchpad.net/qemu/+bug/1739304/
>> Signed-off-by: John Snow 
>> ---
>>
>> v5: rebase for 2.12.0-rc0
>>
>>  block/file-posix.c | 37 +
>>  qemu-doc.texi  |  6 ++
>>  2 files changed, 35 insertions(+), 8 deletions(-)
>>
>> diff --git a/block/file-posix.c b/block/file-posix.c
>> index d7fb772c14..31d9afe026 100644
>> --- a/block/file-posix.c
>> +++ b/block/file-posix.c
>> @@ -420,7 +420,8 @@ static QemuOptsList raw_runtime_opts = {
>>  };
>>  
>>  static int raw_open_common(BlockDriverState *bs, QDict *options,
>> -   int bdrv_flags, int open_flags, Error **errp)
>> +   int bdrv_flags, int open_flags,
>> +   bool device, Error **errp)
>>  {
>>  BDRVRawState *s = bs->opaque;
>>  QemuOpts *opts;
>> @@ -558,10 +559,30 @@ static int raw_open_common(BlockDriverState *bs, QDict 
>> *options,
>>  error_setg_errno(errp, errno, "Could not stat file");
>>  goto fail;
>>  }
>> -if (S_ISREG(st.st_mode)) {
>> -s->discard_zeroes = true;
>> -s->has_fallocate = true;
>> +
>> +if (!device) {
>> +if (S_ISBLK(st.st_mode)) {
>> +warn_report("Opening a block device as file using 'file' "
>> +"driver is deprecated");
>> +} else if (S_ISCHR(st.st_mode)) {
>> +warn_report("Opening a character device as file using the 
>> 'file' "
>> +"driver is deprecated");
>> +} else if (!S_ISREG(st.st_mode)) {
>> +error_setg(errp, "A regular file was expected by the 'file' 
>> driver, "
>> +   "but something else was given");
>> +goto fail;
> 
> ret needs to be set here, otherwise we return success. In my test, I
> still got the wrong message: "Could not refresh total sector count:
> Invalid argument"
> 
>> +} else {
>> +s->discard_zeroes = true;
>> +s->has_fallocate = true;
>> +}
>> +} else {
>> +if (!(S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode))) {
>> +error_setg(errp, "host_device/host_cdrom driver expects either "
>> +   "a character or block device");
>> +goto fail;
> 
> Same here.
> 
>> +}
>>  }
> 
> Do we want a qemu-iotests case for this?
> 
> Kevin
> 

I'll take the hint :)

--js



[Qemu-block] [PATCH v5] file-posix: specify expected filetypes

2018-03-21 Thread John Snow
Adjust each caller of raw_open_common to specify if they are expecting
host and character devices or not. Tighten expectations of file types upon
open in the common code and refuse types that are not expected.

This has two effects:

(1) Character and block devices are now considered deprecated for the
'file' driver, which expects only S_IFREG, and
(2) no file-posix driver (file, host_cdrom, or host_device) can open
directories now.

I don't think there's a legitimate reason to open directories as if
they were files. This prevents QEMU from opening and attempting to probe
a directory inode, which can break in exciting ways. One of those ways
is lseek on ext4/xfs, which will return 0x7fff as the file
size instead of EISDIR. This can coax QEMU into responding with a
confusing "file too big" instead of "Hey, that's not a file".

See: https://bugs.launchpad.net/qemu/+bug/1739304/
Signed-off-by: John Snow 
---

v5: rebase for 2.12.0-rc0

 block/file-posix.c | 37 +
 qemu-doc.texi  |  6 ++
 2 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index d7fb772c14..31d9afe026 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -420,7 +420,8 @@ static QemuOptsList raw_runtime_opts = {
 };
 
 static int raw_open_common(BlockDriverState *bs, QDict *options,
-   int bdrv_flags, int open_flags, Error **errp)
+   int bdrv_flags, int open_flags,
+   bool device, Error **errp)
 {
 BDRVRawState *s = bs->opaque;
 QemuOpts *opts;
@@ -558,10 +559,30 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 error_setg_errno(errp, errno, "Could not stat file");
 goto fail;
 }
-if (S_ISREG(st.st_mode)) {
-s->discard_zeroes = true;
-s->has_fallocate = true;
+
+if (!device) {
+if (S_ISBLK(st.st_mode)) {
+warn_report("Opening a block device as file using 'file' "
+"driver is deprecated");
+} else if (S_ISCHR(st.st_mode)) {
+warn_report("Opening a character device as file using the 'file' "
+"driver is deprecated");
+} else if (!S_ISREG(st.st_mode)) {
+error_setg(errp, "A regular file was expected by the 'file' 
driver, "
+   "but something else was given");
+goto fail;
+} else {
+s->discard_zeroes = true;
+s->has_fallocate = true;
+}
+} else {
+if (!(S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode))) {
+error_setg(errp, "host_device/host_cdrom driver expects either "
+   "a character or block device");
+goto fail;
+}
 }
+
 if (S_ISBLK(st.st_mode)) {
 #ifdef BLKDISCARDZEROES
 unsigned int arg;
@@ -614,7 +635,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, 
int flags,
 BDRVRawState *s = bs->opaque;
 
 s->type = FTYPE_FILE;
-return raw_open_common(bs, options, flags, 0, errp);
+return raw_open_common(bs, options, flags, 0, false, errp);
 }
 
 typedef enum {
@@ -2611,7 +2632,7 @@ hdev_open_Mac_error:
 
 s->type = FTYPE_FILE;
 
-ret = raw_open_common(bs, options, flags, 0, _err);
+ret = raw_open_common(bs, options, flags, 0, true, _err);
 if (ret < 0) {
 error_propagate(errp, local_err);
 #if defined(__APPLE__) && defined(__MACH__)
@@ -2838,7 +2859,7 @@ static int cdrom_open(BlockDriverState *bs, QDict 
*options, int flags,
 s->type = FTYPE_CD;
 
 /* open will not fail even if no CD is inserted, so add O_NONBLOCK */
-return raw_open_common(bs, options, flags, O_NONBLOCK, errp);
+return raw_open_common(bs, options, flags, O_NONBLOCK, true, errp);
 }
 
 static int cdrom_probe_device(const char *filename)
@@ -2951,7 +2972,7 @@ static int cdrom_open(BlockDriverState *bs, QDict 
*options, int flags,
 
 s->type = FTYPE_CD;
 
-ret = raw_open_common(bs, options, flags, 0, _err);
+ret = raw_open_common(bs, options, flags, 0, true, _err);
 if (ret) {
 error_propagate(errp, local_err);
 return ret;
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 89fa80518a..ff4a098fd7 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -2699,6 +2699,12 @@ that can be specified with the ``-device'' parameter.
 The drive addr argument is replaced by the the addr argument
 that can be specified with the ``-device'' parameter.
 
+@subsection -drive file=json:@{...@{'driver':'file'@}@} (since 2.12.0)
+
+The 'file' driver for drives is no longer appropriate for character or host
+devices and will only accept regular files (S_IFREG). The correct driver
+for these file types is 'host_cdrom' or 'host_device' as appropriate.
+
 @subsection -usbdevice (since 2.10.0)
 
 The ``-usbdevice DEV'' argument is now a synonym for setting
-- 
2.14.3




Re: [Qemu-block] [Qemu-devel] [PATCH 2/5] ide: push end_transfer callback to ide_transfer_halt

2018-03-21 Thread John Snow


On 03/21/2018 01:39 AM, Paolo Bonzini wrote:
> On 20/03/2018 23:11, John Snow wrote:
>> Seems sane, with some lingering questions about what the PIO Setup FIS
>> is supposed to do to begin with still remaining.
> 
> I agree here too.  Smashing the ATA and controller in the same device
> makes things tricky, so I tried to make these patches pure code motion,
> as much as I could.
> 
> Paolo
> 

Oh, I understand. There's a reason I haven't touched the PIO Setup FIS
much when I cleaned the device up before.

I was just hoping you'd magically know more about it.

:)



Re: [Qemu-block] [PATCH for-2.12 v2] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Kevin Wolf
Am 21.03.2018 um 14:38 hat Alberto Garcia geschrieben:
> When we try to allocate new clusters we first look for available ones
> starting from s->free_cluster_index and once we find them we increase
> their reference counts. Before we get to call update_refcount() to do
> this last step s->free_cluster_index is already pointing to the next
> cluster after the ones we are trying to allocate.
> 
> During update_refcount() it may happen however that we also need to
> allocate a new refcount block in order to store the refcounts of these
> new clusters (and to complicate things further that may also require
> us to grow the refcount table). After all this we don't know if the
> clusters that we originally tried to allocate are still available, so
> we return -EAGAIN to ask the caller to restart the search for free
> clusters.
> 
> This is what can happen in a common scenario:
> 
>   1) We want to allocate a new cluster and we see that cluster N is
>  free.
> 
>   2) We try to increase N's refcount but all refcount blocks are full,
>  so we allocate a new one at N+1 (where s->free_cluster_index was
>  pointing at).
> 
>   3) Once we're done we return -EAGAIN to look again for a free
>  cluster, but now s->free_cluster_index points at N+2, so that's
>  the one we allocate. Cluster N remains unallocated and we have a
>  hole in the qcow2 file.
> 
> This can be reproduced easily:
> 
>  qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 1M
>  qemu-io -c 'write 0 124k' hd.qcow2
> 
> After this the image has 132608 bytes (256 clusters), and the refcount
> block is full. If we write 512 more bytes it should allocate two new
> clusters: the data cluster itself and a new refcount block.
> 
>  qemu-io -c 'write 124k 512' hd.qcow2
> 
> However the image has now three new clusters (259 in total), and the
> first one of them is empty (and unallocated):
> 
>  dd if=hd.qcow2 bs=512c skip=256 count=1 | hexdump -C
> 
> If we write larger amounts of data in the last step instead of the 512
> bytes used in this example we can create larger holes in the qcow2
> file.
> 
> What this patch does is reset s->free_cluster_index to its previous
> value when alloc_refcount_block() returns -EAGAIN. This way the caller
> will try to allocate again the original clusters if they are still
> free.
> 
> The output of iotest 026 also needs to be updated because now that
> images have no holes some tests fail at a different point and the
> number of leaked clusters is different.
> 
> Signed-off-by: Alberto Garcia 
> Reviewed-by: Eric Blake 

Thanks, applied to the block branch.

Kevin



[Qemu-block] [PATCH for-2.12 v2 11/12] vhdx: Check for 4 GB maximum log size on creation

2018-03-21 Thread Kevin Wolf
It's unclear what the real maximum is, but we use an uint32_t to store
the log size in vhdx_co_create(), so we should check that the given
value fits in 32 bits.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Jeff Cody 
---
 block/vhdx.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/block/vhdx.c b/block/vhdx.c
index 0e48179b81..a1a0302799 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1829,6 +1829,10 @@ static int coroutine_fn 
vhdx_co_create(BlockdevCreateOptions *opts,
 if (!vhdx_opts->has_log_size) {
 log_size = DEFAULT_LOG_SIZE;
 } else {
+if (vhdx_opts->log_size > UINT32_MAX) {
+error_setg(errp, "Log size must be smaller than 4 GB");
+return -EINVAL;
+}
 log_size = vhdx_opts->log_size;
 }
 if (log_size < MiB || (log_size % MiB) != 0) {
-- 
2.13.6




[Qemu-block] [PATCH for-2.12 v2 12/12] qemu-iotests: Test vhdx image creation with QMP

2018-03-21 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/213 | 349 +
 tests/qemu-iotests/213.out | 121 
 tests/qemu-iotests/group   |   1 +
 3 files changed, 471 insertions(+)
 create mode 100755 tests/qemu-iotests/213
 create mode 100644 tests/qemu-iotests/213.out

diff --git a/tests/qemu-iotests/213 b/tests/qemu-iotests/213
new file mode 100755
index 00..3a00a0f6d6
--- /dev/null
+++ b/tests/qemu-iotests/213
@@ -0,0 +1,349 @@
+#!/bin/bash
+#
+# Test vhdx and file image creation
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=kw...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1   # failure is the default!
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt vhdx
+_supported_proto file
+_supported_os Linux
+
+function do_run_qemu()
+{
+echo Testing: "$@"
+$QEMU -nographic -qmp stdio -serial none "$@"
+echo
+}
+
+function run_qemu()
+{
+do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qmp \
+  | _filter_qemu | _filter_imgfmt \
+  | _filter_actual_image_size
+}
+
+echo
+echo "=== Successful image creation (defaults) ==="
+echo
+
+size=$((128 * 1024 * 1024))
+
+run_qemu <

[Qemu-block] [PATCH for-2.12 v2 07/12] parallels: Check maximum cluster size on create

2018-03-21 Thread Kevin Wolf
It's unclear what the real maximum cluster size is for the Parallels
format, but let's at least make sure that we don't get integer
overflows in our .bdrv_co_create implementation.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
---
 block/parallels.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/block/parallels.c b/block/parallels.c
index 2da5e56a9d..5cf16a3b3c 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -526,6 +526,11 @@ static int coroutine_fn 
parallels_co_create(BlockdevCreateOptions* opts,
 cl_size = DEFAULT_CLUSTER_SIZE;
 }
 
+/* XXX What is the real limit here? This is an insanely large maximum. */
+if (cl_size >= INT64_MAX / MAX_PARALLELS_IMAGE_FACTOR) {
+error_setg(errp, "Cluster size is too large");
+return -EINVAL;
+}
 if (total_size >= MAX_PARALLELS_IMAGE_FACTOR * cl_size) {
 error_setg(errp, "Image size is too large for this cluster size");
 return -E2BIG;
-- 
2.13.6




Re: [Qemu-block] [PATCH for-2.12 v2 01/12] vdi: Change 'static' create option to 'preallocation' in QMP

2018-03-21 Thread Eric Blake

On 03/21/2018 12:37 PM, Kevin Wolf wrote:

What static=on really does is what we call metadata preallocation for
other block drivers. While we can still change the QMP interface, make
it more consistent by using 'preallocation' for VDI, too.

This doesn't implement any new functionality, so the only supported
preallocation modes are 'off' and 'metadata' for now.


Disagrees with:



Signed-off-by: Kevin Wolf 
---
  qapi/block-core.json |  7 +++
  block/vdi.c  | 24 ++--
  2 files changed, 25 insertions(+), 6 deletions(-)



+# @preallocationPreallocation mode for the new image (allowed values: off,
+#   full; default: off)


this.



+if (!vdi_opts->has_preallocation) {
+vdi_opts->preallocation = PREALLOC_MODE_OFF;
+}
+switch (vdi_opts->preallocation) {
+case PREALLOC_MODE_OFF:
+image_type = VDI_TYPE_DYNAMIC;
+break;
+case PREALLOC_MODE_METADATA:
  image_type = VDI_TYPE_STATIC;
+break;
+default:
+error_setg(errp, "Preallocation mode not supported for vdi");


but matches this.  Fix your QMP documentation with s/full/metadata/ and 
you can have:


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



[Qemu-block] [PATCH for-2.12 v2 05/12] luks: Turn another invalid assertion into check

2018-03-21 Thread Kevin Wolf
Commit e39e959e fixed an invalid assertion in the .bdrv_length
implementation, but left a similar assertion in place for
.bdrv_truncate. Instead of crashing when the user requests a too large
image size, fail gracefully.

A file size of exactly INT64_MAX caused failure before, but is actually
legal.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Daniel P. Berrangé 
---
 block/crypto.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/crypto.c b/block/crypto.c
index e0b8856f74..bc6c7e3795 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -357,7 +357,11 @@ static int block_crypto_truncate(BlockDriverState *bs, 
int64_t offset,
 BlockCrypto *crypto = bs->opaque;
 uint64_t payload_offset =
 qcrypto_block_get_payload_offset(crypto->block);
-assert(payload_offset < (INT64_MAX - offset));
+
+if (payload_offset > INT64_MAX - offset) {
+error_setg(errp, "The requested file size is too large");
+return -EFBIG;
+}
 
 offset += payload_offset;
 
-- 
2.13.6




[Qemu-block] [PATCH for-2.12 v2 04/12] qemu-iotests: Enable 025 for luks

2018-03-21 Thread Kevin Wolf
We want to test resizing even for luks. The only change that is needed
is to explicitly zero out new space for luks because it's undefined.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Daniel P. Berrangé 
---
 tests/qemu-iotests/025 | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/025 b/tests/qemu-iotests/025
index f5e672e6b3..70dd5f10aa 100755
--- a/tests/qemu-iotests/025
+++ b/tests/qemu-iotests/025
@@ -38,7 +38,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.filter
 . ./common.pattern
 
-_supported_fmt raw qcow2 qed
+_supported_fmt raw qcow2 qed luks
 _supported_proto file sheepdog rbd nfs
 _supported_os Linux
 
@@ -62,6 +62,13 @@ length
 EOF
 _check_test_img
 
+# bdrv_truncate() doesn't zero the new space, so we need to do that explicitly.
+# We still want to test automatic zeroing for other formats even though
+# bdrv_truncate() doesn't guarantee it.
+if [ "$IMGFMT" == "luks" ]; then
+$QEMU_IO -c "write -z $small_size $((big_size - small_size))" "$TEST_IMG" 
> /dev/null
+fi
+
 echo
 echo "=== Verifying image size after reopen"
 $QEMU_IO -c "length" "$TEST_IMG"
-- 
2.13.6




[Qemu-block] [PATCH for-2.12 v2 03/12] qemu-iotests: Test vdi image creation with QMP

2018-03-21 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/211 | 246 +
 tests/qemu-iotests/211.out |  97 ++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 344 insertions(+)
 create mode 100755 tests/qemu-iotests/211
 create mode 100644 tests/qemu-iotests/211.out

diff --git a/tests/qemu-iotests/211 b/tests/qemu-iotests/211
new file mode 100755
index 00..1edec26517
--- /dev/null
+++ b/tests/qemu-iotests/211
@@ -0,0 +1,246 @@
+#!/bin/bash
+#
+# Test VDI and file image creation
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=kw...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1   # failure is the default!
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt vdi
+_supported_proto file
+_supported_os Linux
+
+function do_run_qemu()
+{
+echo Testing: "$@"
+$QEMU -nographic -qmp stdio -serial none "$@"
+echo
+}
+
+function run_qemu()
+{
+do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qmp \
+  | _filter_qemu | _filter_imgfmt \
+  | _filter_actual_image_size
+}
+
+echo
+echo "=== Successful image creation (defaults) ==="
+echo
+
+size=$((128 * 1024 * 1024))
+
+run_qemu <

[Qemu-block] [PATCH for-2.12 v2 08/12] qemu-iotests: Test parallels image creation with QMP

2018-03-21 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/212 | 326 +
 tests/qemu-iotests/212.out | 111 +++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 438 insertions(+)
 create mode 100755 tests/qemu-iotests/212
 create mode 100644 tests/qemu-iotests/212.out

diff --git a/tests/qemu-iotests/212 b/tests/qemu-iotests/212
new file mode 100755
index 00..e5a1ba77ce
--- /dev/null
+++ b/tests/qemu-iotests/212
@@ -0,0 +1,326 @@
+#!/bin/bash
+#
+# Test parallels and file image creation
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=kw...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1   # failure is the default!
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt parallels
+_supported_proto file
+_supported_os Linux
+
+function do_run_qemu()
+{
+echo Testing: "$@"
+$QEMU -nographic -qmp stdio -serial none "$@"
+echo
+}
+
+function run_qemu()
+{
+do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qmp \
+  | _filter_qemu | _filter_imgfmt \
+  | _filter_actual_image_size
+}
+
+echo
+echo "=== Successful image creation (defaults) ==="
+echo
+
+size=$((128 * 1024 * 1024))
+
+run_qemu <

[Qemu-block] [PATCH for-2.12 v2 10/12] vhdx: Don't use error_setg_errno() with constant errno

2018-03-21 Thread Kevin Wolf
error_setg_errno() is meant for cases where we got an errno from the OS
that can add useful extra information to an error message. It's
pointless if we pass a constant errno, these cases should use plain
error_setg().

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Jeff Cody 
---
 block/vhdx.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index ca211a3196..0e48179b81 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1822,7 +1822,7 @@ static int coroutine_fn 
vhdx_co_create(BlockdevCreateOptions *opts,
 /* Validate options and set default values */
 image_size = vhdx_opts->size;
 if (image_size > VHDX_MAX_IMAGE_SIZE) {
-error_setg_errno(errp, EINVAL, "Image size too large; max of 64TB");
+error_setg(errp, "Image size too large; max of 64TB");
 return -EINVAL;
 }
 
@@ -1832,7 +1832,7 @@ static int coroutine_fn 
vhdx_co_create(BlockdevCreateOptions *opts,
 log_size = vhdx_opts->log_size;
 }
 if (log_size < MiB || (log_size % MiB) != 0) {
-error_setg_errno(errp, EINVAL, "Log size must be a multiple of 1 MB");
+error_setg(errp, "Log size must be a multiple of 1 MB");
 return -EINVAL;
 }
 
@@ -1874,7 +1874,7 @@ static int coroutine_fn 
vhdx_co_create(BlockdevCreateOptions *opts,
 }
 
 if (block_size < MiB || (block_size % MiB) != 0) {
-error_setg_errno(errp, EINVAL, "Block size must be a multiple of 1 
MB");
+error_setg(errp, "Block size must be a multiple of 1 MB");
 return -EINVAL;
 }
 if (!is_power_of_2(block_size)) {
@@ -1882,8 +1882,7 @@ static int coroutine_fn 
vhdx_co_create(BlockdevCreateOptions *opts,
 return -EINVAL;
 }
 if (block_size > VHDX_BLOCK_SIZE_MAX) {
-error_setg_errno(errp, EINVAL, "Block size must not exceed %d",
- VHDX_BLOCK_SIZE_MAX);
+error_setg(errp, "Block size must not exceed %d", VHDX_BLOCK_SIZE_MAX);
 return -EINVAL;
 }
 
-- 
2.13.6




[Qemu-block] [PATCH for-2.12 v2 06/12] qemu-iotests: Test invalid resize on luks

2018-03-21 Thread Kevin Wolf
This tests that the .bdrv_truncate implementation for luks doesn't crash
for invalid image sizes.

Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/210 | 37 +
 tests/qemu-iotests/210.out | 16 
 2 files changed, 53 insertions(+)

diff --git a/tests/qemu-iotests/210 b/tests/qemu-iotests/210
index 96a5213e77..e607c0d296 100755
--- a/tests/qemu-iotests/210
+++ b/tests/qemu-iotests/210
@@ -204,6 +204,43 @@ run_qemu -blockdev 
driver=file,filename="$TEST_IMG_FILE",node-name=node0 \
 { "execute": "quit" }
 EOF
 
+echo
+echo "=== Resize image with invalid sizes ==="
+echo
+
+run_qemu -blockdev driver=file,filename="$TEST_IMG_FILE",node-name=node0 \
+ -blockdev driver=luks,file=node0,key-secret=keysec0,node-name=node1 \
+ -object secret,id=keysec0,data="foo" <0 
size"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN", "data": {"guest": false}}
+
+image: json:{"driver": "IMGFMT", "file": {"driver": "file", "filename": 
"TEST_DIR/t.IMGFMT"}, "key-secret": "keysec0"}
+file format: IMGFMT
+virtual size: 0 (0 bytes)
 *** done
-- 
2.13.6




[Qemu-block] [PATCH for-2.12 v2 09/12] vhdx: Require power-of-two block size on create

2018-03-21 Thread Kevin Wolf
Images with a non-power-of-two block size are invalid and cannot be
opened. Reject such block sizes when creating an image.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Jeff Cody 
---
 block/vhdx.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/block/vhdx.c b/block/vhdx.c
index f1b97f4b49..ca211a3196 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1877,6 +1877,10 @@ static int coroutine_fn 
vhdx_co_create(BlockdevCreateOptions *opts,
 error_setg_errno(errp, EINVAL, "Block size must be a multiple of 1 
MB");
 return -EINVAL;
 }
+if (!is_power_of_2(block_size)) {
+error_setg(errp, "Block size must be a power of two");
+return -EINVAL;
+}
 if (block_size > VHDX_BLOCK_SIZE_MAX) {
 error_setg_errno(errp, EINVAL, "Block size must not exceed %d",
  VHDX_BLOCK_SIZE_MAX);
-- 
2.13.6




[Qemu-block] [PATCH for-2.12 v2 02/12] vdi: Fix build with CONFIG_VDI_DEBUG

2018-03-21 Thread Kevin Wolf
Use qemu_uuid_unparse() instead of uuid_unparse() to make vdi.c compile
again when CONFIG_VDI_DEBUG is set. In order to prevent future bitrot,
replace '#ifdef CONFIG_VDI_DEBUG' by 'if (VDI_DEBUG)' so that the
compiler always sees the code.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
---
 block/vdi.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 73c059e69d..4a2d1ff88d 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -235,7 +235,6 @@ static void vdi_header_to_le(VdiHeader *header)
 qemu_uuid_bswap(>uuid_parent);
 }
 
-#if defined(CONFIG_VDI_DEBUG)
 static void vdi_header_print(VdiHeader *header)
 {
 char uuid[37];
@@ -257,16 +256,15 @@ static void vdi_header_print(VdiHeader *header)
 logout("block extra 0x%04x\n", header->block_extra);
 logout("blocks tot. 0x%04x\n", header->blocks_in_image);
 logout("blocks all. 0x%04x\n", header->blocks_allocated);
-uuid_unparse(header->uuid_image, uuid);
+qemu_uuid_unparse(>uuid_image, uuid);
 logout("uuid image  %s\n", uuid);
-uuid_unparse(header->uuid_last_snap, uuid);
+qemu_uuid_unparse(>uuid_last_snap, uuid);
 logout("uuid snap   %s\n", uuid);
-uuid_unparse(header->uuid_link, uuid);
+qemu_uuid_unparse(>uuid_link, uuid);
 logout("uuid link   %s\n", uuid);
-uuid_unparse(header->uuid_parent, uuid);
+qemu_uuid_unparse(>uuid_parent, uuid);
 logout("uuid parent %s\n", uuid);
 }
-#endif
 
 static int coroutine_fn vdi_co_check(BlockDriverState *bs, BdrvCheckResult 
*res,
  BdrvCheckMode fix)
@@ -387,9 +385,9 @@ static int vdi_open(BlockDriverState *bs, QDict *options, 
int flags,
 }
 
 vdi_header_to_cpu();
-#if defined(CONFIG_VDI_DEBUG)
-vdi_header_print();
-#endif
+if (VDI_DEBUG) {
+vdi_header_print();
+}
 
 if (header.disk_size > VDI_DISK_SIZE_MAX) {
 error_setg(errp, "Unsupported VDI image size (size is 0x%" PRIx64
@@ -825,9 +823,9 @@ static int coroutine_fn 
vdi_co_do_create(BlockdevCreateOptions *create_options,
 qemu_uuid_generate(_image);
 qemu_uuid_generate(_last_snap);
 /* There is no need to set header.uuid_link or header.uuid_parent here. */
-#if defined(CONFIG_VDI_DEBUG)
-vdi_header_print();
-#endif
+if (VDI_DEBUG) {
+vdi_header_print();
+}
 vdi_header_to_le();
 ret = blk_pwrite(blk, offset, , sizeof(header), 0);
 if (ret < 0) {
-- 
2.13.6




[Qemu-block] [PATCH for-2.12 v2 00/12] block: Follow-up for .bdrv_co_create (part 1)

2018-03-21 Thread Kevin Wolf
This series adds qemu-iotests for a few more block drivers (yet more to
come in another series) and fixes a few things that previous review and
these tests brought up.

The only major design change is that I converted the vdi block driver
from a boolean 'static' create option to the standard 'preallocation'
one that other drivers are using. This seems like a good move to make
while the interface isn't stable yet.

v2:
- Patch 1: Mention allowed values for 'preallocation' [Eric]
- Patch 3: Fixed comments, removed 2^63-512 case [Eric]
- Patch 6: Added missing reference output change [Eric]
- Patch 7: s/UINT64_MAX/INT64_MAX/ [Eric]
- Patches 8 and 12: Fixed comments [Eric]

Kevin Wolf (12):
  vdi: Change 'static' create option to 'preallocation' in QMP
  vdi: Fix build with CONFIG_VDI_DEBUG
  qemu-iotests: Test vdi image creation with QMP
  qemu-iotests: Enable 025 for luks
  luks: Turn another invalid assertion into check
  qemu-iotests: Test invalid resize on luks
  parallels: Check maximum cluster size on create
  qemu-iotests: Test parallels image creation with QMP
  vhdx: Require power-of-two block size on create
  vhdx: Don't use error_setg_errno() with constant errno
  vhdx: Check for 4 GB maximum log size on creation
  qemu-iotests: Test vhdx image creation with QMP

 qapi/block-core.json   |   7 +-
 block/crypto.c |   6 +-
 block/parallels.c  |   5 +
 block/vdi.c|  46 --
 block/vhdx.c   |  17 ++-
 tests/qemu-iotests/025 |   9 +-
 tests/qemu-iotests/210 |  37 +
 tests/qemu-iotests/210.out |  16 +++
 tests/qemu-iotests/211 | 246 
 tests/qemu-iotests/211.out |  97 +
 tests/qemu-iotests/212 | 326 ++
 tests/qemu-iotests/212.out | 111 ++
 tests/qemu-iotests/213 | 349 +
 tests/qemu-iotests/213.out | 121 
 tests/qemu-iotests/group   |   3 +
 15 files changed, 1371 insertions(+), 25 deletions(-)
 create mode 100755 tests/qemu-iotests/211
 create mode 100644 tests/qemu-iotests/211.out
 create mode 100755 tests/qemu-iotests/212
 create mode 100644 tests/qemu-iotests/212.out
 create mode 100755 tests/qemu-iotests/213
 create mode 100644 tests/qemu-iotests/213.out

-- 
2.13.6




Re: [Qemu-block] [PATCH 4/4] qapi: new qmp command nbd-server-add-bitmap

2018-03-21 Thread Eric Blake

On 03/21/2018 07:19 AM, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  qapi/block.json | 27 +++
  blockdev-nbd.c  | 23 +++
  2 files changed, 50 insertions(+)

diff --git a/qapi/block.json b/qapi/block.json
index c694524002..4afbbcd7b7 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -269,6 +269,33 @@
'data': {'name': 'str', '*mode': 'NbdServerRemoveMode'} }
  
  ##

+# @nbd-server-add-bitmap:
+#
+# Export dirty bitmap through selected export. Bitmaps are searched for in
+# device attached to the export and in all its backings. Exported bitmap
+# is locked until NBD export is removed.


Expose a dirty bitmap associated with the selected export.  The bitmap 
search starts at the device attached to the export, and includes all 
backing files.  The exported bitmap is then locked until the NBD export 
is removed.



+#
+# @name: Export name.
+#
+# @bitmap: Bitmap name to search.


s/search./search for./


+#
+# @bitmap-export-name: How the bitmap will be seen by nbd clients
+#  (default @bitmap)


Maybe mention that the client must use NBD_OPT_SET_META_CONTEXT with a 
query of "qemu-dirty-bitmap:NAME" (where NAME matches 
bitmap-export-name) to access the exposed bitmap.  (May need to be 
adjusted by my suggestion to use just the namespace "qemu:")



+#
+#
+# Returns: error on one of the following conditions:
+#   - the server is not running
+#   - export is not found
+#   - bitmap is not found
+#   - bitmap is disabled
+#   - bitmap is locked


Do we really need to list all the error conditions?  My worry is that a 
list this specific might go stale, compared to the obvious default that 
the command succeeds only if it was able to expose the bitmap and that 
the error message is specific enough for a human to figure out what to 
fix if it failed.



+#
+# Since: 2.13
+##
+  { 'command': 'nbd-server-add-bitmap',
+'data': {'name': 'str', 'bitmap': 'str', '*bitmap-export-name': 'str'} }
+


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH for-2.12 12/12] qemu-iotests: Test vhdx image creation with QMP

2018-03-21 Thread Kevin Wolf
Am 20.03.2018 um 19:53 hat Eric Blake geschrieben:
> On 03/20/2018 12:36 PM, Kevin Wolf wrote:
> > Signed-off-by: Kevin Wolf 
> > ---
> >   tests/qemu-iotests/213 | 349 
> > +
> >   tests/qemu-iotests/213.out | 121 
> >   tests/qemu-iotests/group   |   1 +
> >   3 files changed, 471 insertions(+)
> >   create mode 100755 tests/qemu-iotests/213
> >   create mode 100644 tests/qemu-iotests/213.out
> > 
> 
> > +
> > +echo
> > +echo "=== Invalid sizes ==="
> > +echo
> > +
> > +# TODO Negative image sizes aren't handled correctly, but this is a problem
> > +# with QAPI's implementation of the 'size' type and affects other commands 
> > as
> > +# well. Once this is fixed, we may want to add a test case here.
> > +
> > +# 1. 2^64 - 512
> > +# 2. 2^63 = 8 EB (qemu-img enforces image sizes less than this)
> > +# 3. 2^63 - 512 (generally valid, but with the crypto header the file will
> > +#exceed 63 bits)
> 
> Same comments as before on whether this comment is slightly stale after
> copy-and-paste.

Will do the same thing as there ("image header").

> > +# 4. 2^46 + 1 (512 bytes more than maximum image size)
> 
> Does this image format require 512-byte alignment?  If so, are you missing a
> test of unaligned sizes?  If not, why not just 1 byte more than the maximum?

The comment is wrong, the code already does just 1 byte more.

Kevin



Re: [Qemu-block] [PATCH for-2.12 06/12] qemu-iotests: Test invalid resize on luks

2018-03-21 Thread Kevin Wolf
Am 20.03.2018 um 19:33 hat Eric Blake geschrieben:
> On 03/20/2018 12:36 PM, Kevin Wolf wrote:
> > This tests that the .bdrv_truncate implementation for luks doesn't crash
> > for invalid image sizes.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >   tests/qemu-iotests/210 | 37 +
> >   1 file changed, 37 insertions(+)
> 
> Missing a change to 210.out?

Obviously. I wonder how that happened... Thanks!

Kevin



Re: [Qemu-block] [PATCH 1/4] nbd/server: refactor nbd_negotiate_meta_query for several namespaces

2018-03-21 Thread Wouter Verhelst
On Wed, Mar 21, 2018 at 09:56:36AM -0500, Eric Blake wrote:
> no further qemu patches are submitted.  Is it worth me proposing a doc
> change to demonstrate what the difference would look like, along with
> corresponding qemu changes to match, to decide if including it in 2.12 is
> worth it?

If you're willing to spend the effort, I won't stop you :-) but I think the
pain of parsing a string once (during negotiation) is not that bad, and
I'm not too bothered about it.

Put otherwise, I'm not convinced that the downside (which I agree
exists) outweighs the downside of having to introduce yet another
command, or add some ugliness for backwards compat reasons.

But hey, go ahead, prove me wrong ;-)

-- 
Could you people please use IRC like normal people?!?

  -- Amaya Rodrigo Sastre, trying to quiet down the buzz in the DebConf 2008
 Hacklab



Re: [Qemu-block] [PATCH for-2.12 08/12] qemu-iotests: Test parallels image creation with QMP

2018-03-21 Thread Kevin Wolf
Am 20.03.2018 um 19:42 hat Eric Blake geschrieben:
> On 03/20/2018 12:36 PM, Kevin Wolf wrote:
> > Signed-off-by: Kevin Wolf 
> > ---
> >   tests/qemu-iotests/212 | 326 
> > +
> >   tests/qemu-iotests/212.out | 111 +++
> >   tests/qemu-iotests/group   |   1 +
> >   3 files changed, 438 insertions(+)
> >   create mode 100755 tests/qemu-iotests/212
> >   create mode 100644 tests/qemu-iotests/212.out
> > 
> 
> > +echo
> > +echo "=== Invalid sizes ==="
> > +echo
> > +
> > +# TODO Negative image sizes aren't handled correctly, but this is a problem
> > +# with QAPI's implementation of the 'size' type and affects other commands 
> > as
> > +# well. Once this is fixed, we may want to add a test case here.
> > +
> > +# 1. Misaligned image size
> > +# 2. 2^64 - 512
> > +# 3. 2^63 = 8 EB (qemu-img enforces image sizes less than this)
> > +# 4. 2^63 - 512 (generally valid, but with the crypto header the file will
> > +#exceed 63 bits)
> 
> Is this part of the comment stale copy-and-paste? There's no crypto header,
> and the real max is much smaller...

Yeah... Not sure if the case is all that useful, but it can't hurt, so
I'll just s/crypto header/image header/ rather than removing it.

Kevin



Re: [Qemu-block] [PATCH 3/4] nbd/server: implement dirty bitmap export

2018-03-21 Thread Eric Blake

On 03/21/2018 07:19 AM, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 


Rather sparse on the details in the commit message; I had to read the 
patch to even learn what the new namespace is.



---
  include/block/nbd.h |   2 +
  nbd/server.c| 207 ++--
  2 files changed, 203 insertions(+), 6 deletions(-)




+++ b/nbd/server.c
@@ -23,6 +23,12 @@
  #include "nbd-internal.h"
  
  #define NBD_META_ID_BASE_ALLOCATION 0

+#define NBD_META_ID_DIRTY_BITMAP 1
+
+#define NBD_MAX_BITMAP_EXTENTS (0x10 / 8) /* 1 mb of extents data */
+#define MAX_EXTENT_LENGTH QEMU_ALIGN_DOWN(INT32_MAX, 512)


Worth comments on these two limits?


+
+#define NBD_STATE_DIRTY 1


Does this belong better in nbd.h, alongside NBD_STATE_HOLE?  (And 
defining it as (1 << 0) might be nicer, too).


  
  static int system_errno_to_nbd_errno(int err)

  {
@@ -80,6 +86,9 @@ struct NBDExport {
  
  BlockBackend *eject_notifier_blk;

  Notifier eject_notifier;
+
+BdrvDirtyBitmap *export_bitmap;
+char *export_bitmap_name;
  };
  
  static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);

@@ -92,6 +101,7 @@ typedef struct NBDExportMetaContexts {
  bool valid; /* means that negotiation of the option finished without
 errors */
  bool base_allocation; /* export base:allocation context (block status) */
+bool dirty_bitmap; /* export qemu-dirty-bitmap: */


That's a LONG namespace name, and it does not lend itself well to other 
qemu extensions; so future qemu BLOCK_STATUS additions would require 
consuming yet another namespace and additional upstream NBD 
registration.  Wouldn't it be better to have the namespace be 'qemu:' 
(which we register with upstream NBD just once), where we are then free 
to document leafnames to look like 'dirty-bitmap:name' or 'foo:bar'?



  } NBDExportMetaContexts;
  
  struct NBDClient {

@@ -786,12 +796,32 @@ static int nbd_meta_base_query(NBDClient *client, 
NBDExportMetaContexts *meta,
   >base_allocation, errp);
  }
  
+/* nbd_meta_bitmap_query

+ *
+ * Handle query to 'qemu-dirty-bitmap' namespace.
+ * 'len' is the amount of text remaining to be read from the current name, 
after
+ * the 'qemu-dirty-bitmap:' portion has been stripped.
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_meta_bitmap_query(NBDClient *client, NBDExportMetaContexts 
*meta,
+ uint32_t len, Error **errp)
+{
+if (!client->exp->export_bitmap) {
+return nbd_opt_skip(client, len, errp);
+}
+
+return nbd_meta_single_query(client, client->exp->export_bitmap_name, len,
+ >dirty_bitmap, errp);


So if I'm reading this right, a client can do _LIST_ 
'qemu-dirty-bitmap:' (or 'qemu:dirty-bitmap:' if we choose a shorter 
initial namespace) and get back the exported bitmap name; or the user 
already knows the bitmap name and both _LIST_ and _SET_ 
'qemu-dirty-bitmap:name' return the one supported bitmap for that NBD 
server.




  /* nbd_co_send_extents
+ *
+ * NBD_REPLY_FLAG_DONE is not set, don't forget to send it.
+ *
   * @extents should be in big-endian */
  static int nbd_co_send_extents(NBDClient *client, uint64_t handle,
 NBDExtent *extents, unsigned nb_extents,


Worth a bool flag that the caller can inform this function whether to 
include FLAG_DONE?

+
+static int nbd_co_send_bitmap(NBDClient *client, uint64_t handle,
+  BdrvDirtyBitmap *bitmap, uint64_t offset,
+  uint64_t length, uint32_t context_id,
+  Error **errp)
+{
+int ret;
+unsigned nb_extents;
+NBDExtent *extents = g_new(NBDExtent, NBD_MAX_BITMAP_EXTENTS);


Is this correct even when the client used NBD_CMD_FLAG_REQ_ONE?

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH v2] qemu: replace "" with <> in headers

2018-03-21 Thread Kevin Wolf
Am 21.03.2018 um 16:58 hat Michael S. Tsirkin geschrieben:
> On Wed, Mar 21, 2018 at 04:34:39PM +0100, Kevin Wolf wrote:
> > Am 21.03.2018 um 15:46 hat Michael S. Tsirkin geschrieben:
> > > Our current scheme is to use
> > >  #include ""
> > > for internal headers, and
> > >  #include <>
> > > for external ones.
> > > 
> > > Unfortunately this is not based on compiler support: from C point of
> > > view, the "" form merely looks up headers in the current directory
> > > and then falls back on <> directories.
> > > 
> > > Thus, for example, a system header trace.h - should it be present - will
> > > conflict with our local trace.h
> > 
> > You're right that there is a conflict, even though only in one
> > direction: "trace.h" is unambiguously the local trace.h in our source
> > tree, but  refers to the same local header rather than the
> > system header as you would expect.
> > 
> > An easy way to resolve this conflict would be using -iquote rather than
> > -I for directories in the source tree, so that  unambiguously
> > refers to the system header and "trace.h" unambiguously refers to the
> > QEMU header.
> 
> I posted patches to that effect for 2.12.

Ah, I missed those. That's good (and imho enough).

> It's all still very much a non-standard convention and so less robust
> than prefixing file name with a project-specifix prefix.

I've always had the impression that it's by far the most common
convention, to the point that I'd blindly assume it when joining a new
project.

> > > As another example of problems, a header by the same name in the source
> > > directory will always be picked up first - before any headers in
> > > the include directory.
> > > 
> > > Let's change the scheme: make sure all headers that are not
> > > in the source directory are included through a path
> > > starting with qemu/ , thus:
> > > 
> > >  #include <>
> > > 
> > > headers in the same directory as source are included with
> > > 
> > >  #include ""
> > > 
> > > as per standard.
> > > 
> > > This (untested) patch is just to start the discussion and does not
> > > change all of the codebase. If there's agreement, this will be
> > > run on all code to converting code to this scheme.
> > 
> > Renaming files is always painful. If that's the fix, the cure might be
> > worse than the disease. As far as I know, the conflict is only
> > theoretical, so in that case I'd say: If it ain't broke, don't fix it.
> > 
> > Kevin
> 
> It's broke I think, it's very hard for new people to contribute to QEMU.
> Look e.g. at rdma which all has messed up includes - and that's from an
> experienced conributor who just isn't an experienced maintainer.

I don't think the problem is that the convention is hard to apply (it's
definitely not). It's knowing about the convention. This problem isn't
going away by switching to a different, less common convention. We're
only going to see more offenders then.

> Amount of time spent on teaching new people trivia about our
> conventions just isn't funny. They should be self-documenting
> and violations should cause the build to fail.

Yes, but your proposal doesn't achieve this. You can still use
"qemu/foo.h" instead of  and it will build successfully.
That's something we can't change, as far as I know, because the include
path for "foo.h" is always a superset of .

If anything, this means that we should prefer "foo.h" for local headers
(i.e. the way it currently is) because we can let the compiler enforce
it:  for "foo.h" can become a build error, and does so with your
-iquote patch, but the other way round doesn't work.

Then it's only system headers that you can possibly get wrong, but for
those everyone should be used to using  anyway.

Kevin



Re: [Qemu-block] [PATCH v2] qemu: replace "" with <> in headers

2018-03-21 Thread Michael S. Tsirkin
On Wed, Mar 21, 2018 at 04:34:39PM +0100, Kevin Wolf wrote:
> Am 21.03.2018 um 15:46 hat Michael S. Tsirkin geschrieben:
> > Our current scheme is to use
> >  #include ""
> > for internal headers, and
> >  #include <>
> > for external ones.
> > 
> > Unfortunately this is not based on compiler support: from C point of
> > view, the "" form merely looks up headers in the current directory
> > and then falls back on <> directories.
> > 
> > Thus, for example, a system header trace.h - should it be present - will
> > conflict with our local trace.h
> 
> You're right that there is a conflict, even though only in one
> direction: "trace.h" is unambiguously the local trace.h in our source
> tree, but  refers to the same local header rather than the
> system header as you would expect.
> 
> An easy way to resolve this conflict would be using -iquote rather than
> -I for directories in the source tree, so that  unambiguously
> refers to the system header and "trace.h" unambiguously refers to the
> QEMU header.

I posted patches to that effect for 2.12. It's all still very much
a non-standard convention and so less robust than
prefixing file name with a project-specifix prefix.

> > As another example of problems, a header by the same name in the source
> > directory will always be picked up first - before any headers in
> > the include directory.
> > 
> > Let's change the scheme: make sure all headers that are not
> > in the source directory are included through a path
> > starting with qemu/ , thus:
> > 
> >  #include <>
> > 
> > headers in the same directory as source are included with
> > 
> >  #include ""
> > 
> > as per standard.
> > 
> > This (untested) patch is just to start the discussion and does not
> > change all of the codebase. If there's agreement, this will be
> > run on all code to converting code to this scheme.
> 
> Renaming files is always painful. If that's the fix, the cure might be
> worse than the disease. As far as I know, the conflict is only
> theoretical, so in that case I'd say: If it ain't broke, don't fix it.
> 
> Kevin

It's broke I think, it's very hard for new people to contribute to QEMU.
Look e.g. at rdma which all has messed up includes - and that's from an
experienced conributor who just isn't an experienced maintainer.

Amount of time spent on teaching new people trivia about our
conventions just isn't funny. They should be self-documenting
and violations should cause the build to fail.

-- 
MST



Re: [Qemu-block] [PATCH v2] qemu: replace "" with <> in headers

2018-03-21 Thread Daniel P . Berrangé
On Wed, Mar 21, 2018 at 05:39:48PM +0200, Michael S. Tsirkin wrote:
> On Wed, Mar 21, 2018 at 03:19:22PM +, Daniel P. Berrangé wrote:
> > On Wed, Mar 21, 2018 at 04:46:32PM +0200, Michael S. Tsirkin wrote:
> > > Our current scheme is to use
> > >  #include ""
> > > for internal headers, and
> > >  #include <>
> > > for external ones.
> > > 
> > > Unfortunately this is not based on compiler support: from C point of
> > > view, the "" form merely looks up headers in the current directory
> > > and then falls back on <> directories.
> > > 
> > > Thus, for example, a system header trace.h - should it be present - will
> > > conflict with our local trace.h
> > 
> > If our local "trace.h" is in the current directory, then using ""
> > is right and you can still use  to get the system version.
> > 
> > If our local trace.h is in include/ top level, then it is going to
> > block use of the system trace.h regardless of whether we use <> or ""
> > 
> > Fortunately our include/ tree uses sub-dirs, so we would typically
> > use  #include "$subdir/trace.h" and  #include  would still
> > find the system header.
> > We just have to be careful we don't add stuff at the top level of
> > our include/ dir with names that are liable to clash. This might
> > suggest renaming  include/elf.h to include/qemu/elf.h, or just
> > moving elf.h to the qemu/ subdirectory. Likewise include/glib-compat.h
> > might be better moved to qemu/ subdirectory.
> > 
> 
> This is exactly what this patch proposes, with a uniform scheme:
> start everything with qemu/.
> 
> > 
> > > As another example of problems, a header by the same name in the source
> > > directory will always be picked up first - before any headers in
> > > the include directory.
> > 
> > There's only a couple of headers in the top level of our include/
> > directory - everything else is pulled in with a named path
> > eg #include "block/block_int.h", so that would not conflict with
> > reference to a bare #include "block_int.h" from the current directory.
> 
> We can not know that there are no system headers that start with block/ on
> any current or future systems.

Ah true, good point.  I guess that's where the benefit of -iquote
comes into play.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [Qemu-devel] [PATCH] block: drop moderated sheepdog mailing list from MAINTAINERS file

2018-03-21 Thread Daniel P . Berrangé
On Wed, Mar 21, 2018 at 10:42:57AM -0500, Eric Blake wrote:
> On 03/21/2018 10:31 AM, Daniel P. Berrangé wrote:
> > The sheepdog mailing list is setup to stop and queue messages from
> > non-subscribers, pending moderator approval. Unfortunately it seems
> > that the moderation queue is not actively dealt with. Even when messages
> > are approved, the sender is never added to the whitelist, so every
> > future mail the same sender continues to get stopped for moderation.
> > 
> > MAINTAINERS entries should be responsive and not uneccessarily block
> > mails from QEMU contributors, so drop the sheepdog mailing list.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >   MAINTAINERS | 1 -
> >   1 file changed, 1 deletion(-)
> 
> Perhaps the sheepdog list would like to know what we are doing, along with a
> heads up that their list policy is unfriendly.  But I can't be the one to
> tell them - if I were to add the sheepdog list in cc, I'd get moderated.

I copied the other listed sheepdog maintainers on this patch so that they
would get the heads up & opportunity to object and / or talk to the list
admins to address it if desired.  We should give them a few days to
respond here before queue'ing this at least.

> It's annoyed me enough times that I agree with the change.
> 
> Reviewed-by: Eric Blake 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [PATCH v2] qemu: replace "" with <> in headers

2018-03-21 Thread Michael S. Tsirkin
On Wed, Mar 21, 2018 at 03:19:22PM +, Daniel P. Berrangé wrote:
> On Wed, Mar 21, 2018 at 04:46:32PM +0200, Michael S. Tsirkin wrote:
> > Our current scheme is to use
> >  #include ""
> > for internal headers, and
> >  #include <>
> > for external ones.
> > 
> > Unfortunately this is not based on compiler support: from C point of
> > view, the "" form merely looks up headers in the current directory
> > and then falls back on <> directories.
> > 
> > Thus, for example, a system header trace.h - should it be present - will
> > conflict with our local trace.h
> 
> If our local "trace.h" is in the current directory, then using ""
> is right and you can still use  to get the system version.
> 
> If our local trace.h is in include/ top level, then it is going to
> block use of the system trace.h regardless of whether we use <> or ""
> 
> Fortunately our include/ tree uses sub-dirs, so we would typically
> use  #include "$subdir/trace.h" and  #include  would still
> find the system header.
> We just have to be careful we don't add stuff at the top level of
> our include/ dir with names that are liable to clash. This might
> suggest renaming  include/elf.h to include/qemu/elf.h, or just
> moving elf.h to the qemu/ subdirectory. Likewise include/glib-compat.h
> might be better moved to qemu/ subdirectory.
> 

This is exactly what this patch proposes, with a uniform scheme:
start everything with qemu/.

> 
> > As another example of problems, a header by the same name in the source
> > directory will always be picked up first - before any headers in
> > the include directory.
> 
> There's only a couple of headers in the top level of our include/
> directory - everything else is pulled in with a named path
> eg #include "block/block_int.h", so that would not conflict with
> reference to a bare #include "block_int.h" from the current directory.

We can not know that there are no system headers that start with block/ on
any current or future systems.

> > Let's change the scheme: make sure all headers that are not
> > in the source directory are included through a path
> > starting with qemu/ , thus:
> > 
> >  #include <>
> > 
> > headers in the same directory as source are included with
> > 
> >  #include ""
> > 
> > as per standard.
> 
> As stated before, I consider this a step backwards - it is a
> good clear standard to use "" for project local includes and
> <> for 3rd party / system includes IMHO. The change doesn't
> do anything beneficial for the two scenarios described above
> AFAICT.

I think you are mistaken on the last point:
1. Everything will be under qemu/ so we never clash with a system file
2. A local stale file anywhere in source directory is completely ignored
   since source is not on -I path.

I hope this clarifies things.

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [Qemu-devel] [PATCH] block: drop moderated sheepdog mailing list from MAINTAINERS file

2018-03-21 Thread Eric Blake

On 03/21/2018 10:31 AM, Daniel P. Berrangé wrote:

The sheepdog mailing list is setup to stop and queue messages from
non-subscribers, pending moderator approval. Unfortunately it seems
that the moderation queue is not actively dealt with. Even when messages
are approved, the sender is never added to the whitelist, so every
future mail the same sender continues to get stopped for moderation.

MAINTAINERS entries should be responsive and not uneccessarily block
mails from QEMU contributors, so drop the sheepdog mailing list.

Signed-off-by: Daniel P. Berrangé 
---
  MAINTAINERS | 1 -
  1 file changed, 1 deletion(-)


Perhaps the sheepdog list would like to know what we are doing, along 
with a heads up that their list policy is unfriendly.  But I can't be 
the one to tell them - if I were to add the sheepdog list in cc, I'd get 
moderated.


It's annoyed me enough times that I agree with the change.

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH for-2.12 v2] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Kevin Wolf
Am 21.03.2018 um 16:32 hat Alberto Garcia geschrieben:
> On Wed 21 Mar 2018 04:07:28 PM CET, Kevin Wolf wrote:
> > I just remembered that when I looked at an image recently, I noticed
> > that the refcount block wasn't in the first cluster and I disliked it,
> > though mostly because it felt untidy rather than being a problem.
> 
> I don't think you can fix that in general. In my test case I'm writing
> new data when the existing refcount block is already full, so we can
> allocate the new one before the new data and keep everything tidy.
> 
> But if there are, say, two refcount entries available and you write four
> data clusters you don't want the new refcount block in the middle of
> those four clusters just to have it at the beginning :-)

Agreed. I think if implemented like I suggested (hard error in
update_refcount() and then allocate a new refcount block in the caller),
we'd actually end up with the refcount block covered by an old refcount
block rather than being self-describing. And that in turn wouldn't make
things much tidier, so I guess you're right. :-)

> > Having a qcow2 analysis script in the repo sounds like a good
> > idea. John has something, too. Maybe we can check whether the two
> > things complement each other and then check in a script that combines
> > both (or if one provides a superset of the other, just check in that
> > one).
> 
> I'll take a look.

Thanks!

Kevin



Re: [Qemu-block] [PATCH] block: drop moderated sheepdog mailing list from MAINTAINERS file

2018-03-21 Thread Jeff Cody
On Wed, Mar 21, 2018 at 03:31:24PM +, Daniel P. Berrangé wrote:
> The sheepdog mailing list is setup to stop and queue messages from
> non-subscribers, pending moderator approval. Unfortunately it seems
> that the moderation queue is not actively dealt with. Even when messages
> are approved, the sender is never added to the whitelist, so every
> future mail the same sender continues to get stopped for moderation.
> 
> MAINTAINERS entries should be responsive and not uneccessarily block
> mails from QEMU contributors, so drop the sheepdog mailing list.
> 
> Signed-off-by: Daniel P. Berrangé 

Reviewed-by: Jeff Cody 

> ---
>  MAINTAINERS | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 354a18ce49..7eea869dcd 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1871,7 +1871,6 @@ M: Hitoshi Mitake 
>  M: Liu Yuan 
>  M: Jeff Cody 
>  L: qemu-block@nongnu.org
> -L: sheep...@lists.wpkg.org
>  S: Supported
>  F: block/sheepdog.c
>  T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> -- 
> 2.14.3
> 



Re: [Qemu-block] [PATCH v2] qemu: replace "" with <> in headers

2018-03-21 Thread Kevin Wolf
Am 21.03.2018 um 15:46 hat Michael S. Tsirkin geschrieben:
> Our current scheme is to use
>  #include ""
> for internal headers, and
>  #include <>
> for external ones.
> 
> Unfortunately this is not based on compiler support: from C point of
> view, the "" form merely looks up headers in the current directory
> and then falls back on <> directories.
> 
> Thus, for example, a system header trace.h - should it be present - will
> conflict with our local trace.h

You're right that there is a conflict, even though only in one
direction: "trace.h" is unambiguously the local trace.h in our source
tree, but  refers to the same local header rather than the
system header as you would expect.

An easy way to resolve this conflict would be using -iquote rather than
-I for directories in the source tree, so that  unambiguously
refers to the system header and "trace.h" unambiguously refers to the
QEMU header.

> As another example of problems, a header by the same name in the source
> directory will always be picked up first - before any headers in
> the include directory.
> 
> Let's change the scheme: make sure all headers that are not
> in the source directory are included through a path
> starting with qemu/ , thus:
> 
>  #include <>
> 
> headers in the same directory as source are included with
> 
>  #include ""
> 
> as per standard.
> 
> This (untested) patch is just to start the discussion and does not
> change all of the codebase. If there's agreement, this will be
> run on all code to converting code to this scheme.

Renaming files is always painful. If that's the fix, the cure might be
worse than the disease. As far as I know, the conflict is only
theoretical, so in that case I'd say: If it ain't broke, don't fix it.

Kevin



Re: [Qemu-block] [PATCH for-2.12 v2] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Alberto Garcia
On Wed 21 Mar 2018 04:07:28 PM CET, Kevin Wolf wrote:
> I just remembered that when I looked at an image recently, I noticed
> that the refcount block wasn't in the first cluster and I disliked it,
> though mostly because it felt untidy rather than being a problem.

I don't think you can fix that in general. In my test case I'm writing
new data when the existing refcount block is already full, so we can
allocate the new one before the new data and keep everything tidy.

But if there are, say, two refcount entries available and you write four
data clusters you don't want the new refcount block in the middle of
those four clusters just to have it at the beginning :-)

> Having a qcow2 analysis script in the repo sounds like a good
> idea. John has something, too. Maybe we can check whether the two
> things complement each other and then check in a script that combines
> both (or if one provides a superset of the other, just check in that
> one).

I'll take a look.

Berto



Re: [Qemu-block] [PATCH v2] qemu: replace "" with <> in headers

2018-03-21 Thread Paolo Bonzini
On 21/03/2018 16:11, Michael S. Tsirkin wrote:
>> Can you explain the changes in the source tree layout?
> include/qemu -> include/qemu/common
> include/* -> include/qemu/*

Ok, then perhaps "util" instead of common would match the source layout
more.

Paolo

> Thus one uses any qemu headers with
> 
> #include 




Re: [Qemu-block] [PATCH v2] qemu: replace "" with <> in headers

2018-03-21 Thread Daniel P . Berrangé
On Wed, Mar 21, 2018 at 04:46:32PM +0200, Michael S. Tsirkin wrote:
> Our current scheme is to use
>  #include ""
> for internal headers, and
>  #include <>
> for external ones.
> 
> Unfortunately this is not based on compiler support: from C point of
> view, the "" form merely looks up headers in the current directory
> and then falls back on <> directories.
> 
> Thus, for example, a system header trace.h - should it be present - will
> conflict with our local trace.h

If our local "trace.h" is in the current directory, then using ""
is right and you can still use  to get the system version.

If our local trace.h is in include/ top level, then it is going to
block use of the system trace.h regardless of whether we use <> or ""

Fortunately our include/ tree uses sub-dirs, so we would typically
use  #include "$subdir/trace.h" and  #include  would still
find the system header.

We just have to be careful we don't add stuff at the top level of
our include/ dir with names that are liable to clash. This might
suggest renaming  include/elf.h to include/qemu/elf.h, or just
moving elf.h to the qemu/ subdirectory. Likewise include/glib-compat.h
might be better moved to qemu/ subdirectory.


> As another example of problems, a header by the same name in the source
> directory will always be picked up first - before any headers in
> the include directory.

There's only a couple of headers in the top level of our include/
directory - everything else is pulled in with a named path
eg #include "block/block_int.h", so that would not conflict with
reference to a bare #include "block_int.h" from the current directory.

> Let's change the scheme: make sure all headers that are not
> in the source directory are included through a path
> starting with qemu/ , thus:
> 
>  #include <>
> 
> headers in the same directory as source are included with
> 
>  #include ""
> 
> as per standard.

As stated before, I consider this a step backwards - it is a
good clear standard to use "" for project local includes and
<> for 3rd party / system includes IMHO. The change doesn't
do anything beneficial for the two scenarios described above
AFAICT.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [PATCH v2] qemu: replace "" with <> in headers

2018-03-21 Thread Michael S. Tsirkin
On Wed, Mar 21, 2018 at 04:04:29PM +0100, Paolo Bonzini wrote:
> On 21/03/2018 15:46, Michael S. Tsirkin wrote:
> > +if (m@^\s*#include\s+"qemu/@o) {
> > +s@^(\s*#include\s+)"qemu/([^"]+)"(.*)$@$1$3@o) {
> > +} else {
> > +s@^(\s*#include\s+)"([^"]+)"(.*)$@$1$3@o) {
> > +}
> 
> Can you explain the changes in the source tree layout?

include/qemu -> include/qemu/common
include/* -> include/qemu/*

Thus one uses any qemu headers with

#include 

we can do the conversion gradually and avoid a flag day
with some use of softlinks.

> Also, s{}{} and m{} are a bit more readable.
> 
> Paolo

Thanks, will use.


-- 
MST



Re: [Qemu-block] [PATCH for-2.12 v2] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Kevin Wolf
Am 21.03.2018 um 15:10 hat Alberto Garcia geschrieben:
> On Wed 21 Mar 2018 02:52:38 PM CET, Kevin Wolf wrote:
> 
> >>  qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 1M
> >>  qemu-io -c 'write 0 124k' hd.qcow2
> >>  qemu-io -c 'write 124k 512' hd.qcow2
> >
> > So if I understand correctly, this is what we get:
> >
> > 0x2: free
> > 0x20200: refcount block
> > 0x20400: data cluster
> > 0x20600: potential next data cluster
> 
> Yes.
> 
> >> What this patch does is reset s->free_cluster_index to its previous
> >> value when alloc_refcount_block() returns -EAGAIN. This way the caller
> >> will try to allocate again the original clusters if they are still
> >> free.
> >
> > This is an improvement, because now we should avoid the unused cluster:
> >
> > 0x2: data cluster
> > 0x20200: refcount block
> > 0x20400: potential next data cluster
> 
> That's correct.
> 
> > But now the data clusters are fragmented. Should we try to change the
> > logic so that already the refcount block allocation can make use of
> > the free space? That is:
> >
> > 0x2: refcount block
> > 0x20200: data cluster
> > 0x20400: contiguous potential next data cluster
> 
> Well, the clusters before 0x2 are also data clusters so there's
> going to be fragmentation anyway. Also, if you're writing more than 1
> data cluster in a single request when the refcount block allocation
> happens all those new data clusters are still going to be contiguous
> (before the new refcount block).

That's true.

I just remembered that when I looked at an image recently, I noticed
that the refcount block wasn't in the first cluster and I disliked it,
though mostly because it felt untidy rather than being a problem. There
was one effect that might count as an advantage, though: The first few
data clusters covered by the refcount block were corrupted before the
overlap check for the refcount block stopped it.

Not necessarily worth changing the logic, I just thought I'd bring it
up and see whether anyone else dislikes it.

> Another possibility that I considered was to make alloc_clusters_noref()
> return the offset but let free_cluster_index untouched until the
> refcounts are correct, but this requires more code and I fear that
> keeping free_cluster_index pointing to the clusters we're trying to
> allocate can lead to unexpected surprises.

Possibly. I think if I wanted to change the order, I'd consider
returning a hard error from update_refcount() when no refcount block is
available, and then the caller would have to call alloc_refcount_block()
manually before retrying update_refcount().

> The solution I chose is -I think- the simplest and easiest to audit.
> 
> On a related note, I'm using a script that I wrote myself to debug qcow2
> images. It prints the contents of the header and lists all host
> clusters, indicating the type of each one. I find it useful to debug
> problems like this one. If there's nothing similar already existing I
> can post it and we could put it in the scripts/ directory.

Having a qcow2 analysis script in the repo sounds like a good idea. John
has something, too. Maybe we can check whether the two things complement
each other and then check in a script that combines both (or if one
provides a superset of the other, just check in that one).

Kevin



Re: [Qemu-block] [PATCH 2/4] nbd/server: add nbd_meta_single_query helper

2018-03-21 Thread Eric Blake

On 03/21/2018 07:19 AM, Vladimir Sementsov-Ogievskiy wrote:

The helper will be reused for bitmaps namespace.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  nbd/server.c | 41 -
  1 file changed, 28 insertions(+), 13 deletions(-)




+/* Read len bytes and check matching to the pattern.
+ * @match is set to true on empty query for _LIST_ and for query matching the
+ * @pattern. @match is never set to false.


How about:

Read @len bytes, and set @match to true if they match @pattern, or if 
@len is 0 and the client is performing _LIST_.  @match is never set to 
false.


At any rate, the refactoring is sane; and comment touchups are trivial, so

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH v2] qemu: replace "" with <> in headers

2018-03-21 Thread Paolo Bonzini
On 21/03/2018 15:46, Michael S. Tsirkin wrote:
> +if (m@^\s*#include\s+"qemu/@o) {
> +s@^(\s*#include\s+)"qemu/([^"]+)"(.*)$@$1$3@o) {
> +} else {
> +s@^(\s*#include\s+)"([^"]+)"(.*)$@$1$3@o) {
> +}

Can you explain the changes in the source tree layout?

Also, s{}{} and m{} are a bit more readable.

Paolo



Re: [Qemu-block] [PATCH 1/4] nbd/server: refactor nbd_negotiate_meta_query for several namespaces

2018-03-21 Thread Eric Blake

[adding NBD list]

On 03/21/2018 07:19 AM, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  nbd/server.c | 60 +++-
  1 file changed, 43 insertions(+), 17 deletions(-)



+struct {
+const char *ns;
+int (*func)(NBDClient *, NBDExportMetaContexts *, uint32_t, Error **);
+} meta_namespace_handlers[] = {
+/* namespaces should go in non-decreasing order by name length */
+{.ns = "base:", .func = nbd_meta_base_query},
+};
+



@@ -787,9 +793,12 @@ static int nbd_negotiate_meta_query(NBDClient *client,
  NBDExportMetaContexts *meta, Error **errp)
  {
  int ret;
-char query[sizeof("base:") - 1];
-size_t baselen = strlen("base:");
+int i;
  uint32_t len;
+int bytes_done = 0;
+char *query;
+int nb_ns = sizeof(meta_namespace_handlers) /
+sizeof(meta_namespace_handlers[0]);


Use the ARRAY_SIZE() macro here.


+query = g_malloc(strlen(meta_namespace_handlers[nb_ns - 1].ns));


So this sizes a buffer according to the largest namespace we expect to 
handle,...


  
-len -= baselen;

-ret = nbd_opt_read(client, query, baselen, errp);
-if (ret <= 0) {
-return ret;
-}
-if (strncmp(query, "base:", baselen) != 0) {
-return nbd_opt_skip(client, len, errp);
+for (i = 0; i < nb_ns; i++) {
+const char *ns = meta_namespace_handlers[i].ns;
+int ns_len = strlen(ns);
+int diff_len = strlen(ns) - bytes_done;
+
+assert(diff_len >= 0);
+
+if (diff_len > 0) {
+if (len < diff_len) {
+ret = nbd_opt_skip(client, len, errp);
+goto out;
+}
+
+len -= diff_len;
+ret = nbd_opt_read(client, query + bytes_done, diff_len, errp);
+if (ret <= 0) {
+goto out;
+}
+}
+
+if (!strncmp(query, ns, ns_len)) {
+ret = meta_namespace_handlers[i].func(client, meta, len, errp);
+goto out;
+}



...then you do multiple iterative reads as you step through the list. 
You know, if you encounter a ':' at any point in the iterative reads, 
you don't have to continue through the rest of the handlers (the query 
belongs to a short-named namespace we didn't recognize).


Is it any smarter to just blindly do a single read of MIN(querylen, 
maxlen), then strchr() for ':' (if not found, it's not a namespace we 
recognize), and then look up if the name matches, at which point we then 
read the rest of the query and refactor the namespace handler to be 
passed the already-parsed leafname, rather than having to parse the 
leafname off the wire in the handler?


I'm STILL wondering if the NBD spec should specify namespace and 
leafname as separate fields rather than requiring the server to parse 
for ':'.  We have only a couple of weeks before the qemu 2.12 release 
cements in place an implementation of the BLOCK_STATUS extension.  And 
we've already discussed that if we make a change, we have to consider 
using a different constant for NBD_OPT_SET_META_CONTEXT to play nicely 
with the existing Virtuozzo implementation that currently matches what 
is slated to land in qemu 2.12 if no further qemu patches are submitted. 
 Is it worth me proposing a doc change to demonstrate what the 
difference would look like, along with corresponding qemu changes to 
match, to decide if including it in 2.12 is worth it?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



[Qemu-block] [PATCH v2] qemu: replace "" with <> in headers

2018-03-21 Thread Michael S. Tsirkin
Our current scheme is to use
 #include ""
for internal headers, and
 #include <>
for external ones.

Unfortunately this is not based on compiler support: from C point of
view, the "" form merely looks up headers in the current directory
and then falls back on <> directories.

Thus, for example, a system header trace.h - should it be present - will
conflict with our local trace.h

As another example of problems, a header by the same name in the source
directory will always be picked up first - before any headers in
the include directory.

Let's change the scheme: make sure all headers that are not
in the source directory are included through a path
starting with qemu/ , thus:

 #include <>

headers in the same directory as source are included with

 #include ""

as per standard.

This (untested) patch is just to start the discussion and does not
change all of the codebase. If there's agreement, this will be
run on all code to converting code to this scheme.

Signed-off-by: Michael S. Tsirkin 
---
 scripts/changeheaders.pl | 20 
 1 file changed, 20 insertions(+)
 create mode 100755 scripts/changeheaders.pl

diff --git a/scripts/changeheaders.pl b/scripts/changeheaders.pl
new file mode 100755
index 000..22bd5b8
--- /dev/null
+++ b/scripts/changeheaders.pl
@@ -0,0 +1,20 @@
+#!/usr/bin/perl -pi
+
+if (m@^\s*#include\s+"([^"+]"@o) {
+next;
+}
+
+my $hdr = $1;
+my $file = $ARGV;
+$file =~ s@/[^/]+$@@g;
+$file .= $hdr;
+
+if (-e $file) {
+next;
+}
+
+if (m@^\s*#include\s+"qemu/@o) {
+s@^(\s*#include\s+)"qemu/([^"]+)"(.*)$@$1$3@o) {
+} else {
+s@^(\s*#include\s+)"([^"]+)"(.*)$@$1$3@o) {
+}
-- 
MST



Re: [Qemu-block] [PATCH for-2.12 v2] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Alberto Garcia
On Wed 21 Mar 2018 02:52:38 PM CET, Kevin Wolf wrote:

>>  qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 1M
>>  qemu-io -c 'write 0 124k' hd.qcow2
>>  qemu-io -c 'write 124k 512' hd.qcow2
>
> So if I understand correctly, this is what we get:
>
> 0x2: free
> 0x20200: refcount block
> 0x20400: data cluster
> 0x20600: potential next data cluster

Yes.

>> What this patch does is reset s->free_cluster_index to its previous
>> value when alloc_refcount_block() returns -EAGAIN. This way the caller
>> will try to allocate again the original clusters if they are still
>> free.
>
> This is an improvement, because now we should avoid the unused cluster:
>
> 0x2: data cluster
> 0x20200: refcount block
> 0x20400: potential next data cluster

That's correct.

> But now the data clusters are fragmented. Should we try to change the
> logic so that already the refcount block allocation can make use of
> the free space? That is:
>
> 0x2: refcount block
> 0x20200: data cluster
> 0x20400: contiguous potential next data cluster

Well, the clusters before 0x2 are also data clusters so there's
going to be fragmentation anyway. Also, if you're writing more than 1
data cluster in a single request when the refcount block allocation
happens all those new data clusters are still going to be contiguous
(before the new refcount block).

Another possibility that I considered was to make alloc_clusters_noref()
return the offset but let free_cluster_index untouched until the
refcounts are correct, but this requires more code and I fear that
keeping free_cluster_index pointing to the clusters we're trying to
allocate can lead to unexpected surprises.

The solution I chose is -I think- the simplest and easiest to audit.

On a related note, I'm using a script that I wrote myself to debug qcow2
images. It prints the contents of the header and lists all host
clusters, indicating the type of each one. I find it useful to debug
problems like this one. If there's nothing similar already existing I
can post it and we could put it in the scripts/ directory.

Berto



Re: [Qemu-block] [PATCH for-2.12 v2] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Kevin Wolf
Am 21.03.2018 um 14:38 hat Alberto Garcia geschrieben:
> This can be reproduced easily:
> 
>  qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 1M
>  qemu-io -c 'write 0 124k' hd.qcow2
> 
> After this the image has 132608 bytes (256 clusters), and the refcount
> block is full. If we write 512 more bytes it should allocate two new
> clusters: the data cluster itself and a new refcount block.
> 
>  qemu-io -c 'write 124k 512' hd.qcow2
> 
> However the image has now three new clusters (259 in total), and the
> first one of them is empty (and unallocated):
> 
>  dd if=hd.qcow2 bs=512c skip=256 count=1 | hexdump -C
> 
> If we write larger amounts of data in the last step instead of the 512
> bytes used in this example we can create larger holes in the qcow2
> file.

So if I understand correctly, this is what we get:

0x2: free
0x20200: refcount block
0x20400: data cluster
0x20600: potential next data cluster

> What this patch does is reset s->free_cluster_index to its previous
> value when alloc_refcount_block() returns -EAGAIN. This way the caller
> will try to allocate again the original clusters if they are still
> free.

This is an improvement, because now we should avoid the unused cluster:

0x2: data cluster
0x20200: refcount block
0x20400: potential next data cluster

But now the data clusters are fragmented. Should we try to change the
logic so that already the refcount block allocation can make use of the
free space? That is:

0x2: refcount block
0x20200: data cluster
0x20400: contiguous potential next data cluster

Kevin



Re: [Qemu-block] [Libguestfs] [PATCH] tests: regressions: make test-big-heap use a temporary empty file

2018-03-21 Thread Pino Toscano
On Wednesday, 21 March 2018 14:45:38 CET Eric Blake wrote:
> [adding qemu lists]
> 
> On 03/21/2018 07:51 AM, Richard W.M. Jones wrote:
> > On Wed, Mar 21, 2018 at 01:44:17PM +0100, Pino Toscano wrote:
> >> Newer versions of qemu use file locking for the images by default, and
> >> apparently that does not work with /dev/null.  Since this test just
> >> calls qemu-img to get the format of an empty image, create a temporary
> >> one instead.
> > 
> > ACK, but feels like this is a bug in qemu-img to me.
> 
> You're right that file locking on a character device like /dev/null is 
> not going to work as expected, but is it a case where fcntl() actually 
> fails, or is it worse where the fcntl() claiming the locks "succeeds" 
> but doesn't do what we want?  That is, what were the actual error 
> messages you ran into?

$ qemu-img --version
qemu-img version 2.10.1(qemu-2.10.1-2.fc27)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
$ qemu-img info /dev/null 
qemu-img: Could not open '/dev/null': Failed to get "consistent read" lock
Is another process using the image?

-- 
Pino Toscano

signature.asc
Description: This is a digitally signed message part.


Re: [Qemu-block] [Libguestfs] [PATCH] tests: regressions: make test-big-heap use a temporary empty file

2018-03-21 Thread Eric Blake

[adding qemu lists]

On 03/21/2018 07:51 AM, Richard W.M. Jones wrote:

On Wed, Mar 21, 2018 at 01:44:17PM +0100, Pino Toscano wrote:

Newer versions of qemu use file locking for the images by default, and
apparently that does not work with /dev/null.  Since this test just
calls qemu-img to get the format of an empty image, create a temporary
one instead.


ACK, but feels like this is a bug in qemu-img to me.


You're right that file locking on a character device like /dev/null is 
not going to work as expected, but is it a case where fcntl() actually 
fails, or is it worse where the fcntl() claiming the locks "succeeds" 
but doesn't do what we want?  That is, what were the actual error 
messages you ran into?


Is it something where qemu should instead be checking fstat() results 
and only doing locking on regular files and block devices?  At any rate, 
I agree that qemu itself should behave nicer on attempts to use 
/dev/null as an image.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [Qemu-ppc] [PATCH] qemu: include generated files with <> and not ""

2018-03-21 Thread Michael S. Tsirkin
On Wed, Mar 21, 2018 at 01:29:53PM +, Daniel P. Berrangé wrote:
> On Wed, Mar 21, 2018 at 03:08:36PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Mar 21, 2018 at 08:16:00AM +0100, Thomas Huth wrote:
> > > On 20.03.2018 13:05, Michael S. Tsirkin wrote:
> > > > On Tue, Mar 20, 2018 at 09:58:23AM +0100, Laurent Vivier wrote:
> > > >> Le 20/03/2018 à 02:54, Michael S. Tsirkin a écrit :
> > > >>> QEMU coding style at the moment asks for all non-system
> > > >>> include files to be used with #include "foo.h".
> > > >>> However this rule actually does not make sense and
> > > >>> creates issues for when the included file is generated.
> > > >>
> > > >> If you change that, we can have issue when a system include has the 
> > > >> same
> > > >> name as our local include. With "", system header are taken 
> > > >> first.
> > > > 
> > > > Are you sure? I just tested and that is not the case with
> > > > either gcc or clang.
> > > > 
> > > >>> In C, include "file" means look in current directory,
> > > >>> then on include search path. Current directory here
> > > >>> means the source file directory.
> > > >>> By comparison include  means look on include search path.
> > > >>
> > > >> Not exactly, there is the notion of "system header" too.
> > > >>
> > > >> https://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html
> > > >>
> > > >> #include 
> > > >> This variant is used for system header files. It searches for a file
> > > >> named file in a standard list of system directories. You can prepend
> > > >> directories to this list with the -I option (see Invocation).
> > > > 
> > > > This is exactly what we do.
> > > > 
> > > >> #include "file"
> > > >> This variant is used for header files of your own program. It searches
> > > >> for a file named file first in the directory containing the current
> > > >> file, then in the quote directories and then the same directories used
> > > >> for . You can prepend directories to the list of quote 
> > > >> directories
> > > >> with the -iquote option.
> > > > 
> > > > Since we do not use -iquote, "" just adds the current directory.
> > > 
> > > So why don't we simply switch to use -iquote instead of -I for adding
> > > search paths for our own headers? We then would get a clean separation
> > > of QEMU headers from system headers.
> > > 
> > >  Thomas
> > 
> > It still leaves us with a host of problems e.g. the problem of stale
> > headers in the source directory.
> 
> We have a patch on list which effectively solves the problem of stale
> generated files in source directory, so that's largely a non-issue at
> this point IMHO.
> 
> Regards,
> Daniel

That was just one, and the solution is just to fail build.
I think we can strive to address at least some of the following:
- make sure that an incorrect use of a header fails to build
- make it easier for new developers to understand the codebase
- build correctly rather than fail in as many configurations as possible
- actually support a mix of in and out of tree builds

I think my patch under discussion does not address all issues here.
I'll post a new proposal now.

> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



[Qemu-block] [PATCH for-2.12 v2] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Alberto Garcia
When we try to allocate new clusters we first look for available ones
starting from s->free_cluster_index and once we find them we increase
their reference counts. Before we get to call update_refcount() to do
this last step s->free_cluster_index is already pointing to the next
cluster after the ones we are trying to allocate.

During update_refcount() it may happen however that we also need to
allocate a new refcount block in order to store the refcounts of these
new clusters (and to complicate things further that may also require
us to grow the refcount table). After all this we don't know if the
clusters that we originally tried to allocate are still available, so
we return -EAGAIN to ask the caller to restart the search for free
clusters.

This is what can happen in a common scenario:

  1) We want to allocate a new cluster and we see that cluster N is
 free.

  2) We try to increase N's refcount but all refcount blocks are full,
 so we allocate a new one at N+1 (where s->free_cluster_index was
 pointing at).

  3) Once we're done we return -EAGAIN to look again for a free
 cluster, but now s->free_cluster_index points at N+2, so that's
 the one we allocate. Cluster N remains unallocated and we have a
 hole in the qcow2 file.

This can be reproduced easily:

 qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 1M
 qemu-io -c 'write 0 124k' hd.qcow2

After this the image has 132608 bytes (256 clusters), and the refcount
block is full. If we write 512 more bytes it should allocate two new
clusters: the data cluster itself and a new refcount block.

 qemu-io -c 'write 124k 512' hd.qcow2

However the image has now three new clusters (259 in total), and the
first one of them is empty (and unallocated):

 dd if=hd.qcow2 bs=512c skip=256 count=1 | hexdump -C

If we write larger amounts of data in the last step instead of the 512
bytes used in this example we can create larger holes in the qcow2
file.

What this patch does is reset s->free_cluster_index to its previous
value when alloc_refcount_block() returns -EAGAIN. This way the caller
will try to allocate again the original clusters if they are still
free.

The output of iotest 026 also needs to be updated because now that
images have no holes some tests fail at a different point and the
number of leaked clusters is different.

Signed-off-by: Alberto Garcia 
Reviewed-by: Eric Blake 
---

v2: Force refcount_bits=16 in iotest 121 [Eric]

---
 block/qcow2-refcount.c |  7 +++
 tests/qemu-iotests/026.out |  6 +++---
 tests/qemu-iotests/121 | 20 
 tests/qemu-iotests/121.out | 10 ++
 4 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 362deaf303..6b8b63514a 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -839,6 +839,13 @@ static int QEMU_WARN_UNUSED_RESULT 
update_refcount(BlockDriverState *bs,
 qcow2_cache_put(s->refcount_block_cache, _block);
 }
 ret = alloc_refcount_block(bs, cluster_index, _block);
+/* If the caller needs to restart the search for free clusters,
+ * try the same ones first to see if they're still free. */
+if (ret == -EAGAIN) {
+if (s->free_cluster_index > (start >> s->cluster_bits)) {
+s->free_cluster_index = (start >> s->cluster_bits);
+}
+}
 if (ret < 0) {
 goto fail;
 }
diff --git a/tests/qemu-iotests/026.out b/tests/qemu-iotests/026.out
index 86a50a2e13..8e89416a86 100644
--- a/tests/qemu-iotests/026.out
+++ b/tests/qemu-iotests/026.out
@@ -533,7 +533,7 @@ Failed to flush the L2 table cache: No space left on device
 Failed to flush the refcount block cache: No space left on device
 write failed: No space left on device
 
-11 leaked clusters were found on the image.
+10 leaked clusters were found on the image.
 This means waste of disk space, but no harm to data.
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824
 
@@ -561,7 +561,7 @@ Failed to flush the L2 table cache: No space left on device
 Failed to flush the refcount block cache: No space left on device
 write failed: No space left on device
 
-11 leaked clusters were found on the image.
+10 leaked clusters were found on the image.
 This means waste of disk space, but no harm to data.
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824
 
@@ -589,7 +589,7 @@ Failed to flush the L2 table cache: No space left on device
 Failed to flush the refcount block cache: No space left on device
 write failed: No space left on device
 
-11 leaked clusters were found on the image.
+10 leaked clusters were found on the image.
 This means waste of disk space, but no harm to data.
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824
 
diff --git a/tests/qemu-iotests/121 b/tests/qemu-iotests/121

Re: [Qemu-block] [Qemu-ppc] [PATCH] qemu: include generated files with <> and not ""

2018-03-21 Thread Daniel P . Berrangé
On Wed, Mar 21, 2018 at 03:08:36PM +0200, Michael S. Tsirkin wrote:
> On Wed, Mar 21, 2018 at 08:16:00AM +0100, Thomas Huth wrote:
> > On 20.03.2018 13:05, Michael S. Tsirkin wrote:
> > > On Tue, Mar 20, 2018 at 09:58:23AM +0100, Laurent Vivier wrote:
> > >> Le 20/03/2018 à 02:54, Michael S. Tsirkin a écrit :
> > >>> QEMU coding style at the moment asks for all non-system
> > >>> include files to be used with #include "foo.h".
> > >>> However this rule actually does not make sense and
> > >>> creates issues for when the included file is generated.
> > >>
> > >> If you change that, we can have issue when a system include has the same
> > >> name as our local include. With "", system header are taken first.
> > > 
> > > Are you sure? I just tested and that is not the case with
> > > either gcc or clang.
> > > 
> > >>> In C, include "file" means look in current directory,
> > >>> then on include search path. Current directory here
> > >>> means the source file directory.
> > >>> By comparison include  means look on include search path.
> > >>
> > >> Not exactly, there is the notion of "system header" too.
> > >>
> > >> https://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html
> > >>
> > >> #include 
> > >> This variant is used for system header files. It searches for a file
> > >> named file in a standard list of system directories. You can prepend
> > >> directories to this list with the -I option (see Invocation).
> > > 
> > > This is exactly what we do.
> > > 
> > >> #include "file"
> > >> This variant is used for header files of your own program. It searches
> > >> for a file named file first in the directory containing the current
> > >> file, then in the quote directories and then the same directories used
> > >> for . You can prepend directories to the list of quote directories
> > >> with the -iquote option.
> > > 
> > > Since we do not use -iquote, "" just adds the current directory.
> > 
> > So why don't we simply switch to use -iquote instead of -I for adding
> > search paths for our own headers? We then would get a clean separation
> > of QEMU headers from system headers.
> > 
> >  Thomas
> 
> It still leaves us with a host of problems e.g. the problem of stale
> headers in the source directory.

We have a patch on list which effectively solves the problem of stale
generated files in source directory, so that's largely a non-issue at
this point IMHO.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [Qemu-devel] [PATCH] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Eric Blake

On 03/21/2018 08:30 AM, Eric Blake wrote:

On 03/20/2018 08:55 AM, Alberto Garcia wrote:



This can be reproduced easily:

  qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 1M
  qemu-io -c 'write 0 124k' hd.qcow2


This reproduction fails if you use non-default refcount_order...


+++ b/tests/qemu-iotests/121
@@ -93,6 +93,26 @@ $QEMU_IO -c 'write 63M 130K' "$TEST_IMG" | 
_filter_qemu_io

  _check_test_img
+echo
+echo '=== Allocating a new refcount block must not leave holes in the 
image ==='

+echo
+
+IMGOPTS='cluster_size=512' _make_test_img 1M


...so here, I think IMGOPTS also has to include refcount_bits=4, so that 
calling iotests './check -o refcount_bits=3' or similar doesn't fail the 
test.


Except that it's spelled refcount_bits=16 vs. refcount_bits=8 (we 
convert refcount_bits as a power of 2 to refcount_order as an exponent 
value written into the qcow2 header).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [Qemu-devel] [PATCH] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Eric Blake

On 03/20/2018 08:55 AM, Alberto Garcia wrote:

When we try to allocate new clusters we first look for available ones
starting from s->free_cluster_index and once we find them we increase
their reference counts. Before we get to call update_refcount() to do
this last step s->free_cluster_index is already pointing to the next
cluster after the ones we are trying to allocate.




This can be reproduced easily:

  qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 1M
  qemu-io -c 'write 0 124k' hd.qcow2


This reproduction fails if you use non-default refcount_order...


+++ b/tests/qemu-iotests/121
@@ -93,6 +93,26 @@ $QEMU_IO -c 'write 63M 130K' "$TEST_IMG" | _filter_qemu_io
  
  _check_test_img
  
+echo

+echo '=== Allocating a new refcount block must not leave holes in the image 
==='
+echo
+
+IMGOPTS='cluster_size=512' _make_test_img 1M


...so here, I think IMGOPTS also has to include refcount_bits=4, so that 
calling iotests './check -o refcount_bits=3' or similar doesn't fail the 
test.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [Qemu-ppc] [PATCH] qemu: include generated files with <> and not ""

2018-03-21 Thread Michael S. Tsirkin
On Wed, Mar 21, 2018 at 02:15:20PM +0100, Stefan Weil wrote:
> Am 21.03.2018 um 14:08 schrieb Michael S. Tsirkin:
> > It still leaves us with a host of problems e.g. the problem of stale
> > headers in the source directory.
> 
> There have already been suggestions in the past to forbid in-tree
> builds. Would it help if configure would refuse to run from the root
> source directory? At least .gitignore could be much smaller then.
> 
> Stefan

I think I have a better idea, not relying on external tools.
Will post a proposal shortly.

-- 
MST



Re: [Qemu-block] [PATCH v4 for 2.12 0/3] fix bitmaps migration through shared storage

2018-03-21 Thread Max Reitz
On 2018-03-20 18:05, Vladimir Sementsov-Ogievskiy wrote:
> Hi all.
> 
> This fixes bitmaps migration through shared storage. Look at 02 for
> details.
> 
> The bug introduced in 2.10 with the whole qcow2 bitmaps feature, so
> qemu-stable in CC. However I doubt that someone really suffered from this.
> 
> Do we need dirty bitmaps at all in inactive case? - that was a question in v2.
> And, keeping in mind that we are going to use inactive mode not only for
> incoming migration, I'm not sure that answer is NO (but, it may be "NO" for 
> 2.10, 2.11), so let's fix it in proposed here manner at least for 2.12.

Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH for-2.12] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Alberto Garcia
On Wed 21 Mar 2018 02:08:23 PM CET, Eric Blake wrote:
>>   - This scenario is harder to reach: in order to fill a 1-cluster
>>   refcount table the size of the image needs to be larger than
>>   (cluster_size³ / refcount_bits) bytes, that's 16TB with the default
>>   parameters. So although it can be reproduced easily if you reduce
>>   the cluster size I think it's very infrequent under normal
>>   conditions.
>
> Yes, 16TB for default size, but only 2M for the best-case (512-byte
> cluster, 64-bit refcount), so still easy to write a test for.

The test case is indeed very easy:

   qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 16M
   qemu-io -c 'write 0 808' hd.qcow2
   qemu-io -c 'write 808 512' hd.qcow2

What I haven't done is debug the problem yet.

As I said I don't think this happens often in a real-world scenario, so
I wanted to fix the simple (and more common) problem first.

Berto



Re: [Qemu-block] [Qemu-ppc] [PATCH] qemu: include generated files with <> and not ""

2018-03-21 Thread Stefan Weil
Am 21.03.2018 um 14:08 schrieb Michael S. Tsirkin:
> It still leaves us with a host of problems e.g. the problem of stale
> headers in the source directory.

There have already been suggestions in the past to forbid in-tree
builds. Would it help if configure would refuse to run from the root
source directory? At least .gitignore could be much smaller then.

Stefan




Re: [Qemu-block] [Qemu-ppc] [PATCH] qemu: include generated files with <> and not ""

2018-03-21 Thread Michael S. Tsirkin
On Wed, Mar 21, 2018 at 08:16:00AM +0100, Thomas Huth wrote:
> On 20.03.2018 13:05, Michael S. Tsirkin wrote:
> > On Tue, Mar 20, 2018 at 09:58:23AM +0100, Laurent Vivier wrote:
> >> Le 20/03/2018 à 02:54, Michael S. Tsirkin a écrit :
> >>> QEMU coding style at the moment asks for all non-system
> >>> include files to be used with #include "foo.h".
> >>> However this rule actually does not make sense and
> >>> creates issues for when the included file is generated.
> >>
> >> If you change that, we can have issue when a system include has the same
> >> name as our local include. With "", system header are taken first.
> > 
> > Are you sure? I just tested and that is not the case with
> > either gcc or clang.
> > 
> >>> In C, include "file" means look in current directory,
> >>> then on include search path. Current directory here
> >>> means the source file directory.
> >>> By comparison include  means look on include search path.
> >>
> >> Not exactly, there is the notion of "system header" too.
> >>
> >> https://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html
> >>
> >> #include 
> >> This variant is used for system header files. It searches for a file
> >> named file in a standard list of system directories. You can prepend
> >> directories to this list with the -I option (see Invocation).
> > 
> > This is exactly what we do.
> > 
> >> #include "file"
> >> This variant is used for header files of your own program. It searches
> >> for a file named file first in the directory containing the current
> >> file, then in the quote directories and then the same directories used
> >> for . You can prepend directories to the list of quote directories
> >> with the -iquote option.
> > 
> > Since we do not use -iquote, "" just adds the current directory.
> 
> So why don't we simply switch to use -iquote instead of -I for adding
> search paths for our own headers? We then would get a clean separation
> of QEMU headers from system headers.
> 
>  Thomas

It still leaves us with a host of problems e.g. the problem of stale
headers in the source directory.

-- 
MST



Re: [Qemu-block] [Qemu-devel] [PATCH for-2.12] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Eric Blake

On 03/21/2018 04:28 AM, Alberto Garcia wrote:

On Tue 20 Mar 2018 06:54:15 PM CET, Eric Blake wrote:


When we try to allocate new clusters we first look for available ones
starting from s->free_cluster_index and once we find them we increase
their reference counts. Before we get to call update_refcount() to do
this last step s->free_cluster_index is already pointing to the next
cluster after the ones we are trying to allocate.


  > During update_refcount() it may happen however that we also need to
  > allocate a new refcount block in order to store the refcounts of
  > these new clusters

Your changes to test 121 covers this...

  > (and to complicate things further that may also require us to grow
  > the refcount table).

...but not this.  Is it worth also trying to cover this case in the
testsuite as well?


I checked and the patch doesn't really fix that scenario. There's a
different problem that I haven't debugged completely yet, because of two
reasons:

  - One difference is that when we grow the refcount table we actually
allocate a new one, so s->free_cluster_index points to the beginning
of the image (where the previous table was) and any holes left during
the process are allocated after that (depending on how much data we
write though).


Yeah, that can make the test harder to reason about, and is slightly 
more sensitive to our internal algorithm - but it also explains why you 
checked for index > start rather than unconditionally assigning index = 
start.




  - This scenario is harder to reach: in order to fill a 1-cluster
refcount table the size of the image needs to be larger than
(cluster_size³ / refcount_bits) bytes, that's 16TB with the default
parameters. So although it can be reproduced easily if you reduce the
cluster size I think it's very infrequent under normal conditions.


Yes, 16TB for default size, but only 2M for the best-case (512-byte 
cluster, 64-bit refcount), so still easy to write a test for.




But yes, it's a task left for the future.


Fair enough.




+/* If the caller needs to restart the search for free clusters,
+ * try the same ones first to see if they're still free. */
+if (ret == -EAGAIN) {
+if (s->free_cluster_index > (start >> s->cluster_bits)) {
+s->free_cluster_index = (start >> s->cluster_bits);
+}


Is there any harm in making this assignment unconditional, instead of
only doing it when free_cluster_index has grown larger than start?


It can happen that it is smaller than 'start' if we were moving the
refcount table to a new location, so we want to keep the lowest value.


Okay, then my R-b is sufficient on the patch as-is.




[And unrelated, but it might be nice to do a followup cleanup to track
free_cluster_offset by bytes instead of having to shift
free_cluster_index everywhere]


I've actually just seen that we already have free_byte_offset, we use
that for compressed clusters, so it might be possible to use that one...


free_byte_offset really IS a byte offset, as it can point mid-cluster. 
But having EVERYTHING be byte-based seems like it will make reasoning 
about the code easier to do.




I'll put that in my TODO list.

Berto



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH for-2.12 10/12] vhdx: Don't use error_setg_errno() with constant errno

2018-03-21 Thread Jeff Cody
On Tue, Mar 20, 2018 at 06:36:30PM +0100, Kevin Wolf wrote:
> error_setg_errno() is meant for cases where we got an errno from the OS
> that can add useful extra information to an error message. It's
> pointless if we pass a constant errno, these cases should use plain
> error_setg().
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/vhdx.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/block/vhdx.c b/block/vhdx.c
> index ca211a3196..0e48179b81 100644
> --- a/block/vhdx.c
> +++ b/block/vhdx.c
> @@ -1822,7 +1822,7 @@ static int coroutine_fn 
> vhdx_co_create(BlockdevCreateOptions *opts,
>  /* Validate options and set default values */
>  image_size = vhdx_opts->size;
>  if (image_size > VHDX_MAX_IMAGE_SIZE) {
> -error_setg_errno(errp, EINVAL, "Image size too large; max of 64TB");
> +error_setg(errp, "Image size too large; max of 64TB");
>  return -EINVAL;
>  }
>  
> @@ -1832,7 +1832,7 @@ static int coroutine_fn 
> vhdx_co_create(BlockdevCreateOptions *opts,
>  log_size = vhdx_opts->log_size;
>  }
>  if (log_size < MiB || (log_size % MiB) != 0) {
> -error_setg_errno(errp, EINVAL, "Log size must be a multiple of 1 
> MB");
> +error_setg(errp, "Log size must be a multiple of 1 MB");
>  return -EINVAL;
>  }
>  
> @@ -1874,7 +1874,7 @@ static int coroutine_fn 
> vhdx_co_create(BlockdevCreateOptions *opts,
>  }
>  
>  if (block_size < MiB || (block_size % MiB) != 0) {
> -error_setg_errno(errp, EINVAL, "Block size must be a multiple of 1 
> MB");
> +error_setg(errp, "Block size must be a multiple of 1 MB");
>  return -EINVAL;
>  }
>  if (!is_power_of_2(block_size)) {
> @@ -1882,8 +1882,7 @@ static int coroutine_fn 
> vhdx_co_create(BlockdevCreateOptions *opts,
>  return -EINVAL;
>  }
>  if (block_size > VHDX_BLOCK_SIZE_MAX) {
> -error_setg_errno(errp, EINVAL, "Block size must not exceed %d",
> - VHDX_BLOCK_SIZE_MAX);
> +error_setg(errp, "Block size must not exceed %d", 
> VHDX_BLOCK_SIZE_MAX);
>  return -EINVAL;
>  }
>  
> -- 
> 2.13.6
> 

Reviewed-by: Jeff Cody 




Re: [Qemu-block] [PATCH for-2.12 09/12] vhdx: Require power-of-two block size on create

2018-03-21 Thread Jeff Cody
On Tue, Mar 20, 2018 at 06:36:29PM +0100, Kevin Wolf wrote:
> Images with a non-power-of-two block size are invalid and cannot be
> opened. Reject such block sizes when creating an image.
> 

Good catch.

Reviewed-by: Jeff Cody 


> Signed-off-by: Kevin Wolf 
> ---
>  block/vhdx.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/block/vhdx.c b/block/vhdx.c
> index f1b97f4b49..ca211a3196 100644
> --- a/block/vhdx.c
> +++ b/block/vhdx.c
> @@ -1877,6 +1877,10 @@ static int coroutine_fn 
> vhdx_co_create(BlockdevCreateOptions *opts,
>  error_setg_errno(errp, EINVAL, "Block size must be a multiple of 1 
> MB");
>  return -EINVAL;
>  }
> +if (!is_power_of_2(block_size)) {
> +error_setg(errp, "Block size must be a power of two");
> +return -EINVAL;
> +}
>  if (block_size > VHDX_BLOCK_SIZE_MAX) {
>  error_setg_errno(errp, EINVAL, "Block size must not exceed %d",
>   VHDX_BLOCK_SIZE_MAX);
> -- 
> 2.13.6
> 



Re: [Qemu-block] [PATCH] scsi: turn "is this a SCSI device?" into a conditional hint

2018-03-21 Thread Laurent Vivier
On 21/03/2018 13:54, Paolo Bonzini wrote:
> If the user does not have permissions to send ioctls to the device (due to
> SELinux or cgroups, for example), the output can look like
> 
> qemu-kvm: -device scsi-block,drive=disk: cannot get SG_IO version number:
>   Operation not permitted.  Is this a SCSI device?
> 
> but this is confusing because the ioctl was blocked _before_ the device
> even received the SG_GET_VERSION_NUM ioctl.  Therefore, for EPERM errors
> the suggestion should be eliminated.  To make that simpler, change the
> code to use error_append_hint.
> 
> Reported-by: Ala Hino 
> Reviewed-by: Eric Blake 
> Signed-off-by: Paolo Bonzini 
> ---
>  hw/scsi/scsi-disk.c| 7 ---
>  hw/scsi/scsi-generic.c | 7 ---
>  2 files changed, 8 insertions(+), 6 deletions(-)

Reviewed-by: Laurent Vivier 





[Qemu-block] [PATCH] scsi: turn "is this a SCSI device?" into a conditional hint

2018-03-21 Thread Paolo Bonzini
If the user does not have permissions to send ioctls to the device (due to
SELinux or cgroups, for example), the output can look like

qemu-kvm: -device scsi-block,drive=disk: cannot get SG_IO version number:
  Operation not permitted.  Is this a SCSI device?

but this is confusing because the ioctl was blocked _before_ the device
even received the SG_GET_VERSION_NUM ioctl.  Therefore, for EPERM errors
the suggestion should be eliminated.  To make that simpler, change the
code to use error_append_hint.

Reported-by: Ala Hino 
Reviewed-by: Eric Blake 
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/scsi-disk.c| 7 ---
 hw/scsi/scsi-generic.c | 7 ---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 94043ed024..3e29063a3f 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2637,9 +2637,10 @@ static void scsi_block_realize(SCSIDevice *dev, Error 
**errp)
 /* check we are using a driver managing SG_IO (version 3 and after) */
 rc = blk_ioctl(s->qdev.conf.blk, SG_GET_VERSION_NUM, _version);
 if (rc < 0) {
-error_setg(errp, "cannot get SG_IO version number: %s.  "
- "Is this a SCSI device?",
- strerror(-rc));
+error_setg_errno(errp, -rc, "cannot get SG_IO version number");
+if (rc != -EPERM) {
+error_append_hint(errp, "Is this a SCSI device?\n");
+}
 return;
 }
 if (sg_version < 3) {
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 7414fe2d67..4753f8738f 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -500,9 +500,10 @@ static void scsi_generic_realize(SCSIDevice *s, Error 
**errp)
 /* check we are using a driver managing SG_IO (version 3 and after */
 rc = blk_ioctl(s->conf.blk, SG_GET_VERSION_NUM, _version);
 if (rc < 0) {
-error_setg(errp, "cannot get SG_IO version number: %s.  "
- "Is this a SCSI device?",
- strerror(-rc));
+error_setg_errno(errp, -rc, "cannot get SG_IO version number");
+if (rc != -EPERM) {
+error_append_hint(errp, "Is this a SCSI device?\n");
+}
 return;
 }
 if (sg_version < 3) {
-- 
2.16.2




[Qemu-block] [PATCH 4/4] qapi: new qmp command nbd-server-add-bitmap

2018-03-21 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block.json | 27 +++
 blockdev-nbd.c  | 23 +++
 2 files changed, 50 insertions(+)

diff --git a/qapi/block.json b/qapi/block.json
index c694524002..4afbbcd7b7 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -269,6 +269,33 @@
   'data': {'name': 'str', '*mode': 'NbdServerRemoveMode'} }
 
 ##
+# @nbd-server-add-bitmap:
+#
+# Export dirty bitmap through selected export. Bitmaps are searched for in
+# device attached to the export and in all its backings. Exported bitmap
+# is locked until NBD export is removed.
+#
+# @name: Export name.
+#
+# @bitmap: Bitmap name to search.
+#
+# @bitmap-export-name: How the bitmap will be seen by nbd clients
+#  (default @bitmap)
+#
+#
+# Returns: error on one of the following conditions:
+#   - the server is not running
+#   - export is not found
+#   - bitmap is not found
+#   - bitmap is disabled
+#   - bitmap is locked
+#
+# Since: 2.13
+##
+  { 'command': 'nbd-server-add-bitmap',
+'data': {'name': 'str', 'bitmap': 'str', '*bitmap-export-name': 'str'} }
+
+##
 # @nbd-server-stop:
 #
 # Stop QEMU's embedded NBD server, and unregister all devices previously
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 65a84739ed..6b0c50732c 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -220,3 +220,26 @@ void qmp_nbd_server_stop(Error **errp)
 nbd_server_free(nbd_server);
 nbd_server = NULL;
 }
+
+void qmp_nbd_server_add_bitmap(const char *name, const char *bitmap,
+   bool has_bitmap_export_name,
+   const char *bitmap_export_name,
+   Error **errp)
+{
+NBDExport *exp;
+
+if (!nbd_server) {
+error_setg(errp, "NBD server not running");
+return;
+}
+
+exp = nbd_export_find(name);
+if (exp == NULL) {
+error_setg(errp, "Export '%s' is not found", name);
+return;
+}
+
+nbd_export_bitmap(exp, bitmap,
+  has_bitmap_export_name ? bitmap_export_name : bitmap,
+  errp);
+}
-- 
2.11.1




[Qemu-block] [PATCH 3/4] nbd/server: implement dirty bitmap export

2018-03-21 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/nbd.h |   2 +
 nbd/server.c| 207 ++--
 2 files changed, 203 insertions(+), 6 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index fcdcd54502..f0b459283f 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -315,6 +315,8 @@ void nbd_client_put(NBDClient *client);
 void nbd_server_start(SocketAddress *addr, const char *tls_creds,
   Error **errp);
 
+void nbd_export_bitmap(NBDExport *exp, const char *bitmap,
+   const char *bitmap_export_name, Error **errp);
 
 /* nbd_read
  * Reads @size bytes from @ioc. Returns 0 on success.
diff --git a/nbd/server.c b/nbd/server.c
index 8fe53ffd4b..6554919ef2 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -23,6 +23,12 @@
 #include "nbd-internal.h"
 
 #define NBD_META_ID_BASE_ALLOCATION 0
+#define NBD_META_ID_DIRTY_BITMAP 1
+
+#define NBD_MAX_BITMAP_EXTENTS (0x10 / 8) /* 1 mb of extents data */
+#define MAX_EXTENT_LENGTH QEMU_ALIGN_DOWN(INT32_MAX, 512)
+
+#define NBD_STATE_DIRTY 1
 
 static int system_errno_to_nbd_errno(int err)
 {
@@ -80,6 +86,9 @@ struct NBDExport {
 
 BlockBackend *eject_notifier_blk;
 Notifier eject_notifier;
+
+BdrvDirtyBitmap *export_bitmap;
+char *export_bitmap_name;
 };
 
 static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);
@@ -92,6 +101,7 @@ typedef struct NBDExportMetaContexts {
 bool valid; /* means that negotiation of the option finished without
errors */
 bool base_allocation; /* export base:allocation context (block status) */
+bool dirty_bitmap; /* export qemu-dirty-bitmap: */
 } NBDExportMetaContexts;
 
 struct NBDClient {
@@ -786,12 +796,32 @@ static int nbd_meta_base_query(NBDClient *client, 
NBDExportMetaContexts *meta,
  >base_allocation, errp);
 }
 
+/* nbd_meta_bitmap_query
+ *
+ * Handle query to 'qemu-dirty-bitmap' namespace.
+ * 'len' is the amount of text remaining to be read from the current name, 
after
+ * the 'qemu-dirty-bitmap:' portion has been stripped.
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_meta_bitmap_query(NBDClient *client, NBDExportMetaContexts 
*meta,
+ uint32_t len, Error **errp)
+{
+if (!client->exp->export_bitmap) {
+return nbd_opt_skip(client, len, errp);
+}
+
+return nbd_meta_single_query(client, client->exp->export_bitmap_name, len,
+ >dirty_bitmap, errp);
+}
+
 struct {
 const char *ns;
 int (*func)(NBDClient *, NBDExportMetaContexts *, uint32_t, Error **);
 } meta_namespace_handlers[] = {
 /* namespaces should go in non-decreasing order by name length */
 {.ns = "base:", .func = nbd_meta_base_query},
+{.ns = "qemu-dirty-bitmap:", .func = nbd_meta_bitmap_query},
 };
 
 /* nbd_negotiate_meta_query
@@ -921,6 +951,17 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
 }
 }
 
+if (meta->dirty_bitmap) {
+char *context = g_strdup_printf("qemu-dirty-bitmap:%s",
+exp->export_bitmap_name);
+ret = nbd_negotiate_send_meta_context(client, context,
+  NBD_META_ID_DIRTY_BITMAP,
+  errp);
+if (ret < 0) {
+return ret;
+}
+}
+
 ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
 if (ret == 0) {
 meta->valid = true;
@@ -1549,6 +1590,11 @@ void nbd_export_put(NBDExport *exp)
 exp->blk = NULL;
 }
 
+if (exp->export_bitmap) {
+bdrv_dirty_bitmap_set_qmp_locked(exp->export_bitmap, false);
+g_free(exp->export_bitmap_name);
+}
+
 g_free(exp);
 }
 }
@@ -1790,6 +1836,9 @@ static int blockstatus_to_extent_be(BlockDriverState *bs, 
uint64_t offset,
 }
 
 /* nbd_co_send_extents
+ *
+ * NBD_REPLY_FLAG_DONE is not set, don't forget to send it.
+ *
  * @extents should be in big-endian */
 static int nbd_co_send_extents(NBDClient *client, uint64_t handle,
NBDExtent *extents, unsigned nb_extents,
@@ -1802,7 +1851,7 @@ static int nbd_co_send_extents(NBDClient *client, 
uint64_t handle,
 {.iov_base = extents, .iov_len = nb_extents * sizeof(extents[0])}
 };
 
-set_be_chunk(, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_BLOCK_STATUS,
+set_be_chunk(, 0, NBD_REPLY_TYPE_BLOCK_STATUS,
  handle, sizeof(chunk) - sizeof(chunk.h) + iov[1].iov_len);
 stl_be_p(_id, context_id);
 
@@ -1827,6 +1876,91 @@ static int nbd_co_send_block_status(NBDClient *client, 
uint64_t handle,
 return nbd_co_send_extents(client, handle, , 1, context_id, errp);
 }
 
+/* Set several 

Re: [Qemu-block] [Qemu-devel] [PATCH] scsi: turn "is this a SCSI device?" into a conditional hint

2018-03-21 Thread Eric Blake

On 03/21/2018 05:58 AM, Paolo Bonzini wrote:

If the user does not have permissions to send ioctls to the device (due to
SELinux or cgroups, for example), the output can look like

qemu-kvm: -device scsi-block,drive=disk: cannot get SG_IO version number:
   Operation not permitted.  Is this a SCSI device?

but this is confusing because the ioctl was blocked _before_ the device
even received the SG_GET_VERSION_NUM ioctl.  Therefore, for EPERM errors
the suggestion should be eliminated.  To make that simpler, change the
code to use error_append_hint.

Reported-by: Ala Hino 
Signed-off-by: Paolo Bonzini 
---
  hw/scsi/scsi-disk.c| 7 ---
  hw/scsi/scsi-generic.c | 7 ---
  2 files changed, 8 insertions(+), 6 deletions(-)




+++ b/hw/scsi/scsi-disk.c
@@ -2637,9 +2637,10 @@ static void scsi_block_realize(SCSIDevice *dev, Error 
**errp)
  /* check we are using a driver managing SG_IO (version 3 and after) */
  rc = blk_ioctl(s->qdev.conf.blk, SG_GET_VERSION_NUM, _version);
  if (rc < 0) {
-error_setg(errp, "cannot get SG_IO version number: %s.  "
- "Is this a SCSI device?",
- strerror(-rc));
+error_setg(errp, "cannot get SG_IO version number: %s", strerror(-rc));
+if (rc != -EPERM) {
+error_append_hint(errp, "Is this a SCSI device?");


Missing the \n (error_append_hint does NOT automatically add one, 
because sometimes hints are pieced together but should still display in 
one line).



+++ b/hw/scsi/scsi-generic.c
@@ -500,9 +500,10 @@ static void scsi_generic_realize(SCSIDevice *s, Error 
**errp)
  /* check we are using a driver managing SG_IO (version 3 and after */
  rc = blk_ioctl(s->conf.blk, SG_GET_VERSION_NUM, _version);
  if (rc < 0) {
-error_setg(errp, "cannot get SG_IO version number: %s.  "
- "Is this a SCSI device?",
- strerror(-rc));
+error_setg(errp, "cannot get SG_IO version number: %s", strerror(-rc));
+if (rc != -EPERM) {
+error_append_hint(errp, "Is this a SCSI device?");


And again.  With that fixed,
Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



[Qemu-block] [PATCH for-2.13 0/4] NBD export bitmaps

2018-03-21 Thread Vladimir Sementsov-Ogievskiy
Hi all.

This is a proposal and realization of new NBD meta context:
qemu-dirty-bitmap. (I'll send corresponding proposal to NBD protocol
after some negotiation here)

Corresponding queries will look like:
qemu-dirty-bitmap:

Mapping from export-bitmap-name to BdrvDirtyBitmap is done through qmp
command nbd-server-add-bitmap. For now, only one bitmap export is
allowed per NBD export, however it may be easily improved if needed 
(we don't have such cases for now)

Client and testing.
I wrote client code for Virtuozzo, but it turned out to be unused,
actually it's used only for tests. We don't have cases, where we need
to import dirty bitmap through qemu nbd-client. All this done for
exporting dirty bitmaps to the third tool. So, I think, it is not worth
refactoring, rebasing and merging client part upstream, if there are no
real usage cases.

Vladimir Sementsov-Ogievskiy (4):
  nbd/server: refactor nbd_negotiate_meta_query for several namespaces
  nbd/server: add nbd_meta_single_query helper
  nbd/server: implement dirty bitmap export
  qapi: new qmp command nbd-server-add-bitmap

 qapi/block.json |  27 +
 include/block/nbd.h |   2 +
 blockdev-nbd.c  |  23 
 nbd/server.c| 308 ++--
 4 files changed, 324 insertions(+), 36 deletions(-)

-- 
2.11.1




Re: [Qemu-block] [Qemu-devel] [PATCH] scsi: turn "is this a SCSI device?" into a conditional hint

2018-03-21 Thread Paolo Bonzini
On 21/03/2018 13:17, Laurent Vivier wrote:
> On 21/03/2018 11:58, Paolo Bonzini wrote:
>> If the user does not have permissions to send ioctls to the device (due to
>> SELinux or cgroups, for example), the output can look like
>>
>> qemu-kvm: -device scsi-block,drive=disk: cannot get SG_IO version number:
>>   Operation not permitted.  Is this a SCSI device?
>>
>> but this is confusing because the ioctl was blocked _before_ the device
>> even received the SG_GET_VERSION_NUM ioctl.  Therefore, for EPERM errors
>> the suggestion should be eliminated.  To make that simpler, change the
>> code to use error_append_hint.
>>
>> Reported-by: Ala Hino 
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  hw/scsi/scsi-disk.c| 7 ---
>>  hw/scsi/scsi-generic.c | 7 ---
>>  2 files changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
>> index 94043ed024..ccc245589a 100644
>> --- a/hw/scsi/scsi-disk.c
>> +++ b/hw/scsi/scsi-disk.c
>> @@ -2637,9 +2637,10 @@ static void scsi_block_realize(SCSIDevice *dev, Error 
>> **errp)
>>  /* check we are using a driver managing SG_IO (version 3 and after) */
>>  rc = blk_ioctl(s->qdev.conf.blk, SG_GET_VERSION_NUM, _version);
>>  if (rc < 0) {
>> -error_setg(errp, "cannot get SG_IO version number: %s.  "
>> - "Is this a SCSI device?",
>> - strerror(-rc));
>> +error_setg(errp, "cannot get SG_IO version number: %s", 
>> strerror(-rc));
> 
> 
> You could use:
> 
>   error_setg_errno(errp, -rc, "cannot get SG_IO version number");

Nice, thanks.  Will do.

Paolo



[Qemu-block] [PATCH 2/4] nbd/server: add nbd_meta_single_query helper

2018-03-21 Thread Vladimir Sementsov-Ogievskiy
The helper will be reused for bitmaps namespace.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 nbd/server.c | 41 -
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index b830997114..8fe53ffd4b 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -733,44 +733,59 @@ static int nbd_negotiate_send_meta_context(NBDClient 
*client,
 return qio_channel_writev_all(client->ioc, iov, 2, errp) < 0 ? -EIO : 0;
 }
 
-/* nbd_meta_base_query
- *
- * Handle query to 'base' namespace. For now, only base:allocation context is
- * available in it.  'len' is the amount of text remaining to be read from
- * the current name, after the 'base:' portion has been stripped.
+/* Read len bytes and check matching to the pattern.
+ * @match is set to true on empty query for _LIST_ and for query matching the
+ * @pattern. @match is never set to false.
  *
  * Return -errno on I/O error, 0 if option was completely handled by
  * sending a reply about inconsistent lengths, or 1 on success. */
-static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
-   uint32_t len, Error **errp)
+static int nbd_meta_single_query(NBDClient *client, const char *pattern,
+ uint32_t len, bool *match, Error **errp)
 {
 int ret;
-char query[sizeof("allocation") - 1];
-size_t alen = strlen("allocation");
+char *query;
 
 if (len == 0) {
 if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
-meta->base_allocation = true;
+*match = true;
 }
 return 1;
 }
 
-if (len != alen) {
+if (len != strlen(pattern)) {
 return nbd_opt_skip(client, len, errp);
 }
 
+query = g_malloc(len);
 ret = nbd_opt_read(client, query, len, errp);
 if (ret <= 0) {
+g_free(query);
 return ret;
 }
 
-if (strncmp(query, "allocation", alen) == 0) {
-meta->base_allocation = true;
+if (strncmp(query, pattern, len) == 0) {
+*match = true;
 }
+g_free(query);
 
 return 1;
 }
 
+/* nbd_meta_base_query
+ *
+ * Handle query to 'base' namespace. For now, only base:allocation context is
+ * available in it.  'len' is the amount of text remaining to be read from
+ * the current name, after the 'base:' portion has been stripped.
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
+   uint32_t len, Error **errp)
+{
+return nbd_meta_single_query(client, "allocation", len,
+ >base_allocation, errp);
+}
+
 struct {
 const char *ns;
 int (*func)(NBDClient *, NBDExportMetaContexts *, uint32_t, Error **);
-- 
2.11.1




[Qemu-block] [PATCH 1/4] nbd/server: refactor nbd_negotiate_meta_query for several namespaces

2018-03-21 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 nbd/server.c | 60 +++-
 1 file changed, 43 insertions(+), 17 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index cea158913b..b830997114 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -771,13 +771,19 @@ static int nbd_meta_base_query(NBDClient *client, 
NBDExportMetaContexts *meta,
 return 1;
 }
 
+struct {
+const char *ns;
+int (*func)(NBDClient *, NBDExportMetaContexts *, uint32_t, Error **);
+} meta_namespace_handlers[] = {
+/* namespaces should go in non-decreasing order by name length */
+{.ns = "base:", .func = nbd_meta_base_query},
+};
+
 /* nbd_negotiate_meta_query
  *
  * Parse namespace name and call corresponding function to parse body of the
  * query.
  *
- * The only supported namespace now is 'base'.
- *
  * The function aims not wasting time and memory to read long unknown namespace
  * names.
  *
@@ -787,9 +793,12 @@ static int nbd_negotiate_meta_query(NBDClient *client,
 NBDExportMetaContexts *meta, Error **errp)
 {
 int ret;
-char query[sizeof("base:") - 1];
-size_t baselen = strlen("base:");
+int i;
 uint32_t len;
+int bytes_done = 0;
+char *query;
+int nb_ns = sizeof(meta_namespace_handlers) /
+sizeof(meta_namespace_handlers[0]);
 
 ret = nbd_opt_read(client, , sizeof(len), errp);
 if (ret <= 0) {
@@ -797,22 +806,39 @@ static int nbd_negotiate_meta_query(NBDClient *client,
 }
 cpu_to_be32s();
 
-/* The only supported namespace for now is 'base'. So query should start
- * with 'base:'. Otherwise, we can ignore it and skip the remainder. */
-if (len < baselen) {
-return nbd_opt_skip(client, len, errp);
-}
+query = g_malloc(strlen(meta_namespace_handlers[nb_ns - 1].ns));
 
-len -= baselen;
-ret = nbd_opt_read(client, query, baselen, errp);
-if (ret <= 0) {
-return ret;
-}
-if (strncmp(query, "base:", baselen) != 0) {
-return nbd_opt_skip(client, len, errp);
+for (i = 0; i < nb_ns; i++) {
+const char *ns = meta_namespace_handlers[i].ns;
+int ns_len = strlen(ns);
+int diff_len = strlen(ns) - bytes_done;
+
+assert(diff_len >= 0);
+
+if (diff_len > 0) {
+if (len < diff_len) {
+ret = nbd_opt_skip(client, len, errp);
+goto out;
+}
+
+len -= diff_len;
+ret = nbd_opt_read(client, query + bytes_done, diff_len, errp);
+if (ret <= 0) {
+goto out;
+}
+}
+
+if (!strncmp(query, ns, ns_len)) {
+ret = meta_namespace_handlers[i].func(client, meta, len, errp);
+goto out;
+}
 }
 
-return nbd_meta_base_query(client, meta, len, errp);
+ret = nbd_opt_skip(client, len, errp);
+
+out:
+g_free(query);
+return ret;
 }
 
 /* nbd_negotiate_meta_queries
-- 
2.11.1




Re: [Qemu-block] [Qemu-devel] [PATCH] scsi: turn "is this a SCSI device?" into a conditional hint

2018-03-21 Thread Laurent Vivier
On 21/03/2018 11:58, Paolo Bonzini wrote:
> If the user does not have permissions to send ioctls to the device (due to
> SELinux or cgroups, for example), the output can look like
> 
> qemu-kvm: -device scsi-block,drive=disk: cannot get SG_IO version number:
>   Operation not permitted.  Is this a SCSI device?
> 
> but this is confusing because the ioctl was blocked _before_ the device
> even received the SG_GET_VERSION_NUM ioctl.  Therefore, for EPERM errors
> the suggestion should be eliminated.  To make that simpler, change the
> code to use error_append_hint.
> 
> Reported-by: Ala Hino 
> Signed-off-by: Paolo Bonzini 
> ---
>  hw/scsi/scsi-disk.c| 7 ---
>  hw/scsi/scsi-generic.c | 7 ---
>  2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 94043ed024..ccc245589a 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -2637,9 +2637,10 @@ static void scsi_block_realize(SCSIDevice *dev, Error 
> **errp)
>  /* check we are using a driver managing SG_IO (version 3 and after) */
>  rc = blk_ioctl(s->qdev.conf.blk, SG_GET_VERSION_NUM, _version);
>  if (rc < 0) {
> -error_setg(errp, "cannot get SG_IO version number: %s.  "
> - "Is this a SCSI device?",
> - strerror(-rc));
> +error_setg(errp, "cannot get SG_IO version number: %s", 
> strerror(-rc));


You could use:

  error_setg_errno(errp, -rc, "cannot get SG_IO version number");

Thanks,
Laurent



Re: [Qemu-block] [Qemu-devel] [PATCH v3 15/16] block/mirror: Add copy mode QAPI interface

2018-03-21 Thread Max Reitz
On 2018-03-20 18:35, Eric Blake wrote:
> On 02/28/2018 12:05 PM, Max Reitz wrote:
>> This patch allows the user to specify whether to use active or only
>> background mode for mirror block jobs.  Currently, this setting will
>> remain constant for the duration of the entire block job.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   qapi/block-core.json  | 11 +--
>>   include/block/block_int.h |  4 +++-
>>   block/mirror.c    | 12 +++-
>>   blockdev.c    |  9 -
>>   4 files changed, 27 insertions(+), 9 deletions(-)
>>
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index c73b769c27..1186c007ae 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -1573,6 +1573,9 @@
>>   # written. Both will result in identical contents.
>>   # Default is true. (Since 2.4)
>>   #
>> +# @copy-mode: when to copy data to the destination; defaults to
>> 'background'
>> +# (Since: 2.12)
>> +#
> 
> Are we still aiming for 2.12, or is this a feature which missed
> softfreeze and should therefore be 2.13?

The latter, I'll change it in v4.

Max



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH] scsi: turn "is this a SCSI device?" into a conditional hint

2018-03-21 Thread Paolo Bonzini
If the user does not have permissions to send ioctls to the device (due to
SELinux or cgroups, for example), the output can look like

qemu-kvm: -device scsi-block,drive=disk: cannot get SG_IO version number:
  Operation not permitted.  Is this a SCSI device?

but this is confusing because the ioctl was blocked _before_ the device
even received the SG_GET_VERSION_NUM ioctl.  Therefore, for EPERM errors
the suggestion should be eliminated.  To make that simpler, change the
code to use error_append_hint.

Reported-by: Ala Hino 
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/scsi-disk.c| 7 ---
 hw/scsi/scsi-generic.c | 7 ---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 94043ed024..ccc245589a 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2637,9 +2637,10 @@ static void scsi_block_realize(SCSIDevice *dev, Error 
**errp)
 /* check we are using a driver managing SG_IO (version 3 and after) */
 rc = blk_ioctl(s->qdev.conf.blk, SG_GET_VERSION_NUM, _version);
 if (rc < 0) {
-error_setg(errp, "cannot get SG_IO version number: %s.  "
- "Is this a SCSI device?",
- strerror(-rc));
+error_setg(errp, "cannot get SG_IO version number: %s", strerror(-rc));
+if (rc != -EPERM) {
+error_append_hint(errp, "Is this a SCSI device?");
+}
 return;
 }
 if (sg_version < 3) {
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 7414fe2d67..b3de5df324 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -500,9 +500,10 @@ static void scsi_generic_realize(SCSIDevice *s, Error 
**errp)
 /* check we are using a driver managing SG_IO (version 3 and after */
 rc = blk_ioctl(s->conf.blk, SG_GET_VERSION_NUM, _version);
 if (rc < 0) {
-error_setg(errp, "cannot get SG_IO version number: %s.  "
- "Is this a SCSI device?",
- strerror(-rc));
+error_setg(errp, "cannot get SG_IO version number: %s", strerror(-rc));
+if (rc != -EPERM) {
+error_append_hint(errp, "Is this a SCSI device?");
+}
 return;
 }
 if (sg_version < 3) {
-- 
2.16.2




Re: [Qemu-block] [PATCH for-2.12 04/12] qemu-iotests: Enable 025 for luks

2018-03-21 Thread Daniel P . Berrangé
On Tue, Mar 20, 2018 at 06:36:24PM +0100, Kevin Wolf wrote:
> We want to test resizing even for luks. The only change that is needed
> is to explicitly zero out new space for luks because it's undefined.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/025 | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [PATCH for-2.12 06/12] qemu-iotests: Test invalid resize on luks

2018-03-21 Thread Daniel P . Berrangé
On Tue, Mar 20, 2018 at 06:36:26PM +0100, Kevin Wolf wrote:
> This tests that the .bdrv_truncate implementation for luks doesn't crash
> for invalid image sizes.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/210 | 37 +
>  1 file changed, 37 insertions(+)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [PATCH for-2.12 05/12] luks: Turn another invalid assertion into check

2018-03-21 Thread Daniel P . Berrangé
On Tue, Mar 20, 2018 at 06:36:25PM +0100, Kevin Wolf wrote:
> Commit e39e959e fixed an invalid assertion in the .bdrv_length
> implementation, but left a similar assertion in place for
> .bdrv_truncate. Instead of crashing when the user requests a too large
> image size, fail gracefully.
> 
> A file size of exactly INT64_MAX caused failure before, but is actually
> legal.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/crypto.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [Qemu-devel] [PATCH for-2.12] qcow2: Reset free_cluster_index when allocating a new refcount block

2018-03-21 Thread Alberto Garcia
On Tue 20 Mar 2018 06:54:15 PM CET, Eric Blake wrote:

>> When we try to allocate new clusters we first look for available ones
>> starting from s->free_cluster_index and once we find them we increase
>> their reference counts. Before we get to call update_refcount() to do
>> this last step s->free_cluster_index is already pointing to the next
>> cluster after the ones we are trying to allocate.
>> 
>  > During update_refcount() it may happen however that we also need to
>  > allocate a new refcount block in order to store the refcounts of
>  > these new clusters
>
> Your changes to test 121 covers this...
>
>  > (and to complicate things further that may also require us to grow
>  > the refcount table).
>
> ...but not this.  Is it worth also trying to cover this case in the
> testsuite as well?

I checked and the patch doesn't really fix that scenario. There's a
different problem that I haven't debugged completely yet, because of two
reasons:

 - One difference is that when we grow the refcount table we actually
   allocate a new one, so s->free_cluster_index points to the beginning
   of the image (where the previous table was) and any holes left during
   the process are allocated after that (depending on how much data we
   write though).

 - This scenario is harder to reach: in order to fill a 1-cluster
   refcount table the size of the image needs to be larger than
   (cluster_size³ / refcount_bits) bytes, that's 16TB with the default
   parameters. So although it can be reproduced easily if you reduce the
   cluster size I think it's very infrequent under normal conditions.

But yes, it's a task left for the future.

>> +/* If the caller needs to restart the search for free clusters,
>> + * try the same ones first to see if they're still free. */
>> +if (ret == -EAGAIN) {
>> +if (s->free_cluster_index > (start >> s->cluster_bits)) {
>> +s->free_cluster_index = (start >> s->cluster_bits);
>> +}
>
> Is there any harm in making this assignment unconditional, instead of
> only doing it when free_cluster_index has grown larger than start?

It can happen that it is smaller than 'start' if we were moving the
refcount table to a new location, so we want to keep the lowest value.

> [And unrelated, but it might be nice to do a followup cleanup to track
> free_cluster_offset by bytes instead of having to shift
> free_cluster_index everywhere]

I've actually just seen that we already have free_byte_offset, we use
that for compressed clusters, so it might be possible to use that one...

I'll put that in my TODO list.

Berto



Re: [Qemu-block] [Qemu-ppc] [PATCH] qemu: include generated files with <> and not ""

2018-03-21 Thread Thomas Huth
On 20.03.2018 13:05, Michael S. Tsirkin wrote:
> On Tue, Mar 20, 2018 at 09:58:23AM +0100, Laurent Vivier wrote:
>> Le 20/03/2018 à 02:54, Michael S. Tsirkin a écrit :
>>> QEMU coding style at the moment asks for all non-system
>>> include files to be used with #include "foo.h".
>>> However this rule actually does not make sense and
>>> creates issues for when the included file is generated.
>>
>> If you change that, we can have issue when a system include has the same
>> name as our local include. With "", system header are taken first.
> 
> Are you sure? I just tested and that is not the case with
> either gcc or clang.
> 
>>> In C, include "file" means look in current directory,
>>> then on include search path. Current directory here
>>> means the source file directory.
>>> By comparison include  means look on include search path.
>>
>> Not exactly, there is the notion of "system header" too.
>>
>> https://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html
>>
>> #include 
>> This variant is used for system header files. It searches for a file
>> named file in a standard list of system directories. You can prepend
>> directories to this list with the -I option (see Invocation).
> 
> This is exactly what we do.
> 
>> #include "file"
>> This variant is used for header files of your own program. It searches
>> for a file named file first in the directory containing the current
>> file, then in the quote directories and then the same directories used
>> for . You can prepend directories to the list of quote directories
>> with the -iquote option.
> 
> Since we do not use -iquote, "" just adds the current directory.

So why don't we simply switch to use -iquote instead of -I for adding
search paths for our own headers? We then would get a clean separation
of QEMU headers from system headers.

 Thomas