Re: [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length

2022-04-08 Thread Wouter Verhelst
Hi Eric,

On Thu, Apr 07, 2022 at 04:37:19PM -0500, Eric Blake wrote:
> The spec was silent on how many extents a server could reply with.
> However, both qemu and nbdkit (the two server implementations known to
> have implemented the NBD_CMD_BLOCK_STATUS extension) implement a hard
> cap, and will truncate the amount of extents in a reply to avoid
> sending a client a reply larger than the maximum NBD_CMD_READ response
> they are willing to tolerate:
> 
> When qemu first implemented NBD_CMD_BLOCK_STATUS for the
> base:allocation context (qemu commit e7b1948d51, Mar 2018), it behaved
> as if NBD_CMD_FLAG_REQ_ONE were always passed by the client, and never
> responded with more than one extent.  Later, when adding its
> qemu:dirty-bitmap:XXX context extension (qemu commit 3d068aff16, Jun
> 2018), it added a cap to 128k extents (1M+4 bytes), and that cap was
> applied to base:allocation once qemu started sending multiple extents
> for that context as well (qemu commit fb7afc797e, Jul 2018).  Qemu
> extents are never smaller than 512 bytes (other than an exception at
> the end of a file whose size is not aligned to 512), but even so, a
> request for just under 4G of block status could produce 8M extents,
> resulting in a reply of 64M if it were not capped smaller.
> 
> When nbdkit first implemented NBD_CMD_BLOCK_STATUS (nbdkit 4ca66f70a5,
> Mar 2019), it did not impose any restriction on the number of extents
> in the reply chunk.  But because it allows extents as small as one
> byte, it is easy to write a server that can amplify a client's request
> of status over 1M of the image into a reply over 8M in size, and it
> was very easy to demonstrate that a hard cap was needed to avoid
> crashing clients or otherwise killing the connection (a bad server
> impacting the client negatively); unique to nbdkit's situation is the
> fact that because it is designed for plugin server implementations,
> not capping the length of extent also posed a problem to nbdkit as the
> server (a client requesting large block status could cause the server
> to run out of memory depending on the plugin providing the server
> callbacks).  So nbdkit enforced a bound of 1M extents (8M+4 bytes,
> nbdkit commit 6e0dc839ea, Jun 2019).
> 
> Since the limit chosen by these two implementations is different, and
> since nbdkit has versions that were not limited, add this as a SHOULD
> NOT instead of MUST NOT constraint on servers implementing block
> status.  It does not matter that qemu picked a smaller limit that it
> truncates to, since we have already documented that the server may
> truncate for other reasons (such as it being inefficient to collect
> that many extents in the first place).  But documenting the limit now
> becomes even more important in the face of a future addition of 64-bit
> requests, where a client's request is no longer bounded to 4G and
> could thereby produce even more than 8M extents for the corner case
> when every 512 bytes is a new extent, if it were not for this
> recommendation.

It feels backwards to me to make this a restriction on the server side.
You're saying there are server implementations that will be inefficient
if there are more than 2^20 extents, and therefore no server should send
more than those, even if it can do so efficiently.

Isn't it up to the server implementation to decide what can be done
efficiently?

Perhaps we can make the language about possibly reducing length of
extens a bit stronger; but I don't think adding explicit limits for a
server's own protection is necessary.

> ---
>  doc/proto.md | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/doc/proto.md b/doc/proto.md
> index bacccfa..c3c7cd9 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -1815,6 +1815,12 @@ MUST initiate a hard disconnect.
>the different contexts need not have the same number of extents or
>cumulative extent length.
> 
> +  Servers SHOULD NOT send more than 2^20 extents in a single reply
> +  chunk; in other words, the maximum size of
> +  `NBD_REPLY_TYPE_BLOCK_STATUS` should not be more than 4 + 8*2^20
> +  (8,388,612 bytes), even if this requires that the server truncate
> +  the response in relation to the *length* requested by the client.
> +
>Even if the client did not use the `NBD_CMD_FLAG_REQ_ONE` flag in
>its request, the server MAY return fewer descriptors in the reply
>than would be required to fully specify the whole range of requested
> @@ -2181,10 +2187,13 @@ The following request types exist:
>  multiple descriptors, and the final descriptor MAY extend beyond
>  the original requested size if the server can determine a larger
>  length without additional effort.  On the other hand, the server MAY
> -return less data than requested. However the server MUST return at
> -least one status descriptor (and since each status descriptor has
> -a non-zero length, a client can always make progress on a
> -successful

Re: [Libguestfs] [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length

2022-04-08 Thread Richard W.M. Jones
On Fri, Apr 08, 2022 at 09:25:01AM +0200, Wouter Verhelst wrote:
> Hi Eric,
> 
> On Thu, Apr 07, 2022 at 04:37:19PM -0500, Eric Blake wrote:
> > The spec was silent on how many extents a server could reply with.
> > However, both qemu and nbdkit (the two server implementations known to
> > have implemented the NBD_CMD_BLOCK_STATUS extension) implement a hard
> > cap, and will truncate the amount of extents in a reply to avoid
> > sending a client a reply larger than the maximum NBD_CMD_READ response
> > they are willing to tolerate:
> > 
> > When qemu first implemented NBD_CMD_BLOCK_STATUS for the
> > base:allocation context (qemu commit e7b1948d51, Mar 2018), it behaved
> > as if NBD_CMD_FLAG_REQ_ONE were always passed by the client, and never
> > responded with more than one extent.  Later, when adding its
> > qemu:dirty-bitmap:XXX context extension (qemu commit 3d068aff16, Jun
> > 2018), it added a cap to 128k extents (1M+4 bytes), and that cap was
> > applied to base:allocation once qemu started sending multiple extents
> > for that context as well (qemu commit fb7afc797e, Jul 2018).  Qemu
> > extents are never smaller than 512 bytes (other than an exception at
> > the end of a file whose size is not aligned to 512), but even so, a
> > request for just under 4G of block status could produce 8M extents,
> > resulting in a reply of 64M if it were not capped smaller.
> > 
> > When nbdkit first implemented NBD_CMD_BLOCK_STATUS (nbdkit 4ca66f70a5,
> > Mar 2019), it did not impose any restriction on the number of extents
> > in the reply chunk.  But because it allows extents as small as one
> > byte, it is easy to write a server that can amplify a client's request
> > of status over 1M of the image into a reply over 8M in size, and it
> > was very easy to demonstrate that a hard cap was needed to avoid
> > crashing clients or otherwise killing the connection (a bad server
> > impacting the client negatively); unique to nbdkit's situation is the
> > fact that because it is designed for plugin server implementations,
> > not capping the length of extent also posed a problem to nbdkit as the
> > server (a client requesting large block status could cause the server
> > to run out of memory depending on the plugin providing the server
> > callbacks).  So nbdkit enforced a bound of 1M extents (8M+4 bytes,
> > nbdkit commit 6e0dc839ea, Jun 2019).
> > 
> > Since the limit chosen by these two implementations is different, and
> > since nbdkit has versions that were not limited, add this as a SHOULD
> > NOT instead of MUST NOT constraint on servers implementing block
> > status.  It does not matter that qemu picked a smaller limit that it
> > truncates to, since we have already documented that the server may
> > truncate for other reasons (such as it being inefficient to collect
> > that many extents in the first place).  But documenting the limit now
> > becomes even more important in the face of a future addition of 64-bit
> > requests, where a client's request is no longer bounded to 4G and
> > could thereby produce even more than 8M extents for the corner case
> > when every 512 bytes is a new extent, if it were not for this
> > recommendation.
> 
> It feels backwards to me to make this a restriction on the server side.
> You're saying there are server implementations that will be inefficient
> if there are more than 2^20 extents, and therefore no server should send
> more than those, even if it can do so efficiently.
> 
> Isn't it up to the server implementation to decide what can be done
> efficiently?
> 
> Perhaps we can make the language about possibly reducing length of
> extens a bit stronger; but I don't think adding explicit limits for a
> server's own protection is necessary.

I agree, but for a different reason.

I think Eric should add language that servers can consider limiting
response sizes in order to prevent possible amplification issues
and/or simply overwhelming the client with work (bad server DoS
attacks against clients are a thing!), but I don't think it's
necessarily a "SHOULD" issue.

BTW attached is an nbdkit plugin that creates an NBD server that
responds with massive numbers of byte-granularity extents, in case
anyone wants to test how nbdkit and/or clients respond:

$ chmod +x /var/tmp/lots-of-extents.py
$ /var/tmp/lots-of-extents.py -f

$ nbdinfo --map nbd://localhost | head
 0   13  hole,zero
 1   10  data
 2   13  hole,zero
 3   10  data
 4   13  hole,zero
 5   10  data
 6   13  hole,zero
 7   10  data
 8   13  hole,zero
 9   10  data
$ nbdinfo --map --totals nbd://localhost 
524288  50.0%   0 data
524288  50.0%   3 hole,zero

>From experimentation I found this really hurts my laptop :-(

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read

Re: [Libguestfs] [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length

2022-04-08 Thread Richard W.M. Jones
On Fri, Apr 08, 2022 at 12:52:44PM +0100, Richard W.M. Jones wrote:
> From experimentation I found this really hurts my laptop :-(

... because of a memory leak in the nbdkit Python bindings as it
turned out.  This function:

> def extents(h, count, offset, flags):
> e = [(i, 1, typ(i)) for i in range(offset, offset + count + 1)]
> return e

returns tuples which never got dereferenced in the C part.

Fixed in:

https://gitlab.com/nbdkit/nbdkit/-/commit/4ede3d7b0778cc1fe661307d2289fadfc18a1af2

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit




Re: [Libguestfs] [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length

2022-04-08 Thread Nir Soffer
On Fri, Apr 8, 2022 at 2:52 PM Richard W.M. Jones  wrote:
>
> On Fri, Apr 08, 2022 at 09:25:01AM +0200, Wouter Verhelst wrote:
> > Hi Eric,
> >
> > On Thu, Apr 07, 2022 at 04:37:19PM -0500, Eric Blake wrote:
> > > The spec was silent on how many extents a server could reply with.
> > > However, both qemu and nbdkit (the two server implementations known to
> > > have implemented the NBD_CMD_BLOCK_STATUS extension) implement a hard
> > > cap, and will truncate the amount of extents in a reply to avoid
> > > sending a client a reply larger than the maximum NBD_CMD_READ response
> > > they are willing to tolerate:
> > >
> > > When qemu first implemented NBD_CMD_BLOCK_STATUS for the
> > > base:allocation context (qemu commit e7b1948d51, Mar 2018), it behaved
> > > as if NBD_CMD_FLAG_REQ_ONE were always passed by the client, and never
> > > responded with more than one extent.  Later, when adding its
> > > qemu:dirty-bitmap:XXX context extension (qemu commit 3d068aff16, Jun
> > > 2018), it added a cap to 128k extents (1M+4 bytes), and that cap was
> > > applied to base:allocation once qemu started sending multiple extents
> > > for that context as well (qemu commit fb7afc797e, Jul 2018).  Qemu
> > > extents are never smaller than 512 bytes (other than an exception at
> > > the end of a file whose size is not aligned to 512), but even so, a
> > > request for just under 4G of block status could produce 8M extents,
> > > resulting in a reply of 64M if it were not capped smaller.
> > >
> > > When nbdkit first implemented NBD_CMD_BLOCK_STATUS (nbdkit 4ca66f70a5,
> > > Mar 2019), it did not impose any restriction on the number of extents
> > > in the reply chunk.  But because it allows extents as small as one
> > > byte, it is easy to write a server that can amplify a client's request
> > > of status over 1M of the image into a reply over 8M in size, and it
> > > was very easy to demonstrate that a hard cap was needed to avoid
> > > crashing clients or otherwise killing the connection (a bad server
> > > impacting the client negatively); unique to nbdkit's situation is the
> > > fact that because it is designed for plugin server implementations,
> > > not capping the length of extent also posed a problem to nbdkit as the
> > > server (a client requesting large block status could cause the server
> > > to run out of memory depending on the plugin providing the server
> > > callbacks).  So nbdkit enforced a bound of 1M extents (8M+4 bytes,
> > > nbdkit commit 6e0dc839ea, Jun 2019).
> > >
> > > Since the limit chosen by these two implementations is different, and
> > > since nbdkit has versions that were not limited, add this as a SHOULD
> > > NOT instead of MUST NOT constraint on servers implementing block
> > > status.  It does not matter that qemu picked a smaller limit that it
> > > truncates to, since we have already documented that the server may
> > > truncate for other reasons (such as it being inefficient to collect
> > > that many extents in the first place).  But documenting the limit now
> > > becomes even more important in the face of a future addition of 64-bit
> > > requests, where a client's request is no longer bounded to 4G and
> > > could thereby produce even more than 8M extents for the corner case
> > > when every 512 bytes is a new extent, if it were not for this
> > > recommendation.
> >
> > It feels backwards to me to make this a restriction on the server side.
> > You're saying there are server implementations that will be inefficient
> > if there are more than 2^20 extents, and therefore no server should send
> > more than those, even if it can do so efficiently.
> >
> > Isn't it up to the server implementation to decide what can be done
> > efficiently?
> >
> > Perhaps we can make the language about possibly reducing length of
> > extens a bit stronger; but I don't think adding explicit limits for a
> > server's own protection is necessary.
>
> I agree, but for a different reason.
>
> I think Eric should add language that servers can consider limiting
> response sizes in order to prevent possible amplification issues
> and/or simply overwhelming the client with work (bad server DoS
> attacks against clients are a thing!), but I don't think it's
> necessarily a "SHOULD" issue.
>
> BTW attached is an nbdkit plugin that creates an NBD server that
> responds with massive numbers of byte-granularity extents, in case
> anyone wants to test how nbdkit and/or clients respond:
>
> $ chmod +x /var/tmp/lots-of-extents.py
> $ /var/tmp/lots-of-extents.py -f
>
> $ nbdinfo --map nbd://localhost | head
>  0   13  hole,zero
>  1   10  data
>  2   13  hole,zero
>  3   10  data
>  4   13  hole,zero
>  5   10  data
>  6   13  hole,zero
>  7   10  data
>  8   13  hole,zero
>  9   10  data
> $ nbdinfo --map --totals nbd

Re: [Libguestfs] [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length

2022-04-08 Thread Richard W.M. Jones
On Fri, Apr 08, 2022 at 04:48:59PM +0300, Nir Soffer wrote:
> This is a malicious server. A good client will drop the connection when
> receiving the first 1 byte chunk.
> 
> The real issue here is not enforcing or suggesting a limit on the number of
> extent the server returns, but enforcing a limit on the minimum size of
> a chunk.
> 
> Since this is the network *block device* protocol it should not allow chunks
> smaller than the device block size, so anything smaller than 512 bytes
> should be invalid response from the server.
> 
> Even the last chunk should not be smaller than 512 bytes. The fact that you
> can serve a file with size that is not aligned to 512 bytes does not mean that
> the export size can be unaligned to the logical block size. There are no real
> block devices that have such alignment so the protocol should not allow this.
> A good server will round the file size down the logical block size to avoid 
> this
> issue.
> 
> How about letting the client set a minimum size of a chunk? This way we
> avoid the issue of limiting the number of chunks. Merging small chunks
> is best done on the server side instead of wasting bandwidth and doing
> this on the client side.

While it's interesting to know if chunks should obey the
(server-specified) minimum block size, I don't think we should force
operations to only work on sector boundaries.  That's a step
backwards.  We've spent a long time and effort making nbdkit & NBD
work well for < 512 byte images, byte granularity tails, and disk
operations.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top




Re: [PATCH v4 3/7] iotests: add copy-before-write: on-cbw-error tests

2022-04-08 Thread Hanna Reitz

On 07.04.22 15:27, Vladimir Sementsov-Ogievskiy wrote:

Add tests for new option of copy-before-write filter: on-cbw-error.

Note that we use QEMUMachine instead of VM class, because in further
commit we'll want to use throttling which doesn't work with -accel
qtest used by VM.

We also touch pylintrc to not break iotest 297.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/pylintrc   |   5 +
  tests/qemu-iotests/tests/copy-before-write| 132 ++
  .../qemu-iotests/tests/copy-before-write.out  |   5 +
  3 files changed, 142 insertions(+)
  create mode 100755 tests/qemu-iotests/tests/copy-before-write
  create mode 100644 tests/qemu-iotests/tests/copy-before-write.out


Reviewed-by: Hanna Reitz 




Re: [PATCH v4 7/7] iotests: copy-before-write: add cases for cbw-timeout option

2022-04-08 Thread Hanna Reitz

On 07.04.22 15:27, Vladimir Sementsov-Ogievskiy wrote:

Add two simple test-cases: timeout failure with
break-snapshot-on-cbw-error behavior and similar with
break-guest-write-on-cbw-error behavior.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/tests/copy-before-write| 81 +++
  .../qemu-iotests/tests/copy-before-write.out  |  4 +-
  2 files changed, 83 insertions(+), 2 deletions(-)


Reviewed-by: Hanna Reitz 




Re: [Libguestfs] [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length

2022-04-08 Thread Eric Blake
On Fri, Apr 08, 2022 at 04:48:59PM +0300, Nir Soffer wrote:
> On Fri, Apr 8, 2022 at 2:52 PM Richard W.M. Jones  wrote:
> >
> > On Fri, Apr 08, 2022 at 09:25:01AM +0200, Wouter Verhelst wrote:
> > > Hi Eric,
> > >
> > > On Thu, Apr 07, 2022 at 04:37:19PM -0500, Eric Blake wrote:
> > > > The spec was silent on how many extents a server could reply with.
> > > > However, both qemu and nbdkit (the two server implementations known to
> > > > have implemented the NBD_CMD_BLOCK_STATUS extension) implement a hard
> > > > cap, and will truncate the amount of extents in a reply to avoid
> > > > sending a client a reply larger than the maximum NBD_CMD_READ response
> > > > they are willing to tolerate:
> > > >
> > > > When qemu first implemented NBD_CMD_BLOCK_STATUS for the
> > > > base:allocation context (qemu commit e7b1948d51, Mar 2018), it behaved
> > > > as if NBD_CMD_FLAG_REQ_ONE were always passed by the client, and never
> > > > responded with more than one extent.  Later, when adding its
> > > > qemu:dirty-bitmap:XXX context extension (qemu commit 3d068aff16, Jun
> > > > 2018), it added a cap to 128k extents (1M+4 bytes), and that cap was
> > > > applied to base:allocation once qemu started sending multiple extents
> > > > for that context as well (qemu commit fb7afc797e, Jul 2018).  Qemu
> > > > extents are never smaller than 512 bytes (other than an exception at
> > > > the end of a file whose size is not aligned to 512), but even so, a
> > > > request for just under 4G of block status could produce 8M extents,
> > > > resulting in a reply of 64M if it were not capped smaller.

Qemu as server imposed the limit because sending a 64M reply to a
block status request for 4G (where every 512 bytes alternates status)
would kill qemu as client (which refuses to accept responses from the
server larger than 32M).  Nothing in here about considerations for the
server.

> > > >
> > > > When nbdkit first implemented NBD_CMD_BLOCK_STATUS (nbdkit 4ca66f70a5,
> > > > Mar 2019), it did not impose any restriction on the number of extents
> > > > in the reply chunk.  But because it allows extents as small as one
> > > > byte, it is easy to write a server that can amplify a client's request
> > > > of status over 1M of the image into a reply over 8M in size, and it
> > > > was very easy to demonstrate that a hard cap was needed to avoid
> > > > crashing clients or otherwise killing the connection (a bad server
> > > > impacting the client negatively);

Likewise, THIS part of the commit message is about nbdkit limiting things to 
avoid killing the client.

> > > > unique to nbdkit's situation is the
> > > > fact that because it is designed for plugin server implementations,
> > > > not capping the length of extent also posed a problem to nbdkit as the
> > > > server (a client requesting large block status could cause the server
> > > > to run out of memory depending on the plugin providing the server
> > > > callbacks).  So nbdkit enforced a bound of 1M extents (8M+4 bytes,
> > > > nbdkit commit 6e0dc839ea, Jun 2019).

While this part is an incidental anecdote (we also had problems with
clients being able to kill the server, when the server was poorly
written to consume too much memory by heavily alternating status), but
does NOT change the content of the text change in the patch.

> > > >
> > > > Since the limit chosen by these two implementations is different, and
> > > > since nbdkit has versions that were not limited, add this as a SHOULD
> > > > NOT instead of MUST NOT constraint on servers implementing block
> > > > status.  It does not matter that qemu picked a smaller limit that it
> > > > truncates to, since we have already documented that the server may
> > > > truncate for other reasons (such as it being inefficient to collect
> > > > that many extents in the first place).  But documenting the limit now
> > > > becomes even more important in the face of a future addition of 64-bit
> > > > requests, where a client's request is no longer bounded to 4G and
> > > > could thereby produce even more than 8M extents for the corner case
> > > > when every 512 bytes is a new extent, if it were not for this
> > > > recommendation.
> > >
> > > It feels backwards to me to make this a restriction on the server side.
> > > You're saying there are server implementations that will be inefficient
> > > if there are more than 2^20 extents, and therefore no server should send
> > > more than those, even if it can do so efficiently.

The limitation is on the server to avoid killing clients.  Whether or
not the server happens to be inefficient if it could otherwise
generate more than 2^20 extents (as nbdkit was temporarily able to do)
is not mentioned in the spec change in the patch itself, but only in
the commit message.  I can do a better job of marking the server
effect as an interesting side effect.  Servers can ALWAYS choose to
truncate early to avoid exhausting their own resources; the important
part of this patch is that servers 

[PATCH 1/2] python/machine.py: upgrade vm.command() method

2022-04-08 Thread Vladimir Sementsov-Ogievskiy
The method is not popular, we prefer use vm.qmp() and then check
success by hand.. But that's not optimal. To simplify movement to
vm.command() support same interface improvements like in vm.qmp() and
rename to shorter vm.cmd().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 python/qemu/machine/machine.py | 16 ---
 tests/qemu-iotests/256 | 34 
 tests/qemu-iotests/257 | 36 +-
 3 files changed, 48 insertions(+), 38 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 07ac5a710b..a3fb840b93 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -648,14 +648,24 @@ def qmp(self, cmd: str,
 self._quit_issued = True
 return ret
 
-def command(self, cmd: str,
-conv_keys: bool = True,
-**args: Any) -> QMPReturnValue:
+def cmd(self, cmd: str,
+args_dict: Optional[Dict[str, object]] = None,
+conv_keys: Optional[bool] = None,
+**args: Any) -> QMPReturnValue:
 """
 Invoke a QMP command.
 On success return the response dict.
 On failure raise an exception.
 """
+if args_dict is not None:
+assert not args
+assert conv_keys is None
+args = args_dict
+conv_keys = False
+
+if conv_keys is None:
+conv_keys = True
+
 qmp_args = self._qmp_args(conv_keys, args)
 ret = self._qmp.command(cmd, **qmp_args)
 if cmd == 'quit':
diff --git a/tests/qemu-iotests/256 b/tests/qemu-iotests/256
index 13666813bd..fffc8ef055 100755
--- a/tests/qemu-iotests/256
+++ b/tests/qemu-iotests/256
@@ -40,25 +40,25 @@ with iotests.FilePath('img0') as img0_path, \
 def create_target(filepath, name, size):
 basename = os.path.basename(filepath)
 nodename = "file_{}".format(basename)
-log(vm.command('blockdev-create', job_id='job1',
-   options={
-   'driver': 'file',
-   'filename': filepath,
-   'size': 0,
-   }))
+log(vm.cmd('blockdev-create', job_id='job1',
+   options={
+   'driver': 'file',
+   'filename': filepath,
+   'size': 0,
+   }))
 vm.run_job('job1')
-log(vm.command('blockdev-add', driver='file',
-   node_name=nodename, filename=filepath))
-log(vm.command('blockdev-create', job_id='job2',
-   options={
-   'driver': iotests.imgfmt,
-   'file': nodename,
-   'size': size,
-   }))
+log(vm.cmd('blockdev-add', driver='file',
+   node_name=nodename, filename=filepath))
+log(vm.cmd('blockdev-create', job_id='job2',
+   options={
+   'driver': iotests.imgfmt,
+   'file': nodename,
+   'size': size,
+   }))
 vm.run_job('job2')
-log(vm.command('blockdev-add', driver=iotests.imgfmt,
-   node_name=name,
-   file=nodename))
+log(vm.cmd('blockdev-add', driver=iotests.imgfmt,
+   node_name=name,
+   file=nodename))
 
 log('--- Preparing images & VM ---\n')
 vm.add_object('iothread,id=iothread0')
diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
index e7e7a2317e..7d3720b8e5 100755
--- a/tests/qemu-iotests/257
+++ b/tests/qemu-iotests/257
@@ -160,26 +160,26 @@ class Drive:
 file_node_name = "file_{}".format(basename)
 vm = self.vm
 
-log(vm.command('blockdev-create', job_id='bdc-file-job',
-   options={
-   'driver': 'file',
-   'filename': self.path,
-   'size': 0,
-   }))
+log(vm.cmd('blockdev-create', job_id='bdc-file-job',
+   options={
+   'driver': 'file',
+   'filename': self.path,
+   'size': 0,
+   }))
 vm.run_job('bdc-file-job')
-log(vm.command('blockdev-add', driver='file',
-   node_name=file_node_name, filename=self.path))
-
-log(vm.command('blockdev-create', job_id='bdc-fmt-job',
-   options={
-   'driver': fmt,
-   'file': file_node_name,
-   'size': size,
-   }))
+log(vm.cmd('blockdev-add', driver='file',
+   node_name=file_node_name, filename=self.path))
+
+log(vm.cmd('blockdev-create', job_id='bdc-fmt-job'

[RFC 0/2] introduce QEMUMachind.cmd()

2022-04-08 Thread Vladimir Sementsov-Ogievskiy
Hi all!

I always dreamed about getting rid of pattern

result = self.vm.qmp(...)
self.assert_qmp(result, 'return', {})

Here is a suggestion to switch to

self.vm.cmd(...)

pattern instead.

I'm not sure we really want to update so many tests. May be just commit
patch 01, and use new interface for new code. On the other hand, old
code always used as an example to write the new one.

The series is based on John's python branch.

Vladimir Sementsov-Ogievskiy (2):
  python/machine.py: upgrade vm.command() method
  iotests: use vm.cmd() instead of vm.qmp() where appropriate

 python/qemu/machine/machine.py|  16 +-
 tests/qemu-iotests/030| 168 +++
 tests/qemu-iotests/040| 167 +++---
 tests/qemu-iotests/041| 474 --
 tests/qemu-iotests/045|  15 +-
 tests/qemu-iotests/055|  61 +--
 tests/qemu-iotests/056|  23 +-
 tests/qemu-iotests/093|  41 +-
 tests/qemu-iotests/118| 221 
 tests/qemu-iotests/124|  69 ++-
 tests/qemu-iotests/129|  13 +-
 tests/qemu-iotests/132|   5 +-
 tests/qemu-iotests/139|  43 +-
 tests/qemu-iotests/147|  30 +-
 tests/qemu-iotests/151|  40 +-
 tests/qemu-iotests/155|  53 +-
 tests/qemu-iotests/165|   7 +-
 tests/qemu-iotests/196|   3 +-
 tests/qemu-iotests/205|   6 +-
 tests/qemu-iotests/245| 245 -
 tests/qemu-iotests/256|  34 +-
 tests/qemu-iotests/257|  36 +-
 tests/qemu-iotests/264|  31 +-
 tests/qemu-iotests/281|  21 +-
 tests/qemu-iotests/295|  27 +-
 tests/qemu-iotests/296|  14 +-
 tests/qemu-iotests/298|  13 +-
 tests/qemu-iotests/300|  50 +-
 tests/qemu-iotests/iotests.py |   6 +-
 .../tests/migrate-bitmaps-postcopy-test   |  31 +-
 tests/qemu-iotests/tests/migrate-bitmaps-test |  37 +-
 .../qemu-iotests/tests/migrate-during-backup  |  40 +-
 .../qemu-iotests/tests/migration-permissions  |   9 +-
 tests/qemu-iotests/tests/mirror-top-perms |  15 +-
 34 files changed, 821 insertions(+), 1243 deletions(-)

-- 
2.35.1




Re: [Libguestfs] [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length

2022-04-08 Thread Nir Soffer
On Fri, Apr 8, 2022 at 6:47 PM Eric Blake  wrote:
>
> On Fri, Apr 08, 2022 at 04:48:59PM +0300, Nir Soffer wrote:
...
> > > BTW attached is an nbdkit plugin that creates an NBD server that
> > > responds with massive numbers of byte-granularity extents, in case
> > > anyone wants to test how nbdkit and/or clients respond:
> > >
> > > $ chmod +x /var/tmp/lots-of-extents.py
> > > $ /var/tmp/lots-of-extents.py -f
> > >
> > > $ nbdinfo --map nbd://localhost | head
> > >  0   13  hole,zero
> > >  1   10  data
> > >  2   13  hole,zero
> > >  3   10  data
> > >  4   13  hole,zero
> > >  5   10  data
> > >  6   13  hole,zero
> > >  7   10  data
> > >  8   13  hole,zero
> > >  9   10  data
> > > $ nbdinfo --map --totals nbd://localhost
> > > 524288  50.0%   0 data
> > > 524288  50.0%   3 hole,zero
> >
> > This is a malicious server. A good client will drop the connection when
> > receiving the first 1 byte chunk.
>
> Depends on the server.  Most servers don't serve 1-byte extents, and
> the NBD spec even recommends that extents be at least 512 bytes in
> size, and requires that extents be a multiple of any minimum block
> size if one was advertised by the server.
>
> But even though most servers don't have 1-byte extents does not mean
> that the NBD protocol must forbid them.

Forbidding this simplifies clients without limiting real world use cases.

What is a reason to allow this?

> > The real issue here is not enforcing or suggesting a limit on the number of
> > extent the server returns, but enforcing a limit on the minimum size of
> > a chunk.
> >
> > Since this is the network *block device* protocol it should not allow chunks
> > smaller than the device block size, so anything smaller than 512 bytes
> > should be invalid response from the server.
>
> No, not an invalid response, but merely a discouraged one - and that
> text is already present in the wording of NBD_CMD_BLOCK_STATUS.

My suggestion is to make it an invalid response, because there are no block
devices that can return such a response.

> > Even the last chunk should not be smaller than 512 bytes. The fact that you
> > can serve a file with size that is not aligned to 512 bytes does not mean 
> > that
> > the export size can be unaligned to the logical block size. There are no 
> > real
> > block devices that have such alignment so the protocol should not allow 
> > this.
> > A good server will round the file size down the logical block size to avoid 
> > this
> > issue.
> >
> > How about letting the client set a minimum size of a chunk? This way we
> > avoid the issue of limiting the number of chunks. Merging small chunks
> > is best done on the server side instead of wasting bandwidth and doing
> > this on the client side.
>
> The client can't set the minimum block size, but the server can
> certainly advertise one, and must obey that advertisement.  Or are you
> asking for a new extension where the client mandates what the minimum
> granularity must be from the server in responses to NBD_CMD_READ and
> NBD_CMD_BLOCK_STATUS, when the client wants a larger granularity than
> what the server advertises?  That's a different extension than this
> patch, but may be worth considering.

Yes, this should really be discussed in another thread.

Nir




Re: [Libguestfs] [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length

2022-04-08 Thread Richard W.M. Jones
On Fri, Apr 08, 2022 at 08:45:56PM +0300, Nir Soffer wrote:
> On Fri, Apr 8, 2022 at 6:47 PM Eric Blake  wrote:
> >
> > On Fri, Apr 08, 2022 at 04:48:59PM +0300, Nir Soffer wrote:
> ...
> > > > BTW attached is an nbdkit plugin that creates an NBD server that
> > > > responds with massive numbers of byte-granularity extents, in case
> > > > anyone wants to test how nbdkit and/or clients respond:
> > > >
> > > > $ chmod +x /var/tmp/lots-of-extents.py
> > > > $ /var/tmp/lots-of-extents.py -f
> > > >
> > > > $ nbdinfo --map nbd://localhost | head
> > > >  0   13  hole,zero
> > > >  1   10  data
> > > >  2   13  hole,zero
> > > >  3   10  data
> > > >  4   13  hole,zero
> > > >  5   10  data
> > > >  6   13  hole,zero
> > > >  7   10  data
> > > >  8   13  hole,zero
> > > >  9   10  data
> > > > $ nbdinfo --map --totals nbd://localhost
> > > > 524288  50.0%   0 data
> > > > 524288  50.0%   3 hole,zero
> > >
> > > This is a malicious server. A good client will drop the connection when
> > > receiving the first 1 byte chunk.
> >
> > Depends on the server.  Most servers don't serve 1-byte extents, and
> > the NBD spec even recommends that extents be at least 512 bytes in
> > size, and requires that extents be a multiple of any minimum block
> > size if one was advertised by the server.
> >
> > But even though most servers don't have 1-byte extents does not mean
> > that the NBD protocol must forbid them.
> 
> Forbidding this simplifies clients without limiting real world use cases.

I'm not even sure this is true.  Clients are quite free to only make
requests on 512 byte block boundaries if they want, and refuse to deal
with servers which offer non-aligned disk sizes or extents.

If clients are doing this and still have problems they ought to be
fixed, although I don't know what those would be.

> What is a reason to allow this?

Because you don't always want to serve things which are exactly disk
images, or which make assumptions related to modern PCs (256 byte
sectors were used into the 90s).

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html




Re: [PATCH v2 2/3] qapi: nbd-export: allow select bitmaps by node/name pair

2022-04-08 Thread Vladimir Sementsov-Ogievskiy

17.03.2022 00:28, Eric Blake wrote:

+++ b/qapi/block-export.json
@@ -6,6 +6,7 @@
  ##
  
  { 'include': 'sockets.json' }

+{ 'include': 'block-core.json' }

Hmm.  Does this extra inclusion negatively impact qemu-storage-daemon,
since that is why we created block-export.json in the first place (to
minimize the stuff that qsd pulled in without needing all of
block-core.json)?  In other words, would it be better to move
BlockDirtyBitmapOrStr to this file?


Actually, looking at storage-daemon/qapi/qapi-schema.json I see 
block-cores.json.

That's block.json which is not mentioned in 
storage-daemon/qapi/qapi-schema.json.

So, I think it's OK to keep simple include for now.

--
Best regards,
Vladimir



Re: [PATCH 2/3] libvhost-user: Fix extra vu_add/rem_mem_reg reply

2022-04-08 Thread Raphael Norwitz
On Thu, Apr 07, 2022 at 03:36:56PM +0200, Kevin Wolf wrote:
> Outside of postcopy mode, neither VHOST_USER_ADD_MEM_REG nor
> VHOST_USER_REM_MEM_REG are supposed to send a reply unless explicitly
> requested with the need_reply flag. Their current implementation always
> sends a reply, even if it isn't requested. This confuses the master
> because it will interpret the reply as a reply for the next message for
> which it actually expects a reply.
> 
> need_reply is already handled correctly by vu_dispatch(), so just don't
> send a reply in the non-postcopy part of the message handler for these
> two commands.
> 
> Signed-off-by: Kevin Wolf 

Reviewed-by: Raphael Norwitz 

> ---
>  subprojects/libvhost-user/libvhost-user.c | 9 +++--
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/subprojects/libvhost-user/libvhost-user.c 
> b/subprojects/libvhost-user/libvhost-user.c
> index 47d2efc60f..eccaff5168 100644
> --- a/subprojects/libvhost-user/libvhost-user.c
> +++ b/subprojects/libvhost-user/libvhost-user.c
> @@ -800,8 +800,7 @@ vu_add_mem_reg(VuDev *dev, VhostUserMsg *vmsg) {
>  
>  DPRINT("Successfully added new region\n");
>  dev->nregions++;
> -vmsg_set_reply_u64(vmsg, 0);
> -return true;
> +return false;
>  }
>  }
>  
> @@ -874,15 +873,13 @@ vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) {
>  }
>  }
>  
> -if (found) {
> -vmsg_set_reply_u64(vmsg, 0);
> -} else {
> +if (!found) {
>  vu_panic(dev, "Specified region not found\n");
>  }
>  
>  close(vmsg->fds[0]);
>  
> -return true;
> +return false;
>  }
>  
>  static bool
> -- 
> 2.35.1
> 


Re: [PATCH 3/3] vhost-user: Don't pass file descriptor for VHOST_USER_REM_MEM_REG

2022-04-08 Thread Raphael Norwitz
On Thu, Apr 07, 2022 at 03:36:57PM +0200, Kevin Wolf wrote:
> The spec clarifies now that QEMU should not send a file descriptor in a
> request to remove a memory region. Change it accordingly.
> 
> For libvhost-user, this is a bug fix that makes it compatible with
> rust-vmm's implementation that doesn't send a file descriptor. Keep
> accepting, but ignoring a file descriptor for compatibility with older
> QEMU versions.
> 
> Signed-off-by: Kevin Wolf 

Reviewed-by: Raphael Norwitz 

> ---
>  hw/virtio/vhost-user.c| 2 +-
>  subprojects/libvhost-user/libvhost-user.c | 8 
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 6abbc9da32..82caf607e5 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -751,7 +751,7 @@ static int send_remove_regions(struct vhost_dev *dev,
>  vhost_user_fill_msg_region(®ion_buffer, shadow_reg, 0);
>  msg->payload.mem_reg.region = region_buffer;
>  
> -ret = vhost_user_write(dev, msg, &fd, 1);
> +ret = vhost_user_write(dev, msg, NULL, 0);
>  if (ret < 0) {
>  return ret;
>  }
> diff --git a/subprojects/libvhost-user/libvhost-user.c 
> b/subprojects/libvhost-user/libvhost-user.c
> index eccaff5168..d0041c864b 100644
> --- a/subprojects/libvhost-user/libvhost-user.c
> +++ b/subprojects/libvhost-user/libvhost-user.c
> @@ -822,15 +822,15 @@ vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) {
>  int i;
>  bool found = false;
>  
> -if (vmsg->fd_num != 1) {
> +if (vmsg->fd_num > 1) {
>  vmsg_close_fds(vmsg);
> -vu_panic(dev, "VHOST_USER_REM_MEM_REG received %d fds - only 1 fd "
> +vu_panic(dev, "VHOST_USER_REM_MEM_REG received %d fds - at most 1 fd 
> "
>"should be sent for this message type", vmsg->fd_num);
>  return false;
>  }
>  
>  if (vmsg->size < VHOST_USER_MEM_REG_SIZE) {
> -close(vmsg->fds[0]);
> +vmsg_close_fds(vmsg);
>  vu_panic(dev, "VHOST_USER_REM_MEM_REG requires a message size of at "
>"least %d bytes and only %d bytes were received",
>VHOST_USER_MEM_REG_SIZE, vmsg->size);
> @@ -877,7 +877,7 @@ vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) {
>  vu_panic(dev, "Specified region not found\n");
>  }
>  
> -close(vmsg->fds[0]);
> +vmsg_close_fds(vmsg);
>  
>  return false;
>  }
> -- 
> 2.35.1
> 


Re: [PATCH 1/3] docs/vhost-user: Clarifications for VHOST_USER_ADD/REM_MEM_REG

2022-04-08 Thread Raphael Norwitz
On Thu, Apr 07, 2022 at 03:36:55PM +0200, Kevin Wolf wrote:
> The specification for VHOST_USER_ADD/REM_MEM_REG messages is unclear
> in several points, which has led to clients having incompatible
> implementations. This changes the specification to be more explicit
> about them:
> 
> * VHOST_USER_ADD_MEM_REG is not specified as receiving a file
>   descriptor, though it obviously does need to do so. All
>   implementations agree on this one, fix the specification.
> 
> * VHOST_USER_REM_MEM_REG is not specified as receiving a file
>   descriptor either, and it also has no reason to do so. rust-vmm does
>   not send file descriptors for removing a memory region (in agreement
>   with the specification), libvhost-user and QEMU do (which is a bug),
>   though libvhost-user doesn't actually make any use of it.
> 
>   Change the specification so that for compatibility QEMU's behaviour
>   becomes legal, even if discouraged, but rust-vmm's behaviour becomes
>   the explicitly recommended mode of operation.
> 
> * VHOST_USER_ADD_MEM_REG doesn't have a documented return value, which
>   is the desired behaviour in the non-postcopy case. It also implemented
>   like this in QEMU and rust-vmm, though libvhost-user is buggy and
>   sometimes sends an unexpected reply. This will be fixed in a separate
>   patch.
> 
>   However, in postcopy mode it does reply like VHOST_USER_SET_MEM_TABLE.
>   This behaviour is shared between libvhost-user and QEMU; rust-vmm
>   doesn't implement postcopy mode yet. Mention it explicitly in the
>   spec.
> 
> * The specification doesn't mention how VHOST_USER_REM_MEM_REG
>   identifies the memory region to be removed. Change it to describe the
>   existing behaviour of libvhost-user (guest address, user address and
>   size must match).
> 
> Signed-off-by: Kevin Wolf 

Reviewed-by: Raphael Norwitz 

> ---
>  docs/interop/vhost-user.rst | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 4dbc84fd00..f9e721ba5f 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -308,6 +308,7 @@ replies. Here is a list of the ones that do:
>  There are several messages that the master sends with file descriptors passed
>  in the ancillary data:
>  
> +* ``VHOST_USER_ADD_MEM_REG``
>  * ``VHOST_USER_SET_MEM_TABLE``
>  * ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
>  * ``VHOST_USER_SET_LOG_FD``
> @@ -1334,6 +1335,14 @@ Master message types
>``VHOST_USER_REM_MEM_REG`` message, this message is used to set and
>update the memory tables of the slave device.
>  
> +  Exactly one file descriptor from which the memory is mapped is
> +  passed in the ancillary data.
> +
> +  In postcopy mode (see ``VHOST_USER_POSTCOPY_LISTEN``), the slave
> +  replies with the bases of the memory mapped region to the master.
> +  For further details on postcopy, see ``VHOST_USER_SET_MEM_TABLE``.
> +  They apply to ``VHOST_USER_ADD_MEM_REG`` accordingly.
> +
>  ``VHOST_USER_REM_MEM_REG``
>:id: 38
>:equivalent ioctl: N/A
> @@ -1349,6 +1358,14 @@ Master message types
>``VHOST_USER_ADD_MEM_REG`` message, this message is used to set and
>update the memory tables of the slave device.
>  
> +  The memory region to be removed is identified by its guest address,
> +  user address and size. The mmap offset is ignored.
> +
> +  No file descriptors SHOULD be passed in the ancillary data. For
> +  compatibility with existing incorrect implementations, the slave MAY
> +  accept messages with one file descriptor. If a file descriptor is
> +  passed, the slave MUST close it without using it otherwise.
> +
>  ``VHOST_USER_SET_STATUS``
>:id: 39
>:equivalent ioctl: VHOST_VDPA_SET_STATUS
> -- 
> 2.35.1
>