Re: [PATCH] block/file-posix: Fix return value translation for AIO discards.

2021-10-18 Thread Philippe Mathieu-Daudé
+Stefan

On 10/18/21 20:07, Ari Sundholm wrote:
> AIO discards regressed as a result of the following commit:
>   0dfc7af2 block/file-posix: Optimize for macOS
> 
> When trying to run blkdiscard within a Linux guest, the request would
> fail, with some errors in dmesg:
> 
>  [ snip ] 
> [4.010070] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [4.011061] sd 2:0:0:0: [sda] tag#0 Sense Key : Aborted Command
> [current]
> [4.011061] sd 2:0:0:0: [sda] tag#0 Add. Sense: I/O process
> terminated
> [4.011061] sd 2:0:0:0: [sda] tag#0 CDB: Unmap/Read sub-channel 42
> 00 00 00 00 00 00 00 18 00
> [4.011061] blk_update_request: I/O error, dev sda, sector 0
>  [ snip ] 
> 
> This turns out to be a result of a flaw in changes to the error value
> translation logic in handle_aiocb_discard(). The default return value
> may be left untranslated in some configurations, and the wrong variable
> is used in one translation.
> 
> Fix both issues.

Worth including:

Cc: qemu-sta...@nongnu.org
Fixes: 0dfc7af2b28 ("block/file-posix: Optimize for macOS")

> Signed-off-by: Ari Sundholm 
> Signed-off-by: Emil Karlson 
> ---
>  block/file-posix.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 53be0bdc1b..6def2a4cba 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1807,7 +1807,7 @@ static int handle_aiocb_copy_range(void *opaque)
>  static int handle_aiocb_discard(void *opaque)
>  {
>  RawPosixAIOData *aiocb = opaque;
> -int ret = -EOPNOTSUPP;
> +int ret = -ENOTSUP;
>  BDRVRawState *s = aiocb->bs->opaque;
>  
>  if (!s->has_discard) {
> @@ -1829,7 +1829,7 @@ static int handle_aiocb_discard(void *opaque)
>  #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
>  ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> aiocb->aio_offset, aiocb->aio_nbytes);
> -ret = translate_err(-errno);
> +ret = translate_err(ret);
>  #elif defined(__APPLE__) && (__MACH__)
>  fpunchhole_t fpunchhole;
>  fpunchhole.fp_flags = 0;
> 




Re: [PATCH] block/file-posix: Fix return value translation for AIO discards.

2021-10-18 Thread Akihiko Odaki
Reviewed-by: Akihiko Odaki 

On Tue, Oct 19, 2021 at 3:08 AM Ari Sundholm  wrote:
>
> AIO discards regressed as a result of the following commit:
> 0dfc7af2 block/file-posix: Optimize for macOS
>
> When trying to run blkdiscard within a Linux guest, the request would
> fail, with some errors in dmesg:
>
>  [ snip ] 
> [4.010070] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [4.011061] sd 2:0:0:0: [sda] tag#0 Sense Key : Aborted Command
> [current]
> [4.011061] sd 2:0:0:0: [sda] tag#0 Add. Sense: I/O process
> terminated
> [4.011061] sd 2:0:0:0: [sda] tag#0 CDB: Unmap/Read sub-channel 42
> 00 00 00 00 00 00 00 18 00
> [4.011061] blk_update_request: I/O error, dev sda, sector 0
>  [ snip ] 
>
> This turns out to be a result of a flaw in changes to the error value
> translation logic in handle_aiocb_discard(). The default return value
> may be left untranslated in some configurations, and the wrong variable
> is used in one translation.
>
> Fix both issues.
>
> Signed-off-by: Ari Sundholm 
> Signed-off-by: Emil Karlson 
> ---
>  block/file-posix.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 53be0bdc1b..6def2a4cba 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1807,7 +1807,7 @@ static int handle_aiocb_copy_range(void *opaque)
>  static int handle_aiocb_discard(void *opaque)
>  {
>  RawPosixAIOData *aiocb = opaque;
> -int ret = -EOPNOTSUPP;
> +int ret = -ENOTSUP;
>  BDRVRawState *s = aiocb->bs->opaque;
>
>  if (!s->has_discard) {
> @@ -1829,7 +1829,7 @@ static int handle_aiocb_discard(void *opaque)
>  #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
>  ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> aiocb->aio_offset, aiocb->aio_nbytes);
> -ret = translate_err(-errno);
> +ret = translate_err(ret);
>  #elif defined(__APPLE__) && (__MACH__)
>  fpunchhole_t fpunchhole;
>  fpunchhole.fp_flags = 0;
> --
> 2.31.1
>



[PATCH] block/file-posix: Fix return value translation for AIO discards.

2021-10-18 Thread Ari Sundholm
AIO discards regressed as a result of the following commit:
0dfc7af2 block/file-posix: Optimize for macOS

When trying to run blkdiscard within a Linux guest, the request would
fail, with some errors in dmesg:

 [ snip ] 
[4.010070] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
[4.011061] sd 2:0:0:0: [sda] tag#0 Sense Key : Aborted Command
[current]
[4.011061] sd 2:0:0:0: [sda] tag#0 Add. Sense: I/O process
terminated
[4.011061] sd 2:0:0:0: [sda] tag#0 CDB: Unmap/Read sub-channel 42
00 00 00 00 00 00 00 18 00
[4.011061] blk_update_request: I/O error, dev sda, sector 0
 [ snip ] 

This turns out to be a result of a flaw in changes to the error value
translation logic in handle_aiocb_discard(). The default return value
may be left untranslated in some configurations, and the wrong variable
is used in one translation.

Fix both issues.

Signed-off-by: Ari Sundholm 
Signed-off-by: Emil Karlson 
---
 block/file-posix.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 53be0bdc1b..6def2a4cba 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1807,7 +1807,7 @@ static int handle_aiocb_copy_range(void *opaque)
 static int handle_aiocb_discard(void *opaque)
 {
 RawPosixAIOData *aiocb = opaque;
-int ret = -EOPNOTSUPP;
+int ret = -ENOTSUP;
 BDRVRawState *s = aiocb->bs->opaque;
 
 if (!s->has_discard) {
@@ -1829,7 +1829,7 @@ static int handle_aiocb_discard(void *opaque)
 #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
 ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
aiocb->aio_offset, aiocb->aio_nbytes);
-ret = translate_err(-errno);
+ret = translate_err(ret);
 #elif defined(__APPLE__) && (__MACH__)
 fpunchhole_t fpunchhole;
 fpunchhole.fp_flags = 0;
-- 
2.31.1




Re: [RFC PATCH 0/4] Replace custom test harness with "meson test"

2021-10-18 Thread Paolo Bonzini

On 18/10/21 11:51, Thomas Huth wrote:

* CTRL+C will only interrupt the longest running test.  Pressing
   CTRL+C repeatedly three times (which you would likely do anyway,
   that's how things work) interrupts the whole run


I tried this, and while hitting CTRL-C multiple times brought me back to 
the shell prompt, the remaining tests kept getting started in the 
background instead of getting stopped ... something is still fishy here, 
I think.


Ok, I checked that out.  Looks like CTRL+C magic and "make -j" are 
incompatible. :/  So this will have to wait a bit more, but in the 
meanwhile people can already use "meson test" if they want.



* Right now "make check-block" only does a single test run just like
   "../tests/check-block.sh", but it would be possible to add the 
thorough

   suite to "meson test --suite block" as well.


The output of the iotests is also not optimal yet... when running "make 
check SPEED=slow", the iotests are run multiple times with different 
target image types, but each run prints the same "▶ 1/1 test 001   OK" 
etc. to the console, so it's hard to say which target type  is currently 
exercised. Would it be possible to include the target image type here, 
e.g. something like:


Yes, that's trivial:

diff --git a/tests/qemu-iotests/testrunner.py 
b/tests/qemu-iotests/testrunner.py

index 3ef14af1fa..45debc1928 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -163,11 +163,11 @@ def test_print_one_line(self, test: str, 
starttime: str,


 if self.tap:
 if status == 'pass':
-print(f'ok test {test}')
+print(f'ok {self.env.imgfmt} {test}')
 elif status == 'fail':
-print(f'not ok test {test}')
+print(f'not ok {self.env.imgfmt} {test}')
 elif status == 'not run':
-print(f'ok test {test} # SKIP')
+print(f'ok {self.env.imgfmt} {test} # SKIP')
 return

 if lasttime:

In fact, that's exactly what was printed in the non-TAP case.  Thanks 
for the feedback, even though it was bad! :)


Paolo




Re: [PATCH 10/15] hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime

2021-10-18 Thread Łukasz Gieryk
On Mon, Oct 18, 2021 at 12:06:22PM +0200, Philippe Mathieu-Daudé wrote:
> On 10/7/21 18:24, Lukasz Maniak wrote:
> > From: Łukasz Gieryk 
> > 
> > The Nvme device defines two properties: max_ioqpairs, msix_qsize. Having
> > them as constants is problematic for SR-IOV support.
> > 
> > The SR-IOV feature introduces virtual resources (queues, interrupts)
> > that can be assigned to PF and its dependent VFs. Each device, following
> > a reset, should work with the configured number of queues. A single
> > constant is no longer sufficient to hold the whole state.
> > 
> > This patch tries to solve the problem by introducing additional
> > variables in NvmeCtrl’s state. The variables for, e.g., managing queues
> > are therefore organized as:
> > 
> >  - n->params.max_ioqpairs – no changes, constant set by the user.
> > 
> >  - n->max_ioqpairs - (new) value derived from n->params.* in realize();
> >  constant through device’s lifetime.
> > 
> >  - n->(mutable_state) – (not a part of this patch) user-configurable,
> > specifies number of queues available _after_
> > reset.
> > 
> >  - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’
> >   n->params.max_ioqpairs; initialized in realize()
> >   and updated during reset() to reflect user’s
> >   changes to the mutable state.
> > 
> > Since the number of available i/o queues and interrupts can change in
> > runtime, buffers for sq/cqs and the MSIX-related structures are
> > allocated big enough to handle the limits, to completely avoid the
> > complicated reallocation. A helper function (nvme_update_msixcap_ts)
> > updates the corresponding capability register, to signal configuration
> > changes.
> > 
> > Signed-off-by: Łukasz Gieryk 
> > ---
> >  hw/nvme/ctrl.c | 62 +-
> >  hw/nvme/nvme.h |  4 
> >  2 files changed, 45 insertions(+), 21 deletions(-)
> 
> > @@ -6322,11 +6334,17 @@ static void nvme_init_state(NvmeCtrl *n)
> >  NvmeSecCtrlEntry *sctrl;
> >  int i;
> >  
> > +n->max_ioqpairs = n->params.max_ioqpairs;
> > +n->conf_ioqpairs = n->max_ioqpairs;
> > +
> > +n->max_msix_qsize = n->params.msix_qsize;
> > +n->conf_msix_qsize = n->max_msix_qsize;
> 
> From an developer perspective, the API becomes confusing.
> Most fields from NvmeParams are exposed via QMP, such max_ioqpairs.

Hi Philippe,

I’m not sure I understand your concern. The NvmeParams stays as it was,
so the interaction with QMP stays unchanged. Sure, if QMP allows
updating NvmeParams in runtime (I’m guessing, as I’m not really
accustomed with the feature), then the Nvme device will no longer
respond to those changes. But n->conf_ioqpairs is not meant to be
altered via QEMU’s interfaces, but rather though the NVME protocol, by
the guest OS kernel/user.

Could you explain how the changes are going to break (or make more
confusing) the interaction with QMP?

> I'm not sure we need 2 distinct fields. Maybe simply reorganize
> to not reset these values in the DeviceReset handler?

The idea was to calculate the max value once and use it in multiple
places later. The actual calculations are in the following 12/15 patch
(I’m also including the code below), so indeed, the intended use case
is not so obvious.

if (pci_is_vf(&n->parent_obj)) {
n->max_ioqpairs = n->params.sriov_max_vq_per_vf - 1;
} else {
n->max_ioqpairs = n->params.max_ioqpairs +
  n->params.sriov_max_vfs * n->params.sriov_max_vq_per_vf;
}

But as I’m thinking more about the problem, then indeed, the max_*
fields may be not necessary. I could calculate max_msix_qsize in the
only place it’s used, and turn the above snippet for max_iopairs into a
function. The downside is the code for calculating maximums is no longer
grouped together.

> Also, with this series we should consider implementing the migration
> state (nvme_vmstate).

I wasn’t aware of this feature. I have to do the readings first.

> > diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
> > index 9fbb0a70b5..65383e495c 100644
> > --- a/hw/nvme/nvme.h
> > +++ b/hw/nvme/nvme.h
> > @@ -420,6 +420,10 @@ typedef struct NvmeCtrl {
> >  uint64_tstarttime_ms;
> >  uint16_ttemperature;
> >  uint8_t smart_critical_warning;
> > +uint32_tmax_msix_qsize; /* Derived from 
> > params.msix.qsize */
> > +uint32_tconf_msix_qsize;/* Configured limit */
> > +uint32_tmax_ioqpairs;   /* Derived from 
> > params.max_ioqpairs */
> > +uint32_tconf_ioqpairs;  /* Configured limit */
> >  
> 

-- 
Regards,
Łukasz Gieryk




Re: [PATCH] block: Fail gracefully when blockdev-snapshot creates loops

2021-10-18 Thread Vladimir Sementsov-Ogievskiy

18.10.2021 16:47, Kevin Wolf wrote:

Using blockdev-snapshot to append a node as an overlay to itself, or to
any of its parents, causes crashes. Catch the condition and return an
error for these cases instead.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1824363
Signed-off-by: Kevin Wolf 
---
  block.c| 10 ++
  tests/qemu-iotests/085 | 31 ++-
  tests/qemu-iotests/085.out | 33 ++---
  3 files changed, 70 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index 45f653a88b..231dddf655 100644
--- a/block.c
+++ b/block.c
@@ -84,6 +84,9 @@ static BlockDriverState *bdrv_open_inherit(const char 
*filename,
 BdrvChildRole child_role,
 Error **errp);
  
+static bool bdrv_recurse_has_child(BlockDriverState *bs,

+   BlockDriverState *child);
+
  static void bdrv_replace_child_noperm(BdrvChild *child,
BlockDriverState *new_bs);
  static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
@@ -2673,6 +2676,7 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
  int drain_saldo;
  
  assert(!child->frozen);

+assert(old_bs != new_bs);
  
  if (old_bs && new_bs) {

  assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
@@ -2892,6 +2896,12 @@ static int bdrv_attach_child_noperm(BlockDriverState 
*parent_bs,
  
  assert(parent_bs->drv);
  
+if (bdrv_recurse_has_child(child_bs, parent_bs)) {

+error_setg(errp, "Making '%s' a %s child of '%s' would create a cycle",
+   parent_bs->node_name, child_name, child_bs->node_name);


Seems, child_bs and parent_bs should be swapped.

with that fixed:
Reviewed-by: Vladimir Sementsov-Ogievskiy 



+return -EINVAL;
+}
+
  bdrv_get_cumulative_perm(parent_bs, &perm, &shared_perm);
  bdrv_child_perm(parent_bs, child_bs, NULL, child_role, NULL,
  perm, shared_perm, &perm, &shared_perm);
diff --git a/tests/qemu-iotests/085 b/tests/qemu-iotests/085
index d557522943..de74262a26 100755
--- a/tests/qemu-iotests/085
+++ b/tests/qemu-iotests/085
@@ -103,11 +103,18 @@ do_blockdev_add()
  }
  
  # ${1}: unique identifier for the snapshot filename

-add_snapshot_image()
+create_snapshot_image()
  {
  base_image="${TEST_DIR}/$((${1}-1))-${snapshot_virt0}"
  snapshot_file="${TEST_DIR}/${1}-${snapshot_virt0}"
  TEST_IMG=$snapshot_file _make_test_img -u -b "${base_image}" -F $IMGFMT 
"$size"
+}
+
+# ${1}: unique identifier for the snapshot filename
+add_snapshot_image()
+{
+snapshot_file="${TEST_DIR}/${1}-${snapshot_virt0}"
+create_snapshot_image "$1"
  do_blockdev_add "$1" "'backing': null, " "${snapshot_file}"
  }
  
@@ -230,6 +237,28 @@ _make_test_img -b "${TEST_IMG}.base" -F $IMGFMT "$size"

  do_blockdev_add ${SNAPSHOTS} "" "${TEST_IMG}"
  blockdev_snapshot ${SNAPSHOTS} error
  
+echo

+echo === Invalid command - creating loops ===
+echo
+
+SNAPSHOTS=$((${SNAPSHOTS}+1))
+add_snapshot_image ${SNAPSHOTS}
+
+_send_qemu_cmd $h "{ 'execute': 'blockdev-snapshot',
+ 'arguments': { 'node':'snap_${SNAPSHOTS}',
+'overlay':'snap_${SNAPSHOTS}' }
+   }" "error"
+
+SNAPSHOTS=$((${SNAPSHOTS}+1))
+create_snapshot_image ${SNAPSHOTS}
+do_blockdev_add ${SNAPSHOTS} "'backing': 'snap_$((${SNAPSHOTS}-1))', " \
+"${TEST_DIR}/${SNAPSHOTS}-${snapshot_virt0}"
+
+_send_qemu_cmd $h "{ 'execute': 'blockdev-snapshot',
+ 'arguments': { 'node':'snap_${SNAPSHOTS}',
+'overlay':'snap_$((${SNAPSHOTS}-1))' }
+   }" "error"
+
  echo
  echo === Invalid command - The node does not exist ===
  echo
diff --git a/tests/qemu-iotests/085.out b/tests/qemu-iotests/085.out
index 1d4c565b6d..7750b3df5f 100644
--- a/tests/qemu-iotests/085.out
+++ b/tests/qemu-iotests/085.out
@@ -217,15 +217,42 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 
backing_file=TEST_DIR/
   'overlay':'snap_13' } }
  {"error": {"class": "GenericError", "desc": "The overlay already has a backing 
image"}}
  
+=== Invalid command - creating loops ===

+
+Formatting 'TEST_DIR/14-snapshot-v0.IMGFMT', fmt=IMGFMT size=134217728 
backing_file=TEST_DIR/13-snapshot-v0.IMGFMT backing_fmt=IMGFMT
+{ 'execute': 'blockdev-add', 'arguments':
+   { 'driver': 'IMGFMT', 'node-name': 'snap_14', 'backing': null,
+ 'file':
+ { 'driver': 'file', 'filename': 'TEST_DIR/14-snapshot-v0.IMGFMT',
+   'node-name': 'file_14' } } }
+{"return": {}}
+{ 'execute': 'blockdev-snapshot',
+ 'arguments': { 'node':'snap_14',
+'overlay':'snap_14' }
+   }
+{"error": {"class": "GenericError", "des

[PATCH] block: Fail gracefully when blockdev-snapshot creates loops

2021-10-18 Thread Kevin Wolf
Using blockdev-snapshot to append a node as an overlay to itself, or to
any of its parents, causes crashes. Catch the condition and return an
error for these cases instead.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1824363
Signed-off-by: Kevin Wolf 
---
 block.c| 10 ++
 tests/qemu-iotests/085 | 31 ++-
 tests/qemu-iotests/085.out | 33 ++---
 3 files changed, 70 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index 45f653a88b..231dddf655 100644
--- a/block.c
+++ b/block.c
@@ -84,6 +84,9 @@ static BlockDriverState *bdrv_open_inherit(const char 
*filename,
BdrvChildRole child_role,
Error **errp);
 
+static bool bdrv_recurse_has_child(BlockDriverState *bs,
+   BlockDriverState *child);
+
 static void bdrv_replace_child_noperm(BdrvChild *child,
   BlockDriverState *new_bs);
 static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
@@ -2673,6 +2676,7 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 int drain_saldo;
 
 assert(!child->frozen);
+assert(old_bs != new_bs);
 
 if (old_bs && new_bs) {
 assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
@@ -2892,6 +2896,12 @@ static int bdrv_attach_child_noperm(BlockDriverState 
*parent_bs,
 
 assert(parent_bs->drv);
 
+if (bdrv_recurse_has_child(child_bs, parent_bs)) {
+error_setg(errp, "Making '%s' a %s child of '%s' would create a cycle",
+   parent_bs->node_name, child_name, child_bs->node_name);
+return -EINVAL;
+}
+
 bdrv_get_cumulative_perm(parent_bs, &perm, &shared_perm);
 bdrv_child_perm(parent_bs, child_bs, NULL, child_role, NULL,
 perm, shared_perm, &perm, &shared_perm);
diff --git a/tests/qemu-iotests/085 b/tests/qemu-iotests/085
index d557522943..de74262a26 100755
--- a/tests/qemu-iotests/085
+++ b/tests/qemu-iotests/085
@@ -103,11 +103,18 @@ do_blockdev_add()
 }
 
 # ${1}: unique identifier for the snapshot filename
-add_snapshot_image()
+create_snapshot_image()
 {
 base_image="${TEST_DIR}/$((${1}-1))-${snapshot_virt0}"
 snapshot_file="${TEST_DIR}/${1}-${snapshot_virt0}"
 TEST_IMG=$snapshot_file _make_test_img -u -b "${base_image}" -F $IMGFMT 
"$size"
+}
+
+# ${1}: unique identifier for the snapshot filename
+add_snapshot_image()
+{
+snapshot_file="${TEST_DIR}/${1}-${snapshot_virt0}"
+create_snapshot_image "$1"
 do_blockdev_add "$1" "'backing': null, " "${snapshot_file}"
 }
 
@@ -230,6 +237,28 @@ _make_test_img -b "${TEST_IMG}.base" -F $IMGFMT "$size"
 do_blockdev_add ${SNAPSHOTS} "" "${TEST_IMG}"
 blockdev_snapshot ${SNAPSHOTS} error
 
+echo
+echo === Invalid command - creating loops ===
+echo
+
+SNAPSHOTS=$((${SNAPSHOTS}+1))
+add_snapshot_image ${SNAPSHOTS}
+
+_send_qemu_cmd $h "{ 'execute': 'blockdev-snapshot',
+ 'arguments': { 'node':'snap_${SNAPSHOTS}',
+'overlay':'snap_${SNAPSHOTS}' }
+   }" "error"
+
+SNAPSHOTS=$((${SNAPSHOTS}+1))
+create_snapshot_image ${SNAPSHOTS}
+do_blockdev_add ${SNAPSHOTS} "'backing': 'snap_$((${SNAPSHOTS}-1))', " \
+"${TEST_DIR}/${SNAPSHOTS}-${snapshot_virt0}"
+
+_send_qemu_cmd $h "{ 'execute': 'blockdev-snapshot',
+ 'arguments': { 'node':'snap_${SNAPSHOTS}',
+'overlay':'snap_$((${SNAPSHOTS}-1))' }
+   }" "error"
+
 echo
 echo === Invalid command - The node does not exist ===
 echo
diff --git a/tests/qemu-iotests/085.out b/tests/qemu-iotests/085.out
index 1d4c565b6d..7750b3df5f 100644
--- a/tests/qemu-iotests/085.out
+++ b/tests/qemu-iotests/085.out
@@ -217,15 +217,42 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 
backing_file=TEST_DIR/
  'overlay':'snap_13' } }
 {"error": {"class": "GenericError", "desc": "The overlay already has a backing 
image"}}
 
+=== Invalid command - creating loops ===
+
+Formatting 'TEST_DIR/14-snapshot-v0.IMGFMT', fmt=IMGFMT size=134217728 
backing_file=TEST_DIR/13-snapshot-v0.IMGFMT backing_fmt=IMGFMT
+{ 'execute': 'blockdev-add', 'arguments':
+   { 'driver': 'IMGFMT', 'node-name': 'snap_14', 'backing': null,
+ 'file':
+ { 'driver': 'file', 'filename': 'TEST_DIR/14-snapshot-v0.IMGFMT',
+   'node-name': 'file_14' } } }
+{"return": {}}
+{ 'execute': 'blockdev-snapshot',
+ 'arguments': { 'node':'snap_14',
+'overlay':'snap_14' }
+   }
+{"error": {"class": "GenericError", "desc": "Making 'snap_14' a backing child 
of 'snap_14' would create a cycle"}}
+Formatting 'TEST_DIR/15-snapshot-v0.IMGFMT', fmt=IMGFMT size=134217728 
backing_file=TEST_DIR/14-snapshot-v0.IMGFMT backing_f

Re: regression on s390: was Re: [PULL 37/40] monitor: Tidy up find_device_state()

2021-10-18 Thread Markus Armbruster
Christian Borntraeger  writes:

[...]

> The 2nd thing to do is to fix the regression. Does anyone have an idea what 
> is broken?

I do: "device ID or QOM path" arguments where the device ID contains
'/'.  Undocumented feature, as far as I can tell.  I'll fix it anyway.
Affects device_del, qemu-io, and a number of other monitor commands
related to block devices.




Re: [PULL 37/40] monitor: Tidy up find_device_state()

2021-10-18 Thread Markus Armbruster
Christian Borntraeger  writes:

> Am 13.10.21 um 11:07 schrieb Paolo Bonzini:
>> From: Markus Armbruster 
>> Commit 6287d827d4 "monitor: allow device_del to accept QOM paths"
>> extended find_device_state() to accept QOM paths in addition to qdev
>> IDs.  This added a checked conversion to TYPE_DEVICE at the end, which
>> duplicates the check done for the qdev ID case earlier, except it sets
>> a *different* error: GenericError "ID is not a hotpluggable device"
>> when passed a QOM path, and DeviceNotFound "Device 'ID' not found"
>> when passed a qdev ID.  Fortunately, the latter won't happen as long
>> as we add only devices to /machine/peripheral/.
>> Earlier, commit b6cc36abb2 "qdev: device_del: Search for to be
>> unplugged device in 'peripheral' container" rewrote the lookup by qdev
>> ID to use QOM instead of qdev_find_recursive(), so it can handle
>> buss-less devices.  It does so by constructing an absolute QOM path.
>> Works, but object_resolve_path_component() is easier.  Switching to it
>> also gets rid of the unclean duplication described above.
>> While there, avoid converting to TYPE_DEVICE twice, first to check
>> whether it's possible, and then for real.
>
> This one broke qemu iotest 280 on s390:
>
>
> 280   fail   [13:06:19] [13:06:19]   0.3s   (last: 0.3s)  output mismatch 
> (see 280.out.bad)
> --- /home/cborntra/REPOS/qemu/tests/qemu-iotests/280.out
> +++ 280.out.bad
> @@ -37,14 +37,14 @@
>  === Resume the VM and simulate a write request ===
>  {"execute": "cont", "arguments": {}}
>  {"return": {}}
> -{"return": ""}
> +{"return": "Error: Device 'vda/virtio-backend' not found\r\n"}
>
>  === Commit it to the backing file ===
>  {"execute": "block-commit", "arguments": {"auto-dismiss": false, "device": 
> "top-fmt", "job-id": "job0", "top-node": "top-fmt"}}
>  {"return": {}}
>  {"execute": "job-complete", "arguments": {"id": "job0"}}
>  {"return": {}}
> -{"data": {"device": "job0", "len": 65536, "offset": 65536, "speed": 0, 
> "type": "commit"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": 
> "USECS", "seconds": "SECS"}}
> -{"data": {"device": "job0", "len": 65536, "offset": 65536, "speed": 0, 
> "type": "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": 
> {"microseconds": "USECS", "seconds": "SECS"}}
> +{"data": {"device": "job0", "len": 0, "offset": 0, "speed": 0, "type": 
> "commit"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": "USECS", 
> "seconds": "SECS"}}
> +{"data": {"device": "job0", "len": 0, "offset": 0, "speed": 0, "type": 
> "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": 
> "USECS", "seconds": "SECS"}}
>  {"execute": "job-dismiss", "arguments": {"id": "job0"}}
>  {"return": {}}
> Failures: 280
> Failed 1 of 1 iotests

Reproduced.  I'll dig deeper.  Thanks!




regression on s390: was Re: [PULL 37/40] monitor: Tidy up find_device_state()

2021-10-18 Thread Christian Borntraeger




Am 15.10.21 um 21:15 schrieb Richard Henderson:

On 10/15/21 4:08 AM, Christian Borntraeger wrote:


Am 13.10.21 um 11:07 schrieb Paolo Bonzini:

From: Markus Armbruster 

Commit 6287d827d4 "monitor: allow device_del to accept QOM paths"
extended find_device_state() to accept QOM paths in addition to qdev
IDs.  This added a checked conversion to TYPE_DEVICE at the end, which
duplicates the check done for the qdev ID case earlier, except it sets
a *different* error: GenericError "ID is not a hotpluggable device"
when passed a QOM path, and DeviceNotFound "Device 'ID' not found"
when passed a qdev ID.  Fortunately, the latter won't happen as long
as we add only devices to /machine/peripheral/.

Earlier, commit b6cc36abb2 "qdev: device_del: Search for to be
unplugged device in 'peripheral' container" rewrote the lookup by qdev
ID to use QOM instead of qdev_find_recursive(), so it can handle
buss-less devices.  It does so by constructing an absolute QOM path.
Works, but object_resolve_path_component() is easier.  Switching to it
also gets rid of the unclean duplication described above.

While there, avoid converting to TYPE_DEVICE twice, first to check
whether it's possible, and then for real.


This one broke qemu iotest 280 on s390:


280   fail   [13:06:19] [13:06:19]   0.3s   (last: 0.3s)  output mismatch 
(see 280.out.bad)
--- /home/cborntra/REPOS/qemu/tests/qemu-iotests/280.out
+++ 280.out.bad
@@ -37,14 +37,14 @@
  === Resume the VM and simulate a write request ===
  {"execute": "cont", "arguments": {}}
  {"return": {}}
-{"return": ""}
+{"return": "Error: Device 'vda/virtio-backend' not found\r\n"}


Hmm, this test doesn't seem to have been attempted during staging:

   https://gitlab.com/qemu-project/qemu/-/jobs/1681194907

Is there something extra that needs to be installed on s390x.ci.qemu.org to 
have this test run?




No idea. Peter owns the machine. This is one thing to do.
The 2nd thing to do is to fix the regression. Does anyone have an idea what is 
broken?



Re: [PATCH 10/15] hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime

2021-10-18 Thread Philippe Mathieu-Daudé
On 10/7/21 18:24, Lukasz Maniak wrote:
> From: Łukasz Gieryk 
> 
> The Nvme device defines two properties: max_ioqpairs, msix_qsize. Having
> them as constants is problematic for SR-IOV support.
> 
> The SR-IOV feature introduces virtual resources (queues, interrupts)
> that can be assigned to PF and its dependent VFs. Each device, following
> a reset, should work with the configured number of queues. A single
> constant is no longer sufficient to hold the whole state.
> 
> This patch tries to solve the problem by introducing additional
> variables in NvmeCtrl’s state. The variables for, e.g., managing queues
> are therefore organized as:
> 
>  - n->params.max_ioqpairs – no changes, constant set by the user.
> 
>  - n->max_ioqpairs - (new) value derived from n->params.* in realize();
>  constant through device’s lifetime.
> 
>  - n->(mutable_state) – (not a part of this patch) user-configurable,
> specifies number of queues available _after_
> reset.
> 
>  - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’
>   n->params.max_ioqpairs; initialized in realize()
>   and updated during reset() to reflect user’s
>   changes to the mutable state.
> 
> Since the number of available i/o queues and interrupts can change in
> runtime, buffers for sq/cqs and the MSIX-related structures are
> allocated big enough to handle the limits, to completely avoid the
> complicated reallocation. A helper function (nvme_update_msixcap_ts)
> updates the corresponding capability register, to signal configuration
> changes.
> 
> Signed-off-by: Łukasz Gieryk 
> ---
>  hw/nvme/ctrl.c | 62 +-
>  hw/nvme/nvme.h |  4 
>  2 files changed, 45 insertions(+), 21 deletions(-)

> @@ -6322,11 +6334,17 @@ static void nvme_init_state(NvmeCtrl *n)
>  NvmeSecCtrlEntry *sctrl;
>  int i;
>  
> +n->max_ioqpairs = n->params.max_ioqpairs;
> +n->conf_ioqpairs = n->max_ioqpairs;
> +
> +n->max_msix_qsize = n->params.msix_qsize;
> +n->conf_msix_qsize = n->max_msix_qsize;

>From an developer perspective, the API becomes confusing.
Most fields from NvmeParams are exposed via QMP, such max_ioqpairs.

I'm not sure we need 2 distinct fields. Maybe simply reorganize
to not reset these values in the DeviceReset handler?

Also, with this series we should consider implementing the migration
state (nvme_vmstate).

> diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
> index 9fbb0a70b5..65383e495c 100644
> --- a/hw/nvme/nvme.h
> +++ b/hw/nvme/nvme.h
> @@ -420,6 +420,10 @@ typedef struct NvmeCtrl {
>  uint64_tstarttime_ms;
>  uint16_ttemperature;
>  uint8_t smart_critical_warning;
> +uint32_tmax_msix_qsize; /* Derived from 
> params.msix.qsize */
> +uint32_tconf_msix_qsize;/* Configured limit */
> +uint32_tmax_ioqpairs;   /* Derived from 
> params.max_ioqpairs */
> +uint32_tconf_ioqpairs;  /* Configured limit */
>  




Re: [PATCH 11/15] hw/nvme: Calculate BAR atributes in a function

2021-10-18 Thread Philippe Mathieu-Daudé
Hi Łukasz,

On 10/7/21 18:24, Lukasz Maniak wrote:
> From: Łukasz Gieryk 
> 
> An Nvme device with SR-IOV capability calculates the BAR size
> differently for PF and VF, so it makes sense to extract the common code
> to a separate function.
> 
> Also: it seems the n->reg_size parameter unnecessarily splits the BAR
> size calculation in two phases; removed to simplify the code.

Preferably split in 2 patches, simplification in first, new function
in second.
> Signed-off-by: Łukasz Gieryk 
> ---
>  hw/nvme/ctrl.c | 52 +-
>  hw/nvme/nvme.h |  1 -
>  2 files changed, 35 insertions(+), 18 deletions(-)




Re: [RFC PATCH 0/4] Replace custom test harness with "meson test"

2021-10-18 Thread Thomas Huth

On 15/10/2021 12.07, Paolo Bonzini wrote:

Hi all,

Starting with Meson 0.57, "meson test" has all features of QEMU's
makefile-based harness and more.


I just gave it a try, and basically I like this ... but I also encountered 
two issues:



* CTRL+C will only interrupt the longest running test.  Pressing
   CTRL+C repeatedly three times (which you would likely do anyway,
   that's how things work) interrupts the whole run


I tried this, and while hitting CTRL-C multiple times brought me back to the 
shell prompt, the remaining tests kept getting started in the background 
instead of getting stopped ... something is still fishy here, I think.



* Right now "make check-block" only does a single test run just like
   "../tests/check-block.sh", but it would be possible to add the thorough
   suite to "meson test --suite block" as well.


The output of the iotests is also not optimal yet... when running "make 
check SPEED=slow", the iotests are run multiple times with different target 
image types, but each run prints the same "▶ 1/1 test 001   OK" etc. to the 
console, so it's hard to say which target type  is currently exercised. 
Would it be possible to include the target image type here, e.g. something like:


▶ 1/1 test-qcow2 001   OK

?

 Thomas