Re: [PATCH for-6.1? 3/6] jobs: Give Job.force_cancel more meaning

2021-07-22 Thread Vladimir Sementsov-Ogievskiy

22.07.2021 15:26, Max Reitz wrote:

We largely have two cancel modes for jobs:

First, there is actual cancelling.  The job is terminated as soon as
possible, without trying to reach a consistent result.

Second, we have mirror in the READY state.  Technically, the job is not
really cancelled, but it just is a different completion mode.  The job
can still run for an indefinite amount of time while it tries to reach a
consistent result.

We want to be able to clearly distinguish which cancel mode a job is in
(when it has been cancelled).  We can use Job.force_cancel for this, but
right now it only reflects cancel requests from the user with
force=true, but clearly, jobs that do not even distinguish between
force=false and force=true are effectively always force-cancelled.

So this patch has Job.force_cancel signify whether the job will
terminate as soon as possible (force_cancel=true) or whether it will
effectively remain running despite being "cancelled"
(force_cancel=false).

To this end, we let jobs that provide JobDriver.cancel() tell the
generic job code whether they will terminate as soon as possible or not,
and for jobs that do not provide that method we assume they will.

Signed-off-by: Max Reitz



In isolation this patch is rather strange: force_cancel is used only by mirror. 
But we keep in generic job layer. And make a handler to set a value to this 
variable. So in generic layer we ask mirror which value it want to set to 
generic variable, which is used only by mirror.. This probably shows that this 
feature of mirror should be mirror only and generic layer shouldn't take care 
of it (see also my answer to next commit).

But at the end of the series the variable is not more used by mirror directly. 
So, technically the commit is not wrong, and it is a preparation for the 
following ones.

Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir



Re: [PATCH for-6.1? 4/6] job: Add job_cancel_requested()

2021-07-22 Thread Vladimir Sementsov-Ogievskiy

22.07.2021 15:26, Max Reitz wrote:

Most callers of job_is_cancelled() actually want to know whether the job
is on its way to immediate termination.  For example, we refuse to pause
jobs that are cancelled; but this only makes sense for jobs that are
really actually cancelled.

A mirror job that is cancelled during READY with force=false should
absolutely be allowed to pause.  This "cancellation" (which is actually
a kind of completion) 


You have to repeat that this "cancel" is not "cancel".

So, the whole problem is that feature of mirror, on cancel in READY state do 
not cancel but do some specific kind of completion.

You try to make this thing correctly handled on generic layer..

Did you consider instead just drop the feature from generic layer? So that all 
*cancel* functions always do force-cancel. Then the internal implementation 
become a lot clearer.

But we have to support the qmp block-job-cancel of READY mirror (and commit) 
with force=false.

We can do it as an exclusion in qmp_block_job_cancel, something like:

if (job is mirror or commit AND it's ready AND force = false)
   mirror_soft_cancel(...);
else
   job_cancel(...);



may take an indefinite amount of time, and so
should behave like any job during normal operation.  For example, with
on-target-error=stop, the job should stop on write errors.  (In
contrast, force-cancelled jobs should not get write errors, as they
should just terminate and not do further I/O.)

Therefore, redefine job_is_cancelled() to only return true for jobs that
are force-cancelled (which as of HEAD^ means any job that interprets the
cancellation request as a request for immediate termination), and add
job_cancel_request() as the general variant, which returns true for any
jobs which have been requested to be cancelled, whether it be
immediately or after an arbitrarily long completion phase.

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz 
---


[..]


--- a/job.c
+++ b/job.c
@@ -216,6 +216,11 @@ const char *job_type_str(const Job *job)
  }
  
  bool job_is_cancelled(Job *job)

+{
+return job->cancelled && job->force_cancel;
+}
+
+bool job_cancel_requested(Job *job)
  {
  return job->cancelled;
  }
@@ -650,7 +655,7 @@ static void job_conclude(Job *job)
  
  static void job_update_rc(Job *job)

  {
-if (!job->ret && job_is_cancelled(job)) {
+if (!job->ret && job_cancel_requested(job)) {


Why not job_is_cancelled() here?

So in case of mirror other kind of completion we set ret to -ECANCELED?


  job->ret = -ECANCELED;
  }
  if (job->ret) {
@@ -704,7 +709,7 @@ static int job_finalize_single(Job *job)
  
  /* Emit events only if we actually started */

  if (job_started(job)) {
-if (job_is_cancelled(job)) {
+if (job_cancel_requested(job)) {
  job_event_cancelled(job);


Same question here.. Shouldn't mirror report COMPLETED event in case of 
not-force cancelled in READY state?


  } else {
  job_event_completed(job);
@@ -1015,7 +1020,7 @@ void job_complete(Job *job, Error **errp)
  if (job_apply_verb(job, JOB_VERB_COMPLETE, errp)) {
  return;
  }
-if (job_is_cancelled(job) || !job->driver->complete) {
+if (job_cancel_requested(job) || !job->driver->complete) {
  error_setg(errp, "The active block job '%s' cannot be completed",
 job->id);
  return;
@@ -1043,7 +1048,7 @@ int job_finish_sync(Job *job, void (*finish)(Job *, Error 
**errp), Error **errp)
  AIO_WAIT_WHILE(job->aio_context,
 (job_enter(job), !job_is_completed(job)));
  
-ret = (job_is_cancelled(job) && job->ret == 0) ? -ECANCELED : job->ret;

+ret = (job_cancel_requested(job) && job->ret == 0) ? -ECANCELED : job->ret;
  job_unref(job);
  return ret;
  }




--
Best regards,
Vladimir



Re: [PATCH for-6.1? 2/6] job: @force parameter for job_cancel_sync{,_all}()

2021-07-22 Thread Vladimir Sementsov-Ogievskiy

22.07.2021 15:26, Max Reitz wrote:

Callers should be able to specify whether they want job_cancel_sync() to
force-cancel the job or not.

In fact, almost all invocations do not care about consistency of the
result and just want the job to terminate as soon as possible, so they
should pass force=true.  The replication block driver is the exception.

This changes some iotest outputs, because quitting qemu while a mirror
job is active will now lead to it being cancelled instead of completed,
which is what we want.  (Cancelling a READY mirror job with force=false
may take an indefinite amount of time, which we do not want when
quitting.  If users want consistent results, they must have all jobs be
done before they quit qemu.)

Buglink:https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz


Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir



Re: [PATCH] nbd/server: Add --selinux-label option

2021-07-22 Thread Daniel P . Berrangé
On Thu, Jul 22, 2021 at 05:32:40PM +0100, Richard W.M. Jones wrote:
> Under SELinux, Unix domain sockets have two labels.  One is on the
> disk and can be set with commands such as chcon(1).  There is a
> different label stored in memory (called the process label).  This can
> only be set by the process creating the socket.  When using SELinux +
> SVirt and wanting qemu to be able to connect to a qemu-nbd instance,
> you must set both labels correctly first.
> 
> For qemu-nbd the options to set the second label are awkward.  You can
> create the socket in a wrapper program and then exec into qemu-nbd.
> Or you could try something with LD_PRELOAD.
> 
> This commit adds the ability to set the label straightforwardly on the
> command line, via the new --selinux-label flag.  (The name of the flag
> is the same as the equivalent nbdkit option.)
> 
> A worked example showing how to use the new option can be found in
> this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1984938
> 
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1984938
> Signed-off-by: Richard W.M. Jones 
> ---
>  configure |  9 -
>  meson.build   | 10 +-
>  meson_options.txt |  3 +++
>  qemu-nbd.c| 33 +
>  4 files changed, 53 insertions(+), 2 deletions(-)


> diff --git a/meson.build b/meson.build
> index 2f377098d7..2d7206233e 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1064,6 +1064,11 @@ keyutils = dependency('libkeyutils', required: false,
>  
>  has_gettid = cc.has_function('gettid')
>  
> +# libselinux
> +selinux = dependency('libselinux',
> + required: get_option('selinux'),
> + method: 'pkg-config', kwargs: static_kwargs)
> +

For the new build dep we'll need updated package lists in
tests/docker/dockerfiles. For centos, fedra ,opensuse
add libselinux-devel and for ubuntu 18/20 add libselinux-dev

That'll make sure this new code gets tested in CI.


The rest looks ok to me

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




[PATCH] nbd/server: Add --selinux-label option

2021-07-22 Thread Richard W.M. Jones
Under SELinux, Unix domain sockets have two labels.  One is on the
disk and can be set with commands such as chcon(1).  There is a
different label stored in memory (called the process label).  This can
only be set by the process creating the socket.  When using SELinux +
SVirt and wanting qemu to be able to connect to a qemu-nbd instance,
you must set both labels correctly first.

For qemu-nbd the options to set the second label are awkward.  You can
create the socket in a wrapper program and then exec into qemu-nbd.
Or you could try something with LD_PRELOAD.

This commit adds the ability to set the label straightforwardly on the
command line, via the new --selinux-label flag.  (The name of the flag
is the same as the equivalent nbdkit option.)

A worked example showing how to use the new option can be found in
this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1984938

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1984938
Signed-off-by: Richard W.M. Jones 
---
 configure |  9 -
 meson.build   | 10 +-
 meson_options.txt |  3 +++
 qemu-nbd.c| 33 +
 4 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index b5965b159f..7e04bd485f 100755
--- a/configure
+++ b/configure
@@ -443,6 +443,7 @@ fuse="auto"
 fuse_lseek="auto"
 multiprocess="auto"
 slirp_smbd="$default_feature"
+selinux="auto"
 
 malloc_trim="auto"
 gio="$default_feature"
@@ -1578,6 +1579,10 @@ for opt do
   ;;
   --disable-slirp-smbd) slirp_smbd=no
   ;;
+  --enable-selinux) selinux="enabled"
+  ;;
+  --disable-selinux) selinux="disabled"
+  ;;
   *)
   echo "ERROR: unknown option $opt"
   echo "Try '$0 --help' for more information"
@@ -1965,6 +1970,7 @@ disabled with --disable-FEATURE, default is enabled if 
available
   multiprocessOut of process device emulation support
   gio libgio support
   slirp-smbd  use smbd (at path --smbd=*) in slirp networking
+  selinux SELinux support in qemu-nbd
 
 NOTE: The object files are built at the place where configure is launched
 EOF
@@ -5220,7 +5226,8 @@ if test "$skip_meson" = no; then
 -Dattr=$attr -Ddefault_devices=$default_devices 
-Dvirglrenderer=$virglrenderer \
 -Ddocs=$docs -Dsphinx_build=$sphinx_build -Dinstall_blobs=$blobs \
 -Dvhost_user_blk_server=$vhost_user_blk_server 
-Dmultiprocess=$multiprocess \
--Dfuse=$fuse -Dfuse_lseek=$fuse_lseek 
-Dguest_agent_msi=$guest_agent_msi -Dbpf=$bpf\
+-Dfuse=$fuse -Dfuse_lseek=$fuse_lseek 
-Dguest_agent_msi=$guest_agent_msi -Dbpf=$bpf \
+-Dselinux=$selinux \
 $(if test "$default_features" = no; then echo 
"-Dauto_features=disabled"; fi) \
-Dtcg_interpreter=$tcg_interpreter \
 $cross_arg \
diff --git a/meson.build b/meson.build
index 2f377098d7..2d7206233e 100644
--- a/meson.build
+++ b/meson.build
@@ -1064,6 +1064,11 @@ keyutils = dependency('libkeyutils', required: false,
 
 has_gettid = cc.has_function('gettid')
 
+# libselinux
+selinux = dependency('libselinux',
+ required: get_option('selinux'),
+ method: 'pkg-config', kwargs: static_kwargs)
+
 # Malloc tests
 
 malloc = []
@@ -1291,6 +1296,7 @@ config_host_data.set('CONFIG_FUSE', fuse.found())
 config_host_data.set('CONFIG_FUSE_LSEEK', fuse_lseek.found())
 config_host_data.set('CONFIG_X11', x11.found())
 config_host_data.set('CONFIG_CFI', get_option('cfi'))
+config_host_data.set('CONFIG_SELINUX', selinux.found())
 config_host_data.set('QEMU_VERSION', '"@0@"'.format(meson.project_version()))
 config_host_data.set('QEMU_VERSION_MAJOR', 
meson.project_version().split('.')[0])
 config_host_data.set('QEMU_VERSION_MINOR', 
meson.project_version().split('.')[1])
@@ -2739,7 +2745,8 @@ if have_tools
   qemu_io = executable('qemu-io', files('qemu-io.c'),
  dependencies: [block, qemuutil], install: true)
   qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'),
-   dependencies: [blockdev, qemuutil, gnutls], install: true)
+   dependencies: [blockdev, qemuutil, gnutls, selinux],
+   install: true)
 
   subdir('storage-daemon')
   subdir('contrib/rdmacm-mux')
@@ -3104,6 +3111,7 @@ summary_info += {'libpmem support':   libpmem.found()}
 summary_info += {'libdaxctl support': libdaxctl.found()}
 summary_info += {'libudev':   libudev.found()}
 summary_info += {'FUSE lseek':fuse_lseek.found()}
+summary_info += {'selinux':   selinux.found()}
 summary(summary_info, bool_yn: true, section: 'Dependencies')
 
 if not supported_cpus.contains(cpu)
diff --git a/meson_options.txt b/meson_options.txt
index a9a9b8f4c6..a5938500a3 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -155,3 +155,6 @@ option('slirp', type: 'combo', value: 'auto',
 option('fdt', type: 'combo', value: 'auto',
choices: ['disabled', 'enabled', 'auto', 'system', 'internal'],
description: 'Whether and how to 

[PATCH] nbd/server: Add --selinux-label option

2021-07-22 Thread Richard W.M. Jones
https://bugzilla.redhat.com/show_bug.cgi?id=1984938

The purpose of the patch is explained in the commit message / bug.  In
the cover I want to explain a couple of design choices.

If libselinux isn't available at build time then the --selinux-label
option is still present.  It does not appear in the qemu-nbd --help
output.  If you still use it, it is ignored.  (By contrast nbdkit will
give an error if you try to use the option without having SELinux
support.  It's not clear which is better.)

We give an error if setsockcreatecon_raw fails.  In theory we could
ignore this error (warning?) and keep going.  Either SELinux would
later reject clients or it wouldn't.

Rich.






Re: [PATCH for-6.1? 1/6] mirror: Keep s->synced on error

2021-07-22 Thread Vladimir Sementsov-Ogievskiy

22.07.2021 15:26, Max Reitz wrote:

An error does not take us out of the READY phase, which is what
s->synced signifies.  It does of course mean that source and target are
no longer in sync, but that is what s->actively_sync is for -- s->synced
never meant that source and target are in sync, only that they were at
some point (and at that point we transitioned into the READY phase).

The tangible problem is that we transition to READY once we are in sync
and s->synced is false.  By resetting s->synced here, we will transition
from READY to READY once the error is resolved (if the job keeps
running), and that transition is not allowed.

Signed-off-by: Max Reitz 
---
  block/mirror.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/block/mirror.c b/block/mirror.c
index 98fc66eabf..d73b704473 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -121,7 +121,6 @@ typedef enum MirrorMethod {
  static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
  int error)
  {
-s->synced = false;
  s->actively_synced = false;
  if (read) {
  return block_job_error_action(>common, s->on_source_error,



Looked through.. Yes, seems s->synced used as "is ready". Isn't it better to drop 
s->synced at all and use job_is_read() instead?

Hmm, s->actively_synced used only for assertion in active_write_settle().. 
That's not wrong, just interesting.

--
Best regards,
Vladimir



[PATCH for-6.1? 6/6] iotests: Add mirror-ready-cancel-error test

2021-07-22 Thread Max Reitz
Test what happens when there is an I/O error after a mirror job in the
READY phase has been cancelled.

Signed-off-by: Max Reitz 
---
 .../tests/mirror-ready-cancel-error   | 143 ++
 .../tests/mirror-ready-cancel-error.out   |   5 +
 2 files changed, 148 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/mirror-ready-cancel-error
 create mode 100644 tests/qemu-iotests/tests/mirror-ready-cancel-error.out

diff --git a/tests/qemu-iotests/tests/mirror-ready-cancel-error 
b/tests/qemu-iotests/tests/mirror-ready-cancel-error
new file mode 100755
index 00..f2dc1f
--- /dev/null
+++ b/tests/qemu-iotests/tests/mirror-ready-cancel-error
@@ -0,0 +1,143 @@
+#!/usr/bin/env python3
+# group: rw quick
+#
+# Test what happens when errors occur to a mirror job after it has
+# been cancelled in the READY phase
+#
+# Copyright (C) 2021 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import iotests
+
+
+image_size = 1 * 1024 * 1024
+source = os.path.join(iotests.test_dir, 'source.img')
+target = os.path.join(iotests.test_dir, 'target.img')
+
+
+class TestMirrorReadyCancelError(iotests.QMPTestCase):
+def setUp(self) -> None:
+assert iotests.qemu_img_create('-f', iotests.imgfmt, source,
+   str(image_size)) == 0
+assert iotests.qemu_img_create('-f', iotests.imgfmt, target,
+   str(image_size)) == 0
+
+self.vm = iotests.VM()
+self.vm.launch()
+
+def tearDown(self) -> None:
+self.vm.shutdown()
+os.remove(source)
+os.remove(target)
+
+def add_blockdevs(self, once: bool) -> None:
+res = self.vm.qmp('blockdev-add',
+  **{'node-name': 'source',
+ 'driver': iotests.imgfmt,
+ 'file': {
+ 'driver': 'file',
+ 'filename': source
+ }})
+self.assert_qmp(res, 'return', {})
+
+# blkdebug notes:
+# Enter state 2 on the first flush, which happens before the
+# job enters the READY state.  The second flush will happen
+# when the job is about to complete, and we want that one to
+# fail.
+res = self.vm.qmp('blockdev-add',
+  **{'node-name': 'target',
+ 'driver': iotests.imgfmt,
+ 'file': {
+ 'driver': 'blkdebug',
+ 'image': {
+ 'driver': 'file',
+ 'filename': target
+ },
+ 'set-state': [{
+ 'event': 'flush_to_disk',
+ 'state': 1,
+ 'new_state': 2
+ }],
+ 'inject-error': [{
+ 'event': 'flush_to_disk',
+ 'once': once,
+ 'immediately': True,
+ 'state': 2
+ }]}})
+self.assert_qmp(res, 'return', {})
+
+def start_mirror(self) -> None:
+res = self.vm.qmp('blockdev-mirror',
+  job_id='mirror',
+  device='source',
+  target='target',
+  filter_node_name='mirror-top',
+  sync='full',
+  on_target_error='stop')
+self.assert_qmp(res, 'return', {})
+
+def cancel_mirror_with_error(self) -> None:
+self.vm.event_wait('BLOCK_JOB_READY')
+
+# Write something so will not leave the job immediately, but
+# flush first (which will fail, thanks to blkdebug)
+res = self.vm.qmp('human-monitor-command',
+  command_line='qemu-io mirror-top "write 0 64k"')
+self.assert_qmp(res, 'return', '')
+
+# Drain status change events
+while self.vm.event_wait('JOB_STATUS_CHANGE', timeout=0.0) is not None:
+pass
+
+res = self.vm.qmp('block-job-cancel', 

[PATCH for-6.1? 5/6] mirror: Check job_is_cancelled() earlier

2021-07-22 Thread Max Reitz
We must check whether the job is force-cancelled early in our main loop,
most importantly before any `continue` statement.  For example, we used
to have `continue`s before our current checking location that are
triggered by `mirror_flush()` failing.  So, if `mirror_flush()` kept
failing, force-cancelling the job would not terminate it.

A job being force-cancelled should be treated the same as the job having
failed, so put the check in the same place where we check `s->ret < 0`.

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz 
---
 block/mirror.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 291d2ed040..a993ed37d0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -995,7 +995,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
 mirror_wait_for_any_operation(s, true);
 }
 
-if (s->ret < 0) {
+if (s->ret < 0 || job_is_cancelled(>common.job)) {
 ret = s->ret;
 goto immediate_exit;
 }
@@ -1081,17 +1081,12 @@ static int coroutine_fn mirror_run(Job *job, Error 
**errp)
 break;
 }
 
-ret = 0;
-
 if (s->synced && !should_complete) {
 delay_ns = (s->in_flight == 0 &&
 cnt == 0 ? BLOCK_JOB_SLICE_TIME : 0);
 }
 trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
 job_sleep_ns(>common.job, delay_ns);
-if (job_is_cancelled(>common.job)) {
-break;
-}
 s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 }
 
-- 
2.31.1




[PATCH for-6.1? 3/6] jobs: Give Job.force_cancel more meaning

2021-07-22 Thread Max Reitz
We largely have two cancel modes for jobs:

First, there is actual cancelling.  The job is terminated as soon as
possible, without trying to reach a consistent result.

Second, we have mirror in the READY state.  Technically, the job is not
really cancelled, but it just is a different completion mode.  The job
can still run for an indefinite amount of time while it tries to reach a
consistent result.

We want to be able to clearly distinguish which cancel mode a job is in
(when it has been cancelled).  We can use Job.force_cancel for this, but
right now it only reflects cancel requests from the user with
force=true, but clearly, jobs that do not even distinguish between
force=false and force=true are effectively always force-cancelled.

So this patch has Job.force_cancel signify whether the job will
terminate as soon as possible (force_cancel=true) or whether it will
effectively remain running despite being "cancelled"
(force_cancel=false).

To this end, we let jobs that provide JobDriver.cancel() tell the
generic job code whether they will terminate as soon as possible or not,
and for jobs that do not provide that method we assume they will.

Signed-off-by: Max Reitz 
---
 include/qemu/job.h | 11 ++-
 block/backup.c |  3 ++-
 block/mirror.c | 24 ++--
 job.c  |  6 +-
 4 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 5e8edbc2c8..8aa90f7395 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -253,8 +253,17 @@ struct JobDriver {
 
 /**
  * If the callback is not NULL, it will be invoked in job_cancel_async
+ *
+ * This function must return true if the job will be cancelled
+ * immediately without any further I/O (mandatory if @force is
+ * true), and false otherwise.  This lets the generic job layer
+ * know whether a job has been truly (force-)cancelled, or whether
+ * it is just in a special completion mode (like mirror after
+ * READY).
+ * (If the callback is NULL, the job is assumed to terminate
+ * without I/O.)
  */
-void (*cancel)(Job *job, bool force);
+bool (*cancel)(Job *job, bool force);
 
 
 /** Called when the job is freed */
diff --git a/block/backup.c b/block/backup.c
index bd3614ce70..513e1c8a0b 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -331,11 +331,12 @@ static void coroutine_fn backup_set_speed(BlockJob *job, 
int64_t speed)
 }
 }
 
-static void backup_cancel(Job *job, bool force)
+static bool backup_cancel(Job *job, bool force)
 {
 BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
 
 bdrv_cancel_in_flight(s->target_bs);
+return true;
 }
 
 static const BlockJobDriver backup_job_driver = {
diff --git a/block/mirror.c b/block/mirror.c
index d73b704473..c3514f4196 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1089,9 +1089,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
 }
 trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
 job_sleep_ns(>common.job, delay_ns);
-if (job_is_cancelled(>common.job) &&
-(!s->synced || s->common.job.force_cancel))
-{
+if (job_is_cancelled(>common.job) && s->common.job.force_cancel) {
 break;
 }
 s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -1103,7 +1101,7 @@ immediate_exit:
  * or it was cancelled prematurely so that we do not guarantee that
  * the target is a copy of the source.
  */
-assert(ret < 0 || ((s->common.job.force_cancel || !s->synced) &&
+assert(ret < 0 || (s->common.job.force_cancel &&
job_is_cancelled(>common.job)));
 assert(need_drain);
 mirror_wait_for_all_io(s);
@@ -1189,14 +1187,27 @@ static bool mirror_drained_poll(BlockJob *job)
 return !!s->in_flight;
 }
 
-static void mirror_cancel(Job *job, bool force)
+static bool mirror_cancel(Job *job, bool force)
 {
 MirrorBlockJob *s = container_of(job, MirrorBlockJob, common.job);
 BlockDriverState *target = blk_bs(s->target);
 
-if (force || !job_is_ready(job)) {
+/*
+ * Before the job is READY, we treat any cancellation like a
+ * force-cancellation.
+ */
+force = force || !job_is_ready(job);
+
+if (force) {
 bdrv_cancel_in_flight(target);
 }
+return force;
+}
+
+static bool commit_active_cancel(Job *job, bool force)
+{
+/* Same as above in mirror_cancel() */
+return force || !job_is_ready(job);
 }
 
 static const BlockJobDriver mirror_job_driver = {
@@ -1226,6 +1237,7 @@ static const BlockJobDriver commit_active_job_driver = {
 .abort  = mirror_abort,
 .pause  = mirror_pause,
 .complete   = mirror_complete,
+.cancel = commit_active_cancel,
 },
 .drained_poll   = mirror_drained_poll,
 };
diff --git a/job.c 

[PATCH for-6.1? 2/6] job: @force parameter for job_cancel_sync{, _all}()

2021-07-22 Thread Max Reitz
Callers should be able to specify whether they want job_cancel_sync() to
force-cancel the job or not.

In fact, almost all invocations do not care about consistency of the
result and just want the job to terminate as soon as possible, so they
should pass force=true.  The replication block driver is the exception.

This changes some iotest outputs, because quitting qemu while a mirror
job is active will now lead to it being cancelled instead of completed,
which is what we want.  (Cancelling a READY mirror job with force=false
may take an indefinite amount of time, which we do not want when
quitting.  If users want consistent results, they must have all jobs be
done before they quit qemu.)

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz 
---
 include/qemu/job.h| 10 ++---
 block/replication.c   |  4 +-
 blockdev.c|  4 +-
 job.c | 27 +---
 qemu-nbd.c|  2 +-
 softmmu/runstate.c|  2 +-
 storage-daemon/qemu-storage-daemon.c  |  2 +-
 tests/unit/test-block-iothread.c  |  2 +-
 tests/unit/test-blockjob.c|  2 +-
 tests/qemu-iotests/109.out| 60 +++
 tests/qemu-iotests/tests/qsd-jobs.out |  2 +-
 11 files changed, 61 insertions(+), 56 deletions(-)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 41162ed494..5e8edbc2c8 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -506,19 +506,19 @@ void job_user_cancel(Job *job, bool force, Error **errp);
 
 /**
  * Synchronously cancel the @job.  The completion callback is called
- * before the function returns.  The job may actually complete
- * instead of canceling itself; the circumstances under which this
- * happens depend on the kind of job that is active.
+ * before the function returns.  If @force is false, the job may
+ * actually complete instead of canceling itself; the circumstances
+ * under which this happens depend on the kind of job that is active.
  *
  * Returns the return value from the job if the job actually completed
  * during the call, or -ECANCELED if it was canceled.
  *
  * Callers must hold the AioContext lock of job->aio_context.
  */
-int job_cancel_sync(Job *job);
+int job_cancel_sync(Job *job, bool force);
 
 /** Synchronously cancels all jobs using job_cancel_sync(). */
-void job_cancel_sync_all(void);
+void job_cancel_sync_all(bool force);
 
 /**
  * @job: The job to be completed.
diff --git a/block/replication.c b/block/replication.c
index 32444b9a8f..e7a9327b12 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -149,7 +149,7 @@ static void replication_close(BlockDriverState *bs)
 if (s->stage == BLOCK_REPLICATION_FAILOVER) {
 commit_job = >commit_job->job;
 assert(commit_job->aio_context == qemu_get_current_aio_context());
-job_cancel_sync(commit_job);
+job_cancel_sync(commit_job, false);
 }
 
 if (s->mode == REPLICATION_MODE_SECONDARY) {
@@ -726,7 +726,7 @@ static void replication_stop(ReplicationState *rs, bool 
failover, Error **errp)
  * disk, secondary disk in backup_job_completed().
  */
 if (s->backup_job) {
-job_cancel_sync(>backup_job->job);
+job_cancel_sync(>backup_job->job, false);
 }
 
 if (!failover) {
diff --git a/blockdev.c b/blockdev.c
index 3d8ac368a1..aa95918c02 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1848,7 +1848,7 @@ static void drive_backup_abort(BlkActionState *common)
 aio_context = bdrv_get_aio_context(state->bs);
 aio_context_acquire(aio_context);
 
-job_cancel_sync(>job->job);
+job_cancel_sync(>job->job, true);
 
 aio_context_release(aio_context);
 }
@@ -1949,7 +1949,7 @@ static void blockdev_backup_abort(BlkActionState *common)
 aio_context = bdrv_get_aio_context(state->bs);
 aio_context_acquire(aio_context);
 
-job_cancel_sync(>job->job);
+job_cancel_sync(>job->job, true);
 
 aio_context_release(aio_context);
 }
diff --git a/job.c b/job.c
index e7a5d28854..9e971d64cf 100644
--- a/job.c
+++ b/job.c
@@ -763,7 +763,12 @@ static void job_completed_txn_abort(Job *job)
 if (other_job != job) {
 ctx = other_job->aio_context;
 aio_context_acquire(ctx);
-job_cancel_async(other_job, false);
+/*
+ * This is a transaction: If one job failed, no result will matter.
+ * Therefore, pass force=true to terminate all other jobs as 
quickly
+ * as possible.
+ */
+job_cancel_async(other_job, true);
 aio_context_release(ctx);
 }
 }
@@ -964,12 +969,24 @@ static void job_cancel_err(Job *job, Error **errp)
 job_cancel(job, false);
 }
 
-int job_cancel_sync(Job *job)
+/**
+ * Same as job_cancel_err(), but force-cancel.
+ */
+static 

[PATCH for-6.1? 1/6] mirror: Keep s->synced on error

2021-07-22 Thread Max Reitz
An error does not take us out of the READY phase, which is what
s->synced signifies.  It does of course mean that source and target are
no longer in sync, but that is what s->actively_sync is for -- s->synced
never meant that source and target are in sync, only that they were at
some point (and at that point we transitioned into the READY phase).

The tangible problem is that we transition to READY once we are in sync
and s->synced is false.  By resetting s->synced here, we will transition
from READY to READY once the error is resolved (if the job keeps
running), and that transition is not allowed.

Signed-off-by: Max Reitz 
---
 block/mirror.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/block/mirror.c b/block/mirror.c
index 98fc66eabf..d73b704473 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -121,7 +121,6 @@ typedef enum MirrorMethod {
 static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
 int error)
 {
-s->synced = false;
 s->actively_synced = false;
 if (read) {
 return block_job_error_action(>common, s->on_source_error,
-- 
2.31.1




[PATCH for-6.1? 4/6] job: Add job_cancel_requested()

2021-07-22 Thread Max Reitz
Most callers of job_is_cancelled() actually want to know whether the job
is on its way to immediate termination.  For example, we refuse to pause
jobs that are cancelled; but this only makes sense for jobs that are
really actually cancelled.

A mirror job that is cancelled during READY with force=false should
absolutely be allowed to pause.  This "cancellation" (which is actually
a kind of completion) may take an indefinite amount of time, and so
should behave like any job during normal operation.  For example, with
on-target-error=stop, the job should stop on write errors.  (In
contrast, force-cancelled jobs should not get write errors, as they
should just terminate and not do further I/O.)

Therefore, redefine job_is_cancelled() to only return true for jobs that
are force-cancelled (which as of HEAD^ means any job that interprets the
cancellation request as a request for immediate termination), and add
job_cancel_request() as the general variant, which returns true for any
jobs which have been requested to be cancelled, whether it be
immediately or after an arbitrarily long completion phase.

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz 
---
 include/qemu/job.h |  8 +++-
 block/mirror.c |  9 -
 job.c  | 13 +
 3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 8aa90f7395..032edf3c5f 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -436,9 +436,15 @@ const char *job_type_str(const Job *job);
 /** Returns true if the job should not be visible to the management layer. */
 bool job_is_internal(Job *job);
 
-/** Returns whether the job is scheduled for cancellation. */
+/** Returns whether the job is being cancelled. */
 bool job_is_cancelled(Job *job);
 
+/**
+ * Returns whether the job is scheduled for cancellation (at an
+ * indefinite point).
+ */
+bool job_cancel_requested(Job *job);
+
 /** Returns whether the job is in a completed state. */
 bool job_is_completed(Job *job);
 
diff --git a/block/mirror.c b/block/mirror.c
index c3514f4196..291d2ed040 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -938,7 +938,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
 job_transition_to_ready(>common.job);
 s->synced = true;
 s->actively_synced = true;
-while (!job_is_cancelled(>common.job) && !s->should_complete) {
+while (!job_cancel_requested(>common.job) && !s->should_complete) {
 job_yield(>common.job);
 }
 s->common.job.cancelled = false;
@@ -1046,7 +1046,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
 }
 
 should_complete = s->should_complete ||
-job_is_cancelled(>common.job);
+job_cancel_requested(>common.job);
 cnt = bdrv_get_dirty_count(s->dirty_bitmap);
 }
 
@@ -1089,7 +1089,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
 }
 trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
 job_sleep_ns(>common.job, delay_ns);
-if (job_is_cancelled(>common.job) && s->common.job.force_cancel) {
+if (job_is_cancelled(>common.job)) {
 break;
 }
 s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -1101,8 +1101,7 @@ immediate_exit:
  * or it was cancelled prematurely so that we do not guarantee that
  * the target is a copy of the source.
  */
-assert(ret < 0 || (s->common.job.force_cancel &&
-   job_is_cancelled(>common.job)));
+assert(ret < 0 || job_is_cancelled(>common.job));
 assert(need_drain);
 mirror_wait_for_all_io(s);
 }
diff --git a/job.c b/job.c
index e78d893a9c..c51c8077cb 100644
--- a/job.c
+++ b/job.c
@@ -216,6 +216,11 @@ const char *job_type_str(const Job *job)
 }
 
 bool job_is_cancelled(Job *job)
+{
+return job->cancelled && job->force_cancel;
+}
+
+bool job_cancel_requested(Job *job)
 {
 return job->cancelled;
 }
@@ -650,7 +655,7 @@ static void job_conclude(Job *job)
 
 static void job_update_rc(Job *job)
 {
-if (!job->ret && job_is_cancelled(job)) {
+if (!job->ret && job_cancel_requested(job)) {
 job->ret = -ECANCELED;
 }
 if (job->ret) {
@@ -704,7 +709,7 @@ static int job_finalize_single(Job *job)
 
 /* Emit events only if we actually started */
 if (job_started(job)) {
-if (job_is_cancelled(job)) {
+if (job_cancel_requested(job)) {
 job_event_cancelled(job);
 } else {
 job_event_completed(job);
@@ -1015,7 +1020,7 @@ void job_complete(Job *job, Error **errp)
 if (job_apply_verb(job, JOB_VERB_COMPLETE, errp)) {
 return;
 }
-if (job_is_cancelled(job) || !job->driver->complete) {
+if (job_cancel_requested(job) || !job->driver->complete) {
 error_setg(errp, "The active block job '%s' cannot be 

[PATCH for-6.1? 0/6] mirror: Handle errors after READY cancel

2021-07-22 Thread Max Reitz
Hi,

This is a rather complex series with changes that aren’t exactly local
to the mirror job, so maybe it’s too complex for 6.1.

However, it is a bug fix, and not an insignificant one, though probably
not a regression of any kind.

Bug report:
https://gitlab.com/qemu-project/qemu/-/issues/462

(I didn’t put any “Fixes:” or “Resolves:” into the commit messages,
because there is no single patch here that fixes the bug.)

The root of the problem is that if you cancel a mirror job during its
READY phase, any kind of I/O error (with the error action 'stop') is
likely to not be handled gracefully, which means that perhaps the job
will just loop forever without pausing, doing nothing but emitting
errors.  There is no way to stop the job, or cancel it with force=true,
and so you also cannot quit qemu normally, because, well, cancelling the
job doesn’t do anything.  So you have to kill qemu to stop the mess.

If you’re lucky, the error is transient.  Then qemu will just kill
itself with a failed assertion, because it’ll try a READY -> READY
transition, which isn’t allowed.

There are a couple of problems contributing to it all:

(1) The READY -> READY transition comes from the fact that we will enter
the READY state whenever source and target are synced, and whenever
s->synced is false.  I/O errors reset s->synced.  I believe they
shouldn’t.
(Patch 1)

(2) Quitting qemu doesn’t force-cancel jobs.  I don’t understand why.
If for all jobs but mirror we want them to be cancelled and not
properly completed, why do we want mirror to get a consistent
result?  (Which is what cancel with force=false gives you.)
I believe we actually don’t care, and so on many occasions where we
invoke job_cancel_sync() and job_cancel_sync_all(), we want to
force-cancel the job(s) in question.
(Patch 2)

(3) Cancelling mirror post-READY with force=false is actually not really
cancelling the job.  It’s a different completion mode.  The job
should run like any normal job, it shouldn’t be special-cased.
However, we have a couple of places that special-case cancelled job
because they believe that such jobs are on their way to definite
termination.  For example, we don’t allow pausing cancelled jobs.
We definitely do want to allow pausing a mirror post-READY job that
is being non-force-cancelled.  The job may still take an arbitrary
amount of time, so it absolutely should be pausable.
(Patches 3, 4)

(4) Mirror only checks whether it’s been force-cancelled at the bottom
of its main loop, after several `continue`s.  Therefore, if flushing
fails (and it then `continue`s), that check will be skipped.  If
flushing fails continuously, the job cannot be force-cancelled.
(Patch 5)


Max Reitz (6):
  mirror: Keep s->synced on error
  job: @force parameter for job_cancel_sync{,_all}()
  jobs: Give Job.force_cancel more meaning
  job: Add job_cancel_requested()
  mirror: Check job_is_cancelled() earlier
  iotests: Add mirror-ready-cancel-error test

 include/qemu/job.h|  29 +++-
 block/backup.c|   3 +-
 block/mirror.c|  35 +++--
 block/replication.c   |   4 +-
 blockdev.c|   4 +-
 job.c |  46 --
 qemu-nbd.c|   2 +-
 softmmu/runstate.c|   2 +-
 storage-daemon/qemu-storage-daemon.c  |   2 +-
 tests/unit/test-block-iothread.c  |   2 +-
 tests/unit/test-blockjob.c|   2 +-
 tests/qemu-iotests/109.out|  60 +++-
 .../tests/mirror-ready-cancel-error   | 143 ++
 .../tests/mirror-ready-cancel-error.out   |   5 +
 tests/qemu-iotests/tests/qsd-jobs.out |   2 +-
 15 files changed, 262 insertions(+), 79 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/mirror-ready-cancel-error
 create mode 100644 tests/qemu-iotests/tests/mirror-ready-cancel-error.out

-- 
2.31.1




Re: [PATCH for-6.1?] iotest: Further enhance qemu-img-bitmaps

2021-07-22 Thread Vladimir Sementsov-Ogievskiy

21.07.2021 23:46, Eric Blake wrote:

Add a regression test to make sure we detect attempts to use 'qemu-img
bitmap' to modify an in-use local file.

Suggested-by: Nir Soffer 
Signed-off-by: Eric Blake 
---

Sadly, this missed my bitmaps pull request today.  If there's any
reason to respin that pull request, I'm inclined to add this in, as it
just touches the iotests; otherwise, if it slips to 6.2 it's not too
bad.

  tests/qemu-iotests/tests/qemu-img-bitmaps | 6 ++
  tests/qemu-iotests/tests/qemu-img-bitmaps.out | 5 +
  2 files changed, 11 insertions(+)

diff --git a/tests/qemu-iotests/tests/qemu-img-bitmaps 
b/tests/qemu-iotests/tests/qemu-img-bitmaps
index 7a3fe8c3d37a..3b6fade11735 100755
--- a/tests/qemu-iotests/tests/qemu-img-bitmaps
+++ b/tests/qemu-iotests/tests/qemu-img-bitmaps
@@ -129,6 +129,12 @@ $QEMU_IMG map --output=json --image-opts \
  $QEMU_IMG map --output=json --image-opts \
  "$IMG,x-dirty-bitmap=qemu:dirty-bitmap:b3" | _filter_qemu_img_map

+echo
+echo "=== bitmap command fails to modify image already in use ==="
+echo
+
+$QEMU_IMG bitmap --add "$TEST_IMG" b4 2>&1 | _filter_testdir
+
  nbd_server_stop

  echo
diff --git a/tests/qemu-iotests/tests/qemu-img-bitmaps.out 
b/tests/qemu-iotests/tests/qemu-img-bitmaps.out
index e851f0320ecb..c6e12dd700aa 100644
--- a/tests/qemu-iotests/tests/qemu-img-bitmaps.out
+++ b/tests/qemu-iotests/tests/qemu-img-bitmaps.out
@@ -116,6 +116,11 @@ Format specific information:
  { "start": 2097152, "length": 1048576, "depth": 0, "present": false, "zero": false, 
"data": false},
  { "start": 3145728, "length": 7340032, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET}]

+=== bitmap command fails to modify image already in use ===
+
+qemu-img: Could not open 'TEST_DIR/t.qcow2': Failed to get "write" lock
+Is another process using the image [TEST_DIR/t.qcow2]?
+
  === Check handling of inconsistent bitmap ===

  image: TEST_DIR/t.IMGFMT




I'm not against, but why you decided to add such a test? Lock should work for 
any operation, what's the reason to check bitmaps separately? Or it was broken 
recently?

anyway:
Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir



[PATCH] nbd/server: Suppress Broken pipe errors on abrupt disconnection

2021-07-22 Thread Richard W.M. Jones
This simple patch suppresses a warning message from qemu-nbd when a
client abruptly disconnects.  There is a way to work around this in
the client (by shutting down the connection properly).  Nevertheless
the "Broken pipe" error seems unnecessary to me as it does not convey
any useful information and might be used to fill up logs.

Rich.





[PATCH] nbd/server: Suppress Broken pipe errors on abrupt disconnection

2021-07-22 Thread Richard W.M. Jones
$ rm -f /tmp/sock /tmp/pid
$ qemu-img create -f qcow2 /tmp/disk.qcow2 1M
$ qemu-nbd -t --format=qcow2 --socket=/tmp/sock --pid-file=/tmp/pid 
/tmp/disk.qcow2 &
$ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.get_size()'
qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write to 
socket: Broken pipe
$ killall qemu-nbd

nbdsh is abruptly dropping the NBD connection here which is a valid
way to close the connection.  It seems unnecessary to print an error
in this case so this commit suppresses it.

Note that if you call the nbdsh h.shutdown() method then the message
was not printed:

$ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.get_size()' -c 'h.shutdown()'

Signed-off-by: Richard W.M. Jones 
---
 nbd/server.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/nbd/server.c b/nbd/server.c
index b60ebc3ab6..0f86535b88 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2668,7 +2668,11 @@ static coroutine_fn void nbd_trip(void *opaque)
 ret = nbd_handle_request(client, , req->data, _err);
 }
 if (ret < 0) {
-error_prepend(_err, "Failed to send reply: ");
+if (errno != EPIPE) {
+error_prepend(_err, "Failed to send reply: ");
+} else {
+local_err = NULL;
+}
 goto disconnect;
 }
 
-- 
2.32.0




Re: [PATCH v6 4/6] qmp: add QMP command x-debug-virtio-queue-status

2021-07-22 Thread Jason Wang



在 2021/7/21 下午4:59, Jonah Palmer 写道:



On 7/13/21 10:37 PM, Jason Wang wrote:


在 2021/7/12 下午6:35, Jonah Palmer 写道:

From: Laurent Vivier 

This new command shows internal status of a VirtQueue.
(vrings and indexes).

Signed-off-by: Laurent Vivier 
Signed-off-by: Jonah Palmer 
---
  hw/virtio/virtio-stub.c |   6 +++
  hw/virtio/virtio.c  |  37 ++
  qapi/virtio.json    | 102 


  3 files changed, 145 insertions(+)

  [Jonah: Added 'device-type' field to VirtQueueStatus and
  qmp command x-debug-virtio-queue-status.]

diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c
index ddb592f..3c1bf17 100644
--- a/hw/virtio/virtio-stub.c
+++ b/hw/virtio/virtio-stub.c
@@ -17,3 +17,9 @@ VirtioStatus *qmp_x_debug_virtio_status(const 
char* path, Error **errp)

  {
  return qmp_virtio_unsupported(errp);
  }
+
+VirtQueueStatus *qmp_x_debug_virtio_queue_status(const char *path,
+ uint16_t queue, 
Error **errp)

+{
+    return qmp_virtio_unsupported(errp);
+}
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 81a0ee8..ccd4371 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3935,6 +3935,43 @@ static VirtIODevice *virtio_device_find(const 
char *path)

  return NULL;
  }
  +VirtQueueStatus *qmp_x_debug_virtio_queue_status(const char *path,
+ uint16_t queue, 
Error **errp)

+{
+    VirtIODevice *vdev;
+    VirtQueueStatus *status;
+
+    vdev = virtio_device_find(path);
+    if (vdev == NULL) {
+    error_setg(errp, "Path %s is not a VirtIO device", path);
+    return NULL;
+    }
+
+    if (queue >= VIRTIO_QUEUE_MAX || !virtio_queue_get_num(vdev, 
queue)) {

+    error_setg(errp, "Invalid virtqueue number %d", queue);
+    return NULL;
+    }
+
+    status = g_new0(VirtQueueStatus, 1);
+    status->device_type = qapi_enum_parse(_lookup, 
vdev->name,

+ VIRTIO_TYPE_UNKNOWN, NULL);
+    status->queue_index = vdev->vq[queue].queue_index;
+    status->inuse = vdev->vq[queue].inuse;
+    status->vring_num = vdev->vq[queue].vring.num;
+    status->vring_num_default = vdev->vq[queue].vring.num_default;
+    status->vring_align = vdev->vq[queue].vring.align;
+    status->vring_desc = vdev->vq[queue].vring.desc;
+    status->vring_avail = vdev->vq[queue].vring.avail;
+    status->vring_used = vdev->vq[queue].vring.used;
+    status->last_avail_idx = vdev->vq[queue].last_avail_idx;



As mentioned in previous versions. We need add vhost support 
otherwise the value here is wrong.

Got it. I'll add a case to determine if vhost is active for a given device.
So, in the case that vhost is active, should I just not print out the value or 
would I substitute it with
another value (whatever that might be)?



You can query the vhost for those index.

(vhost_get_vring_base())



  Same question for shadow_avail_idx below as well.



It's an implementation specific. I think we can simply not show it if 
vhost is enabled.


Thanks




Jonah




+    status->shadow_avail_idx = vdev->vq[queue].shadow_avail_idx;



The shadow index is something that is implementation specific e.g in 
the case of vhost it's kind of meaningless.


Thanks



+    status->used_idx = vdev->vq[queue].used_idx;
+    status->signalled_used = vdev->vq[queue].signalled_used;
+    status->signalled_used_valid = 
vdev->vq[queue].signalled_used_valid;

+
+    return status;
+}
+
  #define CONVERT_FEATURES(type, map)    \
  ({   \
  type *list = NULL; \
diff --git a/qapi/virtio.json b/qapi/virtio.json
index 78873cd..7007e0c 100644
--- a/qapi/virtio.json
+++ b/qapi/virtio.json
@@ -406,3 +406,105 @@
    'data': { 'path': 'str' },
    'returns': 'VirtioStatus'
  }
+
+##
+# @VirtQueueStatus:
+#
+# Status of a VirtQueue
+#
+# @device-type: VirtIO device type
+#
+# @queue-index: VirtQueue queue_index
+#
+# @inuse: VirtQueue inuse
+#
+# @vring-num: VirtQueue vring.num
+#
+# @vring-num-default: VirtQueue vring.num_default
+#
+# @vring-align: VirtQueue vring.align
+#
+# @vring-desc: VirtQueue vring.desc
+#
+# @vring-avail: VirtQueue vring.avail
+#
+# @vring-used: VirtQueue vring.used
+#
+# @last-avail-idx: VirtQueue last_avail_idx
+#
+# @shadow-avail-idx: VirtQueue shadow_avail_idx
+#
+# @used-idx: VirtQueue used_idx
+#
+# @signalled-used: VirtQueue signalled_used
+#
+# @signalled-used-valid: VirtQueue signalled_used_valid
+#
+# Since: 6.1
+#
+##
+
+{ 'struct': 'VirtQueueStatus',
+  'data': {
+    'device-type': 'VirtioType',
+    'queue-index': 'uint16',
+    'inuse': 'uint32',
+    'vring-num': 'int',
+    'vring-num-default': 'int',
+    'vring-align': 'int',
+    'vring-desc': 'uint64',
+    'vring-avail': 'uint64',
+    'vring-used': 'uint64',
+    'last-avail-idx': 'uint16',
+    'shadow-avail-idx': 'uint16',
+    'used-idx': 'uint16',
+    'signalled-used': 'uint16',
+  

Re: [PATCH v6 6/6] hmp: add virtio commands

2021-07-22 Thread Jason Wang



在 2021/7/21 下午5:11, Jonah Palmer 写道:



On 7/13/21 10:40 PM, Jason Wang wrote:


在 2021/7/12 下午6:35, Jonah Palmer 写道:

+void hmp_virtio_queue_status(Monitor *mon, const QDict *qdict)
+{
+    Error *err = NULL;
+    const char *path = qdict_get_try_str(qdict, "path");
+    int queue = qdict_get_int(qdict, "queue");
+    VirtQueueStatus *s = qmp_x_debug_virtio_queue_status(path, 
queue, );

+
+    if (err != NULL) {
+    hmp_handle_error(mon, err);
+    return;
+    }
+
+    monitor_printf(mon, "%s:\n", path);
+    monitor_printf(mon, "  device_type:  %s\n",
+   VirtioType_str(s->device_type));
+    monitor_printf(mon, "  index:    %d\n", 
s->queue_index);

+    monitor_printf(mon, "  inuse:    %d\n", s->inuse);
+    monitor_printf(mon, "  last_avail_idx:   %d (%"PRId64" %% 
%"PRId64")\n",
+   s->last_avail_idx, s->last_avail_idx % 
s->vring_num,

+   s->vring_num);
+    monitor_printf(mon, "  shadow_avail_idx: %d (%"PRId64" %% 
%"PRId64")\n",
+   s->shadow_avail_idx, s->shadow_avail_idx % 
s->vring_num,

+   s->vring_num);
+    monitor_printf(mon, "  used_idx: %d (%"PRId64" %% 
%"PRId64")\n",
+   s->used_idx, s->used_idx % s->vring_num, 
s->vring_num);



The modular information is not the case of packed ring where the 
queue size does not have to be a power of 2.

Doesn't modulo work for any integer, regardless if it's a power of 2 or not? 
Could you clarify this for me?



For packed ring, the index doesn't increase freely, it's always small 
than the virtqueue size.


So showing the modulo arithmetic seems useless since the device or 
driver doesn't use modulo for calculating the real offset.


Thanks




Thank you,





Re: [PATCH v6 0/6] hmp, qmp: Add some commands to introspect virtio devices

2021-07-22 Thread Jason Wang



在 2021/7/21 下午4:53, Jonah Palmer 写道:


Hi Jason. My apologies for the delayed response, several work-related 
things came up recently, but they're slowing down now so I'm turning 
my attention these patches to get taken care of.


A few questions and comments below (and in other following patches):


On 7/13/21 10:42 PM, Jason Wang wrote:


在 2021/7/12 下午6:35, Jonah Palmer 写道:
 Dump the information of the head element of the third queue 
of virtio-scsi:


 (qemu) virtio queue-element 
/machine/peripheral-anon/device[3]/virtio-backend 3

 index: 122
 ndescs: 3
 descs: addr 0x7302d000 len 4096 (write), addr 0x3c951763 
len 108 (write, next),

    addr 0x3c951728 len 51 (next)



I think it would be nice if we can show driver area and device area 
as well here.

Sure thing. And I apologize if it's obvious (I'm relatively new to virtio), but 
how can I expose the driver area?



So the spec defines three parts: the device area, the driver area, and 
the descriptor area. And they are all located in the guest memory.




I understand that virtio devices are part of the Qemu process, but I also 
thought that virtio drivers are in the
guest's kernel, which I don't believe I can see into from Qemu (or, at least, 
it's not obvious to me).



It works like how you access the descriptor ring (descriptor area).

Thanks




Jonah


Thanks