Re: [PATCH v10 4/8] qmp: add QMP command x-query-virtio-status

2021-12-06 Thread Markus Armbruster
Jonah Palmer  writes:

> From: Laurent Vivier 
>
> This new command shows the status of a VirtIODevice, including
> its corresponding vhost device's status (if active).
>
> Next patch will improve output by decoding feature bits, including
> vhost device's feature bits (backend, protocol, acked, and features).
> Also will decode status bits of a VirtIODevice.
>
> Signed-off-by: Jonah Palmer 
> ---

[...]

> diff --git a/qapi/virtio.json b/qapi/virtio.json
> index 5372cde..235b8f3 100644
> --- a/qapi/virtio.json
> +++ b/qapi/virtio.json
> @@ -66,3 +66,223 @@
>  { 'command': 'x-query-virtio',
>'returns': [ 'VirtioInfo' ],
>'features': [ 'unstable' ] }
> +
> +##
> +# @VhostStatus:
> +#
> +# Information about a vhost device. This information will only be
> +# displayed if the vhost device is active.
> +#
> +# @n-mem-sections: vhost_dev n_mem_sections
> +#
> +# @n-tmp-sections: vhost_dev n_tmp_sections
> +#
> +# @nvqs: vhost_dev nvqs (number of virtqueues being used)
> +#
> +# @vq-index: vhost_dev vq_index
> +#
> +# @features: vhost_dev features
> +#
> +# @acked-features: vhost_dev acked_features
> +#
> +# @backend-features: vhost_dev backend_features
> +#
> +# @protocol-features: vhost_dev protocol_features
> +#
> +# @max-queues: vhost_dev max_queues
> +#
> +# @backend-cap: vhost_dev backend_cap
> +#
> +# @log-enabled: vhost_dev log_enabled flag
> +#
> +# @log-size: vhost_dev log_size
> +#
> +# Since: 7.0
> +#
> +##
> +
> +{ 'struct': 'VhostStatus',
> +  'data': { 'n-mem-sections': 'int',
> +'n-tmp-sections': 'int',
> +'nvqs': 'uint32',
> +'vq-index': 'int',
> +'features': 'uint64',
> +'acked-features': 'uint64',
> +'backend-features': 'uint64',
> +'protocol-features': 'uint64',
> +'max-queues': 'uint64',
> +'backend-cap': 'uint64',
> +'log-enabled': 'bool',
> +'log-size': 'uint64'  } }

Extra space before } }

> +
> +##
> +# @VirtioStatus:
> +#
> +# Full status of the virtio device with most VirtIODevice members.
> +# Also includes the full status of the corresponding vhost device
> +# if the vhost device is active.
> +#
> +# @name: VirtIODevice name
> +#
> +# @device-id: VirtIODevice ID
> +#
> +# @vhost-started: VirtIODevice vhost_started flag
> +#
> +# @guest-features: VirtIODevice guest_features
> +#
> +# @host-features: VirtIODevice host_features
> +#
> +# @backend-features: VirtIODevice backend_features
> +#
> +# @device-endian: VirtIODevice device_endian
> +#
> +# @num-vqs: VirtIODevice virtqueue count. This is the number of active
> +#   virtqueues being used by the VirtIODevice.
> +#
> +# @status: VirtIODevice configuration status (VirtioDeviceStatus)
> +#
> +# @isr: VirtIODevice ISR
> +#
> +# @queue-sel: VirtIODevice queue_sel
> +#
> +# @vm-running: VirtIODevice vm_running flag
> +#
> +# @broken: VirtIODevice broken flag
> +#
> +# @disabled: VirtIODevice disabled flag
> +#
> +# @use-started: VirtIODevice use_started flag
> +#
> +# @started: VirtIODevice started flag
> +#
> +# @start-on-kick: VirtIODevice start_on_kick flag
> +#
> +# @disable-legacy-check: VirtIODevice disabled_legacy_check flag
> +#
> +# @bus-name: VirtIODevice bus_name
> +#
> +# @use-guest-notifier-mask: VirtIODevice use_guest_notifier_mask flag
> +#
> +# @vhost-dev: Corresponding vhost device info for a given VirtIODevice

Suggest to spell out: "Present if ..."

> +#
> +# Since: 7.0
> +#
> +##
> +
> +{ 'struct': 'VirtioStatus',
> +  'data': { 'name': 'str',
> +'device-id': 'uint16',
> +'vhost-started': 'bool',
> +'device-endian': 'str',
> +'guest-features': 'uint64',
> +'host-features': 'uint64',
> +'backend-features': 'uint64',
> +'num-vqs': 'int',
> +'status': 'uint8',
> +'isr': 'uint8',
> +'queue-sel': 'uint16',
> +'vm-running': 'bool',
> +'broken': 'bool',
> +'disabled': 'bool',
> +'use-started': 'bool',
> +'started': 'bool',
> +'start-on-kick': 'bool',
> +'disable-legacy-check': 'bool',
> +'bus-name': 'str',
> +'use-guest-notifier-mask': 'bool',
> +'*vhost-dev': 'VhostStatus' } }
> +
> +##
> +# @x-query-virtio-status:
> +#
> +# Poll for a comprehensive status of a given virtio device
> +#
> +# @path: Canonical QOM path of the VirtIODevice
> +#
> +# Features:
> +# @unstable: This command is meant for debugging

End with a period for consistency with existing docs, like you did in
v9.

> +#
> +# Returns: VirtioStatus of the virtio device
> +#
> +# Since: 7.0
> +#
> +# Examples:
> +#
> +# 1. Poll for the status of virtio-crypto (no vhost-crypto active)
> +#
> +# -> { "execute": "x-query-virtio-status",
> +#  "arguments": { "path": "/machine/peripheral/crypto0/virtio-backend" }
> +#}
> +# <- { "return": {
> +#"device-endian": 

Re: [PATCH v10 3/8] qmp: add QMP command x-query-virtio

2021-12-06 Thread Markus Armbruster
Jonah Palmer  writes:

> From: Laurent Vivier 
>
> This new command lists all the instances of VirtIODevices with
> their canonical QOM path and name.
>
> Signed-off-by: Jonah Palmer 
> ---
>  hw/virtio/meson.build  |  2 ++
>  hw/virtio/virtio-stub.c| 14 ++
>  hw/virtio/virtio.c | 27 ++
>  include/hw/virtio/virtio.h |  1 +
>  qapi/meson.build   |  1 +
>  qapi/qapi-schema.json  |  1 +
>  qapi/virtio.json   | 68 
> ++
>  tests/qtest/qmp-cmd-test.c |  1 +
>  8 files changed, 115 insertions(+)
>  create mode 100644 hw/virtio/virtio-stub.c
>  create mode 100644 qapi/virtio.json
>
> diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
> index 521f7d6..d893f5f 100644
> --- a/hw/virtio/meson.build
> +++ b/hw/virtio/meson.build
> @@ -6,8 +6,10 @@ softmmu_virtio_ss.add(when: 'CONFIG_VHOST', if_false: 
> files('vhost-stub.c'))
>  
>  softmmu_ss.add_all(when: 'CONFIG_VIRTIO', if_true: softmmu_virtio_ss)
>  softmmu_ss.add(when: 'CONFIG_VIRTIO', if_false: files('vhost-stub.c'))
> +softmmu_ss.add(when: 'CONFIG_VIRTIO', if_false: files('virtio-stub.c'))
>  
>  softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-stub.c'))
> +softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('virtio-stub.c'))
>  
>  virtio_ss = ss.source_set()
>  virtio_ss.add(files('virtio.c'))
> diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c
> new file mode 100644
> index 000..05a81ed
> --- /dev/null
> +++ b/hw/virtio/virtio-stub.c
> @@ -0,0 +1,14 @@
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qapi/qapi-commands-virtio.h"
> +
> +static void *qmp_virtio_unsupported(Error **errp)
> +{
> +error_setg(errp, "Virtio is disabled");
> +return NULL;
> +}
> +
> +VirtioInfoList *qmp_x_query_virtio(Error **errp)
> +{
> +return qmp_virtio_unsupported(errp);
> +}
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 867834d..76a63a0 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -13,6 +13,8 @@
>  
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
> +#include "qapi/qapi-commands-virtio.h"
> +#include "qapi/qapi-visit-virtio.h"
>  #include "cpu.h"
>  #include "trace.h"
>  #include "qemu/error-report.h"
> @@ -29,6 +31,9 @@
>  #include "sysemu/runstate.h"
>  #include "standard-headers/linux/virtio_ids.h"
>  
> +/* QAPI list of realized VirtIODevices */
> +static QTAILQ_HEAD(, VirtIODevice) virtio_list;
> +
>  /*
>   * The alignment to use between consumer and producer parts of vring.
>   * x86 pagesize again. This is the default, used by transports like PCI
> @@ -3700,6 +3705,7 @@ static void virtio_device_realize(DeviceState *dev, 
> Error **errp)
>  vdev->listener.commit = virtio_memory_listener_commit;
>  vdev->listener.name = "virtio";
>  memory_listener_register(>listener, vdev->dma_as);
> +QTAILQ_INSERT_TAIL(_list, vdev, next);
>  }
>  
>  static void virtio_device_unrealize(DeviceState *dev)
> @@ -3714,6 +3720,7 @@ static void virtio_device_unrealize(DeviceState *dev)
>  vdc->unrealize(dev);
>  }
>  
> +QTAILQ_REMOVE(_list, vdev, next);
>  g_free(vdev->bus_name);
>  vdev->bus_name = NULL;
>  }
> @@ -3887,6 +3894,8 @@ static void virtio_device_class_init(ObjectClass 
> *klass, void *data)
>  vdc->stop_ioeventfd = virtio_device_stop_ioeventfd_impl;
>  
>  vdc->legacy_features |= VIRTIO_LEGACY_FEATURES;
> +
> +QTAILQ_INIT(_list);
>  }

@virtio_list duplicates information that exists in the QOM composition
tree.  It needs to stay in sync.  You could search the composition tree
instead.  I don't particularly care, but a virtio or QOM maintainer
might have a preference.

>  
>  bool virtio_device_ioeventfd_enabled(VirtIODevice *vdev)
> @@ -3897,6 +3906,24 @@ bool virtio_device_ioeventfd_enabled(VirtIODevice 
> *vdev)
>  return virtio_bus_ioeventfd_enabled(vbus);
>  }
>  
> +VirtioInfoList *qmp_x_query_virtio(Error **errp)
> +{
> +VirtioInfoList *list = NULL;
> +VirtioInfoList *node;
> +VirtIODevice *vdev;
> +
> +QTAILQ_FOREACH(vdev, _list, next) {
> +DeviceState *dev = DEVICE(vdev);
> +node = g_new0(VirtioInfoList, 1);
> +node->value = g_new(VirtioInfo, 1);
> +node->value->path = g_strdup(dev->canonical_path);
> +node->value->name = g_strdup(vdev->name);
> +QAPI_LIST_PREPEND(list, node->value);
> +}
> +
> +return list;
> +}
> +
>  static const TypeInfo virtio_device_info = {
>  .name = TYPE_VIRTIO_DEVICE,
>  .parent = TYPE_DEVICE,
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index 0a12bd5..3b4eb85 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -110,6 +110,7 @@ struct VirtIODevice
>  bool use_guest_notifier_mask;
>  AddressSpace *dma_as;
>  QLIST_HEAD(, VirtQueue) *vector_queues;
> +QTAILQ_ENTRY(VirtIODevice) next;
>  };
>  
>  struct VirtioDeviceClass {
> diff 

Re: [PATCH] spec: Add NBD_OPT_EXTENDED_HEADERS

2021-12-06 Thread Eric Blake
On Mon, Dec 06, 2021 at 02:40:45PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> >    Simple reply message
> > 
> >   The simple reply message MUST be sent by the server in response to all
> >   requests if structured replies have not been negotiated using
> > -`NBD_OPT_STRUCTURED_REPLY`. If structured replies have been negotiated, a 
> > simple
> > -reply MAY be used as a reply to any request other than `NBD_CMD_READ`,
> > -but only if the reply has no data payload.  The message looks as
> > -follows:
> > +`NBD_OPT_STRUCTURED_REPLY`. If structured replies have been
> > +negotiated, a simple reply MAY be used as a reply to any request other
> > +than `NBD_CMD_READ`, but only if the reply has no data payload.  If
> > +extended headers were not negotiated using `NBD_OPT_EXTENDED_HEADERS`,
> > +the message looks as follows:
> > 
> >   S: 32 bits, 0x67446698, magic (`NBD_SIMPLE_REPLY_MAGIC`; used to be
> >  `NBD_REPLY_MAGIC`)
> > @@ -369,6 +398,16 @@ S: 64 bits, handle
> >   S: (*length* bytes of data if the request is of type `NBD_CMD_READ` and
> >   *error* is zero)
> > 
> > +If extended headers were negotiated using `NBD_OPT_EXTENDED_HEADERS`,
> > +the message looks like:
> > +
> > +S: 32 bits, 0x60d12fd6, magic (`NBD_SIMPLE_REPLY_EXT_MAGIC`)
> > +S: 32 bits, error (MAY be zero)
> > +S: 64 bits, handle
> > +S: 128 bits, padding (MUST be zero)
> > +S: (*length* bytes of data if the request is of type `NBD_CMD_READ` and
> > +*error* is zero)
> > +
> 
> If we go this way, let's put payload length into padding: it will help to 
> make the protocol context-independent and less error-prone.

Easy enough to do (the payload length will be 0 except for
NBD_CMD_READ).

> 
> Or, the otherway, may be just forbid the payload for simple-64bit ? What's 
> the reason to allow 64bit requests without structured reply negotiation?

The two happened to be orthogonal enough in my implementation.  It was
easy to demonstrate either one without the other, and it IS easier to
write a client using non-structured replies (structured reads ARE
tougher than simple reads, even if it is less efficient when it comes
to reading zeros).  But you are also right that we could require
structured reads prior to allowing 64-bit operations, and then have
only one supported reply type on the wire when negotiated.  Wouter,
which way do you prefer?

> 
> >    Structured reply chunk message
> > 
> >   Some of the major downsides of the default simple reply to
> > @@ -410,7 +449,9 @@ considered successful only if it did not contain any 
> > error chunks,
> >   although the client MAY be able to determine partial success based
> >   on the chunks received.
> > 
> > -A structured reply chunk message looks as follows:
> > +If extended headers were not negotiated using
> > +`NBD_OPT_EXTENDED_HEADERS`, a structured reply chunk message looks as
> > +follows:
> > 
> >   S: 32 bits, 0x668e33ef, magic (`NBD_STRUCTURED_REPLY_MAGIC`)
> >   S: 16 bits, flags
> > @@ -423,6 +464,17 @@ The use of *length* in the reply allows context-free 
> > division of
> >   the overall server traffic into individual reply messages; the
> >   *type* field describes how to further interpret the payload.
> > 
> > +If extended headers were negotiated using `NBD_OPT_EXTENDED_HEADERS`,
> > +the message looks like:
> > +
> > +S: 32 bits, 0x6e8a278c, magic (`NBD_STRUCTURED_REPLY_EXT_MAGIC`)
> > +S: 16 bits, flags
> > +S: 16 bits, type
> > +S: 64 bits, handle
> > +S: 64 bits, length of payload (unsigned)
> 
> Maybe, 64bits is too much for payload. But who knows. And it's good that it's 
> symmetric to 64bit length in request.

Indeed, both qemu and libnbd implementations explicitly kill the
connection to any server that replies with more than the max buffer
used for NBD_CMD_READ/WRITE (32M for qemu, 64M for libnbd).  And if
the spec is not already clear on the topic, I should add an
independent patch to NBD_CMD_BLOCK_STATUS to make it obvious that a
server cannot reply with too many extents because of such clients.

So none of my proof-of-concept code ever used the full 64-bits of the
reply header length.  On the other hand, there is indeed the symmetry
argument - if someone writes a server willing to accept a 4G
NBD_CMD_WRITE, then it should also support a 4G NBD_CMD_READ, even if
no known existing server or client allows buffers that large..

> 
> > +S: 64 bits, padding (MUST be zero)
> 
> Hmm. Extra 8 bytes to be power-of-2. Does 32 bytes really perform better than 
> 24 bytes?

Consider:
struct header[100];

if sizeof(header[0]) is a power of 2 <= the cache line size (and the
compiler prefers to start arrays aligned to the cache line) then we
are guaranteed that all array members each reside in a single cache
line.  But if it is not a power of 2, some of the array members
straddle two cache lines.

Will there be code that wants to create an array of headers?  Perhaps
so, because that is a logical way (along with scatter/gather to
combine the header with 

Re: [PATCH v10 0/8] hmp,qmp: Add commands to introspect virtio devices

2021-12-06 Thread Michael S. Tsirkin
On Mon, Dec 06, 2021 at 07:13:49PM +0100, Christian Schoenebeck wrote:
> On Montag, 6. Dezember 2021 17:09:45 CET Jonah Palmer wrote:
> > On 12/6/21 08:50, Christian Schoenebeck wrote:
> > > On Montag, 6. Dezember 2021 13:43:18 CET Jonah Palmer wrote:
> > >> This series introduces new QMP/HMP commands to dump the status of a
> > >> virtio device at different levels.
> > >> 
> > >> [Jonah: Rebasing previous patchset from Nov. 10 (v9). Original patches
> > >> 
> > >>   are by Laurent Vivier from May 2020.
> > >>   
> > >>   Rebase from v9 to v10 includes reformatting virtio.json examples and
> > >>   command structures for better consistency. Also removed all enums from
> > >>   virtio.json and replaced their purpose with string literals.
> > >>   
> > >>   Removed @ndescs from VirtioQueueElement, as the number of descriptors
> > >>   can be inferred from the length of the @descs chain.
> > >>   
> > >>   Lastly, removed the examples in hmp-commands-info.hx to fix
> > >>   'inconsistent
> > >>   literal block quoting' warning from Sphinx.]
> > > 
> > > I have not followed the entire discussion. AFAICS this is intended to
> > > monitor status information on virtio level only, like virtqueue fill
> > > status, etc.
> > > 
> > > One thing that I am looking for is monitoring device specific information
> > > above virtio level, e.g. certain performance numbers or statistics that
> > > only make sense for the specific device. That would not fit into any of
> > > these commands, right?
> > > 
> > > Best regards,
> > > Christian Schoenebeck
> > 
> > Correct. These are just one-shot commands that dump information on virtio
> > devices (including vhosts), their virtqueues, and virtqueue elements as they
> > are at the time of the command.
> > 
> > Jonah
> 
> What I would find useful though on this virtio level: also being able to 
> query 
> the maximum and average fill state of the two ring buffers of each virtqueue. 
> That would allow to identify performance bottlenecks.
> 
> Best regards,
> Christian Schoenebeck

Adding this to vhost would need some interface work though.
Also, collecting these stats isn't free or trivial,
so I imagine we would need commands to enable/disable data collection.

-- 
MST




Re: [PULL 0/1] Block patches

2021-12-06 Thread Richard Henderson

On 12/6/21 7:27 AM, Stefan Hajnoczi wrote:

The following changes since commit 99fc08366b06282614daeda989d2fde6ab8a707f:

   Merge tag 'seabios-20211203-pull-request' of git://git.kraxel.org/qemu into 
staging (2021-12-03 05:26:40 -0800)

are available in the Git repository at:

   https://gitlab.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to 5b807181c27a940a3a7ad1f221a2e76a132cbdc0:

   virtio-blk: Fix clean up of host notifiers for single MR transaction. 
(2021-12-06 14:21:14 +)


Pull request



Mark Mielke (1):
   virtio-blk: Fix clean up of host notifiers for single MR transaction.

  hw/block/dataplane/virtio-blk.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Applied, thanks.

r~




Re: [PATCH 3/3] iotests: check: multiprocessing support

2021-12-06 Thread John Snow
On Mon, Dec 6, 2021 at 3:20 PM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> 06.12.2021 21:35, John Snow wrote:
> >
> >
> > On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
> vsement...@virtuozzo.com > wrote:
> >
> > Add -j  parameter, to run tests in several jobs simultaneously.
> > For realization - simply utilize multiprocessing.Pool class.
> >
> > Notes:
> >
> > 1. Of course, tests can't run simultaneously in same TEST_DIR. So,
> > use subdirectories TEST_DIR/testname/ and SOCK_DIR/testname/
> > instead of simply TEST_DIR and SOCK_DIR
> >
> >
> > SOCK_DIR was introduced because TEST_DIR was getting too long, and the
> length of the filepath was causing problems on some platforms. I hope that
> we aren't pushing our luck by making the directory longer here. Using the
> test name is nice because a human operator can quickly correlate
> directories to the tests that populated them, but if test names get kind of
> long, I wonder if we'll cause problems again.
> >
> > Just a stray thought.
> >
> > 2. multiprocessing.Pool.starmap function doesn't support passing
> > context managers, so we can't simply pass "self". Happily, we
> need
> > self only for read-only access, and it just works if it is
> defined
> > in global space. So, add a temporary link TestRunner.shared_self
> > during run_tests().
> >
> >
> > I'm a little confused on this point, see below
> >
> > Signed-off-by: Vladimir Sementsov-Ogievskiy <
> vsement...@virtuozzo.com >
> > ---
> >   tests/qemu-iotests/check |  4 +-
> >   tests/qemu-iotests/testrunner.py | 69
> 
> >   2 files changed, 64 insertions(+), 9 deletions(-)
> >
> > diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
> > index 43a4b694cc..0c27721a41 100755
> > --- a/tests/qemu-iotests/check
> > +++ b/tests/qemu-iotests/check
> > @@ -34,6 +34,8 @@ def make_argparser() -> argparse.ArgumentParser:
> >  help='show me, do not run tests')
> >   p.add_argument('-makecheck', action='store_true',
> >  help='pretty print output for make check')
> > +p.add_argument('-j', dest='jobs', type=int, default=1,
> > +   help='run tests in multiple parallel jobs')
> >
> >   p.add_argument('-d', dest='debug', action='store_true',
> help='debug')
> >   p.add_argument('-p', dest='print', action='store_true',
> > @@ -165,6 +167,6 @@ if __name__ == '__main__':
> >   with TestRunner(env, makecheck=args.makecheck,
> >   color=args.color) as tr:
> >   paths = [os.path.join(env.source_iotests, t) for t in
> tests]
> > -ok = tr.run_tests(paths)
> > +ok = tr.run_tests(paths, args.jobs )
> >   if not ok:
> >   sys.exit(1)
> >
> >
> > (OK)
> >
> > diff --git a/tests/qemu-iotests/testrunner.py
> b/tests/qemu-iotests/testrunner.py
> > index a9f2feb58c..0feaa396d0 100644
> > --- a/tests/qemu-iotests/testrunner.py
> > +++ b/tests/qemu-iotests/testrunner.py
> > @@ -26,6 +26,7 @@
> >   import json
> >   import termios
> >   import sys
> > +from multiprocessing import Pool
> >   from contextlib import contextmanager
> >   from typing import List, Optional, Iterator, Any, Sequence, Dict, \
> >   ContextManager
> > @@ -126,6 +127,31 @@ def __init__(self, status: str, description:
> str = '',
> >
> >
> >   class TestRunner(ContextManager['TestRunner']):
> > +shared_self = None
> >
> > +
> > +@staticmethod
> > +def proc_run_test(test: str, test_field_width: int) ->
> TestResult:
> > +# We are in a subprocess, we can't change the runner object!
> >
> >
> > *can't*, or shouldn't? I thought changing anything would just result in
> the child process diverging, but nothing harmful overall. Am I mistaken?
>
> Yes you are right. "Shouldn't" is OK
>
> >
> > +runner = TestRunner.shared_self
> > +assert runner is not None
> > +return runner.run_test(test, test_field_width, mp=True)
> > +
> > +def run_tests_pool(self, tests: List[str],
> > +   test_field_width: int, jobs: int) ->
> List[TestResult]:
> > +
> > +# passing self directly to Pool.starmap() just doesn't
> work, because
> > +# it's a context manager.
> >
> >
> > Why, what happens? (Or what doesn't happen?)
> >
> > (I believe you that it doesn't work, but it's not immediately obvious to
> me why.)
>
> Simple check:
>
> diff --git a/tests/qemu-iotests/testrunner.py
> b/tests/qemu-iotests/testrunner.py
> index 0feaa396d0..49c1780697 100644
> --- a/tests/qemu-iotests/testrunner.py
> 

Re: [PATCH 0/3] iotests: multiprocessing!!

2021-12-06 Thread Vladimir Sementsov-Ogievskiy

06.12.2021 21:37, John Snow wrote:



On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy mailto:vsement...@virtuozzo.com>> wrote:

Hi all!

Finally, I can not stand it any longer. So, I'm happy to present
multiprocessing support for iotests test runner.

testing on tmpfs:

Before:

time check -qcow2
...
real    12m28.095s
user    9m53.398s
sys     2m55.548s

After:

time check -qcow2 -j 12
...
real    2m12.136s
user    12m40.948s
sys     4m7.449s


VERY excellent. And this will probably flush a lot more bugs loose, too. (Which 
I consider a good thing!)


Thanks!)


We could look into utilizing it for 'make check', but we'll have to be prepared 
for a greater risk of race conditions on the CI if we do. But... it's seriously 
hard to argue with this kind of optimization, very well done!


I thought about this too. I think, we can at least passthrought -j flag of "make -j9 
check" to ./check

I think, CIs mostly call make check without -j flag. But I always call make -j9 check. And it 
always upset me that all tests run in parallel except for iotests. So if it possible to detect that 
we are called through "make -j9 check" and use "-j 9" for iotests/check in this 
case, it would be good.




Hmm, seems -j 6 should be enough. I have 6 cores, 2 threads per core.
Anyway, that's so fast!

Vladimir Sementsov-Ogievskiy (3):
   iotests/testrunner.py: add doc string for run_test()
   iotests/testrunner.py: move updating last_elapsed to run_tests
   iotests: check: multiprocessing support

  tests/qemu-iotests/check         |  4 +-
  tests/qemu-iotests/testrunner.py | 86 
  2 files changed, 80 insertions(+), 10 deletions(-)

-- 
2.31.1





--
Best regards,
Vladimir



Re: [PATCH 3/3] iotests: check: multiprocessing support

2021-12-06 Thread Vladimir Sementsov-Ogievskiy

06.12.2021 21:35, John Snow wrote:



On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy mailto:vsement...@virtuozzo.com>> wrote:

Add -j  parameter, to run tests in several jobs simultaneously.
For realization - simply utilize multiprocessing.Pool class.

Notes:

1. Of course, tests can't run simultaneously in same TEST_DIR. So,
    use subdirectories TEST_DIR/testname/ and SOCK_DIR/testname/
    instead of simply TEST_DIR and SOCK_DIR


SOCK_DIR was introduced because TEST_DIR was getting too long, and the length 
of the filepath was causing problems on some platforms. I hope that we aren't 
pushing our luck by making the directory longer here. Using the test name is 
nice because a human operator can quickly correlate directories to the tests 
that populated them, but if test names get kind of long, I wonder if we'll 
cause problems again.

Just a stray thought.

2. multiprocessing.Pool.starmap function doesn't support passing
    context managers, so we can't simply pass "self". Happily, we need
    self only for read-only access, and it just works if it is defined
    in global space. So, add a temporary link TestRunner.shared_self
    during run_tests().


I'm a little confused on this point, see below

Signed-off-by: Vladimir Sementsov-Ogievskiy mailto:vsement...@virtuozzo.com>>
---
  tests/qemu-iotests/check         |  4 +-
  tests/qemu-iotests/testrunner.py | 69 
  2 files changed, 64 insertions(+), 9 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 43a4b694cc..0c27721a41 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -34,6 +34,8 @@ def make_argparser() -> argparse.ArgumentParser:
                     help='show me, do not run tests')
      p.add_argument('-makecheck', action='store_true',
                     help='pretty print output for make check')
+    p.add_argument('-j', dest='jobs', type=int, default=1,
+                   help='run tests in multiple parallel jobs')

      p.add_argument('-d', dest='debug', action='store_true', help='debug')
      p.add_argument('-p', dest='print', action='store_true',
@@ -165,6 +167,6 @@ if __name__ == '__main__':
          with TestRunner(env, makecheck=args.makecheck,
                          color=args.color) as tr:
              paths = [os.path.join(env.source_iotests, t) for t in tests]
-            ok = tr.run_tests(paths)
+            ok = tr.run_tests(paths, args.jobs )
              if not ok:
                  sys.exit(1)


(OK)

diff --git a/tests/qemu-iotests/testrunner.py 
b/tests/qemu-iotests/testrunner.py
index a9f2feb58c..0feaa396d0 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -26,6 +26,7 @@
  import json
  import termios
  import sys
+from multiprocessing import Pool
  from contextlib import contextmanager
  from typing import List, Optional, Iterator, Any, Sequence, Dict, \
          ContextManager
@@ -126,6 +127,31 @@ def __init__(self, status: str, description: str = '',


  class TestRunner(ContextManager['TestRunner']):
+    shared_self = None

+
+    @staticmethod
+    def proc_run_test(test: str, test_field_width: int) -> TestResult:
+        # We are in a subprocess, we can't change the runner object!


*can't*, or shouldn't? I thought changing anything would just result in the 
child process diverging, but nothing harmful overall. Am I mistaken?


Yes you are right. "Shouldn't" is OK



+        runner = TestRunner.shared_self
+        assert runner is not None
+        return runner.run_test(test, test_field_width, mp=True)
+
+    def run_tests_pool(self, tests: List[str],
+                       test_field_width: int, jobs: int) -> 
List[TestResult]:
+
+        # passing self directly to Pool.starmap() just doesn't work, 
because
+        # it's a context manager.


Why, what happens? (Or what doesn't happen?)

(I believe you that it doesn't work, but it's not immediately obvious to me 
why.)


Simple check:

diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index 0feaa396d0..49c1780697 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -130,7 +130,7 @@ class TestRunner(ContextManager['TestRunner']):
 shared_self = None
 
 @staticmethod

-def proc_run_test(test: str, test_field_width: int) -> TestResult:
+def proc_run_test(x, test: str, test_field_width: int) -> TestResult:
 # We are in a subprocess, we can't change the runner object!
 runner = TestRunner.shared_self
 assert runner is not None
@@ -146,7 +146,7 @@ def run_tests_pool(self, tests: List[str],
 
 with Pool(jobs) as p:

 

Re: [PATCH 0/3] iotests: multiprocessing!!

2021-12-06 Thread John Snow
On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> Hi all!
>
> Finally, I can not stand it any longer. So, I'm happy to present
> multiprocessing support for iotests test runner.
>
> testing on tmpfs:
>
> Before:
>
> time check -qcow2
> ...
> real12m28.095s
> user9m53.398s
> sys 2m55.548s
>
> After:
>
> time check -qcow2 -j 12
> ...
> real2m12.136s
> user12m40.948s
> sys 4m7.449s
>
>
VERY excellent. And this will probably flush a lot more bugs loose, too.
(Which I consider a good thing!)
We could look into utilizing it for 'make check', but we'll have to be
prepared for a greater risk of race conditions on the CI if we do. But...
it's seriously hard to argue with this kind of optimization, very well done!


>
> Hmm, seems -j 6 should be enough. I have 6 cores, 2 threads per core.
> Anyway, that's so fast!
>
> Vladimir Sementsov-Ogievskiy (3):
>   iotests/testrunner.py: add doc string for run_test()
>   iotests/testrunner.py: move updating last_elapsed to run_tests
>   iotests: check: multiprocessing support
>
>  tests/qemu-iotests/check |  4 +-
>  tests/qemu-iotests/testrunner.py | 86 
>  2 files changed, 80 insertions(+), 10 deletions(-)
>
> --
> 2.31.1
>
>


Re: [PATCH 3/3] iotests: check: multiprocessing support

2021-12-06 Thread John Snow
On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> Add -j  parameter, to run tests in several jobs simultaneously.
> For realization - simply utilize multiprocessing.Pool class.
>
> Notes:
>
> 1. Of course, tests can't run simultaneously in same TEST_DIR. So,
>use subdirectories TEST_DIR/testname/ and SOCK_DIR/testname/
>instead of simply TEST_DIR and SOCK_DIR
>
>
SOCK_DIR was introduced because TEST_DIR was getting too long, and the
length of the filepath was causing problems on some platforms. I hope that
we aren't pushing our luck by making the directory longer here. Using the
test name is nice because a human operator can quickly correlate
directories to the tests that populated them, but if test names get kind of
long, I wonder if we'll cause problems again.

Just a stray thought.


> 2. multiprocessing.Pool.starmap function doesn't support passing
>context managers, so we can't simply pass "self". Happily, we need
>self only for read-only access, and it just works if it is defined
>in global space. So, add a temporary link TestRunner.shared_self
>during run_tests().
>

I'm a little confused on this point, see below


> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/check |  4 +-
>  tests/qemu-iotests/testrunner.py | 69 
>  2 files changed, 64 insertions(+), 9 deletions(-)
>
> diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
> index 43a4b694cc..0c27721a41 100755
> --- a/tests/qemu-iotests/check
> +++ b/tests/qemu-iotests/check
> @@ -34,6 +34,8 @@ def make_argparser() -> argparse.ArgumentParser:
> help='show me, do not run tests')
>  p.add_argument('-makecheck', action='store_true',
> help='pretty print output for make check')
> +p.add_argument('-j', dest='jobs', type=int, default=1,
> +   help='run tests in multiple parallel jobs')
>
>  p.add_argument('-d', dest='debug', action='store_true', help='debug')
>  p.add_argument('-p', dest='print', action='store_true',
> @@ -165,6 +167,6 @@ if __name__ == '__main__':
>  with TestRunner(env, makecheck=args.makecheck,
>  color=args.color) as tr:
>  paths = [os.path.join(env.source_iotests, t) for t in tests]
> -ok = tr.run_tests(paths)
> +ok = tr.run_tests(paths, args.jobs)
>  if not ok:
>  sys.exit(1)
>

(OK)


> diff --git a/tests/qemu-iotests/testrunner.py
> b/tests/qemu-iotests/testrunner.py
> index a9f2feb58c..0feaa396d0 100644
> --- a/tests/qemu-iotests/testrunner.py
> +++ b/tests/qemu-iotests/testrunner.py
> @@ -26,6 +26,7 @@
>  import json
>  import termios
>  import sys
> +from multiprocessing import Pool
>  from contextlib import contextmanager
>  from typing import List, Optional, Iterator, Any, Sequence, Dict, \
>  ContextManager
> @@ -126,6 +127,31 @@ def __init__(self, status: str, description: str = '',
>
>
>  class TestRunner(ContextManager['TestRunner']):
> +shared_self = None
>
+
> +@staticmethod
> +def proc_run_test(test: str, test_field_width: int) -> TestResult:
> +# We are in a subprocess, we can't change the runner object!
>

*can't*, or shouldn't? I thought changing anything would just result in the
child process diverging, but nothing harmful overall. Am I mistaken?


> +runner = TestRunner.shared_self
> +assert runner is not None
> +return runner.run_test(test, test_field_width, mp=True)
> +
> +def run_tests_pool(self, tests: List[str],
> +   test_field_width: int, jobs: int) ->
> List[TestResult]:
> +
> +# passing self directly to Pool.starmap() just doesn't work,
> because
> +# it's a context manager.
>

Why, what happens? (Or what doesn't happen?)

(I believe you that it doesn't work, but it's not immediately obvious to me
why.)


> +assert TestRunner.shared_self is None
> +TestRunner.shared_self = self
> +
> +with Pool(jobs) as p:
> +results = p.starmap(self.proc_run_test,
> +zip(tests, [test_field_width] *
> len(tests)))
> +
> +TestRunner.shared_self = None
> +
> +return results
> +
>  def __init__(self, env: TestEnv, makecheck: bool = False,
>   color: str = 'auto') -> None:
>  self.env = env
> @@ -219,11 +245,16 @@ def find_reference(self, test: str) -> str:
>
>  return f'{test}.out'
>
> -def do_run_test(self, test: str) -> TestResult:
> +def do_run_test(self, test: str, mp: bool) -> TestResult:
>  """
>  Run one test
>
>  :param test: test file path
> +:param mp: if true, we are in a multiprocessing environment, use
> +   personal subdirectories for test run
> +
> +Note: this method may be called from subprocess, so it does not
> +

Re: [PATCH v10 0/8] hmp, qmp: Add commands to introspect virtio devices

2021-12-06 Thread Christian Schoenebeck via
On Montag, 6. Dezember 2021 17:09:45 CET Jonah Palmer wrote:
> On 12/6/21 08:50, Christian Schoenebeck wrote:
> > On Montag, 6. Dezember 2021 13:43:18 CET Jonah Palmer wrote:
> >> This series introduces new QMP/HMP commands to dump the status of a
> >> virtio device at different levels.
> >> 
> >> [Jonah: Rebasing previous patchset from Nov. 10 (v9). Original patches
> >> 
> >>   are by Laurent Vivier from May 2020.
> >>   
> >>   Rebase from v9 to v10 includes reformatting virtio.json examples and
> >>   command structures for better consistency. Also removed all enums from
> >>   virtio.json and replaced their purpose with string literals.
> >>   
> >>   Removed @ndescs from VirtioQueueElement, as the number of descriptors
> >>   can be inferred from the length of the @descs chain.
> >>   
> >>   Lastly, removed the examples in hmp-commands-info.hx to fix
> >>   'inconsistent
> >>   literal block quoting' warning from Sphinx.]
> > 
> > I have not followed the entire discussion. AFAICS this is intended to
> > monitor status information on virtio level only, like virtqueue fill
> > status, etc.
> > 
> > One thing that I am looking for is monitoring device specific information
> > above virtio level, e.g. certain performance numbers or statistics that
> > only make sense for the specific device. That would not fit into any of
> > these commands, right?
> > 
> > Best regards,
> > Christian Schoenebeck
> 
> Correct. These are just one-shot commands that dump information on virtio
> devices (including vhosts), their virtqueues, and virtqueue elements as they
> are at the time of the command.
> 
> Jonah

What I would find useful though on this virtio level: also being able to query 
the maximum and average fill state of the two ring buffers of each virtqueue. 
That would allow to identify performance bottlenecks.

Best regards,
Christian Schoenebeck





Re: [PATCH 2/3] iotests/testrunner.py: move updating last_elapsed to run_tests

2021-12-06 Thread John Snow
On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> We are going to use do_run_test() in multiprocessing environment, where
> we'll not be able to change original runner object.
>
> Happily, the only thing we change is that last_elapsed and it's simple
> to do it in run_tests() instead. All other accesses to self in
> do_runt_test() and in run_test() are read-only.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/testrunner.py | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/tests/qemu-iotests/testrunner.py
> b/tests/qemu-iotests/testrunner.py
> index fa842252d3..a9f2feb58c 100644
> --- a/tests/qemu-iotests/testrunner.py
> +++ b/tests/qemu-iotests/testrunner.py
> @@ -287,7 +287,6 @@ def do_run_test(self, test: str) -> TestResult:
>diff=diff, casenotrun=casenotrun)
>  else:
>  f_bad.unlink()
> -self.last_elapsed.update(test, elapsed)
>  return TestResult(status='pass', elapsed=elapsed,
>casenotrun=casenotrun)
>
> @@ -353,6 +352,9 @@ def run_tests(self, tests: List[str]) -> bool:
>  print('\n'.join(res.diff))
>  elif res.status == 'not run':
>  notrun.append(name)
> +elif res.status == 'pass':
> +assert res.elapsed is not None
> +self.last_elapsed.update(t, res.elapsed)
>
>  sys.stdout.flush()
>  if res.interrupted:
> --
> 2.31.1
>
>
(I continue to be annoyed by the "None" problem in mypy, but I suppose it
really can't be helped. Nothing for you to change with this patch or
series. I just wish we didn't need so many assertions ...)

Reviewed-by: John Snow 


Re: [PATCH 1/3] iotests/testrunner.py: add doc string for run_test()

2021-12-06 Thread John Snow
On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> We are going to modify these methods and will add more documentation in
> further commit. As a preparation add basic documentation.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/testrunner.py | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/tests/qemu-iotests/testrunner.py
> b/tests/qemu-iotests/testrunner.py
> index 0e29c2fddd..fa842252d3 100644
> --- a/tests/qemu-iotests/testrunner.py
> +++ b/tests/qemu-iotests/testrunner.py
> @@ -220,6 +220,12 @@ def find_reference(self, test: str) -> str:
>  return f'{test}.out'
>
>  def do_run_test(self, test: str) -> TestResult:
> +"""
> +Run one test
> +
> +:param test: test file path
> +"""
> +
>  f_test = Path(test)
>  f_bad = Path(f_test.name + '.out.bad')
>  f_notrun = Path(f_test.name + '.notrun')
> @@ -287,6 +293,13 @@ def do_run_test(self, test: str) -> TestResult:
>
>  def run_test(self, test: str,
>   test_field_width: Optional[int] = None) -> TestResult:
> +"""
> +Run one test and print short status
> +
> +:param test: test file path
> +:param test_field_width: width for first field of status format
> +"""
> +
>  last_el = self.last_elapsed.get(test)
>  start = datetime.datetime.now().strftime('%H:%M:%S')
>
> --
> 2.31.1
>
>
Reviewed-by: John Snow 


Re: [PATCH v10 0/8] hmp,qmp: Add commands to introspect virtio devices

2021-12-06 Thread Jonah Palmer


On 12/6/21 08:50, Christian Schoenebeck wrote:

On Montag, 6. Dezember 2021 13:43:18 CET Jonah Palmer wrote:

This series introduces new QMP/HMP commands to dump the status of a
virtio device at different levels.

[Jonah: Rebasing previous patchset from Nov. 10 (v9). Original patches
  are by Laurent Vivier from May 2020.

  Rebase from v9 to v10 includes reformatting virtio.json examples and
  command structures for better consistency. Also removed all enums from
  virtio.json and replaced their purpose with string literals.

  Removed @ndescs from VirtioQueueElement, as the number of descriptors
  can be inferred from the length of the @descs chain.

  Lastly, removed the examples in hmp-commands-info.hx to fix 'inconsistent
  literal block quoting' warning from Sphinx.]

I have not followed the entire discussion. AFAICS this is intended to monitor
status information on virtio level only, like virtqueue fill status, etc.

One thing that I am looking for is monitoring device specific information
above virtio level, e.g. certain performance numbers or statistics that only
make sense for the specific device. That would not fit into any of these
commands, right?

Best regards,
Christian Schoenebeck


Correct. These are just one-shot commands that dump information on virtio
devices (including vhosts), their virtqueues, and virtqueue elements as they
are at the time of the command.

Jonah





Re: [PATCH v2 00/13] Eliminate drive_get_next()

2021-12-06 Thread Markus Armbruster
Markus Armbruster  writes:

> drive_get_next() is basically a bad idea.  It returns the "next" block
> backend of a certain interface type.  "Next" means bus=0,unit=N, where
> subsequent calls count N up from zero, per interface type.
>
> This lets you define unit numbers implicitly by execution order.  If
> the order changes, or new calls appear "in the middle", unit numbers
> change.  ABI break.  Hard to spot in review.  Replace its uses by
> drive_get(), then delete it.

Queued for 7.0.




[PULL 1/1] virtio-blk: Fix clean up of host notifiers for single MR transaction.

2021-12-06 Thread Stefan Hajnoczi
From: Mark Mielke 

The code that introduced "virtio-blk: Configure all host notifiers in
a single MR transaction" introduced a second loop variable to perform
cleanup in second loop, but mistakenly still refers to the first
loop variable within the second loop body.

Fixes: d0267da61489 ("virtio-blk: Configure all host notifiers in a single MR 
transaction")
Signed-off-by: Mark Mielke 
Message-id: CALm7yL08qarOu0dnQkTN+pa=bsrc92g31ypqqndeait4ylz...@mail.gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 hw/block/dataplane/virtio-blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 252c3a7a23..ee5a5352dc 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -222,7 +222,7 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
 memory_region_transaction_commit();
 
 while (j--) {
-virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), i);
+virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), j);
 }
 goto fail_host_notifiers;
 }
-- 
2.33.1




[PULL 0/1] Block patches

2021-12-06 Thread Stefan Hajnoczi
The following changes since commit 99fc08366b06282614daeda989d2fde6ab8a707f:

  Merge tag 'seabios-20211203-pull-request' of git://git.kraxel.org/qemu into 
staging (2021-12-03 05:26:40 -0800)

are available in the Git repository at:

  https://gitlab.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to 5b807181c27a940a3a7ad1f221a2e76a132cbdc0:

  virtio-blk: Fix clean up of host notifiers for single MR transaction. 
(2021-12-06 14:21:14 +)


Pull request



Mark Mielke (1):
  virtio-blk: Fix clean up of host notifiers for single MR transaction.

 hw/block/dataplane/virtio-blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.33.1





[PATCH for-7.0] tests/qemu-iotests: Fix 051 for binaries without 'lsi53c895a'

2021-12-06 Thread Thomas Huth
The lsi53c895a SCSI adaptor might not be enabled in each and every
x86 QEMU binary, e.g. it's disabled in the RHEL/CentOS build.
Thus let's add a check to the 051 test so that it does not fail if
this device is not available.

Signed-off-by: Thomas Huth 
---
 tests/qemu-iotests/051 | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index 1d2fa93a11..e9042a6214 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -45,6 +45,10 @@ _supported_proto file
 _unsupported_imgopts 'refcount_bits=\([^1]\|.\([^6]\|$\)\)' data_file
 _require_drivers nbd
 
+if [ "$QEMU_DEFAULT_MACHINE" = "pc" ]; then
+_require_devices lsi53c895a
+fi
+
 do_run_qemu()
 {
 echo Testing: "$@"
-- 
2.27.0




Re: [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables

2021-12-06 Thread Peter Maydell
On Mon, 6 Dec 2021 at 14:33, Stefan Hajnoczi  wrote:
>
> v3:
> - Added __attribute__((weak)) to get_ptr_*() [Florian]

Do we really need it *only* on get_ptr_*() ? If we need to
noinline the other two we probably also should use the same
attribute weak to force no optimizations at all.

-- PMM



[RFC v3 4/4] cpus: use coroutine TLS macros for iothread_locked

2021-12-06 Thread Stefan Hajnoczi
qemu_mutex_iothread_locked() may be used from coroutines. Standard
__thread variables cannot be used by coroutines. Use the coroutine TLS
macros instead.

Signed-off-by: Stefan Hajnoczi 
---
 softmmu/cpus.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index 071085f840..b02d3211e1 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -474,11 +474,11 @@ bool qemu_in_vcpu_thread(void)
 return current_cpu && qemu_cpu_is_self(current_cpu);
 }
 
-static __thread bool iothread_locked = false;
+QEMU_DEFINE_STATIC_CO_TLS(bool, iothread_locked)
 
 bool qemu_mutex_iothread_locked(void)
 {
-return iothread_locked;
+return get_iothread_locked();
 }
 
 /*
@@ -491,13 +491,13 @@ void qemu_mutex_lock_iothread_impl(const char *file, int 
line)
 
 g_assert(!qemu_mutex_iothread_locked());
 bql_lock(_global_mutex, file, line);
-iothread_locked = true;
+set_iothread_locked(true);
 }
 
 void qemu_mutex_unlock_iothread(void)
 {
 g_assert(qemu_mutex_iothread_locked());
-iothread_locked = false;
+set_iothread_locked(false);
 qemu_mutex_unlock(_global_mutex);
 }
 
-- 
2.33.1




[RFC v3 1/4] tls: add macros for coroutine-safe TLS variables

2021-12-06 Thread Stefan Hajnoczi
Compiler optimizations can cache TLS values across coroutine yield
points, resulting in stale values from the previous thread when a
coroutine is re-entered by a new thread.

Serge Guelton developed an __attribute__((noinline)) wrapper and tested
it with clang and gcc. I formatted his idea according to QEMU's coding
style and wrote documentation.

Richard Henderson developed an alternative approach that can be inlined
by the compiler. This is included for architectures where we have inline
assembly that determines the address of a TLS variable.

These macros must be used instead of __thread from now on to prevent
coroutine TLS bugs. Here is an x86_64 TLS variable access before this patch:

  mov%fs:-0x19c,%edx

And here is the same access using Richard's approach:

  rdfsbase %rax # %fs contains the base address
  lea-0x1a8(%rax),%rax  # -0x1a8 is the offset of our variable
  mov0xc(%rax),%edx # here we access the TLS variable via %rax

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1952483
Suggested-by: Serge Guelton 
Suggested-by: Richard Henderson 
Signed-off-by: Stefan Hajnoczi 
---
Richard's suggested code used a MOV instruction on x86_64 but we need
LEA semantics. LEA doesn't support %fs so I switched to RDFSBASE+LEA.
Otherwise Richard's approach is unchanged.
---
 include/qemu/coroutine-tls.h | 205 +++
 1 file changed, 205 insertions(+)
 create mode 100644 include/qemu/coroutine-tls.h

diff --git a/include/qemu/coroutine-tls.h b/include/qemu/coroutine-tls.h
new file mode 100644
index 00..b87c057243
--- /dev/null
+++ b/include/qemu/coroutine-tls.h
@@ -0,0 +1,205 @@
+/*
+ * QEMU Thread Local Storage for coroutines
+ *
+ * Copyright Red Hat
+ *
+ * SPDX-License-Identifier: LGPL-2.1-or-later
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ * It is forbidden to access Thread Local Storage in coroutines because
+ * compiler optimizations may cause values to be cached across coroutine
+ * re-entry. Coroutines can run in more than one thread through the course of
+ * their life, leading bugs when stale TLS values from the wrong thread are
+ * used as a result of compiler optimization.
+ *
+ * An example is:
+ *
+ * ..code-block:: c
+ *   :caption: A coroutine that may see the wrong TLS value
+ *
+ *   static __thread AioContext *current_aio_context;
+ *   ...
+ *   static void coroutine_fn foo(void)
+ *   {
+ *   aio_notify(current_aio_context);
+ *   qemu_coroutine_yield();
+ *   aio_notify(current_aio_context); // <-- may be stale after yielding!
+ *   }
+ *
+ * This header provides macros for safely defining variables in Thread Local
+ * Storage:
+ *
+ * ..code-block:: c
+ *   :caption: A coroutine that safely uses TLS
+ *
+ *   QEMU_DEFINE_STATIC_CO_TLS(AioContext *, current_aio_context)
+ *   ...
+ *   static void coroutine_fn foo(void)
+ *   {
+ *   aio_notify(get_current_aio_context());
+ *   qemu_coroutine_yield();
+ *   aio_notify(get_current_aio_context()); // <-- safe
+ *   }
+ */
+
+#ifndef QEMU_COROUTINE_TLS_H
+#define QEMU_COROUTINE_TLS_H
+
+/*
+ * Two techniques are available to stop the compiler from caching TLS values:
+ * 1. Accessor functions with __attribute__((noinline)). This is
+ *architecture-independent but prevents inlining optimizations.
+ * 2. TLS address-of implemented as asm volatile so it can be inlined safely.
+ *This enables inlining optimizations but requires architecture-specific
+ *inline assembly.
+ */
+#if defined(__aarch64__)
+#define QEMU_CO_TLS_ADDR(ret, var)   \
+asm volatile("mrs %0, tpidr_el0\n\t" \
+ "add %0, %0, #:tprel_hi12:"#var", lsl #12\n\t"  \
+ "add %0, %0, #:tprel_lo12_nc:"#var  \
+ : "=r"(ret))
+#elif defined(__powerpc64__)
+#define QEMU_CO_TLS_ADDR(ret, var)   \
+asm volatile("addis %0,13,"#var"@tprel@ha\n\t"   \
+ "add   %0,%0,"#var"@tprel@l"\
+ : "=r"(ret))
+#elif defined(__riscv)
+#define QEMU_CO_TLS_ADDR(ret, var)   \
+asm volatile("lui  %0,%%tprel_hi("#var")\n\t"\
+ "add  %0,%0,%%tprel_add("#var")\n\t"\
+ "addi %0,%0,%%tprel_lo("#var")" \
+ : "=r"(ret))
+#elif defined(__x86_64__)
+#define QEMU_CO_TLS_ADDR(ret, var)   \
+asm volatile("movq %%fs:0, %0\n\t"   \
+ "lea "#var"@tpoff(%0), %0" : "=r"(ret))
+#endif
+
+/**
+ * QEMU_DECLARE_CO_TLS:
+ * @type: the variable's C type
+ * @var: the 

[RFC v3 0/4] tls: add macros for coroutine-safe TLS variables

2021-12-06 Thread Stefan Hajnoczi
v3:
- Added __attribute__((weak)) to get_ptr_*() [Florian]
- Replace rdfsbase with mov %%fs:0,%0 [Florian]

This patch series solves the coroutines TLS problem. Coroutines re-entered from
another thread sometimes see stale TLS values. This happens because compilers
may cache values across yield points, so a value from the previous thread will
be used when the coroutine is re-entered in another thread.

Serge Guelton developed a portable technique and Richard Henderson developed an
inline-friendly architecture-specific technique, see the first patch for
details.

I have audited all __thread variables in QEMU and converted those that can be
used from coroutines. Most actually look safe to me. This patch does not
include a checkpatch.pl rule that requires coroutine-tls.h usage, but perhaps
we should add one for block/ at least?

Stefan Hajnoczi (4):
  tls: add macros for coroutine-safe TLS variables
  util/async: replace __thread with QEMU TLS macros
  rcu: use coroutine TLS macros
  cpus: use coroutine TLS macros for iothread_locked

 include/qemu/coroutine-tls.h | 205 +++
 include/qemu/rcu.h   |   7 +-
 softmmu/cpus.c   |   8 +-
 tests/unit/rcutorture.c  |  10 +-
 tests/unit/test-rcu-list.c   |   4 +-
 util/async.c |  12 +-
 util/rcu.c   |  10 +-
 7 files changed, 232 insertions(+), 24 deletions(-)
 create mode 100644 include/qemu/coroutine-tls.h

-- 
2.33.1





[RFC v3 2/4] util/async: replace __thread with QEMU TLS macros

2021-12-06 Thread Stefan Hajnoczi
QEMU TLS macros must be used to make TLS variables safe with coroutines.

Signed-off-by: Stefan Hajnoczi 
---
 util/async.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/util/async.c b/util/async.c
index 6f6717a34b..ddd9f24419 100644
--- a/util/async.c
+++ b/util/async.c
@@ -32,6 +32,7 @@
 #include "qemu/rcu_queue.h"
 #include "block/raw-aio.h"
 #include "qemu/coroutine_int.h"
+#include "qemu/coroutine-tls.h"
 #include "trace.h"
 
 /***/
@@ -669,12 +670,13 @@ void aio_context_release(AioContext *ctx)
 qemu_rec_mutex_unlock(>lock);
 }
 
-static __thread AioContext *my_aiocontext;
+QEMU_DEFINE_STATIC_CO_TLS(AioContext *, my_aiocontext)
 
 AioContext *qemu_get_current_aio_context(void)
 {
-if (my_aiocontext) {
-return my_aiocontext;
+AioContext *ctx = get_my_aiocontext();
+if (ctx) {
+return ctx;
 }
 if (qemu_mutex_iothread_locked()) {
 /* Possibly in a vCPU thread.  */
@@ -685,6 +687,6 @@ AioContext *qemu_get_current_aio_context(void)
 
 void qemu_set_current_aio_context(AioContext *ctx)
 {
-assert(!my_aiocontext);
-my_aiocontext = ctx;
+assert(!get_my_aiocontext());
+set_my_aiocontext(ctx);
 }
-- 
2.33.1




[RFC v3 3/4] rcu: use coroutine TLS macros

2021-12-06 Thread Stefan Hajnoczi
RCU may be used from coroutines. Standard __thread variables cannot be
used by coroutines. Use the coroutine TLS macros instead.

Signed-off-by: Stefan Hajnoczi 
---
 include/qemu/rcu.h |  7 ---
 tests/unit/rcutorture.c| 10 +-
 tests/unit/test-rcu-list.c |  4 ++--
 util/rcu.c | 10 +-
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h
index e69efbd47f..b063c6fde8 100644
--- a/include/qemu/rcu.h
+++ b/include/qemu/rcu.h
@@ -29,6 +29,7 @@
 #include "qemu/atomic.h"
 #include "qemu/notify.h"
 #include "qemu/sys_membarrier.h"
+#include "qemu/coroutine-tls.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -76,11 +77,11 @@ struct rcu_reader_data {
 NotifierList force_rcu;
 };
 
-extern __thread struct rcu_reader_data rcu_reader;
+QEMU_DECLARE_CO_TLS(struct rcu_reader_data, rcu_reader)
 
 static inline void rcu_read_lock(void)
 {
-struct rcu_reader_data *p_rcu_reader = _reader;
+struct rcu_reader_data *p_rcu_reader = get_ptr_rcu_reader();
 unsigned ctr;
 
 if (p_rcu_reader->depth++ > 0) {
@@ -96,7 +97,7 @@ static inline void rcu_read_lock(void)
 
 static inline void rcu_read_unlock(void)
 {
-struct rcu_reader_data *p_rcu_reader = _reader;
+struct rcu_reader_data *p_rcu_reader = get_ptr_rcu_reader();
 
 assert(p_rcu_reader->depth != 0);
 if (--p_rcu_reader->depth > 0) {
diff --git a/tests/unit/rcutorture.c b/tests/unit/rcutorture.c
index de6f649058..495a4e6f42 100644
--- a/tests/unit/rcutorture.c
+++ b/tests/unit/rcutorture.c
@@ -122,7 +122,7 @@ static void *rcu_read_perf_test(void *arg)
 
 rcu_register_thread();
 
-*(struct rcu_reader_data **)arg = _reader;
+*(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
 qatomic_inc();
 while (goflag == GOFLAG_INIT) {
 g_usleep(1000);
@@ -148,7 +148,7 @@ static void *rcu_update_perf_test(void *arg)
 
 rcu_register_thread();
 
-*(struct rcu_reader_data **)arg = _reader;
+*(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
 qatomic_inc();
 while (goflag == GOFLAG_INIT) {
 g_usleep(1000);
@@ -253,7 +253,7 @@ static void *rcu_read_stress_test(void *arg)
 
 rcu_register_thread();
 
-*(struct rcu_reader_data **)arg = _reader;
+*(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
 while (goflag == GOFLAG_INIT) {
 g_usleep(1000);
 }
@@ -304,7 +304,7 @@ static void *rcu_update_stress_test(void *arg)
 struct rcu_stress *cp = qatomic_read(_stress_current);
 
 rcu_register_thread();
-*(struct rcu_reader_data **)arg = _reader;
+*(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
 
 while (goflag == GOFLAG_INIT) {
 g_usleep(1000);
@@ -347,7 +347,7 @@ static void *rcu_fake_update_stress_test(void *arg)
 {
 rcu_register_thread();
 
-*(struct rcu_reader_data **)arg = _reader;
+*(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
 while (goflag == GOFLAG_INIT) {
 g_usleep(1000);
 }
diff --git a/tests/unit/test-rcu-list.c b/tests/unit/test-rcu-list.c
index 49641e1936..64b81ae058 100644
--- a/tests/unit/test-rcu-list.c
+++ b/tests/unit/test-rcu-list.c
@@ -171,7 +171,7 @@ static void *rcu_q_reader(void *arg)
 
 rcu_register_thread();
 
-*(struct rcu_reader_data **)arg = _reader;
+*(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
 qatomic_inc();
 while (qatomic_read() == GOFLAG_INIT) {
 g_usleep(1000);
@@ -206,7 +206,7 @@ static void *rcu_q_updater(void *arg)
 long long n_removed_local = 0;
 struct list_element *el, *prev_el;
 
-*(struct rcu_reader_data **)arg = _reader;
+*(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
 qatomic_inc();
 while (qatomic_read() == GOFLAG_INIT) {
 g_usleep(1000);
diff --git a/util/rcu.c b/util/rcu.c
index c91da9f137..b6d6c71cff 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -65,7 +65,7 @@ static inline int rcu_gp_ongoing(unsigned long *ctr)
 /* Written to only by each individual reader. Read by both the reader and the
  * writers.
  */
-__thread struct rcu_reader_data rcu_reader;
+QEMU_DEFINE_CO_TLS(struct rcu_reader_data, rcu_reader)
 
 /* Protected by rcu_registry_lock.  */
 typedef QLIST_HEAD(, rcu_reader_data) ThreadList;
@@ -355,23 +355,23 @@ void drain_call_rcu(void)
 
 void rcu_register_thread(void)
 {
-assert(rcu_reader.ctr == 0);
+assert(get_ptr_rcu_reader()->ctr == 0);
 qemu_mutex_lock(_registry_lock);
-QLIST_INSERT_HEAD(, _reader, node);
+QLIST_INSERT_HEAD(, get_ptr_rcu_reader(), node);
 qemu_mutex_unlock(_registry_lock);
 }
 
 void rcu_unregister_thread(void)
 {
 qemu_mutex_lock(_registry_lock);
-QLIST_REMOVE(_reader, node);
+QLIST_REMOVE(get_ptr_rcu_reader(), node);
 qemu_mutex_unlock(_registry_lock);
 }
 
 void rcu_add_force_rcu_notifier(Notifier *n)
 {
 qemu_mutex_lock(_registry_lock);
-notifier_list_add(_reader.force_rcu, n);
+

Re: [PATCH v10 1/8] virtio: drop name parameter for virtio_init()

2021-12-06 Thread Christian Schoenebeck
On Montag, 6. Dezember 2021 13:43:19 CET Jonah Palmer wrote:
> This patch drops the name parameter for the virtio_init function.
> 
> The pair between the numeric device ID and the string device ID
> (name) of a virtio device already exists, but not in a way that
> lets us map between them.
> 
> This patch lets us do this and removes the need for the name
> parameter in the virtio_init function.
> 
> Signed-off-by: Jonah Palmer 
> ---

Acked-by: Christian Schoenebeck 

>  hw/9pfs/virtio-9p-device.c  |  2 +-
>  hw/block/vhost-user-blk.c   |  2 +-
>  hw/block/virtio-blk.c   |  2 +-
>  hw/char/virtio-serial-bus.c |  3 +-
>  hw/display/virtio-gpu-base.c|  2 +-
>  hw/input/virtio-input.c |  3 +-
>  hw/net/virtio-net.c |  2 +-
>  hw/scsi/virtio-scsi.c   |  3 +-
>  hw/virtio/vhost-user-fs.c   |  3 +-
>  hw/virtio/vhost-user-i2c.c  |  7 +
>  hw/virtio/vhost-user-rng.c  |  2 +-
>  hw/virtio/vhost-user-vsock.c|  2 +-
>  hw/virtio/vhost-vsock-common.c  |  5 ++--
>  hw/virtio/vhost-vsock.c |  2 +-
>  hw/virtio/virtio-balloon.c  |  3 +-
>  hw/virtio/virtio-crypto.c   |  2 +-
>  hw/virtio/virtio-iommu.c|  3 +-
>  hw/virtio/virtio-mem.c  |  3 +-
>  hw/virtio/virtio-pmem.c |  3 +-
>  hw/virtio/virtio-rng.c  |  2 +-
>  hw/virtio/virtio.c  | 44
> +++-- include/hw/virtio/vhost-vsock-common.h  |
>  2 +-
>  include/hw/virtio/virtio-gpu.h  |  3 +-
>  include/hw/virtio/virtio.h  |  4 +--
>  include/standard-headers/linux/virtio_ids.h |  1 +
>  25 files changed, 67 insertions(+), 43 deletions(-)
> 
> diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> index 54ee93b..5f522e6 100644
> --- a/hw/9pfs/virtio-9p-device.c
> +++ b/hw/9pfs/virtio-9p-device.c
> @@ -216,7 +216,7 @@ static void virtio_9p_device_realize(DeviceState *dev,
> Error **errp) }
> 
>  v->config_size = sizeof(struct virtio_9p_config) +
> strlen(s->fsconf.tag); -virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P,
> v->config_size); +virtio_init(vdev, VIRTIO_ID_9P, v->config_size);
>  v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
>  }
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index ba13cb8..f61f8c1 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -490,7 +490,7 @@ static void vhost_user_blk_device_realize(DeviceState
> *dev, Error **errp) return;
>  }
> 
> -virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> +virtio_init(vdev, VIRTIO_ID_BLOCK,
>  sizeof(struct virtio_blk_config));
> 
>  s->virtqs = g_new(VirtQueue *, s->num_queues);
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index f139cd7..505e574 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -1213,7 +1213,7 @@ static void virtio_blk_device_realize(DeviceState
> *dev, Error **errp)
> 
>  virtio_blk_set_config_size(s, s->host_features);
> 
> -virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK, s->config_size);
> +virtio_init(vdev, VIRTIO_ID_BLOCK, s->config_size);
> 
>  s->blk = conf->conf.blk;
>  s->rq = NULL;
> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> index f01ec21..9f19fd0 100644
> --- a/hw/char/virtio-serial-bus.c
> +++ b/hw/char/virtio-serial-bus.c
> @@ -1044,8 +1044,7 @@ static void virtio_serial_device_realize(DeviceState
> *dev, Error **errp) VIRTIO_CONSOLE_F_EMERG_WRITE)) {
>  config_size = offsetof(struct virtio_console_config, emerg_wr);
>  }
> -virtio_init(vdev, "virtio-serial", VIRTIO_ID_CONSOLE,
> -config_size);
> +virtio_init(vdev, VIRTIO_ID_CONSOLE, config_size);
> 
>  /* Spawn a new virtio-serial bus on which the ports will ride as
> devices */ qbus_init(>bus, sizeof(vser->bus), TYPE_VIRTIO_SERIAL_BUS,
> diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
> index c8da480..5411a7b 100644
> --- a/hw/display/virtio-gpu-base.c
> +++ b/hw/display/virtio-gpu-base.c
> @@ -170,7 +170,7 @@ virtio_gpu_base_device_realize(DeviceState *qdev,
>  }
> 
>  g->virtio_config.num_scanouts = cpu_to_le32(g->conf.max_outputs);
> -virtio_init(VIRTIO_DEVICE(g), "virtio-gpu", VIRTIO_ID_GPU,
> +virtio_init(VIRTIO_DEVICE(g), VIRTIO_ID_GPU,
>  sizeof(struct virtio_gpu_config));
> 
>  if (virtio_gpu_virgl_enabled(g->conf)) {
> diff --git a/hw/input/virtio-input.c b/hw/input/virtio-input.c
> index 54bcb46..5b5398b 100644
> --- a/hw/input/virtio-input.c
> +++ b/hw/input/virtio-input.c
> @@ -257,8 +257,7 @@ static void virtio_input_device_realize(DeviceState
> *dev, Error **errp) 

Re: [PATCH v10 0/8] hmp, qmp: Add commands to introspect virtio devices

2021-12-06 Thread Christian Schoenebeck via
On Montag, 6. Dezember 2021 13:43:18 CET Jonah Palmer wrote:
> This series introduces new QMP/HMP commands to dump the status of a
> virtio device at different levels.
> 
> [Jonah: Rebasing previous patchset from Nov. 10 (v9). Original patches
>  are by Laurent Vivier from May 2020.
> 
>  Rebase from v9 to v10 includes reformatting virtio.json examples and
>  command structures for better consistency. Also removed all enums from
>  virtio.json and replaced their purpose with string literals.
> 
>  Removed @ndescs from VirtioQueueElement, as the number of descriptors
>  can be inferred from the length of the @descs chain.
> 
>  Lastly, removed the examples in hmp-commands-info.hx to fix 'inconsistent
>  literal block quoting' warning from Sphinx.]

I have not followed the entire discussion. AFAICS this is intended to monitor 
status information on virtio level only, like virtqueue fill status, etc.

One thing that I am looking for is monitoring device specific information 
above virtio level, e.g. certain performance numbers or statistics that only 
make sense for the specific device. That would not fit into any of these 
commands, right?

Best regards,
Christian Schoenebeck





Re: [PATCH v2 01/13] hw/sd/ssi-sd: Do not create SD card within controller's realize

2021-12-06 Thread Peter Maydell
On Mon, 6 Dec 2021 at 12:35, Markus Armbruster  wrote:
> Philippe Mathieu-Daudé  writes:
> > I guess a already asked this once but don't remember now... What is the
> > point of creating a SD card without drive? Is this due to the old design
> > issue we hotplug the drive to the SD card and not the SD card on the SD
> > bus?
>
> Device model "sd-card" implements BlockdevOps method change_media_cb().
> This menas it models a device with a removable medium.  No drive behaves
> like no medium.  I guess it's an SD card reader.

No, device sd-card really is the SD card -- the protocol between
the SD controller device and the sd-card device is (a slightly
abstracted version of) the protocol that goes between the card
and the controller in hardware. I think the reason it's implemented
as a "device with a removable medium" is historical -- back in
2007 that was the only way to support runtime ejecting and
insertion. We'd implement it differently today, but we've wanted
to preserve the user-facing compatibility of how the monitor
commands for ejecting and inserting an sd card work.

(Side note, the initial sd implementation actually tells the block
layer that it's BDRV_TYPE_FLOPPY as a "reasonable approximation"...)

-- PMM



[PATCH v10 5/8] qmp: decode feature & status bits in virtio-status

2021-12-06 Thread Jonah Palmer
From: Laurent Vivier 

Display feature names instead of bitmaps for host, guest, and
backend for VirtIODevices.

Display status names instead of bitmaps for VirtIODevices.

Display feature names instead of bitmaps for backend, protocol,
acked, and features (hdev->features) for vhost devices.

Decode features according to device ID. Decode statuses
according to configuration status bitmap (config_status_map).
Decode vhost user protocol features according to vhost user
protocol bitmap (vhost_user_protocol_map).

Transport features are on the first line. Undecoded bits (if
any) are stored in a separate field.

Signed-off-by: Jonah Palmer 
---
 hw/block/virtio-blk.c  |  29 +
 hw/char/virtio-serial-bus.c|  11 ++
 hw/display/virtio-gpu-base.c   |  18 ++-
 hw/input/virtio-input.c|  10 ++
 hw/net/virtio-net.c|  47 +++
 hw/scsi/virtio-scsi.c  |  17 +++
 hw/virtio/vhost-user-fs.c  |  10 ++
 hw/virtio/vhost-vsock-common.c |  10 ++
 hw/virtio/virtio-balloon.c |  14 ++
 hw/virtio/virtio-crypto.c  |  10 ++
 hw/virtio/virtio-iommu.c   |  14 ++
 hw/virtio/virtio-mem.c |  11 ++
 hw/virtio/virtio.c | 286 +++--
 include/hw/virtio/vhost.h  |   3 +
 include/hw/virtio/virtio.h |  18 +++
 qapi/virtio.json   | 156 +++---
 16 files changed, 635 insertions(+), 29 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 505e574..d550f077 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -13,6 +13,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qapi/qapi-visit-virtio.h"
 #include "qemu/iov.h"
 #include "qemu/module.h"
 #include "qemu/error-report.h"
@@ -32,10 +33,38 @@
 #include "hw/virtio/virtio-bus.h"
 #include "migration/qemu-file-types.h"
 #include "hw/virtio/virtio-access.h"
+#include "standard-headers/linux/vhost_types.h"
 
 /* Config size before the discard support (hide associated config fields) */
 #define VIRTIO_BLK_CFG_SIZE offsetof(struct virtio_blk_config, \
  max_discard_sectors)
+
+qmp_virtio_feature_map_t blk_map[] = {
+#define FEATURE_ENTRY(name) \
+{ VIRTIO_BLK_F_##name, #name }
+FEATURE_ENTRY(SIZE_MAX),
+FEATURE_ENTRY(SEG_MAX),
+FEATURE_ENTRY(GEOMETRY),
+FEATURE_ENTRY(RO),
+FEATURE_ENTRY(BLK_SIZE),
+FEATURE_ENTRY(TOPOLOGY),
+FEATURE_ENTRY(MQ),
+FEATURE_ENTRY(DISCARD),
+FEATURE_ENTRY(WRITE_ZEROES),
+#ifndef VIRTIO_BLK_NO_LEGACY
+FEATURE_ENTRY(BARRIER),
+FEATURE_ENTRY(SCSI),
+FEATURE_ENTRY(FLUSH),
+FEATURE_ENTRY(CONFIG_WCE),
+#endif /* !VIRTIO_BLK_NO_LEGACY */
+#undef FEATURE_ENTRY
+#define FEATURE_ENTRY(name) \
+{ VHOST_F_##name, #name  }
+FEATURE_ENTRY(LOG_ALL),
+#undef FEATURE_ENTRY
+{ -1, ""  }
+};
+
 /*
  * Starting from the discard feature, we can use this array to properly
  * set the config size depending on the features enabled.
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 9f19fd0..9de2575 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -20,6 +20,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qapi/qapi-visit-virtio.h"
 #include "qemu/iov.h"
 #include "qemu/main-loop.h"
 #include "qemu/module.h"
@@ -32,6 +33,16 @@
 #include "hw/virtio/virtio-serial.h"
 #include "hw/virtio/virtio-access.h"
 
+qmp_virtio_feature_map_t serial_map[] = {
+#define FEATURE_ENTRY(name) \
+{ VIRTIO_CONSOLE_F_##name, #name }
+FEATURE_ENTRY(SIZE),
+FEATURE_ENTRY(MULTIPORT),
+FEATURE_ENTRY(EMERG_WRITE),
+#undef FEATURE_ENTRY
+{ -1, "" }
+};
+
 static struct VirtIOSerialDevices {
 QLIST_HEAD(, VirtIOSerial) devices;
 } vserdevices;
diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
index 5411a7b..f6b2373 100644
--- a/hw/display/virtio-gpu-base.c
+++ b/hw/display/virtio-gpu-base.c
@@ -12,13 +12,29 @@
  */
 
 #include "qemu/osdep.h"
-
+#include "standard-headers/linux/vhost_types.h"
 #include "hw/virtio/virtio-gpu.h"
 #include "migration/blocker.h"
 #include "qapi/error.h"
+#include "qapi/qapi-visit-virtio.h"
 #include "qemu/error-report.h"
 #include "trace.h"
 
+qmp_virtio_feature_map_t gpu_map[] = {
+#define FEATURE_ENTRY(name) \
+{ VIRTIO_GPU_F_##name, #name }
+FEATURE_ENTRY(VIRGL),
+FEATURE_ENTRY(EDID),
+FEATURE_ENTRY(RESOURCE_UUID),
+FEATURE_ENTRY(RESOURCE_BLOB),
+#undef FEATURE_ENTRY
+#define FEATURE_ENTRY(name) \
+{ VHOST_F_##name, #name }
+FEATURE_ENTRY(LOG_ALL),
+#undef FEATURE_ENTRY
+{ -1, "" }
+};
+
 void
 virtio_gpu_base_reset(VirtIOGPUBase *g)
 {
diff --git a/hw/input/virtio-input.c b/hw/input/virtio-input.c
index 5b5398b..fe0ed6d 100644
--- a/hw/input/virtio-input.c
+++ b/hw/input/virtio-input.c
@@ -6,6 +6,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qapi/qapi-visit-virtio.h"
 #include "qemu/iov.h"
 #include "qemu/module.h"
 #include 

[PATCH v10 4/8] qmp: add QMP command x-query-virtio-status

2021-12-06 Thread Jonah Palmer
From: Laurent Vivier 

This new command shows the status of a VirtIODevice, including
its corresponding vhost device's status (if active).

Next patch will improve output by decoding feature bits, including
vhost device's feature bits (backend, protocol, acked, and features).
Also will decode status bits of a VirtIODevice.

Signed-off-by: Jonah Palmer 
---
 hw/virtio/virtio-stub.c |   5 ++
 hw/virtio/virtio.c  |  84 ++
 qapi/virtio.json| 220 
 3 files changed, 309 insertions(+)

diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c
index 05a81ed..0b432e8 100644
--- a/hw/virtio/virtio-stub.c
+++ b/hw/virtio/virtio-stub.c
@@ -12,3 +12,8 @@ VirtioInfoList *qmp_x_query_virtio(Error **errp)
 {
 return qmp_virtio_unsupported(errp);
 }
+
+VirtioStatus *qmp_x_query_virtio_status(const char *path, Error **errp)
+{
+return qmp_virtio_unsupported(errp);
+}
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 76a63a0..b8112ca 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3924,6 +3924,90 @@ VirtioInfoList *qmp_x_query_virtio(Error **errp)
 return list;
 }
 
+static VirtIODevice *virtio_device_find(const char *path)
+{
+VirtIODevice *vdev;
+
+QTAILQ_FOREACH(vdev, _list, next) {
+DeviceState *dev = DEVICE(vdev);
+
+if (strcmp(dev->canonical_path, path) != 0) {
+continue;
+}
+return vdev;
+}
+
+return NULL;
+}
+
+VirtioStatus *qmp_x_query_virtio_status(const char *path, Error **errp)
+{
+VirtIODevice *vdev;
+VirtioStatus *status;
+
+vdev = virtio_device_find(path);
+if (vdev == NULL) {
+error_setg(errp, "Path %s is not a VirtIODevice", path);
+return NULL;
+}
+
+status = g_new0(VirtioStatus, 1);
+status->name = g_strdup(vdev->name);
+status->device_id = vdev->device_id;
+status->vhost_started = vdev->vhost_started;
+status->guest_features = vdev->guest_features;
+status->host_features = vdev->host_features;
+status->backend_features = vdev->backend_features;
+
+switch (vdev->device_endian) {
+case VIRTIO_DEVICE_ENDIAN_LITTLE:
+status->device_endian = g_strdup("little");
+break;
+case VIRTIO_DEVICE_ENDIAN_BIG:
+status->device_endian = g_strdup("big");
+break;
+default:
+status->device_endian = g_strdup("unknown");
+break;
+}
+
+status->num_vqs = virtio_get_num_queues(vdev);
+status->status = vdev->status;
+status->isr = vdev->isr;
+status->queue_sel = vdev->queue_sel;
+status->vm_running = vdev->vm_running;
+status->broken = vdev->broken;
+status->disabled = vdev->disabled;
+status->use_started = vdev->use_started;
+status->started = vdev->started;
+status->start_on_kick = vdev->start_on_kick;
+status->disable_legacy_check = vdev->disable_legacy_check;
+status->bus_name = g_strdup(vdev->bus_name);
+status->use_guest_notifier_mask = vdev->use_guest_notifier_mask;
+status->has_vhost_dev = vdev->vhost_started;
+
+if (vdev->vhost_started) {
+VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+struct vhost_dev *hdev = vdc->get_vhost(vdev);
+
+status->vhost_dev = g_new0(VhostStatus, 1);
+status->vhost_dev->n_mem_sections = hdev->n_mem_sections;
+status->vhost_dev->n_tmp_sections = hdev->n_tmp_sections;
+status->vhost_dev->nvqs = hdev->nvqs;
+status->vhost_dev->vq_index = hdev->vq_index;
+status->vhost_dev->features = hdev->features;
+status->vhost_dev->acked_features = hdev->acked_features;
+status->vhost_dev->backend_features = hdev->backend_features;
+status->vhost_dev->protocol_features = hdev->protocol_features;
+status->vhost_dev->max_queues = hdev->max_queues;
+status->vhost_dev->backend_cap = hdev->backend_cap;
+status->vhost_dev->log_enabled = hdev->log_enabled;
+status->vhost_dev->log_size = hdev->log_size;
+}
+
+return status;
+}
+
 static const TypeInfo virtio_device_info = {
 .name = TYPE_VIRTIO_DEVICE,
 .parent = TYPE_DEVICE,
diff --git a/qapi/virtio.json b/qapi/virtio.json
index 5372cde..235b8f3 100644
--- a/qapi/virtio.json
+++ b/qapi/virtio.json
@@ -66,3 +66,223 @@
 { 'command': 'x-query-virtio',
   'returns': [ 'VirtioInfo' ],
   'features': [ 'unstable' ] }
+
+##
+# @VhostStatus:
+#
+# Information about a vhost device. This information will only be
+# displayed if the vhost device is active.
+#
+# @n-mem-sections: vhost_dev n_mem_sections
+#
+# @n-tmp-sections: vhost_dev n_tmp_sections
+#
+# @nvqs: vhost_dev nvqs (number of virtqueues being used)
+#
+# @vq-index: vhost_dev vq_index
+#
+# @features: vhost_dev features
+#
+# @acked-features: vhost_dev acked_features
+#
+# @backend-features: vhost_dev backend_features
+#
+# @protocol-features: vhost_dev protocol_features
+#
+# @max-queues: vhost_dev max_queues

[PATCH v10 1/8] virtio: drop name parameter for virtio_init()

2021-12-06 Thread Jonah Palmer
This patch drops the name parameter for the virtio_init function.

The pair between the numeric device ID and the string device ID
(name) of a virtio device already exists, but not in a way that
lets us map between them.

This patch lets us do this and removes the need for the name
parameter in the virtio_init function.

Signed-off-by: Jonah Palmer 
---
 hw/9pfs/virtio-9p-device.c  |  2 +-
 hw/block/vhost-user-blk.c   |  2 +-
 hw/block/virtio-blk.c   |  2 +-
 hw/char/virtio-serial-bus.c |  3 +-
 hw/display/virtio-gpu-base.c|  2 +-
 hw/input/virtio-input.c |  3 +-
 hw/net/virtio-net.c |  2 +-
 hw/scsi/virtio-scsi.c   |  3 +-
 hw/virtio/vhost-user-fs.c   |  3 +-
 hw/virtio/vhost-user-i2c.c  |  7 +
 hw/virtio/vhost-user-rng.c  |  2 +-
 hw/virtio/vhost-user-vsock.c|  2 +-
 hw/virtio/vhost-vsock-common.c  |  5 ++--
 hw/virtio/vhost-vsock.c |  2 +-
 hw/virtio/virtio-balloon.c  |  3 +-
 hw/virtio/virtio-crypto.c   |  2 +-
 hw/virtio/virtio-iommu.c|  3 +-
 hw/virtio/virtio-mem.c  |  3 +-
 hw/virtio/virtio-pmem.c |  3 +-
 hw/virtio/virtio-rng.c  |  2 +-
 hw/virtio/virtio.c  | 44 +++--
 include/hw/virtio/vhost-vsock-common.h  |  2 +-
 include/hw/virtio/virtio-gpu.h  |  3 +-
 include/hw/virtio/virtio.h  |  4 +--
 include/standard-headers/linux/virtio_ids.h |  1 +
 25 files changed, 67 insertions(+), 43 deletions(-)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 54ee93b..5f522e6 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -216,7 +216,7 @@ static void virtio_9p_device_realize(DeviceState *dev, 
Error **errp)
 }
 
 v->config_size = sizeof(struct virtio_9p_config) + strlen(s->fsconf.tag);
-virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, v->config_size);
+virtio_init(vdev, VIRTIO_ID_9P, v->config_size);
 v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
 }
 
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index ba13cb8..f61f8c1 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -490,7 +490,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 
-virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
+virtio_init(vdev, VIRTIO_ID_BLOCK,
 sizeof(struct virtio_blk_config));
 
 s->virtqs = g_new(VirtQueue *, s->num_queues);
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f139cd7..505e574 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1213,7 +1213,7 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 
 virtio_blk_set_config_size(s, s->host_features);
 
-virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK, s->config_size);
+virtio_init(vdev, VIRTIO_ID_BLOCK, s->config_size);
 
 s->blk = conf->conf.blk;
 s->rq = NULL;
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index f01ec21..9f19fd0 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -1044,8 +1044,7 @@ static void virtio_serial_device_realize(DeviceState 
*dev, Error **errp)
 VIRTIO_CONSOLE_F_EMERG_WRITE)) {
 config_size = offsetof(struct virtio_console_config, emerg_wr);
 }
-virtio_init(vdev, "virtio-serial", VIRTIO_ID_CONSOLE,
-config_size);
+virtio_init(vdev, VIRTIO_ID_CONSOLE, config_size);
 
 /* Spawn a new virtio-serial bus on which the ports will ride as devices */
 qbus_init(>bus, sizeof(vser->bus), TYPE_VIRTIO_SERIAL_BUS,
diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
index c8da480..5411a7b 100644
--- a/hw/display/virtio-gpu-base.c
+++ b/hw/display/virtio-gpu-base.c
@@ -170,7 +170,7 @@ virtio_gpu_base_device_realize(DeviceState *qdev,
 }
 
 g->virtio_config.num_scanouts = cpu_to_le32(g->conf.max_outputs);
-virtio_init(VIRTIO_DEVICE(g), "virtio-gpu", VIRTIO_ID_GPU,
+virtio_init(VIRTIO_DEVICE(g), VIRTIO_ID_GPU,
 sizeof(struct virtio_gpu_config));
 
 if (virtio_gpu_virgl_enabled(g->conf)) {
diff --git a/hw/input/virtio-input.c b/hw/input/virtio-input.c
index 54bcb46..5b5398b 100644
--- a/hw/input/virtio-input.c
+++ b/hw/input/virtio-input.c
@@ -257,8 +257,7 @@ static void virtio_input_device_realize(DeviceState *dev, 
Error **errp)
 vinput->cfg_size += 8;
 assert(vinput->cfg_size <= sizeof(virtio_input_config));
 
-virtio_init(vdev, "virtio-input", VIRTIO_ID_INPUT,
-vinput->cfg_size);
+virtio_init(vdev, VIRTIO_ID_INPUT, vinput->cfg_size);
 

[PATCH v10 8/8] hmp: add virtio commands

2021-12-06 Thread Jonah Palmer
From: Laurent Vivier 

This patch implements the HMP versions of the virtio QMP commands.

Signed-off-by: Jonah Palmer 
---
 hmp-commands-info.hx  |  70 
 include/monitor/hmp.h |   5 +
 monitor/hmp-cmds.c| 311 ++
 3 files changed, 386 insertions(+)

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 407a1da..e49d852 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -877,3 +877,73 @@ SRST
   ``info sgx``
 Show intel SGX information.
 ERST
+
+{
+.name  = "virtio",
+.args_type = "",
+.params= "",
+.help  = "List all available virtio devices",
+.cmd   = hmp_virtio_query,
+.flags = "p",
+},
+
+SRST
+  ``info virtio``
+List all available virtio devices
+ERST
+
+{
+.name  = "virtio-status",
+.args_type = "path:s",
+.params= "path",
+.help  = "Display status of a given virtio device",
+.cmd   = hmp_virtio_status,
+.flags = "p",
+},
+
+SRST
+  ``info virtio-status`` *path*
+Display status of a given virtio device
+ERST
+
+{
+.name  = "virtio-queue-status",
+.args_type = "path:s,queue:i",
+.params= "path queue",
+.help  = "Display status of a given virtio queue",
+.cmd   = hmp_virtio_queue_status,
+.flags = "p",
+},
+
+SRST
+  ``info virtio-queue-status`` *path* *queue*
+Display status of a given virtio queue
+ERST
+
+{
+.name  = "virtio-vhost-queue-status",
+.args_type = "path:s,queue:i",
+.params= "path queue",
+.help  = "Display status of a given vhost queue",
+.cmd   = hmp_vhost_queue_status,
+.flags = "p",
+},
+
+SRST
+  ``info virtio-vhost-queue-status`` *path* *queue*
+Display status of a given vhost queue
+ERST
+
+{
+.name   = "virtio-queue-element",
+.args_type  = "path:s,queue:i,index:i?",
+.params = "path queue [index]",
+.help   = "Display element of a given virtio queue",
+.cmd= hmp_virtio_queue_element,
+.flags  = "p",
+},
+
+SRST
+  ``info virtio-queue-element`` *path* *queue* [*index*]
+Display element of a given virtio queue
+ERST
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index 96d0148..47446d8 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -95,6 +95,11 @@ void hmp_qom_list(Monitor *mon, const QDict *qdict);
 void hmp_qom_get(Monitor *mon, const QDict *qdict);
 void hmp_qom_set(Monitor *mon, const QDict *qdict);
 void hmp_info_qom_tree(Monitor *mon, const QDict *dict);
+void hmp_virtio_query(Monitor *mon, const QDict *qdict);
+void hmp_virtio_status(Monitor *mon, const QDict *qdict);
+void hmp_virtio_queue_status(Monitor *mon, const QDict *qdict);
+void hmp_vhost_queue_status(Monitor *mon, const QDict *qdict);
+void hmp_virtio_queue_element(Monitor *mon, const QDict *qdict);
 void object_add_completion(ReadLineState *rs, int nb_args, const char *str);
 void object_del_completion(ReadLineState *rs, int nb_args, const char *str);
 void device_add_completion(ReadLineState *rs, int nb_args, const char *str);
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 9c91bf9..7092653 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -42,6 +42,8 @@
 #include "qapi/qapi-commands-run-state.h"
 #include "qapi/qapi-commands-tpm.h"
 #include "qapi/qapi-commands-ui.h"
+#include "qapi/qapi-commands-virtio.h"
+#include "qapi/qapi-visit-virtio.h"
 #include "qapi/qapi-visit-net.h"
 #include "qapi/qapi-visit-migration.h"
 #include "qapi/qmp/qdict.h"
@@ -2165,3 +2167,312 @@ void hmp_info_memory_size_summary(Monitor *mon, const 
QDict *qdict)
 }
 hmp_handle_error(mon, err);
 }
+
+static void hmp_virtio_dump_protocols(Monitor *mon,
+  VhostDeviceProtocols *pcol)
+{
+strList *pcol_list = pcol->protocols;
+while (pcol_list) {
+monitor_printf(mon, "%s", pcol_list->value);
+pcol_list = pcol_list->next;
+if (pcol_list != NULL) {
+monitor_printf(mon, ", ");
+}
+}
+monitor_printf(mon, "\n");
+if (pcol->has_unknown_protocols) {
+monitor_printf(mon, "  unknown-protocols(0x%016"PRIx64")\n",
+   pcol->unknown_protocols);
+}
+}
+
+static void hmp_virtio_dump_status(Monitor *mon,
+   VirtioDeviceStatus *status)
+{
+strList *status_list = status->statuses;
+while (status_list) {
+monitor_printf(mon, "%s", status_list->value);
+status_list = status_list->next;
+if (status_list != NULL) {
+monitor_printf(mon, ", ");
+}
+}
+monitor_printf(mon, "\n");
+if (status->has_unknown_statuses) {
+monitor_printf(mon, "  unknown-statuses(0x%016"PRIx32")\n",
+   

[PATCH v10 7/8] qmp: add QMP command x-query-virtio-queue-element

2021-12-06 Thread Jonah Palmer
From: Laurent Vivier 

This new command shows the information of a VirtQueue element.

Signed-off-by: Jonah Palmer 
---
 hw/virtio/virtio-stub.c |   9 +++
 hw/virtio/virtio.c  | 154 
 qapi/virtio.json| 183 
 3 files changed, 346 insertions(+)

diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c
index 13e5f93..7ddb22c 100644
--- a/hw/virtio/virtio-stub.c
+++ b/hw/virtio/virtio-stub.c
@@ -31,3 +31,12 @@ VirtQueueStatus *qmp_x_query_virtio_queue_status(const char 
*path,
 {
 return qmp_virtio_unsupported(errp);
 }
+
+VirtioQueueElement *qmp_x_query_virtio_queue_element(const char *path,
+ uint16_t queue,
+ bool has_index,
+ uint16_t index,
+ Error **errp)
+{
+return qmp_virtio_unsupported(errp);
+}
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 459bfb2..8c6cc27 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -475,6 +475,19 @@ static inline void vring_used_write(VirtQueue *vq, 
VRingUsedElem *uelem,
 address_space_cache_invalidate(>used, pa, sizeof(VRingUsedElem));
 }
 
+/* Called within rcu_read_lock(). */
+static inline uint16_t vring_used_flags(VirtQueue *vq)
+{
+VRingMemoryRegionCaches *caches = vring_get_region_caches(vq);
+hwaddr pa = offsetof(VRingUsed, flags);
+
+if (!caches) {
+return 0;
+}
+
+return virtio_lduw_phys_cached(vq->vdev, >used, pa);
+}
+
 /* Called within rcu_read_lock().  */
 static uint16_t vring_used_idx(VirtQueue *vq)
 {
@@ -4381,6 +4394,147 @@ VirtQueueStatus *qmp_x_query_virtio_queue_status(const 
char *path,
 return status;
 }
 
+static strList *qmp_decode_vring_desc_flags(uint16_t flags)
+{
+strList *list = NULL;
+strList *node;
+int i;
+
+struct {
+uint16_t flag;
+const char *value;
+} map[] = {
+{ VRING_DESC_F_NEXT, "next" },
+{ VRING_DESC_F_WRITE, "write" },
+{ VRING_DESC_F_INDIRECT, "indirect" },
+{ 1 << VRING_PACKED_DESC_F_AVAIL, "avail" },
+{ 1 << VRING_PACKED_DESC_F_USED, "used" },
+{ 0, "" }
+};
+
+for (i = 0; map[i].flag; i++) {
+if ((map[i].flag & flags) == 0) {
+continue;
+}
+node = g_malloc0(sizeof(strList));
+node->value = g_strdup(map[i].value);
+node->next = list;
+list = node;
+}
+
+return list;
+}
+
+VirtioQueueElement *qmp_x_query_virtio_queue_element(const char *path,
+ uint16_t queue,
+ bool has_index,
+ uint16_t index,
+ Error **errp)
+{
+VirtIODevice *vdev;
+VirtQueue *vq;
+VirtioQueueElement *element = NULL;
+
+vdev = virtio_device_find(path);
+if (vdev == NULL) {
+error_setg(errp, "Path %s is not a VirtIO device", path);
+return NULL;
+}
+
+if (queue >= VIRTIO_QUEUE_MAX || !virtio_queue_get_num(vdev, queue)) {
+error_setg(errp, "Invalid virtqueue number %d", queue);
+return NULL;
+}
+vq = >vq[queue];
+
+if (virtio_vdev_has_feature(vdev, VIRTIO_F_RING_PACKED)) {
+error_setg(errp, "Packed ring not supported");
+return NULL;
+} else {
+unsigned int head, i, max;
+VRingMemoryRegionCaches *caches;
+MemoryRegionCache indirect_desc_cache = MEMORY_REGION_CACHE_INVALID;
+MemoryRegionCache *desc_cache;
+VRingDesc desc;
+VirtioRingDescList *list = NULL;
+VirtioRingDescList *node;
+int rc; int ndescs;
+
+RCU_READ_LOCK_GUARD();
+
+max = vq->vring.num;
+
+if (!has_index) {
+head = vring_avail_ring(vq, vq->last_avail_idx % vq->vring.num);
+} else {
+head = vring_avail_ring(vq, index % vq->vring.num);
+}
+i = head;
+
+caches = vring_get_region_caches(vq);
+if (!caches) {
+error_setg(errp, "Region caches not initialized");
+return NULL;
+}
+if (caches->desc.len < max * sizeof(VRingDesc)) {
+error_setg(errp, "Cannot map descriptor ring");
+return NULL;
+}
+
+desc_cache = >desc;
+vring_split_desc_read(vdev, , desc_cache, i);
+if (desc.flags & VRING_DESC_F_INDIRECT) {
+int64_t len;
+len = address_space_cache_init(_desc_cache, vdev->dma_as,
+   desc.addr, desc.len, false);
+desc_cache = _desc_cache;
+if (len < desc.len) {
+error_setg(errp, "Cannot map indirect buffer");
+goto done;
+   

[PATCH v10 2/8] virtio: add vhost support for virtio devices

2021-12-06 Thread Jonah Palmer
This patch adds a get_vhost() callback function for VirtIODevices that
returns the device's corresponding vhost_dev structure, if the vhost
device is running. This patch also adds a vhost_started flag for
VirtIODevices.

Previously, a VirtIODevice wouldn't be able to tell if its corresponding
vhost device was active or not.

Signed-off-by: Jonah Palmer 
---
 hw/block/vhost-user-blk.c  |  7 +++
 hw/display/vhost-user-gpu.c|  7 +++
 hw/input/vhost-user-input.c|  7 +++
 hw/net/virtio-net.c|  9 +
 hw/scsi/vhost-scsi.c   |  8 
 hw/virtio/vhost-user-fs.c  |  7 +++
 hw/virtio/vhost-user-rng.c |  7 +++
 hw/virtio/vhost-vsock-common.c |  7 +++
 hw/virtio/vhost.c  |  4 +++-
 hw/virtio/virtio-crypto.c  | 10 ++
 hw/virtio/virtio.c |  1 +
 include/hw/virtio/virtio.h |  3 +++
 12 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index f61f8c1..b059da1 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -568,6 +568,12 @@ static void vhost_user_blk_instance_init(Object *obj)
   "/disk@0,0", DEVICE(obj));
 }
 
+static struct vhost_dev *vhost_user_blk_get_vhost(VirtIODevice *vdev)
+{
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+return >dev;
+}
+
 static const VMStateDescription vmstate_vhost_user_blk = {
 .name = "vhost-user-blk",
 .minimum_version_id = 1,
@@ -602,6 +608,7 @@ static void vhost_user_blk_class_init(ObjectClass *klass, 
void *data)
 vdc->get_features = vhost_user_blk_get_features;
 vdc->set_status = vhost_user_blk_set_status;
 vdc->reset = vhost_user_blk_reset;
+vdc->get_vhost = vhost_user_blk_get_vhost;
 }
 
 static const TypeInfo vhost_user_blk_info = {
diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index 49df56c..6e93b46 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -565,6 +565,12 @@ vhost_user_gpu_device_realize(DeviceState *qdev, Error 
**errp)
 g->vhost_gpu_fd = -1;
 }
 
+static struct vhost_dev *vhost_user_gpu_get_vhost(VirtIODevice *vdev)
+{
+VhostUserGPU *g = VHOST_USER_GPU(vdev);
+return >vhost->dev;
+}
+
 static Property vhost_user_gpu_properties[] = {
 VIRTIO_GPU_BASE_PROPERTIES(VhostUserGPU, parent_obj.conf),
 DEFINE_PROP_END_OF_LIST(),
@@ -586,6 +592,7 @@ vhost_user_gpu_class_init(ObjectClass *klass, void *data)
 vdc->guest_notifier_pending = vhost_user_gpu_guest_notifier_pending;
 vdc->get_config = vhost_user_gpu_get_config;
 vdc->set_config = vhost_user_gpu_set_config;
+vdc->get_vhost = vhost_user_gpu_get_vhost;
 
 device_class_set_props(dc, vhost_user_gpu_properties);
 }
diff --git a/hw/input/vhost-user-input.c b/hw/input/vhost-user-input.c
index 273e96a..43d2ff3 100644
--- a/hw/input/vhost-user-input.c
+++ b/hw/input/vhost-user-input.c
@@ -79,6 +79,12 @@ static void vhost_input_set_config(VirtIODevice *vdev,
 virtio_notify_config(vdev);
 }
 
+static struct vhost_dev *vhost_input_get_vhost(VirtIODevice *vdev)
+{
+VHostUserInput *vhi = VHOST_USER_INPUT(vdev);
+return >vhost->dev;
+}
+
 static const VMStateDescription vmstate_vhost_input = {
 .name = "vhost-user-input",
 .unmigratable = 1,
@@ -93,6 +99,7 @@ static void vhost_input_class_init(ObjectClass *klass, void 
*data)
 dc->vmsd = _vhost_input;
 vdc->get_config = vhost_input_get_config;
 vdc->set_config = vhost_input_set_config;
+vdc->get_vhost = vhost_input_get_vhost;
 vic->realize = vhost_input_realize;
 vic->change_active = vhost_input_change_active;
 }
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index b275acf..2449b9c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3610,6 +3610,14 @@ static bool dev_unplug_pending(void *opaque)
 return vdc->primary_unplug_pending(dev);
 }
 
+static struct vhost_dev *virtio_net_get_vhost(VirtIODevice *vdev)
+{
+VirtIONet *n = VIRTIO_NET(vdev);
+NetClientState *nc = qemu_get_queue(n->nic);
+struct vhost_net *net = get_vhost_net(nc->peer);
+return >dev;
+}
+
 static const VMStateDescription vmstate_virtio_net = {
 .name = "virtio-net",
 .minimum_version_id = VIRTIO_NET_VM_VERSION,
@@ -3712,6 +3720,7 @@ static void virtio_net_class_init(ObjectClass *klass, 
void *data)
 vdc->post_load = virtio_net_post_load_virtio;
 vdc->vmsd = _virtio_net_device;
 vdc->primary_unplug_pending = primary_unplug_pending;
+vdc->get_vhost = virtio_net_get_vhost;
 }
 
 static const TypeInfo virtio_net_info = {
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 039caf2..b0a9c45 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -264,6 +264,13 @@ static void vhost_scsi_unrealize(DeviceState *dev)
 virtio_scsi_common_unrealize(dev);
 }
 
+static struct vhost_dev *vhost_scsi_get_vhost(VirtIODevice *vdev)
+{
+VHostSCSI *s = 

[PATCH v10 6/8] qmp: add QMP commands for virtio/vhost queue-status

2021-12-06 Thread Jonah Palmer
From: Laurent Vivier 

These new commands show the internal status of a VirtIODevice's
VirtQueue and a vhost device's vhost_virtqueue (if active).

Signed-off-by: Jonah Palmer 
---
 hw/virtio/virtio-stub.c |  14 +++
 hw/virtio/virtio.c  | 103 
 qapi/virtio.json| 252 
 3 files changed, 369 insertions(+)

diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c
index 0b432e8..13e5f93 100644
--- a/hw/virtio/virtio-stub.c
+++ b/hw/virtio/virtio-stub.c
@@ -17,3 +17,17 @@ VirtioStatus *qmp_x_query_virtio_status(const char *path, 
Error **errp)
 {
 return qmp_virtio_unsupported(errp);
 }
+
+VirtVhostQueueStatus *qmp_x_query_virtio_vhost_queue_status(const char *path,
+uint16_t queue,
+Error **errp)
+{
+return qmp_virtio_unsupported(errp);
+}
+
+VirtQueueStatus *qmp_x_query_virtio_queue_status(const char *path,
+ uint16_t queue,
+ Error **errp)
+{
+return qmp_virtio_unsupported(errp);
+}
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index b4b2af8..459bfb2 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -4278,6 +4278,109 @@ VirtioStatus *qmp_x_query_virtio_status(const char 
*path, Error **errp)
 return status;
 }
 
+VirtVhostQueueStatus *qmp_x_query_virtio_vhost_queue_status(const char *path,
+uint16_t queue,
+Error **errp)
+{
+VirtIODevice *vdev;
+VirtVhostQueueStatus *status;
+
+vdev = virtio_device_find(path);
+if (vdev == NULL) {
+error_setg(errp, "Path %s is not a VirtIODevice", path);
+return NULL;
+}
+
+if (!vdev->vhost_started) {
+error_setg(errp, "Error: vhost device has not started yet");
+return NULL;
+}
+
+VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+struct vhost_dev *hdev = vdc->get_vhost(vdev);
+
+if (queue < hdev->vq_index || queue >= hdev->vq_index + hdev->nvqs) {
+error_setg(errp, "Invalid vhost virtqueue number %d", queue);
+return NULL;
+}
+
+status = g_new0(VirtVhostQueueStatus, 1);
+status->name = g_strdup(vdev->name);
+status->kick = hdev->vqs[queue].kick;
+status->call = hdev->vqs[queue].call;
+status->desc = (uint64_t)(unsigned long)hdev->vqs[queue].desc;
+status->avail = (uint64_t)(unsigned long)hdev->vqs[queue].avail;
+status->used = (uint64_t)(unsigned long)hdev->vqs[queue].used;
+status->num = hdev->vqs[queue].num;
+status->desc_phys = hdev->vqs[queue].desc_phys;
+status->desc_size = hdev->vqs[queue].desc_size;
+status->avail_phys = hdev->vqs[queue].avail_phys;
+status->avail_size = hdev->vqs[queue].avail_size;
+status->used_phys = hdev->vqs[queue].used_phys;
+status->used_size = hdev->vqs[queue].used_size;
+
+return status;
+}
+
+VirtQueueStatus *qmp_x_query_virtio_queue_status(const char *path,
+ uint16_t queue,
+ Error **errp)
+{
+VirtIODevice *vdev;
+VirtQueueStatus *status;
+
+vdev = virtio_device_find(path);
+if (vdev == NULL) {
+error_setg(errp, "Path %s is not a VirtIODevice", path);
+return NULL;
+}
+
+if (queue >= VIRTIO_QUEUE_MAX || !virtio_queue_get_num(vdev, queue)) {
+error_setg(errp, "Invalid virtqueue number %d", queue);
+return NULL;
+}
+
+status = g_new0(VirtQueueStatus, 1);
+status->name = g_strdup(vdev->name);
+status->queue_index = vdev->vq[queue].queue_index;
+status->inuse = vdev->vq[queue].inuse;
+status->vring_num = vdev->vq[queue].vring.num;
+status->vring_num_default = vdev->vq[queue].vring.num_default;
+status->vring_align = vdev->vq[queue].vring.align;
+status->vring_desc = vdev->vq[queue].vring.desc;
+status->vring_avail = vdev->vq[queue].vring.avail;
+status->vring_used = vdev->vq[queue].vring.used;
+status->used_idx = vdev->vq[queue].used_idx;
+status->signalled_used = vdev->vq[queue].signalled_used;
+status->signalled_used_valid = vdev->vq[queue].signalled_used_valid;
+
+if (vdev->vhost_started) {
+VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+struct vhost_dev *hdev = vdc->get_vhost(vdev);
+
+/* check if vq index exists for vhost as well  */
+if (queue >= hdev->vq_index && queue < hdev->vq_index + hdev->nvqs) {
+status->has_last_avail_idx = true;
+
+int vhost_vq_index =
+hdev->vhost_ops->vhost_get_vq_index(hdev, queue);
+struct vhost_vring_state state = {
+.index = vhost_vq_index,
+};
+
+

[PATCH v10 3/8] qmp: add QMP command x-query-virtio

2021-12-06 Thread Jonah Palmer
From: Laurent Vivier 

This new command lists all the instances of VirtIODevices with
their canonical QOM path and name.

Signed-off-by: Jonah Palmer 
---
 hw/virtio/meson.build  |  2 ++
 hw/virtio/virtio-stub.c| 14 ++
 hw/virtio/virtio.c | 27 ++
 include/hw/virtio/virtio.h |  1 +
 qapi/meson.build   |  1 +
 qapi/qapi-schema.json  |  1 +
 qapi/virtio.json   | 68 ++
 tests/qtest/qmp-cmd-test.c |  1 +
 8 files changed, 115 insertions(+)
 create mode 100644 hw/virtio/virtio-stub.c
 create mode 100644 qapi/virtio.json

diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index 521f7d6..d893f5f 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -6,8 +6,10 @@ softmmu_virtio_ss.add(when: 'CONFIG_VHOST', if_false: 
files('vhost-stub.c'))
 
 softmmu_ss.add_all(when: 'CONFIG_VIRTIO', if_true: softmmu_virtio_ss)
 softmmu_ss.add(when: 'CONFIG_VIRTIO', if_false: files('vhost-stub.c'))
+softmmu_ss.add(when: 'CONFIG_VIRTIO', if_false: files('virtio-stub.c'))
 
 softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-stub.c'))
+softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('virtio-stub.c'))
 
 virtio_ss = ss.source_set()
 virtio_ss.add(files('virtio.c'))
diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c
new file mode 100644
index 000..05a81ed
--- /dev/null
+++ b/hw/virtio/virtio-stub.c
@@ -0,0 +1,14 @@
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-virtio.h"
+
+static void *qmp_virtio_unsupported(Error **errp)
+{
+error_setg(errp, "Virtio is disabled");
+return NULL;
+}
+
+VirtioInfoList *qmp_x_query_virtio(Error **errp)
+{
+return qmp_virtio_unsupported(errp);
+}
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 867834d..76a63a0 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -13,6 +13,8 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qapi/qapi-commands-virtio.h"
+#include "qapi/qapi-visit-virtio.h"
 #include "cpu.h"
 #include "trace.h"
 #include "qemu/error-report.h"
@@ -29,6 +31,9 @@
 #include "sysemu/runstate.h"
 #include "standard-headers/linux/virtio_ids.h"
 
+/* QAPI list of realized VirtIODevices */
+static QTAILQ_HEAD(, VirtIODevice) virtio_list;
+
 /*
  * The alignment to use between consumer and producer parts of vring.
  * x86 pagesize again. This is the default, used by transports like PCI
@@ -3700,6 +3705,7 @@ static void virtio_device_realize(DeviceState *dev, Error 
**errp)
 vdev->listener.commit = virtio_memory_listener_commit;
 vdev->listener.name = "virtio";
 memory_listener_register(>listener, vdev->dma_as);
+QTAILQ_INSERT_TAIL(_list, vdev, next);
 }
 
 static void virtio_device_unrealize(DeviceState *dev)
@@ -3714,6 +3720,7 @@ static void virtio_device_unrealize(DeviceState *dev)
 vdc->unrealize(dev);
 }
 
+QTAILQ_REMOVE(_list, vdev, next);
 g_free(vdev->bus_name);
 vdev->bus_name = NULL;
 }
@@ -3887,6 +3894,8 @@ static void virtio_device_class_init(ObjectClass *klass, 
void *data)
 vdc->stop_ioeventfd = virtio_device_stop_ioeventfd_impl;
 
 vdc->legacy_features |= VIRTIO_LEGACY_FEATURES;
+
+QTAILQ_INIT(_list);
 }
 
 bool virtio_device_ioeventfd_enabled(VirtIODevice *vdev)
@@ -3897,6 +3906,24 @@ bool virtio_device_ioeventfd_enabled(VirtIODevice *vdev)
 return virtio_bus_ioeventfd_enabled(vbus);
 }
 
+VirtioInfoList *qmp_x_query_virtio(Error **errp)
+{
+VirtioInfoList *list = NULL;
+VirtioInfoList *node;
+VirtIODevice *vdev;
+
+QTAILQ_FOREACH(vdev, _list, next) {
+DeviceState *dev = DEVICE(vdev);
+node = g_new0(VirtioInfoList, 1);
+node->value = g_new(VirtioInfo, 1);
+node->value->path = g_strdup(dev->canonical_path);
+node->value->name = g_strdup(vdev->name);
+QAPI_LIST_PREPEND(list, node->value);
+}
+
+return list;
+}
+
 static const TypeInfo virtio_device_info = {
 .name = TYPE_VIRTIO_DEVICE,
 .parent = TYPE_DEVICE,
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 0a12bd5..3b4eb85 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -110,6 +110,7 @@ struct VirtIODevice
 bool use_guest_notifier_mask;
 AddressSpace *dma_as;
 QLIST_HEAD(, VirtQueue) *vector_queues;
+QTAILQ_ENTRY(VirtIODevice) next;
 };
 
 struct VirtioDeviceClass {
diff --git a/qapi/meson.build b/qapi/meson.build
index c0c49c1..e332f28 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -48,6 +48,7 @@ qapi_all_modules = [
   'sockets',
   'trace',
   'transaction',
+  'virtio',
   'yank',
 ]
 if have_system
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 4912b97..1512ada 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -93,3 +93,4 @@
 { 'include': 'audio.json' }
 { 'include': 'acpi.json' }
 { 'include': 'pci.json' }
+{ 'include': 'virtio.json' }
diff --git 

[PATCH v10 0/8] hmp,qmp: Add commands to introspect virtio devices

2021-12-06 Thread Jonah Palmer
This series introduces new QMP/HMP commands to dump the status of a
virtio device at different levels.

[Jonah: Rebasing previous patchset from Nov. 10 (v9). Original patches
 are by Laurent Vivier from May 2020.

 Rebase from v9 to v10 includes reformatting virtio.json examples and
 command structures for better consistency. Also removed all enums from
 virtio.json and replaced their purpose with string literals.

 Removed @ndescs from VirtioQueueElement, as the number of descriptors
 can be inferred from the length of the @descs chain.

 Lastly, removed the examples in hmp-commands-info.hx to fix 'inconsistent
 literal block quoting' warning from Sphinx.]

1. List available virtio devices in the machine

HMP Form:

info virtio

Example:

(qemu) info virtio
/machine/peripheral/vsock0/virtio-backend [vhost-vsock]
/machine/peripheral/crypto0/virtio-backend [virtio-crypto]
/machine/peripheral-anon/device[2]/virtio-backend [virtio-scsi]
/machine/peripheral-anon/device[1]/virtio-backend [virtio-net]
/machine/peripheral-anon/device[0]/virtio-backend [virtio-serial]

QMP Form:

{ 'command': 'x-query-virtio',
  'returns': ['VirtioInfo'],
  'features': [ 'unstable' ] }

Example:

-> { "execute": "x-query-virtio" }
<- { "return": [
   {
   "path": "/machine/peripheral/vsock0/virtio-backend",
   "name": "vhost-vsock"
   },
   {
   "path": "/machine/peripheral/crypto0/virtio-backend",
   "name": "virtio-crypto"
   },
   {
   "path": "/machine/peripheral-anon/device[2]/virtio-backend",
   "name": "virtio-scsi"
   },
   {
   "path": "/machine/peripheral-anon/device[1]/virtio-backend",
   "name": "virtio-net"
   },
   {
   "path": "/machine/peripheral-anon/device[0]/virtio-backend",
   "name": "virtio-serial"
   }
 ]
   }

2. Display status of a given virtio device

HMP Form:

info virtio-status 

Example:

(qemu) info virtio-status /machine/peripheral/vsock0/virtio-backend
/machine/peripheral/vsock0/virtio-backend:
device_name: vhost-vsock (vhost)
device_id:   19
vhost_started:   true
bus_name:(null)
broken:  false
disabled:false
disable_legacy_check:false
started: true
use_started: true
start_on_kick:   false
use_guest_notifier_mask: true
vm_running:  true
num_vqs: 3
queue_sel:   2
isr: 0
endianness:  little
status: ACKNOWLEDGE, DRIVER, FEATURES_OK, DRIVER_OK
Guest features:   EVENT_IDX, INDIRECT_DESC, VERSION_1
Host features:PROTOCOL_FEATURES, EVENT_IDX, INDIRECT_DESC, 
VERSION_1, ANY_LAYOUT,
  NOTIFY_ON_EMPTY
Backend features:
VHost:
nvqs:   2
vq_index:   0
max_queues: 0
n_mem_sections: 4
n_tmp_sections: 4
backend_cap:0
log_enabled:false
log_size:   0
Features:  EVENT_IDX, INDIRECT_DESC, VERSION_1, 
ANY_LAYOUT, NOTIFY_ON_EMPTY
   LOG_ALL
Acked features:EVENT_IDX, INDIRECT_DESC, VERSION_1
Backend features:
Protocol features:

QMP Form:

{ 'command': 'x-query-virtio-status',
  'data': { 'path': 'str' },
  'returns': 'VirtioStatus',
  'features': [ 'unstable' ] }

Example:

-> { "execute": "x-query-virtio-status",
 "arguments": { "path": "/machine/peripheral/vsock0/virtio-backend" 
}
   }
<- { "return": {
   "device-endian": "little",
   "bus-name": "",
   "disable-legacy-check": false,
   "name": "vhost-vsock",
   "started": true,
   "device-id": 19,
   "vhost-dev": {
  "n-tmp-sections": 4,
  "n-mem-sections": 4,
  "max-queues": 0,
  "backend-cap": 0,
  "log-size": 0,
  "backend-features": {
 "transports": [],
 "dev-features": []
  },
  "nvqs": 2,
  "protocol-features": {
 

Re: [PATCH v2 01/13] hw/sd/ssi-sd: Do not create SD card within controller's realize

2021-12-06 Thread Markus Armbruster
Philippe Mathieu-Daudé  writes:

> Hi Markus, Peter,
>
> On 11/17/21 17:33, Markus Armbruster wrote:
>> ssi_sd_realize() creates an "sd-card" device.  This is inappropriate,
>> and marked FIXME.
>> 
>> Move it to the boards that create these devices.  Prior art: commit
>> eb4f566bbb for device "generic-sdhci", and commit 26c607b86b for
>> device "pl181".
>> 
>> The device remains not user-creatable, because its users should (and
>> do) wire up its GPIO chip-select line.
>> 
>> Cc: Peter Maydell 
>> Cc: Alistair Francis 
>> Cc: Bin Meng 
>> Cc: Palmer Dabbelt 
>> Cc: "Philippe Mathieu-Daudé" 
>> Cc: qemu-...@nongnu.org
>> Cc: qemu-ri...@nongnu.org
>> Signed-off-by: Markus Armbruster 
>
> Reviewed-by: Philippe Mathieu-Daudé 
>
>> ---
>>  hw/arm/stellaris.c  | 15 ++-
>>  hw/riscv/sifive_u.c | 13 -
>>  hw/sd/ssi-sd.c  | 29 +
>>  3 files changed, 27 insertions(+), 30 deletions(-)
>> 
>> diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
>> index 78827ace6b..b6c8a5d609 100644
>> --- a/hw/arm/stellaris.c
>> +++ b/hw/arm/stellaris.c
>> @@ -10,6 +10,7 @@
>>  #include "qemu/osdep.h"
>>  #include "qapi/error.h"
>>  #include "hw/sysbus.h"
>> +#include "hw/sd/sd.h"
>>  #include "hw/ssi/ssi.h"
>>  #include "hw/arm/boot.h"
>>  #include "qemu/timer.h"
>> @@ -1157,6 +1158,9 @@ static void stellaris_init(MachineState *ms, 
>> stellaris_board_info *board)
>>  void *bus;
>>  DeviceState *sddev;
>>  DeviceState *ssddev;
>> +DriveInfo *dinfo;
>> +DeviceState *carddev;
>> +BlockBackend *blk;
>>  
>>  /*
>>   * Some boards have both an OLED controller and SD card 
>> connected to
>> @@ -1221,8 +1225,17 @@ static void stellaris_init(MachineState *ms, 
>> stellaris_board_info *board)
>>   *  - Make the ssd0323 OLED controller chipselect active-low
>>   */
>>  bus = qdev_get_child_bus(dev, "ssi");
>> -
>>  sddev = ssi_create_peripheral(bus, "ssi-sd");
>> +
>> +dinfo = drive_get(IF_SD, 0, 0);
>> +blk = dinfo ? blk_by_legacy_dinfo(dinfo) : NULL;
>> +carddev = qdev_new(TYPE_SD_CARD);
>> +qdev_prop_set_drive_err(carddev, "drive", blk, _fatal);
>
> I guess a already asked this once but don't remember now... What is the
> point of creating a SD card without drive? Is this due to the old design
> issue we hotplug the drive to the SD card and not the SD card on the SD
> bus?

Device model "sd-card" implements BlockdevOps method change_media_cb().
This menas it models a device with a removable medium.  No drive behaves
like no medium.  I guess it's an SD card reader.

>
>> +qdev_prop_set_bit(carddev, "spi", true);
>> +qdev_realize_and_unref(carddev,
>> +   qdev_get_child_bus(sddev, "sd-bus"),
>> +   _fatal);
>> +




Re: [PATCH 04/14] nbd/client: Add safety check on chunk payload length

2021-12-06 Thread Vladimir Sementsov-Ogievskiy

04.12.2021 02:15, Eric Blake wrote:

Our existing use of structured replies either reads into a qiov capped
at 32M (NBD_CMD_READ) or caps allocation to 1000 bytes (see
NBD_MAX_MALLOC_PAYLOAD in block/nbd.c).  But the existing length
checks are rather late; if we encounter a buggy (or malicious) server
that sends a super-large payload length, we should drop the connection
right then rather than assuming the layer on top will be careful.
This becomes more important when we permit 64-bit lengths which are
even more likely to have the potential for attempted denial of service
abuse.

Signed-off-by: Eric Blake 



Reviewed-by: Vladimir Sementsov-Ogievskiy 


---
  nbd/client.c | 12 
  1 file changed, 12 insertions(+)

diff --git a/nbd/client.c b/nbd/client.c
index 30d5383cb195..8f137c2320bb 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -1412,6 +1412,18 @@ static int nbd_receive_structured_reply_chunk(QIOChannel 
*ioc,
  chunk->handle = be64_to_cpu(chunk->handle);
  chunk->length = be32_to_cpu(chunk->length);

+/*
+ * Because we use BLOCK_STATUS with REQ_ONE, and cap READ requests
+ * at 32M, no valid server should send us payload larger than
+ * this.  Even if we stopped using REQ_ONE, sane servers will cap
+ * the number of extents they return for block status.
+ */
+if (chunk->length > NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData)) {
+error_setg(errp, "server chunk %" PRIu32 " (%s) payload is too long",
+   chunk->type, nbd_rep_lookup(chunk->type));
+return -EINVAL;
+}
+
  return 0;
  }




--
Best regards,
Vladimir



Re: [PATCH 03/14] qemu-io: Allow larger write zeroes under no fallback

2021-12-06 Thread Vladimir Sementsov-Ogievskiy

04.12.2021 02:15, Eric Blake wrote:

When writing zeroes can fall back to a slow write, permitting an
overly large request can become an amplification denial of service
attack in triggering a large amount of work from a small request.  But
the whole point of the no fallback flag is to quickly determine if
writing an entire device to zero can be done quickly (such as when it
is already known that the device started with zero contents); in those
cases, artificially capping things at 2G in qemu-io itself doesn't
help us.

Signed-off-by: Eric Blake 
---
  qemu-io-cmds.c | 9 +++--
  1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 954955c12fb9..45a957093369 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -603,10 +603,6 @@ static int do_co_pwrite_zeroes(BlockBackend *blk, int64_t 
offset,
  .done   = false,
  };

-if (bytes > INT_MAX) {
-return -ERANGE;
-}
-
  co = qemu_coroutine_create(co_pwrite_zeroes_entry, );
  bdrv_coroutine_enter(blk_bs(blk), co);
  while (!data.done) {
@@ -1160,8 +1156,9 @@ static int write_f(BlockBackend *blk, int argc, char 
**argv)
  if (count < 0) {
  print_cvtnum_err(count, argv[optind]);
  return count;
-} else if (count > BDRV_REQUEST_MAX_BYTES) {
-printf("length cannot exceed %" PRIu64 ", given %s\n",
+} else if (count > BDRV_REQUEST_MAX_BYTES &&
+   !(flags & BDRV_REQ_NO_FALLBACK)) {
+printf("length cannot exceed %" PRIu64 " without -n, given %s\n",


Actually, I don't see the reason to restrict qemu-io which is mostly a testing 
tool in this way. What if I want to test data reqeust > 2G, why not?

So, we restring user in testing, but don't avoid any kind of DOS: bad gues can 
always modify the code and rebuild qemu-io to overcome the restriction.

But I don't really care of it, patch is not wrong:

Reviewed-by: Vladimir Sementsov-Ogievskiy 


--
Best regards,
Vladimir



Re: [PATCH 02/14] qemu-io: Utilize 64-bit status during map

2021-12-06 Thread Vladimir Sementsov-Ogievskiy

04.12.2021 02:15, Eric Blake wrote:

The block layer has supported 64-bit block status from drivers since
commit 86a3d5c688 ("block: Add .bdrv_co_block_status() callback",
v2.12) and friends, with individual driver callbacks responsible for
capping things where necessary.  Artificially capping things below 2G
in the qemu-io 'map' command, added in commit d6a644bbfe ("block: Make
bdrv_is_allocated() byte-based", v2.10) is thus no longer necessary.

One way to test this is with qemu-nbd as server on a raw file larger
than 4G (the entire file should show as allocated), plus 'qemu-io -f
raw -c map nbd://localhost --trace=nbd_\*' as client.  Prior to this
patch, the NBD_CMD_BLOCK_STATUS requests are fragmented at 0x7e00
distances; with this patch, the fragmenting changes to 0x7fff
(since the NBD protocol is currently still limited to 32-bit
transactions - see block/nbd.c:nbd_client_co_block_status).  Then in
later patches, once I add an NBD extension for a 64-bit block status,
the same map command completes with just one NBD_CMD_BLOCK_STATUS.

Signed-off-by: Eric Blake


Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir



Re: [PATCH 01/14] nbd/server: Minor cleanups

2021-12-06 Thread Vladimir Sementsov-Ogievskiy

04.12.2021 02:15, Eric Blake wrote:

Spelling fixes, grammar improvements and consistent spacing, noticed
while preparing other patches in this file.

Signed-off-by: Eric Blake 


Reviewed-by: Vladimir Sementsov-Ogievskiy 


---
  nbd/server.c | 13 ++---
  1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 4630dd732250..f302e1cbb03e 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2085,11 +2085,10 @@ static void 
nbd_extent_array_convert_to_be(NBDExtentArray *ea)
   * Add extent to NBDExtentArray. If extent can't be added (no available 
space),
   * return -1.
   * For safety, when returning -1 for the first time, .can_add is set to false,
- * further call to nbd_extent_array_add() will crash.
- * (to avoid the situation, when after failing to add an extent (returned -1),
- * user miss this failure and add another extent, which is successfully added
- * (array is full, but new extent may be squashed into the last one), then we
- * have invalid array with skipped extent)
+ * and further calls to nbd_extent_array_add() will crash.
+ * (this avoids the situation where a caller ignores failure to add one extent,
+ * where adding another extent that would squash into the last array entry
+ * would result in an incorrect range reported to the client)
   */
  static int nbd_extent_array_add(NBDExtentArray *ea,
  uint32_t length, uint32_t flags)
@@ -2288,7 +2287,7 @@ static int nbd_co_receive_request(NBDRequestData *req, 
NBDRequest *request,
  assert(client->recv_coroutine == qemu_coroutine_self());
  ret = nbd_receive_request(client, request, errp);
  if (ret < 0) {
-return  ret;
+return ret;
  }

  trace_nbd_co_receive_request_decode_type(request->handle, request->type,
@@ -2648,7 +2647,7 @@ static coroutine_fn void nbd_trip(void *opaque)
  }

  if (ret < 0) {
-/* It wans't -EIO, so, according to nbd_co_receive_request()
+/* It wasn't -EIO, so, according to nbd_co_receive_request()
   * semantics, we should return the error to the client. */
  Error *export_err = local_err;




--
Best regards,
Vladimir



Re: [PATCH] spec: Add NBD_OPT_EXTENDED_HEADERS

2021-12-06 Thread Vladimir Sementsov-Ogievskiy

04.12.2021 02:14, Eric Blake wrote:

Add a new negotiation feature where the client and server agree to use
larger packet headers on every packet sent during transmission phase.
This has two purposes: first, it makes it possible to perform
operations like trim, write zeroes, and block status on more than 2^32
bytes in a single command; this in turn requires that some structured
replies from the server also be extended to match.  The wording chosen
here is careful to permit a server to use either flavor in its reply
(that is, a request less than 32-bits can trigger an extended reply,
and conversely a request larger than 32-bits can trigger a compact
reply).

Second, when structured replies are active, clients have to deal with
the difference between 16- and 20-byte headers of simple
vs. structured replies, which impacts performance if the client must
perform multiple syscalls to first read the magic before knowing how
many additional bytes to read.  In extended header mode, all headers
are the same width, so the client can read a full header before
deciding whether the header describes a simple or structured reply.
Similarly, by having extended mode use a power-of-2 sizing, it becomes
easier to manipulate headers within a single cache line, even if it
requires padding bytes sent over the wire.  However, note that this
change only affects the headers; as data payloads can still be
unaligned (for example, a client performing 1-byte reads or writes),
we would need to negotiate yet another extension if we wanted to
ensure that all NBD transmission packets started on an 8-byte boundary
after option haggling has completed.

This spec addition was done in parallel with a proof of concept
implementation in qemu (server and client) and libnbd (client), and I
also have plans to implement it in nbdkit (server).

Signed-off-by: Eric Blake 
---

Available at https://repo.or.cz/nbd/ericb.git/shortlog/refs/tags/exthdr-v1

  doc/proto.md | 218 +--
  1 file changed, 177 insertions(+), 41 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index 3a877a9..46560b6 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -295,6 +295,21 @@ reply is also problematic for error handling of the 
`NBD_CMD_READ`
  request.  Therefore, structured replies can be used to create a
  a context-free server stream; see below.

+The results of client negotiation also determine whether the client
+and server will utilize only compact requests and replies, or whether
+both sides will use only extended packets.  Compact messages are the
+default, but inherently limit single transactions to a 32-bit window
+starting at a 64-bit offset.  Extended messages make it possible to
+perform 64-bit transactions (although typically only for commands that
+do not include a data payload).  Furthermore, when structured replies
+have been negotiated, compact messages require the client to perform
+partial reads to determine which reply packet style (simple or
+structured) is on the wire before knowing the length of the rest of
+the reply, which can reduce client performance.  With extended
+messages, all packet headers have a fixed length of 32 bytes, and
+although this results in more traffic over the network due to padding,
+the resulting layout is friendlier for performance.
+
  Replies need not be sent in the same order as requests (i.e., requests
  may be handled by the server asynchronously), and structured reply
  chunks from one request may be interleaved with reply messages from
@@ -343,7 +358,9 @@ may be useful.

   Request message

-The request message, sent by the client, looks as follows:
+The compact request message, sent by the client when extended
+transactions are not negotiated using `NBD_OPT_EXTENDED_HEADERS`,
+looks as follows:

  C: 32 bits, 0x25609513, magic (`NBD_REQUEST_MAGIC`)
  C: 16 bits, command flags
@@ -353,14 +370,26 @@ C: 64 bits, offset (unsigned)
  C: 32 bits, length (unsigned)
  C: (*length* bytes of data if the request is of type `NBD_CMD_WRITE`)

+If negotiation agreed on extended transactions with
+`NBD_OPT_EXTENDED_HEADERS`, the client instead uses extended requests:
+
+C: 32 bits, 0x21e41c71, magic (`NBD_REQUEST_EXT_MAGIC`)
+C: 16 bits, command flags
+C: 16 bits, type
+C: 64 bits, handle
+C: 64 bits, offset (unsigned)
+C: 64 bits, length (unsigned)
+C: (*length* bytes of data if the request is of type `NBD_CMD_WRITE`)
+
   Simple reply message

  The simple reply message MUST be sent by the server in response to all
  requests if structured replies have not been negotiated using
-`NBD_OPT_STRUCTURED_REPLY`. If structured replies have been negotiated, a 
simple
-reply MAY be used as a reply to any request other than `NBD_CMD_READ`,
-but only if the reply has no data payload.  The message looks as
-follows:
+`NBD_OPT_STRUCTURED_REPLY`. If structured replies have been
+negotiated, a simple reply MAY be used as a reply to any request other
+than `NBD_CMD_READ`, but only if