Re: [PATCH v5 6/8] iotests/300: avoid abnormal shutdown race condition
On Wed, Oct 27, 2021 at 8:56 AM Kevin Wolf wrote: > Am 26.10.2021 um 19:56 hat John Snow geschrieben: > > Wait for the destination VM to close itself instead of racing to shut it > > down first, which produces different error log messages from AQMP > > depending on precisely when we tried to shut it down. > > > > (For example: We may try to issue 'quit' immediately prior to the target > > VM closing its QMP socket, which will cause an ECONNRESET error to be > > logged. Waiting for the VM to exit itself avoids the race on shutdown > > behavior.) > > > > Reported-by: Hanna Reitz > > Signed-off-by: John Snow > > I think this will still expose the race I described in my comment on > patch 2. > > Kevin > > I wrote a long, meandering explanation of how I think things work in reply to that comment. TLDR: I didn't see the problem. Here in reply to this comment I'll just ask: what exactly *is* the race you see happening here even with this patch? I'm not sure I see it. --js
Re: [PATCH v5 2/8] python/machine: Handle QMP errors on close more meticulously
This reply is long, sorry. On Wed, Oct 27, 2021 at 7:19 AM Kevin Wolf wrote: > Am 26.10.2021 um 19:56 hat John Snow geschrieben: > > To use the AQMP backend, Machine just needs to be a little more diligent > > about what happens when closing a QMP connection. The operation is no > > longer a freebie in the async world; it may return errors encountered in > > the async bottom half on incoming message receipt, etc. > > > > (AQMP's disconnect, ultimately, serves as the quiescence point where all > > async contexts are gathered together, and any final errors reported at > > that point.) > > > > Because async QMP continues to check for messages asynchronously, it's > > almost certainly likely that the loop will have exited due to EOF after > > issuing the last 'quit' command. That error will ultimately be bubbled > > up when attempting to close the QMP connection. The manager class here > > then is free to discard it -- if it was expected. > > > > Signed-off-by: John Snow > > Reviewed-by: Hanna Reitz > > --- > > python/qemu/machine/machine.py | 48 +- > > 1 file changed, 42 insertions(+), 6 deletions(-) > > > > diff --git a/python/qemu/machine/machine.py > b/python/qemu/machine/machine.py > > index 0bd40bc2f76..a0cf69786b4 100644 > > --- a/python/qemu/machine/machine.py > > +++ b/python/qemu/machine/machine.py > > @@ -342,9 +342,15 @@ def _post_shutdown(self) -> None: > > # Comprehensive reset for the failed launch case: > > self._early_cleanup() > > > > -if self._qmp_connection: > > -self._qmp.close() > > -self._qmp_connection = None > > +try: > > +self._close_qmp_connection() > > +except Exception as err: # pylint: disable=broad-except > > +LOG.warning( > > +"Exception closing QMP connection: %s", > > +str(err) if str(err) else type(err).__name__ > > +) > > +finally: > > +assert self._qmp_connection is None > > > > self._close_qemu_log_file() > > > > @@ -420,6 +426,31 @@ def _launch(self) -> None: > > close_fds=False) > > self._post_launch() > > > > +def _close_qmp_connection(self) -> None: > > +""" > > +Close the underlying QMP connection, if any. > > + > > +Dutifully report errors that occurred while closing, but assume > > +that any error encountered indicates an abnormal termination > > +process and not a failure to close. > > +""" > > +if self._qmp_connection is None: > > +return > > + > > +try: > > +self._qmp.close() > > +except EOFError: > > +# EOF can occur as an Exception here when using the Async > > +# QMP backend. It indicates that the server closed the > > +# stream. If we successfully issued 'quit' at any point, > > +# then this was expected. If the remote went away without > > +# our permission, it's worth reporting that as an abnormal > > +# shutdown case. > > +if not (self._user_killed or self._quit_issued): > > +raise > > Isn't this racy for those tests that expect QEMU to quit by itself and > then later call wait()? self._quit_issued is only set to True in wait(), > but whatever will cause QEMU to quit happens earlier and it might > actually quit before wait() is called. > _quit_issued is also set to True via qmp() and command(), but I think you're referring to cases where QEMU terminates for other reasons than an explicit command. So, yes, QEMU might indeed terminate/abort/quit before machine.py has recorded that fact somewhere. I wasn't aware of that being a problem. I suppose it'd be racy if, say, you orchestrated a "side-channel quit" and then didn't call wait() and instead called shutdown(). I think that's the case you're worried about? But, this code should ultimately be called in only four cases: (1) Connection failure (2) shutdown() (3) shutdown(), via wait() (4) shutdown(), via kill() I had considered it a matter of calling the correct exit path from the test code. If shutdown() does what the label on the tin says (it shuts down the VM), I actually would expect it to be racy if QEMU was in the middle of deconstructing itself. That's the belief behind changing around iotest 300 the way I did. > It would make sense to me that such tests need to declare that they > expect QEMU to quit before actually performing the action. And then > wait() becomes less weird in patch 1, too, because it can just assert > self._quit_issued instead of unconditionally setting it. > That's a thought. I was working on the model that calling wait() implies the fact that the writer is expecting it to terminate through some mechanism otherwise invisible to machine.py (HMP perhaps, a QGA action, migration failure, etc.) It's one less call vs. saying "expect_shut
Re: [PATCH 00/15] hw/nvme: SR-IOV with Virtualization Enhancements
On Tue, Oct 26, 2021 at 08:20:12PM +0200, Klaus Jensen wrote: > On Oct 7 18:23, Lukasz Maniak wrote: > > Hi, > > > > This series of patches is an attempt to add support for the following > > sections of NVMe specification revision 1.4: > > > > 8.5 Virtualization Enhancements (Optional) > > 8.5.1 VQ Resource Definition > > 8.5.2 VI Resource Definition > > 8.5.3 Secondary Controller States and Resource Configuration > > 8.5.4 Single Root I/O Virtualization and Sharing (SR-IOV) > > > > The NVMe controller's Single Root I/O Virtualization and Sharing > > implementation is based on patches introducing SR-IOV support for PCI > > Express proposed by Knut Omang: > > https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg05155.html > > > > However, based on what I was able to find historically, Knut's patches > > have not yet been pulled into QEMU due to no example of a working device > > up to this point: > > https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg02722.html > > > > In terms of design, the Physical Function controller and the Virtual > > Function controllers are almost independent, with few exceptions: > > PF handles flexible resource allocation for all its children (VFs have > > read-only access to this data), and reset (PF explicitly calls it on VFs). > > Since the MMIO access is serialized, no extra precautions are required > > to handle concurrent resets, as well as the secondary controller state > > access doesn't need to be atomic. > > > > A controller with full SR-IOV support must be capable of handling the > > Namespace Management command. As there is a pending review with this > > functionality, this patch list is not duplicating efforts. > > Yet, NS management patches are not required to test the SR-IOV support. > > > > We tested the patches on Ubuntu 20.04.3 LTS with kernel 5.4.0. We have > > hit various issues with NVMe CLI (list and virt-mgmt commands) between > > releases from version 1.09 to master, thus we chose this golden NVMe CLI > > hash for testing: a50a0c1. > > > > The implementation is not 100% finished and certainly not bug free, > > since we are already aware of some issues e.g. interaction with > > namespaces related to AER, or unexpected (?) kernel behavior in more > > complex reset scenarios. However, our SR-IOV implementation is already > > able to support typical SR-IOV use cases, so we believe the patches are > > ready to share with the community. > > > > Hope you find some time to review the work we did, and share your > > thoughts. > > > > Kind regards, > > Lukasz > > Hi Lukasz, > > If possible, can you share a brief guide on testing this? I keep hitting > an assert > > qemu-system-x86_64: ../hw/pci/pci.c:1215: pci_register_bar: Assertion > `!pci_is_vf(pci_dev)' failed. > > when I try to modify sriov_numvfs. This should be fixed anyway, but I > might be doing something wrong in the first place. Hi Klaus, Let me share all the details about the steps I did to run 7 fully functional VF controllers and failed to reproduce the assert. I rebased v1 patches to eliminate any recent regression onto the current master 931ce30859. I configured build as follows: ./configure \ --target-list=x86_64-softmmu \ --enable-kvm Then I launched QEMU using these options: ./qemu-system-x86_64 \ -m 4096 \ -smp 4 \ -drive file=qemu-os.qcow2 \ -nographic \ -enable-kvm \ -machine q35 \ -device pcie-root-port,slot=0,id=rp0 \ -device nvme-subsys,id=subsys0 \ -device nvme,serial=1234,id=nvme0,subsys=subsys0,bus=rp0,sriov_max_vfs=127,sriov_max_vq_per_vf=2,sriov_max_vi_per_vf=1 Next, I issued below commands as root to configure VFs: nvme virt-mgmt /dev/nvme0 -c 0 -r 1 -a 1 -n 0 nvme virt-mgmt /dev/nvme0 -c 0 -r 0 -a 1 -n 0 nvme reset /dev/nvme0 echo 1 > /sys/bus/pci/rescan nvme virt-mgmt /dev/nvme0 -c 1 -r 1 -a 8 -n 1 nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 8 -n 2 nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 9 -n 0 nvme virt-mgmt /dev/nvme0 -c 2 -r 1 -a 8 -n 1 nvme virt-mgmt /dev/nvme0 -c 2 -r 0 -a 8 -n 2 nvme virt-mgmt /dev/nvme0 -c 2 -r 0 -a 9 -n 0 nvme virt-mgmt /dev/nvme0 -c 3 -r 1 -a 8 -n 1 nvme virt-mgmt /dev/nvme0 -c 3 -r 0 -a 8 -n 2 nvme virt-mgmt /dev/nvme0 -c 3 -r 0 -a 9 -n 0 nvme virt-mgmt /dev/nvme0 -c 4 -r 1 -a 8 -n 1 nvme virt-mgmt /dev/nvme0 -c 4 -r 0 -a 8 -n 2 nvme virt-mgmt /dev/nvme0 -c 4 -r 0 -a 9 -n 0 nvme virt-mgmt /dev/nvme0 -c 5 -r 1 -a 8 -n 1 nvme virt-mgmt /dev/nvme0 -c 5 -r 0 -a 8 -n 2 nvme virt-mgmt /dev/nvme0 -c 5 -r 0 -a 9 -n 0 nvme virt-mgmt /dev/nvme0 -c 6 -r 1 -a 8 -n 1 nvme virt-mgmt /dev/nvme0 -c 6 -r 0 -a 8 -n 2 nvme virt-mgmt /dev/nvme0 -c 6 -r 0 -a 9 -n 0 nvme virt-mgmt /dev/nvme0 -c 7 -r 1 -a 8 -n 1 nvme virt-mgmt /dev/nvme0 -c 7 -r 0 -a 8 -n 2 nvme virt-mgmt /dev/nvme0 -c 7 -r 0 -a 9 -n 0 echo 7 > /sys/bus/pci/devices/:01:00.0/sriov_numvfs If you use only up to patch 05 inclusive then this command should do all the job: echo 7 > /sys/bus/pci/devices/:01:00.0/sriov_numvfs The OS I used for the host and guest was
Re: [PATCH v2 1/3] file-posix: add `aio-max-batch` option
On Wed, Oct 27, 2021 at 03:06:37PM +0200, Kevin Wolf wrote: Am 27.10.2021 um 11:23 hat Stefano Garzarella geschrieben: On Wed, Oct 27, 2021 at 06:28:51AM +0200, Markus Armbruster wrote: > Stefano Garzarella writes: > > > Commit d7ddd0a161 ("linux-aio: limit the batch size using > > `aio-max-batch` parameter") added a way to limit the batch size > > of Linux AIO backend for the entire AIO context. > > > > The same AIO context can be shared by multiple devices, so > > latency-sensitive devices may want to limit the batch size even > > more to avoid increasing latency. > > > > For this reason we add the `aio-max-batch` option to the file > > backend, which will be used by the next commits to limit the size of > > batches including requests generated by this device. > > > > Suggested-by: Kevin Wolf > > Reviewed-by: Kevin Wolf > > Signed-off-by: Stefano Garzarella > > --- > > > > Notes: > > v2: > > - @aio-max-batch documentation rewrite [Stefan, Kevin] > > > > qapi/block-core.json | 7 +++ > > block/file-posix.c | 9 + > > 2 files changed, 16 insertions(+) > > > > diff --git a/qapi/block-core.json b/qapi/block-core.json > > index 6d3217abb6..fef76b0ea2 100644 > > --- a/qapi/block-core.json > > +++ b/qapi/block-core.json > > @@ -2896,6 +2896,12 @@ > > # for this device (default: none, forward the commands via SG_IO; > > # since 2.11) > > # @aio: AIO backend (default: threads) (since: 2.8) > > +# @aio-max-batch: maximum number of requests to batch together into a single > > +# submission in the AIO backend. The smallest value between > > +# this and the aio-max-batch value of the IOThread object is > > +# chosen. > > +# 0 means that the AIO backend will handle it automatically. > > +# (default: 0, since 6.2) > > "(default 0) (since 6.2)" seems to be more common. Indeed I wasn't sure, so I followed @drop-cache, the last one added in @BlockdevOptionsFile. Actually, I think your style is more common, both globally and in block-*: $ git grep -i '[,;] since' qapi/ | wc -l 17 $ git grep -i '[,;] since' qapi/block* | wc -l 12 Compared to: $ git grep -i ') (since' qapi/ | wc -l 14 $ git grep -i ') (since' qapi/block* | wc -l 7 Thanks for checking! Also a few instances with "(since: ...; default: ...)", but none in that order with separate brackets. So I'd rather merge this version if this is the only comment. Honestly I don't have a strong opinion. If Markus agree, I think we can merge this version. Thanks, Stefano
Re: [PATCH v2 1/3] file-posix: add `aio-max-batch` option
Am 27.10.2021 um 11:23 hat Stefano Garzarella geschrieben: > On Wed, Oct 27, 2021 at 06:28:51AM +0200, Markus Armbruster wrote: > > Stefano Garzarella writes: > > > > > Commit d7ddd0a161 ("linux-aio: limit the batch size using > > > `aio-max-batch` parameter") added a way to limit the batch size > > > of Linux AIO backend for the entire AIO context. > > > > > > The same AIO context can be shared by multiple devices, so > > > latency-sensitive devices may want to limit the batch size even > > > more to avoid increasing latency. > > > > > > For this reason we add the `aio-max-batch` option to the file > > > backend, which will be used by the next commits to limit the size of > > > batches including requests generated by this device. > > > > > > Suggested-by: Kevin Wolf > > > Reviewed-by: Kevin Wolf > > > Signed-off-by: Stefano Garzarella > > > --- > > > > > > Notes: > > > v2: > > > - @aio-max-batch documentation rewrite [Stefan, Kevin] > > > > > > qapi/block-core.json | 7 +++ > > > block/file-posix.c | 9 + > > > 2 files changed, 16 insertions(+) > > > > > > diff --git a/qapi/block-core.json b/qapi/block-core.json > > > index 6d3217abb6..fef76b0ea2 100644 > > > --- a/qapi/block-core.json > > > +++ b/qapi/block-core.json > > > @@ -2896,6 +2896,12 @@ > > > # for this device (default: none, forward the commands via > > > SG_IO; > > > # since 2.11) > > > # @aio: AIO backend (default: threads) (since: 2.8) > > > +# @aio-max-batch: maximum number of requests to batch together into a > > > single > > > +# submission in the AIO backend. The smallest value > > > between > > > +# this and the aio-max-batch value of the IOThread > > > object is > > > +# chosen. > > > +# 0 means that the AIO backend will handle it > > > automatically. > > > +# (default: 0, since 6.2) > > > > "(default 0) (since 6.2)" seems to be more common. > > Indeed I wasn't sure, so I followed @drop-cache, the last one added in > @BlockdevOptionsFile. Actually, I think your style is more common, both globally and in block-*: $ git grep -i '[,;] since' qapi/ | wc -l 17 $ git grep -i '[,;] since' qapi/block* | wc -l 12 Compared to: $ git grep -i ') (since' qapi/ | wc -l 14 $ git grep -i ') (since' qapi/block* | wc -l 7 Also a few instances with "(since: ...; default: ...)", but none in that order with separate brackets. So I'd rather merge this version if this is the only comment. Kevin
Re: [PATCH v5 6/8] iotests/300: avoid abnormal shutdown race condition
Am 26.10.2021 um 19:56 hat John Snow geschrieben: > Wait for the destination VM to close itself instead of racing to shut it > down first, which produces different error log messages from AQMP > depending on precisely when we tried to shut it down. > > (For example: We may try to issue 'quit' immediately prior to the target > VM closing its QMP socket, which will cause an ECONNRESET error to be > logged. Waiting for the VM to exit itself avoids the race on shutdown > behavior.) > > Reported-by: Hanna Reitz > Signed-off-by: John Snow I think this will still expose the race I described in my comment on patch 2. Kevin
Re: [RFC 0/2] tls: add macros for coroutine-safe TLS variables
On Tue, Oct 26, 2021 at 10:10:44AM -0700, Richard Henderson wrote: > On 10/26/21 9:34 AM, Stefan Hajnoczi wrote: > > On Tue, Oct 26, 2021 at 08:10:16AM -0700, Richard Henderson wrote: > > > On 10/26/21 6:22 AM, Stefan Hajnoczi wrote: > > > > If "safe" TLS variables are opt-in then we'll likely have obscure bugs > > > > when code changes to access a TLS variable that was previously never > > > > accessed from a coroutine. There is no compiler error and no way to > > > > detect this. When it happens debugging it is painful. > > > > > > Co-routines are never used in user-only builds. > > > > If developers have the choice of using __thread then bugs can slip > > through. > > Huh? How. No, really. If there is no checkpatch.pl error then more instances of __thread will slip in. Not everyone in the QEMU community will be aware of this issue, so it's likely that code with __thread will get merged. Subsystems that use coroutines today include block, 9p, mpqemu, io channels, migration, colo, and monitor commands. I understand that qemu-user is particularly unlikely to use coroutines. Thomas' suggestion sounds good to me. Let's allow __thread only in subsystems where it's safe. Stefan signature.asc Description: PGP signature
[PATCH v8 8/8] hmp: add virtio commands
From: Laurent Vivier This patch implements the HMP versions of the virtio QMP commands. Signed-off-by: Jonah Palmer --- docs/system/monitor.rst | 2 + hmp-commands-virtio.hx | 250 ++ hmp-commands.hx | 10 ++ hw/virtio/virtio.c | 355 include/monitor/hmp.h | 5 + meson.build | 1 + monitor/misc.c | 17 +++ 7 files changed, 640 insertions(+) create mode 100644 hmp-commands-virtio.hx diff --git a/docs/system/monitor.rst b/docs/system/monitor.rst index ff5c434..10418fc 100644 --- a/docs/system/monitor.rst +++ b/docs/system/monitor.rst @@ -21,6 +21,8 @@ The following commands are available: .. hxtool-doc:: hmp-commands.hx +.. hxtool-doc:: hmp-commands-virtio.hx + .. hxtool-doc:: hmp-commands-info.hx Integer expressions diff --git a/hmp-commands-virtio.hx b/hmp-commands-virtio.hx new file mode 100644 index 000..36aab94 --- /dev/null +++ b/hmp-commands-virtio.hx @@ -0,0 +1,250 @@ +HXCOMM Use DEFHEADING() to define headings in both help text and rST. +HXCOMM Text between SRST and ERST is copied to the rST version and +HXCOMM discarded from C version. +HXCOMM +HXCOMM DEF(command, args, callback, arg_string, help) is used to construct +HXCOMM monitor info commands. +HXCOMM +HXCOMM HXCOMM can be used for comments, discarded from both rST and C. +HXCOMM +HXCOMM In this file, generally SRST fragments should have two extra +HXCOMM spaces of indent, so that the documentation list item for "virtio cmd" +HXCOMM appears inside the documentation list item for the top level +HXCOMM "virtio" documentation entry. The exception is the first SRST +HXCOMM fragment that defines that top level entry. + +SRST + ``virtio`` *subcommand* + Show various information about virtio + + Example: + + List all sub-commands:: + + (qemu) virtio + virtio query -- List all available virtio devices + virtio status path -- Display status of a given virtio device + virtio queue-status path queue -- Display status of a given virtio queue + virtio vhost-queue-status path queue -- Display status of a given vhost queue + virtio queue-element path queue [index] -- Display element of a given virtio queue + +ERST + + { +.name = "query", +.args_type = "", +.params = "", +.help = "List all available virtio devices", +.cmd= hmp_virtio_query, +.flags = "p", + }, + +SRST + ``virtio query`` + List all available virtio devices + + Example: + + List all available virtio devices in the machine:: + + (qemu) virtio query + /machine/peripheral/vsock0/virtio-backend [vhost-vsock] + /machine/peripheral/crypto0/virtio-backend [virtio-crypto] + /machine/peripheral-anon/device[2]/virtio-backend [virtio-scsi] + /machine/peripheral-anon/device[1]/virtio-backend [virtio-net] + /machine/peripheral-anon/device[0]/virtio-backend [virtio-serial] + +ERST + + { +.name = "status", +.args_type = "path:s", +.params = "path", +.help = "Display status of a given virtio device", +.cmd= hmp_virtio_status, +.flags = "p", + }, + +SRST + ``virtio status`` *path* + Display status of a given virtio device + + Example: + + Dump the status of virtio-net (vhost on):: + + (qemu) virtio status /machine/peripheral-anon/device[1]/virtio-backend + /machine/peripheral-anon/device[1]/virtio-backend: +device_name: virtio-net (vhost) +device_id: 1 +vhost_started: true +bus_name:(null) +broken: false +disabled:false +disable_legacy_check:false +started: true +use_started: true +start_on_kick: false +use_guest_notifier_mask: true +vm_running: true +num_vqs: 3 +queue_sel: 2 +isr: 1 +endianness: little +status: acknowledge, driver, features-ok, driver-ok +Guest features: event-idx, indirect-desc, version-1 + ctrl-mac-addr, guest-announce, ctrl-vlan, ctrl-rx, ctrl-vq, status, mrg-rxbuf, + host-ufo, host-ecn, host-tso6, host-tso4, guest-ufo, guest-ecn, guest-tso6, + guest-tso4, mac, ctrl-guest-offloads, guest-csum, csum +Host features:protocol-features, event-idx, indirect-desc, version-1, any-layout, notify-on-empty + gso, ctrl-mac-addr, guest-announce, ctrl-rx-extra, ctrl-vlan, ctrl-rx, ctrl-vq, + status, mrg-rxbuf, host-ufo, host-ecn, host-tso6, host-tso4, guest-ufo, guest-ecn, + guest-tso6, guest-tso4, mac, ctrl-guest-offloads, guest-csum, csum +Backend features: protocol-features, event-idx, indirect-desc, version-1, any-layout, notify-on-empty + gso, ctrl-mac-addr, guest-announce, ctrl-rx-ex
Re: [PATCH v8 5/8] qmp: decode feature & status bits in virtio-status
On 27/10/2021 13:59, David Hildenbrand wrote: On 27.10.21 13:41, Jonah Palmer wrote: From: Laurent Vivier Display feature names instead of bitmaps for host, guest, and backend for VirtIODevice. Display status names instead of bitmaps for VirtIODevice. Display feature names instead of bitmaps for backend, protocol, acked, and features (hdev->features) for vhost devices. Decode features according to device type. Decode status according to configuration status bitmap (config_status_map). Decode vhost user protocol features according to vhost user protocol bitmap (vhost_user_protocol_map). Transport features are on the first line. Undecoded bits (if any) are stored in a separate field. Vhost device field wont show if there's no vhost active for a given VirtIODevice. Signed-off-by: Jonah Palmer --- hw/block/virtio-blk.c | 28 ++ hw/char/virtio-serial-bus.c| 11 + hw/display/virtio-gpu-base.c | 18 +- hw/input/virtio-input.c| 11 +- hw/net/virtio-net.c| 47 hw/scsi/virtio-scsi.c | 17 ++ hw/virtio/vhost-user-fs.c | 10 + hw/virtio/vhost-vsock-common.c | 10 + hw/virtio/virtio-balloon.c | 14 + hw/virtio/virtio-crypto.c | 10 + hw/virtio/virtio-iommu.c | 14 + hw/virtio/virtio.c | 273 ++- include/hw/virtio/vhost.h | 3 + include/hw/virtio/virtio.h | 17 ++ qapi/virtio.json | 577 ++--- Any particular reason we're not handling virtio-mem? (there is only a single feature bit so far, a second one to be introduced soon) I think this is because the v1 of the series has been written in March 2020 and it has not been update when virtio-mem has been added (June 2020). Thanks, Laurent
Re: [PATCH v8 5/8] qmp: decode feature & status bits in virtio-status
On 27.10.21 13:41, Jonah Palmer wrote: > From: Laurent Vivier > > Display feature names instead of bitmaps for host, guest, and > backend for VirtIODevice. > > Display status names instead of bitmaps for VirtIODevice. > > Display feature names instead of bitmaps for backend, protocol, > acked, and features (hdev->features) for vhost devices. > > Decode features according to device type. Decode status > according to configuration status bitmap (config_status_map). > Decode vhost user protocol features according to vhost user > protocol bitmap (vhost_user_protocol_map). > > Transport features are on the first line. Undecoded bits > (if any) are stored in a separate field. Vhost device field > wont show if there's no vhost active for a given VirtIODevice. > > Signed-off-by: Jonah Palmer > --- > hw/block/virtio-blk.c | 28 ++ > hw/char/virtio-serial-bus.c| 11 + > hw/display/virtio-gpu-base.c | 18 +- > hw/input/virtio-input.c| 11 +- > hw/net/virtio-net.c| 47 > hw/scsi/virtio-scsi.c | 17 ++ > hw/virtio/vhost-user-fs.c | 10 + > hw/virtio/vhost-vsock-common.c | 10 + > hw/virtio/virtio-balloon.c | 14 + > hw/virtio/virtio-crypto.c | 10 + > hw/virtio/virtio-iommu.c | 14 + > hw/virtio/virtio.c | 273 ++- > include/hw/virtio/vhost.h | 3 + > include/hw/virtio/virtio.h | 17 ++ > qapi/virtio.json | 577 > ++--- Any particular reason we're not handling virtio-mem? (there is only a single feature bit so far, a second one to be introduced soon) -- Thanks, David / dhildenb
Re: [PATCH v8 0/8] hmp, qmp: Add commands to introspect virtio devices
On Wed, Oct 27, 2021 at 07:41:41AM -0400, Jonah Palmer wrote: > This series introduces new QMP/HMP commands to dump the status of a > virtio device at different levels. > > [Jonah: Rebasing previous patchset from Oct. 5 (v7). Original patches > are from Laurent Vivier from May 2020. > > Rebase from v7 to v8 includes an additional assert to make sure > we're not returning NULL in virtio_id_to_name(). Rebase also > includes minor additions/edits to qapi/virtio.json.] > > 1. Main command > > HMP Only: > > virtio [subcommand] > > Example: > > List all sub-commands: > > (qemu) virtio > virtio query -- List all available virtio devices > virtio status path -- Display status of a given virtio device > virtio queue-status path queue -- Display status of a given virtio > queue > virtio vhost-queue-status path queue -- Display status of a given > vhost queue > virtio queue-element path queue [index] -- Display element of a given > virtio queue I don't see a compelling reason why these are setup as sub-commands under a new "virtio" top level. This HMP approach and the QMP 'x-debug-query' naming just feels needlessly different from the current QEMU practices. IMHO they should just be "info" subcommands for HMP. ie info virtio -- List all available virtio devices info virtio-status path -- Display status of a given virtio device info virtio-queue-status path queue -- Display status of a given virtio queue info virtio-vhost-queue-status path queue -- Display status of a given vhost queue info virtio-queue-element path queue [index] -- Display element of a given virtio queue While the corresponding QMP commands ought to be x-query-virtio x-query-virtio-status x-query-virtio-queue-status x-query-virtio-vhost-queue-status x-query-virtio-queue-element Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[PATCH v8 6/8] qmp: add QMP commands for virtio/vhost queue-status
From: Laurent Vivier These new commands show the internal status of a VirtIODevice's VirtQueue and a vhost device's vhost_virtqueue (if active). Signed-off-by: Jonah Palmer --- hw/virtio/virtio-stub.c | 14 +++ hw/virtio/virtio.c | 103 +++ qapi/virtio.json| 268 3 files changed, 385 insertions(+) diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c index ddb592f..387803d 100644 --- a/hw/virtio/virtio-stub.c +++ b/hw/virtio/virtio-stub.c @@ -17,3 +17,17 @@ VirtioStatus *qmp_x_debug_virtio_status(const char* path, Error **errp) { return qmp_virtio_unsupported(errp); } + +VirtVhostQueueStatus *qmp_x_debug_virtio_vhost_queue_status(const char *path, +uint16_t queue, +Error **errp) +{ +return qmp_virtio_unsupported(errp); +} + +VirtQueueStatus *qmp_x_debug_virtio_queue_status(const char *path, + uint16_t queue, + Error **errp) +{ +return qmp_virtio_unsupported(errp); +} diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 5bac549..7fd98c5 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -4284,6 +4284,109 @@ VirtioStatus *qmp_x_debug_virtio_status(const char *path, Error **errp) return status; } +VirtVhostQueueStatus *qmp_x_debug_virtio_vhost_queue_status(const char *path, +uint16_t queue, +Error **errp) +{ +VirtIODevice *vdev; +VirtVhostQueueStatus *status; + +vdev = virtio_device_find(path); +if (vdev == NULL) { +error_setg(errp, "Path %s is not a VirtIODevice", path); +return NULL; +} + +if (!vdev->vhost_started) { +error_setg(errp, "Error: vhost device has not started yet"); +return NULL; +} + +VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); +struct vhost_dev *hdev = vdc->get_vhost(vdev); + +if (queue < hdev->vq_index || queue >= hdev->vq_index + hdev->nvqs) { +error_setg(errp, "Invalid vhost virtqueue number %d", queue); +return NULL; +} + +status = g_new0(VirtVhostQueueStatus, 1); +status->device_name = g_strdup(vdev->name); +status->kick = hdev->vqs[queue].kick; +status->call = hdev->vqs[queue].call; +status->desc = (uint64_t)(unsigned long)hdev->vqs[queue].desc; +status->avail = (uint64_t)(unsigned long)hdev->vqs[queue].avail; +status->used = (uint64_t)(unsigned long)hdev->vqs[queue].used; +status->num = hdev->vqs[queue].num; +status->desc_phys = hdev->vqs[queue].desc_phys; +status->desc_size = hdev->vqs[queue].desc_size; +status->avail_phys = hdev->vqs[queue].avail_phys; +status->avail_size = hdev->vqs[queue].avail_size; +status->used_phys = hdev->vqs[queue].used_phys; +status->used_size = hdev->vqs[queue].used_size; + +return status; +} + +VirtQueueStatus *qmp_x_debug_virtio_queue_status(const char *path, + uint16_t queue, + Error **errp) +{ +VirtIODevice *vdev; +VirtQueueStatus *status; + +vdev = virtio_device_find(path); +if (vdev == NULL) { +error_setg(errp, "Path %s is not a VirtIODevice", path); +return NULL; +} + +if (queue >= VIRTIO_QUEUE_MAX || !virtio_queue_get_num(vdev, queue)) { +error_setg(errp, "Invalid virtqueue number %d", queue); +return NULL; +} + +status = g_new0(VirtQueueStatus, 1); +status->device_name = g_strdup(vdev->name); +status->queue_index = vdev->vq[queue].queue_index; +status->inuse = vdev->vq[queue].inuse; +status->vring_num = vdev->vq[queue].vring.num; +status->vring_num_default = vdev->vq[queue].vring.num_default; +status->vring_align = vdev->vq[queue].vring.align; +status->vring_desc = vdev->vq[queue].vring.desc; +status->vring_avail = vdev->vq[queue].vring.avail; +status->vring_used = vdev->vq[queue].vring.used; +status->used_idx = vdev->vq[queue].used_idx; +status->signalled_used = vdev->vq[queue].signalled_used; +status->signalled_used_valid = vdev->vq[queue].signalled_used_valid; + +if (vdev->vhost_started) { +VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); +struct vhost_dev *hdev = vdc->get_vhost(vdev); + +/* check if vq index exists for vhost as well */ +if (queue >= hdev->vq_index && queue < hdev->vq_index + hdev->nvqs) { +status->has_last_avail_idx = true; + +int vhost_vq_index = +hdev->vhost_ops->vhost_get_vq_index(hdev, queue); +struct vhost_vring_state state = { +.index = vhost_vq_index, +}; + +stat
[PATCH v8 4/8] qmp: add QMP command x-debug-virtio-status
From: Laurent Vivier This new command shows the status of a VirtIODevice, including its corresponding vhost device status (if active). Next patch will improve output by decoding feature bits, including vhost device's feature bits (backend, protocol, acked, and features). Also will decode status bits of a VirtIODevice. Next patch will also suppress the vhost device field from displaying if no vhost device is active for a given VirtIODevice. Signed-off-by: Jonah Palmer --- hw/virtio/virtio-stub.c | 5 + hw/virtio/virtio.c | 96 ++ qapi/virtio.json| 255 3 files changed, 356 insertions(+) diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c index d4a88f5..ddb592f 100644 --- a/hw/virtio/virtio-stub.c +++ b/hw/virtio/virtio-stub.c @@ -12,3 +12,8 @@ VirtioInfoList *qmp_x_debug_query_virtio(Error **errp) { return qmp_virtio_unsupported(errp); } + +VirtioStatus *qmp_x_debug_virtio_status(const char* path, Error **errp) +{ +return qmp_virtio_unsupported(errp); +} diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index ad17be7..8d13d27 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -3933,6 +3933,102 @@ VirtioInfoList *qmp_x_debug_query_virtio(Error **errp) return list; } +static VirtIODevice *virtio_device_find(const char *path) +{ +VirtIODevice *vdev; + +QTAILQ_FOREACH(vdev, &virtio_list, next) { +DeviceState *dev = DEVICE(vdev); + +if (strcmp(dev->canonical_path, path) != 0) { +continue; +} +return vdev; +} + +return NULL; +} + +VirtioStatus *qmp_x_debug_virtio_status(const char *path, Error **errp) +{ +VirtIODevice *vdev; +VirtioStatus *status; + +vdev = virtio_device_find(path); +if (vdev == NULL) { +error_setg(errp, "Path %s is not a VirtIO device", path); +return NULL; +} + +status = g_new0(VirtioStatus, 1); +status->vhost_dev = g_new0(VhostStatus, 1); +status->name = g_strdup(vdev->name); +status->device_id = vdev->device_id; +status->vhost_started = vdev->vhost_started; +status->guest_features = vdev->guest_features; +status->host_features = vdev->host_features; +status->backend_features = vdev->backend_features; + +switch (vdev->device_endian) { +case VIRTIO_DEVICE_ENDIAN_LITTLE: +status->device_endian = VIRTIO_STATUS_ENDIANNESS_LITTLE; +break; +case VIRTIO_DEVICE_ENDIAN_BIG: +status->device_endian = VIRTIO_STATUS_ENDIANNESS_BIG; +break; +default: +status->device_endian = VIRTIO_STATUS_ENDIANNESS_UNKNOWN; +break; +} + +status->num_vqs = virtio_get_num_queues(vdev); +status->status = vdev->status; +status->isr = vdev->isr; +status->queue_sel = vdev->queue_sel; +status->vm_running = vdev->vm_running; +status->broken = vdev->broken; +status->disabled = vdev->disabled; +status->use_started = vdev->use_started; +status->started = vdev->started; +status->start_on_kick = vdev->start_on_kick; +status->disable_legacy_check = vdev->disable_legacy_check; +status->bus_name = g_strdup(vdev->bus_name); +status->use_guest_notifier_mask = vdev->use_guest_notifier_mask; + +if (vdev->vhost_started) { +VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); +struct vhost_dev *hdev = vdc->get_vhost(vdev); + +status->vhost_dev->n_mem_sections = hdev->n_mem_sections; +status->vhost_dev->n_tmp_sections = hdev->n_tmp_sections; +status->vhost_dev->nvqs = hdev->nvqs; +status->vhost_dev->vq_index = hdev->vq_index; +status->vhost_dev->features = hdev->features; +status->vhost_dev->acked_features = hdev->acked_features; +status->vhost_dev->backend_features = hdev->backend_features; +status->vhost_dev->protocol_features = hdev->protocol_features; +status->vhost_dev->max_queues = hdev->max_queues; +status->vhost_dev->backend_cap = hdev->backend_cap; +status->vhost_dev->log_enabled = hdev->log_enabled; +status->vhost_dev->log_size = hdev->log_size; +} else { +status->vhost_dev->n_mem_sections = 0; +status->vhost_dev->n_tmp_sections = 0; +status->vhost_dev->nvqs = 0; +status->vhost_dev->vq_index = 0; +status->vhost_dev->features = 0; +status->vhost_dev->acked_features = 0; +status->vhost_dev->backend_features = 0; +status->vhost_dev->protocol_features = 0; +status->vhost_dev->max_queues = 0; +status->vhost_dev->backend_cap = 0; +status->vhost_dev->log_enabled = false; +status->vhost_dev->log_size = 0; +} + +return status; +} + static const TypeInfo virtio_device_info = { .name = TYPE_VIRTIO_DEVICE, .parent = TYPE_DEVICE, diff --git a/qapi/virtio.json b/qapi/virtio.json index 4490c2c..656a26f 100644 --- a/qapi/virtio.json +++ b/qapi/vi
[PATCH v8 3/8] qmp: add QMP command x-debug-query-virtio
From: Laurent Vivier This new command lists all the instances of VirtIODevice with their QOM paths and virtio type/name. Signed-off-by: Jonah Palmer --- hw/virtio/meson.build | 2 ++ hw/virtio/virtio-stub.c| 14 ++ hw/virtio/virtio.c | 27 +++ include/hw/virtio/virtio.h | 1 + qapi/meson.build | 1 + qapi/qapi-schema.json | 1 + qapi/virtio.json | 67 ++ tests/qtest/qmp-cmd-test.c | 1 + 8 files changed, 114 insertions(+) create mode 100644 hw/virtio/virtio-stub.c create mode 100644 qapi/virtio.json diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build index 521f7d6..d893f5f 100644 --- a/hw/virtio/meson.build +++ b/hw/virtio/meson.build @@ -6,8 +6,10 @@ softmmu_virtio_ss.add(when: 'CONFIG_VHOST', if_false: files('vhost-stub.c')) softmmu_ss.add_all(when: 'CONFIG_VIRTIO', if_true: softmmu_virtio_ss) softmmu_ss.add(when: 'CONFIG_VIRTIO', if_false: files('vhost-stub.c')) +softmmu_ss.add(when: 'CONFIG_VIRTIO', if_false: files('virtio-stub.c')) softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-stub.c')) +softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('virtio-stub.c')) virtio_ss = ss.source_set() virtio_ss.add(files('virtio.c')) diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c new file mode 100644 index 000..d4a88f5 --- /dev/null +++ b/hw/virtio/virtio-stub.c @@ -0,0 +1,14 @@ +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "qapi/qapi-commands-virtio.h" + +static void *qmp_virtio_unsupported(Error **errp) +{ +error_setg(errp, "Virtio is disabled"); +return NULL; +} + +VirtioInfoList *qmp_x_debug_query_virtio(Error **errp) +{ +return qmp_virtio_unsupported(errp); +} diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 7050bd5..ad17be7 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -13,6 +13,8 @@ #include "qemu/osdep.h" #include "qapi/error.h" +#include "qapi/qapi-commands-virtio.h" +#include "qapi/qapi-visit-virtio.h" #include "cpu.h" #include "trace.h" #include "qemu/error-report.h" @@ -29,6 +31,9 @@ #include "sysemu/runstate.h" #include "standard-headers/linux/virtio_ids.h" +/* QAPI list of VirtIODevices */ +static QTAILQ_HEAD(, VirtIODevice) virtio_list; + /* * The alignment to use between consumer and producer parts of vring. * x86 pagesize again. This is the default, used by transports like PCI @@ -3709,6 +3714,7 @@ static void virtio_device_realize(DeviceState *dev, Error **errp) vdev->listener.commit = virtio_memory_listener_commit; vdev->listener.name = "virtio"; memory_listener_register(&vdev->listener, vdev->dma_as); +QTAILQ_INSERT_TAIL(&virtio_list, vdev, next); } static void virtio_device_unrealize(DeviceState *dev) @@ -3723,6 +3729,7 @@ static void virtio_device_unrealize(DeviceState *dev) vdc->unrealize(dev); } +QTAILQ_REMOVE(&virtio_list, vdev, next); g_free(vdev->bus_name); vdev->bus_name = NULL; } @@ -3896,6 +3903,8 @@ static void virtio_device_class_init(ObjectClass *klass, void *data) vdc->stop_ioeventfd = virtio_device_stop_ioeventfd_impl; vdc->legacy_features |= VIRTIO_LEGACY_FEATURES; + +QTAILQ_INIT(&virtio_list); } bool virtio_device_ioeventfd_enabled(VirtIODevice *vdev) @@ -3906,6 +3915,24 @@ bool virtio_device_ioeventfd_enabled(VirtIODevice *vdev) return virtio_bus_ioeventfd_enabled(vbus); } +VirtioInfoList *qmp_x_debug_query_virtio(Error **errp) +{ +VirtioInfoList *list = NULL; +VirtioInfoList *node; +VirtIODevice *vdev; + +QTAILQ_FOREACH(vdev, &virtio_list, next) { +DeviceState *dev = DEVICE(vdev); +node = g_new0(VirtioInfoList, 1); +node->value = g_new(VirtioInfo, 1); +node->value->path = g_strdup(dev->canonical_path); +node->value->type = g_strdup(vdev->name); +QAPI_LIST_PREPEND(list, node->value); +} + +return list; +} + static const TypeInfo virtio_device_info = { .name = TYPE_VIRTIO_DEVICE, .parent = TYPE_DEVICE, diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h index 105b98c..eceaafc 100644 --- a/include/hw/virtio/virtio.h +++ b/include/hw/virtio/virtio.h @@ -110,6 +110,7 @@ struct VirtIODevice bool use_guest_notifier_mask; AddressSpace *dma_as; QLIST_HEAD(, VirtQueue) *vector_queues; +QTAILQ_ENTRY(VirtIODevice) next; }; struct VirtioDeviceClass { diff --git a/qapi/meson.build b/qapi/meson.build index c356a38..df5662e 100644 --- a/qapi/meson.build +++ b/qapi/meson.build @@ -45,6 +45,7 @@ qapi_all_modules = [ 'sockets', 'trace', 'transaction', + 'virtio', 'yank', ] if have_system diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json index 4912b97..1512ada 100644 --- a/qapi/qapi-schema.json +++ b/qapi/qapi-schema.json @@ -93,3 +93,4 @@ { 'include': 'audio.json' } { 'include': 'acpi.json' } { 'include': 'pci.json' } +{ 'include': 'virt
[PATCH v8 0/8] hmp,qmp: Add commands to introspect virtio devices
This series introduces new QMP/HMP commands to dump the status of a virtio device at different levels. [Jonah: Rebasing previous patchset from Oct. 5 (v7). Original patches are from Laurent Vivier from May 2020. Rebase from v7 to v8 includes an additional assert to make sure we're not returning NULL in virtio_id_to_name(). Rebase also includes minor additions/edits to qapi/virtio.json.] 1. Main command HMP Only: virtio [subcommand] Example: List all sub-commands: (qemu) virtio virtio query -- List all available virtio devices virtio status path -- Display status of a given virtio device virtio queue-status path queue -- Display status of a given virtio queue virtio vhost-queue-status path queue -- Display status of a given vhost queue virtio queue-element path queue [index] -- Display element of a given virtio queue 2. List available virtio devices in the machine HMP Form: virtio query Example: (qemu) virtio query /machine/peripheral/vsock0/virtio-backend [vhost-vsock] /machine/peripheral/crypto0/virtio-backend [virtio-crypto] /machine/peripheral-anon/device[2]/virtio-backend [virtio-scsi] /machine/peripheral-anon/device[1]/virtio-backend [virtio-net] /machine/peripheral-anon/device[0]/virtio-backend [virtio-serial] QMP Form: { 'command': 'x-debug-query-virtio', 'returns': ['VirtioInfo'] } Example: -> { "execute": "x-debug-query-virtio" } <- { "return": [ { "path": "/machine/peripheral/vsock0/virtio-backend", "type": "vhost-vsock" }, { "path": "/machine/peripheral/crypto0/virtio-backend", "type": "virtio-crypto" }, { "path": "/machine/peripheral-anon/device[2]/virtio-backend", "type": "virtio-scsi" }, { "path": "/machine/peripheral-anon/device[1]/virtio-backend", "type": "virtio-net" }, { "path": "/machine/peripheral-anon/device[0]/virtio-backend", "type": "virtio-serial" } ] } 3. Display status of a given virtio device HMP Form: virtio status Example: (qemu) virtio status /machine/peripheral/vsock0/virtio-backend /machine/peripheral/vsock0/virtio-backend: device_name: vhost-vsock (vhost) device_id: 19 vhost_started: true bus_name:(null) broken: false disabled:false disable_legacy_check:false started: true use_started: true start_on_kick: false use_guest_notifier_mask: true vm_running: true num_vqs: 3 queue_sel: 2 isr: 0 endianness: little status: acknowledge, driver, features-ok, driver-ok Guest features: event-idx, indirect-desc, version-1 Host features:protocol-features, event-idx, indirect-desc, version-1, any-layout, notify-on-empty Backend features: VHost: nvqs: 2 vq_index: 0 max_queues: 0 n_mem_sections: 4 n_tmp_sections: 4 backend_cap:0 log_enabled:false log_size: 0 Features: event-idx, indirect-desc, version-1, any-layout, notify-on-empty, log-all Acked features:event-idx, indirect-desc, version-1 Backend features: Protocol features: QMP Form: { 'command': 'x-debug-virtio-status', 'data': { 'path': 'str' }, 'returns': 'VirtioStatus' } Example: -> { "execute": "x-debug-virtio-status", "arguments": { "path": "/machine/peripheral/vsock0/virtio-backend" } } <- { "return": { "device-endian": "little", "bus-name": "", "disable-legacy-check": false, "name": "vhost-vsock", "started": true, "device-id": 19, "vhost-dev": { "n-tmp-sections": 4, "n-mem-sections": 4, "max-queues": 0, "backend-cap": 0, "log-size": 0, "backend-features": {
[PATCH v8 7/8] qmp: add QMP command x-debug-virtio-queue-element
From: Laurent Vivier This new command shows the information of a VirtQueue element. Signed-off-by: Jonah Palmer --- hw/virtio/virtio-stub.c | 9 +++ hw/virtio/virtio.c | 154 qapi/virtio.json| 204 3 files changed, 367 insertions(+) diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c index 387803d..6c282b3 100644 --- a/hw/virtio/virtio-stub.c +++ b/hw/virtio/virtio-stub.c @@ -31,3 +31,12 @@ VirtQueueStatus *qmp_x_debug_virtio_queue_status(const char *path, { return qmp_virtio_unsupported(errp); } + +VirtioQueueElement *qmp_x_debug_virtio_queue_element(const char *path, + uint16_t queue, + bool has_index, + uint16_t index, + Error **errp) +{ +return qmp_virtio_unsupported(errp); +} diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 7fd98c5..8c8a987 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -480,6 +480,19 @@ static inline void vring_used_write(VirtQueue *vq, VRingUsedElem *uelem, address_space_cache_invalidate(&caches->used, pa, sizeof(VRingUsedElem)); } +/* Called within rcu_read_lock(). */ +static inline uint16_t vring_used_flags(VirtQueue *vq) +{ +VRingMemoryRegionCaches *caches = vring_get_region_caches(vq); +hwaddr pa = offsetof(VRingUsed, flags); + +if (!caches) { +return 0; +} + +return virtio_lduw_phys_cached(vq->vdev, &caches->used, pa); +} + /* Called within rcu_read_lock(). */ static uint16_t vring_used_idx(VirtQueue *vq) { @@ -4387,6 +4400,147 @@ VirtQueueStatus *qmp_x_debug_virtio_queue_status(const char *path, return status; } +static VirtioRingDescFlagsList *qmp_decode_vring_desc_flags(uint16_t flags) +{ +VirtioRingDescFlagsList *list = NULL; +VirtioRingDescFlagsList *node; +int i; + +struct { +uint16_t flag; +VirtioRingDescFlags value; +} map[] = { +{ VRING_DESC_F_NEXT, VIRTIO_RING_DESC_FLAGS_NEXT }, +{ VRING_DESC_F_WRITE, VIRTIO_RING_DESC_FLAGS_WRITE }, +{ VRING_DESC_F_INDIRECT, VIRTIO_RING_DESC_FLAGS_INDIRECT }, +{ 1 << VRING_PACKED_DESC_F_AVAIL, VIRTIO_RING_DESC_FLAGS_AVAIL }, +{ 1 << VRING_PACKED_DESC_F_USED, VIRTIO_RING_DESC_FLAGS_USED }, +{ 0, -1 } +}; + +for (i = 0; map[i].flag; i++) { +if ((map[i].flag & flags) == 0) { +continue; +} +node = g_malloc0(sizeof(VirtioRingDescFlagsList)); +node->value = map[i].value; +node->next = list; +list = node; +} + +return list; +} + +VirtioQueueElement *qmp_x_debug_virtio_queue_element(const char *path, + uint16_t queue, + bool has_index, + uint16_t index, + Error **errp) +{ +VirtIODevice *vdev; +VirtQueue *vq; +VirtioQueueElement *element = NULL; + +vdev = virtio_device_find(path); +if (vdev == NULL) { +error_setg(errp, "Path %s is not a VirtIO device", path); +return NULL; +} + +if (queue >= VIRTIO_QUEUE_MAX || !virtio_queue_get_num(vdev, queue)) { +error_setg(errp, "Invalid virtqueue number %d", queue); +return NULL; +} +vq = &vdev->vq[queue]; + +if (virtio_vdev_has_feature(vdev, VIRTIO_F_RING_PACKED)) { +error_setg(errp, "Packed ring not supported"); +return NULL; +} else { +unsigned int head, i, max; +VRingMemoryRegionCaches *caches; +MemoryRegionCache indirect_desc_cache = MEMORY_REGION_CACHE_INVALID; +MemoryRegionCache *desc_cache; +VRingDesc desc; +VirtioRingDescList *list = NULL; +VirtioRingDescList *node; +int rc; + +RCU_READ_LOCK_GUARD(); + +max = vq->vring.num; + +if (!has_index) { +head = vring_avail_ring(vq, vq->last_avail_idx % vq->vring.num); +} else { +head = vring_avail_ring(vq, index % vq->vring.num); +} +i = head; + +caches = vring_get_region_caches(vq); +if (!caches) { +error_setg(errp, "Region caches not initialized"); +return NULL; +} +if (caches->desc.len < max * sizeof(VRingDesc)) { +error_setg(errp, "Cannot map descriptor ring"); +return NULL; +} + +desc_cache = &caches->desc; +vring_split_desc_read(vdev, &desc, desc_cache, i); +if (desc.flags & VRING_DESC_F_INDIRECT) { +int64_t len; +len = address_space_cache_init(&indirect_desc_cache, vdev->dma_as, + desc.
[PATCH v8 2/8] virtio: add vhost support for virtio devices
This patch adds a get_vhost() callback function for VirtIODevices that returns the device's corresponding vhost_dev structure if the vhost device is running. This patch also adds a vhost_started flag for VirtIODevices. Previously, a VirtIODevice wouldn't be able to tell if its corresponding vhost device was active or not. Signed-off-by: Jonah Palmer --- hw/block/vhost-user-blk.c | 7 +++ hw/display/vhost-user-gpu.c| 7 +++ hw/input/vhost-user-input.c| 7 +++ hw/net/virtio-net.c| 9 + hw/scsi/vhost-scsi.c | 8 hw/virtio/vhost-user-fs.c | 7 +++ hw/virtio/vhost-user-rng.c | 7 +++ hw/virtio/vhost-vsock-common.c | 7 +++ hw/virtio/vhost.c | 3 +++ hw/virtio/virtio-crypto.c | 10 ++ hw/virtio/virtio.c | 1 + include/hw/virtio/virtio.h | 3 +++ 12 files changed, 76 insertions(+) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index f61f8c1..b059da1 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -568,6 +568,12 @@ static void vhost_user_blk_instance_init(Object *obj) "/disk@0,0", DEVICE(obj)); } +static struct vhost_dev *vhost_user_blk_get_vhost(VirtIODevice *vdev) +{ +VHostUserBlk *s = VHOST_USER_BLK(vdev); +return &s->dev; +} + static const VMStateDescription vmstate_vhost_user_blk = { .name = "vhost-user-blk", .minimum_version_id = 1, @@ -602,6 +608,7 @@ static void vhost_user_blk_class_init(ObjectClass *klass, void *data) vdc->get_features = vhost_user_blk_get_features; vdc->set_status = vhost_user_blk_set_status; vdc->reset = vhost_user_blk_reset; +vdc->get_vhost = vhost_user_blk_get_vhost; } static const TypeInfo vhost_user_blk_info = { diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c index 49df56c..6e93b46 100644 --- a/hw/display/vhost-user-gpu.c +++ b/hw/display/vhost-user-gpu.c @@ -565,6 +565,12 @@ vhost_user_gpu_device_realize(DeviceState *qdev, Error **errp) g->vhost_gpu_fd = -1; } +static struct vhost_dev *vhost_user_gpu_get_vhost(VirtIODevice *vdev) +{ +VhostUserGPU *g = VHOST_USER_GPU(vdev); +return &g->vhost->dev; +} + static Property vhost_user_gpu_properties[] = { VIRTIO_GPU_BASE_PROPERTIES(VhostUserGPU, parent_obj.conf), DEFINE_PROP_END_OF_LIST(), @@ -586,6 +592,7 @@ vhost_user_gpu_class_init(ObjectClass *klass, void *data) vdc->guest_notifier_pending = vhost_user_gpu_guest_notifier_pending; vdc->get_config = vhost_user_gpu_get_config; vdc->set_config = vhost_user_gpu_set_config; +vdc->get_vhost = vhost_user_gpu_get_vhost; device_class_set_props(dc, vhost_user_gpu_properties); } diff --git a/hw/input/vhost-user-input.c b/hw/input/vhost-user-input.c index 273e96a..43d2ff3 100644 --- a/hw/input/vhost-user-input.c +++ b/hw/input/vhost-user-input.c @@ -79,6 +79,12 @@ static void vhost_input_set_config(VirtIODevice *vdev, virtio_notify_config(vdev); } +static struct vhost_dev *vhost_input_get_vhost(VirtIODevice *vdev) +{ +VHostUserInput *vhi = VHOST_USER_INPUT(vdev); +return &vhi->vhost->dev; +} + static const VMStateDescription vmstate_vhost_input = { .name = "vhost-user-input", .unmigratable = 1, @@ -93,6 +99,7 @@ static void vhost_input_class_init(ObjectClass *klass, void *data) dc->vmsd = &vmstate_vhost_input; vdc->get_config = vhost_input_get_config; vdc->set_config = vhost_input_set_config; +vdc->get_vhost = vhost_input_get_vhost; vic->realize = vhost_input_realize; vic->change_active = vhost_input_change_active; } diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index b275acf..2449b9c 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -3610,6 +3610,14 @@ static bool dev_unplug_pending(void *opaque) return vdc->primary_unplug_pending(dev); } +static struct vhost_dev *virtio_net_get_vhost(VirtIODevice *vdev) +{ +VirtIONet *n = VIRTIO_NET(vdev); +NetClientState *nc = qemu_get_queue(n->nic); +struct vhost_net *net = get_vhost_net(nc->peer); +return &net->dev; +} + static const VMStateDescription vmstate_virtio_net = { .name = "virtio-net", .minimum_version_id = VIRTIO_NET_VM_VERSION, @@ -3712,6 +3720,7 @@ static void virtio_net_class_init(ObjectClass *klass, void *data) vdc->post_load = virtio_net_post_load_virtio; vdc->vmsd = &vmstate_virtio_net_device; vdc->primary_unplug_pending = primary_unplug_pending; +vdc->get_vhost = virtio_net_get_vhost; } static const TypeInfo virtio_net_info = { diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c index 039caf2..b0a9c45 100644 --- a/hw/scsi/vhost-scsi.c +++ b/hw/scsi/vhost-scsi.c @@ -264,6 +264,13 @@ static void vhost_scsi_unrealize(DeviceState *dev) virtio_scsi_common_unrealize(dev); } +static struct vhost_dev *vhost_scsi_get_vhost(VirtIODevice *vdev) +{ +VHostSCSI
[PATCH v8 5/8] qmp: decode feature & status bits in virtio-status
From: Laurent Vivier Display feature names instead of bitmaps for host, guest, and backend for VirtIODevice. Display status names instead of bitmaps for VirtIODevice. Display feature names instead of bitmaps for backend, protocol, acked, and features (hdev->features) for vhost devices. Decode features according to device type. Decode status according to configuration status bitmap (config_status_map). Decode vhost user protocol features according to vhost user protocol bitmap (vhost_user_protocol_map). Transport features are on the first line. Undecoded bits (if any) are stored in a separate field. Vhost device field wont show if there's no vhost active for a given VirtIODevice. Signed-off-by: Jonah Palmer --- hw/block/virtio-blk.c | 28 ++ hw/char/virtio-serial-bus.c| 11 + hw/display/virtio-gpu-base.c | 18 +- hw/input/virtio-input.c| 11 +- hw/net/virtio-net.c| 47 hw/scsi/virtio-scsi.c | 17 ++ hw/virtio/vhost-user-fs.c | 10 + hw/virtio/vhost-vsock-common.c | 10 + hw/virtio/virtio-balloon.c | 14 + hw/virtio/virtio-crypto.c | 10 + hw/virtio/virtio-iommu.c | 14 + hw/virtio/virtio.c | 273 ++- include/hw/virtio/vhost.h | 3 + include/hw/virtio/virtio.h | 17 ++ qapi/virtio.json | 577 ++--- 15 files changed, 1015 insertions(+), 45 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 505e574..c2e901f 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -13,6 +13,7 @@ #include "qemu/osdep.h" #include "qapi/error.h" +#include "qapi/qapi-visit-virtio.h" #include "qemu/iov.h" #include "qemu/module.h" #include "qemu/error-report.h" @@ -32,6 +33,7 @@ #include "hw/virtio/virtio-bus.h" #include "migration/qemu-file-types.h" #include "hw/virtio/virtio-access.h" +#include "standard-headers/linux/vhost_types.h" /* Config size before the discard support (hide associated config fields) */ #define VIRTIO_BLK_CFG_SIZE offsetof(struct virtio_blk_config, \ @@ -48,6 +50,32 @@ static const VirtIOFeature feature_sizes[] = { {} }; +qmp_virtio_feature_map_t blk_map[] = { +#define FEATURE_ENTRY(name) \ +{ VIRTIO_BLK_F_##name, VIRTIO_BLK_FEATURE_##name } +FEATURE_ENTRY(SIZE_MAX), +FEATURE_ENTRY(SEG_MAX), +FEATURE_ENTRY(GEOMETRY), +FEATURE_ENTRY(RO), +FEATURE_ENTRY(BLK_SIZE), +FEATURE_ENTRY(TOPOLOGY), +FEATURE_ENTRY(MQ), +FEATURE_ENTRY(DISCARD), +FEATURE_ENTRY(WRITE_ZEROES), +#ifndef VIRTIO_BLK_NO_LEGACY +FEATURE_ENTRY(BARRIER), +FEATURE_ENTRY(SCSI), +FEATURE_ENTRY(FLUSH), +FEATURE_ENTRY(CONFIG_WCE), +#endif /* !VIRTIO_BLK_NO_LEGACY */ +#undef FEATURE_ENTRY +#define FEATURE_ENTRY(name) \ +{ VHOST_F_##name, VIRTIO_BLK_FEATURE_##name } +FEATURE_ENTRY(LOG_ALL), +#undef FEATURE_ENTRY +{ -1, -1 } +}; + static void virtio_blk_set_config_size(VirtIOBlock *s, uint64_t host_features) { s->config_size = MAX(VIRTIO_BLK_CFG_SIZE, diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c index 232f4c9..fa57059 100644 --- a/hw/char/virtio-serial-bus.c +++ b/hw/char/virtio-serial-bus.c @@ -20,6 +20,7 @@ #include "qemu/osdep.h" #include "qapi/error.h" +#include "qapi/qapi-visit-virtio.h" #include "qemu/iov.h" #include "qemu/main-loop.h" #include "qemu/module.h" @@ -32,6 +33,16 @@ #include "hw/virtio/virtio-serial.h" #include "hw/virtio/virtio-access.h" +qmp_virtio_feature_map_t serial_map[] = { +#define FEATURE_ENTRY(name) \ +{ VIRTIO_CONSOLE_F_##name, VIRTIO_SERIAL_FEATURE_##name } +FEATURE_ENTRY(SIZE), +FEATURE_ENTRY(MULTIPORT), +FEATURE_ENTRY(EMERG_WRITE), +#undef FEATURE_ENTRY +{ -1, -1 } +}; + static struct VirtIOSerialDevices { QLIST_HEAD(, VirtIOSerial) devices; } vserdevices; diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c index 5411a7b..a322349 100644 --- a/hw/display/virtio-gpu-base.c +++ b/hw/display/virtio-gpu-base.c @@ -12,13 +12,29 @@ */ #include "qemu/osdep.h" - +#include "standard-headers/linux/vhost_types.h" #include "hw/virtio/virtio-gpu.h" #include "migration/blocker.h" #include "qapi/error.h" +#include "qapi/qapi-visit-virtio.h" #include "qemu/error-report.h" #include "trace.h" +qmp_virtio_feature_map_t gpu_map[] = { +#define FEATURE_ENTRY(name) \ +{ VIRTIO_GPU_F_##name, VIRTIO_GPU_FEATURE_##name } +FEATURE_ENTRY(VIRGL), +FEATURE_ENTRY(EDID), +FEATURE_ENTRY(RESOURCE_UUID), +FEATURE_ENTRY(RESOURCE_BLOB), +#undef FEATURE_ENTRY +#define FEATURE_ENTRY(name) \ +{ VHOST_F_##name, VIRTIO_GPU_FEATURE_##name } +FEATURE_ENTRY(LOG_ALL), +#undef FEATURE_ENTRY +{ -1, -1 } +}; + void virtio_gpu_base_reset(VirtIOGPUBase *g) { diff --git a/hw/input/virtio-input.c b/hw/input/virtio-input.c index 5b5398b..b4562a3 100644 --- a/hw/input/virtio-input.c +++ b/hw/input/virtio-input.c @@ -6,6 +6,7 @@ #include "qe
[PATCH v8 1/8] virtio: drop name parameter for virtio_init()
This patch drops the name parameter for the virtio_init function. The pair between the numeric device ID and the string device ID (name) of a virtio device already exists, but not in a way that lets us map between them. This patch will let us do this and removes the need for the name parameter in virtio_init(). Signed-off-by: Jonah Palmer --- hw/9pfs/virtio-9p-device.c | 2 +- hw/block/vhost-user-blk.c | 2 +- hw/block/virtio-blk.c | 2 +- hw/char/virtio-serial-bus.c | 4 +-- hw/display/virtio-gpu-base.c| 2 +- hw/input/virtio-input.c | 3 +- hw/net/virtio-net.c | 2 +- hw/scsi/virtio-scsi.c | 3 +- hw/virtio/vhost-user-fs.c | 3 +- hw/virtio/vhost-user-i2c.c | 6 +--- hw/virtio/vhost-user-rng.c | 2 +- hw/virtio/vhost-user-vsock.c| 2 +- hw/virtio/vhost-vsock-common.c | 4 +-- hw/virtio/vhost-vsock.c | 2 +- hw/virtio/virtio-balloon.c | 3 +- hw/virtio/virtio-crypto.c | 2 +- hw/virtio/virtio-iommu.c| 3 +- hw/virtio/virtio-mem.c | 3 +- hw/virtio/virtio-pmem.c | 3 +- hw/virtio/virtio-rng.c | 2 +- hw/virtio/virtio.c | 45 +++-- include/hw/virtio/vhost-vsock-common.h | 2 +- include/hw/virtio/virtio-gpu.h | 3 +- include/hw/virtio/virtio.h | 3 +- include/standard-headers/linux/virtio_ids.h | 1 + 25 files changed, 68 insertions(+), 41 deletions(-) diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c index 54ee93b..5f522e6 100644 --- a/hw/9pfs/virtio-9p-device.c +++ b/hw/9pfs/virtio-9p-device.c @@ -216,7 +216,7 @@ static void virtio_9p_device_realize(DeviceState *dev, Error **errp) } v->config_size = sizeof(struct virtio_9p_config) + strlen(s->fsconf.tag); -virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, v->config_size); +virtio_init(vdev, VIRTIO_ID_9P, v->config_size); v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output); } diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index ba13cb8..f61f8c1 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -490,7 +490,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp) return; } -virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK, +virtio_init(vdev, VIRTIO_ID_BLOCK, sizeof(struct virtio_blk_config)); s->virtqs = g_new(VirtQueue *, s->num_queues); diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index f139cd7..505e574 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1213,7 +1213,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp) virtio_blk_set_config_size(s, s->host_features); -virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK, s->config_size); +virtio_init(vdev, VIRTIO_ID_BLOCK, s->config_size); s->blk = conf->conf.blk; s->rq = NULL; diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c index f01ec21..232f4c9 100644 --- a/hw/char/virtio-serial-bus.c +++ b/hw/char/virtio-serial-bus.c @@ -1044,8 +1044,8 @@ static void virtio_serial_device_realize(DeviceState *dev, Error **errp) VIRTIO_CONSOLE_F_EMERG_WRITE)) { config_size = offsetof(struct virtio_console_config, emerg_wr); } -virtio_init(vdev, "virtio-serial", VIRTIO_ID_CONSOLE, -config_size); + +virtio_init(vdev, VIRTIO_ID_CONSOLE, config_size); /* Spawn a new virtio-serial bus on which the ports will ride as devices */ qbus_init(&vser->bus, sizeof(vser->bus), TYPE_VIRTIO_SERIAL_BUS, diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c index c8da480..5411a7b 100644 --- a/hw/display/virtio-gpu-base.c +++ b/hw/display/virtio-gpu-base.c @@ -170,7 +170,7 @@ virtio_gpu_base_device_realize(DeviceState *qdev, } g->virtio_config.num_scanouts = cpu_to_le32(g->conf.max_outputs); -virtio_init(VIRTIO_DEVICE(g), "virtio-gpu", VIRTIO_ID_GPU, +virtio_init(VIRTIO_DEVICE(g), VIRTIO_ID_GPU, sizeof(struct virtio_gpu_config)); if (virtio_gpu_virgl_enabled(g->conf)) { diff --git a/hw/input/virtio-input.c b/hw/input/virtio-input.c index 54bcb46..5b5398b 100644 --- a/hw/input/virtio-input.c +++ b/hw/input/virtio-input.c @@ -257,8 +257,7 @@ static void virtio_input_device_realize(DeviceState *dev, Error **errp) vinput->cfg_size += 8; assert(vinput->cfg_size <= sizeof(virtio_input_config)); -virtio_init(vdev, "virtio-input", VIRTIO_ID_INPUT, -vinput->cfg_size); +virtio_init(vdev, VIRTIO_ID_INPUT, vinput->cfg_size); vinput->evt
Re: [PATCH v5 2/8] python/machine: Handle QMP errors on close more meticulously
Am 26.10.2021 um 19:56 hat John Snow geschrieben: > To use the AQMP backend, Machine just needs to be a little more diligent > about what happens when closing a QMP connection. The operation is no > longer a freebie in the async world; it may return errors encountered in > the async bottom half on incoming message receipt, etc. > > (AQMP's disconnect, ultimately, serves as the quiescence point where all > async contexts are gathered together, and any final errors reported at > that point.) > > Because async QMP continues to check for messages asynchronously, it's > almost certainly likely that the loop will have exited due to EOF after > issuing the last 'quit' command. That error will ultimately be bubbled > up when attempting to close the QMP connection. The manager class here > then is free to discard it -- if it was expected. > > Signed-off-by: John Snow > Reviewed-by: Hanna Reitz > --- > python/qemu/machine/machine.py | 48 +- > 1 file changed, 42 insertions(+), 6 deletions(-) > > diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py > index 0bd40bc2f76..a0cf69786b4 100644 > --- a/python/qemu/machine/machine.py > +++ b/python/qemu/machine/machine.py > @@ -342,9 +342,15 @@ def _post_shutdown(self) -> None: > # Comprehensive reset for the failed launch case: > self._early_cleanup() > > -if self._qmp_connection: > -self._qmp.close() > -self._qmp_connection = None > +try: > +self._close_qmp_connection() > +except Exception as err: # pylint: disable=broad-except > +LOG.warning( > +"Exception closing QMP connection: %s", > +str(err) if str(err) else type(err).__name__ > +) > +finally: > +assert self._qmp_connection is None > > self._close_qemu_log_file() > > @@ -420,6 +426,31 @@ def _launch(self) -> None: > close_fds=False) > self._post_launch() > > +def _close_qmp_connection(self) -> None: > +""" > +Close the underlying QMP connection, if any. > + > +Dutifully report errors that occurred while closing, but assume > +that any error encountered indicates an abnormal termination > +process and not a failure to close. > +""" > +if self._qmp_connection is None: > +return > + > +try: > +self._qmp.close() > +except EOFError: > +# EOF can occur as an Exception here when using the Async > +# QMP backend. It indicates that the server closed the > +# stream. If we successfully issued 'quit' at any point, > +# then this was expected. If the remote went away without > +# our permission, it's worth reporting that as an abnormal > +# shutdown case. > +if not (self._user_killed or self._quit_issued): > +raise Isn't this racy for those tests that expect QEMU to quit by itself and then later call wait()? self._quit_issued is only set to True in wait(), but whatever will cause QEMU to quit happens earlier and it might actually quit before wait() is called. It would make sense to me that such tests need to declare that they expect QEMU to quit before actually performing the action. And then wait() becomes less weird in patch 1, too, because it can just assert self._quit_issued instead of unconditionally setting it. The other point I'm unsure is whether you can actually kill QEMU without getting either self._user_killed or self._quit_issued set. The potentially problematic case I see is shutdown(hard = False) where soft shutdown fails. Then a hard shutdown will be performed without setting self._user_killed (is this a bug?). Of course, sending the 'quit' command in _soft_shutdown() will set self._quit_issued at least, but are we absolutely sure that it can never raise an exception before getting to qmp()? I guess in theory, closing the console socket in _early_cleanup() could raise one? (But either way, not relying on such subtleties would make the code easier to understand.) Kevin
Re: [RFC 0/2] tls: add macros for coroutine-safe TLS variables
Am 26.10.2021 um 19:10 hat Richard Henderson geschrieben: > On 10/26/21 9:34 AM, Stefan Hajnoczi wrote: > > On Tue, Oct 26, 2021 at 08:10:16AM -0700, Richard Henderson wrote: > > > On 10/26/21 6:22 AM, Stefan Hajnoczi wrote: > > > > If "safe" TLS variables are opt-in then we'll likely have obscure bugs > > > > when code changes to access a TLS variable that was previously never > > > > accessed from a coroutine. There is no compiler error and no way to > > > > detect this. When it happens debugging it is painful. > > > > > > Co-routines are never used in user-only builds. > > > > If developers have the choice of using __thread then bugs can slip > > through. > > Huh? How. No, really. > > > Are you concerned about performance, the awkwardness of calling > > getters/setters, or something else for qemu-user? > > Awkwardness first, performance second. > > I'll also note that coroutines never run on vcpu threads, only io threads. > So I'll resist any use of these interfaces in TCG as well. This is wrong. Device emulation is called from vcpu threads, and it calls into code using coroutines. Using dedicated iothreads is possible for some devices, but not the default. In fact, this is probably where the most visible case of the bug comes from: A coroutine is created in the vcpu thread (while holding the BQL) and after yielding first, it is subsequently reentered from the main thread. This is exactly the same pattern as you often get with callbacks, where the request is made in the vcpu thread and the callback is run in the main thread. Kevin
Re: [PATCH v2 1/3] file-posix: add `aio-max-batch` option
On Wed, Oct 27, 2021 at 06:28:51AM +0200, Markus Armbruster wrote: Stefano Garzarella writes: Commit d7ddd0a161 ("linux-aio: limit the batch size using `aio-max-batch` parameter") added a way to limit the batch size of Linux AIO backend for the entire AIO context. The same AIO context can be shared by multiple devices, so latency-sensitive devices may want to limit the batch size even more to avoid increasing latency. For this reason we add the `aio-max-batch` option to the file backend, which will be used by the next commits to limit the size of batches including requests generated by this device. Suggested-by: Kevin Wolf Reviewed-by: Kevin Wolf Signed-off-by: Stefano Garzarella --- Notes: v2: - @aio-max-batch documentation rewrite [Stefan, Kevin] qapi/block-core.json | 7 +++ block/file-posix.c | 9 + 2 files changed, 16 insertions(+) diff --git a/qapi/block-core.json b/qapi/block-core.json index 6d3217abb6..fef76b0ea2 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2896,6 +2896,12 @@ # for this device (default: none, forward the commands via SG_IO; # since 2.11) # @aio: AIO backend (default: threads) (since: 2.8) +# @aio-max-batch: maximum number of requests to batch together into a single +# submission in the AIO backend. The smallest value between +# this and the aio-max-batch value of the IOThread object is +# chosen. +# 0 means that the AIO backend will handle it automatically. +# (default: 0, since 6.2) "(default 0) (since 6.2)" seems to be more common. Indeed I wasn't sure, so I followed @drop-cache, the last one added in @BlockdevOptionsFile. I'll fix in v3 :-) Thanks, Stefano
Re: [PATCH v2 1/3] file-posix: add `aio-max-batch` option
On Tue, Oct 26, 2021 at 06:23:44PM +0200, Stefano Garzarella wrote: > Commit d7ddd0a161 ("linux-aio: limit the batch size using > `aio-max-batch` parameter") added a way to limit the batch size > of Linux AIO backend for the entire AIO context. > > The same AIO context can be shared by multiple devices, so > latency-sensitive devices may want to limit the batch size even > more to avoid increasing latency. > > For this reason we add the `aio-max-batch` option to the file > backend, which will be used by the next commits to limit the size of > batches including requests generated by this device. > > Suggested-by: Kevin Wolf > Reviewed-by: Kevin Wolf > Signed-off-by: Stefano Garzarella > --- > > Notes: > v2: > - @aio-max-batch documentation rewrite [Stefan, Kevin] > > qapi/block-core.json | 7 +++ > block/file-posix.c | 9 + > 2 files changed, 16 insertions(+) Reviewed-by: Stefan Hajnoczi signature.asc Description: PGP signature
Re: [PATCH v2 1/4] qemu-img: implement compare --stat
On 26.10.21 11:18, Vladimir Sementsov-Ogievskiy wrote: 26.10.2021 11:47, Hanna Reitz wrote: On 26.10.21 09:53, Vladimir Sementsov-Ogievskiy wrote: 25.10.2021 19:40, Hanna Reitz wrote: On 21.10.21 12:12, Vladimir Sementsov-Ogievskiy wrote: With new option qemu-img compare will not stop at first mismatch, but instead calculate statistics: how many clusters with different data, how many clusters with equal data, how many clusters were unallocated but become data and so on. We compare images chunk by chunk. Chunk size depends on what block_status returns for both images. It may return less than cluster (remember about qcow2 subclusters), it may return more than cluster (if several consecutive clusters share same status). Finally images may have different cluster sizes. This all leads to ambiguity in how to finally compare the data. What we can say for sure is that, when we compare two qcow2 images with same cluster size, we should compare clusters with data separately. Otherwise, if we for example compare 10 consecutive clusters of data where only one byte differs we'll report 10 different clusters. Expected result in this case is 1 different cluster and 9 equal ones. So, to serve this case and just to have some defined rule let's do the following: 1. Select some block-size for compare procedure. In this commit it must be specified by user, next commit will add some automatic logic and make --block-size optional. 2. Go chunk-by-chunk using block_status as we do now with only one differency: If block_status() returns DATA region that intersects block-size aligned boundary, crop this region at this boundary. This way it's still possible to compare less than cluster and report subcluster-level accuracy, but we newer compare more than one cluster of data. Signed-off-by: Vladimir Sementsov-Ogievskiy --- docs/tools/qemu-img.rst | 18 +++- qemu-img.c | 206 +--- qemu-img-cmds.hx | 4 +- 3 files changed, 212 insertions(+), 16 deletions(-) Looks good to me overall! Just some technical comments below. diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst index d58980aef8..21164253d4 100644 --- a/docs/tools/qemu-img.rst +++ b/docs/tools/qemu-img.rst @@ -159,6 +159,18 @@ Parameters to compare subcommand: Strict mode - fail on different image size or sector allocation +.. option:: --stat + + Instead of exit on first mismatch compare the whole images and print + statistics on amount of different pairs of clusters, based on their + block-status and are they equal or not. I’d phrase this as: Instead of exiting on the first mismatch, compare the whole images and print statistics on how much they differ in terms of block status (i.e. are blocks allocated or not, do they contain data, are they marked as containing only zeroes) and block content (a block of data that contains only zero still has the same content as a marked-zero block). For me the rest starting from "and block content" sounds unclear, seems doesn't add any information to previous (i.e. are blocks allocated ...) By “block content” I meant what you said by “equal or not”, i.e. what is returned when reading from the block. Now that I think about it again, I believe we should go with your original “equal or not”, though, because that reflects what qemu-img --stat prints, like so perhaps: Instead of exiting on the first mismatch, compare the whole images and print statistics on the amount of different pairs of blocks, based on their block status and whether they are equal or not. OK I’d still like to add something like what I had in parentheses, though, because as a user, I’d find the “block status” and “equal or not” terms to be a bit handwave-y. I don’t think “block status” is a common term in our documentation, so I wanted to add some examples; and I wanted to show by example that “equal” blocks don’t need to have the same block status. Actually, I think, that the resulting tool is anyway developer-oriented, to use it you should know what this block status really means.. May be just s/block status/block type/, will it be more human friendly? Oh, OK. I think I’d prefer “block status” still, because that’s what we use in the code. Hanna
Re: [PATCH v4 6/8] iotests/300: avoid abnormal shutdown race condition
On 26.10.21 19:07, John Snow wrote: On Mon, Oct 25, 2021 at 9:20 AM Hanna Reitz wrote: On 13.10.21 23:57, John Snow wrote: > Wait for the destination VM to close itself instead of racing to shut it > down first, which produces different error log messages from AQMP > depending on precisely when we tried to shut it down. > > (For example: We may try to issue 'quit' immediately prior to the target > VM closing its QMP socket, which will cause an ECONNRESET error to be > logged. Waiting for the VM to exit itself avoids the race on shutdown > behavior.) > > Reported-by: Hanna Reitz > Signed-off-by: John Snow > --- > tests/qemu-iotests/300 | 12 > 1 file changed, 4 insertions(+), 8 deletions(-) > > diff --git a/tests/qemu-iotests/300 b/tests/qemu-iotests/300 > index 10f9f2a8da6..bbea7248005 100755 > --- a/tests/qemu-iotests/300 > +++ b/tests/qemu-iotests/300 > @@ -24,8 +24,6 @@ import random > import re > from typing import Dict, List, Optional > > -from qemu.machine import machine > - > import iotests > > > @@ -461,12 +459,10 @@ class TestBlockBitmapMappingErrors(TestDirtyBitmapMigration): > f"'{self.src_node_name}': Name is longer than 255 bytes", > log) > > - # Expect abnormal shutdown of the destination VM because of > - # the failed migration > - try: > - self.vm_b.shutdown() > - except machine.AbnormalShutdown: > - pass > + # Destination VM will terminate w/ error of its own accord > + # due to the failed migration. > + self.vm_b.wait() > + assert self.vm_b.exitcode() > 0 Trying to test, I can see that this fails iotest 297, because `.exitcode()` is `Optional[int]`... (I can’t believe how long it took me to figure this out – the message “300:465: Unsupported operand types for < ("int" and "None")” made me believe that it was 300 that was failing, because `exitcode()` was returning `None` for some inconceivable reason. I couldn’t understand why my usual test setup failed on every run, but I couldn’t get 300 to fail manually... Until I noticed that the message came below the “297” line, not the “300” line...) Oops. Is there anything we can do to improve the visual clarity there? You mean, like, separating iotests from the Python linting tests? :) I think until then I’ll just need to pay more attention. Hanna