date:20240610


On Tue, 11 Jun 2024 00:38, Pierrick Bouvier  wrote:
Maybe it could be better if build.rs file was *not* needed for new 
devices/folders, and could be abstracted as a detail of the python 
wrapper script instead of something that should be committed.


That'd mean you cannot work on the rust files with a LanguageServer, 
you cannot run cargo build or cargo check or cargo clippy, etc. 
That's why I left the alternative choice of including a manually 
generated bindings file (generated.rs.inc)


Maybe I missed something, but it seems like it just checks/copies the 
generated.rs file where it's expected. Definitely something that could 
be done as part of the rust build.


Having to run the build before getting completion does not seem to be a 
huge compromise.


It only checks if it's called from meson, hence it should update the 
should choose meson's generated.rs file. Otherwise it falls back to any 
manual bindings you have put there that are not checked into git. So 
essentially it does what you suggest, I think :)





Yes, vendor-the-world is a different topic than vendor e.g. two 
crates such as the dependencies I'm using here.




If there must be a discussion about dependencies, it's probably better 
to consider the "worse" case to take a decison about vendoring this or not.




Agreed. To re-cap, my opinion is that vendoring 1-2 small crates is 
fine, but any more than that needs rethinking.

Examining device state via monitor for debugging (was: [PATCH 0/2] hw/misc/mos6522: Do not open-code hmp_info_human_readable_text())

2024-06-10 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> Officialise the QMP command, use the existing
> hmp_info_human_readable_text() helper.

I'm not sure "officialise" is a word :)

Taking a step back...  "info via" and its new QMP counterpart
x-query-mos6522-devices dump device state.  I understand why examining
device state via monitor can be useful for debugging.  However, we have
more than 2000 devices in the tree.  Clearly, we don't want 2000 device
state queries.  Not even 100.  Could we have more generic means instead?

We could use QOM (read-only) properties to expose device state.

If we use one QOM property per "thing", examining device state becomes
quite tedious.  Also, you'd have to stop the guest to get a consistent
view, and adding lots of QOM properties bloats the code.

If we use a single, object-valued property for the entire state, we get
to define the objects in QAPI.  Differently tedious, and bloats the
generated code.

We could use a single string-valued property.  Too much of an abuse of
QOM?

We could add an optional "dump state for debugging" method to QOM, and
have a single query command that calls it if present.

Thoughts?

Re: [PATCH 1/2] hw/misc/mos6522: Expose x-query-mos6522-devices QMP command

2024-06-10 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> This is a counterpart to the HMP "info via" command. It is being
> added with an "x-" prefix because this QMP command is intended as an
> adhoc debugging tool and will thus not be modelled in QAPI as fully
> structured data, nor will it have long term guaranteed stability.
>
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  MAINTAINERS |  2 +-
>  qapi/machine.json   | 17 +
>  hw/misc/mos6522-stubs.c | 18 ++
>  hw/misc/mos6522.c   |  5 +++--
>  hw/misc/meson.build |  3 ++-
>  5 files changed, 41 insertions(+), 4 deletions(-)
>  create mode 100644 hw/misc/mos6522-stubs.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 951556224a..e86638c68c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1453,7 +1453,7 @@ F: hw/ppc/mac_newworld.c
>  F: hw/pci-host/uninorth.c
>  F: hw/pci-bridge/dec.[hc]
>  F: hw/misc/macio/
> -F: hw/misc/mos6522.c
> +F: hw/misc/mos6522*.c
>  F: hw/nvram/mac_nvram.c
>  F: hw/ppc/fw_cfg.c
>  F: hw/input/adb*
> diff --git a/qapi/machine.json b/qapi/machine.json
> index 1283d14493..a82b8dd39d 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json

I figure you pick machine.json because it already serves as grabbag of
vaguely device-specific queries like x-query-usb.  misc-target.json is
another grabbag.

> @@ -1865,6 +1865,23 @@
>'data': { 'filename': 'str' },
>'if': 'CONFIG_FDT' }
>  
> +##
> +# @x-query-mos6522-devices:
> +#
> +# Query information on MOS6522 VIA devices
> +#
> +# Features:
> +#
> +# @unstable: This command is meant for debugging.
> +#
> +# Returns: MOS6522 VIA devices information
> +#
> +# Since: 9.1
> +##
> +{ 'command': 'x-query-mos6522-devices',
> +  'returns': 'HumanReadableText',
> +  'features': [ 'unstable' ]}
> +
>  ##
>  # @x-query-interrupt-controllers:
>  #

HMP "info via" is compile-time conditional on CONFIG_MOS6522.

Its new QMP counterpart x-query-mos6522-devices is unconditional.
Can you explain why?

Possibly related:

commit 409e9f7131e55e74eb09e65535779e311df5ebf5
Author: Mark Cave-Ayland 
Date:   Sat Mar 5 15:09:53 2022 +

mos6522: add "info via" HMP command for debugging

This displays detailed information about the device registers and timers to 
aid
debugging problems with timers and interrupts.

--> Currently the QAPI generators for HumanReadableText don't work correctly if
--> used in qapi/target-misc.json when a non-specified target is built, so for
--> now manually add a hmp_info_via() wrapper until direct support for 
per-device
--> HMP/QMP commands is implemented.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Laurent Vivier 
Message-Id: <20220305150957.5053-9-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 

[...]

Re: [Stable-7.2.12 00/29] Patch Round-up for stable 7.2.12, frozen at 2024-06-07


10.06.2024 15:21, Eric Blake wrote:

On Mon, Jun 10, 2024 at 07:17:53AM GMT, Eric Blake wrote:



In addition to these two, we also need the following for NBD:

  14ddea7e3c81 Eric Blake:
   qio: Inherit follow_coroutine_ctx across TLS

and optionally:
  5905c09466f4 Eric Blake:
   iotests: test NBD+TLS+iothread


Hmm; I see you did include them for the 8.2.x branch; and the
regression they fix was only introduced in 8.2.  Unless we backported
the work of removing AioContext to 7.2.x, then not backporting these
two that far should not be an issue, after all.


Ah yes, some of these don't apply to older (here: 7.2) versions, especially
because 7.2 lacks AioContext removal.  Such change is too intrusive for a
stable release, I'd say.

I was in a hurry when replied to your previous reply and didn't check before
writing, - if I'd look I would know the reason why I haven't picked them up
for 7.2 right away.  I'm sorry for this noise.

And thank you once again for checking and letting me know, - such attention
is appreciated, it is a good reality check for my own sanity ;)

(I keep stable-7.2 branch alive still, because it is used in debian stable
and in redhat).

Thanks!

/mjt

--
GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
New key: rsa4096/61AD3D98ECDF2C8E  9D8B E14E 3F2A 9DD7 9199  28F1 61AD 3D98 
ECDF 2C8E
Old key: rsa2048/457CE0A0804465C5  6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 
8044 65C5
Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt

Re: qemu-riscv32 usermode still broken?

On Wed, Sep 20, 2023 at 6:39 AM Andreas K. Huettel  wrote:
>
> Hi Alistair,
>
> > It would be great to get a strace of the failure to narrow down what
> > it is. From there it should be not too hard to find and fix.
>
> thanks a lot. Here's as much info as I could get with strace mechanisms.
>
> 1) What I did, without any tracing
>
> pinacolada ~ # qemu-riscv32 -L /var/lib/machines/riscv32 
> /var/lib/machines/riscv32/bin/bash
> pinacolada ~ # python
> Python 3.11.5 (main, Aug 27 2023, 18:39:05) [GCC 12.3.1 20230623] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>>
> [1]+  Stopped python
> ^C^C
> pinacolada ~ # ^C
> pinacolada ~ # fg
> python
>
> pinacolada ~ #
> exit
>
> * When I type Ctrl-Z at the python prompt, the terminal hangs.
> * With several Ctrl-C I can get back to the riscv32 bash, and then python is 
> suspended in the background.
>
> * Now I did this again, first with qemu tracing system calls, then with 
> strace tracing qemu
> * In both cases, the log starts when I type "python", and ends (with quickly 
> repeated output lines)
>   after pressing Ctrl-Z
>
> 2)
> pinacolada ~ # QEMU_STRACE=1 qemu-riscv32 -L /var/lib/machines/riscv32 
> /var/lib/machines/riscv32/bin/bash
> (QEMU_STRACE is getting unset in my bashrc, so no subprocesses are traced)
>
> (...)
> 2472050 write(2,0xe56c0,58)pinacolada ~ #  = 58
> 2472050 Unknown syscall 413
> 2472050 read(0,0x2b2aa29b,1) = 1
> 2472050 Unknown syscall 413
> 2472050 write(2,0xe56c0,1)p = 1
> 2472050 Unknown syscall 413
> 2472050 read(0,0x2b2aa29b,1) = 1
> 2472050 Unknown syscall 413
> 2472050 write(2,0xe56c0,1)y = 1
> 2472050 Unknown syscall 413
> 2472050 read(0,0x2b2aa29b,1) = 1
> 2472050 Unknown syscall 413
> 2472050 write(2,0xe56c0,1)t = 1
> 2472050 Unknown syscall 413
> 2472050 read(0,0x2b2aa29b,1) = 1
> 2472050 Unknown syscall 413
> 2472050 write(2,0xe56c0,1)h = 1
> 2472050 Unknown syscall 413
> 2472050 read(0,0x2b2aa29b,1) = 1
> 2472050 Unknown syscall 413
> 2472050 write(2,0xe56c0,1)o = 1
> 2472050 Unknown syscall 413
> 2472050 read(0,0x2b2aa29b,1) = 1
> 2472050 Unknown syscall 413
> 2472050 write(2,0xe56c0,1)n = 1
> 2472050 Unknown syscall 413
> 2472050 read(0,0x2b2aa29b,1) = 1
> 2472050 write(2,0xe56c0,1)
>  = 1
>  = 9050 write(2,0xe56c0,9)
> 2472050 ioctl(0,TCSETSW,{c_iflag = ICRNL|IXON|IXOFF|IUTF8,c_oflag = 
> OPOST|ONLCR,c_cflag = B38400,CS8,CREAD,c_lflag = 
> ISIG|ICANON|ECHO|ECHOE|ECHOK|ECHOCTL|ECHOKE|IEXTEN,c_cc = "",c_line = ''}) = 0
> 2472050 rt_sigaction(SIGINT,0x2b2aa1bc,0x2b2aa244) = 0
> 2472050 rt_sigaction(SIGHUP,0x2b2aa1bc,0x2b2aa244) = 0
> 2472050 rt_sigaction(SIGALRM,0x2b2aa1bc,0x2b2aa244) = 0
> 2472050 rt_sigaction(SIGWINCH,0x2b2aa1bc,0x2b2aa244) = 0
> 2472050 rt_sigaction(SIGINT,0x2b2aa14c,0x2b2aa1d4) = 0
> 2472050 clock_gettime64(CLOCK_REALTIME_COARSE,0x2b2aa268) = 0 
> ({tv_sec=1695154794,tv_nsec=760883171})
> 2472050 
> statx(AT_FDCWD,".",AT_NO_AUTOMOUNT|AT_STATX_SYNC_AS_STAT,STATX_BASIC_STATS,0x2b2aaa78)
>  = 0
> 2472050 
> statx(AT_FDCWD,"/usr/local/sbin/python",AT_NO_AUTOMOUNT|AT_STATX_SYNC_AS_STAT,STATX_BASIC_STATS,0x2b2aa998)
>  = -1 errno=2 (No such file or directory)
> 2472050 
> statx(AT_FDCWD,"/usr/local/bin/python",AT_NO_AUTOMOUNT|AT_STATX_SYNC_AS_STAT,STATX_BASIC_STATS,0x2b2aa998)
>  = -1 errno=2 (No such file or directory)
> 2472050 
> statx(AT_FDCWD,"/usr/sbin/python",AT_NO_AUTOMOUNT|AT_STATX_SYNC_AS_STAT,STATX_BASIC_STATS,0x2b2aa998)
>  = -1 errno=2 (No such file or directory)
> 2472050 
> statx(AT_FDCWD,"/usr/bin/python",AT_NO_AUTOMOUNT|AT_STATX_SYNC_AS_STAT,STATX_BASIC_STATS,0x2b2aa998)
>  = 0
> 2472050 
> statx(AT_FDCWD,"/usr/bin/python",AT_NO_AUTOMOUNT|AT_STATX_SYNC_AS_STAT,STATX_BASIC_STATS,0x2b2aa8e8)
>  = 0
> 2472050 geteuid() = 0
> 2472050 getegid() = 0
> 2472050 getuid() = 0
> 2472050 getgid() = 0
> 2472050 
> faccessat(AT_FDCWD,"/usr/bin/python",X_OK,AT_SYMLINK_NOFOLLOW|0x1da42089) = 0
> 2472050 
> statx(AT_FDCWD,"/usr/bin/python",AT_NO_AUTOMOUNT|AT_STATX_SYNC_AS_STAT,STATX_BASIC_STATS,0x2b2aa8e8)
>  = 0
> 2472050 geteuid() = 0
> 2472050 getegid() = 0
> 2472050 getuid() = 0
> 2472050 getgid() = 0
> 2472050 
> faccessat(AT_FDCWD,"/usr/bin/python",R_OK,AT_SYMLINK_NOFOLLOW|0x1da42089) = 0
> 2472050 rt_sigprocmask(SIG_BLOCK,NULL,0x2b2aabec,8) = 0
> 2472050 rt_sigprocmask(SIG_BLOCK,0x2b2aaaec,0x2b2aab6c,8) = 0
> 2472050 rt_sigaction(SIGTERM,0x2b2aa85c,0x2b2aa8e4) = 0
> 2472050 rt_sigprocmask(SIG_BLOCK,0x2b2aa98c,0x2b2aaa0c,8) = 0
> 2472050 rt_sigprocmask(SIG_SETMASK,0x2b2aaa0c,NULL,8) = 0
> 2472050 pipe2(0x5560d3f4,0) = 0
> 2472050 
> clone(CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|0x11,child_stack=0x,parent_tidptr=0x,tls=0x,child_tidptr=0x2b2d20c8)
>  = 2472055
> 2472050 rt_sigaction(SIGTERM,0x2b2aa85c,0x2b2aa8e4) = 0
>  = 0
> 2472050 setpgid(2472055,2472055) = 0
> 2472055 set_robust_list(0x2b2d20cc,12) = 2472050 
> rt_sigprocmask(SIG_SETMASK,0x2b2aab6c,-1 errno=38 (Function not 
> implemented)NULL,
> 8) = 0
> 2472055

[ANNOUNCE] QEMU 9.0.1 Stable released

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi everyone,

The QEMU v9.0.1 stable release is now available.

You can grab the tarball from our download page here:

  https://www.qemu.org/download/#source

  https://download.qemu.org/qemu-9.0.1.tar.xz
  https://download.qemu.org/qemu-9.0.1.tar.xz.sig (signature)

v9.0.1 is now tagged in the official qemu.git repository, and the
stable-9.0 branch has been updated accordingly:

  https://gitlab.com/qemu-project/qemu/-/commits/stable-9.0

There are 71 changes since the previous v9.0.0 release.

Thank you everyone who has been involved and helped with the stable series!

/mjt

Changelog (stable-9.0-hash master-hash Author Name: Commmit-Subject):

60b4f3aff4 Michael Tokarev:
 Update version for 9.0.1 release
2d673c3cdc 78f932ea1f lanyanzhi:
 target/loongarch: fix a wrong print in cpu dump
453a7c4f9b 2e701e6785 Bernhard Beschow:
 ui/sdl2: Allow host to power down screen
3fe67740ca 40a23ef643 Marc-André Lureau:
 virtio-gpu: fix v2 migration
e44389b0ac da7c95920d Xinyu Li:
 target/i386: fix SSE and SSE2 feature check
0ab2229daa 7604bbc2d8 Paolo Bonzini:
 target/i386: fix xsave.flat from kvm-unit-tests
9075bc0bdd 915758c537 Alistair Francis:
 disas/riscv: Decode all of the pmpcfg and pmpaddr CSRs
8746327f4b 583edc4efb Daniel Henrique Barboza:
 riscv, gdbstub.c: fix reg_width in ricsv_gen_dynamic_vector_feature()
e532fdb0eb 190b867f28 Yong-Xuan Wang:
 target/riscv/kvm.c: Fix the hart bit setting of AIA
fb1be88084 c5eb8d6336 Alistair Francis:
 target/riscv: rvzicbo: Fixup CBO extension register calculation
a58758c5df 6c9a344247 Alexei Filippov:
 target/riscv: do not set mtval2 for non guest-page faults
ab2d6e7412 68e7c86927 Daniel Henrique Barboza:
 target/riscv: prioritize pmp errors in raise_mmu_exception()
3ee5f0e313 93cb52b7a3 Max Chou:
 target/riscv: rvv: Remove redudant SEW checking for vector fp narrow/widen 
instructions
9f9cd6b7f9 692f33a3ab Max Chou:
 target/riscv: rvv: Check single width operator for vfncvt.rod.f.f.w
a0ea75e019 7a999d4dd7 Max Chou:
 target/riscv: rvv: Check single width operator for vector fp widen instructions
f3bea9603b 17b713c080 Max Chou:
 target/riscv: rvv: Fix Zvfhmin checking for vfwcvt.f.f.v and vfncvt.f.f.w 
instructions
3f4ab4b158 ff33b7a969 Yangyu Chen:
 target/riscv/cpu.c: fix Zvkb extension config
af1e2cdc57 75115d880c Huang Tao:
 target/riscv: Fix the element agnostic function problem
2dcc48b38b 1215d45b2a Daniel Henrique Barboza:
 target/riscv/kvm: tolerate KVM disable ext errors
2ae8e12964 86997772fa Andrew Jones:
 target/riscv/kvm: Fix exposure of Zkr
8d664e5bc2 c76b121840 yang.zhang:
 hw/intc/riscv_aplic: APLICs should add child earlier than realize
f7ddff7d5b a73c993780 Eric Blake:
 iotests: test NBD+TLS+iothread
a15989d89b 199e84de1c Eric Blake:
 qio: Inherit follow_coroutine_ctx across TLS
1c8a740fad daf9748ac0 Marcin Juszkiewicz:
 target/arm: Disable SVE extensions when SVE is disabled
65b44e55e4 daafa78b29 Andrey Shumilin:
 hw/intc/arm_gic: Fix handling of NS view of GICC_APR
68af25cd8e 19ed42e8ad Zenghui Yu:
 hvf: arm: Fix encodings for ID_AA64PFR1_EL1 and debug System registers
6df1431678 b563959b90 Daniel P. Berrangé:
 gitlab: use 'setarch -R' to workaround tsan bug
d488e255be c53f7a1078 Daniel P. Berrangé:
 gitlab: use $MAKE instead of 'make'
8fe634f851 bad7a2759c Daniel P. Berrangé:
 dockerfiles: add 'MAKE' env variable to remaining containers
fd4afd5a77 36fa7c686e Richard Henderson:
 gitlab: Update msys2-64bit runner tags
2cd8deb0d9 f0f0136abb Paolo Bonzini:
 target/i386: no single-step exception after MOV or POP SS
89ed6d4b6c 8225bff7c5 Paolo Bonzini:
 target/i386: disable jmp_opt if EFLAGS.RF is 1
0854469050 6204af704a Jiaxun Yang:
 hw/loongarch/virt: Fix FDT memory node address width
16b1ecee52 b11f981452 Song Gao:
 hw/loongarch: Fix fdt memory node wrong 'reg'
d27df7187b 07c0866103 Song Gao:
 target/loongarch/kvm: fpu save the vreg registers high 192bit
41558f42b3 9710401276 Fiona Ebner:
 hw/core/machine: move compatibility flags for VirtIO-net USO to machine 8.1
285cef5c39 84d4b72854 donsheng:
 target-i386: hyper-v: Correct kvm_hv_handle_exit return value
2569dec929 2563be6317 Gerd Hoffmann:
 hw/pflash: fix block write start
2965ecc487 c9290dfebf Richard Henderson:
 tcg/loongarch64: Fill out tcg_out_{ld,st} for vector regs
bbfe1d4e8b e4e62514e3 Dongwon Kim:
 ui/gtk: Check if fence_fd is equal to or greater than 0
ba27e71976 37e9141501 hikalium:
 ui/gtk: Fix mouse/motion event scaling issue with GTK display backend
33a17bcbaf 371d60dfdb Thomas Huth:
 configure: Fix error message when C compiler is not working
52d96ce37d 23b1f53c2c Paolo Bonzini:
 configure: quote -D options that are passed through to meson
6cb4afc418 fe01af5d47 Paolo Bonzini:
 target/i386: fix feature dependency for WAITPKG
1e5c6ceb27 40a3ec7b5f Paolo Bonzini:
 target/i386: rdpkru/wrpkru are no-prefix instructions
08eb23e4c9 41c685dc59 Paolo Bonzini:
 target/i386: fix operand size for DATA16 REX.W POPCNT
230b5c968e e6578f1f68 Mattias

[ANNOUNCE] QEMU 7.2.12 Stable released

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi everyone,

The QEMU v7.2.12 stable release is now available.

You can grab the tarball from our download page here:

  https://www.qemu.org/download/#source

  https://download.qemu.org/qemu-7.2.12.tar.xz
  https://download.qemu.org/qemu-7.2.12.tar.xz.sig (signature)

v7.2.12 is now tagged in the official qemu.git repository, and the
stable-7.2 branch has been updated accordingly:

  https://gitlab.com/qemu-project/qemu/-/commits/stable-7.2

There are 29 changes since the previous v7.2.11 release.

Thank you everyone who has been involved and helped with the stable series!

/mjt

Changelog (stable-7.2-hash master-hash Author Name: Commmit-Subject):

f48ba9b085 Michael Tokarev:
 Update version for 7.2.12 release
6f62fc9ff3 78f932ea1f lanyanzhi:
 target/loongarch: fix a wrong print in cpu dump
61687b3b43 2e701e6785 Bernhard Beschow:
 ui/sdl2: Allow host to power down screen
082940a5a1 da7c95920d Xinyu Li:
 target/i386: fix SSE and SSE2 feature check
9aca1a84de 7604bbc2d8 Paolo Bonzini:
 target/i386: fix xsave.flat from kvm-unit-tests
81ca6c2c9b 915758c537 Alistair Francis:
 disas/riscv: Decode all of the pmpcfg and pmpaddr CSRs
b73e3712a3 c76b121840 yang.zhang:
 hw/intc/riscv_aplic: APLICs should add child earlier than realize
e08fbea661 daf9748ac0 Marcin Juszkiewicz:
 target/arm: Disable SVE extensions when SVE is disabled
eed21e9574 daafa78b29 Andrey Shumilin:
 hw/intc/arm_gic: Fix handling of NS view of GICC_APR
c6fe98fe79 19ed42e8ad Zenghui Yu:
 hvf: arm: Fix encodings for ID_AA64PFR1_EL1 and debug System registers
07f686009f 36fa7c686e Richard Henderson:
 gitlab: Update msys2-64bit runner tags
f417712ef1 f0f0136abb Paolo Bonzini:
 target/i386: no single-step exception after MOV or POP SS
9abcd968e7 8225bff7c5 Paolo Bonzini:
 target/i386: disable jmp_opt if EFLAGS.RF is 1
ddc13a3c42 84d4b72854 donsheng:
 target-i386: hyper-v: Correct kvm_hv_handle_exit return value
5ec422a958 e4e62514e3 Dongwon Kim:
 ui/gtk: Check if fence_fd is equal to or greater than 0
659835d24b 37e9141501 hikalium:
 ui/gtk: Fix mouse/motion event scaling issue with GTK display backend
e6000bd7c7 40a3ec7b5f Paolo Bonzini:
 target/i386: rdpkru/wrpkru are no-prefix instructions
76b96c053f 41c685dc59 Paolo Bonzini:
 target/i386: fix operand size for DATA16 REX.W POPCNT
2b8be9cffb e6578f1f68 Mattias Nissler:
 hw/remote/vfio-user: Fix config space access byte order
41e052fc05 6a5a63f74b Ruihan Li:
 target/i386: Give IRQs a chance when resetting HF_INHIBIT_IRQ_MASK
2e3e5138d6 eb656a60fd Philippe Mathieu-Daudé:
 hw/arm/npcm7xx: Store derivative OTP fuse key in little endian
a004dfabea 4b00855f0e Alexandra Diupina:
 hw/dmax/xlnx_dpdma: fix handling of address_extension descriptor fields
9a005e30f5 a88a04906b Thomas Huth:
 .gitlab-ci.d/cirrus.yml: Shorten the runtime of the macOS and FreeBSD jobs
e00c9b4758 dcc5c018c7 Peter Maydell:
 tests/avocado: update sunxi kernel from armbian to 6.6.16
39a0961d0a 06479dbf3d Li Zhijian:
 backends/cryptodev-builtin: Fix local_error leaks
f7b46e82ce 4fa333e08d Eric Blake:
 nbd/server: Mark negotiation functions as coroutine_fn
a0823c2766 ae6d91a7e9 Zhu Yangyang:
 nbd/server: do not poll within a coroutine context
51cc8762a0 04f6fb897a Michael Tokarev:
 linux-user: do_setsockopt: fix SOL_ALG.ALG_SET_KEY
6ea6863f21 7bc1286b81 Palmer Dabbelt:
 gitlab/opensbi: Move to docker:stable
861fca8ce0 690ceb7193 Philippe Mathieu-Daudé:
 gitlab-ci: Remove job building EDK2 firmware binaries

-BEGIN PGP SIGNATURE-

iQEzBAEBCAAdFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmZn1fMACgkQcBtPaxpp
PlkL/wf/f6KAhQKrxLJu0e76xxMzJrTDTmXW3wHKq3K9blOamBfohNAEIc6d0Haf
HQj02beu+Nw1KvnHRdr0ycXU3KkJyywpC1BivohZCmS2uLgEdIXkfcMiZfBlcKdb
5amLWMXFUz1Agbq3Ks9FIbPv6SdmMgsq/wbQlwJX0cpQhIvxuuj+8U9FBH5el2Hp
THF5As0+1vMJVxk3G5ZihdjsG+Pv36zcwJsOPQpqg/exalicSkKUfFfsaoXlsmVG
FUqds548p40tSVGPmdVhIyRwMEEBWYPO8lAcz9pcy9Kosy6l7QcjLqTDJb08on06
hZYlU9zRkcW6ZTYvEfeIKulbnqpFwA==
=VJj7
-END PGP SIGNATURE-

[ANNOUNCE] QEMU 8.2.5 Stable released

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi everyone,

The QEMU v8.2.5 stable release is now available.

You can grab the tarball from our download page here:

  https://www.qemu.org/download/#source

  https://download.qemu.org/qemu-8.2.5.tar.xz
  https://download.qemu.org/qemu-8.2.5.tar.xz.sig (signature)

v8.2.5 is now tagged in the official qemu.git repository, and the
stable-8.2 branch has been updated accordingly:

  https://gitlab.com/qemu-project/qemu/-/commits/stable-8.2

There are 45 changes since the previous v8.2.4 release.

Thank you everyone who has been involved and helped with the stable series!

/mjt

Changelog (stable-8.2-hash master-hash Author Name: Commmit-Subject):

909772f0a5 Michael Tokarev:
 Update version for 8.2.5 release
6feae1d0dd 78f932ea1f lanyanzhi:
 target/loongarch: fix a wrong print in cpu dump
af008b379c 2e701e6785 Bernhard Beschow:
 ui/sdl2: Allow host to power down screen
276ec925a7 da7c95920d Xinyu Li:
 target/i386: fix SSE and SSE2 feature check
d84afebcee 7604bbc2d8 Paolo Bonzini:
 target/i386: fix xsave.flat from kvm-unit-tests
2891807479 915758c537 Alistair Francis:
 disas/riscv: Decode all of the pmpcfg and pmpaddr CSRs
ae5edeb084 190b867f28 Yong-Xuan Wang:
 target/riscv/kvm.c: Fix the hart bit setting of AIA
935be461eb c5eb8d6336 Alistair Francis:
 target/riscv: rvzicbo: Fixup CBO extension register calculation
37d6c6e495 6c9a344247 Alexei Filippov:
 target/riscv: do not set mtval2 for non guest-page faults
6da92af4f9 68e7c86927 Daniel Henrique Barboza:
 target/riscv: prioritize pmp errors in raise_mmu_exception()
0f9578497c 93cb52b7a3 Max Chou:
 target/riscv: rvv: Remove redudant SEW checking for vector fp narrow/widen 
instructions
c4173e4caf 692f33a3ab Max Chou:
 target/riscv: rvv: Check single width operator for vfncvt.rod.f.f.w
d813f356ad 7a999d4dd7 Max Chou:
 target/riscv: rvv: Check single width operator for vector fp widen instructions
749907f857 17b713c080 Max Chou:
 target/riscv: rvv: Fix Zvfhmin checking for vfwcvt.f.f.v and vfncvt.f.f.w 
instructions
4cba687b86 ff33b7a969 Yangyu Chen:
 target/riscv/cpu.c: fix Zvkb extension config
ec182b1045 75115d880c Huang Tao:
 target/riscv: Fix the element agnostic function problem
cf7143fdb7 1215d45b2a Daniel Henrique Barboza:
 target/riscv/kvm: tolerate KVM disable ext errors
cd1228a80e c76b121840 yang.zhang:
 hw/intc/riscv_aplic: APLICs should add child earlier than realize
b9b2f3bbab a73c993780 Eric Blake:
 iotests: test NBD+TLS+iothread
9a6143a73e 199e84de1c Eric Blake:
 qio: Inherit follow_coroutine_ctx across TLS
71c7036b18 daf9748ac0 Marcin Juszkiewicz:
 target/arm: Disable SVE extensions when SVE is disabled
3f470980b4 daafa78b29 Andrey Shumilin:
 hw/intc/arm_gic: Fix handling of NS view of GICC_APR
0970313b05 19ed42e8ad Zenghui Yu:
 hvf: arm: Fix encodings for ID_AA64PFR1_EL1 and debug System registers
8965709b86 b563959b90 Daniel P. Berrangé:
 gitlab: use 'setarch -R' to workaround tsan bug
3b36dd0005 c53f7a1078 Daniel P. Berrangé:
 gitlab: use $MAKE instead of 'make'
fc88204b82 bad7a2759c Daniel P. Berrangé:
 dockerfiles: add 'MAKE' env variable to remaining containers
ca0799624e 36fa7c686e Richard Henderson:
 gitlab: Update msys2-64bit runner tags
52031d6be5 f0f0136abb Paolo Bonzini:
 target/i386: no single-step exception after MOV or POP SS
c6171d524d 8225bff7c5 Paolo Bonzini:
 target/i386: disable jmp_opt if EFLAGS.RF is 1
93fa768d40 6204af704a Jiaxun Yang:
 hw/loongarch/virt: Fix FDT memory node address width
d679c82488 b11f981452 Song Gao:
 hw/loongarch: Fix fdt memory node wrong 'reg'
e3a2aa9542 9710401276 Fiona Ebner:
 hw/core/machine: move compatibility flags for VirtIO-net USO to machine 8.1
9b98ab7d3d 84d4b72854 donsheng:
 target-i386: hyper-v: Correct kvm_hv_handle_exit return value
90e023f2bc c9290dfebf Richard Henderson:
 tcg/loongarch64: Fill out tcg_out_{ld,st} for vector regs
355527b646 e4e62514e3 Dongwon Kim:
 ui/gtk: Check if fence_fd is equal to or greater than 0
f44d2398d8 37e9141501 hikalium:
 ui/gtk: Fix mouse/motion event scaling issue with GTK display backend
05bfa963df 371d60dfdb Thomas Huth:
 configure: Fix error message when C compiler is not working
19a931f207 23b1f53c2c Paolo Bonzini:
 configure: quote -D options that are passed through to meson
2b95625643 fe01af5d47 Paolo Bonzini:
 target/i386: fix feature dependency for WAITPKG
1cc3cb96b8 40a3ec7b5f Paolo Bonzini:
 target/i386: rdpkru/wrpkru are no-prefix instructions
eb761b4ee5 41c685dc59 Paolo Bonzini:
 target/i386: fix operand size for DATA16 REX.W POPCNT
7d7b770bde e6578f1f68 Mattias Nissler:
 hw/remote/vfio-user: Fix config space access byte order
7dbebba4a5 54c52ec719 Song Gao:
 hw/loongarch/virt: Fix memory leak
819f92ec3e 9157dccc7e Richard Henderson:
 target/sparc: Fix FMUL8x16
d3da3d02a0 7b616f36de Richard Henderson:
 target/sparc: Fix FEXPAND
50ed4f856a 6a5a63f74b Ruihan Li:
 target/i386: Give IRQs a chance when resetting HF_INHIBIT_IRQ_MASK

-BEGIN PGP SIGNATURE-

RE: [RFC v2 4/7] virtio-iommu: Compute host reserved regions




>-Original Message-
>From: Eric Auger 
>Subject: [RFC v2 4/7] virtio-iommu: Compute host reserved regions
>
>Compute the host reserved regions in virtio_iommu_set_iommu_device().
>The usable IOVA regions are retrieved from the HOSTIOMMUDevice.
>The virtio_iommu_set_host_iova_ranges() helper turns usable regions
>into complementary reserved regions while testing the inclusion
>into existing ones. virtio_iommu_set_host_iova_ranges() reuse the
>implementation of virtio_iommu_set_iova_ranges() which will be 
>removed in subsequent patches. rebuild_resv_regions() is just moved.
>
>Signed-off-by: Eric Auger 
>---
> hw/virtio/virtio-iommu.c | 151 ++
>-
> 1 file changed, 117 insertions(+), 34 deletions(-)
>
>diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>index 0680a357f0..33e9682b83 100644
>--- a/hw/virtio/virtio-iommu.c
>+++ b/hw/virtio/virtio-iommu.c
>@@ -494,12 +494,114 @@ get_host_iommu_device(VirtIOIOMMU
>*viommu, PCIBus *bus, int devfn) {
> return g_hash_table_lookup(viommu->host_iommu_devices, );
> }
>
>+/**
>+ * rebuild_resv_regions: rebuild resv regions with both the
>+ * info of host resv ranges and property set resv ranges
>+ */
>+static int rebuild_resv_regions(IOMMUDevice *sdev)
>+{
>+GList *l;
>+int i = 0;
>+
>+/* free the existing list and rebuild it from scratch */
>+g_list_free_full(sdev->resv_regions, g_free);
>+sdev->resv_regions = NULL;
>+
>+/* First add host reserved regions if any, all tagged as RESERVED */
>+for (l = sdev->host_resv_ranges; l; l = l->next) {
>+ReservedRegion *reg = g_new0(ReservedRegion, 1);
>+Range *r = (Range *)l->data;
>+
>+reg->type = VIRTIO_IOMMU_RESV_MEM_T_RESERVED;
>+range_set_bounds(>range, range_lob(r), range_upb(r));
>+sdev->resv_regions = resv_region_list_insert(sdev->resv_regions, reg);
>+trace_virtio_iommu_host_resv_regions(sdev-
>>iommu_mr.parent_obj.name, i,
>+ range_lob(>range),
>+ range_upb(>range));
>+i++;
>+}
>+/*
>+ * then add higher priority reserved regions set by the machine
>+ * through properties
>+ */
>+add_prop_resv_regions(sdev);
>+return 0;
>+}
>+
>+static int virtio_iommu_set_host_iova_ranges(VirtIOIOMMU *s, PCIBus
>*bus,
>+ int devfn, GList *iova_ranges,
>+ Error **errp)
>+{
>+IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
>+IOMMUDevice *sdev;
>+GList *current_ranges;
>+GList *l, *tmp, *new_ranges = NULL;
>+int ret = -EINVAL;
>+
>+if (!sbus) {
>+error_report("%s no sbus", __func__);
>+}
>+
>+sdev = sbus->pbdev[devfn];
>+
>+current_ranges = sdev->host_resv_ranges;
>+
>+if (sdev->probe_done) {

Will this still happen with new interface?

>+error_setg(errp,
>+   "%s: Notified about new host reserved regions after probe",
>+   __func__);
>+goto out;
>+}
>+
>+/* check that each new resv region is included in an existing one */
>+if (sdev->host_resv_ranges) {

Same here.

>+range_inverse_array(iova_ranges,
>+_ranges,
>+0, UINT64_MAX);
>+
>+for (tmp = new_ranges; tmp; tmp = tmp->next) {
>+Range *newr = (Range *)tmp->data;
>+bool included = false;
>+
>+for (l = current_ranges; l; l = l->next) {
>+Range * r = (Range *)l->data;
>+
>+if (range_contains_range(r, newr)) {
>+included = true;
>+break;
>+}
>+}
>+if (!included) {
>+goto error;
>+}
>+}
>+/* all new reserved ranges are included in existing ones */
>+ret = 0;
>+goto out;
>+}
>+
>+range_inverse_array(iova_ranges,
>+>host_resv_ranges,
>+0, UINT64_MAX);
>+rebuild_resv_regions(sdev);
>+
>+return 0;
>+error:
>+error_setg(errp, "%s Conflicting host reserved ranges set!",
>+   __func__);
>+out:
>+g_list_free_full(new_ranges, g_free);
>+return ret;
>+}
>+
> static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque,
>int devfn,
>   HostIOMMUDevice *hiod, Error **errp)
> {
> VirtIOIOMMU *viommu = opaque;
> VirtioHostIOMMUDevice *vhiod;
>+HostIOMMUDeviceClass *hiodc =
>HOST_IOMMU_DEVICE_GET_CLASS(hiod);
> struct hiod_key *new_key;
>+GList *host_iova_ranges = NULL;

g_autoptr(GList)?

Thanks
Zhenzhong

>
> assert(hiod);
>
>@@ -509,6 +611,20 @@ static bool
>virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
> return false;
> }
>
>+if (hiodc->get_iova_ranges) {
>+

RE: [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges callback




>-Original Message-
>From: Eric Auger 
>Subject: [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges
>callback
>
>Introduce a new HostIOMMUDevice callback that allows to
>retrieve the usable IOVA ranges.
>
>Implement this callback in the legacy VFIO and IOMMUFD VFIO
>host iommu devices. This relies on the VFIODevice agent's
>base container iova_ranges resource.
>
>Signed-off-by: Eric Auger 
>---
> include/sysemu/host_iommu_device.h |  8 
> hw/vfio/container.c| 14 ++
> hw/vfio/iommufd.c  | 14 ++
> 3 files changed, 36 insertions(+)
>
>diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>index 3e5f058e7b..40e0fa13ef 100644
>--- a/include/sysemu/host_iommu_device.h
>+++ b/include/sysemu/host_iommu_device.h
>@@ -80,6 +80,14 @@ struct HostIOMMUDeviceClass {
>  * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
>  */
> int (*get_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
>+/**
>+ * @get_iova_ranges: Return the list of usable iova_ranges along with
>+ * @hiod Host IOMMU device
>+ *
>+ * @hiod: handle to the host IOMMU device
>+ * @errp: error handle
>+ */
>+GList* (*get_iova_ranges)(HostIOMMUDevice *hiod, Error **errp);

Previous I thought expose iova_ranges directly in 
HostIOMMUDevice::caps.iova_ranges,
But a new callback looks better for a Glist pointer.

> };
>
> /*
>diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>index b728b978a2..edd0df6262 100644
>--- a/hw/vfio/container.c
>+++ b/hw/vfio/container.c
>@@ -1164,12 +1164,26 @@ static int
>hiod_legacy_vfio_get_cap(HostIOMMUDevice *hiod, int cap,
> }
> }
>
>+static GList *
>+hiod_legacy_vfio_get_iova_ranges(HostIOMMUDevice *hiod, Error **errp)
>+{
>+VFIODevice *vdev = hiod->agent;
>+GList *l = NULL;

g_assert(vdev)?

>+
>+if (vdev && vdev->bcontainer) {
>+l = g_list_copy(vdev->bcontainer->iova_ranges);
>+}
>+
>+return l;
>+}
>+
> static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
> {
> HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>
> hioc->realize = hiod_legacy_vfio_realize;
> hioc->get_cap = hiod_legacy_vfio_get_cap;
>+hioc->get_iova_ranges = hiod_legacy_vfio_get_iova_ranges;
> };
>
> static const TypeInfo types[] = {
>diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>index dbdae1adbb..1706784063 100644
>--- a/hw/vfio/iommufd.c
>+++ b/hw/vfio/iommufd.c
>@@ -645,11 +645,25 @@ static bool
>hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> return true;
> }
>
>+static GList *
>+hiod_iommufd_vfio_get_iova_ranges(HostIOMMUDevice *hiod, Error
>**errp)
>+{
>+VFIODevice *vdev = hiod->agent;
>+GList *l = NULL;
>+

Same here.

Thanks
Zhenzhong

>+if (vdev && vdev->bcontainer) {
>+l = g_list_copy(vdev->bcontainer->iova_ranges);
>+}
>+
>+return l;
>+}
>+
> static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
> {
> HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
>
> hiodc->realize = hiod_iommufd_vfio_realize;
>+hiodc->get_iova_ranges = hiod_iommufd_vfio_get_iova_ranges;
> };
>
> static const TypeInfo types[] = {
>--
>2.41.0

[PATCH v3] i386/cpu: fixup number of addressable IDs for processor cores in the physical package

2024-06-10 Thread Chuang Xu

When QEMU is started with:
-cpu host,host-cache-info=on,l3-cache=off \
-smp 2,sockets=1,dies=1,cores=1,threads=2
Guest can't acquire maximum number of addressable IDs for processor cores in
the physical package from CPUID[04H].

When creating a CPU topology of 1 core per package, host-cache-info only
uses the Host's addressable core IDs field (CPUID.04H.EAX[bits 31-26]),
resulting in a conflict (on the multicore Host) between the Guest core
topology information in this field and the Guest's actual cores number.

Fix it by removing the unnecessary condition to cover 1 core per package
case. This is safe because cores_per_pkg will not be 0 and will be at
least 1.

Fixes: d7caf13b5fcf ("x86: cpu: fixup number of addressable IDs for logical 
processors sharing cache")
Signed-off-by: Guixiong Wei 
Signed-off-by: Yipeng Yin 
Signed-off-by: Chuang Xu 
---
 target/i386/cpu.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index bc2dceb647..b68f7460db 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6426,10 +6426,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 if (*eax & 31) {
 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
 
-if (cores_per_pkg > 1) {
-*eax &= ~0xFC00;
-*eax |= max_core_ids_in_package(_info) << 26;
-}
+*eax &= ~0xFC00;
+*eax |= max_core_ids_in_package(_info) << 26;
 if (host_vcpus_per_cache > threads_per_pkg) {
 *eax &= ~0x3FFC000;
 
-- 
2.20.1

RE: [RFC v2 7/7] memory: Remove IOMMU MR iommu_set_iova_range API




>-Original Message-
>From: Eric Auger 
>Subject: [RFC v2 7/7] memory: Remove IOMMU MR iommu_set_iova_range
>API
>
>Since the host IOVA ranges are now passed through the
>PCIIOMMUOps set_host_resv_regions and we have removed
>the only implementation of iommu_set_iova_range() in
>the virtio-iommu and the only call site in vfio/common,
>let's retire the IOMMU MR API and its memory wrapper.
>
>Signed-off-by: Eric Auger 

Reviewed-by: Zhenzhong Duan  

Thanks
Zhenzhong

>---
> include/exec/memory.h | 32 
> system/memory.c   | 13 -
> 2 files changed, 45 deletions(-)
>
>diff --git a/include/exec/memory.h b/include/exec/memory.h
>index 9cdd64e9c6..35d772e52b 100644
>--- a/include/exec/memory.h
>+++ b/include/exec/memory.h
>@@ -530,26 +530,6 @@ struct IOMMUMemoryRegionClass {
>  int (*iommu_set_page_size_mask)(IOMMUMemoryRegion *iommu,
>  uint64_t page_size_mask,
>  Error **errp);
>-/**
>- * @iommu_set_iova_ranges:
>- *
>- * Propagate information about the usable IOVA ranges for a given
>IOMMU
>- * memory region. Used for example to propagate host physical device
>- * reserved memory region constraints to the virtual IOMMU.
>- *
>- * Optional method: if this method is not provided, then the default IOVA
>- * aperture is used.
>- *
>- * @iommu: the IOMMUMemoryRegion
>- *
>- * @iova_ranges: list of ordered IOVA ranges (at least one range)
>- *
>- * Returns 0 on success, or a negative error. In case of failure, the 
>error
>- * object must be created.
>- */
>- int (*iommu_set_iova_ranges)(IOMMUMemoryRegion *iommu,
>-  GList *iova_ranges,
>-  Error **errp);
> };
>
> typedef struct RamDiscardListener RamDiscardListener;
>@@ -1945,18 +1925,6 @@ int
>memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion
>*iommu_mr,
>uint64_t page_size_mask,
>Error **errp);
>
>-/**
>- * memory_region_iommu_set_iova_ranges - Set the usable IOVA ranges
>- * for a given IOMMU MR region
>- *
>- * @iommu: IOMMU memory region
>- * @iova_ranges: list of ordered IOVA ranges (at least one range)
>- * @errp: pointer to Error*, to store an error if it happens.
>- */
>-int memory_region_iommu_set_iova_ranges(IOMMUMemoryRegion
>*iommu,
>-GList *iova_ranges,
>-Error **errp);
>-
> /**
>  * memory_region_name: get a memory region's name
>  *
>diff --git a/system/memory.c b/system/memory.c
>index 9540caa8a1..248d514f83 100644
>--- a/system/memory.c
>+++ b/system/memory.c
>@@ -1914,19 +1914,6 @@ int
>memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion
>*iommu_mr,
> return ret;
> }
>
>-int memory_region_iommu_set_iova_ranges(IOMMUMemoryRegion
>*iommu_mr,
>-GList *iova_ranges,
>-Error **errp)
>-{
>-IOMMUMemoryRegionClass *imrc =
>IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
>-int ret = 0;
>-
>-if (imrc->iommu_set_iova_ranges) {
>-ret = imrc->iommu_set_iova_ranges(iommu_mr, iova_ranges, errp);
>-}
>-return ret;
>-}
>-
> int memory_region_register_iommu_notifier(MemoryRegion *mr,
>   IOMMUNotifier *n, Error **errp)
> {
>--
>2.41.0

RE: [RFC v2 6/7] hw/vfio: Remove memory_region_iommu_set_iova_ranges() call




>-Original Message-
>From: Eric Auger 
>Subject: [RFC v2 6/7] hw/vfio: Remove
>memory_region_iommu_set_iova_ranges() call
>
>As we have just removed the only implementation of
>iommu_set_iova_ranges IOMMU MR callback in the virtio-iommu,
>let's remove the call to the memory wrapper. Usable IOVA ranges
>are now conveyed through the PCIIOMMUOps in VFIO-PCI.
>
>Signed-off-by: Eric Auger 

Reviewed-by: Zhenzhong Duan  

Thanks
Zhenzhong

>---
> hw/vfio/common.c | 10 --
> 1 file changed, 10 deletions(-)
>
>diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>index f20a7b5bba..9e4c0cc95f 100644
>--- a/hw/vfio/common.c
>+++ b/hw/vfio/common.c
>@@ -630,16 +630,6 @@ static void
>vfio_listener_region_add(MemoryListener *listener,
> goto fail;
> }
>
>-if (bcontainer->iova_ranges) {
>-ret = memory_region_iommu_set_iova_ranges(giommu-
>>iommu_mr,
>-  bcontainer->iova_ranges,
>-  );
>-if (ret) {
>-g_free(giommu);
>-goto fail;
>-}
>-}
>-
> ret = memory_region_register_iommu_notifier(section->mr,
>>n,
> );
> if (ret) {
>--
>2.41.0

RE: [RFC v2 5/7] virtio-iommu: Remove the implementation of iommu_set_iova_range




>-Original Message-
>From: Eric Auger 
>Subject: [RFC v2 5/7] virtio-iommu: Remove the implementation of
>iommu_set_iova_range
>
>Now that we use PCIIOMMUOps to convey information about usable IOVA
>ranges we do not to implement the iommu_set_iova_ranges IOMMU MR
>callback.
>
>Signed-off-by: Eric Auger 

Reviewed-by: Zhenzhong Duan  

Thanks
Zhenzhong

>---
> hw/virtio/virtio-iommu.c | 67 
> 1 file changed, 67 deletions(-)
>
>diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>index 33e9682b83..643bbb060b 100644
>--- a/hw/virtio/virtio-iommu.c
>+++ b/hw/virtio/virtio-iommu.c
>@@ -1360,72 +1360,6 @@ static int
>virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
> return 0;
> }
>
>-/**
>- * virtio_iommu_set_iova_ranges: Conveys the usable IOVA ranges
>- *
>- * The function turns those into reserved ranges. Once some
>- * reserved ranges have been set, new reserved regions cannot be
>- * added outside of the original ones.
>- *
>- * @mr: IOMMU MR
>- * @iova_ranges: list of usable IOVA ranges
>- * @errp: error handle
>- */
>-static int virtio_iommu_set_iova_ranges(IOMMUMemoryRegion *mr,
>-GList *iova_ranges,
>-Error **errp)
>-{
>-IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
>-GList *current_ranges = sdev->host_resv_ranges;
>-GList *l, *tmp, *new_ranges = NULL;
>-int ret = -EINVAL;
>-
>-/* check that each new resv region is included in an existing one */
>-if (sdev->host_resv_ranges) {
>-range_inverse_array(iova_ranges,
>-_ranges,
>-0, UINT64_MAX);
>-
>-for (tmp = new_ranges; tmp; tmp = tmp->next) {
>-Range *newr = (Range *)tmp->data;
>-bool included = false;
>-
>-for (l = current_ranges; l; l = l->next) {
>-Range * r = (Range *)l->data;
>-
>-if (range_contains_range(r, newr)) {
>-included = true;
>-break;
>-}
>-}
>-if (!included) {
>-goto error;
>-}
>-}
>-/* all new reserved ranges are included in existing ones */
>-ret = 0;
>-goto out;
>-}
>-
>-if (sdev->probe_done) {
>-warn_report("%s: Notified about new host reserved regions after
>probe",
>-mr->parent_obj.name);
>-}
>-
>-range_inverse_array(iova_ranges,
>->host_resv_ranges,
>-0, UINT64_MAX);
>-rebuild_resv_regions(sdev);
>-
>-return 0;
>-error:
>-error_setg(errp, "IOMMU mr=%s Conflicting host reserved ranges set!",
>-   mr->parent_obj.name);
>-out:
>-g_list_free_full(new_ranges, g_free);
>-return ret;
>-}
>-
> static void virtio_iommu_system_reset(void *opaque)
> {
> VirtIOIOMMU *s = opaque;
>@@ -1751,7 +1685,6 @@ static void
>virtio_iommu_memory_region_class_init(ObjectClass *klass,
> imrc->replay = virtio_iommu_replay;
> imrc->notify_flag_changed = virtio_iommu_notify_flag_changed;
> imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask;
>-imrc->iommu_set_iova_ranges = virtio_iommu_set_iova_ranges;
> }
>
> static const TypeInfo virtio_iommu_info = {
>--
>2.41.0

RE: [RFC v2 1/7] HostIOMMUDevice: Store the VFIO/VDPA agent




>-Original Message-
>From: Eric Auger 
>Subject: [RFC v2 1/7] HostIOMMUDevice: Store the VFIO/VDPA agent
>
>Store the agent device (VFIO or VDPA) in the host IOMMU device.
>This will allow easy access to some of its resources.
>
>Signed-off-by: Eric Auger 
>---

Reviewed-by: Zhenzhong Duan  

Thanks
Zhenzhong

> include/sysemu/host_iommu_device.h | 1 +
> hw/vfio/container.c| 1 +
> hw/vfio/iommufd.c  | 2 ++
> 3 files changed, 4 insertions(+)
>
>diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>index a57873958b..3e5f058e7b 100644
>--- a/include/sysemu/host_iommu_device.h
>+++ b/include/sysemu/host_iommu_device.h
>@@ -34,6 +34,7 @@ struct HostIOMMUDevice {
> Object parent_obj;
>
> char *name;
>+void *agent; /* pointer to agent device, ie. VFIO or VDPA device */
> HostIOMMUDeviceCaps caps;
> };
>
>diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>index 26e6f7fb4f..b728b978a2 100644
>--- a/hw/vfio/container.c
>+++ b/hw/vfio/container.c
>@@ -1145,6 +1145,7 @@ static bool
>hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>
> hiod->name = g_strdup(vdev->name);
> hiod->caps.aw_bits = vfio_device_get_aw_bits(vdev);
>+hiod->agent = opaque;
>
> return true;
> }
>diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>index 409ed3dcc9..dbdae1adbb 100644
>--- a/hw/vfio/iommufd.c
>+++ b/hw/vfio/iommufd.c
>@@ -631,6 +631,8 @@ static bool
>hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> struct iommu_hw_info_vtd vtd;
> } data;
>
>+hiod->agent = opaque;
>+
> if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
>  , , sizeof(data), errp)) {
> return false;
>--
>2.41.0

Re:Re: [PATCH v5 00/10] Support persistent reservation operations

2024-06-10 Thread 卢长奇

Hi,

Sorry, I explained it in patch2 and forgot to reply your email.

The existing PRManager only works with local scsi devices. This series
will completely decouple devices and drivers. The device can not only be
scsi, but also other devices such as nvme. The same is true for the
driver, which is completely unrestricted.

And block/file-posix.c can implement the new block driver, and
pr_manager can be executed after splicing ioctl commands in these
drivers. This will be implemented in subsequent patches.

On 2024/6/11 01:18, Stefan Hajnoczi wrote:
> On Thu, Jun 06, 2024 at 08:24:34PM +0800, Changqi Lu wrote:
>> Hi,
>>
>> patchv5 has been modified.
>>
>> Sincerely hope that everyone can help review the
>> code and provide some suggestions.
>>
>> v4->v5:
>> - Fixed a memory leak bug at hw/nvme/ctrl.c.
>>
>> v3->v4:
>> - At the nvme layer, the two patches of enabling the ONCS
>> function and enabling rescap are combined into one.
>> - At the nvme layer, add helper functions for pr capacity
>> conversion between the block layer and the nvme layer.
>>
>> v2->v3:
>> In v2 Persist Through Power Loss(PTPL) is enable default.
>> In v3 PTPL is supported, which is passed as a parameter.
>>
>> v1->v2:
>> - Add sg_persist --report-capabilities for SCSI protocol and enable
>> oncs and rescap for NVMe protocol.
>> - Add persistent reservation capabilities constants and helper functions
for
>> SCSI and NVMe protocol.
>> - Add comments for necessary APIs.
>>
>> v1:
>> - Add seven APIs about persistent reservation command for block layer.
>> These APIs including reading keys, reading reservations, registering,
>> reserving, releasing, clearing and preempting.
>> - Add the necessary pr-related operation APIs for both the
>> SCSI protocol and NVMe protocol at the device layer.
>> - Add scsi driver at the driver layer to verify the functions
>
> My question from v1 is unanswered:
>
> What is the relationship to the existing PRManager functionality
> (docs/interop/pr-helper.rst) where block/file-posix.c interprets SCSI
> ioctls and sends persistent reservation requests to an external helper
> process?
>
> I wonder if block/file-posix.c can implement the new block driver
> callbacks using pr_mgr (while keeping the existing scsi-generic
> support).
>
> Thanks,
> Stefan
>
>>
>>
>> Changqi Lu (10):
>> block: add persistent reservation in/out api
>> block/raw: add persistent reservation in/out driver
>> scsi/constant: add persistent reservation in/out protocol constants
>> scsi/util: add helper functions for persistent reservation types
>> conversion
>> hw/scsi: add persistent reservation in/out api for scsi device
>> block/nvme: add reservation command protocol constants
>> hw/nvme: add helper functions for converting reservation types
>> hw/nvme: enable ONCS and rescap function
>> hw/nvme: add reservation protocal command
>> block/iscsi: add persistent reservation in/out driver
>>
>> block/block-backend.c | 397 ++
>> block/io.c | 163 +++
>> block/iscsi.c | 443 ++
>> block/raw-format.c | 56 
>> hw/nvme/ctrl.c | 326 +-
>> hw/nvme/ns.c | 5 +
>> hw/nvme/nvme.h | 84 ++
>> hw/scsi/scsi-disk.c | 352 
>> include/block/block-common.h | 40 +++
>> include/block/block-io.h | 20 ++
>> include/block/block_int-common.h | 84 ++
>> include/block/nvme.h | 98 +++
>> include/scsi/constants.h | 52 
>> include/scsi/utils.h | 8 +
>> include/sysemu/block-backend-io.h | 24 ++
>> scsi/utils.c | 81 ++
>> 16 files changed, 2231 insertions(+), 2 deletions(-)
>>
>> --
>> 2.20.1
>>

Re: [External] Re: [PATCH v5 01/10] block: add persistent reservation in/out api

2024-06-10 Thread 卢长奇

Hi,

Thanks for your advices! I will add it.

On 2024/6/11 01:26, Stefan Hajnoczi wrote:
> On Thu, Jun 06, 2024 at 08:24:35PM +0800, Changqi Lu wrote:
>> Add persistent reservation in/out operations
>> at the block level. The following operations
>> are included:
>>
>> - read_keys: retrieves the list of registered keys.
>> - read_reservation: retrieves the current reservation status.
>> - register: registers a new reservation key.
>> - reserve: initiates a reservation for a specific key.
>> - release: releases a reservation for a specific key.
>> - clear: clears all existing reservations.
>> - preempt: preempts a reservation held by another key.
>>
>> Signed-off-by: Changqi Lu
>> Signed-off-by: zhenwei pi
>> ---
>> block/block-backend.c | 397 ++
>> block/io.c | 163 
>> include/block/block-common.h | 40 +++
>> include/block/block-io.h | 20 ++
>> include/block/block_int-common.h | 84 +++
>> include/sysemu/block-backend-io.h | 24 ++
>> 6 files changed, 728 insertions(+)
>>
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index db6f9b92a3..6707d94df7 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -1770,6 +1770,403 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk,
unsigned long int req, void *buf,
>> return blk_aio_prwv(blk, req, 0, buf, blk_aio_ioctl_entry, 0, cb,
opaque);
>> }
>>
>> +typedef struct BlkPrInCo {
>> + BlockBackend *blk;
>> + uint32_t *generation;
>> + uint32_t num_keys;
>> + BlockPrType *type;
>> + uint64_t *keys;
>> + int ret;
>> +} BlkPrInCo;
>> +
>> +typedef struct BlkPrInCB {
>> + BlockAIOCB common;
>> + BlkPrInCo prco;
>> + bool has_returned;
>> +} BlkPrInCB;
>> +
>> +static const AIOCBInfo blk_pr_in_aiocb_info = {
>> + .aiocb_size = sizeof(BlkPrInCB),
>> +};
>> +
>> +static void blk_pr_in_complete(BlkPrInCB *acb)
>> +{
>> + if (acb->has_returned) {
>> + acb->common.cb(acb->common.opaque, acb->prco.ret);
>> + blk_dec_in_flight(acb->prco.blk);
>
> Did you receive my replies to v1 of this patch series?
>
> Please take a look at them and respond:
>
https://lore.kernel.org/qemu-devel/20240508093629.441057-1-luchangqi@bytedance.com/
>
> Thanks,
> Stefan
>
>> + qemu_aio_unref(acb);
>> + }
>> +}
>> +
>> +static void blk_pr_in_complete_bh(void *opaque)
>> +{
>> + BlkPrInCB *acb = opaque;
>> + assert(acb->has_returned);
>> + blk_pr_in_complete(acb);
>> +}
>> +
>> +static BlockAIOCB *blk_aio_pr_in(BlockBackend *blk, uint32_t
*generation,
>> + uint32_t num_keys, BlockPrType *type,
>> + uint64_t *keys, CoroutineEntry co_entry,
>> + BlockCompletionFunc *cb, void *opaque)
>> +{
>> + BlkPrInCB *acb;
>> + Coroutine *co;
>> +
>> + blk_inc_in_flight(blk);
>> + acb = blk_aio_get(_pr_in_aiocb_info, blk, cb, opaque);
>> + acb->prco = (BlkPrInCo) {
>> + .blk = blk,
>> + .generation = generation,
>> + .num_keys = num_keys,
>> + .type = type,
>> + .ret = NOT_DONE,
>> + .keys = keys,
>> + };
>> + acb->has_returned = false;
>> +
>> + co = qemu_coroutine_create(co_entry, acb);
>> + aio_co_enter(qemu_get_current_aio_context(), co);
>> +
>> + acb->has_returned = true;
>> + if (acb->prco.ret != NOT_DONE) {
>> + replay_bh_schedule_oneshot_event(qemu_get_current_aio_context(),
>> + blk_pr_in_complete_bh, acb);
>> + }
>> +
>> + return >common;
>> +}
>> +
>> +/* To be called between exactly one pair of blk_inc/dec_in_flight() */
>> +static int coroutine_fn
>> +blk_aio_pr_do_read_keys(BlockBackend *blk, uint32_t *generation,
>> + uint32_t num_keys, uint64_t *keys)
>> +{
>> + IO_CODE();
>> +
>> + blk_wait_while_drained(blk);
>> + GRAPH_RDLOCK_GUARD();
>> +
>> + if (!blk_co_is_available(blk)) {
>> + return -ENOMEDIUM;
>> + }
>> +
>> + return bdrv_co_pr_read_keys(blk_bs(blk), generation, num_keys, keys);
>> +}
>> +
>> +static void coroutine_fn blk_aio_pr_read_keys_entry(void *opaque)
>> +{
>> + BlkPrInCB *acb = opaque;
>> + BlkPrInCo *prco = >prco;
>> +
>> + prco->ret = blk_aio_pr_do_read_keys(prco->blk, prco->generation,
>> + prco->num_keys, prco->keys);
>> + blk_pr_in_complete(acb);
>> +}
>> +
>> +BlockAIOCB *blk_aio_pr_read_keys(BlockBackend *blk, uint32_t
*generation,
>> + uint32_t num_keys, uint64_t *keys,
>> + BlockCompletionFunc *cb, void *opaque)
>> +{
>> + IO_CODE();
>> + return blk_aio_pr_in(blk, generation, num_keys, NULL, keys,
>> + blk_aio_pr_read_keys_entry, cb, opaque);
>> +}
>> +
>> +/* To be called between exactly one pair of blk_inc/dec_in_flight() */
>> +static int coroutine_fn
>> +blk_aio_pr_do_read_reservation(BlockBackend *blk, uint32_t *generation,
>> + uint64_t *key, BlockPrType *type)
>> +{
>> + IO_CODE();
>> +
>> + blk_wait_while_drained(blk);
>> + GRAPH_RDLOCK_GUARD();
>> +
>> + if (!blk_co_is_available(blk)) {
>> + return -ENOMEDIUM;
>> + }
>> +
>> + return bdrv_co_pr_read_reservation(blk_bs(blk), generation, key,
type);
>> +}
>> +
>> +static void coroutine_fn blk_aio_pr_read_reservation_entry(void
*opaque)
>> +{
>> + BlkPrInCB *acb = opaque;
>> + BlkPrInCo *prco = >prco;
>> +
>> + prco->ret =

RE: [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device() callbacks

Hi Eric,

>-Original Message-
>From: Eric Auger 
>Subject: [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device()
>callbacks
>
>Implement PCIIOMMUOPs [set|unset]_iommu_device() callbacks.
>In set(), a VirtioHostIOMMUDevice is allocated which holds
>a reference to the HostIOMMUDevice. This object is stored in a hash
>table indexed by PCI BDF. The handle to the Host IOMMU device
>will allow to retrieve information related to the physical IOMMU.
>
>Signed-off-by: Eric Auger 
>---
> include/hw/virtio/virtio-iommu.h |  9 
> hw/virtio/virtio-iommu.c | 87
>
> 2 files changed, 96 insertions(+)
>
>diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-
>iommu.h
>index 83a52cc446..4f664ea0c4 100644
>--- a/include/hw/virtio/virtio-iommu.h
>+++ b/include/hw/virtio/virtio-iommu.h
>@@ -45,6 +45,14 @@ typedef struct IOMMUDevice {
> bool probe_done;
> } IOMMUDevice;
>
>+typedef struct VirtioHostIOMMUDevice {
>+void *viommu;
>+PCIBus *bus;
>+uint8_t devfn;
>+HostIOMMUDevice *dev;
>+QLIST_ENTRY(VirtioHostIOMMUDevice) next;
>+} VirtioHostIOMMUDevice;
>+
> typedef struct IOMMUPciBus {
> PCIBus   *bus;
> IOMMUDevice  *pbdev[]; /* Parent array is sparse, so dynamically alloc
>*/
>@@ -57,6 +65,7 @@ struct VirtIOIOMMU {
> struct virtio_iommu_config config;
> uint64_t features;
> GHashTable *as_by_busptr;
>+GHashTable *host_iommu_devices;
> IOMMUPciBus *iommu_pcibus_by_bus_num[PCI_BUS_MAX];
> PCIBus *primary_bus;
> ReservedRegion *prop_resv_regions;
>diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>index 1326c6ec41..0680a357f0 100644
>--- a/hw/virtio/virtio-iommu.c
>+++ b/hw/virtio/virtio-iommu.c
>@@ -28,6 +28,7 @@
> #include "sysemu/kvm.h"
> #include "sysemu/reset.h"
> #include "sysemu/sysemu.h"
>+#include "sysemu/host_iommu_device.h"

Not sure if better to move this to include/hw/virtio/virtio-iommu.h
as HostIOMMUDevice is used there.

> #include "qemu/reserved-region.h"
> #include "qemu/units.h"
> #include "qapi/error.h"
>@@ -69,6 +70,11 @@ typedef struct VirtIOIOMMUMapping {
> uint32_t flags;
> } VirtIOIOMMUMapping;
>
>+struct hiod_key {
>+PCIBus *bus;
>+uint8_t devfn;
>+};
>+
> static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
> {
> return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
>@@ -462,8 +468,86 @@ static AddressSpace
>*virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
> return >as;
> }
>
>+static gboolean hiod_equal(gconstpointer v1, gconstpointer v2)
>+{
>+const struct hiod_key *key1 = v1;
>+const struct hiod_key *key2 = v2;
>+
>+return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>+}
>+
>+static guint hiod_hash(gconstpointer v)
>+{
>+const struct hiod_key *key = v;
>+guint value = (guint)(uintptr_t)key->bus;
>+
>+return (guint)(value << 8 | key->devfn);
>+}
>+
>+static VirtioHostIOMMUDevice *
>+get_host_iommu_device(VirtIOIOMMU *viommu, PCIBus *bus, int devfn) {
>+struct hiod_key key = {
>+.bus = bus,
>+.devfn = devfn,
>+};
>+
>+return g_hash_table_lookup(viommu->host_iommu_devices, );
>+}
>+
>+static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque,
>int devfn,
>+  HostIOMMUDevice *hiod, Error **errp)
>+{
>+VirtIOIOMMU *viommu = opaque;
>+VirtioHostIOMMUDevice *vhiod;
>+struct hiod_key *new_key;
>+
>+assert(hiod);
>+
>+vhiod = get_host_iommu_device(viommu, bus, devfn);
>+if (vhiod) {
>+error_setg(errp, "VirtioHostIOMMUDevice already exists");
>+return false;
>+}
>+
>+vhiod = g_malloc0(sizeof(VirtioHostIOMMUDevice));
>+vhiod->bus = bus;
>+vhiod->devfn = (uint8_t)devfn;
>+vhiod->viommu = viommu;
>+vhiod->dev = hiod;
>+
>+new_key = g_malloc(sizeof(*new_key));
>+new_key->bus = bus;
>+new_key->devfn = devfn;
>+
>+object_ref(hiod);
>+g_hash_table_insert(viommu->host_iommu_devices, new_key, vhiod);
>+
>+return true;
>+}
>+
>+static void
>+virtio_iommu_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
>+{
>+VirtIOIOMMU *viommu = opaque;
>+VirtioHostIOMMUDevice *vhiod;
>+struct hiod_key key = {
>+.bus = bus,
>+.devfn = devfn,
>+};
>+
>+vhiod = g_hash_table_lookup(viommu->host_iommu_devices, );
>+if (!vhiod) {
>+return;
>+}
>+
>+g_hash_table_remove(viommu->host_iommu_devices, );
>+object_unref(vhiod->dev);

This looks a use-after-free.

Thanks
Zhenzhong

>+}
>+
> static const PCIIOMMUOps virtio_iommu_ops = {
> .get_address_space = virtio_iommu_find_add_as,
>+.set_iommu_device = virtio_iommu_set_iommu_device,
>+.unset_iommu_device = virtio_iommu_unset_iommu_device,
> };
>
> static int virtio_iommu_attach(VirtIOIOMMU *s,
>@@ -1357,6 +1441,9 @@ static void
>virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>
> s->as_by_busptr =

RE: [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU



>-Original Message-
>From: Eric Auger 
>Subject: Re: [PATCH v7 00/17] Add a host IOMMU device abstraction to
>check with vIOMMU
>
>Hi Zhenzhong,
>
>On 6/5/24 10:30, Zhenzhong Duan wrote:
>> Hi,
>>
>> This series introduce a HostIOMMUDevice abstraction and sub-classes.
>> Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new
>interface
>> between vIOMMU and HostIOMMUDevice.
>>
>> A HostIOMMUDevice is an abstraction for an assigned device that is
>protected
>> by a physical IOMMU (aka host IOMMU). The userspace interaction with
>this
>> physical IOMMU can be done either through the VFIO IOMMU type 1
>legacy
>> backend or the new iommufd backend. The assigned device can be a VFIO
>device
>> or a VDPA device. The HostIOMMUDevice is needed to interact with the
>host
>> IOMMU that protects the assigned device. It is especially useful when the
>> device is also protected by a virtual IOMMU as this latter use the
>translation
>> services of the physical IOMMU and is constrained by it. In that context the
>> HostIOMMUDevice can be passed to the virtual IOMMU to collect physical
>IOMMU
>> capabilities such as the supported address width. In the future, the virtual
>> IOMMU will use the HostIOMMUDevice to program the guest page tables
>in the
>> first translation stage of the physical IOMMU.
>>
>> HostIOMMUDeviceClass::realize() is introduced to initialize
>> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
>>
>> HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
>> device capabilities.
>>
>> The class tree is as below:
>>
>>   HostIOMMUDevice
>>  | .caps
>>  | .realize()
>>  | .get_cap()
>>  |
>> .---.
>> ||  |
>> HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}
>HostIOMMUDeviceIOMMUFD
>> ||  | [.iommufd]
>> | [.devid]
>> | [.ioas_id]
>> | 
>> [.attach_hwpt()]
>> | 
>> [.detach_hwpt()]
>> |
>> .--.
>> |  |
>>  HostIOMMUDeviceIOMMUFDVFIO
>{HostIOMMUDeviceIOMMUFDVDPA}
>>   | [.vdev]| {.vdev}
>>
>> * The attributes in [] will be implemented in nesting series.
>> * The classes in {} will be implemented in future.
>> * .vdev in different class points to different agent device,
>> * i.e., VFIODevice or VDPADevice.
>>
>> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
>> PATCH5-10: Implement .realize() and .get_cap() handler
>> PATCH11-14: Create HostIOMMUDevice instance and pass to vIOMMU
>> PATCH15-17: Implement compatibility check between host IOMMU and
>vIOMMU(intel_iommu)
>>
>> Test done:
>> make check
>> vfio device hotplug/unplug with different backend on linux
>> reboot, kexec
>> build test on linux and windows11
>>
>> Qemu code can be found at:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_pre
>q_v7
>>
>> Besides the compatibility check in this series, in nesting series, this
>> host IOMMU device is extended for much wider usage. For anyone
>interested
>> on the nesting series, here is the link:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfc
>v2
>>
>> Thanks
>> Zhenzhong
>>
>> Changelog:
>> v7:
>> - drop config CONFIG_HOST_IOMMU_DEVICE (Cédric)
>> - introduce HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX (Eric)
>> - use iova_ranges method in iommufd.realize() (Eric)
>> - introduce HostIOMMUDevice::name to facilitate tracing (Eric)
>> - implement a custom destroy hash function (Cédric)
>> - drop VTDHostIOMMUDevice and save HostIOMMUDevice in hash table
>(Eric)
>> - move patch5 after patch1 (Eric)
>> - squash patch3 and 4, squash patch12 and 13 (Eric)
>> - refine comments (Eric)
>> - collect Eric's R-B
>
>for the whole series:
>Reviewed-by: Eric Auger 

Thanks Eric.

>
>I exercised part of it using the virtio-iommu and this series on top
>[RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for
>hotplugged devices

You are super-efficient

BRs.
Zhenzhong

Re: [PULL 06/10] hw/loongarch: Refine fwcfg memory map

2024-06-10 Thread maobibo





On 2024/6/7 下午10:31, Peter Maydell wrote:

On Thu, 23 May 2024 at 02:48, Song Gao  wrote:


From: Bibo Mao 

Memory map table for fwcfg is used for UEFI BIOS, UEFI BIOS uses the first
entry from fwcfg memory map as the first memory HOB, the second memory HOB
will be used if the first memory HOB is used up.

Memory map table for fwcfg does not care about numa node, however in
generic the first memory HOB is part of numa node0, so that runtime
memory of UEFI which is allocated from the first memory HOB is located
at numa node0.

Signed-off-by: Bibo Mao 
Reviewed-by: Song Gao 
Message-Id: <20240515093927.3453674-4-maob...@loongson.cn>
Signed-off-by: Song Gao 


Hi; Coverity points out a possible issue with this code
(CID 1546441):


+static void fw_cfg_add_memory(MachineState *ms)
+{
+hwaddr base, size, ram_size, gap;
+int nb_numa_nodes, nodes;
+NodeInfo *numa_info;
+
+ram_size = ms->ram_size;
+base = VIRT_LOWMEM_BASE;
+gap = VIRT_LOWMEM_SIZE;
+nodes = nb_numa_nodes = ms->numa_state->num_nodes;
+numa_info = ms->numa_state->nodes;
+if (!nodes) {
+nodes = 1;
+}
+
+/* add fw_cfg memory map of node0 */
+if (nb_numa_nodes) {
+size = numa_info[0].node_mem;
+} else {
+size = ram_size;
+}
+
+if (size >= gap) {
+memmap_add_entry(base, gap, 1);
+size -= gap;
+base = VIRT_HIGHMEM_BASE;
+gap = ram_size - VIRT_LOWMEM_SIZE;


In this if() statement we set 'gap'...


+}
+
+if (size) {
+memmap_add_entry(base, size, 1);
+base += size;
+}
+
+if (nodes < 2) {
+return;
+}
+
+/* add fw_cfg memory map of other nodes */
+size = ram_size - numa_info[0].node_mem;
+gap  = VIRT_LOWMEM_BASE + VIRT_LOWMEM_SIZE;


...but then later here we unconditionally overwrite 'gap',
without ever using it in between, making the previous
assignment useless.

What was the intention here ?

It is abuse about variable gap, sometimes it represents low memory size,
sometimes it represents the end address of low memory.

It can be removed at both placed, what is this patch?

--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -1054,7 +1054,6 @@ static void fw_cfg_add_memory(MachineState *ms)
 memmap_add_entry(base, gap, 1);
 size -= gap;
 base = VIRT_HIGHMEM_BASE;
-gap = ram_size - VIRT_LOWMEM_SIZE;
 }

 if (size) {
@@ -1068,15 +1067,14 @@ static void fw_cfg_add_memory(MachineState *ms)

 /* add fw_cfg memory map of other nodes */
 size = ram_size - numa_info[0].node_mem;
-gap  = VIRT_LOWMEM_BASE + VIRT_LOWMEM_SIZE;
-if (base < gap && (base + size) > gap) {
+if (numa_info[0].node_mem < gap && ram_size > gap) {
 /*
  * memory map for the maining nodes splited into two part
- *   lowram:  [base, +(gap - base))
- *   highram: [VIRT_HIGHMEM_BASE, +(size - (gap - base)))
+ * lowram:  [base, +(gap - numa_info[0].node_mem))
+ * highram: [VIRT_HIGHMEM_BASE, +(size - (gap - 
numa_info[0].node_mem)))

  */
-memmap_add_entry(base, gap - base, 1);
-size -= gap - base;
+memmap_add_entry(base, gap - numa_info[0].node_mem, 1);
+size -= gap - numa_info[0].node_mem;
 base = VIRT_HIGHMEM_BASE;
 }

Regards
Bibo Mao




thanks
-- PMM

Re: [PATCH RESEND 1/6] target/riscv: Introduce extension implied rules definition

2024-06-10 Thread Frank Chang

Hi Alistair,

On Tue, Jun 11, 2024 at 9:35 AM Alistair Francis 
wrote:

> On Wed, Jun 5, 2024 at 4:35 PM  wrote:
> >
> > From: Frank Chang 
> >
> > RISCVCPUImpliedExtsRule is created to store the implied rules.
> > 'is_misa' flag is used to distinguish whether the rule is derived
> > from the MISA or other extensions.
> > 'ext' stores the MISA bit if 'is_misa' is true. Otherwise, it stores
> > the offset of the extension defined in RISCVCPUConfig. 'ext' will also
> > serve as the key of the hash tables to look up the rule in the following
> > commit.
> >
> > Signed-off-by: Frank Chang 
> > ---
> >  target/riscv/cpu.c |  8 
> >  target/riscv/cpu.h | 18 ++
> >  2 files changed, 26 insertions(+)
> >
> > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> > index cee6fc4a9a..c7e5cec7ef 100644
> > --- a/target/riscv/cpu.c
> > +++ b/target/riscv/cpu.c
> > @@ -2242,6 +2242,14 @@ RISCVCPUProfile *riscv_profiles[] = {
> >  NULL,
> >  };
> >
> > +RISCVCPUImpliedExtsRule *riscv_misa_implied_rules[] = {
> > +NULL
> > +};
> > +
> > +RISCVCPUImpliedExtsRule *riscv_ext_implied_rules[] = {
> > +NULL
> > +};
> > +
> >  static Property riscv_cpu_properties[] = {
> >  DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true),
> >
> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > index 1501868008..b5a036cf27 100644
> > --- a/target/riscv/cpu.h
> > +++ b/target/riscv/cpu.h
> > @@ -122,6 +122,24 @@ typedef enum {
> >  EXT_STATUS_DIRTY,
> >  } RISCVExtStatus;
> >
> > +typedef struct riscv_cpu_implied_exts_rule RISCVCPUImpliedExtsRule;
> > +
> > +struct riscv_cpu_implied_exts_rule {
> > +/* Bitmask indicates the rule enabled status for the harts. */
> > +uint64_t enabled;
>
> I'm not clear why we need this
>

This is because a rule may be implied more than once.
e.g. Zcf implies RVF, Zfa also implies RVF.
There's no need to check RVF's implied rule again for Zfa after Zcf's
implied rules are enabled.

The implied rules are checked recursively, so once the rule has been
enabled (per-CPU basis),
the rule (and all its implied rules) will not be rechecked.

Regards,
Frank Chang


> Alistair
>
> > +/* True if this is a MISA implied rule. */
> > +bool is_misa;
> > +/* ext is MISA bit if is_misa flag is true, else extension offset.
> */
> > +const uint32_t ext;
> > +const uint32_t implied_misas;
> > +const uint32_t implied_exts[];
> > +};
> > +
> > +extern RISCVCPUImpliedExtsRule *riscv_misa_implied_rules[];
> > +extern RISCVCPUImpliedExtsRule *riscv_ext_implied_rules[];
> > +
> > +#define RISCV_IMPLIED_EXTS_RULE_END -1
> > +
> >  #define MMU_USER_IDX 3
> >
> >  #define MAX_RISCV_PMPS (16)
> > --
> > 2.43.2
> >
> >
>

Re: [PATCH v3 00/13] riscv: QEMU RISC-V IOMMU Support

2024-06-10 Thread LIU Zhiwei


Hi Daniel,

I want to know if we can use the IOMMU and IOPMP at the same time.

The relationship between them is more similar to MMU and sPMP or to MMU 
and PMP?


Thanks,
Zhiwei

On 2024/5/24 1:39, Daniel Henrique Barboza wrote:

Hi,

In this new version a lot of changes were made throughout all the code,
most notably on patch 3. Link for the previous version is [1].

* How it was tested *

This series was tested using an emulated QEMU RISC-V host booting a QEMU
KVM guest, passing through an emulated e1000 network card from the host
to the guest. I can provide more details (e.g. QEMU command lines) if
required, just let me know. For now this cover-letter is too much of an
essay as is.

The Linux kernel used for tests can be found here:

https://github.com/tjeznach/linux/tree/riscv_iommu_v6-rc3

This is a newer version of the following work from Tomasz:

https://lore.kernel.org/linux-riscv/cover.1715708679.git.tjezn...@rivosinc.com/
("[PATCH v5 0/7] Linux RISC-V IOMMU Support")

The v5 wasn't enough for the testing being done. v6-rc3 did the trick.

Note that to test this work using riscv-iommu-pci we'll need to provide
the Rivos PCI ID in the command line. More details down below.

* Highlights of this version *

- patches removed from v2: platform driver (riscv-iommu-sys, former
patch 05) and the EDU changes (patches 14 and 15). The platform driver
will be sent later with a working example on the 'virt' machine,
either on a newer version of this series or via a follow-up series. We
already have a PoC on [2] created by Sunil. More tests are needed, so
it'll be left behind for now. The EDU changes will be sent in separate
after I finish the doc changes that Frank cited in v2.

- patch 3 contains the bulk of changes made from v2. Please give special
attention to the following functions since this is entirely new code I
ended up adding:
  
  - riscv_iommu_report_fault()

  - riscv_iommu_validate_device_ctx()
  - riscv_iommu_update_ipsr()
  
   Aside from these helpers most of the changes made in this patch 3 were

punctual.

- Red HAT PCI ID related changes. A new patch (4) that introduces a
generic RISC-V IOMMU PCI ID was added. This PCI ID was gracefully given
to us by Red Hat and Gerd Hoffman from their ID space. The
riscv-iommu-pci device now defaults to this PCI ID instead of Rivos PCI
ID. The device was changed slightly to allow vendor-id and device-id to
be set in the command-line, so it's now possible to use this reference
device as another RISC-V IOMMU PCI device to ease the burden of
testing/development.

   To instantiate the riscv-iommu-pci device using the previous Rivos PCI
ID, use the following cmd line:

   -device riscv-iommu-pci,vendor-id=0x1efd,device-id=0xedf1

   I'm using these options to test the series with the existing Linux RISC-V
IOMMU support that uses just a Rivos ID to identify the device.


Series based on alistair/riscv-to-apply.next. It's also applicable on
current QEMU master. It can also be fetched from:

https://gitlab.com/danielhb/qemu/-/tree/riscv_iommu_v3
  


Patches missing reviews/acks: 3, 5, 9, 10, 11.

Changes from v2 [1]:
- patch 05 (hw/riscv: add riscv-iommu-sys platform device): dropped
   - will be reintroduced in a later review or as a follow-up series

- patches 14 and 15: dropped
   - will be sent in separate

- patches 2, 3, 4 and 5:
   - removed all 'Ziommu' references

- patch 2:
   - added extra bits that patch 3 ended up using

- patch 3:
   - fixed blank line at EOF in hw/riscv/trace.h
   - added a riscv_iommu_report_fault() helper to report faults. The helper 
checks if
 a given fault is eligible to be reported if DTF is 1
   - Use riscv_iommu_report_fault() in riscv_iommu_ctx() and 
riscv_iommu_translate()
 to avoid code repetition
   - added a riscv_iommu_validate_device_ctx() helper to validate the device 
context
 as specified in "Device configuration checks" section. This helper is 
being used
 in riscv_iommu_ctx_fetch()
   - added a new riscv_iommu_update_ipsr() helper to handle IPSR updates
 in riscv_iommu_mmio_write()
   - riscv_iommmu_msi_write() now reports a fault in all error paths
   - check for fctl.WSI before issuing a MSI interrupt in riscv_iommu_notify()
   - change riscv-iommu region name to 'riscv-iommu'
   - change address_space_init() name for PCI devices to 'name' instead of 
using TYPE_RISCV_IOMMU_PCI
   - changed riscv_iommu_mmio_ops min_access_size to 4
   - do not check for min and max sizes on riscv_iommu_mmio_write()
   - changed riscv_iommu_trap_ops  min_access_size to 4
   - removed IOMMU qemu_thread thread:
 - riscv_iommu_mmio_write() will now execute a riscv_iommu_process_fn by 
holding
   'core_lock'
   - init FSCR as zero explicitly
   - check for bus->iommu_opaque == NULL before calling pci_setup_iommu()

- patch 4 (new):
   - add Red-Hat PCI RISC-V IOMMU ID

- patch 5 (former 4):
   - create vendor-id and device-id properties
   - set Red-hat PCI RISC-V IOMMU ID as default ID

- patch

Re: [PATCH RESEND 4/6] target/riscv: Add standard extension implied rules

On Wed, Jun 5, 2024 at 4:35 PM  wrote:
>
> From: Frank Chang 
>
> Add standard extension implied rules to enable the implied extensions of
> the standard extension recursively.
>
> Signed-off-by: Frank Chang 

Acked-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 340 +
>  1 file changed, 340 insertions(+)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index a6e9055c5f..80b238060a 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -2289,12 +2289,352 @@ static RISCVCPUImpliedExtsRule RVV_IMPLIED = {
>  },
>  };
>
> +static RISCVCPUImpliedExtsRule ZCB_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zcb),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zca),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZCD_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zcd),
> +.implied_misas = RVD,
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zca),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZCE_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zce),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zcb), CPU_CFG_OFFSET(ext_zcmp),
> +CPU_CFG_OFFSET(ext_zcmt),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZCF_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zcf),
> +.implied_misas = RVF,
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zca),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZCMP_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zcmp),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zca),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZCMT_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zcmt),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zca), CPU_CFG_OFFSET(ext_zicsr),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZDINX_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zdinx),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zfinx),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZFA_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zfa),
> +.implied_misas = RVF,
> +.implied_exts = { RISCV_IMPLIED_EXTS_RULE_END },
> +};
> +
> +static RISCVCPUImpliedExtsRule ZFBFMIN_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zfbfmin),
> +.implied_misas = RVF,
> +.implied_exts = { RISCV_IMPLIED_EXTS_RULE_END },
> +};
> +
> +static RISCVCPUImpliedExtsRule ZFH_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zfh),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zfhmin),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZFHMIN_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zfhmin),
> +.implied_misas = RVF,
> +.implied_exts = { RISCV_IMPLIED_EXTS_RULE_END },
> +};
> +
> +static RISCVCPUImpliedExtsRule ZFINX_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zfinx),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zicsr),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZHINX_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zhinx),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zhinxmin),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZHINXMIN_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zhinxmin),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zfinx),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZICNTR_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zicntr),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zicsr),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZIHPM_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zihpm),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zicsr),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZK_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zk),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zkn), CPU_CFG_OFFSET(ext_zkr),
> +CPU_CFG_OFFSET(ext_zkt),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZKN_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zkn),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zbkb), CPU_CFG_OFFSET(ext_zbkc),
> +CPU_CFG_OFFSET(ext_zbkx), CPU_CFG_OFFSET(ext_zkne),
> +CPU_CFG_OFFSET(ext_zknd), CPU_CFG_OFFSET(ext_zknh),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule ZKS_IMPLIED = {
> +.ext = CPU_CFG_OFFSET(ext_zks),
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zbkb), CPU_CFG_OFFSET(ext_zbkc),
> +CPU_CFG_OFFSET(ext_zbkx),

Re: [PATCH RESEND 3/6] target/riscv: Add MISA implied rules

On Wed, Jun 5, 2024 at 4:34 PM  wrote:
>
> From: Frank Chang 
>
> Add MISA extension implied rules to enable the implied extensions
> of MISA recursively.
>
> Signed-off-by: Frank Chang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 50 +-
>  1 file changed, 49 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index c7e5cec7ef..a6e9055c5f 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -2242,8 +2242,56 @@ RISCVCPUProfile *riscv_profiles[] = {
>  NULL,
>  };
>
> +static RISCVCPUImpliedExtsRule RVA_IMPLIED = {
> +.is_misa = true,
> +.ext = RVA,
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zalrsc), CPU_CFG_OFFSET(ext_zaamo),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule RVD_IMPLIED = {
> +.is_misa = true,
> +.ext = RVD,
> +.implied_misas = RVF,
> +.implied_exts = { RISCV_IMPLIED_EXTS_RULE_END },
> +};
> +
> +static RISCVCPUImpliedExtsRule RVF_IMPLIED = {
> +.is_misa = true,
> +.ext = RVF,
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zicsr),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule RVM_IMPLIED = {
> +.is_misa = true,
> +.ext = RVM,
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zmmul),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
> +static RISCVCPUImpliedExtsRule RVV_IMPLIED = {
> +.is_misa = true,
> +.ext = RVV,
> +.implied_exts = {
> +CPU_CFG_OFFSET(ext_zve64d),
> +
> +RISCV_IMPLIED_EXTS_RULE_END
> +},
> +};
> +
>  RISCVCPUImpliedExtsRule *riscv_misa_implied_rules[] = {
> -NULL
> +_IMPLIED, _IMPLIED, _IMPLIED,
> +_IMPLIED, _IMPLIED, NULL
>  };
>
>  RISCVCPUImpliedExtsRule *riscv_ext_implied_rules[] = {
> --
> 2.43.2
>
>

Re: [PATCH RESEND 1/6] target/riscv: Introduce extension implied rules definition

On Wed, Jun 5, 2024 at 4:35 PM  wrote:
>
> From: Frank Chang 
>
> RISCVCPUImpliedExtsRule is created to store the implied rules.
> 'is_misa' flag is used to distinguish whether the rule is derived
> from the MISA or other extensions.
> 'ext' stores the MISA bit if 'is_misa' is true. Otherwise, it stores
> the offset of the extension defined in RISCVCPUConfig. 'ext' will also
> serve as the key of the hash tables to look up the rule in the following
> commit.
>
> Signed-off-by: Frank Chang 
> ---
>  target/riscv/cpu.c |  8 
>  target/riscv/cpu.h | 18 ++
>  2 files changed, 26 insertions(+)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index cee6fc4a9a..c7e5cec7ef 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -2242,6 +2242,14 @@ RISCVCPUProfile *riscv_profiles[] = {
>  NULL,
>  };
>
> +RISCVCPUImpliedExtsRule *riscv_misa_implied_rules[] = {
> +NULL
> +};
> +
> +RISCVCPUImpliedExtsRule *riscv_ext_implied_rules[] = {
> +NULL
> +};
> +
>  static Property riscv_cpu_properties[] = {
>  DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true),
>
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 1501868008..b5a036cf27 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -122,6 +122,24 @@ typedef enum {
>  EXT_STATUS_DIRTY,
>  } RISCVExtStatus;
>
> +typedef struct riscv_cpu_implied_exts_rule RISCVCPUImpliedExtsRule;
> +
> +struct riscv_cpu_implied_exts_rule {
> +/* Bitmask indicates the rule enabled status for the harts. */
> +uint64_t enabled;

I'm not clear why we need this

Alistair

> +/* True if this is a MISA implied rule. */
> +bool is_misa;
> +/* ext is MISA bit if is_misa flag is true, else extension offset. */
> +const uint32_t ext;
> +const uint32_t implied_misas;
> +const uint32_t implied_exts[];
> +};
> +
> +extern RISCVCPUImpliedExtsRule *riscv_misa_implied_rules[];
> +extern RISCVCPUImpliedExtsRule *riscv_ext_implied_rules[];
> +
> +#define RISCV_IMPLIED_EXTS_RULE_END -1
> +
>  #define MMU_USER_IDX 3
>
>  #define MAX_RISCV_PMPS (16)
> --
> 2.43.2
>
>

Re: [PATCH v3 00/13] riscv: QEMU RISC-V IOMMU Support

On Tue, Jun 11, 2024 at 5:16 AM Daniel Henrique Barboza
 wrote:
>
>
>
> On 6/10/24 3:32 PM, Andrew Jones wrote:
> > On June 10, 2024 2:34:58 AM GMT+02:00, Alistair Francis 
> >  wrote:
> >> On Fri, May 24, 2024 at 3:43 AM Daniel Henrique Barboza
> >>  wrote:
> >>>
> >>> Hi,
> >>>
> >>> In this new version a lot of changes were made throughout all the code,
> >>> most notably on patch 3. Link for the previous version is [1].
> >>>
> >>> * How it was tested *
> >>>
> >>> This series was tested using an emulated QEMU RISC-V host booting a QEMU
> >>> KVM guest, passing through an emulated e1000 network card from the host
> >>> to the guest. I can provide more details (e.g. QEMU command lines) if
> >>> required, just let me know. For now this cover-letter is too much of an
> >>> essay as is.
> >>
> >> It would probably be helpful to document these somewhere, so others
> >> can use them as a starting point for running this
> >>
> >
> > I've written up a testing procedure which I shared internally with Daniel. 
> > I'll sanitize it and post it somewhere public.
> >
>
> I can also add a QEMU docs under docs/system/riscv, both as a
> subsection of virt.rst and perhaps a new doc that describes the
> devices itself (riscv-iommu-pci and later on riscv-iommu-sys).

I think that would be great. Even if it isn't a simple "copy this
command and it works" it at least gives users a place to start to
figure out how to use this

Alistair

Re: [PATCH v3 4/5] target/riscv: Restrict semihosting to TCG

On Tue, Jun 11, 2024 at 12:59 AM Philippe Mathieu-Daudé
 wrote:
>
> Semihosting currently uses the TCG probe_access API. To prepare for
> encoding the TCG dependency in Kconfig, do not enable it unless TCG
> is available.
>
> Suggested-by: Paolo Bonzini 
> Signed-off-by: Philippe Mathieu-Daudé 
> Reviewed-by: Anton Johansson 

Acked-by: Alistair Francis 

Alistair

> ---
>  target/riscv/Kconfig | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/target/riscv/Kconfig b/target/riscv/Kconfig
> index 5f30df22f2..c332616d36 100644
> --- a/target/riscv/Kconfig
> +++ b/target/riscv/Kconfig
> @@ -1,9 +1,9 @@
>  config RISCV32
>  bool
> -select ARM_COMPATIBLE_SEMIHOSTING # for do_common_semihosting()
> +imply ARM_COMPATIBLE_SEMIHOSTING if TCG
>  select DEVICE_TREE # needed by boot.c
>
>  config RISCV64
>  bool
> -select ARM_COMPATIBLE_SEMIHOSTING # for do_common_semihosting()
> +imply ARM_COMPATIBLE_SEMIHOSTING if TCG
>  select DEVICE_TREE # needed by boot.c
> --
> 2.41.0
>
>

Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-10 Thread Pierrick Bouvier

On 6/10/24 13:29, Manos Pitsidianakis wrote:

On Mon, 10 Jun 2024 22:37, Pierrick Bouvier wrote:

Hello Manos,

On 6/10/24 11:22, Manos Pitsidianakis wrote:

Hello everyone,

This is an early draft of my work on implementing a very simple device,
in this case the ARM PL011 (which in C code resides in hw/char/pl011.c
and is used in hw/arm/virt.c).

The device is functional, with copied logic from the C code but with
effort not to make a direct C to Rust translation. In other words, do
not write Rust as a C developer would.

That goal is not complete but a best-effort case. To give a specific
example, register values are typed but interrupt bit flags are not (but
could be). I will leave such minutiae for later iterations.

By the way, the wiki page for Rust was revived to keep track of all
current series on the mailing list https://wiki.qemu.org/RustInQemu

a #qemu-rust IRC channel was also created for rust-specific discussion
that might flood #qemu

Excellent work, and thanks for posting this RFC!

IMHO, having patches 2 and 5 splitted is a bit confusing, and exposing
(temporarily) the generated.rs file in patches is not a good move.
Any reason you kept it this way?

That was my first approach, I will rework it on the second version. The
generated code should not exist in committed code at all.

It was initally tricky setting up the dependency orders correctly, so I
first committed it and then made it a dependency.

Maybe it could be better if build.rs file was *not* needed for new
devices/folders, and could be abstracted as a detail of the python
wrapper script instead of something that should be committed.

That'd mean you cannot work on the rust files with a LanguageServer, you
cannot run cargo build or cargo check or cargo clippy, etc. That's why I
left the alternative choice of including a manually generated bindings
file (generated.rs.inc)

Maybe I missed something, but it seems like it just checks/copies the
generated.rs file where it's expected. Definitely something that could
be done as part of the rust build.

Having to run the build before getting completion does not seem to be a
huge compromise.

Having a simple rust/pl011/meson.build is nice and good taste!

A request: keep comments to Rust in relation to the QEMU project and no
debates on the merits of the language itself. These are valid concerns,
but it'd be better if they were on separate mailing list threads.

Table of contents: [TOC]

- How can I try it? [howcanItryit]
- What are the most important points to focus on, at this point?
[whatarethemostimportant]
- What are the issues with not using the compiler, rustc, directly?
[whataretheissueswith]
1. Tooling
2. Rust dependencies

- Should QEMU use third-party dependencies? [shouldqemuusethirdparty]
- Should QEMU provide wrapping Rust APIs over QEMU internals?
[qemuprovidewrappingrustapis]
- Will QEMU now depend on Rust and thus not build on my XYZ platform?
[qemudependonrustnotbuildonxyz]
- How is the compilation structured? [howisthecompilationstructured]
- The generated.rs rust file includes a bunch of junk definitions?
[generatedrsincludesjunk]
- The staticlib artifact contains a bunch of mangled .o objects?
[staticlibmangledobjects]

How can I try it?
=
[howcanItryit] Back to [TOC]

Hopefully applying this patches (or checking out `master` branch from
https://gitlab.com/epilys/rust-for-qemu/ current commit
de81929e0e9d470deac2c6b449b7a5183325e7ee )

Tag for this RFC is rust-pl011-rfc-v1

Rustdoc documentation is hosted on

https://rust-for-qemu-epilys-aebb06ca9f9adfe6584811c14ae44156501d935ba4.gitlab.io/pl011/index.html

If `cargo` and `bindgen` is installed in your system, you should be able
to build qemu-system-aarch64 with configure flag --enable-rust and
launch an arm virt VM. One of the patches hardcodes the default UART of
the machine to the Rust one, so if something goes wrong you will see it
upon launching qemu-system-aarch64.

To confirm it is there for sure, run e.g. info qom-tree on the monitor
and look for x-pl011-rust.

What are the most important points to focus on, at this point?
==
[whatarethemostimportant] Back to [TOC]

In my opinion, integration of the go-to Rust build system (Cargo and
crates.io) with the build system we use in QEMU. This is "easily" done
in some definition of the word with a python wrapper script.

What are the issues with not using the compiler, rustc, directly?
-
[whataretheissueswith] Back to [TOC]

1. Tooling
Mostly writing up the build-sys tooling to do so. Ideally we'd
compile everything without cargo but rustc directly.

If we decide we need Rust's `std` library

Re: [PATCH 1/3] hw/s390x: Declare target specific monitor commands in hmp-target.h

2024-06-10 Thread Dr. David Alan Gilbert

* Philippe Mathieu-Daudé (phi...@linaro.org) wrote:
> "monitor/hmp-target.h" is meant to hold target-specific commands.
> Move s390x specific commands there, slightly simplifying hmp-target.c.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/hw/s390x/storage-attributes.h | 4 
>  include/hw/s390x/storage-keys.h   | 4 
>  include/monitor/hmp-target.h  | 5 +
>  hw/s390x/s390-skeys.c | 2 ++
>  hw/s390x/s390-stattrib.c  | 2 ++
>  monitor/hmp-target.c  | 5 -
>  6 files changed, 9 insertions(+), 13 deletions(-)
> 
> diff --git a/include/hw/s390x/storage-attributes.h 
> b/include/hw/s390x/storage-attributes.h
> index 8921a04d51..4916c75936 100644
> --- a/include/hw/s390x/storage-attributes.h
> +++ b/include/hw/s390x/storage-attributes.h
> @@ -13,7 +13,6 @@
>  #define S390_STORAGE_ATTRIBUTES_H
>  
>  #include "hw/qdev-core.h"
> -#include "monitor/monitor.h"
>  #include "qom/object.h"
>  
>  #define TYPE_S390_STATTRIB "s390-storage_attributes"
> @@ -73,7 +72,4 @@ static inline Object *kvm_s390_stattrib_create(void)
>  }
>  #endif
>  
> -void hmp_info_cmma(Monitor *mon, const QDict *qdict);
> -void hmp_migrationmode(Monitor *mon, const QDict *qdict);
> -
>  #endif /* S390_STORAGE_ATTRIBUTES_H */
> diff --git a/include/hw/s390x/storage-keys.h b/include/hw/s390x/storage-keys.h
> index aa2ec2aae5..1d9b7ead44 100644
> --- a/include/hw/s390x/storage-keys.h
> +++ b/include/hw/s390x/storage-keys.h
> @@ -13,7 +13,6 @@
>  #define S390_STORAGE_KEYS_H
>  
>  #include "hw/qdev-core.h"
> -#include "monitor/monitor.h"
>  #include "qom/object.h"
>  
>  #define TYPE_S390_SKEYS "s390-skeys"
> @@ -114,7 +113,4 @@ void s390_skeys_init(void);
>  
>  S390SKeysState *s390_get_skeys_device(void);
>  
> -void hmp_dump_skeys(Monitor *mon, const QDict *qdict);
> -void hmp_info_skeys(Monitor *mon, const QDict *qdict);
> -
>  #endif /* S390_STORAGE_KEYS_H */
> diff --git a/include/monitor/hmp-target.h b/include/monitor/hmp-target.h
> index b679aaebbf..024cff0052 100644
> --- a/include/monitor/hmp-target.h
> +++ b/include/monitor/hmp-target.h
> @@ -61,4 +61,9 @@ void hmp_gva2gpa(Monitor *mon, const QDict *qdict);
>  void hmp_gpa2hva(Monitor *mon, const QDict *qdict);
>  void hmp_gpa2hpa(Monitor *mon, const QDict *qdict);
>  
> +void hmp_dump_skeys(Monitor *mon, const QDict *qdict);
> +void hmp_info_skeys(Monitor *mon, const QDict *qdict);
> +void hmp_info_cmma(Monitor *mon, const QDict *qdict);
> +void hmp_migrationmode(Monitor *mon, const QDict *qdict);
> +

Could you please add a comment here saying that these are all s390,
since it's not obvious from their names.
(and if we're lucky the other s390 commands will stay with them).

Dave

>  #endif /* MONITOR_HMP_TARGET_H */
> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
> index 5c535d483e..7b2ccb94a5 100644
> --- a/hw/s390x/s390-skeys.c
> +++ b/hw/s390x/s390-skeys.c
> @@ -23,6 +23,8 @@
>  #include "sysemu/kvm.h"
>  #include "migration/qemu-file-types.h"
>  #include "migration/register.h"
> +#include "monitor/hmp-target.h"
> +#include "monitor/monitor.h"
>  
>  #define S390_SKEYS_BUFFER_SIZE (128 * KiB)  /* Room for 128k storage keys */
>  #define S390_SKEYS_SAVE_FLAG_EOS 0x01
> diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
> index c4259b5327..9b4b8d8d0c 100644
> --- a/hw/s390x/s390-stattrib.c
> +++ b/hw/s390x/s390-stattrib.c
> @@ -19,6 +19,8 @@
>  #include "exec/ram_addr.h"
>  #include "qapi/error.h"
>  #include "qapi/qmp/qdict.h"
> +#include "monitor/hmp-target.h"
> +#include "monitor/monitor.h"
>  #include "cpu.h"
>  
>  /* 512KiB cover 2GB of guest memory */
> diff --git a/monitor/hmp-target.c b/monitor/hmp-target.c
> index 1eb72ac1bf..0466474354 100644
> --- a/monitor/hmp-target.c
> +++ b/monitor/hmp-target.c
> @@ -36,11 +36,6 @@
>  #include "qapi/error.h"
>  #include "qemu/cutils.h"
>  
> -#if defined(TARGET_S390X)
> -#include "hw/s390x/storage-keys.h"
> -#include "hw/s390x/storage-attributes.h"
> -#endif
> -
>  /* Make devices configuration available for use in hmp-commands*.hx 
> templates */
>  #include CONFIG_DEVICES
>  
> -- 
> 2.41.0
> 
-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/

Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

On Mon, 10 Jun 2024 at 16:27, Manos Pitsidianakis
 wrote:
>
> On Mon, 10 Jun 2024 22:59, Stefan Hajnoczi  wrote:
> >> What are the issues with not using the compiler, rustc, directly?
> >> -
> >> [whataretheissueswith] Back to [TOC]
> >>
> >> 1. Tooling
> >>Mostly writing up the build-sys tooling to do so. Ideally we'd
> >>compile everything without cargo but rustc directly.
> >
> >Why would that be ideal?
>
> It remove the indirection level of meson<->cargo<->rustc. I don't have a
> concrete idea on how to tackle this, but if cargo ends up not strictly
> necessary, I don't see why we cannot use one build system.

The convenience of being able to use cargo dependencies without
special QEMU meson build system effort seems worth the overhead of
meson<->cargo<->rustc to me. There is a blog post that explores using
cargo crates using meson's wrap dependencies here, and it seems like
extra work:
https://coaxion.net/blog/2023/04/building-a-gstreamer-plugin-in-rust-with-meson-instead-of-cargo/

It's possible to use just meson today, but I don't think it's
practical when using cargo dependencies.

>
> >
> >>
> >>If we decide we need Rust's `std` library support, we could
> >>investigate whether building it from scratch is a good solution. This
> >>will only build the bits we need in our devices.
> >
> >Whether or not to use std is a fundamental decision. It might be
> >difficult to back from std later on. This is something that should be
> >discussed in more detail.
> >
> >Do you want to avoid std for maximum flexibility in the future, or are
> >there QEMU use cases today where std is unavailable?
>
> For flexibility, and for being compatible with more versions.
>
> But I do not want to avoid it, what I am saying is we can do a custom
> build of it instead of linking to the rust toolchain's prebuilt version.

What advantages does a custom build of std bring?

>
> >
> >>
> >> 2. Rust dependencies
> >>We could go without them completely. I chose deliberately to include
> >>one dependency in my UART implementation, `bilge`[0], because it has
> >>an elegant way of representing typed bitfields for the UART's
> >>registers.
> >>
> >> [0]: Article: https://hecatia-elegua.github.io/blog/no-more-bit-fiddling/
> >>  Crates.io page: https://crates.io/crates/bilge
> >>  Repository: https://github.com/hecatia-elegua/bilge
> >
> >I guess there will be interest in using rust-vmm crates in some way.
> >
> >Bindings to platform features that are not available in core or std
> >will also be desirable. We probably don't want to reinvent them.
>
>
> Agreed.
>
> >
> >>
> >> Should QEMU use third-party dependencies?
> >> -
> >> [shouldqemuusethirdparty] Back to [TOC]
> >>
> >> In my personal opinion, if we need a dependency we need a strong
> >> argument for it. A dependency needs a trusted upstream source, a QEMU
> >> maintainer to make sure it us up-to-date in QEMU etc.
> >>
> >> We already fetch some projects with meson subprojects, so this is not a
> >> new reality. Cargo allows you to define "locked" dependencies which is
> >> the same as only fetching specific commits by SHA. No suspicious
> >> tarballs, and no disappearing dependencies a la left-pad in npm.
> >>
> >> However, I believe it's worth considering vendoring every dependency by
> >> default, if they prove to be few, for the sake of having a local QEMU
> >> git clone buildable without network access.
> >
> >Do you mean vendoring by committing them to qemu.git or just the
> >practice of running `cargo vendor` locally for users who decide they
> >want to keep a copy of the dependencies?
>
>
> Committing, with an option to opt-out. They are generally not big in
> size. I am not of strong opinion on this one, I'm very open to
> alternatives.

Fedora and Debian want Rust applications to use distro-packaged
crates. No vendoring and no crates.io online access. It's a bit of a
pain because Rust developers need to make sure their code works with
whatever version of crates Fedora and Debian provide.

The `cargo vendor` command makes it easy for anyone wishing to collect
the required dependencies for offline builds (something I've used for
CentOS builds where vendoring is allowed).

I suggest not vendoring packages in qemu.git. Users can still run
`cargo vendor` for easy offline builds.

>
>
> >>
> >> Should QEMU provide wrapping Rust APIs over QEMU internals?
> >> ---
> >> [qemuprovidewrappingrustapis] Back to [TOC]
> >>
> >> My personal opinion is no, with the reasoning being that QEMU internals
> >> are not documented or stable. However I do not see why creating stable
> >> opt-in interfaces is bad. It just needs someone to volunteer to maintain
> >> it and ensure there are no breakages through versions.
> >
> >Rust code will need to interface with QEMU's C APIs, so Rust

Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

On Mon, 10 Jun 2024 22:37, Pierrick Bouvier wrote:

Hello Manos,

On 6/10/24 11:22, Manos Pitsidianakis wrote:

Hello everyone,

This is an early draft of my work on implementing a very simple device,
in this case the ARM PL011 (which in C code resides in hw/char/pl011.c
and is used in hw/arm/virt.c).

The device is functional, with copied logic from the C code but with
effort not to make a direct C to Rust translation. In other words, do
not write Rust as a C developer would.

By the way, the wiki page for Rust was revived to keep track of all
current series on the mailing list https://wiki.qemu.org/RustInQemu

a #qemu-rust IRC channel was also created for rust-specific discussion
that might flood #qemu

Excellent work, and thanks for posting this RFC!

IMHO, having patches 2 and 5 splitted is a bit confusing, and exposing
(temporarily) the generated.rs file in patches is not a good move.

Any reason you kept it this way?

That was my first approach, I will rework it on the second version. The
generated code should not exist in committed code at all.

It was initally tricky setting up the dependency orders correctly, so I
first committed it and then made it a dependency.

Maybe it could be better if build.rs file was *not* needed for new
devices/folders, and could be abstracted as a detail of the python
wrapper script instead of something that should be committed.

Having a simple rust/pl011/meson.build is nice and good taste!

Table of contents: [TOC]

How can I try it?
=
[howcanItryit] Back to [TOC]

Hopefully applying this patches (or checking out `master` branch from
https://gitlab.com/epilys/rust-for-qemu/ current commit
de81929e0e9d470deac2c6b449b7a5183325e7ee )

Tag for this RFC is rust-pl011-rfc-v1

Rustdoc documentation is hosted on

https://rust-for-qemu-epilys-aebb06ca9f9adfe6584811c14ae44156501d935ba4.gitlab.io/pl011/index.html

To confirm it is there for sure, run e.g. info qom-tree on the monitor
and look for x-pl011-rust.

What are the most important points to focus on, at this point?
==
[whatarethemostimportant] Back to [TOC]

What are the issues with not using the compiler, rustc, directly?
-
[whataretheissueswith] Back to [TOC]

1. Tooling
Mostly writing up the build-sys tooling to do so. Ideally we'd
compile everything without cargo but rustc directly.

If we decide we need Rust's `std` library support, we could
investigate whether building it from scratch is a good solution. This
will only build the bits we need in our devices.
> 2. Rust dependencies
We could go without them completely. I chose deliberately to include
one dependency in my UART implementation, `bilge`[0], because it has
an

Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust


On Mon, 10 Jun 2024 22:59, Stefan Hajnoczi  wrote:

What are the issues with not using the compiler, rustc, directly?
-
[whataretheissueswith] Back to [TOC]

1. Tooling
   Mostly writing up the build-sys tooling to do so. Ideally we'd
   compile everything without cargo but rustc directly.


Why would that be ideal?


It remove the indirection level of meson<->cargo<->rustc. I don't have a 
concrete idea on how to tackle this, but if cargo ends up not strictly 
necessary, I don't see why we cannot use one build system.






   If we decide we need Rust's `std` library support, we could
   investigate whether building it from scratch is a good solution. This
   will only build the bits we need in our devices.


Whether or not to use std is a fundamental decision. It might be
difficult to back from std later on. This is something that should be
discussed in more detail.

Do you want to avoid std for maximum flexibility in the future, or are
there QEMU use cases today where std is unavailable?


For flexibility, and for being compatible with more versions.

But I do not want to avoid it, what I am saying is we can do a custom 
build of it instead of linking to the rust toolchain's prebuilt version.






2. Rust dependencies
   We could go without them completely. I chose deliberately to include
   one dependency in my UART implementation, `bilge`[0], because it has
   an elegant way of representing typed bitfields for the UART's
   registers.

[0]: Article: https://hecatia-elegua.github.io/blog/no-more-bit-fiddling/
 Crates.io page: https://crates.io/crates/bilge
 Repository: https://github.com/hecatia-elegua/bilge


I guess there will be interest in using rust-vmm crates in some way.

Bindings to platform features that are not available in core or std
will also be desirable. We probably don't want to reinvent them.



Agreed.





Should QEMU use third-party dependencies?
-
[shouldqemuusethirdparty] Back to [TOC]

In my personal opinion, if we need a dependency we need a strong
argument for it. A dependency needs a trusted upstream source, a QEMU
maintainer to make sure it us up-to-date in QEMU etc.

We already fetch some projects with meson subprojects, so this is not a
new reality. Cargo allows you to define "locked" dependencies which is
the same as only fetching specific commits by SHA. No suspicious
tarballs, and no disappearing dependencies a la left-pad in npm.

However, I believe it's worth considering vendoring every dependency by
default, if they prove to be few, for the sake of having a local QEMU
git clone buildable without network access.


Do you mean vendoring by committing them to qemu.git or just the
practice of running `cargo vendor` locally for users who decide they
want to keep a copy of the dependencies?



Committing, with an option to opt-out. They are generally not big in 
size. I am not of strong opinion on this one, I'm very open to 
alternatives.





Should QEMU provide wrapping Rust APIs over QEMU internals?
---
[qemuprovidewrappingrustapis] Back to [TOC]

My personal opinion is no, with the reasoning being that QEMU internals
are not documented or stable. However I do not see why creating stable
opt-in interfaces is bad. It just needs someone to volunteer to maintain
it and ensure there are no breakages through versions.


Rust code will need to interface with QEMU's C APIs, so Rust wrappers
seem unavoidable. Using a protocol like vhost-user might be possible
in some cases. It separates the two codebases so they can both be
native and without bindings, but that won't work for all parts of the
QEMU source tree.

Stable APIs aren't necessary if most developers in the QEMU community
are willing to work in both languages. They can adjust both C and Rust
code when making changes to APIs. I find this preferable to having
Rust maintainers whose job is to keep wrappers up-to-date. Those Rust
maintainers would probably burn out. This seems like a question of
which approach the developer community is comfortable with.



Me too.





Will QEMU now depend on Rust and thus not build on my XYZ platform?
---
[qemudependonrustnotbuildonxyz] Back to [TOC]

No, worry about this in some years if this experiment takes off. Rust
has broad platform support and is present in most distro package
managers. In the future we might have gcc support for it as well.

For now, Rust will have an experimental status, and will be aimed to
those who wish to try it. I leave it to the project leaders to make
proper decisions and statements on this if necessary.


This can be discussed in a separate email thread if you prefer, but I
do think it needs agreement soon so that people have the confidence to
invest their time in writing Rust. They need to know that the code
they develop

Re: [PATCH v2 18/18] migration/ram: Add direct-io support to precopy file migration

2024-06-10 Thread Fabiano Rosas

Peter Xu  writes:

> On Mon, Jun 10, 2024 at 02:45:53PM -0300, Fabiano Rosas wrote:
>> >> AIUI, the issue here that users are already allowed to specify in
>> >> libvirt the equivalent to direct-io and multifd independent of each
>> >> other (bypass-cache, parallel). To start requiring both together now in
>> >> some situations would be a regression. I confess I don't know libvirt
>> >> code to know whether this can be worked around somehow, but as I said,
>> >> it's a relatively simple change from the QEMU side.
>> >
>> > Firstly, I definitely want to already avoid all the calls to either
>> > migration_direct_io_start() or *_finish(), now we already need to
>> > explicitly call them in three paths, and that's not intuitive and less
>> > readable, just like the hard coded rdma codes.
>> 
>> Right, but that's just a side-effect of how the code is structured and
>> the fact that writes to the stream happen in small chunks. Setting
>> O_DIRECT needs to happen around aligned IO. We could move the calls
>> further down into qemu_put_buffer_at(), but that would be four fcntl()
>> calls for every page.
>
> Hmm.. why we need four fcntl()s instead of two?

Because we need to first get the flags before flipping the O_DIRECT
bit. And we do this once to enable and once to disable.

int flags = fcntl(fioc->fd, F_GETFL);
if (enabled) {
flags |= O_DIRECT;
} else {
flags &= ~O_DIRECT;
}
fcntl(fioc->fd, F_SETFL, flags);

>> 
>> A tangent:
>>  one thing that occured to me now is that we may be able to restrict
>>  calls to qemu_fflush() to internal code like add_to_iovec() and maybe
>>  use that function to gather the correct amount of data before writing,
>>  making sure it disables O_DIRECT in case alignment is about to be
>>  broken?
>
> IIUC dio doesn't require alignment if we don't care about perf?  I meant it
> should be legal to write(fd, buffer, 5) even if O_DIRECT?

No, we may get an -EINVAL. See Daniel's reply.

>
> I just noticed the asserts you added in previous patch, I think that's
> better indeed, but still I'm wondering whether we can avoid enabling it on
> qemufile.
>
> It makes me feel slightly nervous when introducing dio to QEMUFile rather
> than iochannels - the API design of QEMUFile seems to easily encourage
> breaking things in dio worlds with a default and static buffering. And if
> we're going to blacklist most of the API anyway except the new one for
> mapped-ram, I start to wonder too why bother on top of QEMUFile anyway.
>
> IIRC you also mentioned in the previous doc patch so that libvirt should
> always pass in two fds anyway to the fdset if dio is enabled.  I wonder
> whether it's also true for multifd=off and directio=on, then would it be
> possible to use the dio for guest pages with one fd, while keeping the
> normal stream to use !dio with the other fd.  I'm not sure whether it's
> easy to avoid qemufile in the dio fd, even if not looks like we may avoid
> frequent fcntl()s?

Hm, sounds like a good idea. We'd need a place to put that new ioc
though. Either QEMUFile.direct_ioc and then make use of it in
qemu_put_buffer_at() or a more transparent QIOChannelFile.direct_fd that
gets set somewhere during file_start_outgoing_migration(). Let me try to
come up with something.

Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

On Mon, 10 Jun 2024 at 14:23, Manos Pitsidianakis
 wrote:
>
> Hello everyone,
>
> This is an early draft of my work on implementing a very simple device,
> in this case the ARM PL011 (which in C code resides in hw/char/pl011.c
> and is used in hw/arm/virt.c).
>
> The device is functional, with copied logic from the C code but with
> effort not to make a direct C to Rust translation. In other words, do
> not write Rust as a C developer would.
>
> That goal is not complete but a best-effort case. To give a specific
> example, register values are typed but interrupt bit flags are not (but
> could be). I will leave such minutiae for later iterations.
>
> By the way, the wiki page for Rust was revived to keep track of all
> current series on the mailing list https://wiki.qemu.org/RustInQemu
>
> a #qemu-rust IRC channel was also created for rust-specific discussion
> that might flood #qemu
>
> 
> A request: keep comments to Rust in relation to the QEMU project and no
> debates on the merits of the language itself. These are valid concerns,
> but it'd be better if they were on separate mailing list threads.
> 
>
> Table of contents: [TOC]
>
> - How can I try it? [howcanItryit]
> - What are the most important points to focus on, at this point?
>   [whatarethemostimportant]
>   - What are the issues with not using the compiler, rustc, directly?
> [whataretheissueswith]
> 1. Tooling
> 2. Rust dependencies
>
>   - Should QEMU use third-party dependencies? [shouldqemuusethirdparty]
>   - Should QEMU provide wrapping Rust APIs over QEMU internals?
> [qemuprovidewrappingrustapis]
>   - Will QEMU now depend on Rust and thus not build on my XYZ platform?
> [qemudependonrustnotbuildonxyz]
> - How is the compilation structured? [howisthecompilationstructured]
> - The generated.rs rust file includes a bunch of junk definitions?
>   [generatedrsincludesjunk]
> - The staticlib artifact contains a bunch of mangled .o objects?
>   [staticlibmangledobjects]
>
> How can I try it?
> =
> [howcanItryit] Back to [TOC]
>
> Hopefully applying this patches (or checking out `master` branch from
> https://gitlab.com/epilys/rust-for-qemu/ current commit
> de81929e0e9d470deac2c6b449b7a5183325e7ee )
>
> Tag for this RFC is rust-pl011-rfc-v1
>
> Rustdoc documentation is hosted on
>
> https://rust-for-qemu-epilys-aebb06ca9f9adfe6584811c14ae44156501d935ba4.gitlab.io/pl011/index.html
>
> If `cargo` and `bindgen` is installed in your system, you should be able
> to build qemu-system-aarch64 with configure flag --enable-rust and
> launch an arm virt VM. One of the patches hardcodes the default UART of
> the machine to the Rust one, so if something goes wrong you will see it
> upon launching qemu-system-aarch64.
>
> To confirm it is there for sure, run e.g. info qom-tree on the monitor
> and look for x-pl011-rust.
>
>
> What are the most important points to focus on, at this point?
> ==
> [whatarethemostimportant] Back to [TOC]
>
> In my opinion, integration of the go-to Rust build system (Cargo and
> crates.io) with the build system we use in QEMU. This is "easily" done
> in some definition of the word with a python wrapper script.
>
> What are the issues with not using the compiler, rustc, directly?
> -
> [whataretheissueswith] Back to [TOC]
>
> 1. Tooling
>Mostly writing up the build-sys tooling to do so. Ideally we'd
>compile everything without cargo but rustc directly.

Why would that be ideal?

>
>If we decide we need Rust's `std` library support, we could
>investigate whether building it from scratch is a good solution. This
>will only build the bits we need in our devices.

Whether or not to use std is a fundamental decision. It might be
difficult to back from std later on. This is something that should be
discussed in more detail.

Do you want to avoid std for maximum flexibility in the future, or are
there QEMU use cases today where std is unavailable?

>
> 2. Rust dependencies
>We could go without them completely. I chose deliberately to include
>one dependency in my UART implementation, `bilge`[0], because it has
>an elegant way of representing typed bitfields for the UART's
>registers.
>
> [0]: Article: https://hecatia-elegua.github.io/blog/no-more-bit-fiddling/
>  Crates.io page: https://crates.io/crates/bilge
>  Repository: https://github.com/hecatia-elegua/bilge

I guess there will be interest in using rust-vmm crates in some way.

Bindings to platform features that are not available in core or std
will also be desirable. We probably don't want to reinvent them.

>
> Should QEMU use third-party dependencies?
> -
> [shouldqemuusethirdparty] Back to

Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-10 Thread Pierrick Bouvier

Hello Manos,

On 6/10/24 11:22, Manos Pitsidianakis wrote:

Hello everyone,

This is an early draft of my work on implementing a very simple device,
in this case the ARM PL011 (which in C code resides in hw/char/pl011.c
and is used in hw/arm/virt.c).

The device is functional, with copied logic from the C code but with
effort not to make a direct C to Rust translation. In other words, do
not write Rust as a C developer would.

By the way, the wiki page for Rust was revived to keep track of all
current series on the mailing list https://wiki.qemu.org/RustInQemu

a #qemu-rust IRC channel was also created for rust-specific discussion
that might flood #qemu

Excellent work, and thanks for posting this RFC!

IMHO, having patches 2 and 5 splitted is a bit confusing, and exposing
(temporarily) the generated.rs file in patches is not a good move.

Any reason you kept it this way?

Maybe it could be better if build.rs file was *not* needed for new
devices/folders, and could be abstracted as a detail of the python
wrapper script instead of something that should be committed.

Having a simple rust/pl011/meson.build is nice and good taste!

Table of contents: [TOC]

How can I try it?
=
[howcanItryit] Back to [TOC]

Hopefully applying this patches (or checking out `master` branch from
https://gitlab.com/epilys/rust-for-qemu/ current commit
de81929e0e9d470deac2c6b449b7a5183325e7ee )

Tag for this RFC is rust-pl011-rfc-v1

Rustdoc documentation is hosted on

https://rust-for-qemu-epilys-aebb06ca9f9adfe6584811c14ae44156501d935ba4.gitlab.io/pl011/index.html

To confirm it is there for sure, run e.g. info qom-tree on the monitor
and look for x-pl011-rust.

What are the most important points to focus on, at this point?
==
[whatarethemostimportant] Back to [TOC]

What are the issues with not using the compiler, rustc, directly?
-
[whataretheissueswith] Back to [TOC]

1. Tooling
Mostly writing up the build-sys tooling to do so. Ideally we'd
compile everything without cargo but rustc directly.

[0]: Article: https://hecatia-elegua.github.io/blog/no-more-bit-fiddling/
Crates.io page: https://crates.io/crates/bilge
Repository: https://github.com/hecatia-elegua/bilge

Should QEMU use third-party dependencies?
-
[shouldqemuusethirdparty] Back to [TOC]

In my personal opinion, if we need a dependency we need a strong
argument for it. A dependency needs a trusted upstream source, a QEMU
maintainer to make sure it us up-to-date in QEMU

Re: [PATCH] hw/openrisc: Fixed undercounting of TTCR in continuous mode

2024-06-10 Thread Joel Holdsworth

Hi Stafford, thanks for your response.

> - You sent this 2 times, is the only change in v2 the sender address?

Yes, I was just having some difficulty with Git and SMTP. Should be fixed now.


>> In the existing design, TTCR is prone to undercounting when running in
>> continuous mode. This manifests as a timer interrupt appearing to
>> trigger a few cycles prior to the deadline set in SPR_TTMR_TP.

> This is a good find, I have noticed the timer is off when running on OpenRISC
> but never tracked it down to this undercounting issue.  I also notice
> unexplained RCU stalls when running in Linux when tere is no load, this timer
issue might be related.

> Did you notice this via other system symptoms when running OpenRISC or just 
> via
> code auditing of QEMU?

I'm working on an OpenRISC port of Zephyr. The under-counting issue causes 
consistent deadlocks in my experiments with the test suite. I wouldn't be 
surprised if it causes problems for other OS's.


> In QEMU there is a function clock_ns_to_ticks(). Could this maybe be used
> instead to give us more standard fix?

Seems like a good idea, and I now have some nearly-complete patch that brings 
hw/openrisc/cputimer.c into closer alignment with 
target/mips/sysemu/cp0_timer.c . However, don't we run into problems with 
undercounting with clock_ns_to_ticks, because if I understand correctly it will 
round ticks down, not up?, which is the problem I was trying to avoid in the 
first place.

Joel

Re: [PATCH v3 00/13] riscv: QEMU RISC-V IOMMU Support

2024-06-10 Thread Daniel Henrique Barboza





On 6/10/24 3:32 PM, Andrew Jones wrote:

On June 10, 2024 2:34:58 AM GMT+02:00, Alistair Francis  
wrote:

On Fri, May 24, 2024 at 3:43 AM Daniel Henrique Barboza
 wrote:


Hi,

In this new version a lot of changes were made throughout all the code,
most notably on patch 3. Link for the previous version is [1].

* How it was tested *

This series was tested using an emulated QEMU RISC-V host booting a QEMU
KVM guest, passing through an emulated e1000 network card from the host
to the guest. I can provide more details (e.g. QEMU command lines) if
required, just let me know. For now this cover-letter is too much of an
essay as is.


It would probably be helpful to document these somewhere, so others
can use them as a starting point for running this



I've written up a testing procedure which I shared internally with Daniel. I'll 
sanitize it and post it somewhere public.



I can also add a QEMU docs under docs/system/riscv, both as a
subsection of virt.rst and perhaps a new doc that describes the
devices itself (riscv-iommu-pci and later on riscv-iommu-sys).


Thanks,


Daniel
 

Thanks,
drew


Alistair



The Linux kernel used for tests can be found here:

https://github.com/tjeznach/linux/tree/riscv_iommu_v6-rc3

This is a newer version of the following work from Tomasz:

https://lore.kernel.org/linux-riscv/cover.1715708679.git.tjezn...@rivosinc.com/
("[PATCH v5 0/7] Linux RISC-V IOMMU Support")

The v5 wasn't enough for the testing being done. v6-rc3 did the trick.

Note that to test this work using riscv-iommu-pci we'll need to provide
the Rivos PCI ID in the command line. More details down below.

* Highlights of this version *

- patches removed from v2: platform driver (riscv-iommu-sys, former
patch 05) and the EDU changes (patches 14 and 15). The platform driver
will be sent later with a working example on the 'virt' machine,
either on a newer version of this series or via a follow-up series. We
already have a PoC on [2] created by Sunil. More tests are needed, so
it'll be left behind for now. The EDU changes will be sent in separate
after I finish the doc changes that Frank cited in v2.

- patch 3 contains the bulk of changes made from v2. Please give special
attention to the following functions since this is entirely new code I
ended up adding:

  - riscv_iommu_report_fault()
  - riscv_iommu_validate_device_ctx()
  - riscv_iommu_update_ipsr()

   Aside from these helpers most of the changes made in this patch 3 were
punctual.

- Red HAT PCI ID related changes. A new patch (4) that introduces a
generic RISC-V IOMMU PCI ID was added. This PCI ID was gracefully given
to us by Red Hat and Gerd Hoffman from their ID space. The
riscv-iommu-pci device now defaults to this PCI ID instead of Rivos PCI
ID. The device was changed slightly to allow vendor-id and device-id to
be set in the command-line, so it's now possible to use this reference
device as another RISC-V IOMMU PCI device to ease the burden of
testing/development.

   To instantiate the riscv-iommu-pci device using the previous Rivos PCI
ID, use the following cmd line:

   -device riscv-iommu-pci,vendor-id=0x1efd,device-id=0xedf1

   I'm using these options to test the series with the existing Linux RISC-V
IOMMU support that uses just a Rivos ID to identify the device.


Series based on alistair/riscv-to-apply.next. It's also applicable on
current QEMU master. It can also be fetched from:

https://gitlab.com/danielhb/qemu/-/tree/riscv_iommu_v3


Patches missing reviews/acks: 3, 5, 9, 10, 11.

Changes from v2 [1]:
- patch 05 (hw/riscv: add riscv-iommu-sys platform device): dropped
   - will be reintroduced in a later review or as a follow-up series

- patches 14 and 15: dropped
   - will be sent in separate

- patches 2, 3, 4 and 5:
   - removed all 'Ziommu' references

- patch 2:
   - added extra bits that patch 3 ended up using

- patch 3:
   - fixed blank line at EOF in hw/riscv/trace.h
   - added a riscv_iommu_report_fault() helper to report faults. The helper 
checks if
 a given fault is eligible to be reported if DTF is 1
   - Use riscv_iommu_report_fault() in riscv_iommu_ctx() and 
riscv_iommu_translate()
 to avoid code repetition
   - added a riscv_iommu_validate_device_ctx() helper to validate the device 
context
 as specified in "Device configuration checks" section. This helper is 
being used
 in riscv_iommu_ctx_fetch()
   - added a new riscv_iommu_update_ipsr() helper to handle IPSR updates
 in riscv_iommu_mmio_write()
   - riscv_iommmu_msi_write() now reports a fault in all error paths
   - check for fctl.WSI before issuing a MSI interrupt in riscv_iommu_notify()
   - change riscv-iommu region name to 'riscv-iommu'
   - change address_space_init() name for PCI devices to 'name' instead of 
using TYPE_RISCV_IOMMU_PCI
   - changed riscv_iommu_mmio_ops min_access_size to 4
   - do not check for min and max sizes on riscv_iommu_mmio_write()
   - changed riscv_iommu_trap_ops  min_access_size to 4
   - removed

Re: [PATCH v2 18/18] migration/ram: Add direct-io support to precopy file migration

On Mon, Jun 10, 2024 at 03:02:10PM -0400, Peter Xu wrote:
> On Mon, Jun 10, 2024 at 02:45:53PM -0300, Fabiano Rosas wrote:
> > >> AIUI, the issue here that users are already allowed to specify in
> > >> libvirt the equivalent to direct-io and multifd independent of each
> > >> other (bypass-cache, parallel). To start requiring both together now in
> > >> some situations would be a regression. I confess I don't know libvirt
> > >> code to know whether this can be worked around somehow, but as I said,
> > >> it's a relatively simple change from the QEMU side.
> > >
> > > Firstly, I definitely want to already avoid all the calls to either
> > > migration_direct_io_start() or *_finish(), now we already need to
> > > explicitly call them in three paths, and that's not intuitive and less
> > > readable, just like the hard coded rdma codes.
> > 
> > Right, but that's just a side-effect of how the code is structured and
> > the fact that writes to the stream happen in small chunks. Setting
> > O_DIRECT needs to happen around aligned IO. We could move the calls
> > further down into qemu_put_buffer_at(), but that would be four fcntl()
> > calls for every page.
> 
> Hmm.. why we need four fcntl()s instead of two?
> 
> > 
> > A tangent:
> >  one thing that occured to me now is that we may be able to restrict
> >  calls to qemu_fflush() to internal code like add_to_iovec() and maybe
> >  use that function to gather the correct amount of data before writing,
> >  making sure it disables O_DIRECT in case alignment is about to be
> >  broken?
> 
> IIUC dio doesn't require alignment if we don't care about perf?  I meant it
> should be legal to write(fd, buffer, 5) even if O_DIRECT?

No, we must assume  that O_DIRECT requires alignment both of the userspace
memory buffers, and the file offset on disk:

[quote man(open)]
  O_DIRECT
   The O_DIRECT flag may impose alignment restrictions  on  the  length
   and  address  of user-space buffers and the file offset of I/Os.  In
   Linux alignment restrictions vary by filesystem and  kernel  version
   and  might  be absent entirely.  The handling of misaligned O_DIRECT
   I/Os also varies; they can either fail with EINVAL or fall  back  to
   buffered I/O.
[/quote]

Given QEMU's code base, it is only safe for us to use O_DIRECT with RAM
blocks where we have predictable in-memory alignment, and have defined
a good on-disk offset alignment too.


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2 18/18] migration/ram: Add direct-io support to precopy file migration

On Mon, Jun 10, 2024 at 02:45:53PM -0300, Fabiano Rosas wrote:
> >> AIUI, the issue here that users are already allowed to specify in
> >> libvirt the equivalent to direct-io and multifd independent of each
> >> other (bypass-cache, parallel). To start requiring both together now in
> >> some situations would be a regression. I confess I don't know libvirt
> >> code to know whether this can be worked around somehow, but as I said,
> >> it's a relatively simple change from the QEMU side.
> >
> > Firstly, I definitely want to already avoid all the calls to either
> > migration_direct_io_start() or *_finish(), now we already need to
> > explicitly call them in three paths, and that's not intuitive and less
> > readable, just like the hard coded rdma codes.
> 
> Right, but that's just a side-effect of how the code is structured and
> the fact that writes to the stream happen in small chunks. Setting
> O_DIRECT needs to happen around aligned IO. We could move the calls
> further down into qemu_put_buffer_at(), but that would be four fcntl()
> calls for every page.

Hmm.. why we need four fcntl()s instead of two?

> 
> A tangent:
>  one thing that occured to me now is that we may be able to restrict
>  calls to qemu_fflush() to internal code like add_to_iovec() and maybe
>  use that function to gather the correct amount of data before writing,
>  making sure it disables O_DIRECT in case alignment is about to be
>  broken?

IIUC dio doesn't require alignment if we don't care about perf?  I meant it
should be legal to write(fd, buffer, 5) even if O_DIRECT?

I just noticed the asserts you added in previous patch, I think that's
better indeed, but still I'm wondering whether we can avoid enabling it on
qemufile.

It makes me feel slightly nervous when introducing dio to QEMUFile rather
than iochannels - the API design of QEMUFile seems to easily encourage
breaking things in dio worlds with a default and static buffering. And if
we're going to blacklist most of the API anyway except the new one for
mapped-ram, I start to wonder too why bother on top of QEMUFile anyway.

IIRC you also mentioned in the previous doc patch so that libvirt should
always pass in two fds anyway to the fdset if dio is enabled.  I wonder
whether it's also true for multifd=off and directio=on, then would it be
possible to use the dio for guest pages with one fd, while keeping the
normal stream to use !dio with the other fd.  I'm not sure whether it's
easy to avoid qemufile in the dio fd, even if not looks like we may avoid
frequent fcntl()s?

-- 
Peter Xu

Re: [PATCH v2 4/7] migration/multifd: Add UADK initialization

2024-06-10 Thread Fabiano Rosas

Shameer Kolothum via  writes:

> Initialize UADK session and allocate buffers required. The actual
> compression/decompression will only be done in a subsequent patch.
>
> Signed-off-by: Shameer Kolothum 

Reviewed-by: Fabiano Rosas

Re: [PATCH v3 00/13] riscv: QEMU RISC-V IOMMU Support

2024-06-10 Thread Andrew Jones

On June 10, 2024 2:34:58 AM GMT+02:00, Alistair Francis  
wrote:
>On Fri, May 24, 2024 at 3:43 AM Daniel Henrique Barboza
> wrote:
>>
>> Hi,
>>
>> In this new version a lot of changes were made throughout all the code,
>> most notably on patch 3. Link for the previous version is [1].
>>
>> * How it was tested *
>>
>> This series was tested using an emulated QEMU RISC-V host booting a QEMU
>> KVM guest, passing through an emulated e1000 network card from the host
>> to the guest. I can provide more details (e.g. QEMU command lines) if
>> required, just let me know. For now this cover-letter is too much of an
>> essay as is.
>
>It would probably be helpful to document these somewhere, so others
>can use them as a starting point for running this
>

I've written up a testing procedure which I shared internally with Daniel. I'll 
sanitize it and post it somewhere public.

Thanks,
drew

>Alistair
>
>>
>> The Linux kernel used for tests can be found here:
>>
>> https://github.com/tjeznach/linux/tree/riscv_iommu_v6-rc3
>>
>> This is a newer version of the following work from Tomasz:
>>
>> https://lore.kernel.org/linux-riscv/cover.1715708679.git.tjezn...@rivosinc.com/
>> ("[PATCH v5 0/7] Linux RISC-V IOMMU Support")
>>
>> The v5 wasn't enough for the testing being done. v6-rc3 did the trick.
>>
>> Note that to test this work using riscv-iommu-pci we'll need to provide
>> the Rivos PCI ID in the command line. More details down below.
>>
>> * Highlights of this version *
>>
>> - patches removed from v2: platform driver (riscv-iommu-sys, former
>> patch 05) and the EDU changes (patches 14 and 15). The platform driver
>> will be sent later with a working example on the 'virt' machine,
>> either on a newer version of this series or via a follow-up series. We
>> already have a PoC on [2] created by Sunil. More tests are needed, so
>> it'll be left behind for now. The EDU changes will be sent in separate
>> after I finish the doc changes that Frank cited in v2.
>>
>> - patch 3 contains the bulk of changes made from v2. Please give special
>> attention to the following functions since this is entirely new code I
>> ended up adding:
>>
>>  - riscv_iommu_report_fault()
>>  - riscv_iommu_validate_device_ctx()
>>  - riscv_iommu_update_ipsr()
>>
>>   Aside from these helpers most of the changes made in this patch 3 were
>> punctual.
>>
>> - Red HAT PCI ID related changes. A new patch (4) that introduces a
>> generic RISC-V IOMMU PCI ID was added. This PCI ID was gracefully given
>> to us by Red Hat and Gerd Hoffman from their ID space. The
>> riscv-iommu-pci device now defaults to this PCI ID instead of Rivos PCI
>> ID. The device was changed slightly to allow vendor-id and device-id to
>> be set in the command-line, so it's now possible to use this reference
>> device as another RISC-V IOMMU PCI device to ease the burden of
>> testing/development.
>>
>>   To instantiate the riscv-iommu-pci device using the previous Rivos PCI
>> ID, use the following cmd line:
>>
>>   -device riscv-iommu-pci,vendor-id=0x1efd,device-id=0xedf1
>>
>>   I'm using these options to test the series with the existing Linux RISC-V
>> IOMMU support that uses just a Rivos ID to identify the device.
>>
>>
>> Series based on alistair/riscv-to-apply.next. It's also applicable on
>> current QEMU master. It can also be fetched from:
>>
>> https://gitlab.com/danielhb/qemu/-/tree/riscv_iommu_v3
>>
>>
>> Patches missing reviews/acks: 3, 5, 9, 10, 11.
>>
>> Changes from v2 [1]:
>> - patch 05 (hw/riscv: add riscv-iommu-sys platform device): dropped
>>   - will be reintroduced in a later review or as a follow-up series
>>
>> - patches 14 and 15: dropped
>>   - will be sent in separate
>>
>> - patches 2, 3, 4 and 5:
>>   - removed all 'Ziommu' references
>>
>> - patch 2:
>>   - added extra bits that patch 3 ended up using
>>
>> - patch 3:
>>   - fixed blank line at EOF in hw/riscv/trace.h
>>   - added a riscv_iommu_report_fault() helper to report faults. The helper 
>> checks if
>> a given fault is eligible to be reported if DTF is 1
>>   - Use riscv_iommu_report_fault() in riscv_iommu_ctx() and 
>> riscv_iommu_translate()
>> to avoid code repetition
>>   - added a riscv_iommu_validate_device_ctx() helper to validate the device 
>> context
>> as specified in "Device configuration checks" section. This helper is 
>> being used
>> in riscv_iommu_ctx_fetch()
>>   - added a new riscv_iommu_update_ipsr() helper to handle IPSR updates
>> in riscv_iommu_mmio_write()
>>   - riscv_iommmu_msi_write() now reports a fault in all error paths
>>   - check for fctl.WSI before issuing a MSI interrupt in riscv_iommu_notify()
>>   - change riscv-iommu region name to 'riscv-iommu'
>>   - change address_space_init() name for PCI devices to 'name' instead of 
>> using TYPE_RISCV_IOMMU_PCI
>>   - changed riscv_iommu_mmio_ops min_access_size to 4
>>   - do not check for min and max sizes on riscv_iommu_mmio_write()
>>   - changed riscv_iommu_trap_ops  min_access_size

Re: [PATCH v3 2/5] target/xtensa: Restrict semihosting to TCG

2024-06-10 Thread Max Filippov

On Mon, Jun 10, 2024 at 04:58:04PM +0200, Philippe Mathieu-Daudé wrote:
> The semihosting feature depends on TCG (due to the probe_access
> API access). Although TCG is the single accelerator currently
> available for the xtensa target, use the Kconfig "imply" directive
> which is more correct (if we were to support a different accel).
> 
> Reported-by: Anton Johansson 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  target/xtensa/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Max Filippov 

-- 
Thanks.
-- Max

Re: [RFC PATCH 2/3] monitor: Allow passing HMP arguments to QMP HumanReadableText API

On Mon, Jun 10, 2024 at 07:58:51PM +0200, Philippe Mathieu-Daudé wrote:
> Allow HMP commands implemented using the HumanReadableText API
> (via the HMPCommand::cmd_info_hrt handler) to pass arguments
> to the QMP equivalent command. The arguments are serialized as
> a JSON dictionary.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  docs/devel/writing-monitor-commands.rst | 15 ++-
>  qapi/machine.json   | 24 
>  include/monitor/monitor.h   |  3 ++-
>  monitor/monitor-internal.h  |  2 +-
>  accel/tcg/monitor.c |  4 ++--
>  hw/core/loader.c|  2 +-
>  hw/core/machine-qmp-cmds.c  |  9 +
>  hw/usb/bus.c|  2 +-
>  monitor/hmp-target.c|  3 ++-
>  monitor/hmp.c   | 11 +++
>  10 files changed, 59 insertions(+), 16 deletions(-)
> 
> diff --git a/docs/devel/writing-monitor-commands.rst 
> b/docs/devel/writing-monitor-commands.rst
> index 930da5cd06..843458e52c 100644
> --- a/docs/devel/writing-monitor-commands.rst
> +++ b/docs/devel/writing-monitor-commands.rst
> @@ -561,6 +561,7 @@ returns a ``HumanReadableText``::
>   # Since: 6.2
>   ##
>   { 'command': 'x-query-roms',
> +   'data': { 'json-args': 'str'},
> 'returns': 'HumanReadableText',
> 'features': [ 'unstable' ] }
>  
> @@ -578,7 +579,7 @@ Implementing the QMP command
>  The QMP implementation will typically involve creating a ``GString``
>  object and printing formatted data into it, like this::
>  
> - HumanReadableText *qmp_x_query_roms(Error **errp)
> + HumanReadableText *qmp_x_query_roms(const char *json_args, Error **errp)
>   {
>   g_autoptr(GString) buf = g_string_new("");
>   Rom *rom;
> @@ -596,6 +597,18 @@ object and printing formatted data into it, like this::
>  The actual implementation emits more information.  You can find it in
>  hw/core/loader.c.
>  
> +For QMP command taking (optional) parameters, these parameters are
> +serialized as a JSON dictionary, and can be retrieved using the QDict
> +API. If the previous ``x-query-roms`` command were taking a "index"
> +argument, it could be retrieved as::
> +
> + HumanReadableText *qmp_x_query_roms(const char *json_args, Error **errp)
> + {
> + g_autoptr(GString) buf = g_string_new("");
> + QDict *qdict = qobject_to(QDict, qobject_from_json(json_args, 
> _abort));
> + uint64_t index = qdict_get_int(qdict, "index");
> + ...
> + }

Passing json inside json is pretty gross, and throwing away a key
benefit of QAPI - that it de-serializes the JSON into the actual
data types that you need, avoiding manual & error prone code for
unpacking args from a QDict.

IMHO if a commend requires arguments, they should be modelled
explicitly, and not use the  cmd_info_hrt convenience handler
which was only ever intended simple for no-arg 'info' commands.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[RFC PATCH v1 4/6] DO NOT MERGE: replace TYPE_PL011 with x-pl011-rust in arm virt machine

Signed-off-by: Manos Pitsidianakis 
---
 hw/arm/virt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3c93c0c0a6..153be0f42d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -912,7 +912,7 @@ static void create_uart(const VirtMachineState *vms, int 
uart,
 int irq = vms->irqmap[uart];
 const char compat[] = "arm,pl011\0arm,primecell";
 const char clocknames[] = "uartclk\0apb_pclk";
-DeviceState *dev = qdev_new(TYPE_PL011);
+DeviceState *dev = qdev_new("x-pl011-rust");
 SysBusDevice *s = SYS_BUS_DEVICE(dev);
 MachineState *ms = MACHINE(vms);
 
-- 
γαῖα πυρί μιχθήτω

[RFC PATCH v1 6/6] DO NOT MERGE: update rustdoc gitlab pages gen

Signed-off-by: Manos Pitsidianakis 
---
 .gitlab-ci.d/buildtest.yml | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index 1cd6519506..da882813b8 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -744,11 +744,20 @@ build-tools-and-docs-debian:
 pages:
   image: rust:latest
   script:
-- cd ./rust/pl011/
+- rustup component add rustfmt
+- DEBIAN_FRONTEND=noninteractive apt-get update -y
+- DEBIAN_FRONTEND=noninteractive apt-get install -y python3-venv meson 
libgcrypt20-dev zlib1g-dev autoconf automake libtool bison flex git 
libglib2.0-dev libfdt-dev libpixman-1-dev ninja-build make libclang-14-dev
+- cargo install bindgen-cli
+- mkdir ./build/
+- cd ./build/
+- ../configure --enable-system --disable-kvm --target-list=aarch64-softmmu 
--enable-with-rust
+- ninja "generated.rs"
+- cp ./generated.rs ../rust/pl011/src/generated.rs.inc
+- cd ../rust/pl011/
 - cargo tree --depth 1 -e normal --prefix none | cut -d' ' -f1  | xargs
   printf -- '-p %s\n'  | xargs cargo doc --no-deps 
--document-private-items --target x86_64-unknown-linux-gnu
 - cd ./../..
-- mv ./rust/pl011/target/doc ./public
+- mv ./rust/pl011/target/x86_64-unknown-linux-gnu/doc ./public
   artifacts:
 when: on_success
 paths:
-- 
γαῖα πυρί μιχθήτω

[RFC PATCH v1 3/6] DO NOT MERGE: add rustdoc build for gitlab pages

Signed-off-by: Manos Pitsidianakis 
---
 .gitlab-ci.d/buildtest.yml | 55 +-
 1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index 91c57efded..1cd6519506 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -715,31 +715,48 @@ build-tools-and-docs-debian:
 # For contributor forks we want to publish from any repo so
 # that users can see the results of their commits, regardless
 # of what topic branch they're currently using
+# pages:
+#   extends: .base_job_template
+#   image: $CI_REGISTRY_IMAGE/qemu/debian:$QEMU_CI_CONTAINER_TAG
+#   stage: test
+#   needs:
+# - job: build-tools-and-docs-debian
+#   script:
+# - mkdir -p public
+# # HTML-ised source tree
+# - make gtags
+# # We unset variables to work around a bug in some htags versions
+# # which causes it to fail when the environment is large
+# - CI_COMMIT_MESSAGE= CI_COMMIT_TAG_MESSAGE= htags
+# -anT --tree-view=filetree -m qemu_init
+# -t "Welcome to the QEMU sourcecode"
+# - mv HTML public/src
+# # Project documentation
+# - make -C build install DESTDIR=$(pwd)/temp-install
+# - mv temp-install/usr/local/share/doc/qemu/* public/
+#   artifacts:
+# when: on_success
+# paths:
+#   - public
+#   variables:
+# QEMU_JOB_PUBLISH: 1
+# The Docker image that will be used to build your app
 pages:
-  extends: .base_job_template
-  image: $CI_REGISTRY_IMAGE/qemu/debian:$QEMU_CI_CONTAINER_TAG
-  stage: test
-  needs:
-- job: build-tools-and-docs-debian
+  image: rust:latest
   script:
-- mkdir -p public
-# HTML-ised source tree
-- make gtags
-# We unset variables to work around a bug in some htags versions
-# which causes it to fail when the environment is large
-- CI_COMMIT_MESSAGE= CI_COMMIT_TAG_MESSAGE= htags
--anT --tree-view=filetree -m qemu_init
--t "Welcome to the QEMU sourcecode"
-- mv HTML public/src
-# Project documentation
-- make -C build install DESTDIR=$(pwd)/temp-install
-- mv temp-install/usr/local/share/doc/qemu/* public/
+- cd ./rust/pl011/
+- cargo tree --depth 1 -e normal --prefix none | cut -d' ' -f1  | xargs
+  printf -- '-p %s\n'  | xargs cargo doc --no-deps 
--document-private-items --target x86_64-unknown-linux-gnu
+- cd ./../..
+- mv ./rust/pl011/target/doc ./public
   artifacts:
 when: on_success
 paths:
   - public
-  variables:
-QEMU_JOB_PUBLISH: 1
+  rules:
+# This ensures that only pushes to the default branch will trigger
+# a pages deploy
+- if: $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
 
 coverity:
   image: $CI_REGISTRY_IMAGE/qemu/fedora:$QEMU_CI_CONTAINER_TAG
-- 
γαῖα πυρί μιχθήτω

[RFC PATCH v1 0/6] Implement ARM PL011 in Rust

Hello everyone,

This is an early draft of my work on implementing a very simple device,
in this case the ARM PL011 (which in C code resides in hw/char/pl011.c
and is used in hw/arm/virt.c).

The device is functional, with copied logic from the C code but with
effort not to make a direct C to Rust translation. In other words, do
not write Rust as a C developer would.

By the way, the wiki page for Rust was revived to keep track of all
current series on the mailing list https://wiki.qemu.org/RustInQemu

a #qemu-rust IRC channel was also created for rust-specific discussion
that might flood #qemu

Table of contents: [TOC]

How can I try it?
=
[howcanItryit] Back to [TOC]

Hopefully applying this patches (or checking out `master` branch from
https://gitlab.com/epilys/rust-for-qemu/ current commit
de81929e0e9d470deac2c6b449b7a5183325e7ee )

Tag for this RFC is rust-pl011-rfc-v1

Rustdoc documentation is hosted on

https://rust-for-qemu-epilys-aebb06ca9f9adfe6584811c14ae44156501d935ba4.gitlab.io/pl011/index.html

To confirm it is there for sure, run e.g. info qom-tree on the monitor
and look for x-pl011-rust.

What are the most important points to focus on, at this point?
==
[whatarethemostimportant] Back to [TOC]

What are the issues with not using the compiler, rustc, directly?
-
[whataretheissueswith] Back to [TOC]

1. Tooling
Mostly writing up the build-sys tooling to do so. Ideally we'd
compile everything without cargo but rustc directly.

If we decide we need Rust's `std` library support, we could
investigate whether building it from scratch is a good solution. This
will only build the bits we need in our devices.

2. Rust dependencies
We could go without them completely. I chose deliberately to include
one dependency in my UART implementation, `bilge`[0], because it has
an elegant way of representing typed bitfields for the UART's
registers.

[0]: Article: https://hecatia-elegua.github.io/blog/no-more-bit-fiddling/
Crates.io page: https://crates.io/crates/bilge
Repository: https://github.com/hecatia-elegua/bilge

Should QEMU use third-party dependencies?
-
[shouldqemuusethirdparty] Back to [TOC]

In my personal opinion, if we need a dependency we need a strong
argument for it. A dependency needs a trusted upstream source, a QEMU
maintainer to make sure it us up-to-date in QEMU etc.

We already fetch some projects with meson subprojects, so this is not a
new reality. Cargo allows you to define "locked" dependencies which is
the same as only fetching specific commits by SHA. No suspicious
tarballs, and no disappearing dependencies a la left-pad in npm.

However, I believe it's worth considering vendoring every dependency by
default, if they prove to be few, for the sake of having a local QEMU
git clone buildable without network access.

Should QEMU provide wrapping Rust APIs over QEMU internals?

[RFC PATCH v1 1/6] build-sys: Add rust feature option

Add options for Rust in meson_options.txt, meson.build, configure to
prepare for adding Rust code in the followup commits.

`rust` is a reserved meson name, so we have to use an alternative.
`with_rust` was chosen.

Signed-off-by: Manos Pitsidianakis 
---
The cargo wrapper script hardcodes some rust target triples. This is 
just temporary.
---
 .gitignore   |   2 +
 configure|  12 +++
 meson.build  |  11 ++
 meson_options.txt|   4 +
 scripts/cargo_wrapper.py | 211 +++
 5 files changed, 240 insertions(+)
 create mode 100644 scripts/cargo_wrapper.py

diff --git a/.gitignore b/.gitignore
index 61fa39967b..f42b0d937e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -2,6 +2,8 @@
 /build/
 /.cache/
 /.vscode/
+/target/
+rust/**/target
 *.pyc
 .sdk
 .stgit-*
diff --git a/configure b/configure
index 38ee257701..c195630771 100755
--- a/configure
+++ b/configure
@@ -302,6 +302,9 @@ else
   objcc="${objcc-${cross_prefix}clang}"
 fi
 
+with_rust="auto"
+with_rust_target_triple=""
+
 ar="${AR-${cross_prefix}ar}"
 as="${AS-${cross_prefix}as}"
 ccas="${CCAS-$cc}"
@@ -760,6 +763,12 @@ for opt do
   ;;
   --gdb=*) gdb_bin="$optarg"
   ;;
+  --enable-rust) with_rust=enabled
+  ;;
+  --disable-rust) with_rust=disabled
+  ;;
+  --rust-target-triple=*) with_rust_target_triple="$optarg"
+  ;;
   # everything else has the same name in configure and meson
   --*) meson_option_parse "$opt" "$optarg"
   ;;
@@ -1796,6 +1805,9 @@ if test "$skip_meson" = no; then
   test -n "${LIB_FUZZING_ENGINE+xxx}" && meson_option_add 
"-Dfuzzing_engine=$LIB_FUZZING_ENGINE"
   test "$plugins" = yes && meson_option_add "-Dplugins=true"
   test "$tcg" != enabled && meson_option_add "-Dtcg=$tcg"
+  test "$with_rust" != enabled && meson_option_add "-Dwith_rust=$with_rust"
+  test "$with_rust" != enabled && meson_option_add "-Dwith_rust=$with_rust"
+  test "$with_rust_target_triple" != "" && meson_option_add 
"-Dwith_rust_target_triple=$with_rust_target_triple"
   run_meson() {
 NINJA=$ninja $meson setup "$@" "$PWD" "$source_path"
   }
diff --git a/meson.build b/meson.build
index a9de71d450..3533889852 100644
--- a/meson.build
+++ b/meson.build
@@ -290,6 +290,12 @@ foreach lang : all_languages
   endif
 endforeach
 
+cargo = not_found
+if get_option('with_rust').allowed()
+  cargo = find_program('cargo', required: get_option('with_rust'))
+endif
+with_rust = cargo.found()
+
 # default flags for all hosts
 # We use -fwrapv to tell the compiler that we require a C dialect where
 # left shift of signed integers is well defined and has the expected
@@ -2066,6 +2072,7 @@ endif
 
 config_host_data = configuration_data()
 
+config_host_data.set('CONFIG_WITH_RUST', with_rust)
 audio_drivers_selected = []
 if have_system
   audio_drivers_available = {
@@ -4190,6 +4197,10 @@ if 'objc' in all_languages
 else
   summary_info += {'Objective-C compiler': false}
 endif
+summary_info += {'Rust support':  with_rust}
+if with_rust and get_option('with_rust_target_triple') != ''
+  summary_info += {'Rust target': get_option('with_rust_target_triple')}
+endif
 option_cflags = (get_option('debug') ? ['-g'] : [])
 if get_option('optimization') != 'plain'
   option_cflags += ['-O' + get_option('optimization')]
diff --git a/meson_options.txt b/meson_options.txt
index 4c1583eb40..223491b731 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -366,3 +366,7 @@ option('qemu_ga_version', type: 'string', value: '',
 
 option('hexagon_idef_parser', type : 'boolean', value : true,
description: 'use idef-parser to automatically generate TCG code for 
the Hexagon frontend')
+option('with_rust', type: 'feature', value: 'auto',
+   description: 'Enable Rust support')
+option('with_rust_target_triple', type : 'string', value: '',
+   description: 'Rust target triple')
diff --git a/scripts/cargo_wrapper.py b/scripts/cargo_wrapper.py
new file mode 100644
index 00..d338effdaa
--- /dev/null
+++ b/scripts/cargo_wrapper.py
@@ -0,0 +1,211 @@
+#!/usr/bin/env python3
+# Copyright (c) 2020 Red Hat, Inc.
+# Copyright (c) 2023 Linaro Ltd.
+#
+# Authors:
+#  Manos Pitsidianakis 
+#  Marc-André Lureau 
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# later.  See the COPYING file in the top-level directory.
+
+import argparse
+import configparser
+import distutils.file_util
+import json
+import logging
+import os
+import os.path
+import re
+import subprocess
+import sys
+import pathlib
+import shutil
+import tomllib
+
+from pathlib import Path
+from typing import Any, Dict, List, Tuple
+
+RUST_TARGET_TRIPLES = (
+"aarch64-unknown-linux-gnu",
+"x86_64-unknown-linux-gnu",
+"x86_64-apple-darwin",
+"aarch64-apple-darwin",
+)
+
+
+def cfg_name(name: str) -> str:
+if (
+name.startswith("CONFIG_")
+or name.startswith("TARGET_")
+or name.startswith("HAVE_")
+):
+return name
+return ""
+
+
+def

[PATCH 1/3] hw/s390x: Declare target specific monitor commands in hmp-target.h

"monitor/hmp-target.h" is meant to hold target-specific commands.
Move s390x specific commands there, slightly simplifying hmp-target.c.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/s390x/storage-attributes.h | 4 
 include/hw/s390x/storage-keys.h   | 4 
 include/monitor/hmp-target.h  | 5 +
 hw/s390x/s390-skeys.c | 2 ++
 hw/s390x/s390-stattrib.c  | 2 ++
 monitor/hmp-target.c  | 5 -
 6 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/include/hw/s390x/storage-attributes.h 
b/include/hw/s390x/storage-attributes.h
index 8921a04d51..4916c75936 100644
--- a/include/hw/s390x/storage-attributes.h
+++ b/include/hw/s390x/storage-attributes.h
@@ -13,7 +13,6 @@
 #define S390_STORAGE_ATTRIBUTES_H
 
 #include "hw/qdev-core.h"
-#include "monitor/monitor.h"
 #include "qom/object.h"
 
 #define TYPE_S390_STATTRIB "s390-storage_attributes"
@@ -73,7 +72,4 @@ static inline Object *kvm_s390_stattrib_create(void)
 }
 #endif
 
-void hmp_info_cmma(Monitor *mon, const QDict *qdict);
-void hmp_migrationmode(Monitor *mon, const QDict *qdict);
-
 #endif /* S390_STORAGE_ATTRIBUTES_H */
diff --git a/include/hw/s390x/storage-keys.h b/include/hw/s390x/storage-keys.h
index aa2ec2aae5..1d9b7ead44 100644
--- a/include/hw/s390x/storage-keys.h
+++ b/include/hw/s390x/storage-keys.h
@@ -13,7 +13,6 @@
 #define S390_STORAGE_KEYS_H
 
 #include "hw/qdev-core.h"
-#include "monitor/monitor.h"
 #include "qom/object.h"
 
 #define TYPE_S390_SKEYS "s390-skeys"
@@ -114,7 +113,4 @@ void s390_skeys_init(void);
 
 S390SKeysState *s390_get_skeys_device(void);
 
-void hmp_dump_skeys(Monitor *mon, const QDict *qdict);
-void hmp_info_skeys(Monitor *mon, const QDict *qdict);
-
 #endif /* S390_STORAGE_KEYS_H */
diff --git a/include/monitor/hmp-target.h b/include/monitor/hmp-target.h
index b679aaebbf..024cff0052 100644
--- a/include/monitor/hmp-target.h
+++ b/include/monitor/hmp-target.h
@@ -61,4 +61,9 @@ void hmp_gva2gpa(Monitor *mon, const QDict *qdict);
 void hmp_gpa2hva(Monitor *mon, const QDict *qdict);
 void hmp_gpa2hpa(Monitor *mon, const QDict *qdict);
 
+void hmp_dump_skeys(Monitor *mon, const QDict *qdict);
+void hmp_info_skeys(Monitor *mon, const QDict *qdict);
+void hmp_info_cmma(Monitor *mon, const QDict *qdict);
+void hmp_migrationmode(Monitor *mon, const QDict *qdict);
+
 #endif /* MONITOR_HMP_TARGET_H */
diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
index 5c535d483e..7b2ccb94a5 100644
--- a/hw/s390x/s390-skeys.c
+++ b/hw/s390x/s390-skeys.c
@@ -23,6 +23,8 @@
 #include "sysemu/kvm.h"
 #include "migration/qemu-file-types.h"
 #include "migration/register.h"
+#include "monitor/hmp-target.h"
+#include "monitor/monitor.h"
 
 #define S390_SKEYS_BUFFER_SIZE (128 * KiB)  /* Room for 128k storage keys */
 #define S390_SKEYS_SAVE_FLAG_EOS 0x01
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index c4259b5327..9b4b8d8d0c 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -19,6 +19,8 @@
 #include "exec/ram_addr.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
+#include "monitor/hmp-target.h"
+#include "monitor/monitor.h"
 #include "cpu.h"
 
 /* 512KiB cover 2GB of guest memory */
diff --git a/monitor/hmp-target.c b/monitor/hmp-target.c
index 1eb72ac1bf..0466474354 100644
--- a/monitor/hmp-target.c
+++ b/monitor/hmp-target.c
@@ -36,11 +36,6 @@
 #include "qapi/error.h"
 #include "qemu/cutils.h"
 
-#if defined(TARGET_S390X)
-#include "hw/s390x/storage-keys.h"
-#include "hw/s390x/storage-attributes.h"
-#endif
-
 /* Make devices configuration available for use in hmp-commands*.hx templates 
*/
 #include CONFIG_DEVICES
 
-- 
2.41.0

[RFC PATCH 2/3] monitor: Allow passing HMP arguments to QMP HumanReadableText API

Allow HMP commands implemented using the HumanReadableText API
(via the HMPCommand::cmd_info_hrt handler) to pass arguments
to the QMP equivalent command. The arguments are serialized as
a JSON dictionary.

Signed-off-by: Philippe Mathieu-Daudé 
---
 docs/devel/writing-monitor-commands.rst | 15 ++-
 qapi/machine.json   | 24 
 include/monitor/monitor.h   |  3 ++-
 monitor/monitor-internal.h  |  2 +-
 accel/tcg/monitor.c |  4 ++--
 hw/core/loader.c|  2 +-
 hw/core/machine-qmp-cmds.c  |  9 +
 hw/usb/bus.c|  2 +-
 monitor/hmp-target.c|  3 ++-
 monitor/hmp.c   | 11 +++
 10 files changed, 59 insertions(+), 16 deletions(-)

diff --git a/docs/devel/writing-monitor-commands.rst 
b/docs/devel/writing-monitor-commands.rst
index 930da5cd06..843458e52c 100644
--- a/docs/devel/writing-monitor-commands.rst
+++ b/docs/devel/writing-monitor-commands.rst
@@ -561,6 +561,7 @@ returns a ``HumanReadableText``::
  # Since: 6.2
  ##
  { 'command': 'x-query-roms',
+   'data': { 'json-args': 'str'},
'returns': 'HumanReadableText',
'features': [ 'unstable' ] }
 
@@ -578,7 +579,7 @@ Implementing the QMP command
 The QMP implementation will typically involve creating a ``GString``
 object and printing formatted data into it, like this::
 
- HumanReadableText *qmp_x_query_roms(Error **errp)
+ HumanReadableText *qmp_x_query_roms(const char *json_args, Error **errp)
  {
  g_autoptr(GString) buf = g_string_new("");
  Rom *rom;
@@ -596,6 +597,18 @@ object and printing formatted data into it, like this::
 The actual implementation emits more information.  You can find it in
 hw/core/loader.c.
 
+For QMP command taking (optional) parameters, these parameters are
+serialized as a JSON dictionary, and can be retrieved using the QDict
+API. If the previous ``x-query-roms`` command were taking a "index"
+argument, it could be retrieved as::
+
+ HumanReadableText *qmp_x_query_roms(const char *json_args, Error **errp)
+ {
+ g_autoptr(GString) buf = g_string_new("");
+ QDict *qdict = qobject_to(QDict, qobject_from_json(json_args, 
_abort));
+ uint64_t index = qdict_get_int(qdict, "index");
+ ...
+ }
 
 Implementing the HMP command
 
diff --git a/qapi/machine.json b/qapi/machine.json
index 1283d14493..6da72f2585 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1697,6 +1697,8 @@
 #
 # Query interrupt statistics
 #
+# @json-args: HMP arguments encoded as JSON string (unused for this command).
+#
 # Features:
 #
 # @unstable: This command is meant for debugging.
@@ -1706,6 +1708,7 @@
 # Since: 6.2
 ##
 { 'command': 'x-query-irq',
+  'data': { 'json-args': 'str'},
   'returns': 'HumanReadableText',
   'features': [ 'unstable' ] }
 
@@ -1714,6 +1717,8 @@
 #
 # Query TCG compiler statistics
 #
+# @json-args: HMP arguments encoded as JSON string (unused for this command).
+#
 # Features:
 #
 # @unstable: This command is meant for debugging.
@@ -1723,6 +1728,7 @@
 # Since: 6.2
 ##
 { 'command': 'x-query-jit',
+  'data': { 'json-args': 'str'},
   'returns': 'HumanReadableText',
   'if': 'CONFIG_TCG',
   'features': [ 'unstable' ] }
@@ -1732,6 +1738,8 @@
 #
 # Query NUMA topology information
 #
+# @json-args: HMP arguments encoded as JSON string (unused for this command).
+#
 # Features:
 #
 # @unstable: This command is meant for debugging.
@@ -1741,6 +1749,7 @@
 # Since: 6.2
 ##
 { 'command': 'x-query-numa',
+  'data': { 'json-args': 'str'},
   'returns': 'HumanReadableText',
   'features': [ 'unstable' ] }
 
@@ -1749,6 +1758,8 @@
 #
 # Query TCG opcode counters
 #
+# @json-args: HMP arguments encoded as JSON string (unused for this command).
+#
 # Features:
 #
 # @unstable: This command is meant for debugging.
@@ -1758,6 +1769,7 @@
 # Since: 6.2
 ##
 { 'command': 'x-query-opcount',
+  'data': { 'json-args': 'str'},
   'returns': 'HumanReadableText',
   'if': 'CONFIG_TCG',
   'features': [ 'unstable' ] }
@@ -1767,6 +1779,8 @@
 #
 # Query system ramblock information
 #
+# @json-args: HMP arguments encoded as JSON string (unused for this command).
+#
 # Features:
 #
 # @unstable: This command is meant for debugging.
@@ -1776,6 +1790,7 @@
 # Since: 6.2
 ##
 { 'command': 'x-query-ramblock',
+  'data': { 'json-args': 'str'},
   'returns': 'HumanReadableText',
   'features': [ 'unstable' ] }
 
@@ -1784,6 +1799,8 @@
 #
 # Query information on the registered ROMS
 #
+# @json-args: HMP arguments encoded as JSON string (unused for this command).
+#
 # Features:
 #
 # @unstable: This command is meant for debugging.
@@ -1793,6 +1810,7 @@
 # Since: 6.2
 ##
 { 'command': 'x-query-roms',
+  'data': { 'json-args': 'str'},
   'returns': 'HumanReadableText',
   'features': [ 'unstable' ] }
 
@@ -1801,6 +1819,8 @@
 #
 # Query information on the USB devices
 #
+# @json-args: HMP arguments

Re: [PATCH 0/5] trace: Remove and forbid newline characters in event format


On 10/6/24 19:05, Stefan Hajnoczi wrote:

On Thu, Jun 06, 2024 at 12:39:38PM +0200, Philippe Mathieu-Daudé wrote:

Trace events aren't designed to be multi-lines.
Few format use the newline character: remove it
and forbid further uses.

Philippe Mathieu-Daudé (5):
   backends/tpm: Remove newline character in trace event
   hw/sh4: Remove newline character in trace events
   hw/usb: Remove newline character in trace events
   hw/vfio: Remove newline character in trace events
   tracetool: Forbid newline character in event format




Thanks, applied to my tracing tree:
https://gitlab.com/stefanha/qemu/commits/tracing


Thanks!

[RFC PATCH 3/3] hw/s390x: Introduce x-query-s390x-cmma QMP command

This is a counterpart to the HMP "info cmma" command. It is being
added with an "x-" prefix because this QMP command is intended as an
adhoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.
The existing HMP command is rewritten to call the QMP command.

Signed-off-by: Philippe Mathieu-Daudé 
---
 qapi/machine.json| 20 
 hw/s390x/s390-stattrib.c | 28 ++--
 hmp-commands-info.hx |  2 +-
 3 files changed, 39 insertions(+), 11 deletions(-)

diff --git a/qapi/machine.json b/qapi/machine.json
index 6da72f2585..a56b7572b1 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1905,3 +1905,23 @@
   'data': { 'json-args': 'str'},
   'returns': 'HumanReadableText',
   'features': [ 'unstable' ]}
+
+##
+# @x-query-s390x-cmma:
+#
+# Query information on s390x CMMA storage attributes
+#
+# @json-args: HMP arguments encoded as JSON string.
+#
+# Features:
+#
+# @unstable: This command is meant for debugging.
+#
+# Returns: s390x CMMA storage attributes information
+#
+# Since: 9.1
+##
+{ 'command': 'x-query-s390x-cmma',
+  'data': { 'json-args': 'str'},
+  'returns': 'HumanReadableText',
+  'features': [ 'unstable' ]}
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index 9b4b8d8d0c..8c2372bd71 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -19,6 +19,9 @@
 #include "exec/ram_addr.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
+#include "qapi/qapi-commands-machine.h"
+#include "qapi/qmp/qjson.h"
+#include "qapi/type-helpers.h"
 #include "monitor/hmp-target.h"
 #include "monitor/monitor.h"
 #include "cpu.h"
@@ -73,10 +76,12 @@ void hmp_migrationmode(Monitor *mon, const QDict *qdict)
 }
 }
 
-void hmp_info_cmma(Monitor *mon, const QDict *qdict)
+HumanReadableText *qmp_x_query_s390x_cmma(const char *json_args, Error **errp)
 {
+g_autoptr(GString) buf = g_string_new("");
 S390StAttribState *sas = s390_get_stattrib_device();
 S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
+QDict *qdict = qobject_to(QDict, qobject_from_json(json_args, 
_abort));
 uint64_t addr = qdict_get_int(qdict, "addr");
 uint64_t buflen = qdict_get_try_int(qdict, "count", 8);
 uint8_t *vals;
@@ -84,30 +89,33 @@ void hmp_info_cmma(Monitor *mon, const QDict *qdict)
 
 vals = g_try_malloc(buflen);
 if (!vals) {
-monitor_printf(mon, "Error: %s\n", strerror(errno));
-return;
+error_setg(errp, "Failed to allocate memory");
+return NULL;
 }
 
 len = sac->peek_stattr(sas, addr / TARGET_PAGE_SIZE, buflen, vals);
 if (len < 0) {
-monitor_printf(mon, "Error: %s", strerror(-len));
+error_setg_errno(errp, -len, "Could not get attributes");
 goto out;
 }
 
-monitor_printf(mon, "  CMMA attributes, "
-   "pages %" PRIu64 "+%d (0x%" PRIx64 "):\n",
-   addr / TARGET_PAGE_SIZE, len, addr & ~TARGET_PAGE_MASK);
+g_string_append_printf(buf, "  CMMA attributes, "
+   "pages %" PRIu64 "+%d (0x%" PRIx64 "):\n",
+   addr / TARGET_PAGE_SIZE, len,
+   addr & ~TARGET_PAGE_MASK);
 for (cx = 0; cx < len; cx++) {
 if (cx % 8 == 7) {
-monitor_printf(mon, "%02x\n", vals[cx]);
+g_string_append_printf(buf, "%02x\n", vals[cx]);
 } else {
-monitor_printf(mon, "%02x", vals[cx]);
+g_string_append_printf(buf, "%02x", vals[cx]);
 }
 }
-monitor_printf(mon, "\n");
+g_string_append_c(buf, '\n');
 
 out:
+qobject_unref(qdict);
 g_free(vals);
+return human_readable_text_from_str(buf);
 }
 
 /* Migration support: */
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index cfd4ad5651..0a944e43ce 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -720,7 +720,7 @@ ERST
 .args_type  = "addr:l,count:l?",
 .params = "address [count]",
 .help   = "Display the values of the CMMA storage attributes for a 
range of pages",
-.cmd= hmp_info_cmma,
+.cmd_info_hrt = qmp_x_query_s390x_cmma,
 },
 #endif
 
-- 
2.41.0

[RFC PATCH 0/3] monitor: Pass HMP arguments to QMP HumanReadableText API as JSON

Current HMPCommand::cmd_info_hrt() handlers don't allow
passing arguments from the monitor. This series pass them
to the underlying QMP commands as a JSON dictionary,
easily deserialized as QDict, similarly to how current
HMP commands receive their arguments. Thus very few
changes are required to port to the new API. As an
example, the @x-query-s390x-cmma command is ported.

Based-on: <20240610063518.50680-1-phi...@linaro.org>

Philippe Mathieu-Daudé (3):
  hw/s390x: Declare target specific monitor commands in hmp-target.h
  monitor: Allow passing HMP arguments to QMP HumanReadableText API
  hw/s390x: Introduce x-query-s390x-cmma QMP command

 docs/devel/writing-monitor-commands.rst | 15 -
 qapi/machine.json   | 44 +
 include/hw/s390x/storage-attributes.h   |  4 ---
 include/hw/s390x/storage-keys.h |  4 ---
 include/monitor/hmp-target.h|  5 +++
 include/monitor/monitor.h   |  3 +-
 monitor/monitor-internal.h  |  2 +-
 accel/tcg/monitor.c |  4 +--
 hw/core/loader.c|  2 +-
 hw/core/machine-qmp-cmds.c  |  9 ++---
 hw/s390x/s390-skeys.c   |  2 ++
 hw/s390x/s390-stattrib.c| 30 +++--
 hw/usb/bus.c|  2 +-
 monitor/hmp-target.c|  8 ++---
 monitor/hmp.c   | 11 ---
 hmp-commands-info.hx|  2 +-
 16 files changed, 107 insertions(+), 40 deletions(-)

-- 
2.41.0

Re: [PATCH qemu ] hw/acpi: Fix big endian host creation of Generic Port Affinity Structures

2024-06-10 Thread Jonathan Cameron via



Hi Igor,

Some code snippets below to try and see if I'm on the correct track
for what you had in mind.

> >   
> > > diff --git a/hw/acpi/acpi_generic_initiator.c 
> > > b/hw/acpi/acpi_generic_initiator.c
> > > index 78b80dcf08..f064753b67 100644
> > > --- a/hw/acpi/acpi_generic_initiator.c
> > > +++ b/hw/acpi/acpi_generic_initiator.c
> > > @@ -151,7 +151,9 @@ build_srat_generic_node_affinity(GArray *table_data, 
> > > int node,
> > >  build_append_int_noprefix(table_data, 0, 12);
> > >  } else {
> > >  /* Device Handle - ACPI */
> > > -build_append_int_noprefix(table_data, handle->hid, 8);
> > > +for (int i = 0; i < sizeof(handle->hid); i++) {
> > > +build_append_int_noprefix(table_data, handle->hid[i], 1);
> > > +}
> > >  build_append_int_noprefix(table_data, handle->uid, 4);
> > >  build_append_int_noprefix(table_data, 0, 4);
> > 
> > instead of open codding structure
> > 
> > it might be better to introduce helper in aml_build.c
> > something like 
> >   /* proper reference to spec as we do for other ACPI primitives */
> >   build_append_srat_acpi_device_handle(GArray *table_data, char* hid, 
> > unit32_t uid)
> >   assert(strlen(hid) ...
> >   for() {
> > build_append_byte()
> >   }  
> >   ...
> > 
> > the same applies to "Device Handle - PCI" structure  
> 
> I'll look at moving that stuff and the affinity structure creation
> code themselves in there. I think they ended up in this file because
> of the other infrastructure needed to create these nodes and it
> will have felt natural to keep this together.
> 
> Putting it in aml_build.c will put it with similar code though
> which makes sense to me.

This all works out fine, though there is less reason to keep a
ACPI_GENERIC_NODE base under GENERIC_PORT and GENERIC_INITIATOR
so I may drop that and just have a small amount of code duplication.

> 
> > 
> > Also get rid of PCI deps in acpi_generic_initiator.c 
> > move build_all_acpi_generic_initiators/build_srat_generic_pci_initiator into
> > hw/acpi/pci.c  
> 
> Today it's used only for PCI devices, but that's partly an artifact
> of how we get to the root complex via the bus below it.
> 
> Spec wise, it's just as applicable to platform devices etc, but maybe
> we can move it to pci.c for now and move it out again if it gains other
> users. Or leave it in acpi_generic_initiator.c but have all the aml
> stuff in aml_build.c as you suggest. 
> 
> > file if it has to access PCI code/structures directly
> > (which I'm not convinced it should, can we get/expose what it needs as QOM 
> > properties?)  
> 
> Maybe. I'll see what I can come up with.  This feels involved
> however so I'm more doubtful about this as a precursor.

This is a little messy and tricky to get the right level of generic.
For the bdf, were you thinking something along the lines of the following?

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 324c1302d2..75366491b7 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -67,6 +67,19 @@ static char *pcibus_get_fw_dev_path(DeviceState *dev);
 static void pcibus_reset_hold(Object *obj, ResetType type);
 static bool pcie_has_upstream_port(PCIDevice *dev);

+static void prop_pci_bdf_get(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+uint16_t bdf = pci_get_bdf(PCI_DEVICE(obj));
+
+visit_type_uint16(v, name, , errp);
+}
+
+static const PropertyInfo prop_pci_bdf = {
+.name = "bdf",
+.get = prop_pci_bdf_get,
+};
+
 static Property pci_props[] = {
 DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1),
 DEFINE_PROP_STRING("romfile", PCIDevice, romfile),
@@ -85,6 +98,7 @@ static Property pci_props[] = {
 QEMU_PCIE_ERR_UNC_MASK_BITNR, true),
 DEFINE_PROP_BIT("x-pcie-ari-nextfn-1", PCIDevice, cap_present,
 QEMU_PCIE_ARI_NEXTFN_1_BITNR, false),
+{ .name = "bdf", .info = _pci_bdf },
 DEFINE_PROP_END_OF_LIST()
 };


The other case is where I need to get the ACPI UID associate with a
root complex. Now that has to be matched to the appropriate HID and so
far the only one of those is ACPI0016 which is the HID for
TYPE_PXB_CXL_DEV. That happens to the bus number of the
TYPE_PXB_CXL_BUS but that connection should probably not be explicit
outside of the PXB specific code.

I can add a property like: 

diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index f5431443b9..1c51f3f5b6 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -92,6 +92,21 @@ static void pxb_bus_class_init(ObjectClass *class, void 
*data)
 pbc->numa_node = pxb_bus_numa_node;
 }

+static void prop_pxb_cxl_uid_get(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+uint32_t uid = pci_bus_num(PCI_BUS(obj));
+
+visit_type_uint32(v, name, , errp);
+}
+
+static

Re: [PATCH v2 18/18] migration/ram: Add direct-io support to precopy file migration

2024-06-10 Thread Fabiano Rosas

Peter Xu  writes:

> On Fri, Jun 07, 2024 at 03:42:35PM -0300, Fabiano Rosas wrote:
>> Peter Xu  writes:
>> 
>> > On Thu, May 23, 2024 at 04:05:48PM -0300, Fabiano Rosas wrote:
>> >> We've recently added support for direct-io with multifd, which brings
>> >> performance benefits, but creates a non-uniform user interface by
>> >> coupling direct-io with the multifd capability. This means that users
>> >> cannot keep the direct-io flag enabled while disabling multifd.
>> >> 
>> >> Libvirt in particular already has support for direct-io and parallel
>> >> migration separately from each other, so it would be a regression to
>> >> now require both options together. It's relatively simple for QEMU to
>> >> add support for direct-io migration without multifd, so let's do this
>> >> in order to keep both options decoupled.
>> >> 
>> >> We cannot simply enable the O_DIRECT flag, however, because not all IO
>> >> performed by the migration thread satisfies the alignment requirements
>> >> of O_DIRECT. There are many small read & writes that add headers and
>> >> synchronization flags to the stream, which at the moment are required
>> >> to always be present.
>> >> 
>> >> Fortunately, due to fixed-ram migration there is a discernible moment
>> >> where only RAM pages are written to the migration file. Enable
>> >> direct-io during that moment.
>> >> 
>> >> Signed-off-by: Fabiano Rosas 
>> >
>> > Is anyone going to consume this?  How's the performance?
>> 
>> I don't think we have a pre-determined consumer for this. This came up
>> in an internal discussion about making the interface simpler for libvirt
>> and in a thread on the libvirt mailing list[1] about using O_DIRECT to
>> keep the snapshot data out of the caches to avoid impacting the rest of
>> the system. (I could have described this better in the commit message,
>> sorry).
>> 
>> Quoting Daniel:
>> 
>>   "Note the reason for using O_DIRECT is *not* to make saving / restoring
>>the guest VM faster. Rather it is to ensure that saving/restoring a VM
>>does not trash the host I/O / buffer cache, which will negatively impact
>>performance of all the *other* concurrently running VMs."
>> 
>> 1- https://lore.kernel.org/r/87sez86ztq@suse.de
>> 
>> About performance, a quick test on a stopped 30G guest, shows
>> mapped-ram=on direct-io=on it's 12% slower than mapped-ram=on
>> direct-io=off.
>
> Yes, this makes sense.
>
>> 
>> >
>> > It doesn't look super fast to me if we need to enable/disable dio in each
>> > loop.. then it's a matter of whether we should bother, or would it be
>> > easier that we simply require multifd when direct-io=on.
>> 
>> AIUI, the issue here that users are already allowed to specify in
>> libvirt the equivalent to direct-io and multifd independent of each
>> other (bypass-cache, parallel). To start requiring both together now in
>> some situations would be a regression. I confess I don't know libvirt
>> code to know whether this can be worked around somehow, but as I said,
>> it's a relatively simple change from the QEMU side.
>
> Firstly, I definitely want to already avoid all the calls to either
> migration_direct_io_start() or *_finish(), now we already need to
> explicitly call them in three paths, and that's not intuitive and less
> readable, just like the hard coded rdma codes.

Right, but that's just a side-effect of how the code is structured and
the fact that writes to the stream happen in small chunks. Setting
O_DIRECT needs to happen around aligned IO. We could move the calls
further down into qemu_put_buffer_at(), but that would be four fcntl()
calls for every page.

A tangent:
 one thing that occured to me now is that we may be able to restrict
 calls to qemu_fflush() to internal code like add_to_iovec() and maybe
 use that function to gather the correct amount of data before writing,
 making sure it disables O_DIRECT in case alignment is about to be
 broken?

>
> I also worry we may overlook the complexity here, and pinning buffers
> definitely need more thoughts on its own.  It's easier to digest when using
> multifd and when QEMU only pins guest pages just like tcp-zerocopy does,
> which are naturally host page size aligned, and also guaranteed to not be
> freed (while reused / modified is fine here, as dirty tracking guarantees a
> new page will be migrated soon again).

I don't get this at all, sorry. What is different from multifd here?
We're writing on the same HVA as the one that would be given to multifd
(if it were enabled) and dirty tracking is working the same.

> IMHO here the "not be freed / modified" is even more important than
> "alignment": the latter is about perf, the former is about correctness.
> When we do directio on random buffers, AFAIU we don't want to have the
> buffer modified before flushed to disk, and that's IMHO not easy to
> guarantee.
>
> E.g., I don't think this guarantees a flush on the buffer usages:
>
>   migration_direct_io_start()
> /* flush any potentially unaligned

Re: [PATCH v5 01/10] block: add persistent reservation in/out api

On Thu, Jun 06, 2024 at 08:24:35PM +0800, Changqi Lu wrote:
> Add persistent reservation in/out operations
> at the block level. The following operations
> are included:
> 
> - read_keys:retrieves the list of registered keys.
> - read_reservation: retrieves the current reservation status.
> - register: registers a new reservation key.
> - reserve:  initiates a reservation for a specific key.
> - release:  releases a reservation for a specific key.
> - clear:clears all existing reservations.
> - preempt:  preempts a reservation held by another key.
> 
> Signed-off-by: Changqi Lu 
> Signed-off-by: zhenwei pi 
> ---
>  block/block-backend.c | 397 ++
>  block/io.c| 163 
>  include/block/block-common.h  |  40 +++
>  include/block/block-io.h  |  20 ++
>  include/block/block_int-common.h  |  84 +++
>  include/sysemu/block-backend-io.h |  24 ++
>  6 files changed, 728 insertions(+)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index db6f9b92a3..6707d94df7 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1770,6 +1770,403 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned 
> long int req, void *buf,
>  return blk_aio_prwv(blk, req, 0, buf, blk_aio_ioctl_entry, 0, cb, 
> opaque);
>  }
>  
> +typedef struct BlkPrInCo {
> +BlockBackend *blk;
> +uint32_t *generation;
> +uint32_t num_keys;
> +BlockPrType *type;
> +uint64_t *keys;
> +int ret;
> +} BlkPrInCo;
> +
> +typedef struct BlkPrInCB {
> +BlockAIOCB common;
> +BlkPrInCo prco;
> +bool has_returned;
> +} BlkPrInCB;
> +
> +static const AIOCBInfo blk_pr_in_aiocb_info = {
> +.aiocb_size = sizeof(BlkPrInCB),
> +};
> +
> +static void blk_pr_in_complete(BlkPrInCB *acb)
> +{
> +if (acb->has_returned) {
> +acb->common.cb(acb->common.opaque, acb->prco.ret);
> +blk_dec_in_flight(acb->prco.blk);

Did you receive my replies to v1 of this patch series?

Please take a look at them and respond:
https://lore.kernel.org/qemu-devel/20240508093629.441057-1-luchangqi@bytedance.com/

Thanks,
Stefan

> +qemu_aio_unref(acb);
> +}
> +}
> +
> +static void blk_pr_in_complete_bh(void *opaque)
> +{
> +BlkPrInCB *acb = opaque;
> +assert(acb->has_returned);
> +blk_pr_in_complete(acb);
> +}
> +
> +static BlockAIOCB *blk_aio_pr_in(BlockBackend *blk, uint32_t *generation,
> + uint32_t num_keys, BlockPrType *type,
> + uint64_t *keys, CoroutineEntry co_entry,
> + BlockCompletionFunc *cb, void *opaque)
> +{
> +BlkPrInCB *acb;
> +Coroutine *co;
> +
> +blk_inc_in_flight(blk);
> +acb = blk_aio_get(_pr_in_aiocb_info, blk, cb, opaque);
> +acb->prco = (BlkPrInCo) {
> +.blk= blk,
> +.generation = generation,
> +.num_keys   = num_keys,
> +.type   = type,
> +.ret= NOT_DONE,
> +.keys   = keys,
> +};
> +acb->has_returned = false;
> +
> +co = qemu_coroutine_create(co_entry, acb);
> +aio_co_enter(qemu_get_current_aio_context(), co);
> +
> +acb->has_returned = true;
> +if (acb->prco.ret != NOT_DONE) {
> +replay_bh_schedule_oneshot_event(qemu_get_current_aio_context(),
> + blk_pr_in_complete_bh, acb);
> +}
> +
> +return >common;
> +}
> +
> +/* To be called between exactly one pair of blk_inc/dec_in_flight() */
> +static int coroutine_fn
> +blk_aio_pr_do_read_keys(BlockBackend *blk, uint32_t *generation,
> +uint32_t num_keys, uint64_t *keys)
> +{
> +IO_CODE();
> +
> +blk_wait_while_drained(blk);
> +GRAPH_RDLOCK_GUARD();
> +
> +if (!blk_co_is_available(blk)) {
> +return -ENOMEDIUM;
> +}
> +
> +return bdrv_co_pr_read_keys(blk_bs(blk), generation, num_keys, keys);
> +}
> +
> +static void coroutine_fn blk_aio_pr_read_keys_entry(void *opaque)
> +{
> +BlkPrInCB *acb = opaque;
> +BlkPrInCo *prco = >prco;
> +
> +prco->ret = blk_aio_pr_do_read_keys(prco->blk, prco->generation,
> +prco->num_keys, prco->keys);
> +blk_pr_in_complete(acb);
> +}
> +
> +BlockAIOCB *blk_aio_pr_read_keys(BlockBackend *blk, uint32_t *generation,
> + uint32_t num_keys, uint64_t *keys,
> + BlockCompletionFunc *cb, void *opaque)
> +{
> +IO_CODE();
> +return blk_aio_pr_in(blk, generation, num_keys, NULL, keys,
> + blk_aio_pr_read_keys_entry, cb, opaque);
> +}
> +
> +/* To be called between exactly one pair of blk_inc/dec_in_flight() */
> +static int coroutine_fn
> +blk_aio_pr_do_read_reservation(BlockBackend *blk, uint32_t *generation,
> +   uint64_t *key, BlockPrType

Re: [PATCH v5 00/10] Support persistent reservation operations

On Thu, Jun 06, 2024 at 08:24:34PM +0800, Changqi Lu wrote:
> Hi,
> 
> patchv5 has been modified. 
> 
> Sincerely hope that everyone can help review the
> code and provide some suggestions.
> 
> v4->v5:
> - Fixed a memory leak bug at hw/nvme/ctrl.c.
> 
> v3->v4:
> - At the nvme layer, the two patches of enabling the ONCS
>   function and enabling rescap are combined into one.
> - At the nvme layer, add helper functions for pr capacity
>   conversion between the block layer and the nvme layer.
> 
> v2->v3:
> In v2 Persist Through Power Loss(PTPL) is enable default.
> In v3 PTPL is supported, which is passed as a parameter.
> 
> v1->v2:
> - Add sg_persist --report-capabilities for SCSI protocol and enable
>   oncs and rescap for NVMe protocol.
> - Add persistent reservation capabilities constants and helper functions for
>   SCSI and NVMe protocol.
> - Add comments for necessary APIs.
> 
> v1:
> - Add seven APIs about persistent reservation command for block layer.
>   These APIs including reading keys, reading reservations, registering,
>   reserving, releasing, clearing and preempting.
> - Add the necessary pr-related operation APIs for both the
>   SCSI protocol and NVMe protocol at the device layer.
> - Add scsi driver at the driver layer to verify the functions

My question from v1 is unanswered:

  What is the relationship to the existing PRManager functionality
  (docs/interop/pr-helper.rst) where block/file-posix.c interprets SCSI
  ioctls and sends persistent reservation requests to an external helper
  process?

  I wonder if block/file-posix.c can implement the new block driver
  callbacks using pr_mgr (while keeping the existing scsi-generic
  support).

Thanks,
Stefan

> 
> 
> Changqi Lu (10):
>   block: add persistent reservation in/out api
>   block/raw: add persistent reservation in/out driver
>   scsi/constant: add persistent reservation in/out protocol constants
>   scsi/util: add helper functions for persistent reservation types
> conversion
>   hw/scsi: add persistent reservation in/out api for scsi device
>   block/nvme: add reservation command protocol constants
>   hw/nvme: add helper functions for converting reservation types
>   hw/nvme: enable ONCS and rescap function
>   hw/nvme: add reservation protocal command
>   block/iscsi: add persistent reservation in/out driver
> 
>  block/block-backend.c | 397 ++
>  block/io.c| 163 +++
>  block/iscsi.c | 443 ++
>  block/raw-format.c|  56 
>  hw/nvme/ctrl.c| 326 +-
>  hw/nvme/ns.c  |   5 +
>  hw/nvme/nvme.h|  84 ++
>  hw/scsi/scsi-disk.c   | 352 
>  include/block/block-common.h  |  40 +++
>  include/block/block-io.h  |  20 ++
>  include/block/block_int-common.h  |  84 ++
>  include/block/nvme.h  |  98 +++
>  include/scsi/constants.h  |  52 
>  include/scsi/utils.h  |   8 +
>  include/sysemu/block-backend-io.h |  24 ++
>  scsi/utils.c  |  81 ++
>  16 files changed, 2231 insertions(+), 2 deletions(-)
> 
> -- 
> 2.20.1
> 


signature.asc
Description: PGP signature

[PULL 5/6] hw/vfio: Remove newline character in trace events

From: Philippe Mathieu-Daudé 

Trace events aren't designed to be multi-lines.
Remove the newline characters.

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Mads Ynddal 
Reviewed-by: Daniel P. Berrangé 
Message-id: 20240606103943.79116-5-phi...@linaro.org
Signed-off-by: Stefan Hajnoczi 
---
 hw/vfio/trace-events | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 64161bf6f4..e16179b507 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -19,7 +19,7 @@ vfio_msix_fixup(const char *name, int bar, uint64_t start, 
uint64_t end) " (%s)
 vfio_msix_relo(const char *name, int bar, uint64_t offset) " (%s) BAR %d 
offset 0x%"PRIx64""
 vfio_msi_enable(const char *name, int nr_vectors) " (%s) Enabled %d MSI 
vectors"
 vfio_msi_disable(const char *name) " (%s)"
-vfio_pci_load_rom(const char *name, unsigned long size, unsigned long offset, 
unsigned long flags) "Device %s ROM:\n  size: 0x%lx, offset: 0x%lx, flags: 
0x%lx"
+vfio_pci_load_rom(const char *name, unsigned long size, unsigned long offset, 
unsigned long flags) "Device '%s' ROM: size: 0x%lx, offset: 0x%lx, flags: 0x%lx"
 vfio_rom_read(const char *name, uint64_t addr, int size, uint64_t data) " (%s, 
0x%"PRIx64", 0x%x) = 0x%"PRIx64
 vfio_pci_size_rom(const char *name, int size) "%s ROM size 0x%x"
 vfio_vga_write(uint64_t addr, uint64_t data, int size) " (0x%"PRIx64", 
0x%"PRIx64", %d)"
@@ -35,7 +35,7 @@ vfio_pci_hot_reset(const char *name, const char *type) " (%s) 
%s"
 vfio_pci_hot_reset_has_dep_devices(const char *name) "%s: hot reset dependent 
devices:"
 vfio_pci_hot_reset_dep_devices(int domain, int bus, int slot, int function, 
int group_id) "\t%04x:%02x:%02x.%x group %d"
 vfio_pci_hot_reset_result(const char *name, const char *result) "%s hot reset: 
%s"
-vfio_populate_device_config(const char *name, unsigned long size, unsigned 
long offset, unsigned long flags) "Device %s config:\n  size: 0x%lx, offset: 
0x%lx, flags: 0x%lx"
+vfio_populate_device_config(const char *name, unsigned long size, unsigned 
long offset, unsigned long flags) "Device '%s' config: size: 0x%lx, offset: 
0x%lx, flags: 0x%lx"
 vfio_populate_device_get_irq_info_failure(const char *errstr) 
"VFIO_DEVICE_GET_IRQ_INFO failure: %s"
 vfio_attach_device(const char *name, int group_id) " (%s) group %d"
 vfio_detach_device(const char *name, int group_id) " (%s) group %d"
-- 
2.45.1

[PULL 6/6] tracetool: Forbid newline character in event format

From: Philippe Mathieu-Daudé 

Events aren't designed to be multi-lines. Multiple events
can be used instead. Prevent that format using multi-lines
by forbidding the newline character.

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Mads Ynddal 
Reviewed-by: Daniel P. Berrangé 
Message-id: 20240606103943.79116-6-phi...@linaro.org
Signed-off-by: Stefan Hajnoczi 
---
 scripts/tracetool/__init__.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
index 7237abe0e8..bc03238c0f 100644
--- a/scripts/tracetool/__init__.py
+++ b/scripts/tracetool/__init__.py
@@ -301,6 +301,8 @@ def build(line_str, lineno, filename):
 if fmt.endswith(r'\n"'):
 raise ValueError("Event format must not end with a newline "
  "character")
+if '\\n' in fmt:
+raise ValueError("Event format must not use new line character")
 
 if len(fmt_trans) > 0:
 fmt = [fmt_trans, fmt]
-- 
2.45.1

[PULL 4/6] hw/usb: Remove newline character in trace events

From: Philippe Mathieu-Daudé 

Trace events aren't designed to be multi-lines.
Remove the newline characters.

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Mads Ynddal 
Reviewed-by: Daniel P. Berrangé 
Message-id: 20240606103943.79116-4-phi...@linaro.org
Signed-off-by: Stefan Hajnoczi 
---
 hw/usb/trace-events | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/usb/trace-events b/hw/usb/trace-events
index fd7b90d70c..46732717a9 100644
--- a/hw/usb/trace-events
+++ b/hw/usb/trace-events
@@ -15,7 +15,7 @@ usb_ohci_exit(const char *s) "%s"
 
 # hcd-ohci.c
 usb_ohci_iso_td_read_failed(uint32_t addr) "ISO_TD read error at 0x%x"
-usb_ohci_iso_td_head(uint32_t head, uint32_t tail, uint32_t flags, uint32_t 
bp, uint32_t next, uint32_t be, uint32_t framenum, uint32_t startframe, 
uint32_t framecount, int rel_frame_num) "ISO_TD ED head 0x%.8x tailp 
0x%.8x\n0x%.8x 0x%.8x 0x%.8x 0x%.8x\nframe_number 0x%.8x starting_frame 
0x%.8x\nframe_count  0x%.8x relative %d"
+usb_ohci_iso_td_head(uint32_t head, uint32_t tail, uint32_t flags, uint32_t 
bp, uint32_t next, uint32_t be, uint32_t framenum, uint32_t startframe, 
uint32_t framecount, int rel_frame_num) "ISO_TD ED head 0x%.8x tailp 0x%.8x, 
flags 0x%.8x bp 0x%.8x next 0x%.8x be 0x%.8x, frame_number 0x%.8x 
starting_frame 0x%.8x, frame_count 0x%.8x relative %d"
 usb_ohci_iso_td_head_offset(uint32_t o0, uint32_t o1, uint32_t o2, uint32_t 
o3, uint32_t o4, uint32_t o5, uint32_t o6, uint32_t o7) "0x%.8x 0x%.8x 0x%.8x 
0x%.8x 0x%.8x 0x%.8x 0x%.8x 0x%.8x"
 usb_ohci_iso_td_relative_frame_number_neg(int rel) "ISO_TD R=%d < 0"
 usb_ohci_iso_td_relative_frame_number_big(int rel, int count) "ISO_TD R=%d > 
FC=%d"
@@ -23,7 +23,7 @@ usb_ohci_iso_td_bad_direction(int dir) "Bad direction %d"
 usb_ohci_iso_td_bad_bp_be(uint32_t bp, uint32_t be) "ISO_TD bp 0x%.8x be 
0x%.8x"
 usb_ohci_iso_td_bad_cc_not_accessed(uint32_t start, uint32_t next) "ISO_TD cc 
!= not accessed 0x%.8x 0x%.8x"
 usb_ohci_iso_td_bad_cc_overrun(uint32_t start, uint32_t next) "ISO_TD 
start_offset=0x%.8x > next_offset=0x%.8x"
-usb_ohci_iso_td_so(uint32_t so, uint32_t eo, uint32_t s, uint32_t e, const 
char *str, ssize_t len, int ret) "0x%.8x eo 0x%.8x\nsa 0x%.8x ea 0x%.8x\ndir %s 
len %zu ret %d"
+usb_ohci_iso_td_so(uint32_t so, uint32_t eo, uint32_t s, uint32_t e, const 
char *str, ssize_t len, int ret) "0x%.8x eo 0x%.8x sa 0x%.8x ea 0x%.8x dir %s 
len %zu ret %d"
 usb_ohci_iso_td_data_overrun(int ret, ssize_t len) "DataOverrun %d > %zu"
 usb_ohci_iso_td_data_underrun(int ret) "DataUnderrun %d"
 usb_ohci_iso_td_nak(int ret) "got NAK/STALL %d"
@@ -55,7 +55,7 @@ usb_ohci_td_pkt_full(const char *dir, const char *buf) "%s 
data: %s"
 usb_ohci_td_too_many_pending(int ep) "ep=%d"
 usb_ohci_td_packet_status(int status) "status=%d"
 usb_ohci_ed_read_error(uint32_t addr) "ED read error at 0x%x"
-usb_ohci_ed_pkt(uint32_t cur, int h, int c, uint32_t head, uint32_t tail, 
uint32_t next) "ED @ 0x%.8x h=%u c=%u\n  head=0x%.8x tailp=0x%.8x next=0x%.8x"
+usb_ohci_ed_pkt(uint32_t cur, int h, int c, uint32_t head, uint32_t tail, 
uint32_t next) "ED @ 0x%.8x h=%u c=%u head=0x%.8x tailp=0x%.8x next=0x%.8x"
 usb_ohci_ed_pkt_flags(uint32_t fa, uint32_t en, uint32_t d, int s, int k, int 
f, uint32_t mps) "fa=%u en=%u d=%u s=%u k=%u f=%u mps=%u"
 usb_ohci_hcca_read_error(uint32_t addr) "HCCA read error at 0x%x"
 usb_ohci_mem_read(uint32_t size, const char *name, uint32_t addr, uint32_t 
offs, uint32_t val) "%d %s 0x%x %d -> 0x%x"
-- 
2.45.1

Re: [PATCH 11/25] target/i386: replace read_crN helper with read_cr8

2024-06-10 Thread Paolo Bonzini

On Sat, Jun 8, 2024 at 8:46 PM Richard Henderson
 wrote:
>
> On 6/8/24 01:40, Paolo Bonzini wrote:
> > All other control registers are stored plainly in CPUX86State.
>
> s/stored/read/

I mean the CPUX86State is their storage and it's plain. :)


Paolo
>
> Reviewed-by: Richard Henderson 
>
>
> r~
>

[PULL 3/6] hw/sh4: Remove newline character in trace events

From: Philippe Mathieu-Daudé 

Trace events aren't designed to be multi-lines. Remove
the newline character which doesn't bring much value.

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Mads Ynddal 
Reviewed-by: Daniel P. Berrangé 
Message-id: 20240606103943.79116-3-phi...@linaro.org
Signed-off-by: Stefan Hajnoczi 
---
 hw/sh4/trace-events | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/sh4/trace-events b/hw/sh4/trace-events
index 4b61cd56c8..6bfd7eebc4 100644
--- a/hw/sh4/trace-events
+++ b/hw/sh4/trace-events
@@ -1,3 +1,3 @@
 # sh7750.c
-sh7750_porta(uint16_t prev, uint16_t cur, uint16_t pdtr, uint16_t pctr) "porta 
changed from 0x%04x to 0x%04x\npdtra=0x%04x, pctra=0x%08x"
-sh7750_portb(uint16_t prev, uint16_t cur, uint16_t pdtr, uint16_t pctr) "portb 
changed from 0x%04x to 0x%04x\npdtrb=0x%04x, pctrb=0x%08x"
+sh7750_porta(uint16_t prev, uint16_t cur, uint16_t pdtr, uint16_t pctr) "porta 
changed from 0x%04x to 0x%04x (pdtra=0x%04x, pctra=0x%08x)"
+sh7750_portb(uint16_t prev, uint16_t cur, uint16_t pdtr, uint16_t pctr) "portb 
changed from 0x%04x to 0x%04x (pdtrb=0x%04x, pctrb=0x%08x)"
-- 
2.45.1

[PULL 2/6] backends/tpm: Remove newline character in trace event

From: Philippe Mathieu-Daudé 

Split the 'tpm_util_show_buffer' event in two to avoid
using a newline character.

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Mads Ynddal 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Stefan Berger 
Message-id: 20240606103943.79116-2-phi...@linaro.org
Signed-off-by: Stefan Hajnoczi 
---
 backends/tpm/tpm_util.c   | 5 +++--
 backends/tpm/trace-events | 3 ++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/backends/tpm/tpm_util.c b/backends/tpm/tpm_util.c
index 1856589c3b..cf138551df 100644
--- a/backends/tpm/tpm_util.c
+++ b/backends/tpm/tpm_util.c
@@ -339,10 +339,11 @@ void tpm_util_show_buffer(const unsigned char *buffer,
 size_t len, i;
 char *line_buffer, *p;
 
-if (!trace_event_get_state_backends(TRACE_TPM_UTIL_SHOW_BUFFER)) {
+if (!trace_event_get_state_backends(TRACE_TPM_UTIL_SHOW_BUFFER_CONTENT)) {
 return;
 }
 len = MIN(tpm_cmd_get_size(buffer), buffer_size);
+trace_tpm_util_show_buffer_header(string, len);
 
 /*
  * allocate enough room for 3 chars per buffer entry plus a
@@ -356,7 +357,7 @@ void tpm_util_show_buffer(const unsigned char *buffer,
 }
 p += sprintf(p, "%.2X ", buffer[i]);
 }
-trace_tpm_util_show_buffer(string, len, line_buffer);
+trace_tpm_util_show_buffer_content(line_buffer);
 
 g_free(line_buffer);
 }
diff --git a/backends/tpm/trace-events b/backends/tpm/trace-events
index 1ecef42a07..cb5cfa6510 100644
--- a/backends/tpm/trace-events
+++ b/backends/tpm/trace-events
@@ -10,7 +10,8 @@ tpm_util_get_buffer_size_len(uint32_t len, size_t expected) 
"tpm_resp->len = %u,
 tpm_util_get_buffer_size_hdr_len2(uint32_t len, size_t expected) 
"tpm2_resp->hdr.len = %u, expected = %zu"
 tpm_util_get_buffer_size_len2(uint32_t len, size_t expected) "tpm2_resp->len = 
%u, expected = %zu"
 tpm_util_get_buffer_size(size_t len) "buffersize of device: %zu"
-tpm_util_show_buffer(const char *direction, size_t len, const char *buf) 
"direction: %s len: %zu\n%s"
+tpm_util_show_buffer_header(const char *direction, size_t len) "direction: %s 
len: %zu"
+tpm_util_show_buffer_content(const char *buf) "%s"
 
 # tpm_emulator.c
 tpm_emulator_set_locality(uint8_t locty) "setting locality to %d"
-- 
2.45.1

[PULL 1/6] tracetool: Remove unused vcpu.py script

From: Philippe Mathieu-Daudé 

vcpu.py is pointless since commit 89aafcf2a7 ("trace:
remove code that depends on setting vcpu"), remote it.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Zhao Liu 
Message-id: 20240606102631.78152-1-phi...@linaro.org
Signed-off-by: Stefan Hajnoczi 
---
 meson.build   |  1 -
 scripts/tracetool/__init__.py |  8 +
 scripts/tracetool/vcpu.py | 59 ---
 3 files changed, 1 insertion(+), 67 deletions(-)
 delete mode 100644 scripts/tracetool/vcpu.py

diff --git a/meson.build b/meson.build
index ec59effca2..91278667ea 100644
--- a/meson.build
+++ b/meson.build
@@ -3232,7 +3232,6 @@ tracetool_depends = files(
   'scripts/tracetool/format/log_stap.py',
   'scripts/tracetool/format/stap.py',
   'scripts/tracetool/__init__.py',
-  'scripts/tracetool/vcpu.py'
 )
 
 qemu_version_cmd = [find_program('scripts/qemu-version.sh'),
diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
index b887540a55..7237abe0e8 100644
--- a/scripts/tracetool/__init__.py
+++ b/scripts/tracetool/__init__.py
@@ -306,13 +306,7 @@ def build(line_str, lineno, filename):
 fmt = [fmt_trans, fmt]
 args = Arguments.build(groups["args"])
 
-event = Event(name, props, fmt, args, lineno, filename)
-
-# add implicit arguments when using the 'vcpu' property
-import tracetool.vcpu
-event = tracetool.vcpu.transform_event(event)
-
-return event
+return Event(name, props, fmt, args, lineno, filename)
 
 def __repr__(self):
 """Evaluable string representation for this object."""
diff --git a/scripts/tracetool/vcpu.py b/scripts/tracetool/vcpu.py
deleted file mode 100644
index d232cb1d06..00
--- a/scripts/tracetool/vcpu.py
+++ /dev/null
@@ -1,59 +0,0 @@
-# -*- coding: utf-8 -*-
-
-"""
-Generic management for the 'vcpu' property.
-
-"""
-
-__author__ = "Lluís Vilanova "
-__copyright__  = "Copyright 2016, Lluís Vilanova "
-__license__= "GPL version 2 or (at your option) any later version"
-
-__maintainer__ = "Stefan Hajnoczi"
-__email__  = "stefa...@redhat.com"
-
-
-from tracetool import Arguments, try_import
-
-
-def transform_event(event):
-"""Transform event to comply with the 'vcpu' property (if present)."""
-if "vcpu" in event.properties:
-event.args = Arguments([("void *", "__cpu"), event.args])
-fmt = "\"cpu=%p \""
-event.fmt = fmt + event.fmt
-return event
-
-
-def transform_args(format, event, *args, **kwargs):
-"""Transforms the arguments to suit the specified format.
-
-The format module must implement function 'vcpu_args', which receives the
-implicit arguments added by the 'vcpu' property, and must return suitable
-arguments for the given format.
-
-The function is only called for events with the 'vcpu' property.
-
-Parameters
-==
-format : str
-Format module name.
-event : Event
-args, kwargs
-Passed to 'vcpu_transform_args'.
-
-Returns
-===
-Arguments
-The transformed arguments, including the non-implicit ones.
-
-"""
-if "vcpu" in event.properties:
-ok, func = try_import("tracetool.format." + format,
-  "vcpu_transform_args")
-assert ok
-assert func
-return Arguments([func(event.args[:1], *args, **kwargs),
-  event.args[1:]])
-else:
-return event.args
-- 
2.45.1

[PULL 0/6] Tracing patches

The following changes since commit 80e8f0602168f451a93e71cbb1d59e93d745e62e:

  Merge tag 'bsd-user-misc-2024q2-pull-request' of gitlab.com:bsdimp/qemu into 
staging (2024-06-09 11:21:55 -0700)

are available in the Git repository at:

  https://gitlab.com/stefanha/qemu.git tags/tracing-pull-request

for you to fetch changes up to 4c2b6f328742084a5bd770af7c3a2ef07828c41c:

  tracetool: Forbid newline character in event format (2024-06-10 13:05:27 
-0400)


Pull request

Cleanups from Philippe Mathieu-Daudé.



Philippe Mathieu-Daudé (6):
  tracetool: Remove unused vcpu.py script
  backends/tpm: Remove newline character in trace event
  hw/sh4: Remove newline character in trace events
  hw/usb: Remove newline character in trace events
  hw/vfio: Remove newline character in trace events
  tracetool: Forbid newline character in event format

 meson.build   |  1 -
 backends/tpm/tpm_util.c   |  5 +--
 backends/tpm/trace-events |  3 +-
 hw/sh4/trace-events   |  4 +--
 hw/usb/trace-events   |  6 ++--
 hw/vfio/trace-events  |  4 +--
 scripts/tracetool/__init__.py | 10 ++
 scripts/tracetool/vcpu.py | 59 ---
 8 files changed, 15 insertions(+), 77 deletions(-)
 delete mode 100644 scripts/tracetool/vcpu.py

-- 
2.45.1

Re: [PATCH 24/25] target/i386: do not check PREFIX_LOCK in old-style decoder

2024-06-10 Thread Paolo Bonzini

On Sat, Jun 8, 2024 at 10:16 PM Richard Henderson
 wrote:
>
> On 6/8/24 01:41, Paolo Bonzini wrote:
> > It is already checked before getting there.
> >
> > Signed-off-by: Paolo Bonzini
> > ---
> >   target/i386/tcg/translate.c | 26 --
> >   1 file changed, 8 insertions(+), 18 deletions(-)
>
> Reviewed-by: Richard Henderson 

... except cmpxchg8b/cmpxchg16b do have to accept LOCK. Fortunately
it's trivial to convert them, with just an ugly temporary

if (decode.e.gen == gen_multi0F) {
accept_lock = true;
}

that only lasts one commit. I'll resend this part of the series later
(and BTx as well).

Paolo

Re: [PATCH 2/2] hw/misc/mos6522: Do not open-code hmp_info_human_readable_text()

On Mon, Jun 10, 2024 at 05:07:58PM +0200, Philippe Mathieu-Daudé wrote:
> Register the command 'info via' using HMPCommand::cmd_info_hrt(),
> so it is processed using the generic hmp_info_human_readable_text().
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/hw/misc/mos6522.h|  2 --
>  include/monitor/hmp-target.h |  1 -
>  hw/misc/mos6522.c| 13 -
>  hmp-commands-info.hx |  2 +-
>  4 files changed, 1 insertion(+), 17 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 1/2] hw/misc/mos6522: Expose x-query-mos6522-devices QMP command

On Mon, Jun 10, 2024 at 05:07:57PM +0200, Philippe Mathieu-Daudé wrote:
> This is a counterpart to the HMP "info via" command. It is being
> added with an "x-" prefix because this QMP command is intended as an
> adhoc debugging tool and will thus not be modelled in QAPI as fully
> structured data, nor will it have long term guaranteed stability.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  MAINTAINERS |  2 +-
>  qapi/machine.json   | 17 +
>  hw/misc/mos6522-stubs.c | 18 ++
>  hw/misc/mos6522.c   |  5 +++--
>  hw/misc/meson.build |  3 ++-
>  5 files changed, 41 insertions(+), 4 deletions(-)
>  create mode 100644 hw/misc/mos6522-stubs.c

Reviewed-by: Daniel P. Berrangé 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 0/5] trace: Remove and forbid newline characters in event format

On Thu, Jun 06, 2024 at 12:39:38PM +0200, Philippe Mathieu-Daudé wrote:
> Trace events aren't designed to be multi-lines.
> Few format use the newline character: remove it
> and forbid further uses.
> 
> Philippe Mathieu-Daudé (5):
>   backends/tpm: Remove newline character in trace event
>   hw/sh4: Remove newline character in trace events
>   hw/usb: Remove newline character in trace events
>   hw/vfio: Remove newline character in trace events
>   tracetool: Forbid newline character in event format
> 
>  backends/tpm/tpm_util.c   | 5 +++--
>  backends/tpm/trace-events | 3 ++-
>  hw/sh4/trace-events   | 4 ++--
>  hw/usb/trace-events   | 6 +++---
>  hw/vfio/trace-events  | 4 ++--
>  scripts/tracetool/__init__.py | 2 ++
>  6 files changed, 14 insertions(+), 10 deletions(-)
> 
> -- 
> 2.41.0
> 

Thanks, applied to my tracing tree:
https://gitlab.com/stefanha/qemu/commits/tracing

Stefan


signature.asc
Description: PGP signature

[PATCH v2] scsi-disk: Fix crash for VM configured with USB CDROM after live migration

2024-06-10 Thread Paolo Bonzini

From: Hyman Huang 

For VMs configured with the USB CDROM device:

-drive file=/path/to/local/file,id=drive-usb-disk0,media=cdrom,readonly=on...
-device usb-storage,drive=drive-usb-disk0,id=usb-disk0...

QEMU process may crash after live migration, to reproduce the issue,
configure VM (Guest OS ubuntu 20.04 or 21.10) with the following XML:


  
  
  
  
  



Do the live migration repeatedly, crash may happen after live migratoin,
trace log at the source before live migration is as follows:

324808@1711972823.521945:usb_uhci_frame_start nr 319
324808@1711972823.521978:usb_uhci_qh_load qh 0x35cb5400
324808@1711972823.521989:usb_uhci_qh_load qh 0x35cb5480
324808@1711972823.521997:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 
0x0, token 0xffe07f69
324808@1711972823.522010:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
324808@1711972823.522022:usb_uhci_qh_load qh 0x35cb5680
324808@1711972823.522030:usb_uhci_td_load qh 0x35cb5680, td 0x75ac5180, ctrl 
0x1980, token 0x3c903e1
324808@1711972823.522045:usb_uhci_packet_add token 0x103e1, td 0x75ac5180
324808@1711972823.522056:usb_packet_state_change bus 0, port 2, ep 2, packet 
0x559f9ba14b00, state undef -> setup
324808@1711972823.522079:usb_msd_cmd_submit lun 0, tag 0x472, flags 0x0080, 
len 10, data-len 8
324808@1711972823.522107:scsi_req_parsed target 0 lun 0 tag 1138 command 74 dir 
1 length 8
324808@1711972823.522124:scsi_req_parsed_lba target 0 lun 0 tag 1138 command 74 
lba 4096
324808@1711972823.522139:scsi_req_alloc target 0 lun 0 tag 1138
324808@1711972823.522169:scsi_req_continue target 0 lun 0 tag 1138
324808@1711972823.522181:scsi_req_data target 0 lun 0 tag 1138 len 8
324808@1711972823.522194:usb_packet_state_change bus 0, port 2, ep 2, packet 
0x559f9ba14b00, state setup -> complete
324808@1711972823.522209:usb_uhci_packet_complete_success token 0x103e1, td 
0x75ac5180
324808@1711972823.522219:usb_uhci_packet_del token 0x103e1, td 0x75ac5180
324808@1711972823.522232:usb_uhci_td_complete qh 0x35cb5680, td 0x75ac5180

trace log at the destination after live migration is as follows:

3286206@1711972823.951646:usb_uhci_frame_start nr 320
3286206@1711972823.951663:usb_uhci_qh_load qh 0x35cb5100
3286206@1711972823.951671:usb_uhci_qh_load qh 0x35cb5480
3286206@1711972823.951680:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 
0x100, token 0xffe07f69
3286206@1711972823.951693:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
3286206@1711972823.951702:usb_uhci_qh_load qh 0x35cb5700
3286206@1711972823.951709:usb_uhci_td_load qh 0x35cb5700, td 0x75ac5240, ctrl 
0x3980, token 0xe08369
3286206@1711972823.951727:usb_uhci_queue_add token 0x8369
3286206@1711972823.951735:usb_uhci_packet_add token 0x8369, td 0x75ac5240
3286206@1711972823.951746:usb_packet_state_change bus 0, port 2, ep 1, packet 
0x56066b2fb5a0, state undef -> setup
3286206@1711972823.951766:usb_msd_data_in 8/8 (scsi 8)
2024-04-01 12:00:24.665+: shutting down, reason=crashed

The backtrace reveals the following:

Program terminated with signal SIGSEGV, Segmentation fault.
0  __memmove_sse2_unaligned_erms () at 
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
312movq-8(%rsi,%rdx), %rcx
[Current thread is 1 (Thread 0x7f0a9025fc00 (LWP 3286206))]
(gdb) bt
0  __memmove_sse2_unaligned_erms () at 
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
1  memcpy (__len=8, __src=, __dest=) at 
/usr/include/bits/string_fortified.h:34
2  iov_from_buf_full (iov=, iov_cnt=, 
offset=, buf=0x0, bytes=bytes@entry=8) at ../util/iov.c:33
3  iov_from_buf (bytes=8, buf=, offset=, 
iov_cnt=, iov=)
   at 
/usr/src/debug/qemu-6-6.2.0-75.7.oe1.smartx.git.40.x86_64/include/qemu/iov.h:49
4  usb_packet_copy (p=p@entry=0x56066b2fb5a0, ptr=, 
bytes=bytes@entry=8) at ../hw/usb/core.c:636
5  usb_msd_copy_data (s=s@entry=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at 
../hw/usb/dev-storage.c:186
6  usb_msd_handle_data (dev=0x56066c62c770, p=0x56066b2fb5a0) at 
../hw/usb/dev-storage.c:496
7  usb_handle_packet (dev=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at 
../hw/usb/core.c:455
8  uhci_handle_td (s=s@entry=0x56066bd5f210, q=0x56066bb7fbd0, q@entry=0x0, 
qh_addr=qh_addr@entry=902518530, td=td@entry=0x7fffe6e788f0, td_addr=,
   int_mask=int_mask@entry=0x7fffe6e788e4) at ../hw/usb/hcd-uhci.c:885
9  uhci_process_frame (s=s@entry=0x56066bd5f210) at ../hw/usb/hcd-uhci.c:1061
10 uhci_frame_timer (opaque=opaque@entry=0x56066bd5f210) at 
../hw/usb/hcd-uhci.c:1159
11 timerlist_run_timers (timer_list=0x56066af26bd0) at ../util/qemu-timer.c:642
12 qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at ../util/qemu-timer.c:656
13 qemu_clock_run_all_timers () at ../util/qemu-timer.c:738
14 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:542
15 qemu_main_loop () at ../softmmu/runstate.c:739
16 main (argc=, argv=, envp=) at 
../softmmu/main.c:52
(gdb) frame 5
(gdb) p ((SCSIDiskReq *)s->req)->iov
$1 = {iov_base = 0x0, iov_len = 0}
(gdb) p/x s->req->tag
$2 = 0x472

Re: [PATCH v4 00/15] vfio: VFIO migration support with vIOMMU

2024-06-10 Thread Cédric Le Goater


On 6/7/24 5:10 PM, Joao Martins wrote:

On 06/06/2024 16:43, Cédric Le Goater wrote:

Hello Joao,

On 6/22/23 23:48, Joao Martins wrote:

Hey,

This series introduces support for vIOMMU with VFIO device migration,
particurlarly related to how we do the dirty page tracking.

Today vIOMMUs serve two purposes: 1) enable interrupt remaping 2)
provide dma translation services for guests to provide some form of
guest kernel managed DMA e.g. for nested virt based usage; (1) is specially
required for big VMs with VFs with more than 255 vcpus. We tackle both
and remove the migration blocker when vIOMMU is present provided the
conditions are met. I have both use-cases here in one series, but I am happy
to tackle them in separate series.

As I found out we don't necessarily need to expose the whole vIOMMU
functionality in order to just support interrupt remapping. x86 IOMMUs
on Windows Server 2018[2] and Linux >=5.10, with qemu 7.1+ (or really
Linux guests with commit c40c10 and since qemu commit 8646d9c773d8)
can instantiate a IOMMU just for interrupt remapping without needing to
be advertised/support DMA translation. AMD IOMMU in theory can provide
the same, but Linux doesn't quite support the IR-only part there yet,
only intel-iommu.

The series is organized as following:

Patches 1-5: Today we can't gather vIOMMU details before the guest
establishes their first DMA mapping via the vIOMMU. So these first four
patches add a way for vIOMMUs to be asked of their properties at start
of day. I choose the least churn possible way for now (as opposed to a
treewide conversion) and allow easy conversion a posteriori. As
suggested by Peter Xu[7], I have ressurected Yi's patches[5][6] which
allows us to fetch PCI backing vIOMMU attributes, without necessarily
tieing the caller (VFIO or anyone else) to an IOMMU MR like I
was doing in v3.

Patches 6-8: Handle configs with vIOMMU interrupt remapping but without
DMA translation allowed. Today the 'dma-translation' attribute is
x86-iommu only, but the way this series is structured nothing stops from
other vIOMMUs supporting it too as long as they use
pci_setup_iommu_ops() and the necessary IOMMU MR get_attr attributes
are handled. The blocker is thus relaxed when vIOMMUs are able to toggle
the toggle/report DMA_TRANSLATION attribute. With the patches up to this set,
we've then tackled item (1) of the second paragraph.

Patches 9-15: Simplified a lot from v2 (patch 9) to only track the complete
IOVA address space, leveraging the logic we use to compose the dirty ranges.
The blocker is once again relaxed for vIOMMUs that advertise their IOVA
addressing limits. This tackles item (2). So far I mainly use it with
intel-iommu, although I have a small set of patches for virtio-iommu per
Alex's suggestion in v2.

Comments, suggestions welcome. Thanks for the review!



I spent sometime refreshing your series on upstream QEMU (See [1]) and
gave migration a try with CX-7 VF. LGTM. It doesn't seem we are far
from acceptance in QEMU 9.1. Are we ?


Yeah.

There was a comment from Zhenzhong on the vfio_viommu_preset() here[0]. But I
was looking at that to remind myself what was it that we had to change, but even
with re-reading the thread I can't spot any flaw that needs change.

[0]
https://lore.kernel.org/qemu-devel/de2b72d2-f56b-9350-ce0f-70edfb58e...@intel.com/#r


I introduced a vfio_devices_all_viommu_preset() routine to check all devices
in a container and a simplified version of vfio_viommu_get_max_iova()
returning the space max_iova.



First, I will resend these with the changes I made :

   vfio/common: Extract vIOMMU code from vfio_sync_dirty_bitmap()
   vfio/common: Move dirty tracking ranges update to helper()

I guess the PCIIOMMUOps::get_iommu_attr needs a close review. Is
IOMMU_ATTR_DMA_TRANSLATION a must have ?


It's sort of the 'correct way' of relaxing vIOMMU checks, because you are 100%
guaranteed that the guest won't do DMA. The other outstanding thing related to
that is for older kernels which is to use the directmap for dirty page tracking,
but the moment a mapping is attempted the migration doesn't start or if it's in
progress it gets aborted[*]:

https://lore.kernel.org/qemu-devel/20230908120521.50903-1-joao.m.mart...@oracle.com/

The above link and DMA_TRANSLATION is mostly for the usecase we use that only
cares about vIOMMU for interrupt remapping only and no DMA translation services.
But we can't just disable dma-translation in qemu because it may crash older
kernels, so it supports both old and new this way.

[*] Recently I noticed you improved error reporting, so
vfio_set_migration_error(-EOPNOTSUPP) probably has a better way of getting 
there.


Yes. So, I did a little more change to improve vfio_dirty_tracking_init().


The rest is mostly VFIO internals for dirty tracking.


Right.

I derailed with other work and also stuff required for iommu dirty tracking that
I forgot about these patches, sorry.


That's fine.

I am trying to sort out which patches

Re: [PATCH v4 4/4] iotests: Add `vvfat` tests

2024-06-10 Thread Kevin Wolf

Am 10.06.2024 um 16:11 hat Amjad Alsharafi geschrieben:
> On Mon, Jun 10, 2024 at 02:01:24PM +0200, Kevin Wolf wrote:
> > With the updated test, I can catch the problems that are fixed by
> > patches 1 and 2, but it still doesn't need patch 3 to pass.
> > 
> > Kevin
> > 
> 
> Thanks for reviewing, those are all mistakes, and I fixed them (included
> a small patch to fix these issues at the end...).
> 
> Regarding the failing test, I forgot to also read the files from the fat
> driver, and instead I was just reading from the host filesystem.
> I'm not sure exactly, why reading from the filesystem works, but reading
> from the driver (i.e. guest) gives the weird buggy result. 
> I have updated the test in the patch below to reflect this.
> 
> I would love if you can test the patch below and let me know if the
> issues are fixed, after that I can send the new series.

Yes, that looks good to me and reproduces a failure without patch 3.

Kevin

Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`

2024-06-10 Thread Kevin Wolf

Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> The field is marked as "the offset in the file (in clusters)", but it
> was being used like this
> `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> 
> Additionally, removed the `abort` when `first_mapping_index` does not
> match, as this matches the case when adding new clusters for files, and
> its inevitable that we reach this condition when doing that if the
> clusters are not after one another, so there is no reason to `abort`
> here, execution continues and the new clusters are written to disk
> correctly.
> 
> Signed-off-by: Amjad Alsharafi 

Can you help me understand how first_mapping_index really works?

It seems to me that you get a chain of mappings for each file on the FAT
filesystem, which are just the contiguous areas in it, and
first_mapping_index refers to the mapping at the start of the file. But
for much of the time, it actually doesn't seem to be set at all, so you
have mapping->first_mapping_index == -1. Do you understand the rules
around when it's set and when it isn't?

>  block/vvfat.c | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/block/vvfat.c b/block/vvfat.c
> index 19da009a5b..f0642ac3e4 100644
> --- a/block/vvfat.c
> +++ b/block/vvfat.c
> @@ -1408,7 +1408,9 @@ read_cluster_directory:
>  
>  assert(s->current_fd);
>  
> -
> offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
> +offset = s->cluster_size *
> +((cluster_num - s->current_mapping->begin)
> ++ s->current_mapping->info.file.offset);
>  if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
>  return -3;
>  s->cluster=s->cluster_buffer;
> @@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, 
> direntry_t* direntry, const ch
>  (mapping->mode & MODE_DIRECTORY) == 0) {
>  
>  /* was modified in qcow */
> -if (offset != mapping->info.file.offset + s->cluster_size
> -* (cluster_num - mapping->begin)) {
> +if (offset != s->cluster_size
> +* ((cluster_num - mapping->begin)
> ++ mapping->info.file.offset)) {
>  /* offset of this cluster in file chain has changed 
> */
>  abort();
>  copy_it = 1;
> @@ -1944,7 +1947,6 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, 
> direntry_t* direntry, const ch
>  
>  if (mapping->first_mapping_index != first_mapping_index
>  && mapping->info.file.offset > 0) {
> -abort();
>  copy_it = 1;
>  }

I'm unsure which case this represents. If first_mapping_index refers to
the mapping of the first cluster in the file, does this mean we got a
mapping for a different file here? Or is the comparison between -1 and a
real value?

In any case it doesn't seem to be the case that the comment at the
declaration of copy_it describes.

>  
> @@ -2404,7 +2406,7 @@ static int commit_mappings(BDRVVVFATState* s,
>  (mapping->end - mapping->begin);
>  } else
>  next_mapping->info.file.offset = mapping->info.file.offset +
> -mapping->end - mapping->begin;
> +(mapping->end - mapping->begin);
>  
>  mapping = next_mapping;
>  }

Kevin

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

On Fri, Jun 07, 2024 at 08:49:01AM +, Gonglei (Arei) wrote:
> Actually we tried this solution, but it didn't work. Pls see patch 3/6
> 
> Known limitations: 
>   For a blocking rsocket fd, if we use io_create_watch to wait for
>   POLLIN or POLLOUT events, since the rsocket fd is blocking, we
>   cannot determine when it is not ready to read/write as we can with
>   non-blocking fds. Therefore, when an event occurs, it will occurs
>   always, potentially leave the qemu hanging. So we need be cautious
>   to avoid hanging when using io_create_watch .

I'm not sure I fully get that part, though.  In:

https://lore.kernel.org/all/ZldY21xVExtiMddB@x1n/

I was thinking of iochannel implements its own poll with the _POLL flag, so
in that case it'll call qio_channel_poll() which should call rpoll()
directly. So I didn't expect using qio_channel_create_watch().  I thought
the context was gmainloop won't work with rsocket fds in general, but maybe
I missed something.

Thanks,

-- 
Peter Xu

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

On Fri, Jun 07, 2024 at 08:28:29AM +, Gonglei (Arei) wrote:
> 
> 
> > -Original Message-
> > From: Jinpu Wang [mailto:jinpu.w...@ionos.com]
> > Sent: Friday, June 7, 2024 1:54 PM
> > To: Gonglei (Arei) 
> > Cc: qemu-devel@nongnu.org; pet...@redhat.com; yu.zh...@ionos.com;
> > mgal...@akamai.com; elmar.ger...@ionos.com; zhengchuan
> > ; berra...@redhat.com; arm...@redhat.com;
> > lizhij...@fujitsu.com; pbonz...@redhat.com; m...@redhat.com; Xiexiangyou
> > ; linux-r...@vger.kernel.org; lixiao (H)
> > ; Wangjialin 
> > Subject: Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API
> > 
> > Hi Gonglei, hi folks on the list,
> > 
> > On Tue, Jun 4, 2024 at 2:14 PM Gonglei  wrote:
> > >
> > > From: Jialin Wang 
> > >
> > > Hi,
> > >
> > > This patch series attempts to refactor RDMA live migration by
> > > introducing a new QIOChannelRDMA class based on the rsocket API.
> > >
> > > The /usr/include/rdma/rsocket.h provides a higher level rsocket API
> > > that is a 1-1 match of the normal kernel 'sockets' API, which hides
> > > the detail of rdma protocol into rsocket and allows us to add support
> > > for some modern features like multifd more easily.
> > >
> > > Here is the previous discussion on refactoring RDMA live migration
> > > using the rsocket API:
> > >
> > > https://lore.kernel.org/qemu-devel/20240328130255.52257-1-philmd@linar
> > > o.org/
> > >
> > > We have encountered some bugs when using rsocket and plan to submit
> > > them to the rdma-core community.
> > >
> > > In addition, the use of rsocket makes our programming more convenient,
> > > but it must be noted that this method introduces multiple memory
> > > copies, which can be imagined that there will be a certain performance
> > > degradation, hoping that friends with RDMA network cards can help verify,
> > thank you!
> > First thx for the effort, we are running migration tests on our IB fabric, 
> > different
> > generation of HCA from mellanox, the migration works ok, there are a few
> > failures,  Yu will share the result later separately.
> > 
> 
> Thank you so much. 
> 
> > The one blocker for the change is the old implementation and the new rsocket
> > implementation; they don't talk to each other due to the effect of 
> > different wire
> > protocol during connection establishment.
> > eg the old RDMA migration has special control message during the migration
> > flow, which rsocket use a different control message, so there lead to no 
> > way to
> > migrate VM using rdma transport pre to the rsocket patchset to a new version
> > with rsocket implementation.
> > 
> > Probably we should keep both implementation for a while, mark the old
> > implementation as deprecated, and promote the new implementation, and
> > high light in doc, they are not compatible.
> > 
> 
> IMO It makes sense. What's your opinion? @Peter.

Sounds good to me.  We can use an internal property field and enable
rsocket rdma migration on new machine types with rdma protocol, deprecating
both old rdma and that internal field after 2 releases.  So that when
receiving rdma migrations it'll use old property (as old qemu will use old
machine types), but when initiating rdma migration on new binary it'll
switch to rsocket.

It might be more important to address either the failures or perf concerns
that others raised, though.

Thanks,

-- 
Peter Xu

[PATCH v2 1/3] hw/arm/virt: Add serial aliases in DTB

If there is more than one UART in the DTB, then there is no guarantee
on which order a guest is supposed to initialise them.  The standard
solution to this is "serialN" entries in the "/aliases" node of the
dtb which give the nodename of the UARTs.

At the moment we only have two UARTs in the DTB when one is for
the Secure world and one for the Non-Secure world, so this isn't
really a problem. However if we want to add a second NS UART we'll
need the aliases to ensure guests pick the right one.

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/arm/virt.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3c93c0c0a61..0c1dab67c00 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -284,6 +284,8 @@ static void create_fdt(VirtMachineState *vms)
 }
 }
 
+qemu_fdt_add_subnode(fdt, "/aliases");
+
 /* Clock node, for the benefit of the UART. The kernel device tree
  * binding documentation claims the PL011 node clock properties are
  * optional but in practice if you omit them the kernel refuses to
@@ -939,7 +941,9 @@ static void create_uart(const VirtMachineState *vms, int 
uart,
 
 if (uart == VIRT_UART) {
 qemu_fdt_setprop_string(ms->fdt, "/chosen", "stdout-path", nodename);
+qemu_fdt_setprop_string(ms->fdt, "/aliases", "serial0", nodename);
 } else {
+qemu_fdt_setprop_string(ms->fdt, "/aliases", "serial1", nodename);
 /* Mark as not usable by the normal world */
 qemu_fdt_setprop_string(ms->fdt, nodename, "status", "disabled");
 qemu_fdt_setprop_string(ms->fdt, nodename, "secure-status", "okay");
-- 
2.34.1

[PATCH v2 2/3] hw/arm/virt: Rename VIRT_UART and VIRT_SECURE_UART to VIRT_UART[01]

We're going to make the second UART not always a secure-only device.
Rename the constants VIRT_UART and VIRT_SECURE_UART to VIRT_UART0
and VIRT_UART1 accordingly.

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/hw/arm/virt.h|  4 ++--
 hw/arm/virt-acpi-build.c | 12 ++--
 hw/arm/virt.c| 14 +++---
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index bb486d36b14..1227e7f7f08 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -59,7 +59,7 @@ enum {
 VIRT_GIC_ITS,
 VIRT_GIC_REDIST,
 VIRT_SMMU,
-VIRT_UART,
+VIRT_UART0,
 VIRT_MMIO,
 VIRT_RTC,
 VIRT_FW_CFG,
@@ -69,7 +69,7 @@ enum {
 VIRT_PCIE_ECAM,
 VIRT_PLATFORM_BUS,
 VIRT_GPIO,
-VIRT_SECURE_UART,
+VIRT_UART1,
 VIRT_SECURE_MEM,
 VIRT_SECURE_GPIO,
 VIRT_PCDIMM_ACPI,
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index c3ccfef026f..eb5796e309b 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -440,10 +440,10 @@ spcr_setup(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 .base_addr.width = 32,
 .base_addr.offset = 0,
 .base_addr.size = 3,
-.base_addr.addr = vms->memmap[VIRT_UART].base,
+.base_addr.addr = vms->memmap[VIRT_UART0].base,
 .interrupt_type = (1 << 3),/* Bit[3] ARMH GIC interrupt*/
 .pc_interrupt = 0, /* IRQ */
-.interrupt = (vms->irqmap[VIRT_UART] + ARM_SPI_BASE),
+.interrupt = (vms->irqmap[VIRT_UART0] + ARM_SPI_BASE),
 .baud_rate = 3,/* 9600 */
 .parity = 0,   /* No Parity */
 .stop_bits = 1,/* 1 Stop bit */
@@ -631,11 +631,11 @@ build_dbg2(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 
 /* BaseAddressRegister[] */
 build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 32, 0, 3,
- vms->memmap[VIRT_UART].base);
+ vms->memmap[VIRT_UART0].base);
 
 /* AddressSize[] */
 build_append_int_noprefix(table_data,
-  vms->memmap[VIRT_UART].size, 4);
+  vms->memmap[VIRT_UART0].size, 4);
 
 /* NamespaceString[] */
 g_array_append_vals(table_data, name, namespace_length);
@@ -816,8 +816,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
  */
 scope = aml_scope("\\_SB");
 acpi_dsdt_add_cpus(scope, vms);
-acpi_dsdt_add_uart(scope, [VIRT_UART],
-   (irqmap[VIRT_UART] + ARM_SPI_BASE));
+acpi_dsdt_add_uart(scope, [VIRT_UART0],
+   (irqmap[VIRT_UART0] + ARM_SPI_BASE));
 if (vmc->acpi_expose_flash) {
 acpi_dsdt_add_flash(scope, [VIRT_FLASH]);
 }
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 0c1dab67c00..920a9db22f2 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -165,11 +165,11 @@ static const MemMapEntry base_memmap[] = {
 [VIRT_GIC_ITS] ={ 0x0808, 0x0002 },
 /* This redistributor space allows up to 2*64kB*123 CPUs */
 [VIRT_GIC_REDIST] = { 0x080A, 0x00F6 },
-[VIRT_UART] =   { 0x0900, 0x1000 },
+[VIRT_UART0] =  { 0x0900, 0x1000 },
 [VIRT_RTC] ={ 0x0901, 0x1000 },
 [VIRT_FW_CFG] = { 0x0902, 0x0018 },
 [VIRT_GPIO] =   { 0x0903, 0x1000 },
-[VIRT_SECURE_UART] ={ 0x0904, 0x1000 },
+[VIRT_UART1] =  { 0x0904, 0x1000 },
 [VIRT_SMMU] =   { 0x0905, 0x0002 },
 [VIRT_PCDIMM_ACPI] ={ 0x0907, MEMORY_HOTPLUG_IO_LEN },
 [VIRT_ACPI_GED] =   { 0x0908, ACPI_GED_EVT_SEL_LEN },
@@ -212,11 +212,11 @@ static MemMapEntry extended_memmap[] = {
 };
 
 static const int a15irqmap[] = {
-[VIRT_UART] = 1,
+[VIRT_UART0] = 1,
 [VIRT_RTC] = 2,
 [VIRT_PCIE] = 3, /* ... to 6 */
 [VIRT_GPIO] = 7,
-[VIRT_SECURE_UART] = 8,
+[VIRT_UART1] = 8,
 [VIRT_ACPI_GED] = 9,
 [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
 [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
@@ -939,7 +939,7 @@ static void create_uart(const VirtMachineState *vms, int 
uart,
 qemu_fdt_setprop(ms->fdt, nodename, "clock-names",
  clocknames, sizeof(clocknames));
 
-if (uart == VIRT_UART) {
+if (uart == VIRT_UART0) {
 qemu_fdt_setprop_string(ms->fdt, "/chosen", "stdout-path", nodename);
 qemu_fdt_setprop_string(ms->fdt, "/aliases", "serial0", nodename);
 } else {
@@ -2318,11 +2318,11 @@ static void machvirt_init(MachineState *machine)
 
 fdt_add_pmu_nodes(vms);
 
-create_uart(vms, VIRT_UART, sysmem, serial_hd(0));
+create_uart(vms, VIRT_UART0, sysmem, serial_hd(0));
 
 if (vms->secure) {

[PATCH v2 3/3] hw/arm/virt: allow creation of a second NonSecure UART

For some use-cases, it is helpful to have more than one UART
available to the guest.  If the second UART slot is not already used
for a TrustZone Secure-World-only UART, create it as a NonSecure UART
only when the user provides a serial backend (e.g.  via a second
-serial command line option).

This avoids problems where existing guest software only expects a
single UART, and gets confused by the second UART in the DTB.  The
major example of this is older EDK2 firmware, which will send the
GRUB bootloader output to UART1 and the guest serial output to UART0.
Users who want to use both UARTs with a guest setup including EDK2
are advised to update to EDK2 release edk2-stable202311 or newer.
(The prebuilt EDK2 blobs QEMU upstream provides are new enough.)
The relevant EDK2 changes are the ones described here:
https://bugzilla.tianocore.org/show_bug.cgi?id=4577

Inspired-by: Axel Heider 
Signed-off-by: Peter Maydell 
Tested-by: Laszlo Ersek 
---
 docs/system/arm/virt.rst |  6 +-
 include/hw/arm/virt.h|  1 +
 hw/arm/virt-acpi-build.c | 12 
 hw/arm/virt.c| 38 +++---
 4 files changed, 49 insertions(+), 8 deletions(-)

diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
index 26fcba00b76..e67e7f0f7c5 100644
--- a/docs/system/arm/virt.rst
+++ b/docs/system/arm/virt.rst
@@ -26,7 +26,7 @@ The virt board supports:
 
 - PCI/PCIe devices
 - Flash memory
-- One PL011 UART
+- Either one or two PL011 UARTs for the NonSecure World
 - An RTC
 - The fw_cfg device that allows a guest to obtain data from QEMU
 - A PL061 GPIO controller
@@ -48,6 +48,10 @@ The virt board supports:
   - A secure flash memory
   - 16MB of secure RAM
 
+The second NonSecure UART only exists if a backend is configured
+explicitly (e.g. with a second -serial command line option) and
+TrustZone emulation is not enabled.
+
 Supported guest CPU types:
 
 - ``cortex-a7`` (32-bit)
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 1227e7f7f08..ab961bb6a9b 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -151,6 +151,7 @@ struct VirtMachineState {
 bool ras;
 bool mte;
 bool dtb_randomness;
+bool second_ns_uart_present;
 OnOffAuto acpi;
 VirtGICType gic_version;
 VirtIOMMUType iommu;
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index eb5796e309b..b2366f24f96 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -79,11 +79,11 @@ static void acpi_dsdt_add_cpus(Aml *scope, VirtMachineState 
*vms)
 }
 
 static void acpi_dsdt_add_uart(Aml *scope, const MemMapEntry *uart_memmap,
-   uint32_t uart_irq)
+   uint32_t uart_irq, int uartidx)
 {
-Aml *dev = aml_device("COM0");
+Aml *dev = aml_device("COM%d", uartidx);
 aml_append(dev, aml_name_decl("_HID", aml_string("ARMH0011")));
-aml_append(dev, aml_name_decl("_UID", aml_int(0)));
+aml_append(dev, aml_name_decl("_UID", aml_int(uartidx)));
 
 Aml *crs = aml_resource_template();
 aml_append(crs, aml_memory32_fixed(uart_memmap->base,
@@ -817,7 +817,11 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 scope = aml_scope("\\_SB");
 acpi_dsdt_add_cpus(scope, vms);
 acpi_dsdt_add_uart(scope, [VIRT_UART0],
-   (irqmap[VIRT_UART0] + ARM_SPI_BASE));
+   (irqmap[VIRT_UART0] + ARM_SPI_BASE), 0);
+if (vms->second_ns_uart_present) {
+acpi_dsdt_add_uart(scope, [VIRT_UART1],
+   (irqmap[VIRT_UART1] + ARM_SPI_BASE), 1);
+}
 if (vmc->acpi_expose_flash) {
 acpi_dsdt_add_flash(scope, [VIRT_FLASH]);
 }
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 920a9db22f2..5028af8eb56 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -906,7 +906,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion 
*mem)
 }
 
 static void create_uart(const VirtMachineState *vms, int uart,
-MemoryRegion *mem, Chardev *chr)
+MemoryRegion *mem, Chardev *chr, bool secure)
 {
 char *nodename;
 hwaddr base = vms->memmap[uart].base;
@@ -944,6 +944,8 @@ static void create_uart(const VirtMachineState *vms, int 
uart,
 qemu_fdt_setprop_string(ms->fdt, "/aliases", "serial0", nodename);
 } else {
 qemu_fdt_setprop_string(ms->fdt, "/aliases", "serial1", nodename);
+}
+if (secure) {
 /* Mark as not usable by the normal world */
 qemu_fdt_setprop_string(ms->fdt, nodename, "status", "disabled");
 qemu_fdt_setprop_string(ms->fdt, nodename, "secure-status", "okay");
@@ -2318,11 +2320,41 @@ static void machvirt_init(MachineState *machine)
 
 fdt_add_pmu_nodes(vms);
 
-create_uart(vms, VIRT_UART0, sysmem, serial_hd(0));
+/*
+ * The first UART always exists. If the security extensions are
+ * enabled, the second UART also always exists. Otherwise,

[PATCH v2 0/3] hw/arm: Create second NonSecure UART for virt board

This is v2 of a series I posted back in October last year:
 https://patchew.org/QEMU/20231023161532.2729084-1-peter.mayd...@linaro.org/
At the time I wanted to wait until EDK2 had been updated so it
didn't behave weirdly in the presence of a second UART. That
happened at the tail end of last year, but I'd forgotten that
we never committed the QEMU side of things until Laszlo kindly
reminded me a few days ago. So there are now no blockers to
getting this patchset into QEMU.


For some use-cases, it is helpful to have more than one UART available
to the guest, but the Arm 'virt' board only creates one.  If the
second UART slot is not already used for a TrustZone Secure-World-only
UART, create it as a NonSecure UART if the user provides a serial
backend for it (e.g. via a second -serial command line option).

We've wanted this for literally years; my first attempt at it
was this series in 2017:
https://lore.kernel.org/all/1512745328-5109-1-git-send-email-peter.mayd...@linaro.org/
More recently Axel Heider revived the idea with a patchset in 2022:
https://lore.kernel.org/qemu-devel/166990501232.22022.1658256124453401108...@git.sr.ht/

However it has previously foundered on the problem that EDK2 does
odd things when presented with multiple UARTs in the DTB. (Specifically,
it will send the guest GRUB bootloader output to UART1, debug output
to both UARTs 0 and 1 depending on how far through boot it is, and the
guest kernel will use UART0 since that's what the ACPI tables say.)

Several things here I think mean we can finally get over this issue:
 * I learnt about the device tree "aliases" node; this allows us to
   explicitly say "this node is the first UART and this node is the
   second UART". So guests like Linux that follow this part of the
   DTB spec will always get the UART order correct; and if there are
   obscure guests that turn out to misbehave, we can point at the
   spec and say "this is how you should fix this on your end"...
 * This patch, like Axel's patch, only creates the second UART if
   the user asks for it on the command line, so any pre-existing
   command lines will not change behaviour.
 * Laszlo Ersek has kindly written some EDK2 patches that rationalise
   what it does when it finds more than one UART. This means that
   we can tell any users who do want to use two UARTs with EDK2
   "you should upgrade your EDK2 blobs to version NNN if you want to
   do that". These are now in a released EDK2 and QEMU's EDK2
   blobs have been updated to a version including these changes.

Changes since v2:
 * rebased (the search-n-replace patch 2 needed some minor tweaks)
 * updated commit message to patch 3 with details of which EDK2
   release you need for second-uart support

Patches 1 and 2 have been reviewed; patch 3 needs review.

thanks
-- PMM

Peter Maydell (3):
  hw/arm/virt: Add serial aliases in DTB
  hw/arm/virt: Rename VIRT_UART and VIRT_SECURE_UART to VIRT_UART[01]
  hw/arm/virt: allow creation of a second NonSecure UART

 docs/system/arm/virt.rst |  6 -
 include/hw/arm/virt.h|  5 ++--
 hw/arm/virt-acpi-build.c | 22 ++---
 hw/arm/virt.c| 52 +---
 4 files changed, 65 insertions(+), 20 deletions(-)

-- 
2.34.1

Re: [PATCH v2 18/18] migration/ram: Add direct-io support to precopy file migration

On Fri, Jun 07, 2024 at 03:42:35PM -0300, Fabiano Rosas wrote:
> Peter Xu  writes:
> 
> > On Thu, May 23, 2024 at 04:05:48PM -0300, Fabiano Rosas wrote:
> >> We've recently added support for direct-io with multifd, which brings
> >> performance benefits, but creates a non-uniform user interface by
> >> coupling direct-io with the multifd capability. This means that users
> >> cannot keep the direct-io flag enabled while disabling multifd.
> >> 
> >> Libvirt in particular already has support for direct-io and parallel
> >> migration separately from each other, so it would be a regression to
> >> now require both options together. It's relatively simple for QEMU to
> >> add support for direct-io migration without multifd, so let's do this
> >> in order to keep both options decoupled.
> >> 
> >> We cannot simply enable the O_DIRECT flag, however, because not all IO
> >> performed by the migration thread satisfies the alignment requirements
> >> of O_DIRECT. There are many small read & writes that add headers and
> >> synchronization flags to the stream, which at the moment are required
> >> to always be present.
> >> 
> >> Fortunately, due to fixed-ram migration there is a discernible moment
> >> where only RAM pages are written to the migration file. Enable
> >> direct-io during that moment.
> >> 
> >> Signed-off-by: Fabiano Rosas 
> >
> > Is anyone going to consume this?  How's the performance?
> 
> I don't think we have a pre-determined consumer for this. This came up
> in an internal discussion about making the interface simpler for libvirt
> and in a thread on the libvirt mailing list[1] about using O_DIRECT to
> keep the snapshot data out of the caches to avoid impacting the rest of
> the system. (I could have described this better in the commit message,
> sorry).
> 
> Quoting Daniel:
> 
>   "Note the reason for using O_DIRECT is *not* to make saving / restoring
>the guest VM faster. Rather it is to ensure that saving/restoring a VM
>does not trash the host I/O / buffer cache, which will negatively impact
>performance of all the *other* concurrently running VMs."
> 
> 1- https://lore.kernel.org/r/87sez86ztq@suse.de
> 
> About performance, a quick test on a stopped 30G guest, shows
> mapped-ram=on direct-io=on it's 12% slower than mapped-ram=on
> direct-io=off.

Yes, this makes sense.

> 
> >
> > It doesn't look super fast to me if we need to enable/disable dio in each
> > loop.. then it's a matter of whether we should bother, or would it be
> > easier that we simply require multifd when direct-io=on.
> 
> AIUI, the issue here that users are already allowed to specify in
> libvirt the equivalent to direct-io and multifd independent of each
> other (bypass-cache, parallel). To start requiring both together now in
> some situations would be a regression. I confess I don't know libvirt
> code to know whether this can be worked around somehow, but as I said,
> it's a relatively simple change from the QEMU side.

Firstly, I definitely want to already avoid all the calls to either
migration_direct_io_start() or *_finish(), now we already need to
explicitly call them in three paths, and that's not intuitive and less
readable, just like the hard coded rdma codes.

I also worry we may overlook the complexity here, and pinning buffers
definitely need more thoughts on its own.  It's easier to digest when using
multifd and when QEMU only pins guest pages just like tcp-zerocopy does,
which are naturally host page size aligned, and also guaranteed to not be
freed (while reused / modified is fine here, as dirty tracking guarantees a
new page will be migrated soon again).

IMHO here the "not be freed / modified" is even more important than
"alignment": the latter is about perf, the former is about correctness.
When we do directio on random buffers, AFAIU we don't want to have the
buffer modified before flushed to disk, and that's IMHO not easy to
guarantee.

E.g., I don't think this guarantees a flush on the buffer usages:

  migration_direct_io_start()
/* flush any potentially unaligned IO before setting O_DIRECT */
qemu_fflush(file);

qemu_fflush() internally does writev(), and that "flush" is about "flushing
qemufile iov[] to fd", not "flushing buffers to disk".  I think it means
if we do qemu_fflush() then we modify QEMUFile.buf[IO_BUF_SIZE] we're
doomed: we will never know whether dio has happened, and which version of
buffer will be sent; I don't think it's guaranteed it will always be the
old version of the buffer.

However the issue is, QEMUFile defines qemu_fflush() as: after call, the
buf[] can be reused!  It suggests breaking things I guess in dio context.

IIUC currently mapped-ram is ok because mapped-ram is just special that it
doesn't have page headers, so it doesn't use the buf[] during iterations;
while for zeropage it uses file_bmap bitmap and that's separate too and
does not generate any byte on the wire either.

xbzrle could use that buf[], but maybe mapped-ram doesn't work

[PATCH v4 3/3] tests/qtest/x86: check for availability of older cpu models before running tests

It is better to check if some older cpu models like 486, athlon, pentium,
penryn, phenom, core2duo etc are available before running their corresponding
tests. Some downstream distributions may no longer support these older cpu
models.

Signature of add_feature_test() has been modified to return void as
FeatureTestArgs* was not used by the caller.

One minor correction. Replaced 'phenom' with '486' in the test
'x86/cpuid/auto-level/phenom/arat' matching the cpu used.

CC: th...@redhat.com
CC: imamm...@redhat.com
Signed-off-by: Ani Sinha 
Reviewed-by: Daniel P. Berrangé 
---
 tests/qtest/test-x86-cpuid-compat.c | 170 ++--
 1 file changed, 108 insertions(+), 62 deletions(-)

changelog:
v2: reworked as per suggestion from danpb.
v3: reworked as_feature_test() same way as add_cpuid_test()
v4: phil's suggestion. tags added.

diff --git a/tests/qtest/test-x86-cpuid-compat.c 
b/tests/qtest/test-x86-cpuid-compat.c
index 6a39454fce..b9e7e5ef7b 100644
--- a/tests/qtest/test-x86-cpuid-compat.c
+++ b/tests/qtest/test-x86-cpuid-compat.c
@@ -67,10 +67,29 @@ static void test_cpuid_prop(const void *data)
 g_free(path);
 }
 
-static void add_cpuid_test(const char *name, const char *cmdline,
+static void add_cpuid_test(const char *name, const char *cpu,
+   const char *cpufeat, const char *machine,
const char *property, int64_t expected_value)
 {
 CpuidTestArgs *args = g_new0(CpuidTestArgs, 1);
+char *cmdline;
+char *save;
+
+if (!qtest_has_cpu_model(cpu)) {
+return;
+}
+cmdline = g_strdup_printf("-cpu %s", cpu);
+
+if (cpufeat) {
+save = cmdline;
+cmdline = g_strdup_printf("%s,%s", cmdline, cpufeat);
+g_free(save);
+}
+if (machine) {
+save = cmdline;
+cmdline = g_strdup_printf("-machine %s %s", machine, cmdline);
+g_free(save);
+}
 args->cmdline = cmdline;
 args->property = property;
 args->expected_value = expected_value;
@@ -149,12 +168,24 @@ static void test_feature_flag(const void *data)
  * either "feature-words" or "filtered-features", when running QEMU
  * using cmdline
  */
-static FeatureTestArgs *add_feature_test(const char *name, const char *cmdline,
- uint32_t eax, uint32_t ecx,
- const char *reg, int bitnr,
- bool expected_value)
+static void add_feature_test(const char *name, const char *cpu,
+ const char *cpufeat, uint32_t eax,
+ uint32_t ecx, const char *reg,
+ int bitnr, bool expected_value)
 {
 FeatureTestArgs *args = g_new0(FeatureTestArgs, 1);
+char *cmdline;
+
+if (!qtest_has_cpu_model(cpu)) {
+return;
+}
+
+if (cpufeat) {
+cmdline = g_strdup_printf("-cpu %s,%s", cpu, cpufeat);
+} else {
+cmdline = g_strdup_printf("-cpu %s", cpu);
+}
+
 args->cmdline = cmdline;
 args->in_eax = eax;
 args->in_ecx = ecx;
@@ -162,13 +193,17 @@ static FeatureTestArgs *add_feature_test(const char 
*name, const char *cmdline,
 args->bitnr = bitnr;
 args->expected_value = expected_value;
 qtest_add_data_func(name, args, test_feature_flag);
-return args;
+return;
 }
 
 static void test_plus_minus_subprocess(void)
 {
 char *path;
 
+if (!qtest_has_cpu_model("pentium")) {
+return;
+}
+
 /* Rules:
  * 1)"-foo" overrides "+foo"
  * 2) "[+-]foo" overrides "foo=..."
@@ -198,6 +233,10 @@ static void test_plus_minus_subprocess(void)
 
 static void test_plus_minus(void)
 {
+if (!qtest_has_cpu_model("pentium")) {
+return;
+}
+
 g_test_trap_subprocess("/x86/cpuid/parsing-plus-minus/subprocess", 0, 0);
 g_test_trap_assert_passed();
 g_test_trap_assert_stderr("*Ambiguous CPU model string. "
@@ -217,99 +256,105 @@ int main(int argc, char **argv)
 
 /* Original level values for CPU models: */
 add_cpuid_test("x86/cpuid/phenom/level",
-   "-cpu phenom", "level", 5);
+   "phenom", NULL, NULL, "level", 5);
 add_cpuid_test("x86/cpuid/Conroe/level",
-   "-cpu Conroe", "level", 10);
+   "Conroe", NULL, NULL, "level", 10);
 add_cpuid_test("x86/cpuid/SandyBridge/level",
-   "-cpu SandyBridge", "level", 0xd);
+   "SandyBridge", NULL, NULL, "level", 0xd);
 add_cpuid_test("x86/cpuid/486/xlevel",
-   "-cpu 486", "xlevel", 0);
+   "486", NULL, NULL, "xlevel", 0);
 add_cpuid_test("x86/cpuid/core2duo/xlevel",
-   "-cpu core2duo", "xlevel", 0x8008);
+   "core2duo", NULL, NULL, "xlevel", 0x8008);
 add_cpuid_test("x86/cpuid/phenom/xlevel",
-   "-cpu phenom", "xlevel", 0x801A);
+   "phenom", NULL, NULL,

[PATCH v3 2/3] tests/qtest/libqtest: add qtest_has_cpu_model() api

Added a new test api qtest_has_cpu_model() in order to check availability of
some cpu models in the current QEMU binary. The specific architecture of the
QEMU binary is selected using the QTEST_QEMU_BINARY environment variable.
This api would be useful to run tests against some older cpu models after
checking if QEMU actually supported these models.

CC: th...@redhat.com
Signed-off-by: Ani Sinha 
Reviewed-by: Reviewed-by: Daniel P. Berrangé 
---
 tests/qtest/libqtest.c | 83 ++
 tests/qtest/libqtest.h |  8 
 2 files changed, 91 insertions(+)

changelog:
v2: changes related to suggestions made by danpb. added tags.
v3: phil's suggestion to rename function and structure names.
diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
index d8f80d335e..18e2f7f282 100644
--- a/tests/qtest/libqtest.c
+++ b/tests/qtest/libqtest.c
@@ -37,6 +37,7 @@
 #include "qapi/qmp/qjson.h"
 #include "qapi/qmp/qlist.h"
 #include "qapi/qmp/qstring.h"
+#include "qapi/qmp/qbool.h"
 
 #define MAX_IRQ 256
 
@@ -1471,6 +1472,12 @@ struct MachInfo {
 char *alias;
 };
 
+struct CpuModel {
+char *name;
+char *alias_of;
+bool deprecated;
+};
+
 static void qtest_free_machine_list(struct MachInfo *machines)
 {
 if (machines) {
@@ -1550,6 +1557,82 @@ static struct MachInfo *qtest_get_machines(const char 
*var)
 return machines;
 }
 
+static struct CpuModel *qtest_get_cpu_models(void)
+{
+static struct CpuModel *cpus;
+QDict *response, *minfo;
+QList *list;
+const QListEntry *p;
+QObject *qobj;
+QString *qstr;
+QBool *qbool;
+QTestState *qts;
+int idx;
+
+if (cpus) {
+return cpus;
+}
+
+silence_spawn_log = !g_test_verbose();
+
+qts = qtest_init_with_env(NULL, "-machine none");
+response = qtest_qmp(qts, "{ 'execute': 'query-cpu-definitions' }");
+g_assert(response);
+list = qdict_get_qlist(response, "return");
+g_assert(list);
+
+cpus = g_new0(struct CpuModel, qlist_size(list) + 1);
+
+for (p = qlist_first(list), idx = 0; p; p = qlist_next(p), idx++) {
+minfo = qobject_to(QDict, qlist_entry_obj(p));
+g_assert(minfo);
+
+qobj = qdict_get(minfo, "name");
+g_assert(qobj);
+qstr = qobject_to(QString, qobj);
+g_assert(qstr);
+cpus[idx].name = g_strdup(qstring_get_str(qstr));
+
+qobj = qdict_get(minfo, "alias_of");
+if (qobj) { /* old machines do not report aliases */
+qstr = qobject_to(QString, qobj);
+g_assert(qstr);
+cpus[idx].alias_of = g_strdup(qstring_get_str(qstr));
+} else {
+cpus[idx].alias_of = NULL;
+}
+
+qobj = qdict_get(minfo, "deprecated");
+qbool = qobject_to(QBool, qobj);
+g_assert(qbool);
+cpus[idx].deprecated = qbool_get_bool(qbool);
+}
+
+qtest_quit(qts);
+qobject_unref(response);
+
+silence_spawn_log = false;
+
+return cpus;
+}
+
+bool qtest_has_cpu_model(const char *cpu)
+{
+struct CpuModel *cpus;
+int i;
+
+cpus = qtest_get_cpu_models();
+
+for (i = 0; cpus[i].name != NULL; i++) {
+if (g_str_equal(cpu, cpus[i].name) ||
+(cpus[i].alias_of && g_str_equal(cpu, cpus[i].alias_of))) {
+return true;
+}
+}
+
+return false;
+}
+
 void qtest_cb_for_every_machine(void (*cb)(const char *machine),
 bool skip_old_versioned)
 {
diff --git a/tests/qtest/libqtest.h b/tests/qtest/libqtest.h
index 6e3d3525bf..beb96b18eb 100644
--- a/tests/qtest/libqtest.h
+++ b/tests/qtest/libqtest.h
@@ -949,6 +949,14 @@ bool qtest_has_machine(const char *machine);
  */
 bool qtest_has_machine_with_env(const char *var, const char *machine);
 
+/**
+ * qtest_has_cpu_model:
+ * @cpu: The cpu to look for
+ *
+ * Returns: true if the cpu is available in the target binary.
+ */
+bool qtest_has_cpu_model(const char *cpu);
+
 /**
  * qtest_has_device:
  * @device: The device to look for
-- 
2.42.0

[PATCH v4 0/3] x86 cpu test refactoring

Add a new library api to check for the support of a specific cpu type.
Used the new api to check support for some older x86 cpu models before
running the tests.

CC: th...@redhat.com
CC: imamm...@redhat.com
CC: qemu-devel@nongnu.org
CC: pbonz...@redhat.com
CC: lviv...@redhat.com
CC: m...@redhat.com


Ani Sinha (3):
  qtest/x86/numa-test: do not use the obsolete 'pentium' cpu
  tests/qtest/libqtest: add qtest_has_cpu_model() api
  tests/qtest/x86: check for availability of older cpu models before
running tests

 tests/qtest/libqtest.c  |  83 ++
 tests/qtest/libqtest.h  |   8 ++
 tests/qtest/numa-test.c |   3 +-
 tests/qtest/test-x86-cpuid-compat.c | 170 ++--
 4 files changed, 201 insertions(+), 63 deletions(-)

-- 
2.42.0

[PATCH 1/3] qtest/x86/numa-test: do not use the obsolete 'pentium' cpu

'pentium' cpu is old and obsolete and should be avoided for running tests if
its not strictly needed. Use 'max' cpu instead for generic non-cpu specific
numa test.

CC: th...@redhat.com
Reviewed-by: Thomas Huth 
Reviewed-by: Igor Mammedov 
Signed-off-by: Ani Sinha 
---
 tests/qtest/numa-test.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/numa-test.c b/tests/qtest/numa-test.c
index 7aa262dbb9..f01f19592d 100644
--- a/tests/qtest/numa-test.c
+++ b/tests/qtest/numa-test.c
@@ -125,7 +125,8 @@ static void pc_numa_cpu(const void *data)
 QTestState *qts;
 g_autofree char *cli = NULL;
 
-cli = make_cli(data, "-cpu pentium -machine 
smp.cpus=8,smp.sockets=2,smp.cores=2,smp.threads=2 "
+cli = make_cli(data,
+"-cpu max -machine smp.cpus=8,smp.sockets=2,smp.cores=2,smp.threads=2 "
 "-numa node,nodeid=0,memdev=ram -numa node,nodeid=1 "
 "-numa cpu,node-id=1,socket-id=0 "
 "-numa cpu,node-id=0,socket-id=1,core-id=0 "
-- 
2.42.0

Re: Historical QMP schema

2024-06-10 Thread John Snow

On Mon, Jun 10, 2024 at 9:39 AM Markus Armbruster  wrote:

> Daniel P. Berrangé  writes:
>
> > On Thu, Jun 06, 2024 at 01:22:14PM -0400, John Snow wrote:
> >> On Thu, Jun 6, 2024 at 6:25 AM Victor Toso 
> wrote:
> >> > On Wed, Jun 05, 2024 at 11:47:53AM GMT, John Snow wrote:
> >> Importantly, old versions of the schema aren't contained *entirely*
> within
> >> the schema. Here's a timeline:
> >>
> >> v0.12.0: QMP first introduced. Events are hardcoded, commands are
> defined
> >> in qemu-monitor.hx. query commands are hard-coded in monitor.c.
> >> v0.14.0: qemu-monitor.hx is forked into qmp-commands.hx and
> hmp-commands.hx
> >> v1.0: First version which features qapi-schema.json; all query commands
> are
> >> qapified but most other commands are not.
> >> v1.1.0: A very large chunk of commands are QAPIfied.
> >> v1.3.0: Most commands are now QAPIfied, but there are 2-3 remaining.
> >> v2.1.0: events are now fully qapified; most are now defined in
> >> qapi/events.json
> >> v2.8.0: The remaining commands are fully qapified; qmp-commands.hx is
> >> removed.
> >
> > v2.8.0 was in Dec 2016 - 7+1/2 years ago.
> >
> > libvirt's min QEMU version is 4.2.0 - Dec 2019
> >
> > Ther are non-libvirt consumers of QEMU, but for them, do we think it is
> > reasonable for a consumer of QAPI *today*, to care about a QEMU version
> > from almost 8 years ago ?
> >
> > IOW, I wonder if the most pragammatic answer to this problem is to simply
> > entirely ignore the problems prior to 2.8.0 - accept that the versioning
> > is inaccurate/incomplete for versions before 2.8.0
>
> I'm in favour.  However, I'd prefer honest "Since: at least 2.8.0" to
> "Since: ".
>

That's certainly fine by me if it's community consensus to do so.

I wouldn't mind a phrasing in our HTML doc output in this way:

"Since: 4.2.0", when it's after the cutoff, or
"Since: at least 2.8.0" when it's at or prior to the cutoff.

However, because I am unreasonable, I do have a pretty accurate history of
everything that happened prior to then anyway, just in case ...!

(I was afraid of the review feedback of when I went to cull such
information from our docs, admittedly...)

Even after 2.8.0, there are many "breaking changes" to the QAPI schema
format itself that requires various hacks and workarounds in the QAPI
generator to be able to parse until at least v6.2.0 or so; things we
definitely don't feel like hacking into the upstream parser and
maintaining/supporting. It's far easier, I think, to compile this
information *once* and store the compiled result into a file we can check
back into the qemu.git tree to serve as historical reference instead.

So, even if we do ignore such older versions, it's still a question of how
we'd like to store and maintain the historical information so we have a
reference for new releases going forward, and what kind of features we'd
like to see such a format support us with.

My list right now is:

- The ability to see at a glance, as a "one-page summary", any changes to
the QMP wire protocol during Release Candidate phase.
- The ability to programmatically determine from the doc generator:
- when any Command or Event was introduced
- when any Command Argument/Return field was introduced (or modified
incompatibly?)
- when any Event member was introduced/modified incompatibly

I've got some prototypes for this, I hope to send some example output soon
when it's more reasonably complete and I don't have to explain the
Work-In-Progress caveats quite as much; however I'm still receptive to
ideas and suggestions about how to organize this data. Right now, I am
using a JSON Schema format for "compiled" data because it has the ability
to describe arbitrarily nested structures, which allows me to strip all
"non-API" information from the compiled schema, such as intermediate type
names which we do not consider API. This allows me to give accurate
version-to-version diff reports regardless of the type factoring on the
developer's side.

I'm not necessarily attached to this idea, but it has been useful in
prototyping for verifying that the rest of my qapihackborg is functioning
correctly, so it can serve as a starting point for critique and discussion,
I think.

(I just chose JSON Schema because it's something I am aware of and know how
to use, and it fit some loose criteria for the hacking I was doing. Maybe
we'll stick with it, maybe we won't. etc.)

--js

[PATCH 0/2] hw/misc/mos6522: Do not open-code hmp_info_human_readable_text()

Officialise the QMP command, use the existing
hmp_info_human_readable_text() helper.

Philippe Mathieu-Daudé (2):
  hw/misc/mos6522: Expose x-query-mos6522-devices QMP command
  hw/misc/mos6522: Do not open-code hmp_info_human_readable_text()

 MAINTAINERS  |  2 +-
 qapi/machine.json| 17 +
 include/hw/misc/mos6522.h|  2 --
 include/monitor/hmp-target.h |  1 -
 hw/misc/mos6522-stubs.c  | 18 ++
 hw/misc/mos6522.c| 16 ++--
 hmp-commands-info.hx |  2 +-
 hw/misc/meson.build  |  3 ++-
 8 files changed, 41 insertions(+), 20 deletions(-)
 create mode 100644 hw/misc/mos6522-stubs.c

-- 
2.41.0

[PATCH 2/2] hw/misc/mos6522: Do not open-code hmp_info_human_readable_text()

Register the command 'info via' using HMPCommand::cmd_info_hrt(),
so it is processed using the generic hmp_info_human_readable_text().

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/misc/mos6522.h|  2 --
 include/monitor/hmp-target.h |  1 -
 hw/misc/mos6522.c| 13 -
 hmp-commands-info.hx |  2 +-
 4 files changed, 1 insertion(+), 17 deletions(-)

diff --git a/include/hw/misc/mos6522.h b/include/hw/misc/mos6522.h
index fba45668ab..a54fe063ac 100644
--- a/include/hw/misc/mos6522.h
+++ b/include/hw/misc/mos6522.h
@@ -172,6 +172,4 @@ extern const VMStateDescription vmstate_mos6522;
 uint64_t mos6522_read(void *opaque, hwaddr addr, unsigned size);
 void mos6522_write(void *opaque, hwaddr addr, uint64_t val, unsigned size);
 
-void hmp_info_via(Monitor *mon, const QDict *qdict);
-
 #endif /* MOS6522_H */
diff --git a/include/monitor/hmp-target.h b/include/monitor/hmp-target.h
index b679aaebbf..9b46fec84a 100644
--- a/include/monitor/hmp-target.h
+++ b/include/monitor/hmp-target.h
@@ -53,7 +53,6 @@ void hmp_mce(Monitor *mon, const QDict *qdict);
 void hmp_info_local_apic(Monitor *mon, const QDict *qdict);
 void hmp_info_sev(Monitor *mon, const QDict *qdict);
 void hmp_info_sgx(Monitor *mon, const QDict *qdict);
-void hmp_info_via(Monitor *mon, const QDict *qdict);
 void hmp_memory_dump(Monitor *mon, const QDict *qdict);
 void hmp_physical_memory_dump(Monitor *mon, const QDict *qdict);
 void hmp_info_registers(Monitor *mon, const QDict *qdict);
diff --git a/hw/misc/mos6522.c b/hw/misc/mos6522.c
index b1bb7f54f0..afa343dd27 100644
--- a/hw/misc/mos6522.c
+++ b/hw/misc/mos6522.c
@@ -29,8 +29,6 @@
 #include "hw/misc/mos6522.h"
 #include "hw/qdev-properties.h"
 #include "migration/vmstate.h"
-#include "monitor/monitor.h"
-#include "monitor/hmp.h"
 #include "qapi/qapi-commands-machine.h"
 #include "qapi/type-helpers.h"
 #include "qemu/timer.h"
@@ -587,17 +585,6 @@ HumanReadableText *qmp_x_query_mos6522_devices(Error 
**errp)
 return human_readable_text_from_str(buf);
 }
 
-void hmp_info_via(Monitor *mon, const QDict *qdict)
-{
-Error *err = NULL;
-g_autoptr(HumanReadableText) info = qmp_x_query_mos6522_devices();
-
-if (hmp_handle_error(mon, err)) {
-return;
-}
-monitor_puts(mon, info->human_readable_text);
-}
-
 static const MemoryRegionOps mos6522_ops = {
 .read = mos6522_read,
 .write = mos6522_write,
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index cfd4ad5651..a24c217d89 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -873,7 +873,7 @@ ERST
 .args_type= "",
 .params   = "",
 .help = "show guest mos6522 VIA devices",
-.cmd  = hmp_info_via,
+.cmd_info_hrt = qmp_x_query_mos6522_devices,
 },
 #endif
 
-- 
2.41.0

[PATCH 1/2] hw/misc/mos6522: Expose x-query-mos6522-devices QMP command

This is a counterpart to the HMP "info via" command. It is being
added with an "x-" prefix because this QMP command is intended as an
adhoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.

Signed-off-by: Philippe Mathieu-Daudé 
---
 MAINTAINERS |  2 +-
 qapi/machine.json   | 17 +
 hw/misc/mos6522-stubs.c | 18 ++
 hw/misc/mos6522.c   |  5 +++--
 hw/misc/meson.build |  3 ++-
 5 files changed, 41 insertions(+), 4 deletions(-)
 create mode 100644 hw/misc/mos6522-stubs.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 951556224a..e86638c68c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1453,7 +1453,7 @@ F: hw/ppc/mac_newworld.c
 F: hw/pci-host/uninorth.c
 F: hw/pci-bridge/dec.[hc]
 F: hw/misc/macio/
-F: hw/misc/mos6522.c
+F: hw/misc/mos6522*.c
 F: hw/nvram/mac_nvram.c
 F: hw/ppc/fw_cfg.c
 F: hw/input/adb*
diff --git a/qapi/machine.json b/qapi/machine.json
index 1283d14493..a82b8dd39d 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1865,6 +1865,23 @@
   'data': { 'filename': 'str' },
   'if': 'CONFIG_FDT' }
 
+##
+# @x-query-mos6522-devices:
+#
+# Query information on MOS6522 VIA devices
+#
+# Features:
+#
+# @unstable: This command is meant for debugging.
+#
+# Returns: MOS6522 VIA devices information
+#
+# Since: 9.1
+##
+{ 'command': 'x-query-mos6522-devices',
+  'returns': 'HumanReadableText',
+  'features': [ 'unstable' ]}
+
 ##
 # @x-query-interrupt-controllers:
 #
diff --git a/hw/misc/mos6522-stubs.c b/hw/misc/mos6522-stubs.c
new file mode 100644
index 00..c953f01a16
--- /dev/null
+++ b/hw/misc/mos6522-stubs.c
@@ -0,0 +1,18 @@
+/*
+ * QEMU MOS6522 VIA stubs
+ *
+ * SPDX-FileContributor: Philippe Mathieu-Daudé 
+ * SPDX-FileCopyrightText: 2024 Linaro Ltd.
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-machine.h"
+
+HumanReadableText *qmp_x_query_mos6522_devices(Error **errp)
+{
+error_setg(errp, "Support for MOS6522 VIA devices not built-in");
+
+return NULL;
+}
diff --git a/hw/misc/mos6522.c b/hw/misc/mos6522.c
index 515f62e687..b1bb7f54f0 100644
--- a/hw/misc/mos6522.c
+++ b/hw/misc/mos6522.c
@@ -31,6 +31,7 @@
 #include "migration/vmstate.h"
 #include "monitor/monitor.h"
 #include "monitor/hmp.h"
+#include "qapi/qapi-commands-machine.h"
 #include "qapi/type-helpers.h"
 #include "qemu/timer.h"
 #include "qemu/cutils.h"
@@ -576,7 +577,7 @@ static int qmp_x_query_via_foreach(Object *obj, void 
*opaque)
 return 0;
 }
 
-static HumanReadableText *qmp_x_query_via(Error **errp)
+HumanReadableText *qmp_x_query_mos6522_devices(Error **errp)
 {
 g_autoptr(GString) buf = g_string_new("");
 
@@ -589,7 +590,7 @@ static HumanReadableText *qmp_x_query_via(Error **errp)
 void hmp_info_via(Monitor *mon, const QDict *qdict)
 {
 Error *err = NULL;
-g_autoptr(HumanReadableText) info = qmp_x_query_via();
+g_autoptr(HumanReadableText) info = qmp_x_query_mos6522_devices();
 
 if (hmp_handle_error(mon, err)) {
 return;
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index 86596a3888..9fa0e98794 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -18,7 +18,8 @@ system_ss.add(when: 'CONFIG_ARM11SCU', if_true: 
files('arm11scu.c'))
 system_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('armv7m_ras.c'))
 
 # Mac devices
-system_ss.add(when: 'CONFIG_MOS6522', if_true: files('mos6522.c'))
+system_ss.add(when: 'CONFIG_MOS6522', if_true: files('mos6522.c'),
+  if_false: files('mos6522-stubs.c'))
 system_ss.add(when: 'CONFIG_DJMEMC', if_true: files('djmemc.c'))
 system_ss.add(when: 'CONFIG_IOSB', if_true: files('iosb.c'))
 
-- 
2.41.0

[PATCH v3 3/5] target/mips: Restrict semihosting to TCG

Semihosting currently uses the TCG probe_access API. To prepare for
encoding the TCG dependency in Kconfig, do not enable it unless TCG
is available.

Suggested-by: Paolo Bonzini 
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Anton Johansson 
---
 target/mips/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/mips/Kconfig b/target/mips/Kconfig
index eb19c94c7d..876048b150 100644
--- a/target/mips/Kconfig
+++ b/target/mips/Kconfig
@@ -1,6 +1,6 @@
 config MIPS
 bool
-select SEMIHOSTING
+imply SEMIHOSTING if TCG
 
 config MIPS64
 bool
-- 
2.41.0

[PATCH v3 2/5] target/xtensa: Restrict semihosting to TCG

The semihosting feature depends on TCG (due to the probe_access
API access). Although TCG is the single accelerator currently
available for the xtensa target, use the Kconfig "imply" directive
which is more correct (if we were to support a different accel).

Reported-by: Anton Johansson 
Signed-off-by: Philippe Mathieu-Daudé 
---
 target/xtensa/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/xtensa/Kconfig b/target/xtensa/Kconfig
index 5e46049262..e8c2598c4d 100644
--- a/target/xtensa/Kconfig
+++ b/target/xtensa/Kconfig
@@ -1,3 +1,3 @@
 config XTENSA
 bool
-select SEMIHOSTING
+imply SEMIHOSTING if TCG
-- 
2.41.0

[PATCH v3 4/5] target/riscv: Restrict semihosting to TCG

Semihosting currently uses the TCG probe_access API. To prepare for
encoding the TCG dependency in Kconfig, do not enable it unless TCG
is available.

Suggested-by: Paolo Bonzini 
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Anton Johansson 
---
 target/riscv/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/riscv/Kconfig b/target/riscv/Kconfig
index 5f30df22f2..c332616d36 100644
--- a/target/riscv/Kconfig
+++ b/target/riscv/Kconfig
@@ -1,9 +1,9 @@
 config RISCV32
 bool
-select ARM_COMPATIBLE_SEMIHOSTING # for do_common_semihosting()
+imply ARM_COMPATIBLE_SEMIHOSTING if TCG
 select DEVICE_TREE # needed by boot.c
 
 config RISCV64
 bool
-select ARM_COMPATIBLE_SEMIHOSTING # for do_common_semihosting()
+imply ARM_COMPATIBLE_SEMIHOSTING if TCG
 select DEVICE_TREE # needed by boot.c
-- 
2.41.0

[PATCH v3 5/5] semihosting: Restrict to TCG

Semihosting currently uses the TCG probe_access API.
It is pointless to have it in the binary when TCG isn't.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
---
 semihosting/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/semihosting/Kconfig b/semihosting/Kconfig
index eaf3a20ef5..fbe6ac87f9 100644
--- a/semihosting/Kconfig
+++ b/semihosting/Kconfig
@@ -1,6 +1,7 @@
 
 config SEMIHOSTING
bool
+   depends on TCG
 
 config ARM_COMPATIBLE_SEMIHOSTING
bool
-- 
2.41.0

[PATCH v3 1/5] target/m68k: Restrict semihosting to TCG

The semihosting feature depends on TCG (due to the probe_access
API access). Although TCG is the single accelerator currently
available for the m68k target, use the Kconfig "imply" directive
which is more correct (if we were to support a different accel).

Reported-by: Anton Johansson 
Signed-off-by: Philippe Mathieu-Daudé 
---
 target/m68k/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/m68k/Kconfig b/target/m68k/Kconfig
index 9eae71486f..23aae24ebe 100644
--- a/target/m68k/Kconfig
+++ b/target/m68k/Kconfig
@@ -1,3 +1,3 @@
 config M68K
 bool
-select SEMIHOSTING
+imply SEMIHOSTING if TCG
-- 
2.41.0

[PATCH v3 0/5] semihosting: Restrict to TCG

v3: Address Anton's comment
v2: Address Paolo's comment

Semihosting currently uses the TCG probe_access API,
so it is pointless to have it in the binary when TCG
isn't.

It could be implemented for other accelerators, but
work need to be done. Meanwhile, do not enable it
unless TCG is available.

Philippe Mathieu-Daudé (5):
  target/m68k: Restrict semihosting to TCG
  target/xtensa: Restrict semihosting to TCG
  target/mips: Restrict semihosting to TCG
  target/riscv: Restrict semihosting to TCG
  semihosting: Restrict to TCG

 semihosting/Kconfig   | 1 +
 target/m68k/Kconfig   | 2 +-
 target/mips/Kconfig   | 2 +-
 target/riscv/Kconfig  | 4 ++--
 target/xtensa/Kconfig | 2 +-
 5 files changed, 6 insertions(+), 5 deletions(-)

-- 
2.41.0

Re: [PATCH v2 1/3] target/mips: Restrict semihosting to TCG


On 10/6/24 11:29, Alex Bennée wrote:

Philippe Mathieu-Daudé  writes:


On 7/6/24 13:08, Anton Johansson wrote:

On 30/05/24, Philippe Mathieu-Daudé wrote:

Semihosting currently uses the TCG probe_access API. To prepare for
encoding the TCG dependency in Kconfig, do not enable it unless TCG
is available.

Suggested-by: Paolo Bonzini 
Signed-off-by: Philippe Mathieu-Daudé 
---
   target/mips/Kconfig | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

xtensa and m68k also `select SEMIHOSTING`, were these missed?


TCG is the only accelerator they use, so it is kinda implicit,
but you are right, I'll update for completeness.


So I'll wait for a v3?


Yes, on the way...

[PATCH] scripts/qcow2-to-stdout.py: Add script to write qcow2 images to stdout

2024-06-10 Thread Alberto Garcia

This tool converts a disk image to qcow2, writing the result directly
to stdout. This can be used for example to send the generated file
over the network.

This is equivalent to using qemu-img to convert a file to qcow2 and
then writing the result to stdout, with the difference that this tool
does not need to create this temporary qcow2 file and therefore does
not need any additional disk space.

The input file is read twice. The first pass is used to determine
which clusters contain non-zero data and that information is used to
create the qcow2 header, refcount table and blocks, and L1 and L2
tables. After all that metadata is created then the second pass is
used to write the guest data.

By default qcow2-to-stdout.py expects the input to be a raw file, but
if qemu-storage-daemon is available then it can also be used to read
images in other formats. Alternatively the user can also run qemu-ndb
or qemu-storage-daemon manually instead.

Signed-off-by: Alberto Garcia 
Signed-off-by: Madeeha Javed 
---
 scripts/qcow2-to-stdout.py | 330 +
 1 file changed, 330 insertions(+)
 create mode 100755 scripts/qcow2-to-stdout.py

diff --git a/scripts/qcow2-to-stdout.py b/scripts/qcow2-to-stdout.py
new file mode 100755
index 00..b9f75de690
--- /dev/null
+++ b/scripts/qcow2-to-stdout.py
@@ -0,0 +1,330 @@
+#!/usr/bin/env python3
+
+# This tool reads a disk image in any format and converts it to qcow2,
+# writing the result directly to stdout.
+#
+# Copyright (C) 2024 Igalia, S.L.
+#
+# Authors: Alberto Garcia 
+#  Madeeha Javed 
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# qcow2 files produced by this script are always arranged like this:
+#
+# - qcow2 header
+# - refcount table
+# - refcount blocks
+# - L1 table
+# - L2 tables
+# - Data clusters
+#
+# A note about variable names: in qcow2 there is one refcount table
+# and one (active) L1 table, although each can occupy several
+# clusters. For the sake of simplicity the code sometimes talks about
+# refcount tables and L1 tables when referring to those clusters.
+
+import argparse
+import atexit
+import math
+import os
+import signal
+import struct
+import sys
+import subprocess
+import tempfile
+import time
+
+QCOW2_DEFAULT_CLUSTER_SIZE = 65536
+QCOW2_DEFAULT_REFCOUNT_BITS = 16
+QCOW2_DEFAULT_VERSION = 3
+QCOW_OFLAG_COPIED = 1 << 63
+
+def bitmap_set(bitmap, idx):
+bitmap[int(idx / 8)] |= (1 << (idx % 8))
+
+def bitmap_test(bitmap, idx):
+return (bitmap[int(idx / 8)] & (1 << (idx % 8))) != 0
+
+# Kill the storage daemon on exit
+def kill_storage_daemon(pid_file, raw_file, temp_dir):
+if os.path.exists(pid_file):
+with open(pid_file, 'r') as f:
+pid=int(f.readline())
+os.kill(pid, signal.SIGTERM)
+while os.path.exists(pid_file):
+time.sleep(0.1)
+os.unlink(raw_file)
+os.rmdir(temp_dir)
+
+def write_features(header):
+qcow2_features = [
+# Incompatible
+(0, 0, 'dirty bit'),
+(0, 1, 'corrupt bit'),
+(0, 2, 'external data file'),
+(0, 3, 'compression type'),
+(0, 4, 'extended L2 entries'),
+# Compatible
+(1, 0, 'lazy refcounts'),
+# Autoclear
+(2, 0, 'bitmaps'),
+(2, 1, 'raw external data')
+]
+struct.pack_into('>I', header, 0x70, 0x6803f857)
+struct.pack_into('>I', header, 0x74, len(qcow2_features) * 48)
+cur_offset = 0x78
+for (feature_type, feature_bit, feature_name) in qcow2_features:
+struct.pack_into('>BB46s', header, cur_offset,
+ feature_type, feature_bit, 
feature_name.encode('ascii'))
+cur_offset += 48
+
+# Command-line arguments
+parser = argparse.ArgumentParser(description='This program converts a QEMU 
disk image to qcow2 '
+ 'and writes it to the standard output')
+parser.add_argument('input_file', help='name of the input file')
+parser.add_argument('-f', dest='input_format', metavar='input_format',
+default='raw',
+help='format of the input file (default: raw)')
+parser.add_argument('-c', dest='cluster_size', metavar='cluster_size',
+help=f'qcow2 cluster size (default: 
{QCOW2_DEFAULT_CLUSTER_SIZE})',
+default=QCOW2_DEFAULT_CLUSTER_SIZE, type=int,
+choices=[1 << x for x in range(9,22)])
+parser.add_argument('-r', dest='refcount_bits', metavar='refcount_bits',
+help=f'width of the reference count entries (default: 
{QCOW2_DEFAULT_REFCOUNT_BITS})',
+default=QCOW2_DEFAULT_REFCOUNT_BITS, type=int,
+choices=[1 << x for x in range(7)])
+parser.add_argument('-v', dest='qcow2_version', metavar='qcow2_version',
+help=f'qcow2 version (default: {QCOW2_DEFAULT_VERSION})',
+default=QCOW2_DEFAULT_VERSION, type=int, choices=[2, 3])
+args = parser.parse_args()
+

Re: [PATCH v2 2/3] tests/qtest/libqtest: add qtest_has_cpu() api


Hi Ani,

On 10/6/24 15:21, Ani Sinha wrote:

Added a new test api qtest_has_cpu() in order to check availability of some
cpu models in the current QEMU binary. The specific architecture of the QEMU
binary is selected using the QTEST_QEMU_BINARY environment variable. This api
would be useful to run tests against some older cpu models after checking if
QEMU actually supported these models.

CC: th...@redhat.com
Signed-off-by: Ani Sinha 
Reviewed-by: Reviewed-by: Daniel P. Berrangé 
---
  tests/qtest/libqtest.c | 83 ++
  tests/qtest/libqtest.h |  8 
  2 files changed, 91 insertions(+)


Since "CPU" commonly refers to a CPU state, I suggest renaming as:


+struct CpuInfo {


CpuModel


+static struct CpuInfo *qtest_get_cpus(void)


qtest_get_cpu_models()


+bool qtest_has_cpu(const char *cpu)


qtest_has_cpu_model()

Regards,

Phil.

Re: [PATCH] tracetool: Remove unused vcpu.py script