Re: [RFC v3 14/18] backends/iommufd: Introduce the iommufd object
Hi Nicolin, On 2/16/23 00:48, Nicolin Chen wrote: > Hi Eric, > > On Tue, Jan 31, 2023 at 09:53:01PM +0100, Eric Auger wrote: > >> diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h >> new file mode 100644 >> index 00..06a866d1bd >> --- /dev/null >> +++ b/include/sysemu/iommufd.h >> @@ -0,0 +1,47 @@ >> +#ifndef SYSEMU_IOMMUFD_H >> +#define SYSEMU_IOMMUFD_H >> + >> +#include "qom/object.h" >> +#include "qemu/thread.h" >> +#include "exec/hwaddr.h" >> +#include "exec/ram_addr.h" > After rebasing nesting patches on top of this, I see a build error: > > > [47/876] Compiling C object libcommon.fa.p/hw_arm_smmu-common.c.o > FAILED: libcommon.fa.p/hw_arm_smmu-common.c.o > cc -Ilibcommon.fa.p -I../src/3rdparty/qemu/dtc/libfdt -I/usr/include/pixman-1 > -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/glib-2.0 > -I/usr/lib/aarch64-linux-gnu/glib-2.0/include -I/usr/include/gio-unix-2.0 > -fdiagnostics-color=auto -Wall -Winvalid-pch -std=gnu11 -O2 -g -isystem > /src/3rdparty/qemu/linux-headers -isystem linux-headers -iquote . -iquote > /src/3rdparty/qemu -iquote /src/3rdparty/qemu/include -iquote > /src/3rdparty/qemu/tcg/aarch64 -pthread -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 > -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -fno-strict-aliasing > -fno-common -fwrapv -Wundef -Wwrite-strings -Wmissing-prototypes > -Wstrict-prototypes -Wredundant-decls -Wold-style-declaration > -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k > -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels > -Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wmissing-format-attribute > -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi > -fstack-protector-strong -fPIE -MD -MQ libcommon.fa.p/hw_arm_smmu-common.c.o > -MF libcommon.fa.p/hw_arm_smmu-common.c.o.d -o > libcommon.fa.p/hw_arm_smmu-common.c.o -c > ../src/3rdparty/qemu/hw/arm/smmu-common.c > In file included from /src/3rdparty/qemu/include/sysemu/iommufd.h:7, > from ../src/3rdparty/qemu/hw/arm/smmu-common.c:29: > /src/3rdparty/qemu/include/exec/ram_addr.h:23:10: fatal error: cpu.h: No such > file or directory >23 | #include "cpu.h" > | ^~~ > compilation terminated. > > > I guess it's resulted from the module inter-dependency. Though our > nesting patches aren't finalized yet, the possibility of including > iommufd.h is still there. Meanwhile, the ram_addr.h here is added > for "ram_addr_t" type, I think. So, could we include "cpu-common.h" > instead, where the "ram_addr_t" type is actually defined? Sure. We will fix that on the next iteration Eric > > The build error is gone after this replacement: > > diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h > index 45540de63986..86d370c221b3 100644 > --- a/include/sysemu/iommufd.h > +++ b/include/sysemu/iommufd.h > @@ -4,7 +4,7 @@ > #include "qom/object.h" > #include "qemu/thread.h" > #include "exec/hwaddr.h" > -#include "exec/ram_addr.h" > +#include "exec/cpu-common.h" > #include > > #define TYPE_IOMMUFD_BACKEND "iommufd" > > > Thanks > Nic >
Re: [PATCH 07/12] testing: update ubuntu2004 to ubuntu2204
On 15/02/2023 20.25, Alex Bennée wrote: The 22.04 LTS release has been out for almost a year now so its time to update all the remaining images to the current LTS. We can also drop some hacks we need for older clang TSAN support. Signed-off-by: Alex Bennée --- docs/devel/testing.rst| 4 ++-- .gitlab-ci.d/buildtest.yml| 22 +-- .gitlab-ci.d/containers.yml | 4 ++-- .../{ubuntu2004.docker => ubuntu2204.docker} | 16 +- tests/docker/test-tsan| 2 +- tests/lcitool/refresh | 10 + 6 files changed, 23 insertions(+), 35 deletions(-) rename tests/docker/dockerfiles/{ubuntu2004.docker => ubuntu2204.docker} (91%) Reviewed-by: Thomas Huth
Re: [PATCH v2 05/15] linux-user/sparc: Tidy window spill/fill traps
On 16/2/23 06:45, Richard Henderson wrote: Add some macros to localize the hw difference between v9 and pre-v9. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 23 +-- 1 file changed, 13 insertions(+), 10 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH v2 04/15] linux-user/sparc: Use TT_TRAP for flush windows
On 16/2/23 06:45, Richard Henderson wrote: The v9 and pre-v9 code can be unified with this macro. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH v2 02/15] linux-user/sparc: Tidy syscall trap
On 16/2/23 06:45, Richard Henderson wrote: Use TT_TRAP. For sparc32, 0x88 is the "Slowaris" system call, currently BAD_TRAP in the kernel's ttable_32.S. For sparc64, 0x110 is tl0_linux32, the sparc32 trap, now folded into the TARGET_ABI32 case via TT_TRAP. For sparc64, there does still exist trap 0x111 as tl0_oldlinux64, which was replaced by 0x16d as tl0_linux64 in 1998. Since no one has noticed, don't bother implementing it now. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 0/4] target/arm: Cache ARMVAParameters
Hi Richard, On 2/2/23 08:52, Richard Henderson wrote: Richard Henderson (4): target/arm: Flush only required tlbs for TCR_EL[12] target/arm: Store tbi for both insns and data in ARMVAParameters target/arm: Use FIELD for ARMVAParameters target/arm: Cache ARMVAParameters Applying: target/arm: Flush only required tlbs for TCR_EL[12] error: patch failed: target/arm/helper.c:4151 error: target/arm/helper.c: patch does not apply Patch failed at 0001 target/arm: Flush only required tlbs for TCR_EL[12] What is this series base commit?
Re: [PATCH 06/12] gitlab: extend custom runners with base_job_template
On 15/02/2023 20.25, Alex Bennée wrote: The base job template is responsible for controlling how we kick off testing on our various branches. Rename and extend the custom_runner_template so we can take advantage of all that control. Signed-off-by: Alex Bennée --- .gitlab-ci.d/custom-runners.yml | 3 ++- .gitlab-ci.d/custom-runners/ubuntu-20.04-s390x.yml | 10 +- .gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml | 2 +- .gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml | 10 +- 4 files changed, 13 insertions(+), 12 deletions(-) Reviewed-by: Thomas Huth
Re: [PATCH 05/12] gitlab: reduce default verbosity of cirrus run
On 15/02/2023 20.25, Alex Bennée wrote: We also truncate the echoing of the test log if we fail. Ideally we would want the build aretefact to be available to gitlab but so far how to do this eludes me. Signed-off-by: Alex Bennée Cc: Daniel P. Berrangé --- .gitlab-ci.d/cirrus/build.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.gitlab-ci.d/cirrus/build.yml b/.gitlab-ci.d/cirrus/build.yml index 7ef6af8d33..6563ff3c7a 100644 --- a/.gitlab-ci.d/cirrus/build.yml +++ b/.gitlab-ci.d/cirrus/build.yml @@ -32,6 +32,6 @@ build_task: - $MAKE -j$(sysctl -n hw.ncpu) - for TARGET in $TEST_TARGETS ; do -$MAKE -j$(sysctl -n hw.ncpu) $TARGET V=1 -|| { cat meson-logs/testlog.txt; exit 1; } ; +$MAKE -j$(sysctl -n hw.ncpu) $TARGET +|| { tail -n 200 meson-logs/testlog.txt; exit 1; } ; done I think it should be OK to publish the artifacts on cirrus-ci.com instead - you have to click a little bit more often, but you can still get the artifacts there, see: https://lore.kernel.org/qemu-devel/20230215142503.90660-1-th...@redhat.com/ Thomas
Re: [PATCH v2 01/13] vdpa net: move iova tree creation from init to start
On Thu, Feb 16, 2023 at 3:15 AM Si-Wei Liu wrote: > > > > On 2/14/2023 11:07 AM, Eugenio Perez Martin wrote: > > On Tue, Feb 14, 2023 at 2:45 AM Si-Wei Liu wrote: > >> > >> > >> On 2/13/2023 3:14 AM, Eugenio Perez Martin wrote: > >>> On Mon, Feb 13, 2023 at 7:51 AM Si-Wei Liu wrote: > > On 2/8/2023 1:42 AM, Eugenio Pérez wrote: > > Only create iova_tree if and when it is needed. > > > > The cleanup keeps being responsible of last VQ but this change allows it > > to merge both cleanup functions. > > > > Signed-off-by: Eugenio Pérez > > Acked-by: Jason Wang > > --- > > net/vhost-vdpa.c | 99 > > ++-- > > 1 file changed, 71 insertions(+), 28 deletions(-) > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > > index de5ed8ff22..a9e6c8f28e 100644 > > --- a/net/vhost-vdpa.c > > +++ b/net/vhost-vdpa.c > > @@ -178,13 +178,9 @@ err_init: > > static void vhost_vdpa_cleanup(NetClientState *nc) > > { > > VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc); > > -struct vhost_dev *dev = &s->vhost_net->dev; > > > > qemu_vfree(s->cvq_cmd_out_buffer); > > qemu_vfree(s->status); > > -if (dev->vq_index + dev->nvqs == dev->vq_index_end) { > > -g_clear_pointer(&s->vhost_vdpa.iova_tree, > > vhost_iova_tree_delete); > > -} > > if (s->vhost_net) { > > vhost_net_cleanup(s->vhost_net); > > g_free(s->vhost_net); > > @@ -234,10 +230,64 @@ static ssize_t vhost_vdpa_receive(NetClientState > > *nc, const uint8_t *buf, > > return size; > > } > > > > +/** From any vdpa net client, get the netclient of first queue pair */ > > +static VhostVDPAState *vhost_vdpa_net_first_nc_vdpa(VhostVDPAState *s) > > +{ > > +NICState *nic = qemu_get_nic(s->nc.peer); > > +NetClientState *nc0 = qemu_get_peer(nic->ncs, 0); > > + > > +return DO_UPCAST(VhostVDPAState, nc, nc0); > > +} > > + > > +static void vhost_vdpa_net_data_start_first(VhostVDPAState *s) > > +{ > > +struct vhost_vdpa *v = &s->vhost_vdpa; > > + > > +if (v->shadow_vqs_enabled) { > > +v->iova_tree = vhost_iova_tree_new(v->iova_range.first, > > + v->iova_range.last); > > +} > > +} > > + > > +static int vhost_vdpa_net_data_start(NetClientState *nc) > > +{ > > +VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc); > > +struct vhost_vdpa *v = &s->vhost_vdpa; > > + > > +assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA); > > + > > +if (v->index == 0) { > > +vhost_vdpa_net_data_start_first(s); > > +return 0; > > +} > > + > > +if (v->shadow_vqs_enabled) { > > +VhostVDPAState *s0 = vhost_vdpa_net_first_nc_vdpa(s); > > +v->iova_tree = s0->vhost_vdpa.iova_tree; > > +} > > + > > +return 0; > > +} > > + > > +static void vhost_vdpa_net_client_stop(NetClientState *nc) > > +{ > > +VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc); > > +struct vhost_dev *dev; > > + > > +assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA); > > + > > +dev = s->vhost_vdpa.dev; > > +if (dev->vq_index + dev->nvqs == dev->vq_index_end) { > > +g_clear_pointer(&s->vhost_vdpa.iova_tree, > > vhost_iova_tree_delete); > > +} > > +} > > + > > static NetClientInfo net_vhost_vdpa_info = { > > .type = NET_CLIENT_DRIVER_VHOST_VDPA, > > .size = sizeof(VhostVDPAState), > > .receive = vhost_vdpa_receive, > > +.start = vhost_vdpa_net_data_start, > > +.stop = vhost_vdpa_net_client_stop, > > .cleanup = vhost_vdpa_cleanup, > > .has_vnet_hdr = vhost_vdpa_has_vnet_hdr, > > .has_ufo = vhost_vdpa_has_ufo, > > @@ -351,7 +401,7 @@ dma_map_err: > > > > static int vhost_vdpa_net_cvq_start(NetClientState *nc) > > { > > -VhostVDPAState *s; > > +VhostVDPAState *s, *s0; > > struct vhost_vdpa *v; > > uint64_t backend_features; > > int64_t cvq_group; > > @@ -425,6 +475,15 @@ out: > > return 0; > > } > > > > +s0 = vhost_vdpa_net_first_nc_vdpa(s); > > +if (s0->vhost_vdpa.iova_tree) { > > +/* SVQ is already configured for all virtqueues */ > > +v->iova_tree = s0->vhost_vdpa.iova_tree; > > +} else { > > +v->iova_tree = vhost_iova_tree_new(v->iova_range.first, > > + v->iova_range.last); > I wonder how this case could happen, vh
Re: [PATCH 2/4] target/arm: Store tbi for both insns and data in ARMVAParameters
On 2/2/23 08:52, Richard Henderson wrote: This is slightly more work on the consumer side, but means we will be able to compute this once for multiple uses. Signed-off-by: Richard Henderson --- target/arm/internals.h| 5 +++-- target/arm/helper.c | 18 +- target/arm/pauth_helper.c | 29 - target/arm/ptw.c | 6 +++--- 4 files changed, 31 insertions(+), 27 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 04/12] tests: be a bit more strict cleaning up fifos
On 15/02/2023 20.25, Alex Bennée wrote: When we re-factored we dropped the unlink() step which turns out to be required for rmdir to do its thing. If we had been checking the return value we would have noticed so lets do that with this fix. Fixes: 68406d1085 (tests/unit: cleanups for test-io-channel-command) Signed-off-by: Alex Bennée Suggested-by: Philippe Mathieu-Daudé --- tests/unit/test-io-channel-command.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) Reviewed-by: Thomas Huth
Re: [PATCH 01/12] gitlab: tweak and filter ninja output to reduce build noise
On 15/02/2023 20.25, Alex Bennée wrote: A significant portion of our CI logs are just enumerating each successfully built object file. The current widespread versions of ninja don't have a quiet option so we use NINJA_STATUS to add a fixed string to the ninja output which we then filter with grep. If there are any errors in the output we get them from the compiler. Signed-off-by: Alex Bennée --- .gitlab-ci.d/buildtest-template.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.gitlab-ci.d/buildtest-template.yml b/.gitlab-ci.d/buildtest-template.yml index 73ecfabb8d..3af51846cd 100644 --- a/.gitlab-ci.d/buildtest-template.yml +++ b/.gitlab-ci.d/buildtest-template.yml @@ -21,7 +21,7 @@ then ../meson/meson.py configure . -Dbackend_max_links="$LD_JOBS" ; fi || exit 1; -- make -j"$JOBS" +- env NINJA_STATUS="[ninja][%f/%t] " make -j"$JOBS" | grep -v "\[ninja\]\[.*[123456789]/" - if test -n "$MAKE_CHECK_ARGS"; then make -j"$JOBS" $MAKE_CHECK_ARGS ; Not meant as a veto, but just for the records: I still don't like the idea. Having a log of the files that got compiled is still sometimes useful for me, e.g. when I want to check whether a certain file has been compiled at all or not (when e.g. debugging meson.build problems). So I'm still in favour of dropping this patch. IMHO if you want to shorten the build log in the CI, please get those chatty softfloat tests fixed instead. Thomas
Re: [PATCH] target/i386: Fix 32-bit AD[CO]X insns in 64-bit mode
On 15/1/23 02:21, Richard Henderson wrote: Failure to truncate the inputs results in garbage for the carry-out. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1373 Signed-off-by: Richard Henderson --- tests/tcg/x86_64/adox.c | 69 target/i386/tcg/emit.c.inc | 2 + tests/tcg/x86_64/Makefile.target | 3 ++ 3 files changed, 74 insertions(+) create mode 100644 tests/tcg/x86_64/adox.c Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 02/27] accel/tcg: Pass max_insn to gen_intermediate_code by pointer
On 30/1/23 21:59, Richard Henderson wrote: In preparation for returning the number of insns generated via the same pointer. Adjust only the prototypes so far. Signed-off-by: Richard Henderson --- include/exec/translator.h | 4 ++-- accel/tcg/translate-all.c | 2 +- accel/tcg/translator.c| 4 ++-- target/alpha/translate.c | 2 +- target/arm/translate.c| 2 +- target/avr/translate.c| 2 +- target/cris/translate.c | 2 +- target/hexagon/translate.c| 2 +- target/hppa/translate.c | 2 +- target/i386/tcg/translate.c | 2 +- target/loongarch/translate.c | 2 +- target/m68k/translate.c | 2 +- target/microblaze/translate.c | 2 +- target/mips/tcg/translate.c | 2 +- target/nios2/translate.c | 2 +- target/openrisc/translate.c | 2 +- target/ppc/translate.c| 2 +- target/riscv/translate.c | 2 +- target/rx/translate.c | 2 +- target/s390x/tcg/translate.c | 2 +- target/sh4/translate.c| 2 +- target/sparc/translate.c | 2 +- target/tricore/translate.c| 2 +- target/xtensa/translate.c | 2 +- 24 files changed, 26 insertions(+), 26 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 12/27] accel/tcg/plugin: Use tcg_temp_ebb_*
On 30/1/23 21:59, Richard Henderson wrote: All of these uses have quite local scope. Avoid tcg_const_*, because we haven't added a corresponding interface for TEMP_EBB. Use explicit tcg_gen_movi_* instead. Signed-off-by: Richard Henderson --- accel/tcg/plugin-gen.c | 24 ++-- 1 file changed, 14 insertions(+), 10 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 10/27] tcg: Add tcg_gen_movi_ptr
On 30/1/23 21:59, Richard Henderson wrote: Signed-off-by: Richard Henderson --- include/tcg/tcg-op.h | 5 + 1 file changed, 5 insertions(+) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 21/27] target/i386: Don't use tcg_temp_local_new
On 30/1/23 21:59, Richard Henderson wrote: Since tcg_temp_new is now identical, use that. In some cases we can avoid a copy from A0 or T0. Signed-off-by: Richard Henderson --- target/i386/tcg/translate.c | 27 +-- 1 file changed, 9 insertions(+), 18 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH] target/microblaze: Add gdbstub xml
On Thu, Feb 16, 2023 at 12:56 AM Richard Henderson < richard.hender...@linaro.org> wrote: > Alex, Edgar, this has been reviewed. Will either of you take it with your > trees, or shall > I just queue it through tcg-next? > > Hi Richard, yeah if you don't mind, please take it through your tree! Thanks, Edgar > r~ > > On 12/30/22 06:24, Richard Henderson wrote: > > Mirroring the upstream gdb xml files, the two stack boundary > > registers are separated out. > > > > Signed-off-by: Richard Henderson > > --- > > > > I did this thinking I would be fixing: > > > >TESTbasic gdbstub support on microblaze > >Truncated register 35 in remote 'g' packet > >Traceback (most recent call last): > > File "/home/rth/qemu/src/tests/tcg/multiarch/gdbstub/sha1.py", > >line 71, in if gdb.parse_and_eval('$pc') == 0: > >gdb.error: No registers. > > > > but in the end it turned out that the gdb-multiarch supplied > > by ubuntu 22.04 simply doesn't support MicroBlaze, as can be > > seen with the "set architecture" command within gdb. > > > > (I built gdb from source, to try and debug why this still wasn't > > working, only to find that it did. :-P) > > > > Alex, any way to modify our gdb test to fail gracefully here? > > > > Regardless, having proper xml for all of our targets seems > > like the correct way forward. > > > > > > r~ > > > > Cc: Alex Bennée > > Cc: Edgar E. Iglesias > > --- > > configs/targets/microblaze-linux-user.mak | 1 + > > configs/targets/microblaze-softmmu.mak | 1 + > > configs/targets/microblazeel-linux-user.mak | 1 + > > configs/targets/microblazeel-softmmu.mak| 1 + > > target/microblaze/cpu.h | 2 + > > target/microblaze/cpu.c | 7 ++- > > target/microblaze/gdbstub.c | 51 +++- > > gdb-xml/microblaze-core.xml | 67 + > > gdb-xml/microblaze-stack-protect.xml| 12 > > 9 files changed, 128 insertions(+), 15 deletions(-) > > create mode 100644 gdb-xml/microblaze-core.xml > > create mode 100644 gdb-xml/microblaze-stack-protect.xml > > > > diff --git a/configs/targets/microblaze-linux-user.mak > b/configs/targets/microblaze-linux-user.mak > > index 4249a37f65..0a2322c249 100644 > > --- a/configs/targets/microblaze-linux-user.mak > > +++ b/configs/targets/microblaze-linux-user.mak > > @@ -3,3 +3,4 @@ TARGET_SYSTBL_ABI=common > > TARGET_SYSTBL=syscall.tbl > > TARGET_BIG_ENDIAN=y > > TARGET_HAS_BFLT=y > > +TARGET_XML_FILES=gdb-xml/microblaze-core.xml > gdb-xml/microblaze-stack-protect.xml > > diff --git a/configs/targets/microblaze-softmmu.mak > b/configs/targets/microblaze-softmmu.mak > > index 8385e2d333..e84c0cc728 100644 > > --- a/configs/targets/microblaze-softmmu.mak > > +++ b/configs/targets/microblaze-softmmu.mak > > @@ -2,3 +2,4 @@ TARGET_ARCH=microblaze > > TARGET_BIG_ENDIAN=y > > TARGET_SUPPORTS_MTTCG=y > > TARGET_NEED_FDT=y > > +TARGET_XML_FILES=gdb-xml/microblaze-core.xml > gdb-xml/microblaze-stack-protect.xml > > diff --git a/configs/targets/microblazeel-linux-user.mak > b/configs/targets/microblazeel-linux-user.mak > > index d0e775d840..270743156a 100644 > > --- a/configs/targets/microblazeel-linux-user.mak > > +++ b/configs/targets/microblazeel-linux-user.mak > > @@ -2,3 +2,4 @@ TARGET_ARCH=microblaze > > TARGET_SYSTBL_ABI=common > > TARGET_SYSTBL=syscall.tbl > > TARGET_HAS_BFLT=y > > +TARGET_XML_FILES=gdb-xml/microblaze-core.xml > gdb-xml/microblaze-stack-protect.xml > > diff --git a/configs/targets/microblazeel-softmmu.mak > b/configs/targets/microblazeel-softmmu.mak > > index af40391f2f..9b688036bd 100644 > > --- a/configs/targets/microblazeel-softmmu.mak > > +++ b/configs/targets/microblazeel-softmmu.mak > > @@ -1,3 +1,4 @@ > > TARGET_ARCH=microblaze > > TARGET_SUPPORTS_MTTCG=y > > TARGET_NEED_FDT=y > > +TARGET_XML_FILES=gdb-xml/microblaze-core.xml > gdb-xml/microblaze-stack-protect.xml > > diff --git a/target/microblaze/cpu.h b/target/microblaze/cpu.h > > index 1e84dd8f47..e541fbb0b3 100644 > > --- a/target/microblaze/cpu.h > > +++ b/target/microblaze/cpu.h > > @@ -367,6 +367,8 @@ hwaddr mb_cpu_get_phys_page_attrs_debug(CPUState > *cpu, vaddr addr, > > MemTxAttrs *attrs); > > int mb_cpu_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg); > > int mb_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg); > > +int mb_cpu_gdb_read_stack_protect(CPUArchState *cpu, GByteArray *buf, > int reg); > > +int mb_cpu_gdb_write_stack_protect(CPUArchState *cpu, uint8_t *buf, int > reg); > > > > static inline uint32_t mb_cpu_read_msr(const CPUMBState *env) > > { > > diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c > > index 817681f9b2..a2d2f5c340 100644 > > --- a/target/microblaze/cpu.c > > +++ b/target/microblaze/cpu.c > > @@ -28,6 +28,7 @@ > > #include "qemu/module.h" > > #include "hw/qdev-properties.h" > > #inc
Re: [PATCH] target/microblaze: Add gdbstub xml
Alex, Edgar, this has been reviewed. Will either of you take it with your trees, or shall I just queue it through tcg-next? r~ On 12/30/22 06:24, Richard Henderson wrote: Mirroring the upstream gdb xml files, the two stack boundary registers are separated out. Signed-off-by: Richard Henderson --- I did this thinking I would be fixing: TESTbasic gdbstub support on microblaze Truncated register 35 in remote 'g' packet Traceback (most recent call last): File "/home/rth/qemu/src/tests/tcg/multiarch/gdbstub/sha1.py", line 71, in if gdb.parse_and_eval('$pc') == 0: gdb.error: No registers. but in the end it turned out that the gdb-multiarch supplied by ubuntu 22.04 simply doesn't support MicroBlaze, as can be seen with the "set architecture" command within gdb. (I built gdb from source, to try and debug why this still wasn't working, only to find that it did. :-P) Alex, any way to modify our gdb test to fail gracefully here? Regardless, having proper xml for all of our targets seems like the correct way forward. r~ Cc: Alex Bennée Cc: Edgar E. Iglesias --- configs/targets/microblaze-linux-user.mak | 1 + configs/targets/microblaze-softmmu.mak | 1 + configs/targets/microblazeel-linux-user.mak | 1 + configs/targets/microblazeel-softmmu.mak| 1 + target/microblaze/cpu.h | 2 + target/microblaze/cpu.c | 7 ++- target/microblaze/gdbstub.c | 51 +++- gdb-xml/microblaze-core.xml | 67 + gdb-xml/microblaze-stack-protect.xml| 12 9 files changed, 128 insertions(+), 15 deletions(-) create mode 100644 gdb-xml/microblaze-core.xml create mode 100644 gdb-xml/microblaze-stack-protect.xml diff --git a/configs/targets/microblaze-linux-user.mak b/configs/targets/microblaze-linux-user.mak index 4249a37f65..0a2322c249 100644 --- a/configs/targets/microblaze-linux-user.mak +++ b/configs/targets/microblaze-linux-user.mak @@ -3,3 +3,4 @@ TARGET_SYSTBL_ABI=common TARGET_SYSTBL=syscall.tbl TARGET_BIG_ENDIAN=y TARGET_HAS_BFLT=y +TARGET_XML_FILES=gdb-xml/microblaze-core.xml gdb-xml/microblaze-stack-protect.xml diff --git a/configs/targets/microblaze-softmmu.mak b/configs/targets/microblaze-softmmu.mak index 8385e2d333..e84c0cc728 100644 --- a/configs/targets/microblaze-softmmu.mak +++ b/configs/targets/microblaze-softmmu.mak @@ -2,3 +2,4 @@ TARGET_ARCH=microblaze TARGET_BIG_ENDIAN=y TARGET_SUPPORTS_MTTCG=y TARGET_NEED_FDT=y +TARGET_XML_FILES=gdb-xml/microblaze-core.xml gdb-xml/microblaze-stack-protect.xml diff --git a/configs/targets/microblazeel-linux-user.mak b/configs/targets/microblazeel-linux-user.mak index d0e775d840..270743156a 100644 --- a/configs/targets/microblazeel-linux-user.mak +++ b/configs/targets/microblazeel-linux-user.mak @@ -2,3 +2,4 @@ TARGET_ARCH=microblaze TARGET_SYSTBL_ABI=common TARGET_SYSTBL=syscall.tbl TARGET_HAS_BFLT=y +TARGET_XML_FILES=gdb-xml/microblaze-core.xml gdb-xml/microblaze-stack-protect.xml diff --git a/configs/targets/microblazeel-softmmu.mak b/configs/targets/microblazeel-softmmu.mak index af40391f2f..9b688036bd 100644 --- a/configs/targets/microblazeel-softmmu.mak +++ b/configs/targets/microblazeel-softmmu.mak @@ -1,3 +1,4 @@ TARGET_ARCH=microblaze TARGET_SUPPORTS_MTTCG=y TARGET_NEED_FDT=y +TARGET_XML_FILES=gdb-xml/microblaze-core.xml gdb-xml/microblaze-stack-protect.xml diff --git a/target/microblaze/cpu.h b/target/microblaze/cpu.h index 1e84dd8f47..e541fbb0b3 100644 --- a/target/microblaze/cpu.h +++ b/target/microblaze/cpu.h @@ -367,6 +367,8 @@ hwaddr mb_cpu_get_phys_page_attrs_debug(CPUState *cpu, vaddr addr, MemTxAttrs *attrs); int mb_cpu_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg); int mb_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg); +int mb_cpu_gdb_read_stack_protect(CPUArchState *cpu, GByteArray *buf, int reg); +int mb_cpu_gdb_write_stack_protect(CPUArchState *cpu, uint8_t *buf, int reg); static inline uint32_t mb_cpu_read_msr(const CPUMBState *env) { diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c index 817681f9b2..a2d2f5c340 100644 --- a/target/microblaze/cpu.c +++ b/target/microblaze/cpu.c @@ -28,6 +28,7 @@ #include "qemu/module.h" #include "hw/qdev-properties.h" #include "exec/exec-all.h" +#include "exec/gdbstub.h" #include "fpu/softfloat-helpers.h" static const struct { @@ -294,6 +295,9 @@ static void mb_cpu_initfn(Object *obj) CPUMBState *env = &cpu->env; cpu_set_cpustate_pointers(cpu); +gdb_register_coprocessor(CPU(cpu), mb_cpu_gdb_read_stack_protect, + mb_cpu_gdb_write_stack_protect, 2, + "microblaze-stack-protect.xml", 0); set_float_rounding_mode(float_round_nearest_even, &env->fp_status); @@ -422,7 +426,8 @@ static void mb_cpu_class_init(ObjectClass *oc, v
Re: [PATCH 0/4] target/arm: Cache ARMVAParameters
Ping. r~ On 2/1/23 21:52, Richard Henderson wrote: Hi Anders, I'm not well versed on tuxrun, and how to make that work with a qemu binary outside of the container, so I'm not sure if I'm comparing apples to bananas. Can you look and see if this fixes the kselftest slowdown you reported? Anyway, for a boot and shutdown of your rootfs, I see: Before: 11.13% [.] aa64_va_parameters 8.38% [.] helper_lookup_tb_ptr 7.37% [.] pauth_computepac 3.79% [.] qht_lookup_custom After: 9.17% [.] helper_lookup_tb_ptr 8.05% [.] pauth_computepac 4.22% [.] qht_lookup_custom 3.68% [.] pauth_addpac ... 1.67% [.] aa64_va_parameters This is all due to the heavy use pauth makes of aa64_va_parameters. It "only" needs 2 parameters, tsz and tbi, but tsz is probably the most expensive part of aa64_va_parameters -- do anything about that and we might as well cache the whole thing. The change from struct+bitfields to uint32_t+FIELD is meant to combat some really ugly code that gcc produced. Seems like they should have compiled to the same thing, more or less, but alas. r~ Richard Henderson (4): target/arm: Flush only required tlbs for TCR_EL[12] target/arm: Store tbi for both insns and data in ARMVAParameters target/arm: Use FIELD for ARMVAParameters target/arm: Cache ARMVAParameters target/arm/cpu.h | 30 +++ target/arm/internals.h| 21 + target/arm/helper.c | 177 -- target/arm/pauth_helper.c | 39 + target/arm/ptw.c | 57 ++-- 5 files changed, 217 insertions(+), 107 deletions(-)
Re: [PATCH 0/1] accel/tcg: Allow the second page of an instruction to be MMIO
On 2/6/23 09:38, Richard Henderson wrote: Curious but true: two independent reports of the same issue within 24 hours, one with an x86 guest and one with an arm guest. Neither report included instructions for reproduction (and both seem to be with complex setup), therefore this is untested, but seems simple enough to be the proper fix. It matches up with /* * If the TB is not associated with a physical RAM page then it must be * a temporary one-insn TB, and we have nothing left to do. Return early * before attempting to link to other TBs or add to the lookup table. */ if (tb_page_addr0(tb) == -1) { return tb; } in tb_gen_code(). r~ Richard Henderson (1): accel/tcg: Allow the second page of an instruction to be MMIO accel/tcg/translator.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) Queued to tcg-next. r~
Re: [PATCH] target/i386: Fix 32-bit AD[CO]X insns in 64-bit mode
Ping. Paolo, I see you've queued a fix for a different ADCOX bug in your latest pull. You could probably adjust your new test for this case, but this problem is exclusively x86_64. r~ On 1/14/23 15:21, Richard Henderson wrote: Failure to truncate the inputs results in garbage for the carry-out. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1373 Signed-off-by: Richard Henderson --- tests/tcg/x86_64/adox.c | 69 target/i386/tcg/emit.c.inc | 2 + tests/tcg/x86_64/Makefile.target | 3 ++ 3 files changed, 74 insertions(+) create mode 100644 tests/tcg/x86_64/adox.c diff --git a/tests/tcg/x86_64/adox.c b/tests/tcg/x86_64/adox.c new file mode 100644 index 00..36be644c8b --- /dev/null +++ b/tests/tcg/x86_64/adox.c @@ -0,0 +1,69 @@ +/* See if ADOX give expected results */ + +#include +#include +#include + +static uint64_t adoxq(bool *c_out, uint64_t a, uint64_t b, bool c) +{ +asm ("addl $0x7fff, %k1\n\t" + "adoxq %2, %0\n\t" + "seto %b1" + : "+r"(a), "=&r"(c) : "r"(b), "1"((int)c)); +*c_out = c; +return a; +} + +static uint64_t adoxl(bool *c_out, uint64_t a, uint64_t b, bool c) +{ +asm ("addl $0x7fff, %k1\n\t" + "adoxl %k2, %k0\n\t" + "seto %b1" + : "+r"(a), "=&r"(c) : "r"(b), "1"((int)c)); +*c_out = c; +return a; +} + +int main() +{ +uint64_t r; +bool c; + +r = adoxq(&c, 0, 0, 0); +assert(r == 0); +assert(c == 0); + +r = adoxl(&c, 0, 0, 0); +assert(r == 0); +assert(c == 0); + +r = adoxl(&c, 0x1, 0, 0); +assert(r == 0); +assert(c == 0); + +r = adoxq(&c, 0, 0, 1); +assert(r == 1); +assert(c == 0); + +r = adoxl(&c, 0, 0, 1); +assert(r == 1); +assert(c == 0); + +r = adoxq(&c, -1, -1, 0); +assert(r == -2); +assert(c == 1); + +r = adoxl(&c, -1, -1, 0); +assert(r == 0xfffe); +assert(c == 1); + +r = adoxq(&c, -1, -1, 1); +assert(r == -1); +assert(c == 1); + +r = adoxl(&c, -1, -1, 1); +assert(r == 0x); +assert(c == 1); + +return 0; +} diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc index 1eace1231a..d44c51209d 100644 --- a/target/i386/tcg/emit.c.inc +++ b/target/i386/tcg/emit.c.inc @@ -1042,6 +1042,8 @@ static void gen_ADCOX(DisasContext *s, CPUX86State *env, MemOp ot, int cc_op) #ifdef TARGET_X86_64 case MO_32: /* If TL is 64-bit just do everything in 64-bit arithmetic. */ +tcg_gen_ext32u_tl(s->T0, s->T0); +tcg_gen_ext32u_tl(s->T1, s->T1); tcg_gen_add_i64(s->T0, s->T0, s->T1); tcg_gen_add_i64(s->T0, s->T0, carry_in); tcg_gen_shri_i64(carry_out, s->T0, 32); diff --git a/tests/tcg/x86_64/Makefile.target b/tests/tcg/x86_64/Makefile.target index 4eac78293f..e64aab1b81 100644 --- a/tests/tcg/x86_64/Makefile.target +++ b/tests/tcg/x86_64/Makefile.target @@ -12,11 +12,14 @@ ifeq ($(filter %-linux-user, $(TARGET)),$(TARGET)) X86_64_TESTS += vsyscall X86_64_TESTS += noexec X86_64_TESTS += cmpxchg +X86_64_TESTS += adox TESTS=$(MULTIARCH_TESTS) $(X86_64_TESTS) test-x86_64 else TESTS=$(MULTIARCH_TESTS) endif +adox: CFLAGS=-O2 + run-test-i386-ssse3: QEMU_OPTS += -cpu max run-plugin-test-i386-ssse3-%: QEMU_OPTS += -cpu max
Re: [PATCH] target/i386: Fix BZHI instruction
Ping. r~ On 1/14/23 13:32, Richard Henderson wrote: We did not correctly handle N >= operand size. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1374 Signed-off-by: Richard Henderson --- tests/tcg/i386/test-i386-bmi2.c | 3 +++ target/i386/tcg/emit.c.inc | 14 +++--- 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/tests/tcg/i386/test-i386-bmi2.c b/tests/tcg/i386/test-i386-bmi2.c index 982d4abda4..0244df7987 100644 --- a/tests/tcg/i386/test-i386-bmi2.c +++ b/tests/tcg/i386/test-i386-bmi2.c @@ -123,6 +123,9 @@ int main(int argc, char *argv[]) { result = bzhiq(mask, 0x1f); assert(result == (mask & ~(-1 << 30))); +result = bzhiq(mask, 0x40); +assert(result == mask); + result = rorxq(0x2132435465768798, 8); assert(result == 0x9821324354657687); diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc index 4d7702c106..1eace1231a 100644 --- a/target/i386/tcg/emit.c.inc +++ b/target/i386/tcg/emit.c.inc @@ -1143,20 +1143,20 @@ static void gen_BLSR(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode) static void gen_BZHI(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode) { MemOp ot = decode->op[0].ot; -TCGv bound; +TCGv bound = tcg_constant_tl(ot == MO_64 ? 63 : 31); +TCGv zero = tcg_constant_tl(0); +TCGv mone = tcg_constant_tl(-1); -tcg_gen_ext8u_tl(s->T1, cpu_regs[s->vex_v]); -bound = tcg_constant_tl(ot == MO_64 ? 63 : 31); +tcg_gen_ext8u_tl(s->T1, s->T1); /* * Note that since we're using BMILG (in order to get O * cleared) we need to store the inverse into C. */ -tcg_gen_setcond_tl(TCG_COND_LT, cpu_cc_src, s->T1, bound); -tcg_gen_movcond_tl(TCG_COND_GT, s->T1, s->T1, bound, bound, s->T1); +tcg_gen_setcond_tl(TCG_COND_LEU, cpu_cc_src, s->T1, bound); -tcg_gen_movi_tl(s->A0, -1); -tcg_gen_shl_tl(s->A0, s->A0, s->T1); +tcg_gen_shl_tl(s->A0, mone, s->T1); +tcg_gen_movcond_tl(TCG_COND_LEU, s->A0, s->T1, bound, s->A0, zero); tcg_gen_andc_tl(s->T0, s->T0, s->A0); gen_op_update1_cc(s);
[PATCH v11 31/59] hw/xen: Implement EVTCHNOP_unmask
From: David Woodhouse This finally comes with a mechanism for actually injecting events into the guest vCPU, with all the atomic-test-and-set that's involved in setting the bit in the shinfo, then the index in the vcpu_info, and injecting either the lapic vector as MSI, or letting KVM inject the bare vector. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 175 ++ hw/i386/kvm/xen_evtchn.h | 2 + target/i386/kvm/xen-emu.c | 12 +++ 3 files changed, 189 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 08c6fac357..deea7de027 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -224,6 +224,13 @@ int xen_evtchn_set_callback_param(uint64_t param) return ret; } +static void inject_callback(XenEvtchnState *s, uint32_t vcpu) +{ +int type = s->callback_param >> CALLBACK_VIA_TYPE_SHIFT; + +kvm_xen_inject_vcpu_callback_vector(vcpu, type); +} + static bool valid_port(evtchn_port_t port) { if (!port) { @@ -294,6 +301,152 @@ int xen_evtchn_status_op(struct evtchn_status *status) return 0; } +/* + * Never thought I'd hear myself say this, but C++ templates would be + * kind of nice here. + * + * template static int do_unmask_port(T *shinfo, ...); + */ +static int do_unmask_port_lm(XenEvtchnState *s, evtchn_port_t port, + bool do_unmask, struct shared_info *shinfo, + struct vcpu_info *vcpu_info) +{ +const int bits_per_word = BITS_PER_BYTE * sizeof(shinfo->evtchn_pending[0]); +typeof(shinfo->evtchn_pending[0]) mask; +int idx = port / bits_per_word; +int offset = port % bits_per_word; + +mask = 1UL << offset; + +if (idx >= bits_per_word) { +return -EINVAL; +} + +if (do_unmask) { +/* + * If this is a true unmask operation, clear the mask bit. If + * it was already unmasked, we have nothing further to do. + */ +if (!((qatomic_fetch_and(&shinfo->evtchn_mask[idx], ~mask) & mask))) { +return 0; +} +} else { +/* + * This is a pseudo-unmask for affinity changes. We don't + * change the mask bit, and if it's *masked* we have nothing + * else to do. + */ +if (qatomic_fetch_or(&shinfo->evtchn_mask[idx], 0) & mask) { +return 0; +} +} + +/* If the event was not pending, we're done. */ +if (!(qatomic_fetch_or(&shinfo->evtchn_pending[idx], 0) & mask)) { +return 0; +} + +/* Now on to the vcpu_info evtchn_pending_sel index... */ +mask = 1UL << idx; + +/* If a port in this word was already pending for this vCPU, all done. */ +if (qatomic_fetch_or(&vcpu_info->evtchn_pending_sel, mask) & mask) { +return 0; +} + +/* Set evtchn_upcall_pending for this vCPU */ +if (qatomic_fetch_or(&vcpu_info->evtchn_upcall_pending, 1)) { +return 0; +} + +inject_callback(s, s->port_table[port].vcpu); + +return 0; +} + +static int do_unmask_port_compat(XenEvtchnState *s, evtchn_port_t port, + bool do_unmask, + struct compat_shared_info *shinfo, + struct compat_vcpu_info *vcpu_info) +{ +const int bits_per_word = BITS_PER_BYTE * sizeof(shinfo->evtchn_pending[0]); +typeof(shinfo->evtchn_pending[0]) mask; +int idx = port / bits_per_word; +int offset = port % bits_per_word; + +mask = 1UL << offset; + +if (idx >= bits_per_word) { +return -EINVAL; +} + +if (do_unmask) { +/* + * If this is a true unmask operation, clear the mask bit. If + * it was already unmasked, we have nothing further to do. + */ +if (!((qatomic_fetch_and(&shinfo->evtchn_mask[idx], ~mask) & mask))) { +return 0; +} +} else { +/* + * This is a pseudo-unmask for affinity changes. We don't + * change the mask bit, and if it's *masked* we have nothing + * else to do. + */ +if (qatomic_fetch_or(&shinfo->evtchn_mask[idx], 0) & mask) { +return 0; +} +} + +/* If the event was not pending, we're done. */ +if (!(qatomic_fetch_or(&shinfo->evtchn_pending[idx], 0) & mask)) { +return 0; +} + +/* Now on to the vcpu_info evtchn_pending_sel index... */ +mask = 1UL << idx; + +/* If a port in this word was already pending for this vCPU, all done. */ +if (qatomic_fetch_or(&vcpu_info->evtchn_pending_sel, mask) & mask) { +return 0; +} + +/* Set evtchn_upcall_pending for this vCPU */ +if (qatomic_fetch_or(&vcpu_info->evtchn_upcall_pending, 1)) { +return 0; +} + +inject_callback(s, s->port_table[port].vcpu); + +return 0; +} + +static int unmask_port(XenEvtchnState *s, evtchn_port_t port, bool do_unmask) +{ +
[PATCH v11 54/59] i386/xen: Implement HYPERVISOR_physdev_op
From: David Woodhouse Just hook up the basic hypercalls to stubs in xen_evtchn.c for now. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 25 hw/i386/kvm/xen_evtchn.h | 11 target/i386/kvm/xen-compat.h | 19 ++ target/i386/kvm/xen-emu.c| 118 +++ 4 files changed, 173 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 7412139154..ca9f15698f 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -1347,6 +1347,31 @@ int xen_evtchn_set_port(uint16_t port) return ret; } +int xen_physdev_map_pirq(struct physdev_map_pirq *map) +{ +return -ENOTSUP; +} + +int xen_physdev_unmap_pirq(struct physdev_unmap_pirq *unmap) +{ +return -ENOTSUP; +} + +int xen_physdev_eoi_pirq(struct physdev_eoi *eoi) +{ +return -ENOTSUP; +} + +int xen_physdev_query_pirq(struct physdev_irq_status_query *query) +{ +return -ENOTSUP; +} + +int xen_physdev_get_free_pirq(struct physdev_get_free_pirq *get) +{ +return -ENOTSUP; +} + struct xenevtchn_handle *xen_be_evtchn_open(void) { struct xenevtchn_handle *xc = g_new0(struct xenevtchn_handle, 1); diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index 5a71ffb753..352c875976 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -62,4 +62,15 @@ int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain); int xen_evtchn_bind_vcpu_op(struct evtchn_bind_vcpu *vcpu); int xen_evtchn_reset_op(struct evtchn_reset *reset); +struct physdev_map_pirq; +struct physdev_unmap_pirq; +struct physdev_eoi; +struct physdev_irq_status_query; +struct physdev_get_free_pirq; +int xen_physdev_map_pirq(struct physdev_map_pirq *map); +int xen_physdev_unmap_pirq(struct physdev_unmap_pirq *unmap); +int xen_physdev_eoi_pirq(struct physdev_eoi *eoi); +int xen_physdev_query_pirq(struct physdev_irq_status_query *query); +int xen_physdev_get_free_pirq(struct physdev_get_free_pirq *get); + #endif /* QEMU_XEN_EVTCHN_H */ diff --git a/target/i386/kvm/xen-compat.h b/target/i386/kvm/xen-compat.h index 448336de92..7f30180cc2 100644 --- a/target/i386/kvm/xen-compat.h +++ b/target/i386/kvm/xen-compat.h @@ -48,4 +48,23 @@ struct compat_xen_add_to_physmap_batch { COMPAT_HANDLE(int) errs; }; +struct compat_physdev_map_pirq { +domid_t domid; +uint16_t pad; +/* IN */ +int type; +/* IN (ignored for ..._MULTI_MSI) */ +int index; +/* IN or OUT */ +int pirq; +/* IN - high 16 bits hold segment for ..._MSI_SEG and ..._MULTI_MSI */ +int bus; +/* IN */ +int devfn; +/* IN (also OUT for ..._MULTI_MSI) */ +int entry_nr; +/* IN */ +uint64_t table_base; +} __attribute__((packed)); + #endif /* QEMU_I386_XEN_COMPAT_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 389acd0c42..e8e7092c66 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -1517,6 +1517,121 @@ static bool kvm_xen_hcall_gnttab_op(struct kvm_xen_exit *exit, X86CPU *cpu, return true; } +static bool kvm_xen_hcall_physdev_op(struct kvm_xen_exit *exit, X86CPU *cpu, + int cmd, uint64_t arg) +{ +CPUState *cs = CPU(cpu); +int err; + +switch (cmd) { +case PHYSDEVOP_map_pirq: { +struct physdev_map_pirq map; + +if (hypercall_compat32(exit->u.hcall.longmode)) { +struct compat_physdev_map_pirq *map32 = (void *)↦ + +if (kvm_copy_from_gva(cs, arg, map32, sizeof(*map32))) { +return -EFAULT; +} + +/* + * The only thing that's different is the alignment of the + * uint64_t table_base at the end, which gets padding to make + * it 64-bit aligned in the 64-bit version. + */ +qemu_build_assert(sizeof(*map32) == 36); +qemu_build_assert(offsetof(struct physdev_map_pirq, entry_nr) == + offsetof(struct compat_physdev_map_pirq, entry_nr)); +memmove(&map.table_base, &map32->table_base, sizeof(map.table_base)); +} else { +if (kvm_copy_from_gva(cs, arg, &map, sizeof(map))) { +err = -EFAULT; +break; +} +} +err = xen_physdev_map_pirq(&map); +/* + * Since table_base is an IN parameter and won't be changed, just + * copy the size of the compat structure back to the guest. + */ +if (!err && kvm_copy_to_gva(cs, arg, &map, +sizeof(struct compat_physdev_map_pirq))) { +err = -EFAULT; +} +break; +} +case PHYSDEVOP_unmap_pirq: { +struct physdev_unmap_pirq unmap; + +qemu_build_assert(sizeof(unmap) == 8); +if (kvm_copy_from_gva(cs, arg, &unmap, sizeof(unmap))) { +err = -EFAULT; +
[PATCH v11 07/59] xen-platform: exclude vfio-pci from the PCI platform unplug
From: Joao Martins Such that PCI passthrough devices work for Xen emulated guests. Signed-off-by: Joao Martins Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/xen/xen_platform.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c index 66e6de31a6..d601a5509d 100644 --- a/hw/i386/xen/xen_platform.c +++ b/hw/i386/xen/xen_platform.c @@ -109,12 +109,25 @@ static void log_writeb(PCIXenPlatformState *s, char val) #define _UNPLUG_NVME_DISKS 3 #define UNPLUG_NVME_DISKS (1u << _UNPLUG_NVME_DISKS) +static bool pci_device_is_passthrough(PCIDevice *d) +{ +if (!strcmp(d->name, "xen-pci-passthrough")) { +return true; +} + +if (xen_mode == XEN_EMULATE && !strcmp(d->name, "vfio-pci")) { +return true; +} + +return false; +} + static void unplug_nic(PCIBus *b, PCIDevice *d, void *o) { /* We have to ignore passthrough devices */ if (pci_get_word(d->config + PCI_CLASS_DEVICE) == PCI_CLASS_NETWORK_ETHERNET -&& strcmp(d->name, "xen-pci-passthrough") != 0) { +&& !pci_device_is_passthrough(d)) { object_unparent(OBJECT(d)); } } @@ -187,9 +200,8 @@ static void unplug_disks(PCIBus *b, PCIDevice *d, void *opaque) !(flags & UNPLUG_IDE_SCSI_DISKS); /* We have to ignore passthrough devices */ -if (!strcmp(d->name, "xen-pci-passthrough")) { +if (pci_device_is_passthrough(d)) return; -} switch (pci_get_word(d->config + PCI_CLASS_DEVICE)) { case PCI_CLASS_STORAGE_IDE: -- 2.39.0
[PATCH v11 38/59] hw/xen: Implement EVTCHNOP_reset
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 30 ++ hw/i386/kvm/xen_evtchn.h | 3 +++ target/i386/kvm/xen-emu.c | 17 + 3 files changed, 50 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index f87b6a3b23..9b1fb47e85 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -12,6 +12,7 @@ #include "qemu/osdep.h" #include "qemu/host-utils.h" #include "qemu/module.h" +#include "qemu/lockable.h" #include "qemu/main-loop.h" #include "qemu/log.h" #include "qapi/error.h" @@ -745,6 +746,35 @@ static int close_port(XenEvtchnState *s, evtchn_port_t port) return 0; } +int xen_evtchn_soft_reset(void) +{ +XenEvtchnState *s = xen_evtchn_singleton; +int i; + +if (!s) { +return -ENOTSUP; +} + +assert(qemu_mutex_iothread_locked()); + +QEMU_LOCK_GUARD(&s->port_lock); + +for (i = 0; i < s->nr_ports; i++) { +close_port(s, i); +} + +return 0; +} + +int xen_evtchn_reset_op(struct evtchn_reset *reset) +{ +if (reset->dom != DOMID_SELF && reset->dom != xen_domid) { +return -ESRCH; +} + +return xen_evtchn_soft_reset(); +} + int xen_evtchn_close_op(struct evtchn_close *close) { XenEvtchnState *s = xen_evtchn_singleton; diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index 486b031c82..5d3e03553f 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -13,6 +13,7 @@ #define QEMU_XEN_EVTCHN_H void xen_evtchn_create(void); +int xen_evtchn_soft_reset(void); int xen_evtchn_set_callback_param(uint64_t param); struct evtchn_status; @@ -24,6 +25,7 @@ struct evtchn_send; struct evtchn_alloc_unbound; struct evtchn_bind_interdomain; struct evtchn_bind_vcpu; +struct evtchn_reset; int xen_evtchn_status_op(struct evtchn_status *status); int xen_evtchn_close_op(struct evtchn_close *close); int xen_evtchn_unmask_op(struct evtchn_unmask *unmask); @@ -33,5 +35,6 @@ int xen_evtchn_send_op(struct evtchn_send *send); int xen_evtchn_alloc_unbound_op(struct evtchn_alloc_unbound *alloc); int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain); int xen_evtchn_bind_vcpu_op(struct evtchn_bind_vcpu *vcpu); +int xen_evtchn_reset_op(struct evtchn_reset *reset); #endif /* QEMU_XEN_EVTCHN_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index ec7aefadfc..96261c10a0 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -961,6 +961,18 @@ static bool kvm_xen_hcall_evtchn_op(struct kvm_xen_exit *exit, X86CPU *cpu, err = xen_evtchn_bind_vcpu_op(&vcpu); break; } +case EVTCHNOP_reset: { +struct evtchn_reset reset; + +qemu_build_assert(sizeof(reset) == 2); +if (kvm_copy_from_gva(cs, arg, &reset, sizeof(reset))) { +err = -EFAULT; +break; +} + +err = xen_evtchn_reset_op(&reset); +break; +} default: return false; } @@ -978,6 +990,11 @@ int kvm_xen_soft_reset(void) trace_kvm_xen_soft_reset(); +err = xen_evtchn_soft_reset(); +if (err) { +return err; +} + /* * Zero is the reset/startup state for HVM_PARAM_CALLBACK_IRQ. Strictly, * it maps to HVM_PARAM_CALLBACK_TYPE_GSI with GSI#0, but Xen refuses to -- 2.39.0
[PATCH v11 08/59] xen-platform: allow its creation with XEN_EMULATE mode
From: Joao Martins The only thing we need to fix to make this build is the PIO hack which sets the BIOS memory areas to R/W v.s. R/O. Theoretically we could hook that up to the PAM registers on the emulated PIIX, but in practice nobody cares, so just leave it doing nothing. Now it builds without actual Xen, move it to CONFIG_XEN_BUS to include it in the KVM-only builds. Signed-off-by: Joao Martins Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/xen/meson.build| 5 - hw/i386/xen/xen_platform.c | 39 +- 2 files changed, 30 insertions(+), 14 deletions(-) diff --git a/hw/i386/xen/meson.build b/hw/i386/xen/meson.build index 2fcc46e6ca..3dc4c4f106 100644 --- a/hw/i386/xen/meson.build +++ b/hw/i386/xen/meson.build @@ -1,6 +1,9 @@ i386_ss.add(when: 'CONFIG_XEN', if_true: files( 'xen-hvm.c', 'xen_apic.c', - 'xen_platform.c', 'xen_pvdevice.c', )) + +i386_ss.add(when: 'CONFIG_XEN_BUS', if_true: files( + 'xen_platform.c', +)) diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c index d601a5509d..319049d80c 100644 --- a/hw/i386/xen/xen_platform.c +++ b/hw/i386/xen/xen_platform.c @@ -28,9 +28,9 @@ #include "hw/ide.h" #include "hw/ide/pci.h" #include "hw/pci/pci.h" -#include "hw/xen/xen_common.h" #include "migration/vmstate.h" -#include "hw/xen/xen-legacy-backend.h" +#include "hw/xen/xen.h" +#include "net/net.h" #include "trace.h" #include "sysemu/xen.h" #include "sysemu/block-backend.h" @@ -38,6 +38,11 @@ #include "qemu/module.h" #include "qom/object.h" +#ifdef CONFIG_XEN +#include "hw/xen/xen_common.h" +#include "hw/xen/xen-legacy-backend.h" +#endif + //#define DEBUG_PLATFORM #ifdef DEBUG_PLATFORM @@ -280,18 +285,26 @@ static void platform_fixed_ioport_writeb(void *opaque, uint32_t addr, uint32_t v PCIXenPlatformState *s = opaque; switch (addr) { -case 0: /* Platform flags */ { -hvmmem_type_t mem_type = (val & PFFLAG_ROM_LOCK) ? -HVMMEM_ram_ro : HVMMEM_ram_rw; -if (xen_set_mem_type(xen_domid, mem_type, 0xc0, 0x40)) { -DPRINTF("unable to change ro/rw state of ROM memory area!\n"); -} else { +case 0: /* Platform flags */ +if (xen_mode == XEN_EMULATE) { +/* XX: Use i440gx/q35 PAM setup to do this? */ s->flags = val & PFFLAG_ROM_LOCK; -DPRINTF("changed ro/rw state of ROM memory area. now is %s state.\n", -(mem_type == HVMMEM_ram_ro ? "ro":"rw")); +#ifdef CONFIG_XEN +} else { +hvmmem_type_t mem_type = (val & PFFLAG_ROM_LOCK) ? +HVMMEM_ram_ro : HVMMEM_ram_rw; + +if (xen_set_mem_type(xen_domid, mem_type, 0xc0, 0x40)) { +DPRINTF("unable to change ro/rw state of ROM memory area!\n"); +} else { +s->flags = val & PFFLAG_ROM_LOCK; +DPRINTF("changed ro/rw state of ROM memory area. now is %s state.\n", +(mem_type == HVMMEM_ram_ro ? "ro" : "rw")); +} +#endif } break; -} + case 2: log_writeb(s, val); break; @@ -509,8 +522,8 @@ static void xen_platform_realize(PCIDevice *dev, Error **errp) uint8_t *pci_conf; /* Device will crash on reset if xen is not initialized */ -if (!xen_enabled()) { -error_setg(errp, "xen-platform device requires the Xen accelerator"); +if (xen_mode == XEN_DISABLED) { +error_setg(errp, "xen-platform device requires a Xen guest"); return; } -- 2.39.0
[PATCH v11 09/59] i386/xen: handle guest hypercalls
From: Joao Martins This means handling the new exit reason for Xen but still crashing on purpose. As we implement each of the hypercalls we will then return the right return code. Signed-off-by: Joao Martins [dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0] Signed-off-by: David Woodhouse --- target/i386/kvm/kvm.c| 5 target/i386/kvm/trace-events | 3 +++ target/i386/kvm/xen-emu.c| 44 target/i386/kvm/xen-emu.h| 1 + 4 files changed, 53 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 165fa5232d..a7ba3476ac 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -5478,6 +5478,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run) assert(run->msr.reason == KVM_MSR_EXIT_REASON_FILTER); ret = kvm_handle_wrmsr(cpu, run); break; +#ifdef CONFIG_XEN_EMU +case KVM_EXIT_XEN: +ret = kvm_xen_handle_exit(cpu, &run->xen); +break; +#endif default: fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason); ret = -1; diff --git a/target/i386/kvm/trace-events b/target/i386/kvm/trace-events index 7c369db1e1..cd6f842b1f 100644 --- a/target/i386/kvm/trace-events +++ b/target/i386/kvm/trace-events @@ -5,3 +5,6 @@ kvm_x86_fixup_msi_error(uint32_t gsi) "VT-d failed to remap interrupt for GSI %" kvm_x86_add_msi_route(int virq) "Adding route entry for virq %d" kvm_x86_remove_msi_route(int virq) "Removing route entry for virq %d" kvm_x86_update_msi_routes(int num) "Updated %d MSI routes" + +# xen-emu.c +kvm_xen_hypercall(int cpu, uint8_t cpl, uint64_t input, uint64_t a0, uint64_t a1, uint64_t a2, uint64_t ret) "xen_hypercall: cpu %d cpl %d input %" PRIu64 " a0 0x%" PRIx64 " a1 0x%" PRIx64 " a2 0x%" PRIx64" ret 0x%" PRIx64 diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 4883b95d9d..476f464ee2 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -10,10 +10,12 @@ */ #include "qemu/osdep.h" +#include "qemu/log.h" #include "sysemu/kvm_int.h" #include "sysemu/kvm_xen.h" #include "kvm/kvm_i386.h" #include "xen-emu.h" +#include "trace.h" int kvm_xen_init(KVMState *s, uint32_t hypercall_msr) { @@ -84,3 +86,45 @@ uint32_t kvm_xen_get_caps(void) { return kvm_state->xen_caps; } + +static bool do_kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) +{ +uint16_t code = exit->u.hcall.input; + +if (exit->u.hcall.cpl > 0) { +exit->u.hcall.result = -EPERM; +return true; +} + +switch (code) { +default: +return false; +} +} + +int kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) +{ +if (exit->type != KVM_EXIT_XEN_HCALL) { +return -1; +} + +if (!do_kvm_xen_handle_exit(cpu, exit)) { +/* + * Some hypercalls will be deliberately "implemented" by returning + * -ENOSYS. This case is for hypercalls which are unexpected. + */ +exit->u.hcall.result = -ENOSYS; +qemu_log_mask(LOG_UNIMP, "Unimplemented Xen hypercall %" + PRId64 " (0x%" PRIx64 " 0x%" PRIx64 " 0x%" PRIx64 ")\n", + (uint64_t)exit->u.hcall.input, + (uint64_t)exit->u.hcall.params[0], + (uint64_t)exit->u.hcall.params[1], + (uint64_t)exit->u.hcall.params[2]); +} + +trace_kvm_xen_hypercall(CPU(cpu)->cpu_index, exit->u.hcall.cpl, +exit->u.hcall.input, exit->u.hcall.params[0], +exit->u.hcall.params[1], exit->u.hcall.params[2], +exit->u.hcall.result); +return 0; +} diff --git a/target/i386/kvm/xen-emu.h b/target/i386/kvm/xen-emu.h index d62f1d8ed8..21faf6bf38 100644 --- a/target/i386/kvm/xen-emu.h +++ b/target/i386/kvm/xen-emu.h @@ -25,5 +25,6 @@ int kvm_xen_init(KVMState *s, uint32_t hypercall_msr); int kvm_xen_init_vcpu(CPUState *cs); +int kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit); #endif /* QEMU_I386_KVM_XEN_EMU_H */ -- 2.39.0
[PATCH v11 12/59] i386/xen: Implement SCHEDOP_poll and SCHEDOP_yield
From: David Woodhouse They both do the same thing and just call sched_yield. This is enough to stop the Linux guest panicking when running on a host kernel which doesn't intercept SCHEDOP_poll and lets it reach userspace. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/kvm/xen-emu.c | 13 + 1 file changed, 13 insertions(+) diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 4ed833656f..ebea27caf6 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -234,6 +234,19 @@ static bool kvm_xen_hcall_sched_op(struct kvm_xen_exit *exit, X86CPU *cpu, err = schedop_shutdown(cs, arg); break; +case SCHEDOP_poll: +/* + * Linux will panic if this doesn't work. Just yield; it's not + * worth overthinking it because with event channel handling + * in KVM, the kernel will intercept this and it will never + * reach QEMU anyway. The semantics of the hypercall explicltly + * permit spurious wakeups. + */ +case SCHEDOP_yield: +sched_yield(); +err = 0; +break; + default: return false; } -- 2.39.0
[PATCH v11 02/59] xen: add CONFIG_XEN_BUS and CONFIG_XEN_EMU options for Xen emulation
From: David Woodhouse The XEN_EMU option will cover core Xen support in target/, which exists only for x86 with KVM today but could theoretically also be implemented on Arm/Aarch64 and with TCG or other accelerators (if anyone wants to run the gauntlet of struct layout compatibility, errno mapping, and the rest of that fui). It will also cover the support for architecture-independent grant table and event channel support which will be added in hw/i386/kvm/ (on the basis that the non-KVM support is very theoretical and making it not use KVM directly seems like gratuitous overengineering at this point). The XEN_BUS option is for the xenfv platform support, which will now be used both by XEN_EMU and by real Xen. The XEN option remains dependent on the Xen runtime libraries, and covers support for real Xen. Some code which currently resides under CONFIG_XEN will be moving to CONFIG_XEN_BUS over time as the direct dependencies on Xen runtime libraries are eliminated. The Xen PCI platform device will also reside under CONFIG_XEN_BUS. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/Kconfig | 1 + hw/i386/Kconfig | 5 + hw/xen/Kconfig | 3 +++ meson.build | 1 + 4 files changed, 10 insertions(+) create mode 100644 hw/xen/Kconfig diff --git a/hw/Kconfig b/hw/Kconfig index 38233bbb0f..ba62ff6417 100644 --- a/hw/Kconfig +++ b/hw/Kconfig @@ -41,6 +41,7 @@ source tpm/Kconfig source usb/Kconfig source virtio/Kconfig source vfio/Kconfig +source xen/Kconfig source watchdog/Kconfig # arch Kconfig diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig index 9fbfe748b5..d40802d83f 100644 --- a/hw/i386/Kconfig +++ b/hw/i386/Kconfig @@ -136,3 +136,8 @@ config VMPORT config VMMOUSE bool depends on VMPORT + +config XEN_EMU +bool +default y +depends on KVM && (I386 || X86_64) diff --git a/hw/xen/Kconfig b/hw/xen/Kconfig new file mode 100644 index 00..3467efb986 --- /dev/null +++ b/hw/xen/Kconfig @@ -0,0 +1,3 @@ +config XEN_BUS +bool +default y if (XEN || XEN_EMU) diff --git a/meson.build b/meson.build index 3f08bceba0..12071688cd 100644 --- a/meson.build +++ b/meson.build @@ -3853,6 +3853,7 @@ if have_system if xen.found() summary_info += {'xen ctrl version': xen.version()} endif + summary_info += {'Xen emulation': config_all.has_key('CONFIG_XEN_EMU')} endif summary_info += {'TCG support': config_all.has_key('CONFIG_TCG')} if config_all.has_key('CONFIG_TCG') -- 2.39.0
[PATCH v11 23/59] i386/xen: handle VCPUOP_register_runstate_memory_area
From: Joao Martins Allow guest to setup the vcpu runstates which is used as steal clock. Signed-off-by: Joao Martins Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/cpu.h | 1 + target/i386/kvm/xen-emu.c | 57 +++ target/i386/machine.c | 1 + 3 files changed, 59 insertions(+) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 96c2d0d5cb..bf44a87ddb 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1791,6 +1791,7 @@ typedef struct CPUArchState { uint64_t xen_vcpu_info_gpa; uint64_t xen_vcpu_info_default_gpa; uint64_t xen_vcpu_time_info_gpa; +uint64_t xen_vcpu_runstate_gpa; #endif #if defined(CONFIG_HVF) HVFX86LazyFlags hvf_lflags; diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 0b3bd0b889..f5c8b6d20c 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -160,6 +160,7 @@ int kvm_xen_init_vcpu(CPUState *cs) env->xen_vcpu_info_gpa = INVALID_GPA; env->xen_vcpu_info_default_gpa = INVALID_GPA; env->xen_vcpu_time_info_gpa = INVALID_GPA; +env->xen_vcpu_runstate_gpa = INVALID_GPA; return 0; } @@ -254,6 +255,17 @@ static void do_set_vcpu_time_info_gpa(CPUState *cs, run_on_cpu_data data) env->xen_vcpu_time_info_gpa); } +static void do_set_vcpu_runstate_gpa(CPUState *cs, run_on_cpu_data data) +{ +X86CPU *cpu = X86_CPU(cs); +CPUX86State *env = &cpu->env; + +env->xen_vcpu_runstate_gpa = data.host_ulong; + +kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR, + env->xen_vcpu_runstate_gpa); +} + static void do_vcpu_soft_reset(CPUState *cs, run_on_cpu_data data) { X86CPU *cpu = X86_CPU(cs); @@ -262,10 +274,14 @@ static void do_vcpu_soft_reset(CPUState *cs, run_on_cpu_data data) env->xen_vcpu_info_gpa = INVALID_GPA; env->xen_vcpu_info_default_gpa = INVALID_GPA; env->xen_vcpu_time_info_gpa = INVALID_GPA; +env->xen_vcpu_runstate_gpa = INVALID_GPA; kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO, INVALID_GPA); kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO, INVALID_GPA); +kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR, + INVALID_GPA); + } static int xen_set_shared_info(uint64_t gfn) @@ -517,6 +533,35 @@ static int vcpuop_register_vcpu_time_info(CPUState *cs, CPUState *target, return 0; } +static int vcpuop_register_runstate_info(CPUState *cs, CPUState *target, + uint64_t arg) +{ +struct vcpu_register_runstate_memory_area rma; +uint64_t gpa; +size_t len; + +/* No need for 32/64 compat handling */ +qemu_build_assert(sizeof(rma) == 8); +/* The runstate area actually does change size, but Linux copes. */ + +if (!target) { +return -ENOENT; +} + +if (kvm_copy_from_gva(cs, arg, &rma, sizeof(rma))) { +return -EFAULT; +} + +/* As with vcpu_time_info, Xen actually uses the GVA but KVM doesn't. */ +if (!kvm_gva_to_gpa(cs, rma.addr.p, &gpa, &len, false)) { +return -EFAULT; +} + +async_run_on_cpu(target, do_set_vcpu_runstate_gpa, + RUN_ON_CPU_HOST_ULONG(gpa)); +return 0; +} + static bool kvm_xen_hcall_vcpu_op(struct kvm_xen_exit *exit, X86CPU *cpu, int cmd, int vcpu_id, uint64_t arg) { @@ -525,6 +570,9 @@ static bool kvm_xen_hcall_vcpu_op(struct kvm_xen_exit *exit, X86CPU *cpu, int err; switch (cmd) { +case VCPUOP_register_runstate_memory_area: +err = vcpuop_register_runstate_info(cs, dest, arg); +break; case VCPUOP_register_vcpu_time_memory_area: err = vcpuop_register_vcpu_time_info(cs, dest, arg); break; @@ -730,6 +778,15 @@ int kvm_put_xen_state(CPUState *cs) } } +gpa = env->xen_vcpu_runstate_gpa; +if (gpa != INVALID_GPA) { +ret = kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR, +gpa); +if (ret < 0) { +return ret; +} +} + return 0; } diff --git a/target/i386/machine.c b/target/i386/machine.c index eb657907ca..3f3d436aaa 100644 --- a/target/i386/machine.c +++ b/target/i386/machine.c @@ -1273,6 +1273,7 @@ static const VMStateDescription vmstate_xen_vcpu = { VMSTATE_UINT64(env.xen_vcpu_info_gpa, X86CPU), VMSTATE_UINT64(env.xen_vcpu_info_default_gpa, X86CPU), VMSTATE_UINT64(env.xen_vcpu_time_info_gpa, X86CPU), +VMSTATE_UINT64(env.xen_vcpu_runstate_gpa, X86CPU), VMSTATE_END_OF_LIST() } }; -- 2.39.0
[PATCH v11 29/59] hw/xen: Implement EVTCHNOP_status
From: David Woodhouse This adds the basic structure for maintaining the port table and reporting the status of ports therein. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 104 ++ hw/i386/kvm/xen_evtchn.h | 3 ++ target/i386/kvm/xen-emu.c | 20 +++- 3 files changed, 125 insertions(+), 2 deletions(-) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 9d6f4076ad..8bed33890f 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -22,6 +22,7 @@ #include "hw/sysbus.h" #include "hw/xen/xen.h" #include "xen_evtchn.h" +#include "xen_overlay.h" #include "sysemu/kvm.h" #include "sysemu/kvm_xen.h" @@ -33,6 +34,22 @@ #define TYPE_XEN_EVTCHN "xen-evtchn" OBJECT_DECLARE_SIMPLE_TYPE(XenEvtchnState, XEN_EVTCHN) +typedef struct XenEvtchnPort { +uint32_t vcpu; /* Xen/ACPI vcpu_id */ +uint16_t type; /* EVTCHNSTAT_ */ +uint16_t type_val; /* pirq# / virq# / remote port according to type */ +} XenEvtchnPort; + +#define COMPAT_EVTCHN_2L_NR_CHANNELS1024 + +/* + * For unbound/interdomain ports there are only two possible remote + * domains; self and QEMU. Use a single high bit in type_val for that, + * and the low bits for the remote port number (or 0 for unbound). + */ +#define PORT_INFO_TYPEVAL_REMOTE_QEMU 0x8000 +#define PORT_INFO_TYPEVAL_REMOTE_PORT_MASK 0x7FFF + struct XenEvtchnState { /*< private >*/ SysBusDevice busdev; @@ -42,6 +59,8 @@ struct XenEvtchnState { bool evtchn_in_kernel; QemuMutex port_lock; +uint32_t nr_ports; +XenEvtchnPort port_table[EVTCHN_2L_NR_CHANNELS]; }; struct XenEvtchnState *xen_evtchn_singleton; @@ -65,6 +84,18 @@ static bool xen_evtchn_is_needed(void *opaque) return xen_mode == XEN_EMULATE; } +static const VMStateDescription xen_evtchn_port_vmstate = { +.name = "xen_evtchn_port", +.version_id = 1, +.minimum_version_id = 1, +.fields = (VMStateField[]) { +VMSTATE_UINT32(vcpu, XenEvtchnPort), +VMSTATE_UINT16(type, XenEvtchnPort), +VMSTATE_UINT16(type_val, XenEvtchnPort), +VMSTATE_END_OF_LIST() +} +}; + static const VMStateDescription xen_evtchn_vmstate = { .name = "xen_evtchn", .version_id = 1, @@ -73,6 +104,9 @@ static const VMStateDescription xen_evtchn_vmstate = { .post_load = xen_evtchn_post_load, .fields = (VMStateField[]) { VMSTATE_UINT64(callback_param, XenEvtchnState), +VMSTATE_UINT32(nr_ports, XenEvtchnState), +VMSTATE_STRUCT_VARRAY_UINT32(port_table, XenEvtchnState, nr_ports, 1, + xen_evtchn_port_vmstate, XenEvtchnPort), VMSTATE_END_OF_LIST() } }; @@ -153,3 +187,73 @@ int xen_evtchn_set_callback_param(uint64_t param) return ret; } + +static bool valid_port(evtchn_port_t port) +{ +if (!port) { +return false; +} + +if (xen_is_long_mode()) { +return port < EVTCHN_2L_NR_CHANNELS; +} else { +return port < COMPAT_EVTCHN_2L_NR_CHANNELS; +} +} + +int xen_evtchn_status_op(struct evtchn_status *status) +{ +XenEvtchnState *s = xen_evtchn_singleton; +XenEvtchnPort *p; + +if (!s) { +return -ENOTSUP; +} + +if (status->dom != DOMID_SELF && status->dom != xen_domid) { +return -ESRCH; +} + +if (!valid_port(status->port)) { +return -EINVAL; +} + +qemu_mutex_lock(&s->port_lock); + +p = &s->port_table[status->port]; + +status->status = p->type; +status->vcpu = p->vcpu; + +switch (p->type) { +case EVTCHNSTAT_unbound: +if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) { +status->u.unbound.dom = DOMID_QEMU; +} else { +status->u.unbound.dom = xen_domid; +} +break; + +case EVTCHNSTAT_interdomain: +if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) { +status->u.interdomain.dom = DOMID_QEMU; +} else { +status->u.interdomain.dom = xen_domid; +} + +status->u.interdomain.port = p->type_val & +PORT_INFO_TYPEVAL_REMOTE_PORT_MASK; +break; + +case EVTCHNSTAT_pirq: +status->u.pirq = p->type_val; +break; + +case EVTCHNSTAT_virq: +status->u.virq = p->type_val; +break; +} + +qemu_mutex_unlock(&s->port_lock); +return 0; +} diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index c9b7f9d11f..76467636ee 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -15,4 +15,7 @@ void xen_evtchn_create(void); int xen_evtchn_set_callback_param(uint64_t param); +struct evtchn_status; +int xen_evtchn_status_op(struct evtchn_status *status); + #endif /* QEMU_XEN_EVTCHN_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 4513f07c68..3811153724 100644 --- a/target/i386/kvm/xen-emu.c +++ b/targ
[PATCH v11 40/59] hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback
From: David Woodhouse The GSI callback (and later PCI_INTX) is a level triggered interrupt. It is asserted when an event channel is delivered to vCPU0, and is supposed to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0 is cleared again. Thankfully, Xen does *not* assert the GSI if the guest sets its own evtchn_upcall_pending field; we only need to assert the GSI when we have delivered an event for ourselves. So that's the easy part, kind of. There's a slight complexity in that we need to hold the BQL before we can call qemu_set_irq(), and we definitely can't do that while holding our own port_lock (because we'll need to take that from the qemu-side functions that the PV backend drivers will call). So if we end up wanting to set the IRQ in a context where we *don't* already hold the BQL, defer to a BH. However, we *do* need to poll for the evtchn_upcall_pending flag being cleared. In an ideal world we would poll that when the EOI happens on the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd pairs — one is used to trigger the interrupt, and the other works in the other direction to 'resample' on EOI, and trigger the first eventfd again if the line is still active. However, QEMU doesn't seem to do that. Even VFIO level interrupts seem to be supported by temporarily unmapping the device's BARs from the guest when an interrupt happens, then trapping *all* MMIO to the device and sending the 'resample' event on *every* MMIO access until the IRQ is cleared! Maybe in future we'll plumb the 'resample' concept through QEMU's irq framework but for now we'll do what Xen itself does: just check the flag on every vmexit if the upcall GSI is known to be asserted. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 97 +++ hw/i386/kvm/xen_evtchn.h | 4 ++ hw/i386/pc.c | 6 +++ include/sysemu/kvm_xen.h | 1 + target/i386/cpu.h | 1 + target/i386/kvm/kvm.c | 11 + target/i386/kvm/xen-emu.c | 40 target/i386/kvm/xen-emu.h | 1 + 8 files changed, 161 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index fa54d185cd..ecc93da172 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -27,6 +27,8 @@ #include "hw/sysbus.h" #include "hw/xen/xen.h" +#include "hw/i386/x86.h" +#include "hw/irq.h" #include "xen_evtchn.h" #include "xen_overlay.h" @@ -100,9 +102,12 @@ struct XenEvtchnState { uint64_t callback_param; bool evtchn_in_kernel; +QEMUBH *gsi_bh; + QemuMutex port_lock; uint32_t nr_ports; XenEvtchnPort port_table[EVTCHN_2L_NR_CHANNELS]; +qemu_irq gsis[GSI_NUM_PINS]; }; struct XenEvtchnState *xen_evtchn_singleton; @@ -167,13 +172,42 @@ static const TypeInfo xen_evtchn_info = { .class_init= xen_evtchn_class_init, }; +static void gsi_assert_bh(void *opaque) +{ +struct vcpu_info *vi = kvm_xen_get_vcpu_info_hva(0); +if (vi) { +xen_evtchn_set_callback_level(!!vi->evtchn_upcall_pending); +} +} + void xen_evtchn_create(void) { XenEvtchnState *s = XEN_EVTCHN(sysbus_create_simple(TYPE_XEN_EVTCHN, -1, NULL)); +int i; + xen_evtchn_singleton = s; qemu_mutex_init(&s->port_lock); +s->gsi_bh = aio_bh_new(qemu_get_aio_context(), gsi_assert_bh, s); + +for (i = 0; i < GSI_NUM_PINS; i++) { +sysbus_init_irq(SYS_BUS_DEVICE(s), &s->gsis[i]); +} +} + +void xen_evtchn_connect_gsis(qemu_irq *system_gsis) +{ +XenEvtchnState *s = xen_evtchn_singleton; +int i; + +if (!s) { +return; +} + +for (i = 0; i < GSI_NUM_PINS; i++) { +sysbus_connect_irq(SYS_BUS_DEVICE(s), i, system_gsis[i]); +} } static void xen_evtchn_register_types(void) @@ -183,6 +217,64 @@ static void xen_evtchn_register_types(void) type_init(xen_evtchn_register_types) +void xen_evtchn_set_callback_level(int level) +{ +XenEvtchnState *s = xen_evtchn_singleton; +uint32_t param; + +if (!s) { +return; +} + +/* + * We get to this function in a number of ways: + * + * • From I/O context, via PV backend drivers sending a notification to + *the guest. + * + * • From guest vCPU context, via loopback interdomain event channels + *(or theoretically even IPIs but guests don't use those with GSI + *delivery because that's pointless. We don't want a malicious guest + *to be able to trigger a deadlock though, so we can't rule it out.) + * + * • From guest vCPU context when the HVM_PARAM_CALLBACK_IRQ is being + *configured. + * + * • From guest vCPU context in the KVM exit handler, if the upcall + *pending flag has been cleared and the GSI needs to be deasserted. + * + * • Maybe in future, in an interrupt ack/eoi notifier when the GSI has +
[PATCH v11 33/59] hw/xen: Implement EVTCHNOP_bind_ipi
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 69 +++ hw/i386/kvm/xen_evtchn.h | 2 ++ target/i386/kvm/xen-emu.c | 15 + 3 files changed, 86 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index da2f5711dd..d8527483b9 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -13,6 +13,7 @@ #include "qemu/host-utils.h" #include "qemu/module.h" #include "qemu/main-loop.h" +#include "qemu/log.h" #include "qapi/error.h" #include "qom/object.h" #include "exec/target_page.h" @@ -231,6 +232,43 @@ static void inject_callback(XenEvtchnState *s, uint32_t vcpu) kvm_xen_inject_vcpu_callback_vector(vcpu, type); } +static void deassign_kernel_port(evtchn_port_t port) +{ +struct kvm_xen_hvm_attr ha; +int ret; + +ha.type = KVM_XEN_ATTR_TYPE_EVTCHN; +ha.u.evtchn.send_port = port; +ha.u.evtchn.flags = KVM_XEN_EVTCHN_DEASSIGN; + +ret = kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_SET_ATTR, &ha); +if (ret) { +qemu_log_mask(LOG_GUEST_ERROR, "Failed to unbind kernel port %d: %s\n", + port, strerror(ret)); +} +} + +static int assign_kernel_port(uint16_t type, evtchn_port_t port, + uint32_t vcpu_id) +{ +CPUState *cpu = qemu_get_cpu(vcpu_id); +struct kvm_xen_hvm_attr ha; + +if (!cpu) { +return -ENOENT; +} + +ha.type = KVM_XEN_ATTR_TYPE_EVTCHN; +ha.u.evtchn.send_port = port; +ha.u.evtchn.type = type; +ha.u.evtchn.flags = 0; +ha.u.evtchn.deliver.port.port = port; +ha.u.evtchn.deliver.port.vcpu = kvm_arch_vcpu_id(cpu); +ha.u.evtchn.deliver.port.priority = KVM_IRQ_ROUTING_XEN_EVTCHN_PRIO_2LEVEL; + +return kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_SET_ATTR, &ha); +} + static bool valid_port(evtchn_port_t port) { if (!port) { @@ -549,6 +587,12 @@ static int close_port(XenEvtchnState *s, evtchn_port_t port) p->type_val, 0); break; +case EVTCHNSTAT_ipi: +if (s->evtchn_in_kernel) { +deassign_kernel_port(port); +} +break; + default: break; } @@ -638,3 +682,28 @@ int xen_evtchn_bind_virq_op(struct evtchn_bind_virq *virq) return ret; } + +int xen_evtchn_bind_ipi_op(struct evtchn_bind_ipi *ipi) +{ +XenEvtchnState *s = xen_evtchn_singleton; +int ret; + +if (!s) { +return -ENOTSUP; +} + +if (!valid_vcpu(ipi->vcpu)) { +return -ENOENT; +} + +qemu_mutex_lock(&s->port_lock); + +ret = allocate_port(s, ipi->vcpu, EVTCHNSTAT_ipi, 0, &ipi->port); +if (!ret && s->evtchn_in_kernel) { +assign_kernel_port(EVTCHNSTAT_ipi, ipi->port, ipi->vcpu); +} + +qemu_mutex_unlock(&s->port_lock); + +return ret; +} diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index 0ea13dda3a..107f420848 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -19,9 +19,11 @@ struct evtchn_status; struct evtchn_close; struct evtchn_unmask; struct evtchn_bind_virq; +struct evtchn_bind_ipi; int xen_evtchn_status_op(struct evtchn_status *status); int xen_evtchn_close_op(struct evtchn_close *close); int xen_evtchn_unmask_op(struct evtchn_unmask *unmask); int xen_evtchn_bind_virq_op(struct evtchn_bind_virq *virq); +int xen_evtchn_bind_ipi_op(struct evtchn_bind_ipi *ipi); #endif /* QEMU_XEN_EVTCHN_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 0c4988ad63..4a20ccdf78 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -891,6 +891,21 @@ static bool kvm_xen_hcall_evtchn_op(struct kvm_xen_exit *exit, X86CPU *cpu, } break; } +case EVTCHNOP_bind_ipi: { +struct evtchn_bind_ipi ipi; + +qemu_build_assert(sizeof(ipi) == 8); +if (kvm_copy_from_gva(cs, arg, &ipi, sizeof(ipi))) { +err = -EFAULT; +break; +} + +err = xen_evtchn_bind_ipi_op(&ipi); +if (!err && kvm_copy_to_gva(cs, arg, &ipi, sizeof(ipi))) { +err = -EFAULT; +} +break; +} default: return false; } -- 2.39.0
[PATCH v11 06/59] i386/hvm: Set Xen vCPU ID in KVM
From: David Woodhouse There are (at least) three different vCPU ID number spaces. One is the internal KVM vCPU index, based purely on which vCPU was chronologically created in the kernel first. If userspace threads are all spawned and create their KVM vCPUs in essentially random order, then the KVM indices are basically random too. The second number space is the APIC ID space, which is consistent and useful for referencing vCPUs. MSIs will specify the target vCPU using the APIC ID, for example, and the KVM Xen APIs also take an APIC ID from userspace whenever a vCPU needs to be specified (as opposed to just using the appropriate vCPU fd). The third number space is not normally relevant to the kernel, and is the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But Xen timer hypercalls use it, and Xen timer hypercalls *really* want to be accelerated in the kernel rather than handled in userspace, so the kernel needs to be told. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/kvm/kvm.c | 5 + target/i386/kvm/xen-emu.c | 28 target/i386/kvm/xen-emu.h | 1 + 3 files changed, 34 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 2b3daabf7b..165fa5232d 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -1869,6 +1869,11 @@ int kvm_arch_init_vcpu(CPUState *cs) } } +r = kvm_xen_init_vcpu(cs); +if (r) { +return r; +} + kvm_base += 0x100; #else /* CONFIG_XEN_EMU */ /* This should never happen as kvm_arch_init() would have died first. */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 34d5bc1bc9..4883b95d9d 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -52,6 +52,34 @@ int kvm_xen_init(KVMState *s, uint32_t hypercall_msr) return 0; } +int kvm_xen_init_vcpu(CPUState *cs) +{ +int err; + +/* + * The kernel needs to know the Xen/ACPI vCPU ID because that's + * what the guest uses in hypercalls such as timers. It doesn't + * match the APIC ID which is generally used for talking to the + * kernel about vCPUs. And if vCPU threads race with creating + * their KVM vCPUs out of order, it doesn't necessarily match + * with the kernel's internal vCPU indices either. + */ +if (kvm_xen_has_cap(EVTCHN_SEND)) { +struct kvm_xen_vcpu_attr va = { +.type = KVM_XEN_VCPU_ATTR_TYPE_VCPU_ID, +.u.vcpu_id = cs->cpu_index, +}; +err = kvm_vcpu_ioctl(cs, KVM_XEN_VCPU_SET_ATTR, &va); +if (err) { +error_report("kvm: Failed to set Xen vCPU ID attribute: %s", + strerror(-err)); +return err; +} +} + +return 0; +} + uint32_t kvm_xen_get_caps(void) { return kvm_state->xen_caps; diff --git a/target/i386/kvm/xen-emu.h b/target/i386/kvm/xen-emu.h index 2101df0182..d62f1d8ed8 100644 --- a/target/i386/kvm/xen-emu.h +++ b/target/i386/kvm/xen-emu.h @@ -24,5 +24,6 @@ #define XEN_VERSION(maj, min) ((maj) << 16 | (min)) int kvm_xen_init(KVMState *s, uint32_t hypercall_msr); +int kvm_xen_init_vcpu(CPUState *cs); #endif /* QEMU_I386_KVM_XEN_EMU_H */ -- 2.39.0
[PATCH v11 51/59] hw/xen: Add xen_xenstore device for xenstore emulation
From: David Woodhouse Just the basic shell, with the event channel hookup. It only dumps the buffer for now; a real ring implmentation will come in a subsequent patch. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/meson.build| 1 + hw/i386/kvm/xen_evtchn.c | 1 + hw/i386/kvm/xen_xenstore.c | 248 + hw/i386/kvm/xen_xenstore.h | 20 +++ hw/i386/pc.c | 2 + target/i386/kvm/xen-emu.c | 12 ++ 6 files changed, 284 insertions(+) create mode 100644 hw/i386/kvm/xen_xenstore.c create mode 100644 hw/i386/kvm/xen_xenstore.h diff --git a/hw/i386/kvm/meson.build b/hw/i386/kvm/meson.build index e02449e4d4..6d6981fced 100644 --- a/hw/i386/kvm/meson.build +++ b/hw/i386/kvm/meson.build @@ -8,6 +8,7 @@ i386_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files( 'xen_overlay.c', 'xen_evtchn.c', 'xen_gnttab.c', + 'xen_xenstore.c', )) i386_ss.add_all(when: 'CONFIG_KVM', if_true: i386_kvm_ss) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 519b8e0600..7412139154 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -34,6 +34,7 @@ #include "xen_evtchn.h" #include "xen_overlay.h" +#include "xen_xenstore.h" #include "sysemu/kvm.h" #include "sysemu/kvm_xen.h" diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c new file mode 100644 index 00..702f417633 --- /dev/null +++ b/hw/i386/kvm/xen_xenstore.c @@ -0,0 +1,248 @@ +/* + * QEMU Xen emulation: Shared/overlay pages support + * + * Copyright © 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Authors: David Woodhouse + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" + +#include "qemu/host-utils.h" +#include "qemu/module.h" +#include "qemu/main-loop.h" +#include "qemu/cutils.h" +#include "qapi/error.h" +#include "qom/object.h" +#include "migration/vmstate.h" + +#include "hw/sysbus.h" +#include "hw/xen/xen.h" +#include "xen_overlay.h" +#include "xen_evtchn.h" +#include "xen_xenstore.h" + +#include "sysemu/kvm.h" +#include "sysemu/kvm_xen.h" + +#include "hw/xen/interface/io/xs_wire.h" +#include "hw/xen/interface/event_channel.h" + +#define TYPE_XEN_XENSTORE "xen-xenstore" +OBJECT_DECLARE_SIMPLE_TYPE(XenXenstoreState, XEN_XENSTORE) + +#define XEN_PAGE_SHIFT 12 +#define XEN_PAGE_SIZE (1ULL << XEN_PAGE_SHIFT) + +#define ENTRIES_PER_FRAME_V1 (XEN_PAGE_SIZE / sizeof(grant_entry_v1_t)) +#define ENTRIES_PER_FRAME_V2 (XEN_PAGE_SIZE / sizeof(grant_entry_v2_t)) + +#define XENSTORE_HEADER_SIZE ((unsigned int)sizeof(struct xsd_sockmsg)) + +struct XenXenstoreState { +/*< private >*/ +SysBusDevice busdev; +/*< public >*/ + +MemoryRegion xenstore_page; +struct xenstore_domain_interface *xs; +uint8_t req_data[XENSTORE_HEADER_SIZE + XENSTORE_PAYLOAD_MAX]; +uint8_t rsp_data[XENSTORE_HEADER_SIZE + XENSTORE_PAYLOAD_MAX]; +uint32_t req_offset; +uint32_t rsp_offset; +bool rsp_pending; +bool fatal_error; + +evtchn_port_t guest_port; +evtchn_port_t be_port; +struct xenevtchn_handle *eh; +}; + +struct XenXenstoreState *xen_xenstore_singleton; + +static void xen_xenstore_event(void *opaque); + +static void xen_xenstore_realize(DeviceState *dev, Error **errp) +{ +XenXenstoreState *s = XEN_XENSTORE(dev); + +if (xen_mode != XEN_EMULATE) { +error_setg(errp, "Xen xenstore support is for Xen emulation"); +return; +} +memory_region_init_ram(&s->xenstore_page, OBJECT(dev), "xen:xenstore_page", + XEN_PAGE_SIZE, &error_abort); +memory_region_set_enabled(&s->xenstore_page, true); +s->xs = memory_region_get_ram_ptr(&s->xenstore_page); +memset(s->xs, 0, XEN_PAGE_SIZE); + +/* We can't map it this early as KVM isn't ready */ +xen_xenstore_singleton = s; + +s->eh = xen_be_evtchn_open(); +if (!s->eh) { +error_setg(errp, "Xenstore evtchn port init failed"); +return; +} +aio_set_fd_handler(qemu_get_aio_context(), xen_be_evtchn_fd(s->eh), true, + xen_xenstore_event, NULL, NULL, NULL, s); +} + +static bool xen_xenstore_is_needed(void *opaque) +{ +return xen_mode == XEN_EMULATE; +} + +static int xen_xenstore_pre_save(void *opaque) +{ +XenXenstoreState *s = opaque; + +if (s->eh) { +s->guest_port = xen_be_evtchn_get_guest_port(s->eh); +} +return 0; +} + +static int xen_xenstore_post_load(void *opaque, int ver) +{ +XenXenstoreState *s = opaque; + +/* + * As qemu/dom0, rebind to the guest's port. The Windows drivers may + * unbind the XenStore evtchn and rebind to it, having obtained the + * "remote" port through EVTCHNOP_status. In the case that migration + * occurs while it's unbound, the "remote" port needs to be the same + * as before so that the guest can find it, but should
[PATCH v11 39/59] i386/xen: add monitor commands to test event injection
From: Joao Martins Specifically add listing, injection of event channels. Signed-off-by: Joao Martins Signed-off-by: David Woodhouse Acked-by: Dr. David Alan Gilbert Reviewed-by: Paul Durrant --- hmp-commands.hx | 29 + hw/i386/kvm/xen_evtchn.c | 137 +++ include/monitor/hmp.h| 2 + qapi/misc-target.json| 116 + 4 files changed, 284 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index fbb5daf09b..b87c250e23 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1815,3 +1815,32 @@ SRST Dump the FDT in dtb format to *filename*. ERST #endif + +#if defined(CONFIG_XEN_EMU) +{ +.name = "xen-event-inject", +.args_type = "port:i", +.params = "port", +.help = "inject event channel", +.cmd= hmp_xen_event_inject, +}, + +SRST +``xen-event-inject`` *port* + Notify guest via event channel on port *port*. +ERST + + +{ +.name = "xen-event-list", +.args_type = "", +.params = "", +.help = "list event channel state", +.cmd= hmp_xen_event_list, +}, + +SRST +``xen-event-list`` + List event channels in the guest +ERST +#endif diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 9b1fb47e85..fa54d185cd 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -15,7 +15,11 @@ #include "qemu/lockable.h" #include "qemu/main-loop.h" #include "qemu/log.h" +#include "monitor/monitor.h" +#include "monitor/hmp.h" #include "qapi/error.h" +#include "qapi/qapi-commands-misc-target.h" +#include "qapi/qmp/qdict.h" #include "qom/object.h" #include "exec/target_page.h" #include "exec/address-spaces.h" @@ -1067,3 +1071,136 @@ int xen_evtchn_send_op(struct evtchn_send *send) return ret; } +EvtchnInfoList *qmp_xen_event_list(Error **errp) +{ +XenEvtchnState *s = xen_evtchn_singleton; +EvtchnInfoList *head = NULL, **tail = &head; +void *shinfo, *pending, *mask; +int i; + +if (!s) { +error_setg(errp, "Xen event channel emulation not enabled"); +return NULL; +} + +shinfo = xen_overlay_get_shinfo_ptr(); +if (!shinfo) { +error_setg(errp, "Xen shared info page not allocated"); +return NULL; +} + +if (xen_is_long_mode()) { +pending = shinfo + offsetof(struct shared_info, evtchn_pending); +mask = shinfo + offsetof(struct shared_info, evtchn_mask); +} else { +pending = shinfo + offsetof(struct compat_shared_info, evtchn_pending); +mask = shinfo + offsetof(struct compat_shared_info, evtchn_mask); +} + +QEMU_LOCK_GUARD(&s->port_lock); + +for (i = 0; i < s->nr_ports; i++) { +XenEvtchnPort *p = &s->port_table[i]; +EvtchnInfo *info; + +if (p->type == EVTCHNSTAT_closed) { +continue; +} + +info = g_new0(EvtchnInfo, 1); + +info->port = i; +qemu_build_assert(EVTCHN_PORT_TYPE_CLOSED == EVTCHNSTAT_closed); +qemu_build_assert(EVTCHN_PORT_TYPE_UNBOUND == EVTCHNSTAT_unbound); +qemu_build_assert(EVTCHN_PORT_TYPE_INTERDOMAIN == EVTCHNSTAT_interdomain); +qemu_build_assert(EVTCHN_PORT_TYPE_PIRQ == EVTCHNSTAT_pirq); +qemu_build_assert(EVTCHN_PORT_TYPE_VIRQ == EVTCHNSTAT_virq); +qemu_build_assert(EVTCHN_PORT_TYPE_IPI == EVTCHNSTAT_ipi); + +info->type = p->type; +if (p->type == EVTCHNSTAT_interdomain) { +info->remote_domain = g_strdup((p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) ? + "qemu" : "loopback"); +info->target = p->type_val & PORT_INFO_TYPEVAL_REMOTE_PORT_MASK; +} else { +info->target = p->type_val; +} +info->vcpu = p->vcpu; +info->pending = test_bit(i, pending); +info->masked = test_bit(i, mask); + +QAPI_LIST_APPEND(tail, info); +} + +return head; +} + +void qmp_xen_event_inject(uint32_t port, Error **errp) +{ +XenEvtchnState *s = xen_evtchn_singleton; + +if (!s) { +error_setg(errp, "Xen event channel emulation not enabled"); +return; +} + +if (!valid_port(port)) { +error_setg(errp, "Invalid port %u", port); +} + +QEMU_LOCK_GUARD(&s->port_lock); + +if (set_port_pending(s, port)) { +error_setg(errp, "Failed to set port %u", port); +return; +} +} + +void hmp_xen_event_list(Monitor *mon, const QDict *qdict) +{ +EvtchnInfoList *iter, *info_list; +Error *err = NULL; + +info_list = qmp_xen_event_list(&err); +if (err) { +hmp_handle_error(mon, err); +return; +} + +for (iter = info_list; iter; iter = iter->next) { +EvtchnInfo *info = iter->value; + +monitor_printf(mon, "port %4lu: vcpu: %ld %s", info->port, info->vcpu, + EvtchnPortType_str(inf
[PATCH v11 44/59] hw/xen: Support mapping grant frames
From: David Woodhouse Signed-off-by: David Woodhouse --- hw/i386/kvm/xen_gnttab.c | 73 ++- hw/i386/kvm/xen_overlay.c | 2 +- hw/i386/kvm/xen_overlay.h | 2 ++ 3 files changed, 75 insertions(+), 2 deletions(-) diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c index ef8857e50c..72e87aea6a 100644 --- a/hw/i386/kvm/xen_gnttab.c +++ b/hw/i386/kvm/xen_gnttab.c @@ -37,13 +37,26 @@ OBJECT_DECLARE_SIMPLE_TYPE(XenGnttabState, XEN_GNTTAB) #define XEN_PAGE_SHIFT 12 #define XEN_PAGE_SIZE (1ULL << XEN_PAGE_SHIFT) +#define ENTRIES_PER_FRAME_V1 (XEN_PAGE_SIZE / sizeof(grant_entry_v1_t)) + struct XenGnttabState { /*< private >*/ SysBusDevice busdev; /*< public >*/ +QemuMutex gnt_lock; + uint32_t nr_frames; uint32_t max_frames; + +union { +grant_entry_v1_t *v1; +/* Theoretically, v2 support could be added here. */ +} entries; + +MemoryRegion gnt_frames; +MemoryRegion *gnt_aliases; +uint64_t *gnt_frame_gpas; }; struct XenGnttabState *xen_gnttab_singleton; @@ -51,6 +64,7 @@ struct XenGnttabState *xen_gnttab_singleton; static void xen_gnttab_realize(DeviceState *dev, Error **errp) { XenGnttabState *s = XEN_GNTTAB(dev); +int i; if (xen_mode != XEN_EMULATE) { error_setg(errp, "Xen grant table support is for Xen emulation"); @@ -58,6 +72,38 @@ static void xen_gnttab_realize(DeviceState *dev, Error **errp) } s->nr_frames = 0; s->max_frames = kvm_xen_get_gnttab_max_frames(); +memory_region_init_ram(&s->gnt_frames, OBJECT(dev), "xen:grant_table", + XEN_PAGE_SIZE * s->max_frames, &error_abort); +memory_region_set_enabled(&s->gnt_frames, true); +s->entries.v1 = memory_region_get_ram_ptr(&s->gnt_frames); +memset(s->entries.v1, 0, XEN_PAGE_SIZE * s->max_frames); + +/* Create individual page-sizes aliases for overlays */ +s->gnt_aliases = (void *)g_new0(MemoryRegion, s->max_frames); +s->gnt_frame_gpas = (void *)g_new(uint64_t, s->max_frames); +for (i = 0; i < s->max_frames; i++) { +memory_region_init_alias(&s->gnt_aliases[i], OBJECT(dev), + NULL, &s->gnt_frames, + i * XEN_PAGE_SIZE, XEN_PAGE_SIZE); +s->gnt_frame_gpas[i] = INVALID_GPA; +} + +qemu_mutex_init(&s->gnt_lock); + +xen_gnttab_singleton = s; +} + +static int xen_gnttab_post_load(void *opaque, int version_id) +{ +XenGnttabState *s = XEN_GNTTAB(opaque); +uint32_t i; + +for (i = 0; i < s->nr_frames; i++) { +if (s->gnt_frame_gpas[i] != INVALID_GPA) { +xen_overlay_do_map_page(&s->gnt_aliases[i], s->gnt_frame_gpas[i]); +} +} +return 0; } static bool xen_gnttab_is_needed(void *opaque) @@ -70,8 +116,11 @@ static const VMStateDescription xen_gnttab_vmstate = { .version_id = 1, .minimum_version_id = 1, .needed = xen_gnttab_is_needed, +.post_load = xen_gnttab_post_load, .fields = (VMStateField[]) { VMSTATE_UINT32(nr_frames, XenGnttabState), +VMSTATE_VARRAY_UINT32(gnt_frame_gpas, XenGnttabState, nr_frames, 0, + vmstate_info_uint64, uint64_t), VMSTATE_END_OF_LIST() } }; @@ -106,6 +155,28 @@ type_init(xen_gnttab_register_types) int xen_gnttab_map_page(uint64_t idx, uint64_t gfn) { -return -ENOSYS; +XenGnttabState *s = xen_gnttab_singleton; +uint64_t gpa = gfn << XEN_PAGE_SHIFT; + +if (!s) { +return -ENOTSUP; +} + +if (idx >= s->max_frames) { +return -EINVAL; +} + +QEMU_IOTHREAD_LOCK_GUARD(); +QEMU_LOCK_GUARD(&s->gnt_lock); + +xen_overlay_do_map_page(&s->gnt_aliases[idx], gpa); + +s->gnt_frame_gpas[idx] = gpa; + +if (s->nr_frames <= idx) { +s->nr_frames = idx + 1; +} + +return 0; } diff --git a/hw/i386/kvm/xen_overlay.c b/hw/i386/kvm/xen_overlay.c index 8685d87959..39fda1b72c 100644 --- a/hw/i386/kvm/xen_overlay.c +++ b/hw/i386/kvm/xen_overlay.c @@ -49,7 +49,7 @@ struct XenOverlayState { struct XenOverlayState *xen_overlay_singleton; -static void xen_overlay_do_map_page(MemoryRegion *page, uint64_t gpa) +void xen_overlay_do_map_page(MemoryRegion *page, uint64_t gpa) { /* * Xen allows guests to map the same page as many times as it likes diff --git a/hw/i386/kvm/xen_overlay.h b/hw/i386/kvm/xen_overlay.h index 5c46a0b036..75ecb6b359 100644 --- a/hw/i386/kvm/xen_overlay.h +++ b/hw/i386/kvm/xen_overlay.h @@ -21,4 +21,6 @@ int xen_sync_long_mode(void); int xen_set_long_mode(bool long_mode); bool xen_is_long_mode(void); +void xen_overlay_do_map_page(MemoryRegion *page, uint64_t gpa); + #endif /* QEMU_XEN_OVERLAY_H */ -- 2.39.0
[PATCH v11 26/59] i386/xen: implement HVMOP_set_param
From: Ankur Arora This is the hook for adding the HVM_PARAM_CALLBACK_IRQ parameter in a subsequent commit. Signed-off-by: Ankur Arora Signed-off-by: Joao Martins [dwmw2: Split out from another commit] Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/kvm/xen-emu.c | 33 + 1 file changed, 33 insertions(+) diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 55dc2ac012..67c5832d09 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -489,6 +489,36 @@ static bool kvm_xen_hcall_memory_op(struct kvm_xen_exit *exit, X86CPU *cpu, return true; } +static bool handle_set_param(struct kvm_xen_exit *exit, X86CPU *cpu, + uint64_t arg) +{ +CPUState *cs = CPU(cpu); +struct xen_hvm_param hp; +int err = 0; + +/* No need for 32/64 compat handling */ +qemu_build_assert(sizeof(hp) == 16); + +if (kvm_copy_from_gva(cs, arg, &hp, sizeof(hp))) { +err = -EFAULT; +goto out; +} + +if (hp.domid != DOMID_SELF && hp.domid != xen_domid) { +err = -ESRCH; +goto out; +} + +switch (hp.index) { +default: +return false; +} + +out: +exit->u.hcall.result = err; +return true; +} + static int kvm_xen_hcall_evtchn_upcall_vector(struct kvm_xen_exit *exit, X86CPU *cpu, uint64_t arg) { @@ -530,6 +560,9 @@ static bool kvm_xen_hcall_hvm_op(struct kvm_xen_exit *exit, X86CPU *cpu, ret = -ENOSYS; break; +case HVMOP_set_param: +return handle_set_param(exit, cpu, arg); + default: return false; } -- 2.39.0
[PATCH v11 58/59] kvm/i386: Add xen-evtchn-max-pirq property
From: David Woodhouse The default number of PIRQs is set to 256 to avoid issues with 32-bit MSI devices. Allow it to be increased if the user desires. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- accel/kvm/kvm-all.c | 1 + hw/i386/kvm/xen_evtchn.c | 21 +++-- include/sysemu/kvm_int.h | 1 + include/sysemu/kvm_xen.h | 1 + target/i386/kvm/kvm.c | 34 ++ target/i386/kvm/xen-emu.c | 6 ++ 6 files changed, 54 insertions(+), 10 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index dc5b0bb434..3b7881e949 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -3705,6 +3705,7 @@ static void kvm_accel_instance_init(Object *obj) s->notify_window = 0; s->xen_version = 0; s->xen_gnttab_max_frames = 64; +s->xen_evtchn_max_pirq = 256; } /** diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 4ec0c7af75..3f60461e5c 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -302,17 +302,18 @@ void xen_evtchn_create(void) } /* - * We could parameterise the number of PIRQs available if needed, - * but for now limit it to 256. The Xen scheme for encoding PIRQ# - * into an MSI message is not compatible with 32-bit MSI, as it - * puts the high bits of the PIRQ# into the high bits of the MSI - * message address, instead of using the Extended Destination ID - * in address bits 4-11 which perhaps would have been a better - * choice. So to keep life simple, just stick with 256 as the - * default, which conveniently doesn't need to set anything - * outside the low 32 bits of the address. + * The Xen scheme for encoding PIRQ# into an MSI message is not + * compatible with 32-bit MSI, as it puts the high bits of the + * PIRQ# into the high bits of the MSI message address, instead of + * using the Extended Destination ID in address bits 4-11 which + * perhaps would have been a better choice. + * + * To keep life simple, kvm_accel_instance_init() initialises the + * default to 256. which conveniently doesn't need to set anything + * outside the low 32 bits of the address. It can be increased by + * setting the xen-evtchn-max-pirq property. */ -s->nr_pirqs = 256; +s->nr_pirqs = kvm_xen_get_evtchn_max_pirq(); s->nr_pirq_inuse_words = DIV_ROUND_UP(s->nr_pirqs, 64); s->pirq_inuse_bitmap = g_new0(uint64_t, s->nr_pirq_inuse_words); diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h index 39ce4d36f6..a641c974ea 100644 --- a/include/sysemu/kvm_int.h +++ b/include/sysemu/kvm_int.h @@ -121,6 +121,7 @@ struct KVMState uint32_t xen_version; uint32_t xen_caps; uint16_t xen_gnttab_max_frames; +uint16_t xen_evtchn_max_pirq; }; void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml, diff --git a/include/sysemu/kvm_xen.h b/include/sysemu/kvm_xen.h index 0b63bb81df..400aaa1490 100644 --- a/include/sysemu/kvm_xen.h +++ b/include/sysemu/kvm_xen.h @@ -26,6 +26,7 @@ void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, int type); void kvm_xen_set_callback_asserted(void); int kvm_xen_set_vcpu_virq(uint32_t vcpu_id, uint16_t virq, uint16_t port); uint16_t kvm_xen_get_gnttab_max_frames(void); +uint16_t kvm_xen_get_evtchn_max_pirq(void); #define kvm_xen_has_cap(cap) (!!(kvm_xen_get_caps() & \ KVM_XEN_HVM_CONFIG_ ## cap)) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index b497225fbd..4decd2559b 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -5907,6 +5907,33 @@ static void kvm_arch_set_xen_gnttab_max_frames(Object *obj, Visitor *v, s->xen_gnttab_max_frames = value; } +static void kvm_arch_get_xen_evtchn_max_pirq(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ +KVMState *s = KVM_STATE(obj); +uint16_t value = s->xen_evtchn_max_pirq; + +visit_type_uint16(v, name, &value, errp); +} + +static void kvm_arch_set_xen_evtchn_max_pirq(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ +KVMState *s = KVM_STATE(obj); +Error *error = NULL; +uint16_t value; + +visit_type_uint16(v, name, &value, &error); +if (error) { +error_propagate(errp, error); +return; +} + +s->xen_evtchn_max_pirq = value; +} + void kvm_arch_accel_class_init(ObjectClass *oc) { object_class_property_add_enum(oc, "notify-vmexit", "NotifyVMexitOption", @@ -5939,6 +5966,13 @@ void kvm_arch_accel_class_init(ObjectClass *oc) NULL, NULL); object_class_property_set_description(oc, "xen-gnttab-max-frames", "Maximum nu
[PATCH v11 45/59] i386/xen: Implement HYPERVISOR_grant_table_op and GNTTABOP_[gs]et_verson
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_gnttab.c | 31 hw/i386/kvm/xen_gnttab.h | 5 target/i386/kvm/xen-emu.c | 60 +++ 3 files changed, 96 insertions(+) diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c index 72e87aea6a..b54a94e2bd 100644 --- a/hw/i386/kvm/xen_gnttab.c +++ b/hw/i386/kvm/xen_gnttab.c @@ -180,3 +180,34 @@ int xen_gnttab_map_page(uint64_t idx, uint64_t gfn) return 0; } +int xen_gnttab_set_version_op(struct gnttab_set_version *set) +{ +int ret; + +switch (set->version) { +case 1: +ret = 0; +break; + +case 2: +/* Behave as before set_version was introduced. */ +ret = -ENOSYS; +break; + +default: +ret = -EINVAL; +} + +set->version = 1; +return ret; +} + +int xen_gnttab_get_version_op(struct gnttab_get_version *get) +{ +if (get->dom != DOMID_SELF && get->dom != xen_domid) { +return -ESRCH; +} + +get->version = 1; +return 0; +} diff --git a/hw/i386/kvm/xen_gnttab.h b/hw/i386/kvm/xen_gnttab.h index a7caa94c83..79579677ba 100644 --- a/hw/i386/kvm/xen_gnttab.h +++ b/hw/i386/kvm/xen_gnttab.h @@ -15,4 +15,9 @@ void xen_gnttab_create(void); int xen_gnttab_map_page(uint64_t idx, uint64_t gfn); +struct gnttab_set_version; +struct gnttab_get_version; +int xen_gnttab_set_version_op(struct gnttab_set_version *set); +int xen_gnttab_get_version_op(struct gnttab_get_version *get); + #endif /* QEMU_XEN_GNTTAB_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 41976e85af..e35b2d5557 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -34,6 +34,7 @@ #include "hw/xen/interface/hvm/params.h" #include "hw/xen/interface/vcpu.h" #include "hw/xen/interface/event_channel.h" +#include "hw/xen/interface/grant_table.h" #include "xen-compat.h" @@ -1166,6 +1167,61 @@ static bool kvm_xen_hcall_sched_op(struct kvm_xen_exit *exit, X86CPU *cpu, return true; } +static bool kvm_xen_hcall_gnttab_op(struct kvm_xen_exit *exit, X86CPU *cpu, +int cmd, uint64_t arg, int count) +{ +CPUState *cs = CPU(cpu); +int err; + +switch (cmd) { +case GNTTABOP_set_version: { +struct gnttab_set_version set; + +qemu_build_assert(sizeof(set) == 4); +if (kvm_copy_from_gva(cs, arg, &set, sizeof(set))) { +err = -EFAULT; +break; +} + +err = xen_gnttab_set_version_op(&set); +if (!err && kvm_copy_to_gva(cs, arg, &set, sizeof(set))) { +err = -EFAULT; +} +break; +} +case GNTTABOP_get_version: { +struct gnttab_get_version get; + +qemu_build_assert(sizeof(get) == 8); +if (kvm_copy_from_gva(cs, arg, &get, sizeof(get))) { +err = -EFAULT; +break; +} + +err = xen_gnttab_get_version_op(&get); +if (!err && kvm_copy_to_gva(cs, arg, &get, sizeof(get))) { +err = -EFAULT; +} +break; +} +case GNTTABOP_query_size: +case GNTTABOP_setup_table: +case GNTTABOP_copy: +case GNTTABOP_map_grant_ref: +case GNTTABOP_unmap_grant_ref: +case GNTTABOP_swap_grant_ref: +return false; + +default: +/* Xen explicitly returns -ENOSYS to HVM guests for all others */ +err = -ENOSYS; +break; +} + +exit->u.hcall.result = err; +return true; +} + static bool do_kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) { uint16_t code = exit->u.hcall.input; @@ -1176,6 +1232,10 @@ static bool do_kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) } switch (code) { +case __HYPERVISOR_grant_table_op: +return kvm_xen_hcall_gnttab_op(exit, cpu, exit->u.hcall.params[0], + exit->u.hcall.params[1], + exit->u.hcall.params[2]); case __HYPERVISOR_sched_op: return kvm_xen_hcall_sched_op(exit, cpu, exit->u.hcall.params[0], exit->u.hcall.params[1]); -- 2.39.0
[PATCH v11 35/59] hw/xen: Implement EVTCHNOP_alloc_unbound
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 32 hw/i386/kvm/xen_evtchn.h | 2 ++ target/i386/kvm/xen-emu.c | 15 +++ 3 files changed, 49 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index a97d6ba61d..9dc5a98d94 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -835,6 +835,38 @@ int xen_evtchn_bind_ipi_op(struct evtchn_bind_ipi *ipi) return ret; } +int xen_evtchn_alloc_unbound_op(struct evtchn_alloc_unbound *alloc) +{ +XenEvtchnState *s = xen_evtchn_singleton; +uint16_t type_val; +int ret; + +if (!s) { +return -ENOTSUP; +} + +if (alloc->dom != DOMID_SELF && alloc->dom != xen_domid) { +return -ESRCH; +} + +if (alloc->remote_dom == DOMID_QEMU) { +type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU; +} else if (alloc->remote_dom == DOMID_SELF || + alloc->remote_dom == xen_domid) { +type_val = 0; +} else { +return -EPERM; +} + +qemu_mutex_lock(&s->port_lock); + +ret = allocate_port(s, 0, EVTCHNSTAT_unbound, type_val, &alloc->port); + +qemu_mutex_unlock(&s->port_lock); + +return ret; +} + int xen_evtchn_send_op(struct evtchn_send *send) { XenEvtchnState *s = xen_evtchn_singleton; diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index 500fdbe8b8..fc080138e3 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -21,11 +21,13 @@ struct evtchn_unmask; struct evtchn_bind_virq; struct evtchn_bind_ipi; struct evtchn_send; +struct evtchn_alloc_unbound; int xen_evtchn_status_op(struct evtchn_status *status); int xen_evtchn_close_op(struct evtchn_close *close); int xen_evtchn_unmask_op(struct evtchn_unmask *unmask); int xen_evtchn_bind_virq_op(struct evtchn_bind_virq *virq); int xen_evtchn_bind_ipi_op(struct evtchn_bind_ipi *ipi); int xen_evtchn_send_op(struct evtchn_send *send); +int xen_evtchn_alloc_unbound_op(struct evtchn_alloc_unbound *alloc); #endif /* QEMU_XEN_EVTCHN_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 5299614d3c..e186dec9a9 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -918,6 +918,21 @@ static bool kvm_xen_hcall_evtchn_op(struct kvm_xen_exit *exit, X86CPU *cpu, err = xen_evtchn_send_op(&send); break; } +case EVTCHNOP_alloc_unbound: { +struct evtchn_alloc_unbound alloc; + +qemu_build_assert(sizeof(alloc) == 8); +if (kvm_copy_from_gva(cs, arg, &alloc, sizeof(alloc))) { +err = -EFAULT; +break; +} + +err = xen_evtchn_alloc_unbound_op(&alloc); +if (!err && kvm_copy_to_gva(cs, arg, &alloc, sizeof(alloc))) { +err = -EFAULT; +} +break; +} default: return false; } -- 2.39.0
[PATCH v11 21/59] i386/xen: handle VCPUOP_register_vcpu_info
From: Joao Martins Handle the hypercall to set a per vcpu info, and also wire up the default vcpu_info in the shared_info page for the first 32 vCPUs. To avoid deadlock within KVM a vCPU thread must set its *own* vcpu_info rather than it being set from the context in which the hypercall is invoked. Add the vcpu_info (and default) GPA to the vmstate_x86_cpu for migration, and restore it in kvm_arch_put_registers() appropriately. Signed-off-by: Joao Martins Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/cpu.h| 2 + target/i386/kvm/kvm.c| 17 target/i386/kvm/trace-events | 1 + target/i386/kvm/xen-emu.c| 152 ++- target/i386/kvm/xen-emu.h| 2 + target/i386/machine.c| 19 + 6 files changed, 190 insertions(+), 3 deletions(-) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index c6c57baed5..109b2e5669 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1788,6 +1788,8 @@ typedef struct CPUArchState { #endif #if defined(CONFIG_KVM) struct kvm_nested_state *nested_state; +uint64_t xen_vcpu_info_gpa; +uint64_t xen_vcpu_info_default_gpa; #endif #if defined(CONFIG_HVF) HVFX86LazyFlags hvf_lflags; diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index a7ba3476ac..766a757bae 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -4735,6 +4735,15 @@ int kvm_arch_put_registers(CPUState *cpu, int level) kvm_arch_set_tsc_khz(cpu); } +#ifdef CONFIG_XEN_EMU +if (xen_mode == XEN_EMULATE && level == KVM_PUT_FULL_STATE) { +ret = kvm_put_xen_state(cpu); +if (ret < 0) { +return ret; +} +} +#endif + ret = kvm_getput_regs(x86_cpu, 1); if (ret < 0) { return ret; @@ -4834,6 +4843,14 @@ int kvm_arch_get_registers(CPUState *cs) if (ret < 0) { goto out; } +#ifdef CONFIG_XEN_EMU +if (xen_mode == XEN_EMULATE) { +ret = kvm_get_xen_state(cs); +if (ret < 0) { +goto out; +} +} +#endif ret = 0; out: cpu_sync_bndcs_hflags(&cpu->env); diff --git a/target/i386/kvm/trace-events b/target/i386/kvm/trace-events index 8e9f269f56..a840e0333d 100644 --- a/target/i386/kvm/trace-events +++ b/target/i386/kvm/trace-events @@ -10,3 +10,4 @@ kvm_x86_update_msi_routes(int num) "Updated %d MSI routes" kvm_xen_hypercall(int cpu, uint8_t cpl, uint64_t input, uint64_t a0, uint64_t a1, uint64_t a2, uint64_t ret) "xen_hypercall: cpu %d cpl %d input %" PRIu64 " a0 0x%" PRIx64 " a1 0x%" PRIx64 " a2 0x%" PRIx64" ret 0x%" PRIx64 kvm_xen_soft_reset(void) "" kvm_xen_set_shared_info(uint64_t gfn) "shared info at gfn 0x%" PRIx64 +kvm_xen_set_vcpu_attr(int cpu, int type, uint64_t gpa) "vcpu attr cpu %d type %d gpa 0x%" PRIx64 diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index e5ae0a9a38..1cec8566ec 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -119,6 +119,8 @@ int kvm_xen_init(KVMState *s, uint32_t hypercall_msr) int kvm_xen_init_vcpu(CPUState *cs) { +X86CPU *cpu = X86_CPU(cs); +CPUX86State *env = &cpu->env; int err; /* @@ -142,6 +144,9 @@ int kvm_xen_init_vcpu(CPUState *cs) } } +env->xen_vcpu_info_gpa = INVALID_GPA; +env->xen_vcpu_info_default_gpa = INVALID_GPA; + return 0; } @@ -187,10 +192,58 @@ static bool kvm_xen_hcall_xen_version(struct kvm_xen_exit *exit, X86CPU *cpu, return true; } +static int kvm_xen_set_vcpu_attr(CPUState *cs, uint16_t type, uint64_t gpa) +{ +struct kvm_xen_vcpu_attr xhsi; + +xhsi.type = type; +xhsi.u.gpa = gpa; + +trace_kvm_xen_set_vcpu_attr(cs->cpu_index, type, gpa); + +return kvm_vcpu_ioctl(cs, KVM_XEN_VCPU_SET_ATTR, &xhsi); +} + +static void do_set_vcpu_info_default_gpa(CPUState *cs, run_on_cpu_data data) +{ +X86CPU *cpu = X86_CPU(cs); +CPUX86State *env = &cpu->env; + +env->xen_vcpu_info_default_gpa = data.host_ulong; + +/* Changing the default does nothing if a vcpu_info was explicitly set. */ +if (env->xen_vcpu_info_gpa == INVALID_GPA) { +kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO, + env->xen_vcpu_info_default_gpa); +} +} + +static void do_set_vcpu_info_gpa(CPUState *cs, run_on_cpu_data data) +{ +X86CPU *cpu = X86_CPU(cs); +CPUX86State *env = &cpu->env; + +env->xen_vcpu_info_gpa = data.host_ulong; + +kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO, + env->xen_vcpu_info_gpa); +} + +static void do_vcpu_soft_reset(CPUState *cs, run_on_cpu_data data) +{ +X86CPU *cpu = X86_CPU(cs); +CPUX86State *env = &cpu->env; + +env->xen_vcpu_info_gpa = INVALID_GPA; +env->xen_vcpu_info_default_gpa = INVALID_GPA; + +kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO, INVALID_GPA); +} + static int xen_set_shared_info(uint64_t gfn) { uint64_t g
[PATCH v11 43/59] hw/xen: Add xen_gnttab device for grant table emulation
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/meson.build | 1 + hw/i386/kvm/xen_gnttab.c | 111 ++ hw/i386/kvm/xen_gnttab.h | 18 +++ hw/i386/pc.c | 2 + target/i386/kvm/xen-emu.c | 3 ++ 5 files changed, 135 insertions(+) create mode 100644 hw/i386/kvm/xen_gnttab.c create mode 100644 hw/i386/kvm/xen_gnttab.h diff --git a/hw/i386/kvm/meson.build b/hw/i386/kvm/meson.build index cab64df339..e02449e4d4 100644 --- a/hw/i386/kvm/meson.build +++ b/hw/i386/kvm/meson.build @@ -7,6 +7,7 @@ i386_kvm_ss.add(when: 'CONFIG_IOAPIC', if_true: files('ioapic.c')) i386_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files( 'xen_overlay.c', 'xen_evtchn.c', + 'xen_gnttab.c', )) i386_ss.add_all(when: 'CONFIG_KVM', if_true: i386_kvm_ss) diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c new file mode 100644 index 00..ef8857e50c --- /dev/null +++ b/hw/i386/kvm/xen_gnttab.c @@ -0,0 +1,111 @@ +/* + * QEMU Xen emulation: Grant table support + * + * Copyright © 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Authors: David Woodhouse + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/host-utils.h" +#include "qemu/module.h" +#include "qemu/lockable.h" +#include "qemu/main-loop.h" +#include "qapi/error.h" +#include "qom/object.h" +#include "exec/target_page.h" +#include "exec/address-spaces.h" +#include "migration/vmstate.h" + +#include "hw/sysbus.h" +#include "hw/xen/xen.h" +#include "xen_overlay.h" +#include "xen_gnttab.h" + +#include "sysemu/kvm.h" +#include "sysemu/kvm_xen.h" + +#include "hw/xen/interface/memory.h" +#include "hw/xen/interface/grant_table.h" + +#define TYPE_XEN_GNTTAB "xen-gnttab" +OBJECT_DECLARE_SIMPLE_TYPE(XenGnttabState, XEN_GNTTAB) + +#define XEN_PAGE_SHIFT 12 +#define XEN_PAGE_SIZE (1ULL << XEN_PAGE_SHIFT) + +struct XenGnttabState { +/*< private >*/ +SysBusDevice busdev; +/*< public >*/ + +uint32_t nr_frames; +uint32_t max_frames; +}; + +struct XenGnttabState *xen_gnttab_singleton; + +static void xen_gnttab_realize(DeviceState *dev, Error **errp) +{ +XenGnttabState *s = XEN_GNTTAB(dev); + +if (xen_mode != XEN_EMULATE) { +error_setg(errp, "Xen grant table support is for Xen emulation"); +return; +} +s->nr_frames = 0; +s->max_frames = kvm_xen_get_gnttab_max_frames(); +} + +static bool xen_gnttab_is_needed(void *opaque) +{ +return xen_mode == XEN_EMULATE; +} + +static const VMStateDescription xen_gnttab_vmstate = { +.name = "xen_gnttab", +.version_id = 1, +.minimum_version_id = 1, +.needed = xen_gnttab_is_needed, +.fields = (VMStateField[]) { +VMSTATE_UINT32(nr_frames, XenGnttabState), +VMSTATE_END_OF_LIST() +} +}; + +static void xen_gnttab_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); + +dc->realize = xen_gnttab_realize; +dc->vmsd = &xen_gnttab_vmstate; +} + +static const TypeInfo xen_gnttab_info = { +.name = TYPE_XEN_GNTTAB, +.parent= TYPE_SYS_BUS_DEVICE, +.instance_size = sizeof(XenGnttabState), +.class_init= xen_gnttab_class_init, +}; + +void xen_gnttab_create(void) +{ +xen_gnttab_singleton = XEN_GNTTAB(sysbus_create_simple(TYPE_XEN_GNTTAB, + -1, NULL)); +} + +static void xen_gnttab_register_types(void) +{ +type_register_static(&xen_gnttab_info); +} + +type_init(xen_gnttab_register_types) + +int xen_gnttab_map_page(uint64_t idx, uint64_t gfn) +{ +return -ENOSYS; +} + diff --git a/hw/i386/kvm/xen_gnttab.h b/hw/i386/kvm/xen_gnttab.h new file mode 100644 index 00..a7caa94c83 --- /dev/null +++ b/hw/i386/kvm/xen_gnttab.h @@ -0,0 +1,18 @@ +/* + * QEMU Xen emulation: Grant table support + * + * Copyright © 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Authors: David Woodhouse + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef QEMU_XEN_GNTTAB_H +#define QEMU_XEN_GNTTAB_H + +void xen_gnttab_create(void); +int xen_gnttab_map_page(uint64_t idx, uint64_t gfn); + +#endif /* QEMU_XEN_GNTTAB_H */ diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 2d3f316d10..ae1d50e084 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -91,6 +91,7 @@ #include "hw/virtio/virtio-mem-pci.h" #include "hw/i386/kvm/xen_overlay.h" #include "hw/i386/kvm/xen_evtchn.h" +#include "hw/i386/kvm/xen_gnttab.h" #include "hw/mem/memory-device.h" #include "sysemu/replay.h" #include "target/i386/cpu.h" @@ -1858,6 +1859,7 @@ int pc_machine_kvm_type(MachineState *machine, const char *kvm_type) if (xen_mode == XEN_EMULATE) { xen_overlay_create(); xen_e
[PATCH v11 15/59] i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode
From: David Woodhouse The xen_overlay device (and later similar devices for event channels and grant tables) need to be instantiated. Do this from a kvm_type method on the PC machine derivatives, since KVM is only way to support Xen emulation for now. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/pc.c | 11 +++ include/hw/i386/pc.h | 3 +++ 2 files changed, 14 insertions(+) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 6e592bd969..9169305f4f 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -89,6 +89,7 @@ #include "hw/virtio/virtio-iommu.h" #include "hw/virtio/virtio-pmem-pci.h" #include "hw/virtio/virtio-mem-pci.h" +#include "hw/i386/kvm/xen_overlay.h" #include "hw/mem/memory-device.h" #include "sysemu/replay.h" #include "target/i386/cpu.h" @@ -1844,6 +1845,16 @@ static void pc_machine_initfn(Object *obj) cxl_machine_init(obj, &pcms->cxl_devices_state); } +int pc_machine_kvm_type(MachineState *machine, const char *kvm_type) +{ +#ifdef CONFIG_XEN_EMU +if (xen_mode == XEN_EMULATE) { +xen_overlay_create(); +} +#endif +return 0; +} + static void pc_machine_reset(MachineState *machine, ShutdownCause reason) { CPUState *cs; diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 66e3d059ef..740497a961 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -291,12 +291,15 @@ extern const size_t pc_compat_1_5_len; extern GlobalProperty pc_compat_1_4[]; extern const size_t pc_compat_1_4_len; +extern int pc_machine_kvm_type(MachineState *machine, const char *vm_type); + #define DEFINE_PC_MACHINE(suffix, namestr, initfn, optsfn) \ static void pc_machine_##suffix##_class_init(ObjectClass *oc, void *data) \ { \ MachineClass *mc = MACHINE_CLASS(oc); \ optsfn(mc); \ mc->init = initfn; \ +mc->kvm_type = pc_machine_kvm_type; \ } \ static const TypeInfo pc_machine_type_##suffix = { \ .name = namestr TYPE_MACHINE_SUFFIX, \ -- 2.39.0
[PATCH v11 30/59] hw/xen: Implement EVTCHNOP_close
From: David Woodhouse It calls an internal close_port() helper which will also be used from EVTCHNOP_reset and will actually do the work to disconnect/unbind a port once any of that is actually implemented in the first place. That in turn calls a free_port() internal function which will be in error paths after allocation. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 121 ++ hw/i386/kvm/xen_evtchn.h | 2 + target/i386/kvm/xen-emu.c | 12 3 files changed, 135 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 8bed33890f..08c6fac357 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -21,6 +21,7 @@ #include "hw/sysbus.h" #include "hw/xen/xen.h" + #include "xen_evtchn.h" #include "xen_overlay.h" @@ -40,6 +41,41 @@ typedef struct XenEvtchnPort { uint16_t type_val; /* pirq# / virq# / remote port according to type */ } XenEvtchnPort; +/* 32-bit compatibility definitions, also used natively in 32-bit build */ +struct compat_arch_vcpu_info { +unsigned int cr2; +unsigned int pad[5]; +}; + +struct compat_vcpu_info { +uint8_t evtchn_upcall_pending; +uint8_t evtchn_upcall_mask; +uint16_t pad; +uint32_t evtchn_pending_sel; +struct compat_arch_vcpu_info arch; +struct vcpu_time_info time; +}; /* 64 bytes (x86) */ + +struct compat_arch_shared_info { +unsigned int max_pfn; +unsigned int pfn_to_mfn_frame_list_list; +unsigned int nmi_reason; +unsigned int p2m_cr3; +unsigned int p2m_vaddr; +unsigned int p2m_generation; +uint32_t wc_sec_hi; +}; + +struct compat_shared_info { +struct compat_vcpu_info vcpu_info[XEN_LEGACY_MAX_VCPUS]; +uint32_t evtchn_pending[32]; +uint32_t evtchn_mask[32]; +uint32_t wc_version; /* Version counter: see vcpu_time_info_t. */ +uint32_t wc_sec; +uint32_t wc_nsec; +struct compat_arch_shared_info arch; +}; + #define COMPAT_EVTCHN_2L_NR_CHANNELS1024 /* @@ -257,3 +293,88 @@ int xen_evtchn_status_op(struct evtchn_status *status) qemu_mutex_unlock(&s->port_lock); return 0; } + +static int clear_port_pending(XenEvtchnState *s, evtchn_port_t port) +{ +void *p = xen_overlay_get_shinfo_ptr(); +if (!p) +return -ENOTSUP; + +if (xen_is_long_mode()) { +struct shared_info *shinfo = p; +const int bits_per_word = BITS_PER_BYTE * sizeof(shinfo->evtchn_pending[0]); +typeof(shinfo->evtchn_pending[0]) mask; +int idx = port / bits_per_word; +int offset = port % bits_per_word; + +mask = 1UL << offset; + +qatomic_fetch_and(&shinfo->evtchn_pending[idx], ~mask); +} else { +struct compat_shared_info *shinfo = p; +const int bits_per_word = BITS_PER_BYTE * sizeof(shinfo->evtchn_pending[0]); +typeof(shinfo->evtchn_pending[0]) mask; +int idx = port / bits_per_word; +int offset = port % bits_per_word; + +mask = 1UL << offset; + +qatomic_fetch_and(&shinfo->evtchn_pending[idx], ~mask); +} +return 0; +} + +static void free_port(XenEvtchnState *s, evtchn_port_t port) +{ +s->port_table[port].type = EVTCHNSTAT_closed; +s->port_table[port].type_val = 0; +s->port_table[port].vcpu = 0; + +if (s->nr_ports == port + 1) { +do { +s->nr_ports--; +} while (s->nr_ports && + s->port_table[s->nr_ports - 1].type == EVTCHNSTAT_closed); +} + +/* Clear pending event to avoid unexpected behavior on re-bind. */ +clear_port_pending(s, port); +} + +static int close_port(XenEvtchnState *s, evtchn_port_t port) +{ +XenEvtchnPort *p = &s->port_table[port]; + +switch (p->type) { +case EVTCHNSTAT_closed: +return -ENOENT; + +default: +break; +} + +free_port(s, port); +return 0; +} + +int xen_evtchn_close_op(struct evtchn_close *close) +{ +XenEvtchnState *s = xen_evtchn_singleton; +int ret; + +if (!s) { +return -ENOTSUP; +} + +if (!valid_port(close->port)) { +return -EINVAL; +} + +qemu_mutex_lock(&s->port_lock); + +ret = close_port(s, close->port); + +qemu_mutex_unlock(&s->port_lock); + +return ret; +} diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index 76467636ee..cb3924941a 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -16,6 +16,8 @@ void xen_evtchn_create(void); int xen_evtchn_set_callback_param(uint64_t param); struct evtchn_status; +struct evtchn_close; int xen_evtchn_status_op(struct evtchn_status *status); +int xen_evtchn_close_op(struct evtchn_close *close); #endif /* QEMU_XEN_EVTCHN_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 3811153724..c54372700a 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -802,6 +802,18 @@ static bool kvm_xen_hcall_evtchn_op(str
[PATCH v11 00/59] Xen HVM support under KVM
Updated to base it on the incoming Arm Xen PVH support which at least yesterday was in the staging branch, and a couple of tweaks from Paul's review feedback. Most of the changes we've actually been making are in the XenStore part which we're keeping out of this patch set as it's large enough already. As ever it can be seen in all its glory, even running guests with PV disk now, at https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv v11: https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-kvm-11 • Rebase on Arm PVH support. • Fix 32-bit set_timer_op hypercall. • Drop references to grant table v2 which might imply imminent support. v10: https://lore.kernel.org/qemu-devel/20230201143148.1744093-1-dw...@infradead.org/ https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-kvm-10 • Move imported Xen headers to include/hw/xen/interface/. • Allow --xen-domid to be set, and default to non-zero. • Update documentation to include xen-evtchn-max-pirq and xen-gnttab-max-frames properties. • Explicitly include "qemu/lockable.h" in xen_evtchn.c to fix build. v9: https://lore.kernel.org/qemu-devel/20230128081113.1615111-1-dw...@infradead.org/ https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-kvm-9 • Fix race in GSI deassertion. I still hate this and want to fix it to happen on EOI at the irqchip and fix VFIO too, but we can do that in a separate series rather than piling it into this one. At least this one is nicer than the VFIO one that already exists. • Fix user builds by not including xen-stubs.c in those. • On rebasing, add some explicit includes needed after header cleanups. v8: https://lore.kernel.org/qemu-devel/20230120131343.1441939-1-dw...@infradead.org/ https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-kvm-8 • Instantiate xen pci-platform device automatically. • Add documentation. • Rename (newly-added) CONFIG_XENFV_PLATFORM to CONFIG_XEN_EMU. That's basically what it enables now that the dust is settling on the rest of the patch set that comes next. • Shift QMP commands to qapi/misc-target.json, other review feedback. • Clear upcall vector on soft reset. • Wire up soft reset to occur on qemu_devices_reset() (e.g. reboot). • Locking tweaks largely resulting from doing soft reset with the BQL. • Poll for deassertion of event channel GSI from kvm_arch_post_run() instead of kvm_arch_handle_exit(). • Add PIRQ support. v7: https://lore.kernel.org/qemu-devel/20230116215805.1123514-1-dw...@infradead.org/ https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-kvm-7 • Trivial review feedback and collected ack/review tags. • Only call qemu_set_irq() under the BQL, which means doing so from a BH in some circumstances. v6: https://lore.kernel.org/qemu-devel/20230110122042.1562155-1-dw...@infradead.org/ https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-kvm-6 • Require split irqchip to ensure the GSI handling works correctly. • Rework monitor commands to be QMP-based. • Cache vcpu_info hva to avoid MemoryRegion refcount leaks. • Pull in more Xen headers to allow for later PV backend work. • Define __XEN_TOOLS__ in hw/xen/xen.h instead of littering C files with separate definitions of __XEN_INTERFACE_VERSION__. • Drop debugging hexdump from xenstore processing. • Minor fixes in event channel backend handling. • Drop "Refactor xen_be_init()" patch. It turns out we're going to do that all quite differently, so it's neither necessary nor sufficient. v5: https://lore.kernel.org/qemu-devel/20221230121235.1282915-1-dw...@infradead.org/ https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-kvm-5 • Add backend implementation of event channel support, to parallel the libxenevtchn API used by existing backend drivers. • Add basic XenStore ring implementation, test migration and kexec. • Some kexec/soft reset fixes (clear port pending bits, kernel timer virq). • Fix race with setting the xen_callback_asserted flag before actually doing so, which could lead to it being *cleared* again before we even assert it... and leave it asserted for ever. v4: https://lore.kernel.org/qemu-devel/20221221010623.1000191-1-dw...@infradead.org/ https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-kvm-4 • Add soft reset support near the beginning and thread it through the rest of the feature enablement. • Add PV timer support and advertise XENFEAT_safe_hvm_pvclock. • Add basic grant table mapping and [gs]et_version / query_size support. • Make xen_platform device build (and work) without CONFIG_XEN. • Fix Xen HVM mode not to require --xen-attach. v3: https://lore.kernel.org/qemu-devel/20221216004117.862106-1-dw...@infradead.org/ https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/x
[PATCH v11 57/59] hw/xen: Support MSI mapping to PIRQ
From: David Woodhouse The way that Xen handles MSI PIRQs is kind of awful. There is a special MSI message which targets a PIRQ. The vector in the low bits of data must be zero. The low 8 bits of the PIRQ# are in the destination ID field, the extended destination ID field is unused, and instead the high bits of the PIRQ# are in the high 32 bits of the address. Using the high bits of the address means that we can't intercept and translate these messages in kvm_send_msi(), because they won't be caught by the APIC — addresses like 0x1000fee46000 aren't in the APIC's range. So we catch them in pci_msi_trigger() instead, and deliver the event channel directly. That isn't even the worst part. The worst part is that Xen snoops on writes to devices' MSI vectors while they are *masked*. When a MSI message is written which looks like it targets a PIRQ, it remembers the device and vector for later. When the guest makes a hypercall to bind that PIRQ# (snooped from a marked MSI vector) to an event channel port, Xen *unmasks* that MSI vector on the device. Xen guests using PIRQ delivery of MSI don't ever actually unmask the MSI for themselves. Now that this is working we can finally enable XENFEAT_hvm_pirqs and let the guest use it all. Tested with passthrough igb and emulated e1000e + AHCI. CPU0 CPU1 0: 65 0 IO-APIC 2-edge timer 1: 0 14 xen-pirq 1-ioapic-edge i8042 4: 0846 xen-pirq 4-ioapic-edge ttyS0 8: 1 0 xen-pirq 8-ioapic-edge rtc0 9: 0 0 xen-pirq 9-ioapic-level acpi 12:257 0 xen-pirq 12-ioapic-edge i8042 24: 9600 0 xen-percpu-virq timer0 25: 2758 0 xen-percpu-ipi resched0 26: 0 0 xen-percpu-ipi callfunc0 27: 0 0 xen-percpu-virq debug0 28: 1526 0 xen-percpu-ipi callfuncsingle0 29: 0 0 xen-percpu-ipi spinlock0 30: 0 8608 xen-percpu-virq timer1 31: 0874 xen-percpu-ipi resched1 32: 0 0 xen-percpu-ipi callfunc1 33: 0 0 xen-percpu-virq debug1 34: 0 1617 xen-percpu-ipi callfuncsingle1 35: 0 0 xen-percpu-ipi spinlock1 36: 8 0 xen-dyn-event xenbus 37: 0 6046 xen-pirq-msi ahci[:00:03.0] 38: 1 0 xen-pirq-msi-x ens4 39: 0 73 xen-pirq-msi-x ens4-rx-0 40: 14 0 xen-pirq-msi-x ens4-rx-1 41: 0 32 xen-pirq-msi-x ens4-tx-0 42: 47 0 xen-pirq-msi-x ens4-tx-1 Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/meson.build| 7 + hw/i386/kvm/trace-events | 1 + hw/i386/kvm/xen-stubs.c| 27 hw/i386/kvm/xen_evtchn.c | 261 - hw/i386/kvm/xen_evtchn.h | 8 ++ hw/pci/msi.c | 11 ++ hw/pci/msix.c | 7 + hw/pci/pci.c | 17 +++ include/hw/pci/msi.h | 1 + target/i386/kvm/kvm.c | 19 ++- target/i386/kvm/kvm_i386.h | 2 + target/i386/kvm/xen-emu.c | 3 +- 12 files changed, 354 insertions(+), 10 deletions(-) create mode 100644 hw/i386/kvm/xen-stubs.c diff --git a/hw/i386/kvm/meson.build b/hw/i386/kvm/meson.build index 6d6981fced..82dd6ae7c6 100644 --- a/hw/i386/kvm/meson.build +++ b/hw/i386/kvm/meson.build @@ -12,3 +12,10 @@ i386_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files( )) i386_ss.add_all(when: 'CONFIG_KVM', if_true: i386_kvm_ss) + +xen_stubs_ss = ss.source_set() +xen_stubs_ss.add(when: 'CONFIG_XEN_EMU', if_false: files( + 'xen-stubs.c', +)) + +specific_ss.add_all(when: 'CONFIG_SOFTMMU', if_true: xen_stubs_ss) diff --git a/hw/i386/kvm/trace-events b/hw/i386/kvm/trace-events index 04e60c5bb8..b83c3eb965 100644 --- a/hw/i386/kvm/trace-events +++ b/hw/i386/kvm/trace-events @@ -2,3 +2,4 @@ kvm_xen_map_pirq(int pirq, int gsi) "pirq %d gsi %d" kvm_xen_unmap_pirq(int pirq, int gsi) "pirq %d gsi %d" kvm_xen_get_free_pirq(int pirq, int type) "pirq %d type %d" kvm_xen_bind_pirq(int pirq, int port) "pirq %d port %d" +kvm_xen_unmask_pirq(int pirq, char *dev, int vector) "pirq %d dev %s vector %d" diff --git a/hw/i386/kvm/xen-stubs.c b/hw/i386/kvm/xen-stubs.c new file mode 100644 index 00..a95964bbac --- /dev/null +++ b/hw/i386/kvm/xen-stubs.c @@ -0,0 +1,27 @@ +/* + * QEMU Xen emulation: QMP stubs + * + * Copyright © 2023 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Authors: David Woodhouse + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "xen_ev
[PATCH v11 55/59] hw/xen: Implement emulated PIRQ hypercall support
From: David Woodhouse This wires up the basic infrastructure but the actual interrupts aren't there yet, so don't advertise it to the guest. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/trace-events | 4 + hw/i386/kvm/trace.h | 1 + hw/i386/kvm/xen_evtchn.c | 300 +- hw/i386/kvm/xen_evtchn.h | 2 + meson.build | 1 + target/i386/kvm/xen-emu.c | 15 ++ 6 files changed, 318 insertions(+), 5 deletions(-) create mode 100644 hw/i386/kvm/trace-events create mode 100644 hw/i386/kvm/trace.h diff --git a/hw/i386/kvm/trace-events b/hw/i386/kvm/trace-events new file mode 100644 index 00..04e60c5bb8 --- /dev/null +++ b/hw/i386/kvm/trace-events @@ -0,0 +1,4 @@ +kvm_xen_map_pirq(int pirq, int gsi) "pirq %d gsi %d" +kvm_xen_unmap_pirq(int pirq, int gsi) "pirq %d gsi %d" +kvm_xen_get_free_pirq(int pirq, int type) "pirq %d type %d" +kvm_xen_bind_pirq(int pirq, int port) "pirq %d port %d" diff --git a/hw/i386/kvm/trace.h b/hw/i386/kvm/trace.h new file mode 100644 index 00..e55d0812fd --- /dev/null +++ b/hw/i386/kvm/trace.h @@ -0,0 +1 @@ +#include "trace/trace-hw_i386_kvm.h" diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index ca9f15698f..f5e835ff70 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -24,6 +24,7 @@ #include "exec/target_page.h" #include "exec/address-spaces.h" #include "migration/vmstate.h" +#include "trace.h" #include "hw/sysbus.h" #include "hw/xen/xen.h" @@ -105,6 +106,21 @@ struct xenevtchn_handle { #define PORT_INFO_TYPEVAL_REMOTE_QEMU 0x8000 #define PORT_INFO_TYPEVAL_REMOTE_PORT_MASK 0x7FFF +/* + * These 'emuirq' values are used by Xen in the LM stream... and yes, I am + * insane enough to think about guest-transparent live migration from actual + * Xen to QEMU, and ensuring that we can convert/consume the stream. + */ +#define IRQ_UNBOUND -1 +#define IRQ_PT -2 +#define IRQ_MSI_EMU -3 + + +struct pirq_info { +int gsi; +uint16_t port; +}; + struct XenEvtchnState { /*< private >*/ SysBusDevice busdev; @@ -122,8 +138,25 @@ struct XenEvtchnState { qemu_irq gsis[GSI_NUM_PINS]; struct xenevtchn_handle *be_handles[EVTCHN_2L_NR_CHANNELS]; + +uint32_t nr_pirqs; + +/* Bitmap of allocated PIRQs (serialized) */ +uint16_t nr_pirq_inuse_words; +uint64_t *pirq_inuse_bitmap; + +/* GSI → PIRQ mapping (serialized) */ +uint16_t gsi_pirq[GSI_NUM_PINS]; + +/* Per-PIRQ information (rebuilt on migration) */ +struct pirq_info *pirq; }; +#define pirq_inuse_word(s, pirq) (s->pirq_inuse_bitmap[((pirq) / 64)]) +#define pirq_inuse_bit(pirq) (1ULL << ((pirq) & 63)) + +#define pirq_inuse(s, pirq) (pirq_inuse_word(s, pirq) & pirq_inuse_bit(pirq)) + struct XenEvtchnState *xen_evtchn_singleton; /* Top bits of callback_param are the type (HVM_PARAM_CALLBACK_TYPE_xxx) */ @@ -138,17 +171,45 @@ static int xen_evtchn_pre_load(void *opaque) /* Unbind all the backend-side ports; they need to rebind */ unbind_backend_ports(s); +/* It'll be leaked otherwise. */ +g_free(s->pirq_inuse_bitmap); +s->pirq_inuse_bitmap = NULL; + return 0; } static int xen_evtchn_post_load(void *opaque, int version_id) { XenEvtchnState *s = opaque; +uint32_t i; if (s->callback_param) { xen_evtchn_set_callback_param(s->callback_param); } +/* Rebuild s->pirq[].port mapping */ +for (i = 0; i < s->nr_ports; i++) { +XenEvtchnPort *p = &s->port_table[i]; + +if (p->type == EVTCHNSTAT_pirq) { +assert(p->type_val); +assert(p->type_val < s->nr_pirqs); + +/* + * Set the gsi to IRQ_UNBOUND; it may be changed to an actual + * GSI# below, or to IRQ_MSI_EMU when the MSI table snooping + * catches up with it. + */ +s->pirq[p->type_val].gsi = IRQ_UNBOUND; +s->pirq[p->type_val].port = i; +} +} +/* Rebuild s->pirq[].gsi mapping */ +for (i = 0; i < GSI_NUM_PINS; i++) { +if (s->gsi_pirq[i]) { +s->pirq[s->gsi_pirq[i]].gsi = i; +} +} return 0; } @@ -181,6 +242,10 @@ static const VMStateDescription xen_evtchn_vmstate = { VMSTATE_UINT32(nr_ports, XenEvtchnState), VMSTATE_STRUCT_VARRAY_UINT32(port_table, XenEvtchnState, nr_ports, 1, xen_evtchn_port_vmstate, XenEvtchnPort), +VMSTATE_UINT16_ARRAY(gsi_pirq, XenEvtchnState, GSI_NUM_PINS), +VMSTATE_VARRAY_UINT16_ALLOC(pirq_inuse_bitmap, XenEvtchnState, +nr_pirq_inuse_words, 0, +vmstate_info_uint64, uint64_t), VMSTATE_END_OF_LIST() } }; @@ -221,6 +286,23 @@ void xen_evtchn_create(void) for (i = 0; i < GSI_NUM_PINS; i++) { sysbus_init_irq(SYS_BUS_DEVICE(s), &s->gsis[i]); }
[PATCH v11 52/59] hw/xen: Add basic ring handling to xenstore
From: David Woodhouse Extract requests, return ENOSYS to all of them. This is enough to allow older Linux guests to boot, as they need *something* back but it doesn't matter much what. A full implementation of a single-tentant internal XenStore copy-on-write tree with transactions and watches is waiting in the wings to be sent in a subsequent round of patches along with hooking up the actual PV disk back end in qemu, but this is enough to get guests booting for now. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_xenstore.c | 223 - 1 file changed, 220 insertions(+), 3 deletions(-) diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c index 702f417633..2388842d15 100644 --- a/hw/i386/kvm/xen_xenstore.c +++ b/hw/i386/kvm/xen_xenstore.c @@ -188,18 +188,235 @@ uint16_t xen_xenstore_get_port(void) return s->guest_port; } +static bool req_pending(XenXenstoreState *s) +{ +struct xsd_sockmsg *req = (struct xsd_sockmsg *)s->req_data; + +return s->req_offset == XENSTORE_HEADER_SIZE + req->len; +} + +static void reset_req(XenXenstoreState *s) +{ +memset(s->req_data, 0, sizeof(s->req_data)); +s->req_offset = 0; +} + +static void reset_rsp(XenXenstoreState *s) +{ +s->rsp_pending = false; + +memset(s->rsp_data, 0, sizeof(s->rsp_data)); +s->rsp_offset = 0; +} + +static void process_req(XenXenstoreState *s) +{ +struct xsd_sockmsg *req = (struct xsd_sockmsg *)s->req_data; +struct xsd_sockmsg *rsp = (struct xsd_sockmsg *)s->rsp_data; +const char enosys[] = "ENOSYS"; + +assert(req_pending(s)); + assert(!s->rsp_pending); + +rsp->type = XS_ERROR; +rsp->req_id = req->req_id; +rsp->tx_id = req->tx_id; +rsp->len = sizeof(enosys); +memcpy((void *)&rsp[1], enosys, sizeof(enosys)); + +s->rsp_pending = true; +reset_req(s); +} + +static unsigned int copy_from_ring(XenXenstoreState *s, uint8_t *ptr, unsigned int len) +{ +if (!len) +return 0; + +XENSTORE_RING_IDX prod = qatomic_read(&s->xs->req_prod); +XENSTORE_RING_IDX cons = qatomic_read(&s->xs->req_cons); +unsigned int copied = 0; + +smp_mb(); + +while (len) { +unsigned int avail = prod - cons; +unsigned int offset = MASK_XENSTORE_IDX(cons); +unsigned int copylen = avail; + +if (avail > XENSTORE_RING_SIZE) { +error_report("XenStore ring handling error"); +s->fatal_error = true; +break; +} else if (avail == 0) +break; + +if (copylen > len) { +copylen = len; +} +if (copylen > XENSTORE_RING_SIZE - offset) { +copylen = XENSTORE_RING_SIZE - offset; +} + +memcpy(ptr, &s->xs->req[offset], copylen); +copied += copylen; + +ptr += copylen; +len -= copylen; + +cons += copylen; +} + +smp_mb(); + +qatomic_set(&s->xs->req_cons, cons); + +return copied; +} + +static unsigned int copy_to_ring(XenXenstoreState *s, uint8_t *ptr, unsigned int len) +{ +if (!len) +return 0; + +XENSTORE_RING_IDX cons = qatomic_read(&s->xs->rsp_cons); +XENSTORE_RING_IDX prod = qatomic_read(&s->xs->rsp_prod); +unsigned int copied = 0; + +smp_mb(); + +while (len) { +unsigned int avail = cons + XENSTORE_RING_SIZE - prod; +unsigned int offset = MASK_XENSTORE_IDX(prod); +unsigned int copylen = len; + +if (avail > XENSTORE_RING_SIZE) { +error_report("XenStore ring handling error"); +s->fatal_error = true; +break; +} else if (avail == 0) +break; + +if (copylen > avail) { +copylen = avail; +} +if (copylen > XENSTORE_RING_SIZE - offset) { +copylen = XENSTORE_RING_SIZE - offset; +} + + +memcpy(&s->xs->rsp[offset], ptr, copylen); +copied += copylen; + +ptr += copylen; +len -= copylen; + +prod += copylen; +} + +smp_mb(); + +qatomic_set(&s->xs->rsp_prod, prod); + +return copied; +} + +static unsigned int get_req(XenXenstoreState *s) +{ +unsigned int copied = 0; + +if (s->fatal_error) +return 0; + +assert(!req_pending(s)); + +if (s->req_offset < XENSTORE_HEADER_SIZE) { +void *ptr = s->req_data + s->req_offset; +unsigned int len = XENSTORE_HEADER_SIZE; +unsigned int copylen = copy_from_ring(s, ptr, len); + +copied += copylen; +s->req_offset += copylen; +} + +if (s->req_offset >= XENSTORE_HEADER_SIZE) { +struct xsd_sockmsg *req = (struct xsd_sockmsg *)s->req_data; + +if (req->len > (uint32_t)XENSTORE_PAYLOAD_MAX) { +error_report("Illegal XenStore request"); +s->fatal_error = true; +return 0; +} + +void *ptr = s->req_data + s->req_offset; +unsigned int len = XENSTOR
[PATCH v11 36/59] hw/xen: Implement EVTCHNOP_bind_interdomain
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 78 +++ hw/i386/kvm/xen_evtchn.h | 2 + target/i386/kvm/xen-emu.c | 16 3 files changed, 96 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 9dc5a98d94..3e6f7afcbc 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -720,6 +720,23 @@ static int close_port(XenEvtchnState *s, evtchn_port_t port) } break; +case EVTCHNSTAT_interdomain: +if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) { +/* Not yet implemented. This can't happen! */ +} else { +/* Loopback interdomain */ +XenEvtchnPort *rp = &s->port_table[p->type_val]; +if (!valid_port(p->type_val) || rp->type_val != port || +rp->type != EVTCHNSTAT_interdomain) { +error_report("Inconsistent state for interdomain unbind"); +} else { +/* Set the other end back to unbound */ +rp->type = EVTCHNSTAT_unbound; +rp->type_val = 0; +} +} +break; + default: break; } @@ -835,6 +852,67 @@ int xen_evtchn_bind_ipi_op(struct evtchn_bind_ipi *ipi) return ret; } +int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain) +{ +XenEvtchnState *s = xen_evtchn_singleton; +uint16_t type_val; +int ret; + +if (!s) { +return -ENOTSUP; +} + +if (interdomain->remote_dom == DOMID_QEMU) { +type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU; +} else if (interdomain->remote_dom == DOMID_SELF || + interdomain->remote_dom == xen_domid) { +type_val = 0; +} else { +return -ESRCH; +} + +if (!valid_port(interdomain->remote_port)) { +return -EINVAL; +} + +qemu_mutex_lock(&s->port_lock); + +/* The newly allocated port starts out as unbound */ +ret = allocate_port(s, 0, EVTCHNSTAT_unbound, type_val, +&interdomain->local_port); +if (ret) { +goto out; +} + +if (interdomain->remote_dom == DOMID_QEMU) { +/* We haven't hooked up QEMU's PV drivers to this yet */ +ret = -ENOSYS; +} else { +/* Loopback */ +XenEvtchnPort *rp = &s->port_table[interdomain->remote_port]; +XenEvtchnPort *lp = &s->port_table[interdomain->local_port]; + +if (rp->type == EVTCHNSTAT_unbound && rp->type_val == 0) { +/* It's a match! */ +rp->type = EVTCHNSTAT_interdomain; +rp->type_val = interdomain->local_port; + +lp->type = EVTCHNSTAT_interdomain; +lp->type_val = interdomain->remote_port; +} else { +ret = -EINVAL; +} +} + +if (ret) { +free_port(s, interdomain->local_port); +} + out: +qemu_mutex_unlock(&s->port_lock); + +return ret; + +} int xen_evtchn_alloc_unbound_op(struct evtchn_alloc_unbound *alloc) { XenEvtchnState *s = xen_evtchn_singleton; diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index fc080138e3..1ebc7580eb 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -22,6 +22,7 @@ struct evtchn_bind_virq; struct evtchn_bind_ipi; struct evtchn_send; struct evtchn_alloc_unbound; +struct evtchn_bind_interdomain; int xen_evtchn_status_op(struct evtchn_status *status); int xen_evtchn_close_op(struct evtchn_close *close); int xen_evtchn_unmask_op(struct evtchn_unmask *unmask); @@ -29,5 +30,6 @@ int xen_evtchn_bind_virq_op(struct evtchn_bind_virq *virq); int xen_evtchn_bind_ipi_op(struct evtchn_bind_ipi *ipi); int xen_evtchn_send_op(struct evtchn_send *send); int xen_evtchn_alloc_unbound_op(struct evtchn_alloc_unbound *alloc); +int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain); #endif /* QEMU_XEN_EVTCHN_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index e186dec9a9..a07d1d39f3 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -933,6 +933,22 @@ static bool kvm_xen_hcall_evtchn_op(struct kvm_xen_exit *exit, X86CPU *cpu, } break; } +case EVTCHNOP_bind_interdomain: { +struct evtchn_bind_interdomain interdomain; + +qemu_build_assert(sizeof(interdomain) == 12); +if (kvm_copy_from_gva(cs, arg, &interdomain, sizeof(interdomain))) { +err = -EFAULT; +break; +} + +err = xen_evtchn_bind_interdomain_op(&interdomain); +if (!err && +kvm_copy_to_gva(cs, arg, &interdomain, sizeof(interdomain))) { +err = -EFAULT; +} +break; +} default: return false; } -- 2.39.0
[PATCH v11 47/59] i386/xen: handle PV timer hypercalls
From: Joao Martins Introduce support for one shot and periodic mode of Xen PV timers, whereby timer interrupts come through a special virq event channel with deadlines being set through: 1) set_timer_op hypercall (only oneshot) 2) vcpu_op hypercall for {set,stop}_{singleshot,periodic}_timer hypercalls Signed-off-by: Joao Martins Signed-off-by: David Woodhouse --- hw/i386/kvm/xen_evtchn.c | 31 + hw/i386/kvm/xen_evtchn.h | 2 + target/i386/cpu.h | 5 + target/i386/kvm/xen-emu.c | 252 +- target/i386/machine.c | 1 + 5 files changed, 289 insertions(+), 2 deletions(-) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 5d5996641d..06572b3e10 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -1220,6 +1220,37 @@ int xen_evtchn_send_op(struct evtchn_send *send) return ret; } +int xen_evtchn_set_port(uint16_t port) +{ +XenEvtchnState *s = xen_evtchn_singleton; +XenEvtchnPort *p; +int ret = -EINVAL; + +if (!s) { +return -ENOTSUP; +} + +if (!valid_port(port)) { +return -EINVAL; +} + +qemu_mutex_lock(&s->port_lock); + +p = &s->port_table[port]; + +/* QEMU has no business sending to anything but these */ +if (p->type == EVTCHNSTAT_virq || +(p->type == EVTCHNSTAT_interdomain && + (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU))) { +set_port_pending(s, port); +ret = 0; +} + +qemu_mutex_unlock(&s->port_lock); + +return ret; +} + EvtchnInfoList *qmp_xen_event_list(Error **errp) { XenEvtchnState *s = xen_evtchn_singleton; diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index b03c3108bc..24611478b8 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -20,6 +20,8 @@ int xen_evtchn_set_callback_param(uint64_t param); void xen_evtchn_connect_gsis(qemu_irq *system_gsis); void xen_evtchn_set_callback_level(int level); +int xen_evtchn_set_port(uint16_t port); + struct evtchn_status; struct evtchn_close; struct evtchn_unmask; diff --git a/target/i386/cpu.h b/target/i386/cpu.h index e8718c31e5..b579f0f0f8 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -26,6 +26,7 @@ #include "exec/cpu-defs.h" #include "qapi/qapi-types-common.h" #include "qemu/cpu-float.h" +#include "qemu/timer.h" #define XEN_NR_VIRQS 24 @@ -1800,6 +1801,10 @@ typedef struct CPUArchState { bool xen_callback_asserted; uint16_t xen_virq[XEN_NR_VIRQS]; uint64_t xen_singleshot_timer_ns; +QEMUTimer *xen_singleshot_timer; +uint64_t xen_periodic_timer_period; +QEMUTimer *xen_periodic_timer; +QemuMutex xen_timers_lock; #endif #if defined(CONFIG_HVF) HVFX86LazyFlags hvf_lflags; diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 44fa0de784..4781b1fa97 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -38,6 +38,9 @@ #include "xen-compat.h" +static void xen_vcpu_singleshot_timer_event(void *opaque); +static void xen_vcpu_periodic_timer_event(void *opaque); + #ifdef TARGET_X86_64 #define hypercall_compat32(longmode) (!(longmode)) #else @@ -201,6 +204,23 @@ int kvm_xen_init_vcpu(CPUState *cs) env->xen_vcpu_time_info_gpa = INVALID_GPA; env->xen_vcpu_runstate_gpa = INVALID_GPA; +qemu_mutex_init(&env->xen_timers_lock); +env->xen_singleshot_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, + xen_vcpu_singleshot_timer_event, + cpu); +if (!env->xen_singleshot_timer) { +return -ENOMEM; +} +env->xen_singleshot_timer->opaque = cs; + +env->xen_periodic_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, + xen_vcpu_periodic_timer_event, + cpu); +if (!env->xen_periodic_timer) { +return -ENOMEM; +} +env->xen_periodic_timer->opaque = cs; + return 0; } @@ -232,7 +252,8 @@ static bool kvm_xen_hcall_xen_version(struct kvm_xen_exit *exit, X86CPU *cpu, 1 << XENFEAT_writable_descriptor_tables | 1 << XENFEAT_auto_translated_physmap | 1 << XENFEAT_supervisor_mode_kernel | - 1 << XENFEAT_hvm_callback_vector; + 1 << XENFEAT_hvm_callback_vector | + 1 << XENFEAT_hvm_safe_pvclock; } err = kvm_copy_to_gva(CPU(cpu), arg, &fi, sizeof(fi)); @@ -875,13 +896,192 @@ static int vcpuop_register_runstate_info(CPUState *cs, CPUState *target, return 0; } +static uint64_t kvm_get_current_ns(void) +{ +struct kvm_clock_data data; +int ret; + +ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); +if (ret < 0) { +fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret)); +abort(); +} + +return data.clo
[PATCH v11 42/59] kvm/i386: Add xen-gnttab-max-frames property
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- accel/kvm/kvm-all.c | 1 + include/sysemu/kvm_int.h | 1 + include/sysemu/kvm_xen.h | 1 + target/i386/kvm/kvm.c | 34 ++ target/i386/kvm/xen-emu.c | 6 ++ 5 files changed, 43 insertions(+) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index f242e36316..dc5b0bb434 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -3704,6 +3704,7 @@ static void kvm_accel_instance_init(Object *obj) s->notify_vmexit = NOTIFY_VMEXIT_OPTION_RUN; s->notify_window = 0; s->xen_version = 0; +s->xen_gnttab_max_frames = 64; } /** diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h index 7f945bc763..39ce4d36f6 100644 --- a/include/sysemu/kvm_int.h +++ b/include/sysemu/kvm_int.h @@ -120,6 +120,7 @@ struct KVMState uint32_t notify_window; uint32_t xen_version; uint32_t xen_caps; +uint16_t xen_gnttab_max_frames; }; void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml, diff --git a/include/sysemu/kvm_xen.h b/include/sysemu/kvm_xen.h index 1edff29541..7fee28dec7 100644 --- a/include/sysemu/kvm_xen.h +++ b/include/sysemu/kvm_xen.h @@ -25,6 +25,7 @@ void *kvm_xen_get_vcpu_info_hva(uint32_t vcpu_id); void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, int type); void kvm_xen_set_callback_asserted(void); int kvm_xen_set_vcpu_virq(uint32_t vcpu_id, uint16_t virq, uint16_t port); +uint16_t kvm_xen_get_gnttab_max_frames(void); #define kvm_xen_has_cap(cap) (!!(kvm_xen_get_caps() & \ KVM_XEN_HVM_CONFIG_ ## cap)) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index f6ae70c831..6d112ccddd 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -5865,6 +5865,33 @@ static void kvm_arch_set_xen_version(Object *obj, Visitor *v, } } +static void kvm_arch_get_xen_gnttab_max_frames(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ +KVMState *s = KVM_STATE(obj); +uint16_t value = s->xen_gnttab_max_frames; + +visit_type_uint16(v, name, &value, errp); +} + +static void kvm_arch_set_xen_gnttab_max_frames(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ +KVMState *s = KVM_STATE(obj); +Error *error = NULL; +uint16_t value; + +visit_type_uint16(v, name, &value, &error); +if (error) { +error_propagate(errp, error); +return; +} + +s->xen_gnttab_max_frames = value; +} + void kvm_arch_accel_class_init(ObjectClass *oc) { object_class_property_add_enum(oc, "notify-vmexit", "NotifyVMexitOption", @@ -5890,6 +5917,13 @@ void kvm_arch_accel_class_init(ObjectClass *oc) "Xen version to be emulated " "(in XENVER_version form " "e.g. 0x4000a for 4.10)"); + +object_class_property_add(oc, "xen-gnttab-max-frames", "uint16", + kvm_arch_get_xen_gnttab_max_frames, + kvm_arch_set_xen_gnttab_max_frames, + NULL, NULL); +object_class_property_set_description(oc, "xen-gnttab-max-frames", + "Maximum number of grant table frames"); } void kvm_set_max_apic_id(uint32_t max_apic_id) diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index ec82170261..c57620ca51 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -1235,6 +1235,12 @@ int kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) return 0; } +uint16_t kvm_xen_get_gnttab_max_frames(void) +{ +KVMState *s = KVM_STATE(current_accel()); +return s->xen_gnttab_max_frames; +} + int kvm_put_xen_state(CPUState *cs) { X86CPU *cpu = X86_CPU(cs); -- 2.39.0
[PATCH v11 48/59] i386/xen: Reserve Xen special pages for console, xenstore rings
From: David Woodhouse Xen has eight frames at 0xfeff8000 for this; we only really need two for now and KVM puts the identity map at 0xfeffc000, so limit ourselves to four. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- include/sysemu/kvm_xen.h | 8 target/i386/kvm/xen-emu.c | 10 ++ 2 files changed, 18 insertions(+) diff --git a/include/sysemu/kvm_xen.h b/include/sysemu/kvm_xen.h index 7fee28dec7..0b63bb81df 100644 --- a/include/sysemu/kvm_xen.h +++ b/include/sysemu/kvm_xen.h @@ -30,4 +30,12 @@ uint16_t kvm_xen_get_gnttab_max_frames(void); #define kvm_xen_has_cap(cap) (!!(kvm_xen_get_caps() & \ KVM_XEN_HVM_CONFIG_ ## cap)) +#define XEN_SPECIAL_AREA_ADDR 0xfeff8000UL +#define XEN_SPECIAL_AREA_SIZE 0x4000UL + +#define XEN_SPECIALPAGE_CONSOLE 0 +#define XEN_SPECIALPAGE_XENSTORE1 + +#define XEN_SPECIAL_PFN(x) ((XEN_SPECIAL_AREA_ADDR >> TARGET_PAGE_BITS) + XEN_SPECIALPAGE_##x) + #endif /* QEMU_SYSEMU_KVM_XEN_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 4781b1fa97..f55ab08959 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -23,6 +23,7 @@ #include "hw/pci/msi.h" #include "hw/i386/apic-msidef.h" +#include "hw/i386/e820_memory_layout.h" #include "hw/i386/kvm/xen_overlay.h" #include "hw/i386/kvm/xen_evtchn.h" #include "hw/i386/kvm/xen_gnttab.h" @@ -169,6 +170,15 @@ int kvm_xen_init(KVMState *s, uint32_t hypercall_msr) } s->xen_caps = xen_caps; + +/* Tell fw_cfg to notify the BIOS to reserve the range. */ +ret = e820_add_entry(XEN_SPECIAL_AREA_ADDR, XEN_SPECIAL_AREA_SIZE, + E820_RESERVED); +if (ret < 0) { +fprintf(stderr, "e820_add_entry() table is full\n"); +return ret; +} + return 0; } -- 2.39.0
[PATCH v11 17/59] i386/xen: implement HYPERVISOR_memory_op
From: Joao Martins Specifically XENMEM_add_to_physmap with space XENMAPSPACE_shared_info to allow the guest to set its shared_info page. Signed-off-by: Joao Martins [dwmw2: Use the xen_overlay device, add compat support] Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/kvm/trace-events | 1 + target/i386/kvm/xen-compat.h | 27 target/i386/kvm/xen-emu.c| 116 ++- 3 files changed, 143 insertions(+), 1 deletion(-) create mode 100644 target/i386/kvm/xen-compat.h diff --git a/target/i386/kvm/trace-events b/target/i386/kvm/trace-events index bb732e1da8..8e9f269f56 100644 --- a/target/i386/kvm/trace-events +++ b/target/i386/kvm/trace-events @@ -9,3 +9,4 @@ kvm_x86_update_msi_routes(int num) "Updated %d MSI routes" # xen-emu.c kvm_xen_hypercall(int cpu, uint8_t cpl, uint64_t input, uint64_t a0, uint64_t a1, uint64_t a2, uint64_t ret) "xen_hypercall: cpu %d cpl %d input %" PRIu64 " a0 0x%" PRIx64 " a1 0x%" PRIx64 " a2 0x%" PRIx64" ret 0x%" PRIx64 kvm_xen_soft_reset(void) "" +kvm_xen_set_shared_info(uint64_t gfn) "shared info at gfn 0x%" PRIx64 diff --git a/target/i386/kvm/xen-compat.h b/target/i386/kvm/xen-compat.h new file mode 100644 index 00..2d852e2a28 --- /dev/null +++ b/target/i386/kvm/xen-compat.h @@ -0,0 +1,27 @@ +/* + * Xen HVM emulation support in KVM + * + * Copyright © 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef QEMU_I386_KVM_XEN_COMPAT_H +#define QEMU_I386_KVM_XEN_COMPAT_H + +#include "hw/xen/interface/memory.h" + +typedef uint32_t compat_pfn_t; +typedef uint32_t compat_ulong_t; + +struct compat_xen_add_to_physmap { +domid_t domid; +uint16_t size; +unsigned int space; +compat_ulong_t idx; +compat_pfn_t gpfn; +}; + +#endif /* QEMU_I386_XEN_COMPAT_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index be6d85f2cb..5d79827128 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -12,6 +12,7 @@ #include "qemu/osdep.h" #include "qemu/log.h" #include "qemu/main-loop.h" +#include "hw/xen/xen.h" #include "sysemu/kvm_int.h" #include "sysemu/kvm_xen.h" #include "kvm/kvm_i386.h" @@ -24,6 +25,15 @@ #include "hw/xen/interface/version.h" #include "hw/xen/interface/sched.h" +#include "hw/xen/interface/memory.h" + +#include "xen-compat.h" + +#ifdef TARGET_X86_64 +#define hypercall_compat32(longmode) (!(longmode)) +#else +#define hypercall_compat32(longmode) (false) +#endif static int kvm_gva_rw(CPUState *cs, uint64_t gva, void *_buf, size_t sz, bool is_write) @@ -175,13 +185,114 @@ static bool kvm_xen_hcall_xen_version(struct kvm_xen_exit *exit, X86CPU *cpu, return true; } +static int xen_set_shared_info(uint64_t gfn) +{ +uint64_t gpa = gfn << TARGET_PAGE_BITS; +int err; + +QEMU_IOTHREAD_LOCK_GUARD(); + +/* + * The xen_overlay device tells KVM about it too, since it had to + * do that on migration load anyway (unless we're going to jump + * through lots of hoops to maintain the fiction that this isn't + * KVM-specific. + */ +err = xen_overlay_map_shinfo_page(gpa); +if (err) { +return err; +} + +trace_kvm_xen_set_shared_info(gfn); + +return err; +} + +static int add_to_physmap_one(uint32_t space, uint64_t idx, uint64_t gfn) +{ +switch (space) { +case XENMAPSPACE_shared_info: +if (idx > 0) { +return -EINVAL; +} +return xen_set_shared_info(gfn); + +case XENMAPSPACE_grant_table: +case XENMAPSPACE_gmfn: +case XENMAPSPACE_gmfn_range: +return -ENOTSUP; + +case XENMAPSPACE_gmfn_foreign: +case XENMAPSPACE_dev_mmio: +return -EPERM; + +default: +return -EINVAL; +} +} + +static int do_add_to_physmap(struct kvm_xen_exit *exit, X86CPU *cpu, + uint64_t arg) +{ +struct xen_add_to_physmap xatp; +CPUState *cs = CPU(cpu); + +if (hypercall_compat32(exit->u.hcall.longmode)) { +struct compat_xen_add_to_physmap xatp32; + +qemu_build_assert(sizeof(struct compat_xen_add_to_physmap) == 16); +if (kvm_copy_from_gva(cs, arg, &xatp32, sizeof(xatp32))) { +return -EFAULT; +} +xatp.domid = xatp32.domid; +xatp.size = xatp32.size; +xatp.space = xatp32.space; +xatp.idx = xatp32.idx; +xatp.gpfn = xatp32.gpfn; +} else { +if (kvm_copy_from_gva(cs, arg, &xatp, sizeof(xatp))) { +return -EFAULT; +} +} + +if (xatp.domid != DOMID_SELF && xatp.domid != xen_domid) { +return -ESRCH; +} + +return add_to_physmap_one(xatp.space, xatp.idx, xatp.gpfn); +} + +static bool kvm_xen_hcall_memory_op(struct kvm_xen_exit *exit, X86CPU *cpu, +
[PATCH v11 37/59] hw/xen: Implement EVTCHNOP_bind_vcpu
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 40 +++ hw/i386/kvm/xen_evtchn.h | 2 ++ target/i386/kvm/xen-emu.c | 12 3 files changed, 54 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 3e6f7afcbc..f87b6a3b23 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -789,6 +789,46 @@ int xen_evtchn_unmask_op(struct evtchn_unmask *unmask) return ret; } +int xen_evtchn_bind_vcpu_op(struct evtchn_bind_vcpu *vcpu) +{ +XenEvtchnState *s = xen_evtchn_singleton; +XenEvtchnPort *p; +int ret = -EINVAL; + +if (!s) { +return -ENOTSUP; +} + +if (!valid_port(vcpu->port)) { +return -EINVAL; +} + +if (!valid_vcpu(vcpu->vcpu)) { +return -ENOENT; +} + +qemu_mutex_lock(&s->port_lock); + +p = &s->port_table[vcpu->port]; + +if (p->type == EVTCHNSTAT_interdomain || +p->type == EVTCHNSTAT_unbound || +p->type == EVTCHNSTAT_pirq || +(p->type == EVTCHNSTAT_virq && virq_is_global(p->type_val))) { +/* + * unmask_port() with do_unmask==false will just raise the event + * on the new vCPU if the port was already pending. + */ +p->vcpu = vcpu->vcpu; +unmask_port(s, vcpu->port, false); +ret = 0; +} + +qemu_mutex_unlock(&s->port_lock); + +return ret; +} + int xen_evtchn_bind_virq_op(struct evtchn_bind_virq *virq) { XenEvtchnState *s = xen_evtchn_singleton; diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index 1ebc7580eb..486b031c82 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -23,6 +23,7 @@ struct evtchn_bind_ipi; struct evtchn_send; struct evtchn_alloc_unbound; struct evtchn_bind_interdomain; +struct evtchn_bind_vcpu; int xen_evtchn_status_op(struct evtchn_status *status); int xen_evtchn_close_op(struct evtchn_close *close); int xen_evtchn_unmask_op(struct evtchn_unmask *unmask); @@ -31,5 +32,6 @@ int xen_evtchn_bind_ipi_op(struct evtchn_bind_ipi *ipi); int xen_evtchn_send_op(struct evtchn_send *send); int xen_evtchn_alloc_unbound_op(struct evtchn_alloc_unbound *alloc); int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain); +int xen_evtchn_bind_vcpu_op(struct evtchn_bind_vcpu *vcpu); #endif /* QEMU_XEN_EVTCHN_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index a07d1d39f3..ec7aefadfc 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -949,6 +949,18 @@ static bool kvm_xen_hcall_evtchn_op(struct kvm_xen_exit *exit, X86CPU *cpu, } break; } +case EVTCHNOP_bind_vcpu: { +struct evtchn_bind_vcpu vcpu; + +qemu_build_assert(sizeof(vcpu) == 8); +if (kvm_copy_from_gva(cs, arg, &vcpu, sizeof(vcpu))) { +err = -EFAULT; +break; +} + +err = xen_evtchn_bind_vcpu_op(&vcpu); +break; +} default: return false; } -- 2.39.0
[PATCH v11 34/59] hw/xen: Implement EVTCHNOP_send
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 180 ++ hw/i386/kvm/xen_evtchn.h | 2 + target/i386/kvm/xen-emu.c | 12 +++ 3 files changed, 194 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index d8527483b9..a97d6ba61d 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -490,6 +490,133 @@ static int unmask_port(XenEvtchnState *s, evtchn_port_t port, bool do_unmask) } } +static int do_set_port_lm(XenEvtchnState *s, evtchn_port_t port, + struct shared_info *shinfo, + struct vcpu_info *vcpu_info) +{ +const int bits_per_word = BITS_PER_BYTE * sizeof(shinfo->evtchn_pending[0]); +typeof(shinfo->evtchn_pending[0]) mask; +int idx = port / bits_per_word; +int offset = port % bits_per_word; + +mask = 1UL << offset; + +if (idx >= bits_per_word) { +return -EINVAL; +} + +/* Update the pending bit itself. If it was already set, we're done. */ +if (qatomic_fetch_or(&shinfo->evtchn_pending[idx], mask) & mask) { +return 0; +} + +/* Check if it's masked. */ +if (qatomic_fetch_or(&shinfo->evtchn_mask[idx], 0) & mask) { +return 0; +} + +/* Now on to the vcpu_info evtchn_pending_sel index... */ +mask = 1UL << idx; + +/* If a port in this word was already pending for this vCPU, all done. */ +if (qatomic_fetch_or(&vcpu_info->evtchn_pending_sel, mask) & mask) { +return 0; +} + +/* Set evtchn_upcall_pending for this vCPU */ +if (qatomic_fetch_or(&vcpu_info->evtchn_upcall_pending, 1)) { +return 0; +} + +inject_callback(s, s->port_table[port].vcpu); + +return 0; +} + +static int do_set_port_compat(XenEvtchnState *s, evtchn_port_t port, + struct compat_shared_info *shinfo, + struct compat_vcpu_info *vcpu_info) +{ +const int bits_per_word = BITS_PER_BYTE * sizeof(shinfo->evtchn_pending[0]); +typeof(shinfo->evtchn_pending[0]) mask; +int idx = port / bits_per_word; +int offset = port % bits_per_word; + +mask = 1UL << offset; + +if (idx >= bits_per_word) { +return -EINVAL; +} + +/* Update the pending bit itself. If it was already set, we're done. */ +if (qatomic_fetch_or(&shinfo->evtchn_pending[idx], mask) & mask) { +return 0; +} + +/* Check if it's masked. */ +if (qatomic_fetch_or(&shinfo->evtchn_mask[idx], 0) & mask) { +return 0; +} + +/* Now on to the vcpu_info evtchn_pending_sel index... */ +mask = 1UL << idx; + +/* If a port in this word was already pending for this vCPU, all done. */ +if (qatomic_fetch_or(&vcpu_info->evtchn_pending_sel, mask) & mask) { +return 0; +} + +/* Set evtchn_upcall_pending for this vCPU */ +if (qatomic_fetch_or(&vcpu_info->evtchn_upcall_pending, 1)) { +return 0; +} + +inject_callback(s, s->port_table[port].vcpu); + +return 0; +} + +static int set_port_pending(XenEvtchnState *s, evtchn_port_t port) +{ +void *vcpu_info, *shinfo; + +if (s->port_table[port].type == EVTCHNSTAT_closed) { +return -EINVAL; +} + +if (s->evtchn_in_kernel) { +XenEvtchnPort *p = &s->port_table[port]; +CPUState *cpu = qemu_get_cpu(p->vcpu); +struct kvm_irq_routing_xen_evtchn evt; + +if (!cpu) { +return 0; +} + +evt.port = port; +evt.vcpu = kvm_arch_vcpu_id(cpu); +evt.priority = KVM_IRQ_ROUTING_XEN_EVTCHN_PRIO_2LEVEL; + +return kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_EVTCHN_SEND, &evt); +} + +shinfo = xen_overlay_get_shinfo_ptr(); +if (!shinfo) { +return -ENOTSUP; +} + +vcpu_info = kvm_xen_get_vcpu_info_hva(s->port_table[port].vcpu); +if (!vcpu_info) { +return -EINVAL; +} + +if (xen_is_long_mode()) { +return do_set_port_lm(s, port, shinfo, vcpu_info); +} else { +return do_set_port_compat(s, port, shinfo, vcpu_info); +} +} + static int clear_port_pending(XenEvtchnState *s, evtchn_port_t port) { void *p = xen_overlay_get_shinfo_ptr(); @@ -707,3 +834,56 @@ int xen_evtchn_bind_ipi_op(struct evtchn_bind_ipi *ipi) return ret; } + +int xen_evtchn_send_op(struct evtchn_send *send) +{ +XenEvtchnState *s = xen_evtchn_singleton; +XenEvtchnPort *p; +int ret = 0; + +if (!s) { +return -ENOTSUP; +} + +if (!valid_port(send->port)) { +return -EINVAL; +} + +qemu_mutex_lock(&s->port_lock); + +p = &s->port_table[send->port]; + +switch (p->type) { +case EVTCHNSTAT_interdomain: +if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) { +/* + * This is an event from the guest to qemu itself, which is + * serving as the driver domain. Not yet
[PATCH v11 16/59] i386/xen: manage and save/restore Xen guest long_mode setting
From: David Woodhouse Xen will "latch" the guest's 32-bit or 64-bit ("long mode") setting when the guest writes the MSR to fill in the hypercall page, or when the guest sets the event channel callback in HVM_PARAM_CALLBACK_IRQ. KVM handles the former and sets the kernel's long_mode flag accordingly. The latter will be handled in userspace. Keep them in sync by noticing when a hypercall is made in a mode that doesn't match qemu's idea of the guest mode, and resyncing from the kernel. Do that same sync right before serialization too, in case the guest has set the hypercall page but hasn't yet made a system call. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_overlay.c | 62 +++ hw/i386/kvm/xen_overlay.h | 4 +++ target/i386/kvm/xen-emu.c | 12 3 files changed, 78 insertions(+) diff --git a/hw/i386/kvm/xen_overlay.c b/hw/i386/kvm/xen_overlay.c index a2441e2b4e..8685d87959 100644 --- a/hw/i386/kvm/xen_overlay.c +++ b/hw/i386/kvm/xen_overlay.c @@ -44,6 +44,7 @@ struct XenOverlayState { MemoryRegion shinfo_mem; void *shinfo_ptr; uint64_t shinfo_gpa; +bool long_mode; }; struct XenOverlayState *xen_overlay_singleton; @@ -96,9 +97,21 @@ static void xen_overlay_realize(DeviceState *dev, Error **errp) s->shinfo_ptr = memory_region_get_ram_ptr(&s->shinfo_mem); s->shinfo_gpa = INVALID_GPA; +s->long_mode = false; memset(s->shinfo_ptr, 0, XEN_PAGE_SIZE); } +static int xen_overlay_pre_save(void *opaque) +{ +/* + * Fetch the kernel's idea of long_mode to avoid the race condition + * where the guest has set the hypercall page up in 64-bit mode but + * not yet made a hypercall by the time migration happens, so qemu + * hasn't yet noticed. + */ +return xen_sync_long_mode(); +} + static int xen_overlay_post_load(void *opaque, int version_id) { XenOverlayState *s = opaque; @@ -107,6 +120,9 @@ static int xen_overlay_post_load(void *opaque, int version_id) xen_overlay_do_map_page(&s->shinfo_mem, s->shinfo_gpa); xen_overlay_set_be_shinfo(s->shinfo_gpa >> XEN_PAGE_SHIFT); } +if (s->long_mode) { +xen_set_long_mode(true); +} return 0; } @@ -121,9 +137,11 @@ static const VMStateDescription xen_overlay_vmstate = { .version_id = 1, .minimum_version_id = 1, .needed = xen_overlay_is_needed, +.pre_save = xen_overlay_pre_save, .post_load = xen_overlay_post_load, .fields = (VMStateField[]) { VMSTATE_UINT64(shinfo_gpa, XenOverlayState), +VMSTATE_BOOL(long_mode, XenOverlayState), VMSTATE_END_OF_LIST() } }; @@ -208,3 +226,47 @@ void *xen_overlay_get_shinfo_ptr(void) return s->shinfo_ptr; } + +int xen_sync_long_mode(void) +{ +int ret; +struct kvm_xen_hvm_attr xa = { +.type = KVM_XEN_ATTR_TYPE_LONG_MODE, +}; + +if (!xen_overlay_singleton) { +return -ENOENT; +} + +ret = kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_GET_ATTR, &xa); +if (!ret) { +xen_overlay_singleton->long_mode = xa.u.long_mode; +} + +return ret; +} + +int xen_set_long_mode(bool long_mode) +{ +int ret; +struct kvm_xen_hvm_attr xa = { +.type = KVM_XEN_ATTR_TYPE_LONG_MODE, +.u.long_mode = long_mode, +}; + +if (!xen_overlay_singleton) { +return -ENOENT; +} + +ret = kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_SET_ATTR, &xa); +if (!ret) { +xen_overlay_singleton->long_mode = xa.u.long_mode; +} + +return ret; +} + +bool xen_is_long_mode(void) +{ +return xen_overlay_singleton && xen_overlay_singleton->long_mode; +} diff --git a/hw/i386/kvm/xen_overlay.h b/hw/i386/kvm/xen_overlay.h index 00cff05bb0..5c46a0b036 100644 --- a/hw/i386/kvm/xen_overlay.h +++ b/hw/i386/kvm/xen_overlay.h @@ -17,4 +17,8 @@ void xen_overlay_create(void); int xen_overlay_map_shinfo_page(uint64_t gpa); void *xen_overlay_get_shinfo_ptr(void); +int xen_sync_long_mode(void); +int xen_set_long_mode(bool long_mode); +bool xen_is_long_mode(void); + #endif /* QEMU_XEN_OVERLAY_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index ebea27caf6..be6d85f2cb 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -20,6 +20,8 @@ #include "trace.h" #include "sysemu/runstate.h" +#include "hw/i386/kvm/xen_overlay.h" + #include "hw/xen/interface/version.h" #include "hw/xen/interface/sched.h" @@ -282,6 +284,16 @@ int kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) return -1; } +/* + * The kernel latches the guest 32/64 mode when the MSR is used to fill + * the hypercall page. So if we see a hypercall in a mode that doesn't + * match our own idea of the guest mode, fetch the kernel's idea of the + * "long mode" to remain in sync. + */ +if (exit->u.hcall.longmode != xen_is_long_mode()) { +xen_sync_long_mode(); +} + if (!do_
[PATCH v11 20/59] i386/xen: implement HYPERVISOR_vcpu_op
From: Joao Martins This is simply when guest tries to register a vcpu_info and since vcpu_info placement is optional in the minimum ABI therefore we can just fail with -ENOSYS Signed-off-by: Joao Martins Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/kvm/xen-emu.c | 25 + 1 file changed, 25 insertions(+) diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 4002b1b797..e5ae0a9a38 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -27,6 +27,7 @@ #include "hw/xen/interface/sched.h" #include "hw/xen/interface/memory.h" #include "hw/xen/interface/hvm/hvm_op.h" +#include "hw/xen/interface/vcpu.h" #include "xen-compat.h" @@ -363,6 +364,25 @@ static bool kvm_xen_hcall_hvm_op(struct kvm_xen_exit *exit, X86CPU *cpu, } } +static bool kvm_xen_hcall_vcpu_op(struct kvm_xen_exit *exit, X86CPU *cpu, + int cmd, int vcpu_id, uint64_t arg) +{ +int err; + +switch (cmd) { +case VCPUOP_register_vcpu_info: +/* no vcpu info placement for now */ +err = -ENOSYS; +break; + +default: +return false; +} + +exit->u.hcall.result = err; +return true; +} + int kvm_xen_soft_reset(void) { int err; @@ -464,6 +484,11 @@ static bool do_kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) case __HYPERVISOR_sched_op: return kvm_xen_hcall_sched_op(exit, cpu, exit->u.hcall.params[0], exit->u.hcall.params[1]); +case __HYPERVISOR_vcpu_op: +return kvm_xen_hcall_vcpu_op(exit, cpu, + exit->u.hcall.params[0], + exit->u.hcall.params[1], + exit->u.hcall.params[2]); case __HYPERVISOR_hvm_op: return kvm_xen_hcall_hvm_op(exit, cpu, exit->u.hcall.params[0], exit->u.hcall.params[1]); -- 2.39.0
[PATCH v11 27/59] hw/xen: Add xen_evtchn device for event channel emulation
From: David Woodhouse Include basic support for setting HVM_PARAM_CALLBACK_IRQ to the global vector method HVM_PARAM_CALLBACK_TYPE_VECTOR, which is handled in-kernel by raising the vector whenever the vCPU's vcpu_info->evtchn_upcall_pending flag is set. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/meson.build | 5 +- hw/i386/kvm/xen_evtchn.c | 155 ++ hw/i386/kvm/xen_evtchn.h | 18 + hw/i386/pc.c | 2 + target/i386/kvm/xen-emu.c | 15 5 files changed, 194 insertions(+), 1 deletion(-) create mode 100644 hw/i386/kvm/xen_evtchn.c create mode 100644 hw/i386/kvm/xen_evtchn.h diff --git a/hw/i386/kvm/meson.build b/hw/i386/kvm/meson.build index 6165cbf019..cab64df339 100644 --- a/hw/i386/kvm/meson.build +++ b/hw/i386/kvm/meson.build @@ -4,6 +4,9 @@ i386_kvm_ss.add(when: 'CONFIG_APIC', if_true: files('apic.c')) i386_kvm_ss.add(when: 'CONFIG_I8254', if_true: files('i8254.c')) i386_kvm_ss.add(when: 'CONFIG_I8259', if_true: files('i8259.c')) i386_kvm_ss.add(when: 'CONFIG_IOAPIC', if_true: files('ioapic.c')) -i386_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files('xen_overlay.c')) +i386_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files( + 'xen_overlay.c', + 'xen_evtchn.c', + )) i386_ss.add_all(when: 'CONFIG_KVM', if_true: i386_kvm_ss) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c new file mode 100644 index 00..9d6f4076ad --- /dev/null +++ b/hw/i386/kvm/xen_evtchn.c @@ -0,0 +1,155 @@ +/* + * QEMU Xen emulation: Event channel support + * + * Copyright © 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Authors: David Woodhouse + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/host-utils.h" +#include "qemu/module.h" +#include "qemu/main-loop.h" +#include "qapi/error.h" +#include "qom/object.h" +#include "exec/target_page.h" +#include "exec/address-spaces.h" +#include "migration/vmstate.h" + +#include "hw/sysbus.h" +#include "hw/xen/xen.h" +#include "xen_evtchn.h" + +#include "sysemu/kvm.h" +#include "sysemu/kvm_xen.h" +#include + +#include "hw/xen/interface/memory.h" +#include "hw/xen/interface/hvm/params.h" + +#define TYPE_XEN_EVTCHN "xen-evtchn" +OBJECT_DECLARE_SIMPLE_TYPE(XenEvtchnState, XEN_EVTCHN) + +struct XenEvtchnState { +/*< private >*/ +SysBusDevice busdev; +/*< public >*/ + +uint64_t callback_param; +bool evtchn_in_kernel; + +QemuMutex port_lock; +}; + +struct XenEvtchnState *xen_evtchn_singleton; + +/* Top bits of callback_param are the type (HVM_PARAM_CALLBACK_TYPE_xxx) */ +#define CALLBACK_VIA_TYPE_SHIFT 56 + +static int xen_evtchn_post_load(void *opaque, int version_id) +{ +XenEvtchnState *s = opaque; + +if (s->callback_param) { +xen_evtchn_set_callback_param(s->callback_param); +} + +return 0; +} + +static bool xen_evtchn_is_needed(void *opaque) +{ +return xen_mode == XEN_EMULATE; +} + +static const VMStateDescription xen_evtchn_vmstate = { +.name = "xen_evtchn", +.version_id = 1, +.minimum_version_id = 1, +.needed = xen_evtchn_is_needed, +.post_load = xen_evtchn_post_load, +.fields = (VMStateField[]) { +VMSTATE_UINT64(callback_param, XenEvtchnState), +VMSTATE_END_OF_LIST() +} +}; + +static void xen_evtchn_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); + +dc->vmsd = &xen_evtchn_vmstate; +} + +static const TypeInfo xen_evtchn_info = { +.name = TYPE_XEN_EVTCHN, +.parent= TYPE_SYS_BUS_DEVICE, +.instance_size = sizeof(XenEvtchnState), +.class_init= xen_evtchn_class_init, +}; + +void xen_evtchn_create(void) +{ +XenEvtchnState *s = XEN_EVTCHN(sysbus_create_simple(TYPE_XEN_EVTCHN, +-1, NULL)); +xen_evtchn_singleton = s; + +qemu_mutex_init(&s->port_lock); +} + +static void xen_evtchn_register_types(void) +{ +type_register_static(&xen_evtchn_info); +} + +type_init(xen_evtchn_register_types) + +int xen_evtchn_set_callback_param(uint64_t param) +{ +XenEvtchnState *s = xen_evtchn_singleton; +struct kvm_xen_hvm_attr xa = { +.type = KVM_XEN_ATTR_TYPE_UPCALL_VECTOR, +.u.vector = 0, +}; +bool in_kernel = false; +int ret; + +if (!s) { +return -ENOTSUP; +} + +qemu_mutex_lock(&s->port_lock); + +switch (param >> CALLBACK_VIA_TYPE_SHIFT) { +case HVM_PARAM_CALLBACK_TYPE_VECTOR: { +xa.u.vector = (uint8_t)param, + +ret = kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_SET_ATTR, &xa); +if (!ret && kvm_xen_has_cap(EVTCHN_SEND)) { +in_kernel = true; +} +break; +} +default: +/* Xen doesn't return error even if you set something bogus */ +ret = 0; +break; +}
[PATCH v11 03/59] xen: Add XEN_DISABLED mode and make it default
From: David Woodhouse Also set XEN_ATTACH mode in xen_init() to reflect the truth; not that anyone ever cared before. It was *only* ever checked in xen_init_pv() before. Suggested-by: Paolo Bonzini Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- accel/xen/xen-all.c | 2 ++ include/hw/xen/xen.h | 5 +++-- softmmu/globals.c| 2 +- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/accel/xen/xen-all.c b/accel/xen/xen-all.c index 69aa7d018b..2329556595 100644 --- a/accel/xen/xen-all.c +++ b/accel/xen/xen-all.c @@ -181,6 +181,8 @@ static int xen_init(MachineState *ms) * opt out of system RAM being allocated by generic code */ mc->default_ram_id = NULL; + +xen_mode = XEN_ATTACH; return 0; } diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index 4d412fd4b2..b3873c581b 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -22,8 +22,9 @@ /* xen-machine.c */ enum xen_mode { -XEN_EMULATE = 0, // xen emulation, using xenner (default) -XEN_ATTACH// attach to xen domain created by libxl +XEN_DISABLED = 0, // xen support disabled (default) +XEN_ATTACH, // attach to xen domain created by libxl +XEN_EMULATE, }; extern uint32_t xen_domid; diff --git a/softmmu/globals.c b/softmmu/globals.c index 527edbefdd..0a4405614e 100644 --- a/softmmu/globals.c +++ b/softmmu/globals.c @@ -63,5 +63,5 @@ QemuUUID qemu_uuid; bool qemu_uuid_set; uint32_t xen_domid; -enum xen_mode xen_mode = XEN_EMULATE; +enum xen_mode xen_mode = XEN_DISABLED; bool xen_domid_restrict; -- 2.39.0
[PATCH v11 25/59] i386/xen: implement HVMOP_set_evtchn_upcall_vector
From: Ankur Arora The HVMOP_set_evtchn_upcall_vector hypercall sets the per-vCPU upcall vector, to be delivered to the local APIC just like an MSI (with an EOI). This takes precedence over the system-wide delivery method set by the HVMOP_set_param hypercall with HVM_PARAM_CALLBACK_IRQ. It's used by Windows and Xen (PV shim) guests but normally not by Linux. Signed-off-by: Ankur Arora Signed-off-by: Joao Martins [dwmw2: Rework for upstream kernel changes and split from HVMOP_set_param] Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/cpu.h| 1 + target/i386/kvm/trace-events | 1 + target/i386/kvm/xen-emu.c| 84 ++-- target/i386/machine.c| 1 + 4 files changed, 84 insertions(+), 3 deletions(-) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index bf44a87ddb..938a1b9c8b 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1792,6 +1792,7 @@ typedef struct CPUArchState { uint64_t xen_vcpu_info_default_gpa; uint64_t xen_vcpu_time_info_gpa; uint64_t xen_vcpu_runstate_gpa; +uint8_t xen_vcpu_callback_vector; #endif #if defined(CONFIG_HVF) HVFX86LazyFlags hvf_lflags; diff --git a/target/i386/kvm/trace-events b/target/i386/kvm/trace-events index a840e0333d..b365a8e8e2 100644 --- a/target/i386/kvm/trace-events +++ b/target/i386/kvm/trace-events @@ -11,3 +11,4 @@ kvm_xen_hypercall(int cpu, uint8_t cpl, uint64_t input, uint64_t a0, uint64_t a1 kvm_xen_soft_reset(void) "" kvm_xen_set_shared_info(uint64_t gfn) "shared info at gfn 0x%" PRIx64 kvm_xen_set_vcpu_attr(int cpu, int type, uint64_t gpa) "vcpu attr cpu %d type %d gpa 0x%" PRIx64 +kvm_xen_set_vcpu_callback(int cpu, int vector) "callback vcpu %d vector %d" diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 0bca370ea4..55dc2ac012 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -27,6 +27,7 @@ #include "hw/xen/interface/sched.h" #include "hw/xen/interface/memory.h" #include "hw/xen/interface/hvm/hvm_op.h" +#include "hw/xen/interface/hvm/params.h" #include "hw/xen/interface/vcpu.h" #include "hw/xen/interface/event_channel.h" @@ -193,7 +194,8 @@ static bool kvm_xen_hcall_xen_version(struct kvm_xen_exit *exit, X86CPU *cpu, fi.submap |= 1 << XENFEAT_writable_page_tables | 1 << XENFEAT_writable_descriptor_tables | 1 << XENFEAT_auto_translated_physmap | - 1 << XENFEAT_supervisor_mode_kernel; + 1 << XENFEAT_supervisor_mode_kernel | + 1 << XENFEAT_hvm_callback_vector; } err = kvm_copy_to_gva(CPU(cpu), arg, &fi, sizeof(fi)); @@ -220,6 +222,31 @@ static int kvm_xen_set_vcpu_attr(CPUState *cs, uint16_t type, uint64_t gpa) return kvm_vcpu_ioctl(cs, KVM_XEN_VCPU_SET_ATTR, &xhsi); } +static int kvm_xen_set_vcpu_callback_vector(CPUState *cs) +{ +uint8_t vector = X86_CPU(cs)->env.xen_vcpu_callback_vector; +struct kvm_xen_vcpu_attr xva; + +xva.type = KVM_XEN_VCPU_ATTR_TYPE_UPCALL_VECTOR; +xva.u.vector = vector; + +trace_kvm_xen_set_vcpu_callback(cs->cpu_index, vector); + +return kvm_vcpu_ioctl(cs, KVM_XEN_HVM_SET_ATTR, &xva); +} + +static void do_set_vcpu_callback_vector(CPUState *cs, run_on_cpu_data data) +{ +X86CPU *cpu = X86_CPU(cs); +CPUX86State *env = &cpu->env; + +env->xen_vcpu_callback_vector = data.host_int; + +if (kvm_xen_has_cap(EVTCHN_SEND)) { +kvm_xen_set_vcpu_callback_vector(cs); +} +} + static void do_set_vcpu_info_default_gpa(CPUState *cs, run_on_cpu_data data) { X86CPU *cpu = X86_CPU(cs); @@ -276,12 +303,16 @@ static void do_vcpu_soft_reset(CPUState *cs, run_on_cpu_data data) env->xen_vcpu_info_default_gpa = INVALID_GPA; env->xen_vcpu_time_info_gpa = INVALID_GPA; env->xen_vcpu_runstate_gpa = INVALID_GPA; +env->xen_vcpu_callback_vector = 0; kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO, INVALID_GPA); kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO, INVALID_GPA); kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR, INVALID_GPA); +if (kvm_xen_has_cap(EVTCHN_SEND)) { +kvm_xen_set_vcpu_callback_vector(cs); +} } @@ -458,17 +489,53 @@ static bool kvm_xen_hcall_memory_op(struct kvm_xen_exit *exit, X86CPU *cpu, return true; } +static int kvm_xen_hcall_evtchn_upcall_vector(struct kvm_xen_exit *exit, + X86CPU *cpu, uint64_t arg) +{ +struct xen_hvm_evtchn_upcall_vector up; +CPUState *target_cs; + +/* No need for 32/64 compat handling */ +qemu_build_assert(sizeof(up) == 8); + +if (kvm_copy_from_gva(CPU(cpu), arg, &up, sizeof(up))) { +return -EFAULT; +} + +if (up.vector < 0x10) { +return -EINVAL; +} + +target_cs = qemu_get_c
[PATCH v11 13/59] hw/xen: Add xen_overlay device for emulating shared xenheap pages
From: David Woodhouse For the shared info page and for grant tables, Xen shares its own pages from the "Xen heap" to the guest. The guest requests that a given page from a certain address space (XENMAPSPACE_shared_info, etc.) be mapped to a given GPA using the XENMEM_add_to_physmap hypercall. To support that in qemu when *emulating* Xen, create a memory region (migratable) and allow it to be mapped as an overlay when requested. Xen theoretically allows the same page to be mapped multiple times into the guest, but that's hard to track and reinstate over migration, so we automatically *unmap* any previous mapping when creating a new one. This approach has been used in production with a non-trivial number of guests expecting true Xen, without any problems yet being noticed. This adds just the shared info page for now. The grant tables will be a larger region, and will need to be overlaid one page at a time. I think that means I need to create separate aliases for each page of the overall grant_frames region, so that they can be mapped individually. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/meson.build | 1 + hw/i386/kvm/xen_overlay.c | 210 ++ hw/i386/kvm/xen_overlay.h | 20 include/sysemu/kvm_xen.h | 7 ++ 4 files changed, 238 insertions(+) create mode 100644 hw/i386/kvm/xen_overlay.c create mode 100644 hw/i386/kvm/xen_overlay.h diff --git a/hw/i386/kvm/meson.build b/hw/i386/kvm/meson.build index 95467f1ded..6165cbf019 100644 --- a/hw/i386/kvm/meson.build +++ b/hw/i386/kvm/meson.build @@ -4,5 +4,6 @@ i386_kvm_ss.add(when: 'CONFIG_APIC', if_true: files('apic.c')) i386_kvm_ss.add(when: 'CONFIG_I8254', if_true: files('i8254.c')) i386_kvm_ss.add(when: 'CONFIG_I8259', if_true: files('i8259.c')) i386_kvm_ss.add(when: 'CONFIG_IOAPIC', if_true: files('ioapic.c')) +i386_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files('xen_overlay.c')) i386_ss.add_all(when: 'CONFIG_KVM', if_true: i386_kvm_ss) diff --git a/hw/i386/kvm/xen_overlay.c b/hw/i386/kvm/xen_overlay.c new file mode 100644 index 00..a2441e2b4e --- /dev/null +++ b/hw/i386/kvm/xen_overlay.c @@ -0,0 +1,210 @@ +/* + * QEMU Xen emulation: Shared/overlay pages support + * + * Copyright © 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Authors: David Woodhouse + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/host-utils.h" +#include "qemu/module.h" +#include "qemu/main-loop.h" +#include "qapi/error.h" +#include "qom/object.h" +#include "exec/target_page.h" +#include "exec/address-spaces.h" +#include "migration/vmstate.h" + +#include "hw/sysbus.h" +#include "hw/xen/xen.h" +#include "xen_overlay.h" + +#include "sysemu/kvm.h" +#include "sysemu/kvm_xen.h" +#include + +#include "hw/xen/interface/memory.h" + + +#define TYPE_XEN_OVERLAY "xen-overlay" +OBJECT_DECLARE_SIMPLE_TYPE(XenOverlayState, XEN_OVERLAY) + +#define XEN_PAGE_SHIFT 12 +#define XEN_PAGE_SIZE (1ULL << XEN_PAGE_SHIFT) + +struct XenOverlayState { +/*< private >*/ +SysBusDevice busdev; +/*< public >*/ + +MemoryRegion shinfo_mem; +void *shinfo_ptr; +uint64_t shinfo_gpa; +}; + +struct XenOverlayState *xen_overlay_singleton; + +static void xen_overlay_do_map_page(MemoryRegion *page, uint64_t gpa) +{ +/* + * Xen allows guests to map the same page as many times as it likes + * into guest physical frames. We don't, because it would be hard + * to track and restore them all. One mapping of each page is + * perfectly sufficient for all known guests... and we've tested + * that theory on a few now in other implementations. dwmw2. + */ +if (memory_region_is_mapped(page)) { +if (gpa == INVALID_GPA) { +memory_region_del_subregion(get_system_memory(), page); +} else { +/* Just move it */ +memory_region_set_address(page, gpa); +} +} else if (gpa != INVALID_GPA) { +memory_region_add_subregion_overlap(get_system_memory(), gpa, page, 0); +} +} + +/* KVM is the only existing back end for now. Let's not overengineer it yet. */ +static int xen_overlay_set_be_shinfo(uint64_t gfn) +{ +struct kvm_xen_hvm_attr xa = { +.type = KVM_XEN_ATTR_TYPE_SHARED_INFO, +.u.shared_info.gfn = gfn, +}; + +return kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_SET_ATTR, &xa); +} + + +static void xen_overlay_realize(DeviceState *dev, Error **errp) +{ +XenOverlayState *s = XEN_OVERLAY(dev); + +if (xen_mode != XEN_EMULATE) { +error_setg(errp, "Xen overlay page support is for Xen emulation"); +return; +} + +memory_region_init_ram(&s->shinfo_mem, OBJECT(dev), "xen:shared_info", + XEN_PAGE_SIZE, &error_abort); +memory_region_set_enabled(&s->shinfo_mem, true); + +s->shinfo_ptr = memory_
[PATCH v11 14/59] xen: Permit --xen-domid argument when accel is KVM
From: Paul Durrant Signed-off-by: Paul Durrant Signed-off-by: David Wooodhouse --- softmmu/vl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/softmmu/vl.c b/softmmu/vl.c index b2ee3fee3f..2b071159c5 100644 --- a/softmmu/vl.c +++ b/softmmu/vl.c @@ -3359,7 +3359,7 @@ void qemu_init(int argc, char **argv) has_defaults = 0; break; case QEMU_OPTION_xen_domid: -if (!(accel_find("xen"))) { +if (!(accel_find("xen")) && !(accel_find("kvm"))) { error_report("Option not supported for this target"); exit(1); } -- 2.39.0
[PATCH v11 49/59] i386/xen: handle HVMOP_get_param
From: Joao Martins Which is used to fetch xenstore PFN and port to be used by the guest. This is preallocated by the toolstack when guest will just read those and use it straight away. Signed-off-by: Joao Martins Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/kvm/xen-emu.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index f55ab08959..36e60bd2a5 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -762,6 +762,42 @@ out: return true; } +static bool handle_get_param(struct kvm_xen_exit *exit, X86CPU *cpu, + uint64_t arg) +{ +CPUState *cs = CPU(cpu); +struct xen_hvm_param hp; +int err = 0; + +/* No need for 32/64 compat handling */ +qemu_build_assert(sizeof(hp) == 16); + +if (kvm_copy_from_gva(cs, arg, &hp, sizeof(hp))) { +err = -EFAULT; +goto out; +} + +if (hp.domid != DOMID_SELF && hp.domid != xen_domid) { +err = -ESRCH; +goto out; +} + +switch (hp.index) { +case HVM_PARAM_STORE_PFN: +hp.value = XEN_SPECIAL_PFN(XENSTORE); +break; +default: +return false; +} + +if (kvm_copy_to_gva(cs, arg, &hp, sizeof(hp))) { +err = -EFAULT; +} +out: +exit->u.hcall.result = err; +return true; +} + static int kvm_xen_hcall_evtchn_upcall_vector(struct kvm_xen_exit *exit, X86CPU *cpu, uint64_t arg) { @@ -806,6 +842,9 @@ static bool kvm_xen_hcall_hvm_op(struct kvm_xen_exit *exit, X86CPU *cpu, case HVMOP_set_param: return handle_set_param(exit, cpu, arg); +case HVMOP_get_param: +return handle_get_param(exit, cpu, arg); + default: return false; } -- 2.39.0
[PATCH v11 18/59] i386/xen: implement XENMEM_add_to_physmap_batch
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/kvm/xen-compat.h | 24 + target/i386/kvm/xen-emu.c| 69 2 files changed, 93 insertions(+) diff --git a/target/i386/kvm/xen-compat.h b/target/i386/kvm/xen-compat.h index 2d852e2a28..448336de92 100644 --- a/target/i386/kvm/xen-compat.h +++ b/target/i386/kvm/xen-compat.h @@ -15,6 +15,20 @@ typedef uint32_t compat_pfn_t; typedef uint32_t compat_ulong_t; +typedef uint32_t compat_ptr_t; + +#define __DEFINE_COMPAT_HANDLE(name, type) \ +typedef struct {\ +compat_ptr_t c; \ +type *_[0] __attribute__((packed)); \ +} __compat_handle_ ## name; \ + +#define DEFINE_COMPAT_HANDLE(name) __DEFINE_COMPAT_HANDLE(name, name) +#define COMPAT_HANDLE(name) __compat_handle_ ## name + +DEFINE_COMPAT_HANDLE(compat_pfn_t); +DEFINE_COMPAT_HANDLE(compat_ulong_t); +DEFINE_COMPAT_HANDLE(int); struct compat_xen_add_to_physmap { domid_t domid; @@ -24,4 +38,14 @@ struct compat_xen_add_to_physmap { compat_pfn_t gpfn; }; +struct compat_xen_add_to_physmap_batch { +domid_t domid; +uint16_t space; +uint16_t size; +uint16_t extra; +COMPAT_HANDLE(compat_ulong_t) idxs; +COMPAT_HANDLE(compat_pfn_t) gpfns; +COMPAT_HANDLE(int) errs; +}; + #endif /* QEMU_I386_XEN_COMPAT_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 5d79827128..2b235e7b27 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -262,6 +262,71 @@ static int do_add_to_physmap(struct kvm_xen_exit *exit, X86CPU *cpu, return add_to_physmap_one(xatp.space, xatp.idx, xatp.gpfn); } +static int do_add_to_physmap_batch(struct kvm_xen_exit *exit, X86CPU *cpu, + uint64_t arg) +{ +struct xen_add_to_physmap_batch xatpb; +unsigned long idxs_gva, gpfns_gva, errs_gva; +CPUState *cs = CPU(cpu); +size_t op_sz; + +if (hypercall_compat32(exit->u.hcall.longmode)) { +struct compat_xen_add_to_physmap_batch xatpb32; + +qemu_build_assert(sizeof(struct compat_xen_add_to_physmap_batch) == 20); +if (kvm_copy_from_gva(cs, arg, &xatpb32, sizeof(xatpb32))) { +return -EFAULT; +} +xatpb.domid = xatpb32.domid; +xatpb.space = xatpb32.space; +xatpb.size = xatpb32.size; + +idxs_gva = xatpb32.idxs.c; +gpfns_gva = xatpb32.gpfns.c; +errs_gva = xatpb32.errs.c; +op_sz = sizeof(uint32_t); +} else { +if (kvm_copy_from_gva(cs, arg, &xatpb, sizeof(xatpb))) { +return -EFAULT; +} +op_sz = sizeof(unsigned long); +idxs_gva = (unsigned long)xatpb.idxs.p; +gpfns_gva = (unsigned long)xatpb.gpfns.p; +errs_gva = (unsigned long)xatpb.errs.p; +} + +if (xatpb.domid != DOMID_SELF && xatpb.domid != xen_domid) { +return -ESRCH; +} + +/* Explicitly invalid for the batch op. Not that we implement it anyway. */ +if (xatpb.space == XENMAPSPACE_gmfn_range) { +return -EINVAL; +} + +while (xatpb.size--) { +unsigned long idx = 0; +unsigned long gpfn = 0; +int err; + +/* For 32-bit compat this only copies the low 32 bits of each */ +if (kvm_copy_from_gva(cs, idxs_gva, &idx, op_sz) || +kvm_copy_from_gva(cs, gpfns_gva, &gpfn, op_sz)) { +return -EFAULT; +} +idxs_gva += op_sz; +gpfns_gva += op_sz; + +err = add_to_physmap_one(xatpb.space, idx, gpfn); + +if (kvm_copy_to_gva(cs, errs_gva, &err, sizeof(err))) { +return -EFAULT; +} +errs_gva += sizeof(err); +} +return 0; +} + static bool kvm_xen_hcall_memory_op(struct kvm_xen_exit *exit, X86CPU *cpu, int cmd, uint64_t arg) { @@ -272,6 +337,10 @@ static bool kvm_xen_hcall_memory_op(struct kvm_xen_exit *exit, X86CPU *cpu, err = do_add_to_physmap(exit, cpu, arg); break; +case XENMEM_add_to_physmap_batch: +err = do_add_to_physmap_batch(exit, cpu, arg); +break; + default: return false; } -- 2.39.0
[PATCH v11 04/59] i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support
From: David Woodhouse This just initializes the basic Xen support in KVM for now. Only permitted on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap overlay, event channel, grant tables and other stuff will exist. There's no point having the basic hypercall support if nothing else works. Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used later by support devices. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- accel/kvm/kvm-all.c | 1 + include/sysemu/kvm_int.h| 2 ++ include/sysemu/kvm_xen.h| 20 + target/i386/kvm/kvm.c | 59 + target/i386/kvm/meson.build | 2 ++ target/i386/kvm/xen-emu.c | 58 target/i386/kvm/xen-emu.h | 19 7 files changed, 161 insertions(+) create mode 100644 include/sysemu/kvm_xen.h create mode 100644 target/i386/kvm/xen-emu.c create mode 100644 target/i386/kvm/xen-emu.h diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 9b26582655..f242e36316 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -3703,6 +3703,7 @@ static void kvm_accel_instance_init(Object *obj) s->kvm_dirty_ring_size = 0; s->notify_vmexit = NOTIFY_VMEXIT_OPTION_RUN; s->notify_window = 0; +s->xen_version = 0; } /** diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h index 60b520a13e..7f945bc763 100644 --- a/include/sysemu/kvm_int.h +++ b/include/sysemu/kvm_int.h @@ -118,6 +118,8 @@ struct KVMState struct KVMDirtyRingReaper reaper; NotifyVmexitOption notify_vmexit; uint32_t notify_window; +uint32_t xen_version; +uint32_t xen_caps; }; void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml, diff --git a/include/sysemu/kvm_xen.h b/include/sysemu/kvm_xen.h new file mode 100644 index 00..296533f2d5 --- /dev/null +++ b/include/sysemu/kvm_xen.h @@ -0,0 +1,20 @@ +/* + * Xen HVM emulation support in KVM + * + * Copyright © 2019 Oracle and/or its affiliates. All rights reserved. + * Copyright © 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef QEMU_SYSEMU_KVM_XEN_H +#define QEMU_SYSEMU_KVM_XEN_H + +uint32_t kvm_xen_get_caps(void); + +#define kvm_xen_has_cap(cap) (!!(kvm_xen_get_caps() & \ + KVM_XEN_HVM_CONFIG_ ## cap)) + +#endif /* QEMU_SYSEMU_KVM_XEN_H */ diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 5870301991..aa6eac7cad 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -31,6 +31,7 @@ #include "sysemu/runstate.h" #include "kvm_i386.h" #include "sev.h" +#include "xen-emu.h" #include "hyperv.h" #include "hyperv-proto.h" @@ -42,6 +43,7 @@ #include "qemu/error-report.h" #include "qemu/memalign.h" #include "hw/i386/x86.h" +#include "hw/i386/pc.h" #include "hw/i386/apic.h" #include "hw/i386/apic_internal.h" #include "hw/i386/apic-msidef.h" @@ -49,6 +51,8 @@ #include "hw/i386/x86-iommu.h" #include "hw/i386/e820_memory_layout.h" +#include "hw/xen/xen.h" + #include "hw/pci/pci.h" #include "hw/pci/msi.h" #include "hw/pci/msix.h" @@ -2514,6 +2518,22 @@ int kvm_arch_init(MachineState *ms, KVMState *s) } } +if (s->xen_version) { +#ifdef CONFIG_XEN_EMU +if (!object_dynamic_cast(OBJECT(ms), TYPE_PC_MACHINE)) { +error_report("kvm: Xen support only available in PC machine"); +return -ENOTSUP; +} +ret = kvm_xen_init(s); +if (ret < 0) { +return ret; +} +#else +error_report("kvm: Xen support not enabled in qemu"); +return -ENOTSUP; +#endif +} + ret = kvm_get_supported_msrs(s); if (ret < 0) { return ret; @@ -5704,6 +5724,36 @@ static void kvm_arch_set_notify_window(Object *obj, Visitor *v, s->notify_window = value; } +static void kvm_arch_get_xen_version(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ +KVMState *s = KVM_STATE(obj); +uint32_t value = s->xen_version; + +visit_type_uint32(v, name, &value, errp); +} + +static void kvm_arch_set_xen_version(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ +KVMState *s = KVM_STATE(obj); +Error *error = NULL; +uint32_t value; + +visit_type_uint32(v, name, &value, &error); +if (error) { +error_propagate(errp, error); +return; +} + +s->xen_version = value; +if (value && xen_mode == XEN_DISABLED) { +xen_mode = XEN_EMULATE; +} +} + void kvm_arch_accel_class_init(ObjectClass *oc) { object_class_property_add_enum(oc, "notify-vmexit", "NotifyVMex
[PATCH v11 28/59] i386/xen: Add support for Xen event channel delivery to vCPU
From: David Woodhouse The kvm_xen_inject_vcpu_callback_vector() function will either deliver the per-vCPU local APIC vector (as an MSI), or just kick the vCPU out of the kernel to trigger KVM's automatic delivery of the global vector. Support for asserting the GSI/PCI_INTX callbacks will come later. Also add kvm_xen_get_vcpu_info_hva() which returns the vcpu_info of a given vCPU. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- include/sysemu/kvm_xen.h | 2 + target/i386/cpu.h | 2 + target/i386/kvm/xen-emu.c | 91 --- 3 files changed, 89 insertions(+), 6 deletions(-) diff --git a/include/sysemu/kvm_xen.h b/include/sysemu/kvm_xen.h index 0c3a273549..0c0efbe699 100644 --- a/include/sysemu/kvm_xen.h +++ b/include/sysemu/kvm_xen.h @@ -21,6 +21,8 @@ int kvm_xen_soft_reset(void); uint32_t kvm_xen_get_caps(void); +void *kvm_xen_get_vcpu_info_hva(uint32_t vcpu_id); +void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, int type); #define kvm_xen_has_cap(cap) (!!(kvm_xen_get_caps() & \ KVM_XEN_HVM_CONFIG_ ## cap)) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 938a1b9c8b..c9b12e7476 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1788,6 +1788,8 @@ typedef struct CPUArchState { #endif #if defined(CONFIG_KVM) struct kvm_nested_state *nested_state; +MemoryRegion *xen_vcpu_info_mr; +void *xen_vcpu_info_hva; uint64_t xen_vcpu_info_gpa; uint64_t xen_vcpu_info_default_gpa; uint64_t xen_vcpu_time_info_gpa; diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index e80de809fc..4513f07c68 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -21,6 +21,8 @@ #include "trace.h" #include "sysemu/runstate.h" +#include "hw/pci/msi.h" +#include "hw/i386/apic-msidef.h" #include "hw/i386/kvm/xen_overlay.h" #include "hw/i386/kvm/xen_evtchn.h" @@ -248,6 +250,40 @@ static void do_set_vcpu_callback_vector(CPUState *cs, run_on_cpu_data data) } } +static int set_vcpu_info(CPUState *cs, uint64_t gpa) +{ +X86CPU *cpu = X86_CPU(cs); +CPUX86State *env = &cpu->env; +MemoryRegionSection mrs = { .mr = NULL }; +void *vcpu_info_hva = NULL; +int ret; + +ret = kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO, gpa); +if (ret || gpa == INVALID_GPA) { +goto out; +} + +mrs = memory_region_find(get_system_memory(), gpa, + sizeof(struct vcpu_info)); +if (!mrs.mr) { +ret = -EINVAL; +} else if (!mrs.mr->ram_block || mrs.size < sizeof(struct vcpu_info) || + !(vcpu_info_hva = qemu_map_ram_ptr(mrs.mr->ram_block, + mrs.offset_within_region))) { +ret = -EINVAL; +memory_region_unref(mrs.mr); +mrs.mr = NULL; +} + + out: +if (env->xen_vcpu_info_mr) { +memory_region_unref(env->xen_vcpu_info_mr); +} +env->xen_vcpu_info_hva = vcpu_info_hva; +env->xen_vcpu_info_mr = mrs.mr; +return ret; +} + static void do_set_vcpu_info_default_gpa(CPUState *cs, run_on_cpu_data data) { X86CPU *cpu = X86_CPU(cs); @@ -257,8 +293,7 @@ static void do_set_vcpu_info_default_gpa(CPUState *cs, run_on_cpu_data data) /* Changing the default does nothing if a vcpu_info was explicitly set. */ if (env->xen_vcpu_info_gpa == INVALID_GPA) { -kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO, - env->xen_vcpu_info_default_gpa); +set_vcpu_info(cs, env->xen_vcpu_info_default_gpa); } } @@ -269,8 +304,52 @@ static void do_set_vcpu_info_gpa(CPUState *cs, run_on_cpu_data data) env->xen_vcpu_info_gpa = data.host_ulong; -kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO, - env->xen_vcpu_info_gpa); +set_vcpu_info(cs, env->xen_vcpu_info_gpa); +} + +void *kvm_xen_get_vcpu_info_hva(uint32_t vcpu_id) +{ +CPUState *cs = qemu_get_cpu(vcpu_id); +if (!cs) { +return NULL; +} + +return X86_CPU(cs)->env.xen_vcpu_info_hva; +} + +void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, int type) +{ +CPUState *cs = qemu_get_cpu(vcpu_id); +uint8_t vector; + +if (!cs) { +return; +} + +vector = X86_CPU(cs)->env.xen_vcpu_callback_vector; +if (vector) { +/* + * The per-vCPU callback vector injected via lapic. Just + * deliver it as an MSI. + */ +MSIMessage msg = { +.address = APIC_DEFAULT_ADDRESS | X86_CPU(cs)->apic_id, +.data = vector | (1UL << MSI_DATA_LEVEL_SHIFT), +}; +kvm_irqchip_send_msi(kvm_state, msg); +return; +} + +switch (type) { +case HVM_PARAM_CALLBACK_TYPE_VECTOR: +/* + * If the evtchn_upcall_pending field in the vcpu_info is set, then + * KVM will automatically deliver the
[PATCH v11 32/59] hw/xen: Implement EVTCHNOP_bind_virq
From: David Woodhouse Add the array of virq ports to each vCPU so that we can deliver timers, debug ports, etc. Global virqs are allocated against vCPU 0 initially, but can be migrated to other vCPUs (when we implement that). The kernel needs to know about VIRQ_TIMER in order to accelerate timers, so tell it via KVM_XEN_VCPU_ATTR_TYPE_TIMER. Also save/restore the value of the singleshot timer across migration, as the kernel will handle the hypercalls automatically now. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 85 hw/i386/kvm/xen_evtchn.h | 2 + include/sysemu/kvm_xen.h | 1 + target/i386/cpu.h | 4 ++ target/i386/kvm/xen-emu.c | 91 +++ target/i386/machine.c | 2 + 6 files changed, 185 insertions(+) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index deea7de027..da2f5711dd 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -244,6 +244,11 @@ static bool valid_port(evtchn_port_t port) } } +static bool valid_vcpu(uint32_t vcpu) +{ +return !!qemu_get_cpu(vcpu); +} + int xen_evtchn_status_op(struct evtchn_status *status) { XenEvtchnState *s = xen_evtchn_singleton; @@ -494,6 +499,43 @@ static void free_port(XenEvtchnState *s, evtchn_port_t port) clear_port_pending(s, port); } +static int allocate_port(XenEvtchnState *s, uint32_t vcpu, uint16_t type, + uint16_t val, evtchn_port_t *port) +{ +evtchn_port_t p = 1; + +for (p = 1; valid_port(p); p++) { +if (s->port_table[p].type == EVTCHNSTAT_closed) { +s->port_table[p].vcpu = vcpu; +s->port_table[p].type = type; +s->port_table[p].type_val = val; + +*port = p; + +if (s->nr_ports < p + 1) { +s->nr_ports = p + 1; +} + +return 0; +} +} +return -ENOSPC; +} + +static bool virq_is_global(uint32_t virq) +{ +switch (virq) { +case VIRQ_TIMER: +case VIRQ_DEBUG: +case VIRQ_XENOPROF: +case VIRQ_XENPMU: +return false; + +default: +return true; +} +} + static int close_port(XenEvtchnState *s, evtchn_port_t port) { XenEvtchnPort *p = &s->port_table[port]; @@ -502,6 +544,11 @@ static int close_port(XenEvtchnState *s, evtchn_port_t port) case EVTCHNSTAT_closed: return -ENOENT; +case EVTCHNSTAT_virq: +kvm_xen_set_vcpu_virq(virq_is_global(p->type_val) ? 0 : p->vcpu, + p->type_val, 0); +break; + default: break; } @@ -553,3 +600,41 @@ int xen_evtchn_unmask_op(struct evtchn_unmask *unmask) return ret; } + +int xen_evtchn_bind_virq_op(struct evtchn_bind_virq *virq) +{ +XenEvtchnState *s = xen_evtchn_singleton; +int ret; + +if (!s) { +return -ENOTSUP; +} + +if (virq->virq >= NR_VIRQS) { +return -EINVAL; +} + +/* Global VIRQ must be allocated on vCPU0 first */ +if (virq_is_global(virq->virq) && virq->vcpu != 0) { +return -EINVAL; +} + +if (!valid_vcpu(virq->vcpu)) { +return -ENOENT; +} + +qemu_mutex_lock(&s->port_lock); + +ret = allocate_port(s, virq->vcpu, EVTCHNSTAT_virq, virq->virq, +&virq->port); +if (!ret) { +ret = kvm_xen_set_vcpu_virq(virq->vcpu, virq->virq, virq->port); +if (ret) { +free_port(s, virq->port); +} +} + +qemu_mutex_unlock(&s->port_lock); + +return ret; +} diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index 69c6b0d743..0ea13dda3a 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -18,8 +18,10 @@ int xen_evtchn_set_callback_param(uint64_t param); struct evtchn_status; struct evtchn_close; struct evtchn_unmask; +struct evtchn_bind_virq; int xen_evtchn_status_op(struct evtchn_status *status); int xen_evtchn_close_op(struct evtchn_close *close); int xen_evtchn_unmask_op(struct evtchn_unmask *unmask); +int xen_evtchn_bind_virq_op(struct evtchn_bind_virq *virq); #endif /* QEMU_XEN_EVTCHN_H */ diff --git a/include/sysemu/kvm_xen.h b/include/sysemu/kvm_xen.h index 0c0efbe699..297630cd87 100644 --- a/include/sysemu/kvm_xen.h +++ b/include/sysemu/kvm_xen.h @@ -23,6 +23,7 @@ int kvm_xen_soft_reset(void); uint32_t kvm_xen_get_caps(void); void *kvm_xen_get_vcpu_info_hva(uint32_t vcpu_id); void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, int type); +int kvm_xen_set_vcpu_virq(uint32_t vcpu_id, uint16_t virq, uint16_t port); #define kvm_xen_has_cap(cap) (!!(kvm_xen_get_caps() & \ KVM_XEN_HVM_CONFIG_ ## cap)) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index c9b12e7476..dba8732fc6 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -27,6 +27,8 @@ #include "qapi/qapi-types-common.h" #include "qemu/cpu-float
[PATCH v11 05/59] i386/kvm: handle Xen HVM cpuid leaves
From: Joao Martins Introduce support for emulating CPUID for Xen HVM guests. It doesn't make sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally when the xen-version machine property is set. Signed-off-by: Joao Martins [dwmw2: Obtain xen_version from KVM property, make it automatic] Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/cpu.c | 1 + target/i386/cpu.h | 2 + target/i386/kvm/kvm.c | 77 ++- target/i386/kvm/xen-emu.c | 4 +- target/i386/kvm/xen-emu.h | 13 ++- 5 files changed, 91 insertions(+), 6 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 4d2b8d0444..eb5a466d4e 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -7070,6 +7070,7 @@ static Property x86_cpu_properties[] = { * own cache information (see x86_cpu_load_def()). */ DEFINE_PROP_BOOL("legacy-cache", X86CPU, legacy_cache, true), +DEFINE_PROP_BOOL("xen-vapic", X86CPU, xen_vapic, false), /* * From "Requirements for Implementing the Microsoft diff --git a/target/i386/cpu.h b/target/i386/cpu.h index d4bc19577a..c6c57baed5 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1964,6 +1964,8 @@ struct ArchCPU { int32_t thread_id; int32_t hv_max_vps; + +bool xen_vapic; }; diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index aa6eac7cad..2b3daabf7b 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -22,6 +22,7 @@ #include #include "standard-headers/asm-x86/kvm_para.h" +#include "hw/xen/interface/arch-x86/cpuid.h" #include "cpu.h" #include "host-cpu.h" @@ -1804,7 +1805,77 @@ int kvm_arch_init_vcpu(CPUState *cs) has_msr_hv_hypercall = true; } -if (cpu->expose_kvm) { +if (cs->kvm_state->xen_version) { +#ifdef CONFIG_XEN_EMU +struct kvm_cpuid_entry2 *xen_max_leaf; + +memcpy(signature, "XenVMMXenVMM", 12); + +xen_max_leaf = c = &cpuid_data.entries[cpuid_i++]; +c->function = kvm_base + XEN_CPUID_SIGNATURE; +c->eax = kvm_base + XEN_CPUID_TIME; +c->ebx = signature[0]; +c->ecx = signature[1]; +c->edx = signature[2]; + +c = &cpuid_data.entries[cpuid_i++]; +c->function = kvm_base + XEN_CPUID_VENDOR; +c->eax = cs->kvm_state->xen_version; +c->ebx = 0; +c->ecx = 0; +c->edx = 0; + +c = &cpuid_data.entries[cpuid_i++]; +c->function = kvm_base + XEN_CPUID_HVM_MSR; +/* Number of hypercall-transfer pages */ +c->eax = 1; +/* Hypercall MSR base address */ +if (hyperv_enabled(cpu)) { +c->ebx = XEN_HYPERCALL_MSR_HYPERV; +kvm_xen_init(cs->kvm_state, c->ebx); +} else { +c->ebx = XEN_HYPERCALL_MSR; +} +c->ecx = 0; +c->edx = 0; + +c = &cpuid_data.entries[cpuid_i++]; +c->function = kvm_base + XEN_CPUID_TIME; +c->eax = ((!!tsc_is_stable_and_known(env) << 1) | +(!!(env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_RDTSCP) << 2)); +/* default=0 (emulate if necessary) */ +c->ebx = 0; +/* guest tsc frequency */ +c->ecx = env->user_tsc_khz; +/* guest tsc incarnation (migration count) */ +c->edx = 0; + +c = &cpuid_data.entries[cpuid_i++]; +c->function = kvm_base + XEN_CPUID_HVM; +xen_max_leaf->eax = kvm_base + XEN_CPUID_HVM; +if (cs->kvm_state->xen_version >= XEN_VERSION(4, 5)) { +c->function = kvm_base + XEN_CPUID_HVM; + +if (cpu->xen_vapic) { +c->eax |= XEN_HVM_CPUID_APIC_ACCESS_VIRT; +c->eax |= XEN_HVM_CPUID_X2APIC_VIRT; +} + +c->eax |= XEN_HVM_CPUID_IOMMU_MAPPINGS; + +if (cs->kvm_state->xen_version >= XEN_VERSION(4, 6)) { +c->eax |= XEN_HVM_CPUID_VCPU_ID_PRESENT; +c->ebx = cs->cpu_index; +} +} + +kvm_base += 0x100; +#else /* CONFIG_XEN_EMU */ +/* This should never happen as kvm_arch_init() would have died first. */ +fprintf(stderr, "Cannot enable Xen CPUID without Xen support\n"); +abort(); +#endif +} else if (cpu->expose_kvm) { memcpy(signature, "KVMKVMKVM\0\0\0", 12); c = &cpuid_data.entries[cpuid_i++]; c->function = KVM_CPUID_SIGNATURE | kvm_base; @@ -2524,7 +2595,9 @@ int kvm_arch_init(MachineState *ms, KVMState *s) error_report("kvm: Xen support only available in PC machine"); return -ENOTSUP; } -ret = kvm_xen_init(s); +/* hyperv_enabled() doesn't work yet. */ +uint32_t msr = XEN_HYPERCALL_MSR; +ret = kvm_xen_init(s, msr); if (ret < 0) { return ret; } diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index b556d903aa..34d5bc1bc9 100644 --- a/target/i386/kvm/xen
[PATCH v11 59/59] i386/xen: Document Xen HVM emulation
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- docs/system/i386/xen.rst| 76 + docs/system/target-i386.rst | 1 + 2 files changed, 77 insertions(+) create mode 100644 docs/system/i386/xen.rst diff --git a/docs/system/i386/xen.rst b/docs/system/i386/xen.rst new file mode 100644 index 00..a00523b492 --- /dev/null +++ b/docs/system/i386/xen.rst @@ -0,0 +1,76 @@ +Xen HVM guest support += + + +Description +--- + +KVM has support for hosting Xen guests, intercepting Xen hypercalls and event +channel (Xen PV interrupt) delivery. This allows guests which expect to be +run under Xen to be hosted in QEMU under Linux/KVM instead. + +Setup +- + +Xen mode is enabled by setting the ``xen-version`` property of the KVM +accelerator, for example for Xen 4.10: + +.. parsed-literal:: + + |qemu_system| --accel kvm,xen-version=0x4000a + +Additionally, virtual APIC support can be advertised to the guest through the +``xen-vapic`` CPU flag: + +.. parsed-literal:: + + |qemu_system| --accel kvm,xen-version=0x4000a --cpu host,+xen_vapic + +When Xen support is enabled, QEMU changes hypervisor identification (CPUID +0x4000..0x400A) to Xen. The KVM identification and features are not +advertised to a Xen guest. If Hyper-V is also enabled, the Xen identification +moves to leaves 0x4100..0x410A. + +The Xen platform device is enabled automatically for a Xen guest. This allows +a guest to unplug all emulated devices, in order to use Xen PV block and network +drivers instead. Note that until the Xen PV device back ends are enabled to work +with Xen mode in QEMU, that is unlikely to cause significant joy. Linux guests +can be dissuaded from this by adding 'xen_emul_unplug=never' on their command +line, and it can also be noted that AHCI disk controllers are exempt from being +unplugged, as are passthrough VFIO PCI devices. + +Properties +-- + +The following properties exist on the KVM accelerator object: + +``xen-version`` + This property contains the Xen version in ``XENVER_version`` form, with the + major version in the top 16 bits and the minor version in the low 16 bits. + Setting this property enables the Xen guest support. + +``xen-evtchn-max-pirq`` + Xen PIRQs represent an emulated physical interrupt, either GSI or MSI, which + can be routed to an event channel instead of to the emulated I/O or local + APIC. By default, QEMU permits only 256 PIRQs because this allows maximum + compatibility with 32-bit MSI where the higher bits of the PIRQ# would need + to be in the upper 64 bits of the MSI message. For guests with large numbers + of PCI devices (and none which are limited to 32-bit addressing) it may be + desirable to increase this value. + +``xen-gnttab-max-frames`` + Xen grant tables are the means by which a Xen guest grants access to its + memory for PV back ends (disk, network, etc.). Since QEMU only supports v1 + grant tables which are 8 bytes in size, each page (each frame) of the grant + table can reference 512 pages of guest memory. The default number of frames + is 64, allowing for 32768 pages of guest memory to be accessed by PV backends + through simultaneous grants. For guests with large numbers of PV devices and + high throughput, it may be desirable to increase this value. + +OS requirements +--- + +The minimal Xen support in the KVM accelerator requires the host to be running +Linux v5.12 or newer. Later versions add optimisations: Linux v5.17 added +acceleration of interrupt delivery via the Xen PIRQ mechanism, and Linux v5.19 +accelerated Xen PV timers and inter-processor interrupts (IPIs). diff --git a/docs/system/target-i386.rst b/docs/system/target-i386.rst index e64c013077..77c2f3b979 100644 --- a/docs/system/target-i386.rst +++ b/docs/system/target-i386.rst @@ -27,6 +27,7 @@ Architectural features i386/cpu i386/hyperv + i386/xen i386/kvm-pv i386/sgx i386/amd-memory-encryption -- 2.39.0
[PATCH v11 56/59] hw/xen: Support GSI mapping to PIRQ
From: David Woodhouse If I advertise XENFEAT_hvm_pirqs then a guest now boots successfully as long as I tell it 'pci=nomsi'. [root@localhost ~]# cat /proc/interrupts CPU0 0: 52 IO-APIC 2-edge timer 1: 16 xen-pirq 1-ioapic-edge i8042 4: 1534 xen-pirq 4-ioapic-edge ttyS0 8: 1 xen-pirq 8-ioapic-edge rtc0 9: 0 xen-pirq 9-ioapic-level acpi 11: 5648 xen-pirq 11-ioapic-level ahci[:00:04.0] 12:257 xen-pirq 12-ioapic-edge i8042 ... Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 56 +++- hw/i386/kvm/xen_evtchn.h | 2 ++ hw/i386/x86.c| 16 3 files changed, 73 insertions(+), 1 deletion(-) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index f5e835ff70..8df95742a7 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -148,6 +148,9 @@ struct XenEvtchnState { /* GSI → PIRQ mapping (serialized) */ uint16_t gsi_pirq[GSI_NUM_PINS]; +/* Per-GSI assertion state (serialized) */ +uint32_t pirq_gsi_set; + /* Per-PIRQ information (rebuilt on migration) */ struct pirq_info *pirq; }; @@ -246,6 +249,7 @@ static const VMStateDescription xen_evtchn_vmstate = { VMSTATE_VARRAY_UINT16_ALLOC(pirq_inuse_bitmap, XenEvtchnState, nr_pirq_inuse_words, 0, vmstate_info_uint64, uint64_t), +VMSTATE_UINT32(pirq_gsi_set, XenEvtchnState), VMSTATE_END_OF_LIST() } }; @@ -1506,6 +1510,51 @@ static int allocate_pirq(XenEvtchnState *s, int type, int gsi) return pirq; } +bool xen_evtchn_set_gsi(int gsi, int level) +{ +XenEvtchnState *s = xen_evtchn_singleton; +int pirq; + +assert(qemu_mutex_iothread_locked()); + +if (!s || gsi < 0 || gsi > GSI_NUM_PINS) { +return false; +} + +/* + * Check that that it *isn't* the event channel GSI, and thus + * that we are not recursing and it's safe to take s->port_lock. + * + * Locking aside, it's perfectly sane to bail out early for that + * special case, as it would make no sense for the event channel + * GSI to be routed back to event channels, when the delivery + * method is to raise the GSI... that recursion wouldn't *just* + * be a locking issue. + */ +if (gsi && gsi == s->callback_gsi) { +return false; +} + +QEMU_LOCK_GUARD(&s->port_lock); + +pirq = s->gsi_pirq[gsi]; +if (!pirq) { +return false; +} + +if (level) { +int port = s->pirq[pirq].port; + +s->pirq_gsi_set |= (1U << gsi); +if (port) { +set_port_pending(s, port); +} +} else { +s->pirq_gsi_set &= ~(1U << gsi); +} +return true; +} + int xen_physdev_map_pirq(struct physdev_map_pirq *map) { XenEvtchnState *s = xen_evtchn_singleton; @@ -1612,8 +1661,13 @@ int xen_physdev_eoi_pirq(struct physdev_eoi *eoi) if (gsi < 0) { return -EINVAL; } +if (s->pirq_gsi_set & (1U << gsi)) { +int port = s->pirq[pirq].port; +if (port) { +set_port_pending(s, port); +} +} -// XX: Reassert a level IRQ if needed */ return 0; } diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index a7383f760c..95400b7fbf 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -24,6 +24,8 @@ void xen_evtchn_set_callback_level(int level); int xen_evtchn_set_port(uint16_t port); +bool xen_evtchn_set_gsi(int gsi, int level); + /* * These functions mirror the libxenevtchn library API, providing the QEMU * backend side of "interdomain" event channels. diff --git a/hw/i386/x86.c b/hw/i386/x86.c index eaff4227bd..594fd25c55 100644 --- a/hw/i386/x86.c +++ b/hw/i386/x86.c @@ -62,6 +62,11 @@ #include CONFIG_DEVICES #include "kvm/kvm_i386.h" +#ifdef CONFIG_XEN_EMU +#include "hw/xen/xen.h" +#include "hw/i386/kvm/xen_evtchn.h" +#endif + /* Physical Address of PVH entry point read from kernel ELF NOTE */ static size_t pvh_start_addr; @@ -609,6 +614,17 @@ void gsi_handler(void *opaque, int n, int level) } /* fall through */ case ISA_NUM_IRQS ... IOAPIC_NUM_PINS - 1: +#ifdef CONFIG_XEN_EMU +/* + * Xen delivers the GSI to the Legacy PIC (not that Legacy PIC + * routing actually works properly under Xen). And then to + * *either* the PIRQ handling or the I/OAPIC depending on + * whether the former wants it. + */ +if (xen_mode == XEN_EMULATE && xen_evtchn_set_gsi(n, level)) { +break; +} +#endif qemu_set_irq(s->ioapic_irq[n], level); break; case IO_APIC_SECONDARY_IRQBASE -- 2.39.0
[PATCH v11 53/59] hw/xen: Automatically add xen-platform PCI device for emulated Xen guests
From: David Woodhouse It isn't strictly mandatory but Linux guests at least will only map their grant tables over the dummy BAR that it provides, and don't have sufficient wit to map them in any other unused part of their guest address space. So include it by default for minimal surprise factor. As I come to document "how to run a Xen guest in QEMU", this means one fewer thing to tell the user about, according to the mantra of "if it needs documenting, fix it first, then document what remains". Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/pc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index a12a7a67e9..5ec3518b9e 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1313,6 +1313,9 @@ void pc_basic_device_init(struct PCMachineState *pcms, #ifdef CONFIG_XEN_EMU if (xen_mode == XEN_EMULATE) { xen_evtchn_connect_gsis(gsi); +if (pcms->bus) { +pci_create_simple(pcms->bus, -1, "xen-platform"); +} } #endif -- 2.39.0
[PATCH v11 24/59] i386/xen: implement HYPERVISOR_event_channel_op
From: Joao Martins Signed-off-by: Joao Martins [dwmw2: Ditch event_channel_op_compat which was never available to HVM guests] Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/kvm/xen-emu.c | 25 + 1 file changed, 25 insertions(+) diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index f5c8b6d20c..0bca370ea4 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -28,6 +28,7 @@ #include "hw/xen/interface/memory.h" #include "hw/xen/interface/hvm/hvm_op.h" #include "hw/xen/interface/vcpu.h" +#include "hw/xen/interface/event_channel.h" #include "xen-compat.h" @@ -588,6 +589,27 @@ static bool kvm_xen_hcall_vcpu_op(struct kvm_xen_exit *exit, X86CPU *cpu, return true; } +static bool kvm_xen_hcall_evtchn_op(struct kvm_xen_exit *exit, +int cmd, uint64_t arg) +{ +int err = -ENOSYS; + +switch (cmd) { +case EVTCHNOP_init_control: +case EVTCHNOP_expand_array: +case EVTCHNOP_set_priority: +/* We do not support FIFO channels at this point */ +err = -ENOSYS; +break; + +default: +return false; +} + +exit->u.hcall.result = err; +return true; +} + int kvm_xen_soft_reset(void) { CPUState *cpu; @@ -694,6 +716,9 @@ static bool do_kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) case __HYPERVISOR_sched_op: return kvm_xen_hcall_sched_op(exit, cpu, exit->u.hcall.params[0], exit->u.hcall.params[1]); +case __HYPERVISOR_event_channel_op: +return kvm_xen_hcall_evtchn_op(exit, exit->u.hcall.params[0], + exit->u.hcall.params[1]); case __HYPERVISOR_vcpu_op: return kvm_xen_hcall_vcpu_op(exit, cpu, exit->u.hcall.params[0], -- 2.39.0
[PATCH v11 50/59] hw/xen: Add backend implementation of interdomain event channel support
From: David Woodhouse The provides the QEMU side of interdomain event channels, allowing events to be sent to/from the guest. The API mirrors libxenevtchn, and in time both this and the real Xen one will be available through ops structures so that the PV backend drivers can use the correct one as appropriate. For now, this implementation can be used directly by our XenStore which will be for emulated mode only. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 340 ++- hw/i386/kvm/xen_evtchn.h | 19 +++ 2 files changed, 352 insertions(+), 7 deletions(-) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 06572b3e10..519b8e0600 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -38,6 +38,7 @@ #include "sysemu/kvm.h" #include "sysemu/kvm_xen.h" #include +#include #include "hw/xen/interface/memory.h" #include "hw/xen/interface/hvm/params.h" @@ -88,6 +89,13 @@ struct compat_shared_info { #define COMPAT_EVTCHN_2L_NR_CHANNELS1024 +/* Local private implementation of struct xenevtchn_handle */ +struct xenevtchn_handle { +evtchn_port_t be_port; +evtchn_port_t guest_port; /* Or zero for unbound */ +int fd; +}; + /* * For unbound/interdomain ports there are only two possible remote * domains; self and QEMU. Use a single high bit in type_val for that, @@ -111,6 +119,8 @@ struct XenEvtchnState { uint32_t nr_ports; XenEvtchnPort port_table[EVTCHN_2L_NR_CHANNELS]; qemu_irq gsis[GSI_NUM_PINS]; + +struct xenevtchn_handle *be_handles[EVTCHN_2L_NR_CHANNELS]; }; struct XenEvtchnState *xen_evtchn_singleton; @@ -118,6 +128,18 @@ struct XenEvtchnState *xen_evtchn_singleton; /* Top bits of callback_param are the type (HVM_PARAM_CALLBACK_TYPE_xxx) */ #define CALLBACK_VIA_TYPE_SHIFT 56 +static void unbind_backend_ports(XenEvtchnState *s); + +static int xen_evtchn_pre_load(void *opaque) +{ +XenEvtchnState *s = opaque; + +/* Unbind all the backend-side ports; they need to rebind */ +unbind_backend_ports(s); + +return 0; +} + static int xen_evtchn_post_load(void *opaque, int version_id) { XenEvtchnState *s = opaque; @@ -151,6 +173,7 @@ static const VMStateDescription xen_evtchn_vmstate = { .version_id = 1, .minimum_version_id = 1, .needed = xen_evtchn_is_needed, +.pre_load = xen_evtchn_pre_load, .post_load = xen_evtchn_post_load, .fields = (VMStateField[]) { VMSTATE_UINT64(callback_param, XenEvtchnState), @@ -423,6 +446,20 @@ static int assign_kernel_port(uint16_t type, evtchn_port_t port, return kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_SET_ATTR, &ha); } +static int assign_kernel_eventfd(uint16_t type, evtchn_port_t port, int fd) +{ +struct kvm_xen_hvm_attr ha; + +ha.type = KVM_XEN_ATTR_TYPE_EVTCHN; +ha.u.evtchn.send_port = port; +ha.u.evtchn.type = type; +ha.u.evtchn.flags = 0; +ha.u.evtchn.deliver.eventfd.port = 0; +ha.u.evtchn.deliver.eventfd.fd = fd; + +return kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_SET_ATTR, &ha); +} + static bool valid_port(evtchn_port_t port) { if (!port) { @@ -441,6 +478,32 @@ static bool valid_vcpu(uint32_t vcpu) return !!qemu_get_cpu(vcpu); } +static void unbind_backend_ports(XenEvtchnState *s) +{ +XenEvtchnPort *p; +int i; + +for (i = 1; i < s->nr_ports; i++) { +p = &s->port_table[i]; +if (p->type == EVTCHNSTAT_interdomain && +(p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) ) { +evtchn_port_t be_port = p->type_val & PORT_INFO_TYPEVAL_REMOTE_PORT_MASK; + +if (s->be_handles[be_port]) { +/* This part will be overwritten on the load anyway. */ +p->type = EVTCHNSTAT_unbound; +p->type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU; + +/* Leave the backend port open and unbound too. */ +if (kvm_xen_has_cap(EVTCHN_SEND)) { +deassign_kernel_port(i); +} +s->be_handles[be_port]->guest_port = 0; +} +} +} +} + int xen_evtchn_status_op(struct evtchn_status *status) { XenEvtchnState *s = xen_evtchn_singleton; @@ -876,7 +939,14 @@ static int close_port(XenEvtchnState *s, evtchn_port_t port) case EVTCHNSTAT_interdomain: if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) { -/* Not yet implemented. This can't happen! */ +uint16_t be_port = p->type_val & ~PORT_INFO_TYPEVAL_REMOTE_QEMU; +struct xenevtchn_handle *xc = s->be_handles[be_port]; +if (xc) { +if (kvm_xen_has_cap(EVTCHN_SEND)) { +deassign_kernel_port(port); +} +xc->guest_port = 0; +} } else { /* Loopback interdomain */ XenEvtchnPort *rp = &s->port_table[p->type_val]; @@ -1108,8 +1178,27
[PATCH v11 10/59] i386/xen: implement HYPERVISOR_xen_version
From: Joao Martins This is just meant to serve as an example on how we can implement hypercalls. xen_version specifically since Qemu does all kind of feature controllability. So handling that here seems appropriate. Signed-off-by: Joao Martins [dwmw2: Implement kvm_gva_rw() safely] Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/kvm/xen-emu.c | 86 +++ 1 file changed, 86 insertions(+) diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 476f464ee2..56b80a7880 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -14,9 +14,55 @@ #include "sysemu/kvm_int.h" #include "sysemu/kvm_xen.h" #include "kvm/kvm_i386.h" +#include "exec/address-spaces.h" #include "xen-emu.h" #include "trace.h" +#include "hw/xen/interface/version.h" + +static int kvm_gva_rw(CPUState *cs, uint64_t gva, void *_buf, size_t sz, + bool is_write) +{ +uint8_t *buf = (uint8_t *)_buf; +int ret; + +while (sz) { +struct kvm_translation tr = { +.linear_address = gva, +}; + +size_t len = TARGET_PAGE_SIZE - (tr.linear_address & ~TARGET_PAGE_MASK); +if (len > sz) { +len = sz; +} + +ret = kvm_vcpu_ioctl(cs, KVM_TRANSLATE, &tr); +if (ret || !tr.valid || (is_write && !tr.writeable)) { +return -EFAULT; +} + +cpu_physical_memory_rw(tr.physical_address, buf, len, is_write); + +buf += len; +sz -= len; +gva += len; +} + +return 0; +} + +static inline int kvm_copy_from_gva(CPUState *cs, uint64_t gva, void *buf, +size_t sz) +{ +return kvm_gva_rw(cs, gva, buf, sz, false); +} + +static inline int kvm_copy_to_gva(CPUState *cs, uint64_t gva, void *buf, + size_t sz) +{ +return kvm_gva_rw(cs, gva, buf, sz, true); +} + int kvm_xen_init(KVMState *s, uint32_t hypercall_msr) { const int required_caps = KVM_XEN_HVM_CONFIG_HYPERCALL_MSR | @@ -87,6 +133,43 @@ uint32_t kvm_xen_get_caps(void) return kvm_state->xen_caps; } +static bool kvm_xen_hcall_xen_version(struct kvm_xen_exit *exit, X86CPU *cpu, + int cmd, uint64_t arg) +{ +int err = 0; + +switch (cmd) { +case XENVER_get_features: { +struct xen_feature_info fi; + +/* No need for 32/64 compat handling */ +qemu_build_assert(sizeof(fi) == 8); + +err = kvm_copy_from_gva(CPU(cpu), arg, &fi, sizeof(fi)); +if (err) { +break; +} + +fi.submap = 0; +if (fi.submap_idx == 0) { +fi.submap |= 1 << XENFEAT_writable_page_tables | + 1 << XENFEAT_writable_descriptor_tables | + 1 << XENFEAT_auto_translated_physmap | + 1 << XENFEAT_supervisor_mode_kernel; +} + +err = kvm_copy_to_gva(CPU(cpu), arg, &fi, sizeof(fi)); +break; +} + +default: +return false; +} + +exit->u.hcall.result = err; +return true; +} + static bool do_kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) { uint16_t code = exit->u.hcall.input; @@ -97,6 +180,9 @@ static bool do_kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) } switch (code) { +case __HYPERVISOR_xen_version: +return kvm_xen_hcall_xen_version(exit, cpu, exit->u.hcall.params[0], + exit->u.hcall.params[1]); default: return false; } -- 2.39.0
[PATCH v11 19/59] i386/xen: implement HYPERVISOR_hvm_op
From: Joao Martins This is when guest queries for support for HVMOP_pagetable_dying. Signed-off-by: Joao Martins Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/kvm/xen-emu.c | 17 + 1 file changed, 17 insertions(+) diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 2b235e7b27..4002b1b797 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -26,6 +26,7 @@ #include "hw/xen/interface/version.h" #include "hw/xen/interface/sched.h" #include "hw/xen/interface/memory.h" +#include "hw/xen/interface/hvm/hvm_op.h" #include "xen-compat.h" @@ -349,6 +350,19 @@ static bool kvm_xen_hcall_memory_op(struct kvm_xen_exit *exit, X86CPU *cpu, return true; } +static bool kvm_xen_hcall_hvm_op(struct kvm_xen_exit *exit, X86CPU *cpu, + int cmd, uint64_t arg) +{ +switch (cmd) { +case HVMOP_pagetable_dying: +exit->u.hcall.result = -ENOSYS; +return true; + +default: +return false; +} +} + int kvm_xen_soft_reset(void) { int err; @@ -450,6 +464,9 @@ static bool do_kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) case __HYPERVISOR_sched_op: return kvm_xen_hcall_sched_op(exit, cpu, exit->u.hcall.params[0], exit->u.hcall.params[1]); +case __HYPERVISOR_hvm_op: +return kvm_xen_hcall_hvm_op(exit, cpu, exit->u.hcall.params[0], +exit->u.hcall.params[1]); case __HYPERVISOR_memory_op: return kvm_xen_hcall_memory_op(exit, cpu, exit->u.hcall.params[0], exit->u.hcall.params[1]); -- 2.39.0
[PATCH v11 22/59] i386/xen: handle VCPUOP_register_vcpu_time_info
From: Joao Martins In order to support Linux vdso in Xen. Signed-off-by: Joao Martins Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- target/i386/cpu.h | 1 + target/i386/kvm/xen-emu.c | 100 +- target/i386/machine.c | 1 + 3 files changed, 90 insertions(+), 12 deletions(-) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 109b2e5669..96c2d0d5cb 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1790,6 +1790,7 @@ typedef struct CPUArchState { struct kvm_nested_state *nested_state; uint64_t xen_vcpu_info_gpa; uint64_t xen_vcpu_info_default_gpa; +uint64_t xen_vcpu_time_info_gpa; #endif #if defined(CONFIG_HVF) HVFX86LazyFlags hvf_lflags; diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 1cec8566ec..0b3bd0b889 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -37,28 +37,41 @@ #define hypercall_compat32(longmode) (false) #endif -static int kvm_gva_rw(CPUState *cs, uint64_t gva, void *_buf, size_t sz, - bool is_write) +static bool kvm_gva_to_gpa(CPUState *cs, uint64_t gva, uint64_t *gpa, + size_t *len, bool is_write) { -uint8_t *buf = (uint8_t *)_buf; -int ret; - -while (sz) { struct kvm_translation tr = { .linear_address = gva, }; -size_t len = TARGET_PAGE_SIZE - (tr.linear_address & ~TARGET_PAGE_MASK); -if (len > sz) { -len = sz; +if (len) { +*len = TARGET_PAGE_SIZE - (gva & ~TARGET_PAGE_MASK); +} + +if (kvm_vcpu_ioctl(cs, KVM_TRANSLATE, &tr) || !tr.valid || +(is_write && !tr.writeable)) { +return false; } +*gpa = tr.physical_address; +return true; +} + +static int kvm_gva_rw(CPUState *cs, uint64_t gva, void *_buf, size_t sz, + bool is_write) +{ +uint8_t *buf = (uint8_t *)_buf; +uint64_t gpa; +size_t len; -ret = kvm_vcpu_ioctl(cs, KVM_TRANSLATE, &tr); -if (ret || !tr.valid || (is_write && !tr.writeable)) { +while (sz) { +if (!kvm_gva_to_gpa(cs, gva, &gpa, &len, is_write)) { return -EFAULT; } +if (len > sz) { +len = sz; +} -cpu_physical_memory_rw(tr.physical_address, buf, len, is_write); +cpu_physical_memory_rw(gpa, buf, len, is_write); buf += len; sz -= len; @@ -146,6 +159,7 @@ int kvm_xen_init_vcpu(CPUState *cs) env->xen_vcpu_info_gpa = INVALID_GPA; env->xen_vcpu_info_default_gpa = INVALID_GPA; +env->xen_vcpu_time_info_gpa = INVALID_GPA; return 0; } @@ -229,6 +243,17 @@ static void do_set_vcpu_info_gpa(CPUState *cs, run_on_cpu_data data) env->xen_vcpu_info_gpa); } +static void do_set_vcpu_time_info_gpa(CPUState *cs, run_on_cpu_data data) +{ +X86CPU *cpu = X86_CPU(cs); +CPUX86State *env = &cpu->env; + +env->xen_vcpu_time_info_gpa = data.host_ulong; + +kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO, + env->xen_vcpu_time_info_gpa); +} + static void do_vcpu_soft_reset(CPUState *cs, run_on_cpu_data data) { X86CPU *cpu = X86_CPU(cs); @@ -236,8 +261,11 @@ static void do_vcpu_soft_reset(CPUState *cs, run_on_cpu_data data) env->xen_vcpu_info_gpa = INVALID_GPA; env->xen_vcpu_info_default_gpa = INVALID_GPA; +env->xen_vcpu_time_info_gpa = INVALID_GPA; kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO, INVALID_GPA); +kvm_xen_set_vcpu_attr(cs, KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO, + INVALID_GPA); } static int xen_set_shared_info(uint64_t gfn) @@ -453,6 +481,42 @@ static int vcpuop_register_vcpu_info(CPUState *cs, CPUState *target, return 0; } +static int vcpuop_register_vcpu_time_info(CPUState *cs, CPUState *target, + uint64_t arg) +{ +struct vcpu_register_time_memory_area tma; +uint64_t gpa; +size_t len; + +/* No need for 32/64 compat handling */ +qemu_build_assert(sizeof(tma) == 8); +qemu_build_assert(sizeof(struct vcpu_time_info) == 32); + +if (!target) { +return -ENOENT; +} + +if (kvm_copy_from_gva(cs, arg, &tma, sizeof(tma))) { +return -EFAULT; +} + +/* + * Xen actually uses the GVA and does the translation through the guest + * page tables each time. But Linux/KVM uses the GPA, on the assumption + * that guests only ever use *global* addresses (kernel virtual addresses) + * for it. If Linux is changed to redo the GVA→GPA translation each time, + * it will offer a new vCPU attribute for that, and we'll use it instead. + */ +if (!kvm_gva_to_gpa(cs, tma.addr.p, &gpa, &len, false) || +len < sizeof(struct vcpu_time_info)) { +return -EFAULT; +} + +async_run_on_cpu(tar
[PATCH v11 41/59] hw/xen: Support HVM_PARAM_CALLBACK_TYPE_PCI_INTX callback
From: David Woodhouse The guest is permitted to specify an arbitrary domain/bus/device/function and INTX pin from which the callback IRQ shall appear to have come. In QEMU we can only easily do this for devices that actually exist, and even that requires us "knowing" that it's a PCMachine in order to find the PCI root bus — although that's OK really because it's always true. We also don't get to get notified of INTX routing changes, because we can't do that as a passive observer; if we try to register a notifier it will overwrite any existing notifier callback on the device. But in practice, guests using PCI_INTX will only ever use pin A on the Xen platform device, and won't swizzle the INTX routing after they set it up. So this is just fine. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_evtchn.c | 80 --- target/i386/kvm/xen-emu.c | 34 + 2 files changed, 100 insertions(+), 14 deletions(-) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index ecc93da172..5d5996641d 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -28,6 +28,8 @@ #include "hw/sysbus.h" #include "hw/xen/xen.h" #include "hw/i386/x86.h" +#include "hw/i386/pc.h" +#include "hw/pci/pci.h" #include "hw/irq.h" #include "xen_evtchn.h" @@ -101,6 +103,7 @@ struct XenEvtchnState { uint64_t callback_param; bool evtchn_in_kernel; +uint32_t callback_gsi; QEMUBH *gsi_bh; @@ -217,11 +220,41 @@ static void xen_evtchn_register_types(void) type_init(xen_evtchn_register_types) +static int set_callback_pci_intx(XenEvtchnState *s, uint64_t param) +{ +PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); +uint8_t pin = param & 3; +uint8_t devfn = (param >> 8) & 0xff; +uint16_t bus = (param >> 16) & 0x; +uint16_t domain = (param >> 32) & 0x; +PCIDevice *pdev; +PCIINTxRoute r; + +if (domain || !pcms) { +return 0; +} + +pdev = pci_find_device(pcms->bus, bus, devfn); +if (!pdev) { +return 0; +} + +r = pci_device_route_intx_to_irq(pdev, pin); +if (r.mode != PCI_INTX_ENABLED) { +return 0; +} + +/* + * Hm, can we be notified of INTX routing changes? Not without + * *owning* the device and being allowed to overwrite its own + * ->intx_routing_notifier, AFAICT. So let's not. + */ +return r.irq; +} + void xen_evtchn_set_callback_level(int level) { XenEvtchnState *s = xen_evtchn_singleton; -uint32_t param; - if (!s) { return; } @@ -260,18 +293,12 @@ void xen_evtchn_set_callback_level(int level) return; } -param = (uint32_t)s->callback_param; - -switch (s->callback_param >> CALLBACK_VIA_TYPE_SHIFT) { -case HVM_PARAM_CALLBACK_TYPE_GSI: -if (param < GSI_NUM_PINS) { -qemu_set_irq(s->gsis[param], level); -if (level) { -/* Ensure the vCPU polls for deassertion */ -kvm_xen_set_callback_asserted(); -} +if (s->callback_gsi && s->callback_gsi < GSI_NUM_PINS) { +qemu_set_irq(s->gsis[s->callback_gsi], level); +if (level) { +/* Ensure the vCPU polls for deassertion */ +kvm_xen_set_callback_asserted(); } -break; } } @@ -283,15 +310,22 @@ int xen_evtchn_set_callback_param(uint64_t param) .u.vector = 0, }; bool in_kernel = false; +uint32_t gsi = 0; +int type = param >> CALLBACK_VIA_TYPE_SHIFT; int ret; if (!s) { return -ENOTSUP; } +/* + * We need the BQL because set_callback_pci_intx() may call into PCI code, + * and because we may need to manipulate the old and new GSI levels. + */ +assert(qemu_mutex_iothread_locked()); qemu_mutex_lock(&s->port_lock); -switch (param >> CALLBACK_VIA_TYPE_SHIFT) { +switch (type) { case HVM_PARAM_CALLBACK_TYPE_VECTOR: { xa.u.vector = (uint8_t)param, @@ -299,10 +333,17 @@ int xen_evtchn_set_callback_param(uint64_t param) if (!ret && kvm_xen_has_cap(EVTCHN_SEND)) { in_kernel = true; } +gsi = 0; break; } +case HVM_PARAM_CALLBACK_TYPE_PCI_INTX: +gsi = set_callback_pci_intx(s, param); +ret = gsi ? 0 : -EINVAL; +break; + case HVM_PARAM_CALLBACK_TYPE_GSI: +gsi = (uint32_t)param; ret = 0; break; @@ -320,6 +361,17 @@ int xen_evtchn_set_callback_param(uint64_t param) } s->callback_param = param; s->evtchn_in_kernel = in_kernel; + +if (gsi != s->callback_gsi) { +struct vcpu_info *vi = kvm_xen_get_vcpu_info_hva(0); + +xen_evtchn_set_callback_level(0); +s->callback_gsi = gsi; + +if (gsi && vi && vi->evtchn_upcall_pending) { +kvm_xen_inject_vcpu_callback_vector(0, type); +
[PATCH v11 11/59] i386/xen: implement HYPERVISOR_sched_op, SCHEDOP_shutdown
From: Joao Martins It allows to shutdown itself via hypercall with any of the 3 reasons: 1) self-reboot 2) shutdown 3) crash Implementing SCHEDOP_shutdown sub op let us handle crashes gracefully rather than leading to triple faults if it remains unimplemented. In addition, the SHUTDOWN_soft_reset reason is used for kexec, to reset Xen shared pages and other enlightenments and leave a clean slate for the new kernel without the hypervisor helpfully writing information at unexpected addresses. Signed-off-by: Joao Martins [dwmw2: Ditch sched_op_compat which was never available for HVM guests, Add SCHEDOP_soft_reset] Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- include/sysemu/kvm_xen.h | 1 + target/i386/kvm/trace-events | 1 + target/i386/kvm/xen-emu.c| 75 3 files changed, 77 insertions(+) diff --git a/include/sysemu/kvm_xen.h b/include/sysemu/kvm_xen.h index 296533f2d5..5dffcc0542 100644 --- a/include/sysemu/kvm_xen.h +++ b/include/sysemu/kvm_xen.h @@ -12,6 +12,7 @@ #ifndef QEMU_SYSEMU_KVM_XEN_H #define QEMU_SYSEMU_KVM_XEN_H +int kvm_xen_soft_reset(void); uint32_t kvm_xen_get_caps(void); #define kvm_xen_has_cap(cap) (!!(kvm_xen_get_caps() & \ diff --git a/target/i386/kvm/trace-events b/target/i386/kvm/trace-events index cd6f842b1f..bb732e1da8 100644 --- a/target/i386/kvm/trace-events +++ b/target/i386/kvm/trace-events @@ -8,3 +8,4 @@ kvm_x86_update_msi_routes(int num) "Updated %d MSI routes" # xen-emu.c kvm_xen_hypercall(int cpu, uint8_t cpl, uint64_t input, uint64_t a0, uint64_t a1, uint64_t a2, uint64_t ret) "xen_hypercall: cpu %d cpl %d input %" PRIu64 " a0 0x%" PRIx64 " a1 0x%" PRIx64 " a2 0x%" PRIx64" ret 0x%" PRIx64 +kvm_xen_soft_reset(void) "" diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index 56b80a7880..4ed833656f 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -11,14 +11,17 @@ #include "qemu/osdep.h" #include "qemu/log.h" +#include "qemu/main-loop.h" #include "sysemu/kvm_int.h" #include "sysemu/kvm_xen.h" #include "kvm/kvm_i386.h" #include "exec/address-spaces.h" #include "xen-emu.h" #include "trace.h" +#include "sysemu/runstate.h" #include "hw/xen/interface/version.h" +#include "hw/xen/interface/sched.h" static int kvm_gva_rw(CPUState *cs, uint64_t gva, void *_buf, size_t sz, bool is_write) @@ -170,6 +173,75 @@ static bool kvm_xen_hcall_xen_version(struct kvm_xen_exit *exit, X86CPU *cpu, return true; } +int kvm_xen_soft_reset(void) +{ +assert(qemu_mutex_iothread_locked()); + +trace_kvm_xen_soft_reset(); + +/* Nothing to reset... yet. */ +return 0; +} + +static int schedop_shutdown(CPUState *cs, uint64_t arg) +{ +struct sched_shutdown shutdown; +int ret = 0; + +/* No need for 32/64 compat handling */ +qemu_build_assert(sizeof(shutdown) == 4); + +if (kvm_copy_from_gva(cs, arg, &shutdown, sizeof(shutdown))) { +return -EFAULT; +} + +switch (shutdown.reason) { +case SHUTDOWN_crash: +cpu_dump_state(cs, stderr, CPU_DUMP_CODE); +qemu_system_guest_panicked(NULL); +break; + +case SHUTDOWN_reboot: +qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET); +break; + +case SHUTDOWN_poweroff: +qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); +break; + +case SHUTDOWN_soft_reset: +qemu_mutex_lock_iothread(); +ret = kvm_xen_soft_reset(); +qemu_mutex_unlock_iothread(); +break; + +default: +ret = -EINVAL; +break; +} + +return ret; +} + +static bool kvm_xen_hcall_sched_op(struct kvm_xen_exit *exit, X86CPU *cpu, + int cmd, uint64_t arg) +{ +CPUState *cs = CPU(cpu); +int err = -ENOSYS; + +switch (cmd) { +case SCHEDOP_shutdown: +err = schedop_shutdown(cs, arg); +break; + +default: +return false; +} + +exit->u.hcall.result = err; +return true; +} + static bool do_kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) { uint16_t code = exit->u.hcall.input; @@ -180,6 +252,9 @@ static bool do_kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit) } switch (code) { +case __HYPERVISOR_sched_op: +return kvm_xen_hcall_sched_op(exit, cpu, exit->u.hcall.params[0], + exit->u.hcall.params[1]); case __HYPERVISOR_xen_version: return kvm_xen_hcall_xen_version(exit, cpu, exit->u.hcall.params[0], exit->u.hcall.params[1]); -- 2.39.0
[PATCH v11 46/59] hw/xen: Implement GNTTABOP_query_size
From: David Woodhouse Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- hw/i386/kvm/xen_gnttab.c | 19 +++ hw/i386/kvm/xen_gnttab.h | 2 ++ target/i386/kvm/xen-emu.c | 16 +++- 3 files changed, 36 insertions(+), 1 deletion(-) diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c index b54a94e2bd..1e691ded32 100644 --- a/hw/i386/kvm/xen_gnttab.c +++ b/hw/i386/kvm/xen_gnttab.c @@ -211,3 +211,22 @@ int xen_gnttab_get_version_op(struct gnttab_get_version *get) get->version = 1; return 0; } + +int xen_gnttab_query_size_op(struct gnttab_query_size *size) +{ +XenGnttabState *s = xen_gnttab_singleton; + +if (!s) { +return -ENOTSUP; +} + +if (size->dom != DOMID_SELF && size->dom != xen_domid) { +size->status = GNTST_bad_domain; +return 0; +} + +size->status = GNTST_okay; +size->nr_frames = s->nr_frames; +size->max_nr_frames = s->max_frames; +return 0; +} diff --git a/hw/i386/kvm/xen_gnttab.h b/hw/i386/kvm/xen_gnttab.h index 79579677ba..3bdbe96191 100644 --- a/hw/i386/kvm/xen_gnttab.h +++ b/hw/i386/kvm/xen_gnttab.h @@ -17,7 +17,9 @@ int xen_gnttab_map_page(uint64_t idx, uint64_t gfn); struct gnttab_set_version; struct gnttab_get_version; +struct gnttab_query_size; int xen_gnttab_set_version_op(struct gnttab_set_version *set); int xen_gnttab_get_version_op(struct gnttab_get_version *get); +int xen_gnttab_query_size_op(struct gnttab_query_size *size); #endif /* QEMU_XEN_GNTTAB_H */ diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index e35b2d5557..44fa0de784 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -1204,7 +1204,21 @@ static bool kvm_xen_hcall_gnttab_op(struct kvm_xen_exit *exit, X86CPU *cpu, } break; } -case GNTTABOP_query_size: +case GNTTABOP_query_size: { +struct gnttab_query_size size; + +qemu_build_assert(sizeof(size) == 16); +if (kvm_copy_from_gva(cs, arg, &size, sizeof(size))) { +err = -EFAULT; +break; +} + +err = xen_gnttab_query_size_op(&size); +if (!err && kvm_copy_to_gva(cs, arg, &size, sizeof(size))) { +err = -EFAULT; +} +break; +} case GNTTABOP_setup_table: case GNTTABOP_copy: case GNTTABOP_map_grant_ref: -- 2.39.0
Re: [PATCH 00/27] tcg: Simplify temporary usage
On 2/10/23 02:35, Emilio Cota wrote: I ran yesterday linux-user SPEC06 benchmarks from your tcg-life branch. I do see perf regressions for two workloads (sjeng and xalancbmk). With perf(1) I see liveness_pass* are at 0.00%, so I wonder: is it possible that the emitted code isn't quite the same? Everything that I checked by hand was the same, but it's possible. It's a tedious process. You'd definitely want to turn off ASR. My current branch has __attribute__((noreturn)) added to all of the liveness passes, so that they don't get folded into tcg_gen_code. But I still would expect 0%. r~
Re: [PATCH 00/27] tcg: Simplify temporary usage
Ping for the 9 patches lacking review. r~ On 1/30/23 10:59, Richard Henderson wrote: Based-on: 20230126043824.54819-1-richard.hender...@linaro.org ("[PATCH v5 00/36] tcg: Support for Int128 with helpers") The biggest pitfall for new users of TCG is the fact that "normal" temporaries die at branches, and we must therefore use a different "local" temporary in that case. The following patch set changes that, so that the "normal" temporary is the one that lives across branches, and there is a special temporary that dies at the end of the extended basic block, and this special case is reserved for tcg internals. TEMP_LOCAL is renamed TEMP_TB, which I believe to be more explicit and less confusing. TEMP_NORMAL is removed entirely. I thought about putting in a proper full-power liveness analysis pass. This would have eliminated the differences between all non-global temporaries, and would have noticed when TEMP_LOCAL finally dies within a translation and avoid any final writeback. But I came to the conclusion that it was too expensive in runtime, and so retaining some distinction in the types was required. In addition, I found that the usage of temps within plugin-gen.c (9 per guest memory operation) meant that we *must* have some form of temp that can be re-used. (There is one x86 instruction which generates 62 memory operations; 62 * 9 == 558, which is larger than our current TCG_MAX_TEMPS.) However I did add a new liveness pass which, with a single pass over the opcode stream, can see that a TEMP_LOCAL is only live within a single extended basic block, and thus may be transformed to TEMP_EBB. With this, and by not recycling TEMP_LOCAL, we can get identical code out of the backend even when changing the front end translators are adjusted to use TEMP_LOCAL for everything. Benchmarking one test case, qemu-arm linux-test, the new liveness pass comes in at about 1.6% on perf, but I can't see any difference in wall clock time before and after the patch set. r~ Richard Henderson (27): tcg: Adjust TCGContext.temps_in_use check accel/tcg: Pass max_insn to gen_intermediate_code by pointer accel/tcg: Use more accurate max_insns for tb_overflow tcg: Remove branch-to-next regardless of reference count tcg: Rename TEMP_LOCAL to TEMP_TB tcg: Add liveness_pass_0 tcg: Remove TEMP_NORMAL tcg: Pass TCGTempKind to tcg_temp_new_internal tcg: Add tcg_temp_ebb_new_{i32,i64,ptr} tcg: Add tcg_gen_movi_ptr tcg: Use tcg_temp_ebb_new_* in tcg/ accel/tcg/plugin: Use tcg_temp_ebb_* accel/tcg/plugin: Tidy plugin_gen_disable_mem_helpers tcg: Don't re-use TEMP_TB temporaries tcg: Change default temp lifetime to TEMP_TB target/arm: Drop copies in gen_sve_{ldr,str} target/arm: Don't use tcg_temp_local_new_* target/cris: Don't use tcg_temp_local_new target/hexagon: Don't use tcg_temp_local_new_* target/hppa: Don't use tcg_temp_local_new target/i386: Don't use tcg_temp_local_new target/mips: Don't use tcg_temp_local_new target/ppc: Don't use tcg_temp_local_new target/xtensa: Don't use tcg_temp_local_new_* exec/gen-icount: Don't use tcg_temp_local_new_i32 tcg: Remove tcg_temp_local_new_*, tcg_const_local_* tcg: Update docs/devel/tcg-ops.rst for temporary changes docs/devel/tcg-ops.rst | 103 target/hexagon/idef-parser/README.rst | 4 +- include/exec/gen-icount.h | 8 +- include/exec/translator.h | 4 +- include/tcg/tcg-op.h| 7 +- include/tcg/tcg.h | 64 ++--- target/arm/translate-a64.h | 1 - target/hexagon/gen_tcg.h| 4 +- accel/tcg/plugin-gen.c | 33 +-- accel/tcg/translate-all.c | 2 +- accel/tcg/translator.c | 6 +- target/alpha/translate.c| 2 +- target/arm/translate-a64.c | 6 - target/arm/translate-sve.c | 38 +-- target/arm/translate.c | 8 +- target/avr/translate.c | 2 +- target/cris/translate.c | 8 +- target/hexagon/genptr.c | 16 +- target/hexagon/idef-parser/parser-helpers.c | 4 +- target/hexagon/translate.c | 4 +- target/hppa/translate.c | 5 +- target/i386/tcg/translate.c | 29 +-- target/loongarch/translate.c| 2 +- target/m68k/translate.c | 2 +- target/microblaze/translate.c | 2 +- target/mips/tcg/translate.c | 59 ++--- target/nios2/translate.c| 2 +- target/openrisc/translate.c | 2 +- target/ppc/translate.c | 8 +- target/riscv/translate.c| 2 +- target/rx/translate.c | 2 +-
Re: [PATCH v4 0/4] Fix deadlock when dying because of a signal
On 2/14/23 04:08, Ilya Leoshkevich wrote: Based-on:<20230202005204.2055899-1-richard.hender...@linaro.org> ("[PATCH 00/14] linux-user/sparc: Handle missing traps") v3:https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03534.html v3 -> v4: Add printfs to the test in order to make the uncaught signals less scary: $ build/x86_64-linux-user/qemu-x86_64 build/tests/tcg/x86_64-linux-user/linux-fork-trap about to trigger fault... qemu: uncaught target signal 4 (Illegal instruction) - core dumped faulting thread exited cleanly Queued to tcg-next, thanks. r~
Re: [PATCH v2 01/15] linux-user/sparc: Raise SIGILL for all unhandled software traps
On 2/15/23 19:45, Richard Henderson wrote: The linux kernel's trap tables vector all unassigned trap numbers to BAD_TRAP, which then raises SIGILL. Tested-by: Ilya Leoshkevich Reported-by: Ilya Leoshkevich Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 8 1 file changed, 8 insertions(+) I'll queue this to tcg-next, along with Ilya's other start_exclusive patches. r~
Re: [PATCH v1 RFC Zisslpcfi 7/9] target/riscv: Tracking indirect branches (fcfi) using TCG
On 2/8/23 20:24, Deepak Gupta wrote: +if (cpu->cfg.ext_cfi) { +/* + * For Forward CFI, only the expectation of a lpcll at + * the start of the block is tracked (which can only happen + * when FCFI is enabled for the current processor mode). A jump + * or call at the end of the previous TB will have updated + * env->elp to indicate the expectation. + */ +flags = FIELD_DP32(flags, TB_FLAGS, FCFI_LP_EXPECTED, + env->elp != NO_LP_EXPECTED); You should also check cpu_fcfien here. We can completely ignore elp if the feature is disabled. Which means that the tb flag will be set if and only if we require a landing pad. static void riscv_tr_tb_start(DisasContextBase *db, CPUState *cpu) { +DisasContext *ctx = container_of(db, DisasContext, base); + +if (ctx->fcfi_lp_expected) { +/* + * Since we can't look ahead to confirm that the first + * instruction is a legal landing pad instruction, emit + * compare-and-branch sequence that will be fixed-up in + * riscv_tr_tb_stop() to either statically hit or skip an + * illegal instruction exception depending on whether the + * flag was lowered by translation of a CJLP or JLP as + * the first instruction in the block. You can "look ahead" by deferring this to riscv_tr_translate_insn. Compare target/arm/translate-a64.c, btype_destination_ok and uses thereof. Note that risc-v does not have the same "guarded page" bit that aa64 does. r~
RE: [PATCH] Adding ability to change disassembler syntax in TCG plugins
> On 2/15/23 19:04, Mikhail Tyutin wrote: > >> On 2/15/23 18:17, Mikhail Tyutin wrote: > >>> ping > >>> > >>> patchew link: > >>> https://patchew.org/QEMU/7d17f0cbb5ed4c90bbadd39924290...@yadro.com/ > >>> > >>> 10.02.2023 18:24, Mikhail Tyutin wrote: > This patch adds new function qemu_plugin_insn_disas_with_syntax() that > allows TCG > plugins to get disassembler string with non-default syntax if it wants > to. > > Signed-off-by: Mikhail Tyutin > >> > >> Why? > >> > >> It's certainly not very generic, exposing a disassembly quirk for exactly > >> one guest > >> architecture. I mean, you could just as easily link your plugin directly > >> to libcapstone > >> via qemu_plugin_insn_data(). > >> > >> > >> r~ > > > > I agree it can be done outside of Qemu using another disassembler library. > > However, > > there are few reasons to do it in Qemu from architecture standpoint: > > > > 1. To have a single place of instruction decoding logic. TCG has to decode > > guest instructions > > anyway. If plugins add another decoder, it causes double work and prone to > > errors (however > > current implementation does double decode work anyway). For example, TCG > > might support > > new instruction which is not available in external decoder yet. > > > > 2. Under the hood Qemu uses different implementations of decoder (in > > addition to capstone) > > which is not exposed in public interface. If there is a need to configure > > its output, proposed > > API allows that as well. > > > > 3. If multiple plugins want to use another disassembler syntax, they have > > to share > > implementation as utility function. > > What's all this got to do with preferring intel over at&t syntax? > I still think it's a generally useless switch. > > > r~ Linux-world prefers AT&T style, Windows-world prefers Intel style for x86_64 ISA. That causes a lot of pain for developers and tools that have to compare and parse assembler texts. If you have to work on different hosts, you would better use one style for both.
[PATCH v2 09/15] linux-user/sparc: Handle getcc, setcc, getpsr traps
These are really only meaningful for sparc32, but they're still present for backward compatibility for sparc64. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 62 +++-- 1 file changed, 59 insertions(+), 3 deletions(-) diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index e04c842867..a3edb353f6 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -149,6 +149,51 @@ static void flush_windows(CPUSPARCState *env) #endif } +static void next_instruction(CPUSPARCState *env) +{ +env->pc = env->npc; +env->npc = env->npc + 4; +} + +static uint32_t do_getcc(CPUSPARCState *env) +{ +#ifdef TARGET_SPARC64 +return cpu_get_ccr(env) & 0xf; +#else +return extract32(cpu_get_psr(env), 20, 4); +#endif +} + +static void do_setcc(CPUSPARCState *env, uint32_t icc) +{ +#ifdef TARGET_SPARC64 +cpu_put_ccr(env, (cpu_get_ccr(env) & 0xf0) | (icc & 0xf)); +#else +cpu_put_psr(env, deposit32(cpu_get_psr(env), 20, 4, icc)); +#endif +} + +static uint32_t do_getpsr(CPUSPARCState *env) +{ +#ifdef TARGET_SPARC64 +const uint64_t TSTATE_CWP = 0x1f; +const uint64_t TSTATE_ICC = 0xfull << 32; +const uint64_t TSTATE_XCC = 0xfull << 36; +const uint32_t PSR_S = 0x0080u; +const uint32_t PSR_V8PLUS = 0xff00u; +uint64_t tstate = sparc64_tstate(env); + +/* See , tstate_to_psr. */ +return ((tstate & TSTATE_CWP) | +PSR_S | +((tstate & TSTATE_ICC) >> 12) | +((tstate & TSTATE_XCC) >> 20) | +PSR_V8PLUS); +#else +return (cpu_get_psr(env) & (PSR_ICC | PSR_CWP)) | PSR_S; +#endif +} + /* Avoid ifdefs below for the abi32 and abi64 paths. */ #ifdef TARGET_ABI32 #define TARGET_TT_SYSCALL (TT_TRAP + 0x10) /* t_linux */ @@ -218,9 +263,20 @@ void cpu_loop (CPUSPARCState *env) case TT_TRAP + 0x03: /* flush windows */ flush_windows(env); -/* next instruction */ -env->pc = env->npc; -env->npc = env->npc + 4; +next_instruction(env); +break; + +case TT_TRAP + 0x20: /* getcc */ +env->gregs[1] = do_getcc(env); +next_instruction(env); +break; +case TT_TRAP + 0x21: /* setcc */ +do_setcc(env, env->gregs[1]); +next_instruction(env); +break; +case TT_TRAP + 0x22: /* getpsr */ +env->gregs[1] = do_getpsr(env); +next_instruction(env); break; #ifdef TARGET_SPARC64 -- 2.34.1
[PATCH v2 15/15] linux-user/sparc: Handle tag overflow traps
This trap is raised by taddcctv and tsubcctv insns. Signed-off-by: Richard Henderson --- linux-user/sparc/target_signal.h | 2 +- linux-user/syscall_defs.h| 5 + linux-user/sparc/cpu_loop.c | 3 +++ 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/linux-user/sparc/target_signal.h b/linux-user/sparc/target_signal.h index 87757f0c4e..f223eb4af6 100644 --- a/linux-user/sparc/target_signal.h +++ b/linux-user/sparc/target_signal.h @@ -8,7 +8,7 @@ #define TARGET_SIGTRAP 5 #define TARGET_SIGABRT 6 #define TARGET_SIGIOT6 -#define TARGET_SIGSTKFLT 7 /* actually EMT */ +#define TARGET_SIGEMT7 #define TARGET_SIGFPE8 #define TARGET_SIGKILL 9 #define TARGET_SIGBUS 10 diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h index 77864de57f..614a1cbc8e 100644 --- a/linux-user/syscall_defs.h +++ b/linux-user/syscall_defs.h @@ -717,6 +717,11 @@ typedef struct target_siginfo { #define TARGET_TRAP_HWBKPT (4) /* hardware breakpoint/watchpoint */ #define TARGET_TRAP_UNK (5) /* undiagnosed trap */ +/* + * SIGEMT si_codes + */ +#define TARGET_EMT_TAGOVF 1 /* tag overflow */ + #include "target_resource.h" struct target_pollfd { diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index 5a8a71e976..b36bb2574b 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -328,6 +328,9 @@ void cpu_loop (CPUSPARCState *env) case TT_PRIV_INSN: force_sig_fault(TARGET_SIGILL, TARGET_ILL_PRVOPC, env->pc); break; +case TT_TOVF: +force_sig_fault(TARGET_SIGEMT, TARGET_EMT_TAGOVF, env->pc); +break; #ifdef TARGET_SPARC64 case TT_PRIV_ACT: /* Note do_privact defers to do_privop. */ -- 2.34.1
[PATCH v2 14/15] linux-user/sparc: Handle floating-point exceptions
Raise SIGFPE for ieee exceptions. The other types, such as FSR_FTT_UNIMPFPOP, should not appear, because we enable normal emulation of missing insns at the start of sparc_cpu_realizefn(). Signed-off-by: Richard Henderson --- target/sparc/cpu.h | 3 +-- linux-user/sparc/cpu_loop.c | 22 ++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/target/sparc/cpu.h b/target/sparc/cpu.h index e478c5eb16..ae8de606d5 100644 --- a/target/sparc/cpu.h +++ b/target/sparc/cpu.h @@ -197,8 +197,7 @@ enum { #define FSR_FTT2 (1ULL << 16) #define FSR_FTT1 (1ULL << 15) #define FSR_FTT0 (1ULL << 14) -//gcc warns about constant overflow for ~FSR_FTT_MASK -//#define FSR_FTT_MASK (FSR_FTT2 | FSR_FTT1 | FSR_FTT0) +#define FSR_FTT_MASK (FSR_FTT2 | FSR_FTT1 | FSR_FTT0) #ifdef TARGET_SPARC64 #define FSR_FTT_NMASK 0xfffe3fffULL #define FSR_FTT_CEXC_NMASK 0xfffe3fe0ULL diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index 093358a39a..5a8a71e976 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -297,6 +297,28 @@ void cpu_loop (CPUSPARCState *env) restore_window(env); break; +case TT_FP_EXCP: +{ +int code = TARGET_FPE_FLTUNK; +target_ulong fsr = env->fsr; + +if ((fsr & FSR_FTT_MASK) == FSR_FTT_IEEE_EXCP) { +if (fsr & FSR_NVC) { +code = TARGET_FPE_FLTINV; +} else if (fsr & FSR_OFC) { +code = TARGET_FPE_FLTOVF; +} else if (fsr & FSR_UFC) { +code = TARGET_FPE_FLTUND; +} else if (fsr & FSR_DZC) { +code = TARGET_FPE_FLTDIV; +} else if (fsr & FSR_NXC) { +code = TARGET_FPE_FLTRES; +} +} +force_sig_fault(TARGET_SIGFPE, code, env->pc); +} +break; + case EXCP_INTERRUPT: /* just indicate that signals should be handled asap */ break; -- 2.34.1
[PATCH v2 04/15] linux-user/sparc: Use TT_TRAP for flush windows
The v9 and pre-v9 code can be unified with this macro. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index 051a292ce5..e1d08ff204 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -196,15 +196,14 @@ void cpu_loop (CPUSPARCState *env) env->pc = env->npc; env->npc = env->npc + 4; break; -case 0x83: /* flush windows */ -#ifdef TARGET_ABI32 -case 0x103: -#endif + +case TT_TRAP + 0x03: /* flush windows */ flush_windows(env); /* next instruction */ env->pc = env->npc; env->npc = env->npc + 4; break; + #ifndef TARGET_SPARC64 case TT_WIN_OVF: /* window overflow */ save_window(env); -- 2.34.1
Re: [PATCH 1/2] configure: Add 'mkdir build' check
*ping* Patch series: https://lore.kernel.org/qemu-devel/20230208233111.398577-1-dinahbaum...@gmail.com/ -Dinah On Wed, Feb 8, 2023 at 6:31 PM Dinah Baum wrote: > QEMU configure script goes into an infinite error printing loop > when in read only directory due to 'build' dir never being created. > > Checking if 'mkdir dir' succeeds prevents this error. > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/321 > --- > configure | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/configure b/configure > index 64960c6000..3b384914ce 100755 > --- a/configure > +++ b/configure > @@ -31,10 +31,11 @@ then > fi > fi > > -mkdir build > -touch $MARKER > +if mkdir build > +then > +touch $MARKER > > -cat > GNUmakefile <<'EOF' > +cat > GNUmakefile <<'EOF' > # This file is auto-generated by configure to support in-source tree > # 'make' command invocation > > @@ -56,8 +57,12 @@ force: ; > GNUmakefile: ; > > EOF > -cd build > -exec "$source_path/configure" "$@" > +cd build > +exec "$source_path/configure" "$@" > +else > +echo "ERROR: Unable to use ./build dir, try using a > ../qemu/configure build" > +exit 1 > +fi > fi > > # Temporary directory used for files created while > -- > 2.30.2 > >
[PATCH v2 05/15] linux-user/sparc: Tidy window spill/fill traps
Add some macros to localize the hw difference between v9 and pre-v9. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 23 +-- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index e1d08ff204..2bcf32590f 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -158,6 +158,15 @@ static void flush_windows(CPUSPARCState *env) #define syscall_cc xcc #endif +/* Avoid ifdefs below for the v9 and pre-v9 hw traps. */ +#ifdef TARGET_SPARC64 +#define TARGET_TT_SPILL TT_SPILL +#define TARGET_TT_FILL TT_FILL +#else +#define TARGET_TT_SPILL TT_WIN_OVF +#define TARGET_TT_FILL TT_WIN_UNF +#endif + void cpu_loop (CPUSPARCState *env) { CPUState *cs = env_cpu(env); @@ -204,20 +213,14 @@ void cpu_loop (CPUSPARCState *env) env->npc = env->npc + 4; break; -#ifndef TARGET_SPARC64 -case TT_WIN_OVF: /* window overflow */ +case TARGET_TT_SPILL: /* window overflow */ save_window(env); break; -case TT_WIN_UNF: /* window underflow */ -restore_window(env); -break; -#else -case TT_SPILL: /* window overflow */ -save_window(env); -break; -case TT_FILL: /* window underflow */ +case TARGET_TT_FILL: /* window underflow */ restore_window(env); break; + +#ifdef TARGET_SPARC64 #ifndef TARGET_ABI32 case 0x16e: flush_windows(env); -- 2.34.1
[PATCH v2 13/15] linux-user/sparc: Handle unimplemented flush trap
For sparc64, TT_UNIMP_FLUSH == TT_ILL_INSN, so this is already handled. For sparc32, the kernel uses SKIP_TRAP. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index bf7e10216f..093358a39a 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -315,6 +315,9 @@ void cpu_loop (CPUSPARCState *env) case TT_NCP_INSN: force_sig_fault(TARGET_SIGILL, TARGET_ILL_COPROC, env->pc); break; +case TT_UNIMP_FLUSH: +next_instruction(env); +break; #endif case EXCP_ATOMIC: cpu_exec_step_atomic(cs); -- 2.34.1
[PATCH v2 07/15] linux-user/sparc: Handle software breakpoint trap
This is 'ta 1' for both v9 and pre-v9. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index edbc4f3bdc..c14eaea163 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -206,6 +206,11 @@ void cpu_loop (CPUSPARCState *env) env->npc = env->npc + 4; break; +case TT_TRAP + 0x01: /* breakpoint */ +case EXCP_DEBUG: +force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->pc); +break; + case TT_TRAP + 0x03: /* flush windows */ flush_windows(env); /* next instruction */ @@ -237,9 +242,6 @@ void cpu_loop (CPUSPARCState *env) case TT_ILL_INSN: force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLOPC, env->pc); break; -case EXCP_DEBUG: -force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->pc); -break; case EXCP_ATOMIC: cpu_exec_step_atomic(cs); break; -- 2.34.1
[PATCH v2 12/15] linux-user/sparc: Handle coprocessor disabled trap
Since qemu does not implement a sparc coprocessor, all such instructions raise this trap. Because of that, we never raise the coprocessor exception trap, which would be vector 0x28. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 4 1 file changed, 4 insertions(+) diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index 43f19fbd91..bf7e10216f 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -311,6 +311,10 @@ void cpu_loop (CPUSPARCState *env) /* Note do_privact defers to do_privop. */ force_sig_fault(TARGET_SIGILL, TARGET_ILL_PRVOPC, env->pc); break; +#else +case TT_NCP_INSN: +force_sig_fault(TARGET_SIGILL, TARGET_ILL_COPROC, env->pc); +break; #endif case EXCP_ATOMIC: cpu_exec_step_atomic(cs); -- 2.34.1
[PATCH v2 03/15] linux-user/sparc: Tidy syscall error return
Reduce ifdefs with #define syscall_cc. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 15 +-- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index d31ea057db..051a292ce5 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -149,10 +149,13 @@ static void flush_windows(CPUSPARCState *env) #endif } +/* Avoid ifdefs below for the abi32 and abi64 paths. */ #ifdef TARGET_ABI32 #define TARGET_TT_SYSCALL (TT_TRAP + 0x10) /* t_linux */ +#define syscall_cc psr #else #define TARGET_TT_SYSCALL (TT_TRAP + 0x6d) /* tl0_linux64 */ +#define syscall_cc xcc #endif void cpu_loop (CPUSPARCState *env) @@ -183,18 +186,10 @@ void cpu_loop (CPUSPARCState *env) break; } if ((abi_ulong)ret >= (abi_ulong)(-515)) { -#if defined(TARGET_SPARC64) && !defined(TARGET_ABI32) -env->xcc |= PSR_CARRY; -#else -env->psr |= PSR_CARRY; -#endif +env->syscall_cc |= PSR_CARRY; ret = -ret; } else { -#if defined(TARGET_SPARC64) && !defined(TARGET_ABI32) -env->xcc &= ~PSR_CARRY; -#else -env->psr &= ~PSR_CARRY; -#endif +env->syscall_cc &= ~PSR_CARRY; } env->regwptr[0] = ret; /* next instruction */ -- 2.34.1
[PATCH v2 06/15] linux-user/sparc: Fix sparc64_{get, set}_context traps
These traps are present for sparc64 with ilp32, aka sparc32plus. Enabling them means adjusting the defines over in signal.c, and fixing an incorrect usage of abi_ulong when we really meant the full register, target_ulong. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 23 +++ linux-user/sparc/signal.c | 36 +++- 2 files changed, 30 insertions(+), 29 deletions(-) diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index 2bcf32590f..edbc4f3bdc 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -213,6 +213,17 @@ void cpu_loop (CPUSPARCState *env) env->npc = env->npc + 4; break; +#ifdef TARGET_SPARC64 +case TT_TRAP + 0x6e: +flush_windows(env); +sparc64_get_context(env); +break; +case TT_TRAP + 0x6f: +flush_windows(env); +sparc64_set_context(env); +break; +#endif + case TARGET_TT_SPILL: /* window overflow */ save_window(env); break; @@ -220,18 +231,6 @@ void cpu_loop (CPUSPARCState *env) restore_window(env); break; -#ifdef TARGET_SPARC64 -#ifndef TARGET_ABI32 -case 0x16e: -flush_windows(env); -sparc64_get_context(env); -break; -case 0x16f: -flush_windows(env); -sparc64_set_context(env); -break; -#endif -#endif case EXCP_INTERRUPT: /* just indicate that signals should be handled asap */ break; diff --git a/linux-user/sparc/signal.c b/linux-user/sparc/signal.c index b501750fe0..2be9000b9e 100644 --- a/linux-user/sparc/signal.c +++ b/linux-user/sparc/signal.c @@ -503,7 +503,23 @@ long do_rt_sigreturn(CPUSPARCState *env) return -QEMU_ESIGRETURN; } -#if defined(TARGET_SPARC64) && !defined(TARGET_ABI32) +#ifdef TARGET_ABI32 +void setup_sigtramp(abi_ulong sigtramp_page) +{ +uint32_t *tramp = lock_user(VERIFY_WRITE, sigtramp_page, 2 * 8, 0); +assert(tramp != NULL); + +default_sigreturn = sigtramp_page; +install_sigtramp(tramp, TARGET_NR_sigreturn); + +default_rt_sigreturn = sigtramp_page + 8; +install_sigtramp(tramp + 2, TARGET_NR_rt_sigreturn); + +unlock_user(tramp, sigtramp_page, 2 * 8); +} +#endif + +#ifdef TARGET_SPARC64 #define SPARC_MC_TSTATE 0 #define SPARC_MC_PC 1 #define SPARC_MC_NPC 2 @@ -575,7 +591,7 @@ void sparc64_set_context(CPUSPARCState *env) struct target_ucontext *ucp; target_mc_gregset_t *grp; target_mc_fpu_t *fpup; -abi_ulong pc, npc, tstate; +target_ulong pc, npc, tstate; unsigned int i; unsigned char fenab; @@ -773,18 +789,4 @@ do_sigsegv: unlock_user_struct(ucp, ucp_addr, 1); force_sig(TARGET_SIGSEGV); } -#else -void setup_sigtramp(abi_ulong sigtramp_page) -{ -uint32_t *tramp = lock_user(VERIFY_WRITE, sigtramp_page, 2 * 8, 0); -assert(tramp != NULL); - -default_sigreturn = sigtramp_page; -install_sigtramp(tramp, TARGET_NR_sigreturn); - -default_rt_sigreturn = sigtramp_page + 8; -install_sigtramp(tramp + 2, TARGET_NR_rt_sigreturn); - -unlock_user(tramp, sigtramp_page, 2 * 8); -} -#endif +#endif /* TARGET_SPARC64 */ -- 2.34.1
[PATCH v2 08/15] linux-user/sparc: Handle division by zero traps
In addition to the hw trap vector, there is a software trap assigned for older sparc without hw division instructions. Signed-off-by: Richard Henderson --- linux-user/sparc/cpu_loop.c | 5 + 1 file changed, 5 insertions(+) diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c index c14eaea163..e04c842867 100644 --- a/linux-user/sparc/cpu_loop.c +++ b/linux-user/sparc/cpu_loop.c @@ -211,6 +211,11 @@ void cpu_loop (CPUSPARCState *env) force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->pc); break; +case TT_TRAP + 0x02: /* div0 */ +case TT_DIV_ZERO: +force_sig_fault(TARGET_SIGFPE, TARGET_FPE_INTDIV, env->pc); +break; + case TT_TRAP + 0x03: /* flush windows */ flush_windows(env); /* next instruction */ -- 2.34.1