[PATCH] docs: add reference to release cycle discussion
As it is coming up basically every release cycle of Xen, add a reference to the discussion why the current release scheme has been selected in the release management documentation. Signed-off-by: Juergen Gross --- docs/process/xen-release-management.pandoc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/process/xen-release-management.pandoc b/docs/process/xen-release-management.pandoc index b746c7157d..8f80d61d2f 100644 --- a/docs/process/xen-release-management.pandoc +++ b/docs/process/xen-release-management.pandoc @@ -19,6 +19,8 @@ The Xen hypervisor project now releases every 8 months. We aim to release in the first half of March/July/November. These dates have been chosen to avoid major holidays and cultural events; if one release slips, ideally the subsequent release cycle would be shortened. +The reasons for this schedule have been discussed on +[xen-devel](https://lists.xen.org/archives/html/xen-devel/2018-07/msg02240.html). We can roughly divide one release into two periods. The development period and the freeze period. The former is 6 months long and the latter is about 2 -- 2.35.3
[linux-linus test] 171547: regressions - FAIL
flight 171547 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/171547/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-credit1 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-libvirt 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-dom0pvh-xl-intel 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-xl-credit2 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-ws16-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemut-win7-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-libvirt-pair 12 xen-boot/src_host fail REGR. vs. 171277 test-amd64-amd64-libvirt-pair 13 xen-boot/dst_host fail REGR. vs. 171277 test-amd64-amd64-libvirt-raw 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-libvirt-qcow2 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-freebsd11-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-pygrub 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-debianhvm-amd64 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-xl 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-qemuu-nested-amd 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-ovmf-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-multivcpu 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-xl-vhd 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-examine 8 reboot fail REGR. vs. 171277 test-amd64-amd64-xl-qemut-ws16-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-win7-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-qemuu-nested-intel 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-pvshim8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-pvhv2-intel 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-libvirt-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-shadow8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-examine-bios 8 reboot fail REGR. vs. 171277 test-amd64-amd64-freebsd12-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-examine-uefi 8 reboot fail REGR. vs. 171277 test-amd64-amd64-xl-pvhv2-amd 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-dom0pvh-xl-amd 8 xen-boot fail REGR. vs. 171277 test-amd64-coresched-amd64-xl 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-xl-qemut-debianhvm-amd64 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-pair12 xen-boot/src_hostfail REGR. vs. 171277 test-amd64-amd64-pair13 xen-boot/dst_hostfail REGR. vs. 171277 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-rtds 8 xen-boot fail REGR. vs. 171277 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 171277 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 171277 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 171277 test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass
Re: PCI pass-through problem for SN570 NVME SSD
On Fri, Jul 8, 2022 at 12:38 AM Jan Beulich wrote: > > On 07.07.2022 17:24, G.R. wrote: > > On Wed, Jul 6, 2022 at 2:33 PM Jan Beulich wrote: > >> > >> On 06.07.2022 08:25, G.R. wrote: > >>> On Tue, Jul 5, 2022 at 7:59 PM Jan Beulich wrote: > Nothing useful in there. Yet independent of that I guess we need to > separate the issues you're seeing. Otherwise it'll be impossible to > know what piece of data belongs where. > >>> Yep, I think I'm seeing several different issues here: > >>> 1. The FLR related DPC / AER message seen on the 1st attempt only when > >>> pciback tries to seize and release the SN570 > >>> - Later-on pciback operations appear just fine. > >>> 2. MSI-X preparation failure message that shows up each time the SN570 > >>> is seized by pciback or when it's passed to domU. > >>> 3. XEN tries to map BAR from two devices to the same page > >>> 4. The "write-back to unknown field" message in QEMU log that goes > >>> away with permissive=1 passthrough config. > >>> 5. The "irq 16: nobody cared" message shows up *sometimes* in a > >>> pattern that I haven't figured out (See attached) > >>> 6. The FreeBSD domU sees the device but fails to use it because low > >>> level commands sent to it are aborted. > >>> 7. The device does not return to the pci-assignable-list when the domU > >>> it was assigned shuts-down. (See attached) > >>> > >>> #3 appears to be a known issue that could be worked around with > >>> patches from the list. > >>> I suspect #1 may have something to do with the device itself. It's > >>> still not clear if it's deadly or just annoying. > >>> I was able to update the firmware to the latest version and confirmed > >>> that the new firmware didn't make any noticeable difference. > >>> > >>> I suspect issue #2, #4, #5, #6, #7 may be related, and the > >>> pass-through was not completely successful... > >>> > >>> Should I expect a debug build of XEN hypervisor to give better > >>> diagnose messages, without the debug patch that Roger mentioned? > >> > >> Well, "expect" is perhaps too much to say, but with problems like > >> yours (and even more so with multiple ones) using a debug > >> hypervisor (or kernel, if there such a build mode existed) is imo > >> always a good idea. As is using as up-to-date a version as > >> possible. > > > > I built both 4.14.3 debug version and 4.16.1 release version for > > testing purposes. > > Unfortunately they gave me absolutely zero information, since both of > > them are not able to get through issue #1 > > the FlR related DPC / AER issue. > > With 4.16.1 release, it actually can survive the 'xl > > pci-assignable-add' which triggers the first AER failure. > > Then that's what needs debugging first. Yet from all I've seen so > far I'm not sure who one the Xen side could be doing that, the more > without themselves being able to repro - this seems more like a > Linux side issue (and even outside of the pciback driver). > Yep, this one is likely not XEN related, as I've seen some discussions ([1],[2]) on similar syndrome (not necessarily same root cause though). The question is why this only shows up during the FLR attempt and if following pci-assignable-adds that do not trigger the error are actually reliable. BTW, I'm under the impression that the device is still usable in dom0 afterwards, I'll have to double check though... [1] https://patchwork.kernel.org/project/linux-pci/patch/20220408153159.106741-1-kai.heng.f...@canonical.com/ [2] https://patchwork.kernel.org/project/linux-pci/patch/20220127025418.1989642-1-kai.heng.f...@canonical.com/#24713767 > > But the 'xl pci-assignable-remove' will lead to xl segmentation fault... > >> [ 655.041442] xl[975]: segfault at 0 ip 7f2cccdaf71f sp > >> 7ffd73a3d4d0 error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000] > >> [ 655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c > >> 0f 86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 > >> <48> 8b 3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44 > > That'll need debugging. Cc-ing Anthony for awareness, but I'm sure > he'll need more data to actually stand a chance of doing something > about it. > > Is there any chance you could be doing some debugging work yourself, > at the very least to figure out where this (apparent) NULL deref is > happening? Yep, I can collect the call-stack for sure. > > Jan
Re: [PATCH v2 1/9] drivers/char: Add support for Xue USB3 debugger
On Wed, Jul 06, 2022 at 05:32:06PM +0200, Marek Marczykowski-Górecki wrote: > diff --git a/xen/drivers/char/Kconfig b/xen/drivers/char/Kconfig > index e5f7b1d8eb8a..d12b2205dafc 100644 > --- a/xen/drivers/char/Kconfig > +++ b/xen/drivers/char/Kconfig > @@ -74,3 +74,12 @@ config HAS_EHCI > help > This selects the USB based EHCI debug port to be used as a UART. If > you have an x86 based system with USB, say Y. > + > +config HAS_XHCI > + bool "XHCI DbC UART driver" > + depends on X86 > + help > + This selects the USB based XHCI debug capability to be used as a UART. > + Enabling this option makes Xen use extra ~2MB memory, even if XHCI > UART My math sucks here... 58 pages is 232KiB. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab signature.asc Description: PGP signature
Re: [PATCH 1/2] automation: Remove XEN_CONFIG_EXPERT leftovers
On Thu, 7 Jul 2022, Xenia Ragiadakou wrote: > The EXPERT config option cannot anymore be selected via the environmental > variable XEN_CONFIG_EXPERT. Remove stale references to XEN_CONFIG_EXPERT > from the automation code. > > Signed-off-by: Xenia Ragiadakou Reviewed-by: Stefano Stabellini > --- > automation/build/README.md | 3 --- > automation/scripts/build| 4 ++-- > automation/scripts/containerize | 1 - > 3 files changed, 2 insertions(+), 6 deletions(-) > > diff --git a/automation/build/README.md b/automation/build/README.md > index 2137957408..00305eed03 100644 > --- a/automation/build/README.md > +++ b/automation/build/README.md > @@ -65,9 +65,6 @@ understands. > - CONTAINER_NO_PULL: If set to 1, the script will not pull from docker hub. >This is useful when testing container locally. > > -- XEN_CONFIG_EXPERT: If this is defined in your shell it will be > - automatically passed through to the container. > - > If your docker host has Linux kernel > 4.11, and you want to use containers > that run old glibc (for example, CentOS 6 or SLES11SP4), you may need to add > > diff --git a/automation/scripts/build b/automation/scripts/build > index 281f8b1fcc..21b3bc57c8 100755 > --- a/automation/scripts/build > +++ b/automation/scripts/build > @@ -91,6 +91,6 @@ for cfg in `ls ${cfg_dir}`; do > echo "Building $cfg" > make -j$(nproc) -C xen clean > rm -f xen/.config > -make -C xen KBUILD_DEFCONFIG=../../../../${cfg_dir}/${cfg} > XEN_CONFIG_EXPERT=y defconfig > -make -j$(nproc) -C xen XEN_CONFIG_EXPERT=y > +make -C xen KBUILD_DEFCONFIG=../../../../${cfg_dir}/${cfg} defconfig > +make -j$(nproc) -C xen > done > diff --git a/automation/scripts/containerize b/automation/scripts/containerize > index 8992c67278..9d4beca4fa 100755 > --- a/automation/scripts/containerize > +++ b/automation/scripts/containerize > @@ -101,7 +101,6 @@ exec ${docker_cmd} run \ > -v "${CONTAINER_PATH}":/build:rw${selinux} \ > -v "${HOME}/.ssh":/root/.ssh:ro \ > ${SSH_AUTH_DIR:+-v "${SSH_AUTH_DIR}":/tmp/ssh-agent${selinux}} \ > -${XEN_CONFIG_EXPERT:+-e XEN_CONFIG_EXPERT=${XEN_CONFIG_EXPERT}} \ > ${CONTAINER_ARGS} \ > -${termint}i --rm -- \ > ${CONTAINER} \ > -- > 2.34.1 >
Re: [PATCH 2/2] automation: arm64: Create a test job for testing static allocation on qemu
On Thu, 7 Jul 2022, Julien Grall wrote: > Hi Xenia, > > On 07/07/2022 21:38, Xenia Ragiadakou wrote: > > Add an arm subdirectory under automation/configs for the arm specific > > configs > > and add a config that enables static allocation. > > > > Modify the build script to search for configs also in this subdirectory and > > to > > keep the generated xen binary, suffixed with the config file name, as > > artifact. > > > > Create a test job that > > - boots xen on qemu with a single direct mapped dom0less guest configured > > with > > statically allocated memory > > - verifies that the memory ranges reported in the guest's logs are the same > > with the provided static memory regions > > > > For guest kernel, use the 5.9.9 kernel from the tests-artifacts containers. > > Use busybox-static package, to create the guest ramdisk. > > To generate the u-boot script, use ImageBuilder. > > Use the qemu from the tests-artifacts containers. > > > > Signed-off-by: Xenia Ragiadakou > > --- > > automation/configs/arm/static_mem | 3 + > > automation/gitlab-ci/test.yaml | 24 + > > automation/scripts/build | 4 + > > automation/scripts/qemu-staticmem-arm64.sh | 114 + > > 4 files changed, 145 insertions(+) > > create mode 100644 automation/configs/arm/static_mem > > create mode 100755 automation/scripts/qemu-staticmem-arm64.sh > > > > diff --git a/automation/configs/arm/static_mem > > b/automation/configs/arm/static_mem > > new file mode 100644 > > index 00..84675ddf4e > > --- /dev/null > > +++ b/automation/configs/arm/static_mem > > @@ -0,0 +1,3 @@ > > +CONFIG_EXPERT=y > > +CONFIG_UNSUPPORTED=y > > +CONFIG_STATIC_MEMORY=y > > \ No newline at end of file > > Any particular reason to build a new Xen rather enable CONFIG_STATIC_MEMORY in > the existing build? > > > diff --git a/automation/scripts/build b/automation/scripts/build > > index 21b3bc57c8..9c6196d9bd 100755 > > --- a/automation/scripts/build > > +++ b/automation/scripts/build > > @@ -83,6 +83,7 @@ fi > > # Build all the configs we care about > > case ${XEN_TARGET_ARCH} in > > x86_64) arch=x86 ;; > > +arm64) arch=arm ;; > > *) exit 0 ;; > > esac > > @@ -93,4 +94,7 @@ for cfg in `ls ${cfg_dir}`; do > > rm -f xen/.config > > make -C xen KBUILD_DEFCONFIG=../../../../${cfg_dir}/${cfg} defconfig > > make -j$(nproc) -C xen > > +if [[ ${arch} == "arm" ]]; then > > +cp xen/xen binaries/xen-${cfg} > > +fi > > This feels a bit of a hack to be arm only. Can you explain why this is not > enabled for x86 (other than this is not yet used)? > > > done > > diff --git a/automation/scripts/qemu-staticmem-arm64.sh > > b/automation/scripts/qemu-staticmem-arm64.sh > > new file mode 100755 > > index 00..5b89a151aa > > --- /dev/null > > +++ b/automation/scripts/qemu-staticmem-arm64.sh > > @@ -0,0 +1,114 @@ > > +#!/bin/bash > > + > > +base=(0x5000 0x1) > > +size=(0x1000 0x1000) > > From the name, it is not clear what the base and size refers too. Looking a > bit below, it seems to be referring to the domain memory. If so, I would > suggest to comment and rename to "domu_{base, size}". > > > + > > +set -ex > > + > > +apt-get -qy update > > +apt-get -qy install --no-install-recommends u-boot-qemu \ > > +u-boot-tools \ > > +device-tree-compiler \ > > +cpio \ > > +curl \ > > +busybox-static > > + > > +# DomU Busybox > > +cd binaries > > +mkdir -p initrd > > +mkdir -p initrd/bin > > +mkdir -p initrd/sbin > > +mkdir -p initrd/etc > > +mkdir -p initrd/dev > > +mkdir -p initrd/proc > > +mkdir -p initrd/sys > > +mkdir -p initrd/lib > > +mkdir -p initrd/var > > +mkdir -p initrd/mnt > > +cp /bin/busybox initrd/bin/busybox > > +initrd/bin/busybox --install initrd/bin > > +echo "#!/bin/sh > > + > > +mount -t proc proc /proc > > +mount -t sysfs sysfs /sys > > +mount -t devtmpfs devtmpfs /dev > > +/bin/sh" > initrd/init > > +chmod +x initrd/init > > +cd initrd > > +find . | cpio --create --format='newc' | gzip > ../initrd.cpio.gz > > +cd ../.. > > + > > +# XXX QEMU looks for "efi-virtio.rom" even if it is unneeded > > +curl -fsSLO https://github.com/qemu/qemu/raw/v5.2.0/pc-bios/efi-virtio.rom > > + > > +./binaries/qemu-system-aarch64 -nographic \ > > +-M virtualization=true \ > > +-M virt \ > > +-M virt,gic-version=2 \ > > +-cpu cortex-a57 \ > > +-smp 2 \ > > +-m 8G \ > > +-M dumpdtb=binaries/virt-gicv2.dtb > > + > > +#dtc -I dtb -O dts binaries/virt-gicv2.dtb > binaries/virt-gicv2.dts > > + > > +# ImageBuilder > > +rm -rf imagebuilder > > +git clone https://gitlab.com/ViryaOS/imagebuilder > > + > > +echo "MEMORY_START=\"0x4000\" > > +MEMORY_END=\"0x02\" > > +
[xen-unstable test] 171545: tolerable FAIL - PUSHED
flight 171545 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/171545/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 171516 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 171516 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171516 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171516 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 171516 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 171516 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 171516 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 171516 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 171516 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171516 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171516 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171516 test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass version targeted for testing: xen 46cbd76faf737e9fe2d57aaf335a0203f66ba21c baseline version: xen
Re: [PATCH 2/2] automation: arm64: Create a test job for testing static allocation on qemu
Hi Xenia, On 07/07/2022 21:38, Xenia Ragiadakou wrote: Add an arm subdirectory under automation/configs for the arm specific configs and add a config that enables static allocation. Modify the build script to search for configs also in this subdirectory and to keep the generated xen binary, suffixed with the config file name, as artifact. Create a test job that - boots xen on qemu with a single direct mapped dom0less guest configured with statically allocated memory - verifies that the memory ranges reported in the guest's logs are the same with the provided static memory regions For guest kernel, use the 5.9.9 kernel from the tests-artifacts containers. Use busybox-static package, to create the guest ramdisk. To generate the u-boot script, use ImageBuilder. Use the qemu from the tests-artifacts containers. Signed-off-by: Xenia Ragiadakou --- automation/configs/arm/static_mem | 3 + automation/gitlab-ci/test.yaml | 24 + automation/scripts/build | 4 + automation/scripts/qemu-staticmem-arm64.sh | 114 + 4 files changed, 145 insertions(+) create mode 100644 automation/configs/arm/static_mem create mode 100755 automation/scripts/qemu-staticmem-arm64.sh diff --git a/automation/configs/arm/static_mem b/automation/configs/arm/static_mem new file mode 100644 index 00..84675ddf4e --- /dev/null +++ b/automation/configs/arm/static_mem @@ -0,0 +1,3 @@ +CONFIG_EXPERT=y +CONFIG_UNSUPPORTED=y +CONFIG_STATIC_MEMORY=y \ No newline at end of file Any particular reason to build a new Xen rather enable CONFIG_STATIC_MEMORY in the existing build? diff --git a/automation/scripts/build b/automation/scripts/build index 21b3bc57c8..9c6196d9bd 100755 --- a/automation/scripts/build +++ b/automation/scripts/build @@ -83,6 +83,7 @@ fi # Build all the configs we care about case ${XEN_TARGET_ARCH} in x86_64) arch=x86 ;; +arm64) arch=arm ;; *) exit 0 ;; esac @@ -93,4 +94,7 @@ for cfg in `ls ${cfg_dir}`; do rm -f xen/.config make -C xen KBUILD_DEFCONFIG=../../../../${cfg_dir}/${cfg} defconfig make -j$(nproc) -C xen +if [[ ${arch} == "arm" ]]; then +cp xen/xen binaries/xen-${cfg} +fi This feels a bit of a hack to be arm only. Can you explain why this is not enabled for x86 (other than this is not yet used)? done diff --git a/automation/scripts/qemu-staticmem-arm64.sh b/automation/scripts/qemu-staticmem-arm64.sh new file mode 100755 index 00..5b89a151aa --- /dev/null +++ b/automation/scripts/qemu-staticmem-arm64.sh @@ -0,0 +1,114 @@ +#!/bin/bash + +base=(0x5000 0x1) +size=(0x1000 0x1000) From the name, it is not clear what the base and size refers too. Looking a bit below, it seems to be referring to the domain memory. If so, I would suggest to comment and rename to "domu_{base, size}". + +set -ex + +apt-get -qy update +apt-get -qy install --no-install-recommends u-boot-qemu \ +u-boot-tools \ +device-tree-compiler \ +cpio \ +curl \ +busybox-static + +# DomU Busybox +cd binaries +mkdir -p initrd +mkdir -p initrd/bin +mkdir -p initrd/sbin +mkdir -p initrd/etc +mkdir -p initrd/dev +mkdir -p initrd/proc +mkdir -p initrd/sys +mkdir -p initrd/lib +mkdir -p initrd/var +mkdir -p initrd/mnt +cp /bin/busybox initrd/bin/busybox +initrd/bin/busybox --install initrd/bin +echo "#!/bin/sh + +mount -t proc proc /proc +mount -t sysfs sysfs /sys +mount -t devtmpfs devtmpfs /dev +/bin/sh" > initrd/init +chmod +x initrd/init +cd initrd +find . | cpio --create --format='newc' | gzip > ../initrd.cpio.gz +cd ../.. + +# XXX QEMU looks for "efi-virtio.rom" even if it is unneeded +curl -fsSLO https://github.com/qemu/qemu/raw/v5.2.0/pc-bios/efi-virtio.rom + +./binaries/qemu-system-aarch64 -nographic \ +-M virtualization=true \ +-M virt \ +-M virt,gic-version=2 \ +-cpu cortex-a57 \ +-smp 2 \ +-m 8G \ +-M dumpdtb=binaries/virt-gicv2.dtb + +#dtc -I dtb -O dts binaries/virt-gicv2.dtb > binaries/virt-gicv2.dts + +# ImageBuilder +rm -rf imagebuilder +git clone https://gitlab.com/ViryaOS/imagebuilder + +echo "MEMORY_START=\"0x4000\" +MEMORY_END=\"0x02\" + +DEVICE_TREE=\"virt-gicv2.dtb\" + +XEN=\"xen-static_mem\" +XEN_CMD=\"console=dtuart earlyprintk xsm=dummy\" AFAIK, earlyprintk is not an option for Xen on Arm (at least). It is also not clear why you need to pass xsm=dummy. + +NUM_DOMUS=1 +DOMU_MEM[0]=512 +DOMU_VCPUS[0]=1 +DOMU_KERNEL[0]=\"Image\" +DOMU_RAMDISK[0]=\"initrd.cpio.gz\" +DOMU_CMD[0]=\"earlyprintk console=ttyAMA0\" +DOMU_STATIC_MEM[0]=\"${base[0]} ${size[0]} ${base[1]} ${size[1]}\" + +UBOOT_SOURCE=\"boot.source\" +UBOOT_SCRIPT=\"boot.scr\"" > binaries/imagebuilder_config + +bash
[PATCH 2/2] automation: arm64: Create a test job for testing static allocation on qemu
Add an arm subdirectory under automation/configs for the arm specific configs and add a config that enables static allocation. Modify the build script to search for configs also in this subdirectory and to keep the generated xen binary, suffixed with the config file name, as artifact. Create a test job that - boots xen on qemu with a single direct mapped dom0less guest configured with statically allocated memory - verifies that the memory ranges reported in the guest's logs are the same with the provided static memory regions For guest kernel, use the 5.9.9 kernel from the tests-artifacts containers. Use busybox-static package, to create the guest ramdisk. To generate the u-boot script, use ImageBuilder. Use the qemu from the tests-artifacts containers. Signed-off-by: Xenia Ragiadakou --- automation/configs/arm/static_mem | 3 + automation/gitlab-ci/test.yaml | 24 + automation/scripts/build | 4 + automation/scripts/qemu-staticmem-arm64.sh | 114 + 4 files changed, 145 insertions(+) create mode 100644 automation/configs/arm/static_mem create mode 100755 automation/scripts/qemu-staticmem-arm64.sh diff --git a/automation/configs/arm/static_mem b/automation/configs/arm/static_mem new file mode 100644 index 00..84675ddf4e --- /dev/null +++ b/automation/configs/arm/static_mem @@ -0,0 +1,3 @@ +CONFIG_EXPERT=y +CONFIG_UNSUPPORTED=y +CONFIG_STATIC_MEMORY=y \ No newline at end of file diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml index 42cd725a12..dc181f3777 100644 --- a/automation/gitlab-ci/test.yaml +++ b/automation/gitlab-ci/test.yaml @@ -205,3 +205,27 @@ qemu-smoke-x86-64-clang-pvh: - smoke - /^coverity-tested\/.*/ - /^stable-.*/ + +qemu-staticmem-arm64-gcc: + stage: test + image: registry.gitlab.com/xen-project/xen/${CONTAINER} + variables: +CONTAINER: debian:unstable-arm64v8 + script: +- ./automation/scripts/qemu-staticmem-arm64.sh 2>&1 | tee qemu-staticmem-arm64.log + dependencies: +- debian-unstable-gcc-arm64 +- kernel-5.9.9-arm64-export +- qemu-system-aarch64-6.0.0-arm64-export + artifacts: +paths: + - qemu-staticmem.serial + - '*.log' +when: always + tags: +- arm64 + except: +- master +- smoke +- /^coverity-tested\/.*/ +- /^stable-.*/ diff --git a/automation/scripts/build b/automation/scripts/build index 21b3bc57c8..9c6196d9bd 100755 --- a/automation/scripts/build +++ b/automation/scripts/build @@ -83,6 +83,7 @@ fi # Build all the configs we care about case ${XEN_TARGET_ARCH} in x86_64) arch=x86 ;; +arm64) arch=arm ;; *) exit 0 ;; esac @@ -93,4 +94,7 @@ for cfg in `ls ${cfg_dir}`; do rm -f xen/.config make -C xen KBUILD_DEFCONFIG=../../../../${cfg_dir}/${cfg} defconfig make -j$(nproc) -C xen +if [[ ${arch} == "arm" ]]; then +cp xen/xen binaries/xen-${cfg} +fi done diff --git a/automation/scripts/qemu-staticmem-arm64.sh b/automation/scripts/qemu-staticmem-arm64.sh new file mode 100755 index 00..5b89a151aa --- /dev/null +++ b/automation/scripts/qemu-staticmem-arm64.sh @@ -0,0 +1,114 @@ +#!/bin/bash + +base=(0x5000 0x1) +size=(0x1000 0x1000) + +set -ex + +apt-get -qy update +apt-get -qy install --no-install-recommends u-boot-qemu \ +u-boot-tools \ +device-tree-compiler \ +cpio \ +curl \ +busybox-static + +# DomU Busybox +cd binaries +mkdir -p initrd +mkdir -p initrd/bin +mkdir -p initrd/sbin +mkdir -p initrd/etc +mkdir -p initrd/dev +mkdir -p initrd/proc +mkdir -p initrd/sys +mkdir -p initrd/lib +mkdir -p initrd/var +mkdir -p initrd/mnt +cp /bin/busybox initrd/bin/busybox +initrd/bin/busybox --install initrd/bin +echo "#!/bin/sh + +mount -t proc proc /proc +mount -t sysfs sysfs /sys +mount -t devtmpfs devtmpfs /dev +/bin/sh" > initrd/init +chmod +x initrd/init +cd initrd +find . | cpio --create --format='newc' | gzip > ../initrd.cpio.gz +cd ../.. + +# XXX QEMU looks for "efi-virtio.rom" even if it is unneeded +curl -fsSLO https://github.com/qemu/qemu/raw/v5.2.0/pc-bios/efi-virtio.rom + +./binaries/qemu-system-aarch64 -nographic \ +-M virtualization=true \ +-M virt \ +-M virt,gic-version=2 \ +-cpu cortex-a57 \ +-smp 2 \ +-m 8G \ +-M dumpdtb=binaries/virt-gicv2.dtb + +#dtc -I dtb -O dts binaries/virt-gicv2.dtb > binaries/virt-gicv2.dts + +# ImageBuilder +rm -rf imagebuilder +git clone https://gitlab.com/ViryaOS/imagebuilder + +echo "MEMORY_START=\"0x4000\" +MEMORY_END=\"0x02\" + +DEVICE_TREE=\"virt-gicv2.dtb\" + +XEN=\"xen-static_mem\" +XEN_CMD=\"console=dtuart earlyprintk xsm=dummy\" + +NUM_DOMUS=1 +DOMU_MEM[0]=512 +DOMU_VCPUS[0]=1 +DOMU_KERNEL[0]=\"Image\"
[PATCH 0/2] Create a test job for testing static memory on qemu
This patch series - removes all the references to the XEN_CONFIG_EXPERT environmental variable which is not used anymore - creates a trivial arm64 test job that boots xen on qemu with a direct mapped dom0less domu with static memory and verifies, based on its logs, that the domu's memory node ranges are the same as the static memory ranges with which it was configured Xenia Ragiadakou (2): automation: Remove XEN_CONFIG_EXPERT leftovers automation: arm64: Create a test job for testing static allocation on qemu automation/build/README.md | 3 - automation/configs/arm/static_mem | 3 + automation/gitlab-ci/test.yaml | 24 + automation/scripts/build | 8 +- automation/scripts/containerize| 1 - automation/scripts/qemu-staticmem-arm64.sh | 114 + 6 files changed, 147 insertions(+), 6 deletions(-) create mode 100644 automation/configs/arm/static_mem create mode 100755 automation/scripts/qemu-staticmem-arm64.sh -- 2.34.1
[PATCH 1/2] automation: Remove XEN_CONFIG_EXPERT leftovers
The EXPERT config option cannot anymore be selected via the environmental variable XEN_CONFIG_EXPERT. Remove stale references to XEN_CONFIG_EXPERT from the automation code. Signed-off-by: Xenia Ragiadakou --- automation/build/README.md | 3 --- automation/scripts/build| 4 ++-- automation/scripts/containerize | 1 - 3 files changed, 2 insertions(+), 6 deletions(-) diff --git a/automation/build/README.md b/automation/build/README.md index 2137957408..00305eed03 100644 --- a/automation/build/README.md +++ b/automation/build/README.md @@ -65,9 +65,6 @@ understands. - CONTAINER_NO_PULL: If set to 1, the script will not pull from docker hub. This is useful when testing container locally. -- XEN_CONFIG_EXPERT: If this is defined in your shell it will be - automatically passed through to the container. - If your docker host has Linux kernel > 4.11, and you want to use containers that run old glibc (for example, CentOS 6 or SLES11SP4), you may need to add diff --git a/automation/scripts/build b/automation/scripts/build index 281f8b1fcc..21b3bc57c8 100755 --- a/automation/scripts/build +++ b/automation/scripts/build @@ -91,6 +91,6 @@ for cfg in `ls ${cfg_dir}`; do echo "Building $cfg" make -j$(nproc) -C xen clean rm -f xen/.config -make -C xen KBUILD_DEFCONFIG=../../../../${cfg_dir}/${cfg} XEN_CONFIG_EXPERT=y defconfig -make -j$(nproc) -C xen XEN_CONFIG_EXPERT=y +make -C xen KBUILD_DEFCONFIG=../../../../${cfg_dir}/${cfg} defconfig +make -j$(nproc) -C xen done diff --git a/automation/scripts/containerize b/automation/scripts/containerize index 8992c67278..9d4beca4fa 100755 --- a/automation/scripts/containerize +++ b/automation/scripts/containerize @@ -101,7 +101,6 @@ exec ${docker_cmd} run \ -v "${CONTAINER_PATH}":/build:rw${selinux} \ -v "${HOME}/.ssh":/root/.ssh:ro \ ${SSH_AUTH_DIR:+-v "${SSH_AUTH_DIR}":/tmp/ssh-agent${selinux}} \ -${XEN_CONFIG_EXPERT:+-e XEN_CONFIG_EXPERT=${XEN_CONFIG_EXPERT}} \ ${CONTAINER_ARGS} \ -${termint}i --rm -- \ ${CONTAINER} \ -- 2.34.1
[qemu-mainline test] 171544: regressions - FAIL
flight 171544 qemu-mainline real [real] flight 171549 qemu-mainline real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/171544/ http://logs.test-lab.xenproject.org/osstest/logs/171549/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qcow2 21 guest-start/debian.repeat fail REGR. vs. 171539 Tests which are failing intermittently (not blocking): test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeat fail pass in 171549-retest Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171539 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 171539 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171539 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 171539 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171539 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 171539 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171539 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171539 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass version targeted for testing: qemuu
Re: [PATCH v5 2/6] evtchn: convert domain event lock to an r/w one
Hi Jan, As discussed in [1], I think it would good to revive this patch. AFAICT, this patch was dropped because the performance was thought to be minimal. However, I think it would be a better way to resolve the problem that one is trying to address [1]. So I will do another review of this patch. On 27/01/2021 08:16, Jan Beulich wrote: Especially for the use in evtchn_move_pirqs() (called when moving a vCPU across pCPU-s) and the ones in EOI handling in PCI pass-through code, serializing perhaps an entire domain isn't helpful when no state (which isn't e.g. further protected by the per-channel lock) changes. Unfortunately this implies dropping of lock profiling for this lock, until r/w locks may get enabled for such functionality. While ->notify_vcpu_id is now meant to be consistently updated with the per-channel lock held, an extension applies to ECS_PIRQ: The field is also guaranteed to not change with the per-domain event lock held for writing. Therefore the link_pirq_port() call from evtchn_bind_pirq() could in principle be moved out of the per-channel locked regions, but this further code churn didn't seem worth it. This doesn't seem to apply on upstream anymore. Would you be able to respin it? I have looked at the place where you use read_lock() rather than write_lock(). They all look fine to me, so I would be fine to give my reviewed-by on the next version (assuming there are nothing wrong with the rebase :)). Cheers, [1] https://lore.kernel.org/xen-devel/acd0dfae-b045-8505-3f6c-30ce72653...@suse.com/ -- Julien Grall
Re: PCI pass-through problem for SN570 NVME SSD
On 07.07.2022 17:24, G.R. wrote: > On Wed, Jul 6, 2022 at 2:33 PM Jan Beulich wrote: >> >> On 06.07.2022 08:25, G.R. wrote: >>> On Tue, Jul 5, 2022 at 7:59 PM Jan Beulich wrote: Nothing useful in there. Yet independent of that I guess we need to separate the issues you're seeing. Otherwise it'll be impossible to know what piece of data belongs where. >>> Yep, I think I'm seeing several different issues here: >>> 1. The FLR related DPC / AER message seen on the 1st attempt only when >>> pciback tries to seize and release the SN570 >>> - Later-on pciback operations appear just fine. >>> 2. MSI-X preparation failure message that shows up each time the SN570 >>> is seized by pciback or when it's passed to domU. >>> 3. XEN tries to map BAR from two devices to the same page >>> 4. The "write-back to unknown field" message in QEMU log that goes >>> away with permissive=1 passthrough config. >>> 5. The "irq 16: nobody cared" message shows up *sometimes* in a >>> pattern that I haven't figured out (See attached) >>> 6. The FreeBSD domU sees the device but fails to use it because low >>> level commands sent to it are aborted. >>> 7. The device does not return to the pci-assignable-list when the domU >>> it was assigned shuts-down. (See attached) >>> >>> #3 appears to be a known issue that could be worked around with >>> patches from the list. >>> I suspect #1 may have something to do with the device itself. It's >>> still not clear if it's deadly or just annoying. >>> I was able to update the firmware to the latest version and confirmed >>> that the new firmware didn't make any noticeable difference. >>> >>> I suspect issue #2, #4, #5, #6, #7 may be related, and the >>> pass-through was not completely successful... >>> >>> Should I expect a debug build of XEN hypervisor to give better >>> diagnose messages, without the debug patch that Roger mentioned? >> >> Well, "expect" is perhaps too much to say, but with problems like >> yours (and even more so with multiple ones) using a debug >> hypervisor (or kernel, if there such a build mode existed) is imo >> always a good idea. As is using as up-to-date a version as >> possible. > > I built both 4.14.3 debug version and 4.16.1 release version for > testing purposes. > Unfortunately they gave me absolutely zero information, since both of > them are not able to get through issue #1 > the FlR related DPC / AER issue. > With 4.16.1 release, it actually can survive the 'xl > pci-assignable-add' which triggers the first AER failure. Then that's what needs debugging first. Yet from all I've seen so far I'm not sure who one the Xen side could be doing that, the more without themselves being able to repro - this seems more like a Linux side issue (and even outside of the pciback driver). > But the 'xl pci-assignable-remove' will lead to xl segmentation fault... >> [ 655.041442] xl[975]: segfault at 0 ip 7f2cccdaf71f sp >> 7ffd73a3d4d0 error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000] >> [ 655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c >> 0f 86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 <48> >> 8b 3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44 That'll need debugging. Cc-ing Anthony for awareness, but I'm sure he'll need more data to actually stand a chance of doing something about it. Is there any chance you could be doing some debugging work yourself, at the very least to figure out where this (apparent) NULL deref is happening? Jan
Re: PCI pass-through problem for SN570 NVME SSD
On 07.07.2022 17:36, G.R. wrote: > On Thu, Jul 7, 2022 at 11:24 PM G.R. wrote: >> >> On Wed, Jul 6, 2022 at 2:33 PM Jan Beulich wrote: >>> Should I expect a debug build of XEN hypervisor to give better diagnose messages, without the debug patch that Roger mentioned? >>> >>> Well, "expect" is perhaps too much to say, but with problems like >>> yours (and even more so with multiple ones) using a debug >>> hypervisor (or kernel, if there such a build mode existed) is imo >>> always a good idea. As is using as up-to-date a version as >>> possible. >> >> I built both 4.14.3 debug version and 4.16.1 release version for >> testing purposes. >> Unfortunately they gave me absolutely zero information, since both of >> them are not able to get through issue #1 >> the FlR related DPC / AER issue. >> With 4.16.1 release, it actually can survive the 'xl >> pci-assignable-add' which triggers the first AER failure. >> But the 'xl pci-assignable-remove' will lead to xl segmentation fault... >>> [ 655.041442] xl[975]: segfault at 0 ip 7f2cccdaf71f sp >>> 7ffd73a3d4d0 error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000] >>> [ 655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c >>> 0f 86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 >>> <48> 8b 3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44 >> Since I'll need a couple of pci-assignable-add && >> pci-assignable-remove to get to a seemingly normal state, I cannot >> proceed from here. >> >> With 4.14.3 debug build, the hypervisor / dom0 reboots on 'xl >> pci-assignable-add'. >> >> [ 574.623143] pciback :05:00.0: xen_pciback: resetting (FLR, D3, >> etc) the device >> [ 574.623203] pcieport :00:1d.0: DPC: containment event, >> status:0x1f11 source:0x >> [ 574.623204] pcieport :00:1d.0: DPC: unmasked uncorrectable error >> detected >> [ 574.623209] pcieport :00:1d.0: PCIe Bus Error: >> severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver >> ID) >> [ 574.623240] pcieport :00:1d.0: device [8086:a330] error >> status/mask=0020/0001 >> [ 574.623261] pcieport :00:1d.0:[21] ACSViol(First) >> [ 575.855026] pciback :05:00.0: not ready 1023ms after FLR; waiting >> [ 576.895015] pciback :05:00.0: not ready 2047ms after FLR; waiting >> [ 579.028311] pciback :05:00.0: not ready 4095ms after FLR; waiting >> [ 583.294910] pciback :05:00.0: not ready 8191ms after FLR; waiting >> [ 591.614965] pciback :05:00.0: not ready 16383ms after FLR; waiting >> [ 609.534502] pciback :05:00.0: not ready 32767ms after FLR; waiting >> [ 643.667069] pciback :05:00.0: not ready 65535ms after FLR; giving up >> //<===The reboot happens somewhere here, not immediately, but >> after a while... >> //Maybe I can get something from xl dmesg if I was quick enough and >> have connected from a second terminal... > > Unfortunately I didn't see anything from xl dmesg... > I wish the 'xl dmesg' can support the follow mode (dmesg -w) that the > Linux dmesg does. > Here I have to manually repeat this command. The machine suddenly > freezes after the 'giving up' message is out. > I see nothing special in the log. Maybe I'm just not lucky enough to > catch the output, not sure. If the box reboots in the middle, I guess you really want to hook up a serial console. Jan
PING: [PATCH v6 0/3] amd/msr: implement MSR_VIRT_SPEC_CTRL for HVM guests
On 17.05.2022 17:31, Roger Pau Monne wrote: > Roger Pau Monne (3): > amd/msr: implement VIRT_SPEC_CTRL for HVM guests on top of SPEC_CTRL > amd/msr: allow passthrough of VIRT_SPEC_CTRL for HVM guests > amd/msr: implement VIRT_SPEC_CTRL for HVM guests using legacy SSBD While, somewhat different from Jürgen's series, here the delay is more voluntary (in that I had told Roger that I'd prefer to commit this only with your (perhaps informal) agreement, I think we also can't wait much longer. I'm willing to give it until the end of next week, so I guess I'd move forward with committing early the week after, unless I hear substantial arguments against doing so (at which point the two of us would likely need to decide who's going to pick up this work while Roger is away). Once again thanks for your understanding, Jan
Re: [PATCH v6 2/9] xen: harmonize return types of hypercall handlers
On Wed, Jul 6, 2022 at 12:22 PM Christopher Clark wrote: > > On Tue, Jun 28, 2022 at 11:24 PM Juergen Gross wrote: > > > > On 24.03.22 15:01, Juergen Gross wrote: > > > Today most hypercall handlers have a return type of long, while the > > > compat ones return an int. There are a few exceptions from that rule, > > > however. > > > > > > Get rid of the exceptions by letting compat handlers always return int > > > and others always return long, with the exception of the Arm specific > > > physdev_op handler. > > > > > > For the compat hvm case use eax instead of rax for the stored result as > > > it should have been from the beginning. > > > > > > Additionally move some prototypes to include/asm-x86/hypercall.h > > > as they are x86 specific. Move the compat_platform_op() prototype to > > > the common header. > > > > > > Rename paging_domctl_continuation() to do_paging_domctl_cont() and add > > > a matching define for the associated hypercall. > > > > > > Make do_callback_op() and compat_callback_op() more similar by adding > > > the const attribute to compat_callback_op()'s 2nd parameter. > > > > > > Change the type of the cmd parameter for [do|compat]_kexec_op() to > > > unsigned int, as this is more appropriate for the compat case. > > > > > > Signed-off-by: Juergen Gross > > > Reviewed-by: Jan Beulich > > > > Could I please have some feedback regarding the kexec and argo changes? > > Thanks for the ping on this and apologies for the delay. The > Argo-related changes in this patch look ok, and I have built and > tested Argo functionality with this patch applied with PV 32 and 64 > bit guests, using XTF -- with successful results for the patch -- but > I am behind on exercising it on a system that can run HVM guests, > sorry. Given the HVM-related change described in the description of > the patch, I think that such a test is needed and I am working on > getting a build and a system installed to get that done. As discussed on the Xen Community Call today, for the Argo changes: Reviewed-by: Christopher Clark and tested for 32 and 64-bit PV guests on x86. thanks, Christopher > > Christopher > > > > > > > Juergen > > > > > --- > > > V2: > > > - rework platform_op compat handling (Jan Beulich) > > > V3: > > > - remove include of types.h (Jan Beulich) > > > V4: > > > - don't move do_physdev_op() (Julien Grall) > > > - carve out non style compliant parameter replacements (Julien Grall) > > > V6: > > > - remove rebase artifact (Jan Beulich) > > > --- > > > xen/arch/x86/domctl.c| 4 ++-- > > > xen/arch/x86/hvm/hypercall.c | 8 ++- > > > xen/arch/x86/hypercall.c | 2 +- > > > xen/arch/x86/include/asm/hypercall.h | 31 ++-- > > > xen/arch/x86/include/asm/paging.h| 3 --- > > > xen/arch/x86/mm/paging.c | 3 ++- > > > xen/arch/x86/pv/callback.c | 14 ++--- > > > xen/arch/x86/pv/emul-priv-op.c | 2 +- > > > xen/arch/x86/pv/hypercall.c | 5 + > > > xen/arch/x86/pv/iret.c | 4 ++-- > > > xen/arch/x86/pv/misc-hypercalls.c| 14 - > > > xen/common/argo.c| 6 +++--- > > > xen/common/kexec.c | 6 +++--- > > > xen/include/xen/hypercall.h | 20 -- > > > 14 files changed, 58 insertions(+), 64 deletions(-) > > > > > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > > > index e49f9e91b9..ea7d60ffb6 100644 > > > --- a/xen/arch/x86/domctl.c > > > +++ b/xen/arch/x86/domctl.c > > > @@ -221,8 +221,8 @@ long arch_do_domctl( > > > case XEN_DOMCTL_shadow_op: > > > ret = paging_domctl(d, >u.shadow_op, u_domctl, 0); > > > if ( ret == -ERESTART ) > > > -return hypercall_create_continuation(__HYPERVISOR_arch_1, > > > - "h", u_domctl); > > > +return hypercall_create_continuation( > > > + __HYPERVISOR_paging_domctl_cont, "h", u_domctl); > > > copyback = true; > > > break; > > > > > > diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c > > > index 62b5349e7d..3a35543997 100644 > > > --- a/xen/arch/x86/hvm/hypercall.c > > > +++ b/xen/arch/x86/hvm/hypercall.c > > > @@ -124,8 +124,6 @@ static long cf_check hvm_physdev_op(int cmd, > > > XEN_GUEST_HANDLE_PARAM(void) arg) > > > [ __HYPERVISOR_ ## x ] = { (hypercall_fn_t *) do_ ## x, \ > > > (hypercall_fn_t *) compat_ ## x } > > > > > > -#define do_arch_1 paging_domctl_continuation > > > - > > > static const struct { > > > hypercall_fn_t *native, *compat; > > > } hvm_hypercall_table[] = { > > > @@ -158,11 +156,9 @@ static const struct { > > > #ifdef CONFIG_HYPFS > > > HYPERCALL(hypfs_op), > > > #endif > > > -HYPERCALL(arch_1) > > > +HYPERCALL(paging_domctl_cont) > > > }; > > > > > > -#undef do_arch_1 > > >
***PING***: [PATCH v6 0/9] xen: drop hypercall function tables
Andrew, On 24.03.2022 15:01, Juergen Gross wrote: > Juergen Gross (9): > xen: move do_vcpu_op() to arch specific code > xen: harmonize return types of hypercall handlers > xen: don't include asm/hypercall.h from C sources > xen: include compat/platform.h from hypercall.h > xen: generate hypercall interface related code > xen: use generated prototypes for hypercall handlers > xen/x86: call hypercall handlers via generated macro > xen/arm: call hypercall handlers via generated macro > xen/x86: remove cf_check attribute from hypercall handlers we've discussed the state of this on the Community Call today. Unfortunately you weren't there. It was common consensus that we've waited long enough here, so unless very good reasons (including a timeline) appear very quickly, the plan is to commit the series (with REST acks to stand in for the few small areas were acks are still missing) early next week. Should performance issues really turn out excessively bad, we can still consider reverting down the road; I don't expect we would want to go that route though, and rather make incremental adjustments as necessary. Thanks for your understanding, Jan
Re: PCI pass-through problem for SN570 NVME SSD
On Thu, Jul 7, 2022 at 11:24 PM G.R. wrote: > > On Wed, Jul 6, 2022 at 2:33 PM Jan Beulich wrote: > > > > > Should I expect a debug build of XEN hypervisor to give better > > > diagnose messages, without the debug patch that Roger mentioned? > > > > Well, "expect" is perhaps too much to say, but with problems like > > yours (and even more so with multiple ones) using a debug > > hypervisor (or kernel, if there such a build mode existed) is imo > > always a good idea. As is using as up-to-date a version as > > possible. > > I built both 4.14.3 debug version and 4.16.1 release version for > testing purposes. > Unfortunately they gave me absolutely zero information, since both of > them are not able to get through issue #1 > the FlR related DPC / AER issue. > With 4.16.1 release, it actually can survive the 'xl > pci-assignable-add' which triggers the first AER failure. > But the 'xl pci-assignable-remove' will lead to xl segmentation fault... > >[ 655.041442] xl[975]: segfault at 0 ip 7f2cccdaf71f sp > >7ffd73a3d4d0 error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000] > >[ 655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c > >0f 86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 <48> > >8b 3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44 > Since I'll need a couple of pci-assignable-add && > pci-assignable-remove to get to a seemingly normal state, I cannot > proceed from here. > > With 4.14.3 debug build, the hypervisor / dom0 reboots on 'xl > pci-assignable-add'. > > [ 574.623143] pciback :05:00.0: xen_pciback: resetting (FLR, D3, > etc) the device > [ 574.623203] pcieport :00:1d.0: DPC: containment event, > status:0x1f11 source:0x > [ 574.623204] pcieport :00:1d.0: DPC: unmasked uncorrectable error > detected > [ 574.623209] pcieport :00:1d.0: PCIe Bus Error: > severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver > ID) > [ 574.623240] pcieport :00:1d.0: device [8086:a330] error > status/mask=0020/0001 > [ 574.623261] pcieport :00:1d.0:[21] ACSViol(First) > [ 575.855026] pciback :05:00.0: not ready 1023ms after FLR; waiting > [ 576.895015] pciback :05:00.0: not ready 2047ms after FLR; waiting > [ 579.028311] pciback :05:00.0: not ready 4095ms after FLR; waiting > [ 583.294910] pciback :05:00.0: not ready 8191ms after FLR; waiting > [ 591.614965] pciback :05:00.0: not ready 16383ms after FLR; waiting > [ 609.534502] pciback :05:00.0: not ready 32767ms after FLR; waiting > [ 643.667069] pciback :05:00.0: not ready 65535ms after FLR; giving up > //<===The reboot happens somewhere here, not immediately, but > after a while... > //Maybe I can get something from xl dmesg if I was quick enough and > have connected from a second terminal... Unfortunately I didn't see anything from xl dmesg... I wish the 'xl dmesg' can support the follow mode (dmesg -w) that the Linux dmesg does. Here I have to manually repeat this command. The machine suddenly freezes after the 'giving up' message is out. I see nothing special in the log. Maybe I'm just not lucky enough to catch the output, not sure.
Re: [PATCH v4] xen/arm: avoid overflow when setting vtimer in context switch
Hi Julien, > On 7 Jul 2022, at 16:33, Julien Grall wrote: > > Hi Jiamei, > > On 06/07/2022 09:25, Jiamei Xie wrote: >> virt_vtimer_save() will calculate the next deadline when the vCPU is >> scheduled out. At the moment, Xen will use the following equation: >> virt_timer.cval + virt_time_base.offset - boot_count >> The three values are 64-bit and one (cval) is controlled by domain. In >> theory, it would be possible that the domain has started a long time >> after the system boot. So virt_time_base.offset - boot_count may be a >> large numbers. >> This means a domain may inadvertently set a cval so the result would >> overflow. Consequently, the deadline would be set very far in the >> future. This could result to loss of timer interrupts or the vCPU >> getting block "forever". >> One way to solve the problem, would be to separately >> 1) compute when the domain was created in ns >> 2) convert cval to ns >> 3) Add 1 and 2 together >> The first part of the equation never change (the value is set/known at >> domain creation). So take the opportunity to store it in domain structure. >> Signed-off-by: Jiamei Xie > > Reviewed-by: Julien Grall > > The commit message is my own, I would like to Bertrand or Stefano to confirm > they are happy with it :). I am ok with it so: Reviewed-by: Bertrand Marquis Cheers Bertrand > > Cheers, > > -- > Julien Grall
Re: [PATCH v4] xen/arm: avoid overflow when setting vtimer in context switch
Hi Jiamei, On 06/07/2022 09:25, Jiamei Xie wrote: virt_vtimer_save() will calculate the next deadline when the vCPU is scheduled out. At the moment, Xen will use the following equation: virt_timer.cval + virt_time_base.offset - boot_count The three values are 64-bit and one (cval) is controlled by domain. In theory, it would be possible that the domain has started a long time after the system boot. So virt_time_base.offset - boot_count may be a large numbers. This means a domain may inadvertently set a cval so the result would overflow. Consequently, the deadline would be set very far in the future. This could result to loss of timer interrupts or the vCPU getting block "forever". One way to solve the problem, would be to separately 1) compute when the domain was created in ns 2) convert cval to ns 3) Add 1 and 2 together The first part of the equation never change (the value is set/known at domain creation). So take the opportunity to store it in domain structure. Signed-off-by: Jiamei Xie Reviewed-by: Julien Grall The commit message is my own, I would like to Bertrand or Stefano to confirm they are happy with it :). Cheers, -- Julien Grall
Re: PCI pass-through problem for SN570 NVME SSD
On Wed, Jul 6, 2022 at 2:33 PM Jan Beulich wrote: > > On 06.07.2022 08:25, G.R. wrote: > > On Tue, Jul 5, 2022 at 7:59 PM Jan Beulich wrote: > >> Nothing useful in there. Yet independent of that I guess we need to > >> separate the issues you're seeing. Otherwise it'll be impossible to > >> know what piece of data belongs where. > > Yep, I think I'm seeing several different issues here: > > 1. The FLR related DPC / AER message seen on the 1st attempt only when > > pciback tries to seize and release the SN570 > > - Later-on pciback operations appear just fine. > > 2. MSI-X preparation failure message that shows up each time the SN570 > > is seized by pciback or when it's passed to domU. > > 3. XEN tries to map BAR from two devices to the same page > > 4. The "write-back to unknown field" message in QEMU log that goes > > away with permissive=1 passthrough config. > > 5. The "irq 16: nobody cared" message shows up *sometimes* in a > > pattern that I haven't figured out (See attached) > > 6. The FreeBSD domU sees the device but fails to use it because low > > level commands sent to it are aborted. > > 7. The device does not return to the pci-assignable-list when the domU > > it was assigned shuts-down. (See attached) > > > > #3 appears to be a known issue that could be worked around with > > patches from the list. > > I suspect #1 may have something to do with the device itself. It's > > still not clear if it's deadly or just annoying. > > I was able to update the firmware to the latest version and confirmed > > that the new firmware didn't make any noticeable difference. > > > > I suspect issue #2, #4, #5, #6, #7 may be related, and the > > pass-through was not completely successful... > > > > Should I expect a debug build of XEN hypervisor to give better > > diagnose messages, without the debug patch that Roger mentioned? > > Well, "expect" is perhaps too much to say, but with problems like > yours (and even more so with multiple ones) using a debug > hypervisor (or kernel, if there such a build mode existed) is imo > always a good idea. As is using as up-to-date a version as > possible. I built both 4.14.3 debug version and 4.16.1 release version for testing purposes. Unfortunately they gave me absolutely zero information, since both of them are not able to get through issue #1 the FlR related DPC / AER issue. With 4.16.1 release, it actually can survive the 'xl pci-assignable-add' which triggers the first AER failure. But the 'xl pci-assignable-remove' will lead to xl segmentation fault... >[ 655.041442] xl[975]: segfault at 0 ip 7f2cccdaf71f sp 7ffd73a3d4d0 >error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000] >[ 655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c 0f >86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 <48> 8b >3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44 Since I'll need a couple of pci-assignable-add && pci-assignable-remove to get to a seemingly normal state, I cannot proceed from here. With 4.14.3 debug build, the hypervisor / dom0 reboots on 'xl pci-assignable-add'. [ 574.623143] pciback :05:00.0: xen_pciback: resetting (FLR, D3, etc) the device [ 574.623203] pcieport :00:1d.0: DPC: containment event, status:0x1f11 source:0x [ 574.623204] pcieport :00:1d.0: DPC: unmasked uncorrectable error detected [ 574.623209] pcieport :00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID) [ 574.623240] pcieport :00:1d.0: device [8086:a330] error status/mask=0020/0001 [ 574.623261] pcieport :00:1d.0:[21] ACSViol(First) [ 575.855026] pciback :05:00.0: not ready 1023ms after FLR; waiting [ 576.895015] pciback :05:00.0: not ready 2047ms after FLR; waiting [ 579.028311] pciback :05:00.0: not ready 4095ms after FLR; waiting [ 583.294910] pciback :05:00.0: not ready 8191ms after FLR; waiting [ 591.614965] pciback :05:00.0: not ready 16383ms after FLR; waiting [ 609.534502] pciback :05:00.0: not ready 32767ms after FLR; waiting [ 643.667069] pciback :05:00.0: not ready 65535ms after FLR; giving up //<===The reboot happens somewhere here, not immediately, but after a while... //Maybe I can get something from xl dmesg if I was quick enough and have connected from a second terminal... [ 644.773922] pciback :05:00.0: xen_pciback: reset device [ 644.774050] pciback :05:00.0: xen_pciback: xen_pcibk_error_detected(bus:5,devfn:0) [ 644.774051] pciback :05:00.0: xen_pciback: device is not found/assigned [ 644.923432] pciback :05:00.0: xen_pciback: xen_pcibk_error_resume(bus:5,devfn:0) [ 644.923437] pciback :05:00.0: xen_pciback: device is not found/assigned [ 644.923616] pcieport :00:1d.0: AER: device recovery successful > > Jan
[linux-linus test] 171542: regressions - FAIL
flight 171542 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/171542/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-credit1 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-libvirt 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-dom0pvh-xl-intel 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-xl-credit2 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-ws16-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemut-win7-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-libvirt-pair 12 xen-boot/src_host fail REGR. vs. 171277 test-amd64-amd64-libvirt-pair 13 xen-boot/dst_host fail REGR. vs. 171277 test-amd64-amd64-libvirt-raw 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-libvirt-qcow2 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-freebsd11-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-pygrub 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-debianhvm-amd64 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-xl 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-qemuu-nested-amd 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-ovmf-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-multivcpu 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-xl-vhd 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-examine 8 reboot fail REGR. vs. 171277 test-amd64-amd64-xl-qemut-ws16-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-win7-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-qemuu-nested-intel 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-pvshim8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-pvhv2-intel 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-libvirt-xsm 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-shadow8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-xl-pvhv2-amd 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-examine-bios 8 reboot fail REGR. vs. 171277 test-amd64-amd64-freebsd12-amd64 8 xen-boot fail REGR. vs. 171277 test-amd64-amd64-examine-uefi 8 reboot fail REGR. vs. 171277 test-amd64-amd64-dom0pvh-xl-amd 8 xen-boot fail REGR. vs. 171277 test-amd64-coresched-amd64-xl 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-xl-qemut-debianhvm-amd64 8 xen-bootfail REGR. vs. 171277 test-amd64-amd64-pair12 xen-boot/src_hostfail REGR. vs. 171277 test-amd64-amd64-pair13 xen-boot/dst_hostfail REGR. vs. 171277 Tests which are failing intermittently (not blocking): test-arm64-arm64-xl-vhd 18 guest-start.2 fail pass in 171537 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-rtds 8 xen-boot fail REGR. vs. 171277 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 171277 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 171277 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 171277 test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15
Re: [PATCH v10 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
> On 13 Apr 2022, at 12:21, Jane Malalane wrote: > > Introduce a new per-domain creation x86 specific flag to > select whether hardware assisted virtualization should be used for > x{2}APIC. > > A per-domain option is added to xl in order to select the usage of > x{2}APIC hardware assisted virtualization, as well as a global > configuration option. > > Having all APIC interaction exit to Xen for emulation is slow and can > induce much overhead. Hardware can speed up x{2}APIC by decoding the > APIC access and providing a VM exit with a more specific exit reason > than a regular EPT fault or by altogether avoiding a VM exit. > > On the other hand, being able to disable x{2}APIC hardware assisted > virtualization can be useful for testing and debugging purposes. > > Note: > > - vmx_install_vlapic_mapping doesn't require modifications regardless > of whether the guest has "Virtualize APIC accesses" enabled or not, > i.e., setting the APIC_ACCESS_ADDR VMCS field is fine so long as > virtualize_apic_accesses is supported by the CPU. > > - Both per-domain and global assisted_x{2}apic options are not part of > the migration stream, unless explicitly set in the respective > configuration files. Default settings of assisted_x{2}apic done > internally by the toolstack, based on host capabilities at create > time, are not migrated. > > Suggested-by: Andrew Cooper > Signed-off-by: Jane Malalane > Reviewed-by: "Roger Pau Monné" Golang bits: Reviewed-by: George Dunlap mailto:george.dun...@citrix.com>> signature.asc Description: Message signed with OpenPGP
Re: [PATCH v10 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86
> On 13 Apr 2022, at 12:21, Jane Malalane wrote: > > Add XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC and > XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC to report accelerated xAPIC and > x2APIC, on x86 hardware. This is so that xAPIC and x2APIC virtualization > can subsequently be enabled on a per-domain basis. > No such features are currently implemented on AMD hardware. > > HW assisted xAPIC virtualization will be reported if HW, at the > minimum, supports virtualize_apic_accesses as this feature alone means > that an access to the APIC page will cause an APIC-access VM exit. An > APIC-access VM exit provides a VMM with information about the access > causing the VM exit, unlike a regular EPT fault, thus simplifying some > internal handling. > > HW assisted x2APIC virtualization will be reported if HW supports > virtualize_x2apic_mode and, at least, either apic_reg_virt or > virtual_intr_delivery. This also means that > sysctl follows the conditionals in vmx_vlapic_msr_changed(). > > For that purpose, also add an arch-specific "capabilities" parameter > to struct xen_sysctl_physinfo. > > Note that this interface is intended to be compatible with AMD so that > AVIC support can be introduced in a future patch. Unlike Intel that > has multiple controls for APIC Virtualization, AMD has one global > 'AVIC Enable' control bit, so fine-graining of APIC virtualization > control cannot be done on a common interface. > > Suggested-by: Andrew Cooper > Signed-off-by: Jane Malalane > Reviewed-by: "Roger Pau Monné" Reviewed-by: George Dunlap mailto:george.dun...@citrix.com>> Sorry for the delay signature.asc Description: Message signed with OpenPGP
Re: [PATCH] tools/init-xenstore-domain: fix memory map for PVH stubdom
On 07.07.22 16:45, Anthony PERARD wrote: On Fri, Jun 24, 2022 at 11:28:06AM +0200, Juergen Gross wrote: In case of maxmem != memsize the E820 map of the PVH stubdom is wrong, as it is missing the RAM above memsize. Additionally the MMIO area should only cover the HVM special pages. Signed-off-by: Juergen Gross --- tools/helpers/init-xenstore-domain.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/tools/helpers/init-xenstore-domain.c b/tools/helpers/init-xenstore-domain.c index b4f3c65a8a..dad8e43c42 100644 --- a/tools/helpers/init-xenstore-domain.c +++ b/tools/helpers/init-xenstore-domain.c @@ -157,21 +158,24 @@ static int build(xc_interface *xch) config.flags |= XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap; config.arch.emulation_flags = XEN_X86_EMU_LAPIC; dom->target_pages = mem_size >> XC_PAGE_SHIFT; -dom->mmio_size = GB(4) - LAPIC_BASE_ADDRESS; +dom->mmio_size = X86_HVM_NR_SPECIAL_PAGES << XC_PAGE_SHIFT; dom->lowmem_end = (mem_size > LAPIC_BASE_ADDRESS) ? LAPIC_BASE_ADDRESS : mem_size; dom->highmem_end = (mem_size > LAPIC_BASE_ADDRESS) ? GB(4) + mem_size - LAPIC_BASE_ADDRESS : 0; -dom->mmio_start = LAPIC_BASE_ADDRESS; +dom->mmio_start = (X86_HVM_END_SPECIAL_REGION - + X86_HVM_NR_SPECIAL_PAGES) << XC_PAGE_SHIFT; dom->max_vcpus = 1; e820[0].addr = 0; -e820[0].size = dom->lowmem_end; +e820[0].size = (max_size > LAPIC_BASE_ADDRESS) ? + LAPIC_BASE_ADDRESS : max_size; e820[0].type = E820_RAM; -e820[1].addr = LAPIC_BASE_ADDRESS; +e820[1].addr = dom->mmio_start; So, it isn't expected to have an entry covering the LAPIC addresses, right? I guess not as seen in df1ca1dfe20. But based on that same commit info, shouldn't the LAPIC address part of `dom->mmio_start, dom->mmio_size`? (I don't know how dom->mmio_start is used, yet, but maybe it's used by Xen or xen libraries to avoid allocations in the wrong places) In my understanding this is the purpose of lowmem_end. OTOH I can modify the patch to be along the lines of df1ca1dfe20. Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH] tools/init-xenstore-domain: fix memory map for PVH stubdom
On Fri, Jun 24, 2022 at 11:28:06AM +0200, Juergen Gross wrote: > In case of maxmem != memsize the E820 map of the PVH stubdom is wrong, > as it is missing the RAM above memsize. > > Additionally the MMIO area should only cover the HVM special pages. > > Signed-off-by: Juergen Gross > --- > tools/helpers/init-xenstore-domain.c | 16 ++-- > 1 file changed, 10 insertions(+), 6 deletions(-) > > diff --git a/tools/helpers/init-xenstore-domain.c > b/tools/helpers/init-xenstore-domain.c > index b4f3c65a8a..dad8e43c42 100644 > --- a/tools/helpers/init-xenstore-domain.c > +++ b/tools/helpers/init-xenstore-domain.c > @@ -157,21 +158,24 @@ static int build(xc_interface *xch) > config.flags |= XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap; > config.arch.emulation_flags = XEN_X86_EMU_LAPIC; > dom->target_pages = mem_size >> XC_PAGE_SHIFT; > -dom->mmio_size = GB(4) - LAPIC_BASE_ADDRESS; > +dom->mmio_size = X86_HVM_NR_SPECIAL_PAGES << XC_PAGE_SHIFT; > dom->lowmem_end = (mem_size > LAPIC_BASE_ADDRESS) ? >LAPIC_BASE_ADDRESS : mem_size; > dom->highmem_end = (mem_size > LAPIC_BASE_ADDRESS) ? > GB(4) + mem_size - LAPIC_BASE_ADDRESS : 0; > -dom->mmio_start = LAPIC_BASE_ADDRESS; > +dom->mmio_start = (X86_HVM_END_SPECIAL_REGION - > + X86_HVM_NR_SPECIAL_PAGES) << XC_PAGE_SHIFT; > dom->max_vcpus = 1; > e820[0].addr = 0; > -e820[0].size = dom->lowmem_end; > +e820[0].size = (max_size > LAPIC_BASE_ADDRESS) ? > + LAPIC_BASE_ADDRESS : max_size; > e820[0].type = E820_RAM; > -e820[1].addr = LAPIC_BASE_ADDRESS; > +e820[1].addr = dom->mmio_start; So, it isn't expected to have an entry covering the LAPIC addresses, right? I guess not as seen in df1ca1dfe20. But based on that same commit info, shouldn't the LAPIC address part of `dom->mmio_start, dom->mmio_size`? (I don't know how dom->mmio_start is used, yet, but maybe it's used by Xen or xen libraries to avoid allocations in the wrong places) Thanks, -- Anthony PERARD
[ovmf test] 171546: all pass - PUSHED
flight 171546 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/171546/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 5496c763aaddc4a47639d4652abe23aa3419263a baseline version: ovmf f193b945eac58ca379d3d21c77d5550b063580d6 Last test of basis 171540 2022-07-07 01:10:28 Z0 days Testing same since 171546 2022-07-07 10:42:07 Z0 days1 attempts People who touched revisions under test: Ming Huang jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git f193b945ea..5496c763aa 5496c763aaddc4a47639d4652abe23aa3419263a -> xen-tested-master
Re: [PATCH 5/8] xen/evtchn: don't close the static event channel.
Hi Juergen, > On 6 Jul 2022, at 12:33 pm, Juergen Gross wrote: > > On 06.07.22 13:04, Julien Grall wrote: >> (+ Juergen for the Linux question) >> On 06/07/2022 11:42, Rahul Singh wrote: >>> Hi Julien, >>> On 5 Jul 2022, at 2:56 pm, Julien Grall wrote: On 05/07/2022 14:28, Rahul Singh wrote: > Hi Julien, Hi Rahul, >> On 28 Jun 2022, at 4:18 pm, Julien Grall wrote: >>> a new driver in linux kernel, etc where right now we just need to >>> introduce an extra IOCTL in linux to support this feature. >> >> I don't understand why would need a new driver, etc. Given that you are >> introducing a new IOCTL you could pass a flag to say "This is a static >> event channel so don't close it". > I tried to implement other solutions to this issue. We can introduce a > new event channel state “ECS_STATIC” and set the > event channel state to ECS_STATIC when Xen allocate and create the static > event channels. From what you wrote, ECS_STATIC is just an interdomain behind but where you want Xen to prevent closing the port. From Xen PoV, it is still not clear why this is a problem to let Linux closing such port. From the guest PoV, there are other way to pass this information (see below). >>> >>> If Linux closes the port, the static event channel created by Xen >>> associated with such port will not be available to use afterward. >>> >>> When I started implemented the static event channel series, I thought the >>> static event channel has to be available for use during >>> the lifetime of the guest. This patch avoids closing the port if the Linux >>> user-space application wants to use the event channel again. >>> >>> This patch is fixing the problem for Linux OS, and I agree with you that we >>> should not modify the Xen to fix the Linux problem. >>> Therefore, If the guest decided to close the static event channel, Xen will >>> close the port. Event Chanel associated with the port >>> will not be available for use after that.I will discard this patch in the >>> next series. >>> > From guest OS we can check if the event channel is static (via > EVTCHNOP_status() hypercall ), if the event channel is > static don’t try to close the event channel. If guest OS try to close the > static event channel Xen will return error as static event channel can’t > be closed. Why do you need this? You already need a binding indicating which ports will be pre-allocated. So you could update your binding to pass a flag telling Linux "don't close it". I have already proposed that before and I haven't seen any explanation why this is not a viable solution. >>> >>> Sorry I didn’t mention this earlier, I started with your suggestion to fix >>> the issue but after going through the Linux evtchn driver code >>> it is not straight forward to tell Linux don’t close the port. Let me try >>> to explain. >>> >>> In Linux, struct user_evtchn {} is the struct that hold the information for >>> each user evtchn opened. We can add one bool parameter in this struct to >>> tell Linux driver >>> via IOCTL if evtchn is static. When user application close the fd >>> "/dev/xen/evtchn” , evtchn_release() will traverse all the evtchn and call >>> evtchn_unbind_from_user() >>> for each evtchn. evtchn_unbind_from_user() will call >>> __unbind_from_irq(irq) that will call xen_evtchn_close() . We need >>> references to "struct user_evtchn” in >>> function __unbind_from_irq() to pass as argument to xen_evtchn_close() not >>> to close the static event channel. I am not able to find any way to get >>> struct user_evtchn in function __unbind_from_irq() , without modifying the >>> other Linux structure. > > The "static" flag should be added to struct irq_info. In case all relevant > event channels are really user ones, we could easily add another "static" > flag to evtchn_make_refcounted(), which is already used to set a user > event channel specific value into struct irq_info when binding the event > channel. > As suggested by you, I modified the Linux Kernel by adding “static" flag in struct irq_info and it works fine. We can skip the closing of static channel if required. I will send the patch for review once I will send the patch for new ioctl for static event channel. Regards, Rahul
Re: [RFC PATCH] flask: Remove magic SID setting
On Thu, Jul 7, 2022 at 6:14 AM Daniel P. Smith wrote: > > On 7/6/22 15:13, Jason Andryuk wrote: > > flask_domain_alloc_security and flask_domain_create has special code to > > magically label dom0 as dom0_t. This can all be streamlined by making > > create_dom0 set ssidref before creating dom0. > > Hmm, I wouldn't call it magical, it is the initialization policy for a > domain labeling, which is specific to each policy module. I considered > this approach already and my concern here is two fold. First, it now > hard codes the concept of dom0 vs domU into the XSM API. There is an > ever growing desire by solution providers to not have a dom0 and at most > have a hardware domain if at all and this is a step backwards from that > movement. Second, and related, is this now pushes the initial label > policy up into the domain builder code away from the policy module and > spreads it out. Hopefully Xen will evolve to have a richer set of > initial domains and an appropriate initial label policy will be needed > for this case. This approach will result in having to continually expand > the XSM API for each new initial domain type. Yeah, adding dom0 vs. domU into the XSM API isn't nice. My original idea was just for dom0, but I added the domU hook after you basically said in your other email that dom0less had to work. There should not be any more of these since they are just to provide backwards compatibility. A dom0/domU flask policy is not interesting for dom0less/hyperlaunch. So I don't see why xen/flask needs support for determining sids for domains. If you have dom0less/hyperlaunch + flask, every domain should have a ssidref defined in its config when building. If you require ssidrefs for dom0less/hyperlaunch + flask, then there is less initial label policy. An unspecified ssidref defaulting to unlabeled_t is fine. I saw your other patch as adding more "initial label policy" since it adds more special cases. I see requiring an explicit ssidref or getting unlabeled_t as a feature. Automatic labeling seems like a misfeature to me. Regards, Jason
Re: [RFC] DVFS and Thermal management subsystem proposal
Hi Jan, On Thu, Jul 07, 2022 at 01:55:30PM +0200, Jan Beulich wrote: > On 07.07.2022 12:35, Oleksii Moisieiev wrote: > > # Synopsis > > This document is intended to describe the design of the thermal based cpu > > throttling in virtualized environments. The goal is to provide generic > > thermal > > management subsystem, which should work with existing cpufreq subsystem in > > XEN > > and could be used on various architectures and hardware. > > Looks quite plausible to me, just two questions: > > > # Cpufreq subsystem in XEN > > > > ## Brief overview > > > >Governors > > ++ > > | ++ | struct cpufreq_governor { > > | | ondemand | | .name > > | ++ | .governor > > | ++ | .handle_option > > | | powersave | | } > > | ++ | > > | ++ | +--+ > > | | performance | |->cpufreq_register_governor() | +---+| > > | ++ | | | cpufreq_dev_drv || > > | ++ | cpufreq_register_driver()->| +---+| > > | | userspace | | | +---+| > > | ++ | | | ... || > > | ++ | | +---+| > > | | ... | |struct cpufreq_driver { +--+ > > | ++ | .init +--+ > > ++ .verify|Hardware | > > .setpolicy +--+ > > .update > > .target > > .get > > .getavg > > .exit > > } > > > > Cpufreq subsystem consists of 2 parts: > > 1) Cpufreq governor, which should be registered using > > cpufreq_register_governor > > call; > > 2) Cpufreq driver, which provides access to the hardware should be > > registered > > using cpufreq_register_driver call. > > > > ## Hardware drivers > > > > There are two Cpufreq hardware drivers implemented by us (see Appendix 1 and > > Appendix 2) to provide support for Rcar-3 and i.MX8 boards. Those drivers > > are > > designed to support thermal throttling subsystem. They are going to be the > > part > > of the contribution package. > > Are these drivers also intended to act as "ordinary" cpufreq drivers, > i.e. controlled by cpufreq governors instead of thermal ones? > The idea is that cpufreq drivers acts as ordinary cpufreq drivers, controlled by cpufreq governors if temperature is fine. But cpufreq opp level can be overriden by thermal subsystem if critical or passive temperature was reached. > > # XEN Dynamic Thermal management design > > > > ## Synopsis > > > > Introducing the design of the Dynamic Thermal Management for Xen hypervisor. > > This feature is an enhancement of the Xen DVFS feature and will allow system > > admin to configure different thermal governors which will perform CPU > > throttling, based on the CPU cores temperature and thermal configuration. > > > > ## Top level design. > > > > +---+ > > |XEN| > > | +---+| > > | | Thermal || > > | +->| Governor || > > | | +-|-+| > > | || | > > | |+---+ | > > | || | > > | +--+ +--+ | > > | | Thermal| |Cpufreq | | > > | | Driver | | | | > > | +--+ +--+ | > > | | > > +---+ > > ^ > > | > > | > >+v+ > >| | > >|Hardware | > >| | > >+-+ > > > > > > ## Thermal management subsystem design in XEN > > > > +--+ > > | +--+ | > > | | powersave | | struct thermal_governor { > > | +--+ | .name > > | +--+ | .governor > > | | stepwise | |<+ .handle_option > > | +--+ | | } > > | +--+ | | > > | | ... | | | > > | +--+ | | > > +--+ v > >
Re: [PATCH] EFI: strip xen.efi when putting it on the EFI partition
On 07.07.2022 13:58, Wei Chen wrote: > Hi Jan, > > On 2022/7/6 13:44, Henry Wang wrote: >> Hi Jan, >> >>> -Original Message- >>> Subject: Re: [PATCH] EFI: strip xen.efi when putting it on the EFI partition >>> >>> On 09.06.2022 17:52, Jan Beulich wrote: With debug info retained, xen.efi can be quite large. Unlike for xen.gz there's no intermediate step (mkelf32 there) involved which would strip debug info kind of as a side effect. While the installing of xen.efi on the EFI partition is an optional step (intended to be a courtesy to the developer), adjust it also for the purpose of documenting what distros would be expected to do during boot loader configuration (which is what would normally put xen.efi into the EFI partition). Model the control over stripping after Linux'es module installation, except that the stripped executable is constructed in the build area instead of in the destination location. This is to conserve on space used there - EFI partitions tend to be only a few hundred Mb in size. Signed-off-by: Jan Beulich --- GNU strip 2.38 appears to have issues when acting on a PE binary: - file name symbols are also stripped; while there is a separate --keep-file-symbols option (which I would have thought to be on by default anyway), its use so far makes no difference, - the string table grows in size, when one would expect it to retain its size (or shrink), - linker version is changed in and timestamp zapped from the header. Older GNU strip (observed with 2.35.1) doesn't work at all ("Data Directory size (1c) exceeds space left in section (8)"). Future GNU strip is going to honor --keep-file-symbols (and will also have the other issues fixed). Question is whether we should use that option (for the symbol table as a whole to make sense), or whether instead we should (by default) strip the symbol table as well. >>> >>> Without any feedback / ack I guess I'll consider this of no interest >>> (despite having heard otherwise, triggering me to put together the >>> patch in the first place), and put on the pile of effectively >>> rejected patches. >> >> I did a test for this patch on my x86 machine and I think this patch is >> doing the correct thing, so: >> >> Tested-by: Henry Wang >> > > Because there was no Arm EFI environment in hand at the time, Henry only > tested the x86 part.I have setup an Arm platform with UEFI v2.70 (EDK > II, 0x0001) today, and this patch works well when boot Xen as an EFI > application from UEFI shell. > > But the binaries sizes are the same with/without this patch. Is it expected? > I have enabled: > CONFIG_DEBUG=y > CONFIG_DEBUG_INFO=y Well, the way "xen" is built (and "xen.efi" only being an alias thereof), debug info is stripped in the course. That's quite different from x86, where - with a new enough linker - debug info is retained while linking (and it is truly linking by which xen.efi is built), and hence can make sense to strip while installing. > Is there anything I should be aware to test this patch? > > -rwxrwxr-x 1 weic weic 1081504 Jul 7 18:43 xen > -rwxrwxr-x 1 weic weic 1081504 Jul 7 19:43 xen > > Tested-by (Arm only): Wei Chen Thanks. Btw the proper form of the tag, as of a couple of months ago, is Tested-by: Wei Chen # arm Jan
Re: [PATCH] EFI: strip xen.efi when putting it on the EFI partition
Hi Jan, On 2022/7/6 13:44, Henry Wang wrote: Hi Jan, -Original Message- Subject: Re: [PATCH] EFI: strip xen.efi when putting it on the EFI partition On 09.06.2022 17:52, Jan Beulich wrote: With debug info retained, xen.efi can be quite large. Unlike for xen.gz there's no intermediate step (mkelf32 there) involved which would strip debug info kind of as a side effect. While the installing of xen.efi on the EFI partition is an optional step (intended to be a courtesy to the developer), adjust it also for the purpose of documenting what distros would be expected to do during boot loader configuration (which is what would normally put xen.efi into the EFI partition). Model the control over stripping after Linux'es module installation, except that the stripped executable is constructed in the build area instead of in the destination location. This is to conserve on space used there - EFI partitions tend to be only a few hundred Mb in size. Signed-off-by: Jan Beulich --- GNU strip 2.38 appears to have issues when acting on a PE binary: - file name symbols are also stripped; while there is a separate --keep-file-symbols option (which I would have thought to be on by default anyway), its use so far makes no difference, - the string table grows in size, when one would expect it to retain its size (or shrink), - linker version is changed in and timestamp zapped from the header. Older GNU strip (observed with 2.35.1) doesn't work at all ("Data Directory size (1c) exceeds space left in section (8)"). Future GNU strip is going to honor --keep-file-symbols (and will also have the other issues fixed). Question is whether we should use that option (for the symbol table as a whole to make sense), or whether instead we should (by default) strip the symbol table as well. Without any feedback / ack I guess I'll consider this of no interest (despite having heard otherwise, triggering me to put together the patch in the first place), and put on the pile of effectively rejected patches. I did a test for this patch on my x86 machine and I think this patch is doing the correct thing, so: Tested-by: Henry Wang Because there was no Arm EFI environment in hand at the time, Henry only tested the x86 part.I have setup an Arm platform with UEFI v2.70 (EDK II, 0x0001) today, and this patch works well when boot Xen as an EFI application from UEFI shell. But the binaries sizes are the same with/without this patch. Is it expected? I have enabled: CONFIG_DEBUG=y CONFIG_DEBUG_INFO=y Is there anything I should be aware to test this patch? -rwxrwxr-x 1 weic weic 1081504 Jul 7 18:43 xen -rwxrwxr-x 1 weic weic 1081504 Jul 7 19:43 xen Tested-by (Arm only): Wei Chen Thanks, Wei Chen I also noticed that Julien is suggesting maybe we can have Anthony as the reviewer for this patch, so I also add him in the CC of this email. Kind regards, Henry Jan
Re: [RFC] DVFS and Thermal management subsystem proposal
On 07.07.2022 12:35, Oleksii Moisieiev wrote: > # Synopsis > This document is intended to describe the design of the thermal based cpu > throttling in virtualized environments. The goal is to provide generic thermal > management subsystem, which should work with existing cpufreq subsystem in XEN > and could be used on various architectures and hardware. Looks quite plausible to me, just two questions: > # Cpufreq subsystem in XEN > > ## Brief overview > >Governors > ++ > | ++ | struct cpufreq_governor { > | | ondemand | | .name > | ++ | .governor > | ++ | .handle_option > | | powersave | | } > | ++ | > | ++ | +--+ > | | performance | |->cpufreq_register_governor() | +---+| > | ++ | | | cpufreq_dev_drv || > | ++ | cpufreq_register_driver()->| +---+| > | | userspace | | | +---+| > | ++ | | | ... || > | ++ | | +---+| > | | ... | |struct cpufreq_driver { +--+ > | ++ | .init +--+ > ++ .verify|Hardware | > .setpolicy +--+ > .update > .target > .get > .getavg > .exit > } > > Cpufreq subsystem consists of 2 parts: > 1) Cpufreq governor, which should be registered using > cpufreq_register_governor > call; > 2) Cpufreq driver, which provides access to the hardware should be registered > using cpufreq_register_driver call. > > ## Hardware drivers > > There are two Cpufreq hardware drivers implemented by us (see Appendix 1 and > Appendix 2) to provide support for Rcar-3 and i.MX8 boards. Those drivers are > designed to support thermal throttling subsystem. They are going to be the > part > of the contribution package. Are these drivers also intended to act as "ordinary" cpufreq drivers, i.e. controlled by cpufreq governors instead of thermal ones? > # XEN Dynamic Thermal management design > > ## Synopsis > > Introducing the design of the Dynamic Thermal Management for Xen hypervisor. > This feature is an enhancement of the Xen DVFS feature and will allow system > admin to configure different thermal governors which will perform CPU > throttling, based on the CPU cores temperature and thermal configuration. > > ## Top level design. > > +---+ > |XEN| > | +---+| > | | Thermal || > | +->| Governor || > | | +-|-+| > | || | > | |+---+ | > | || | > | +--+ +--+ | > | | Thermal| |Cpufreq | | > | | Driver | | | | > | +--+ +--+ | > | | > +---+ > ^ > | > | >+v+ >| | >|Hardware | >| | >+-+ > > > ## Thermal management subsystem design in XEN > > +--+ > | +--+ | > | | powersave | | struct thermal_governor { > | +--+ | .name > | +--+ | .governor > | | stepwise | |<+ .handle_option > | +--+ | | } > | +--+ | | > | | ... | | | > | +--+ | | > +--+ v > +->register_thermal_governor() > | > +-v+ Polling temperature > | dyn_thermal|<+ ++ > +--+ +>| polling_handler() | >++ Polling (only)? Jan
Re: [PATCH v8 6/9] xen/arm: introduce CDF_staticmem
On 07.07.2022 11:22, Penny Zheng wrote: > In order to have an easy and quick way to find out whether this domain memory > is statically configured, this commit introduces a new flag CDF_staticmem and > a > new helper is_domain_using_staticmem() to tell. > > Signed-off-by: Penny Zheng Acked-by: Jan Beulich
Re: [PATCH] Config.mk: use newest Mini-OS commit
On 07.07.2022 11:39, Juergen Gross wrote: > Switch to use the newest Mini-OS commit in order to get the recent > fixes. > > Signed-off-by: Juergen Gross Acked-by: Jan Beulich
Re: [PATCH v3 1/8] irqchip/mips-gic: Only register IPI domain when SMP is enabled
On Thu, Jul 07, 2022 at 09:22:26AM +0100, Marc Zyngier wrote: > On Tue, 05 Jul 2022 14:52:43 +0100, > Serge Semin wrote: > > > > Hi Samuel > > > > On Fri, Jul 01, 2022 at 03:00:49PM -0500, Samuel Holland wrote: > > > The MIPS GIC irqchip driver may be selected in a uniprocessor > > > configuration, but it unconditionally registers an IPI domain. > > > > > > Limit the part of the driver dealing with IPIs to only be compiled when > > > GENERIC_IRQ_IPI is enabled, which corresponds to an SMP configuration. > > > > Thanks for the patch. Some comment is below. > > > > > > > > Reported-by: kernel test robot > > > Signed-off-by: Samuel Holland > > > --- > > > > > > Changes in v3: > > > - New patch to fix build errors in uniprocessor MIPS configs > > > > > > drivers/irqchip/Kconfig| 3 +- > > > drivers/irqchip/irq-mips-gic.c | 80 +++--- > > > 2 files changed, 56 insertions(+), 27 deletions(-) > > > > > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig > > > index 1f23a6be7d88..d26a4ff7c99f 100644 > > > --- a/drivers/irqchip/Kconfig > > > +++ b/drivers/irqchip/Kconfig > > > @@ -322,7 +322,8 @@ config KEYSTONE_IRQ > > > > > > config MIPS_GIC > > > bool > > > - select GENERIC_IRQ_IPI > > > + select GENERIC_IRQ_IPI if SMP > > > > > + select IRQ_DOMAIN_HIERARCHY > > > > It seems to me that the IRQ domains hierarchy is supposed to be > > created only if IPI is required. At least that's what the MIPS GIC > > driver implies. Thus we can go further and CONFIG_IRQ_DOMAIN_HIERARCHY > > ifdef-out the gic_irq_domain_alloc() and gic_irq_domain_free() > > methods definition together with the initialization: > > > > static const struct irq_domain_ops gic_irq_domain_ops = { > > .xlate = gic_irq_domain_xlate, > > +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY > > .alloc = gic_irq_domain_alloc, > > .free = gic_irq_domain_free, > > +#endif > > .map = gic_irq_domain_map, > > }; > > > > If the GENERIC_IRQ_IPI config is enabled, CONFIG_IRQ_DOMAIN_HIERARCHY > > will be automatically selected (see the config definition in > > kernel/irq/Kconfig). If the IRQs hierarchy is needed for some another > > functionality like GENERIC_MSI_IRQ_DOMAIN or GPIOs then they will > > explicitly enable the IRQ_DOMAIN_HIERARCHY config thus activating the > > denoted .alloc and .free methods definitions. > > > > To sum up you can get rid of the IRQ_DOMAIN_HIERARCHY config > > force-select from this patch and make the MIPS GIC driver code a bit > > more coherent. > > > > @Marc, please correct me if were wrong. > > Either way probably works correctly, but Samuel's approach is more > readable IMO. It is far easier to reason about a high-level feature > (GENERIC_IRQ_IPI) than an implementation detail (IRQ_DOMAIN_HIERARCHY). > The main idea of my comment was to get rid of the forcible IRQ_DOMAIN_HIERARCHY config selection, because the basic part of the driver doesn't depends on the hierarchical IRQ-domains functionality. It's needed only for IPIs and implicitly for the lower level IRQ device drivers like GPIO or PCIe-controllers, which explicitly enable the IRQ_DOMAIN_HIERARCHY config anyway. That's why instead of forcible IRQ_DOMAIN_HIERARCHY config selection (see Samuel patch) I suggested to make the corresponding functionality defined under the IRQ_DOMAIN_HIERARCHY config ifdefs, thus having the driver capable of creating the hierarchical IRQs domains only if it's required. > If you really want to save a handful of bytes, you can make the > callbacks conditional on GENERIC_IRQ_IPI, and be done with it. AFAIU I can't in this case. It must be either IRQ_DOMAIN_HIERARCHY ifdefs or explicit IRQ_DOMAIN_HIERARCHY select. There can be non-SMP (UP) systems with no need in IPIs but for instance having a GPIO or PCIe controller which require the hierarchical IRQ-domains support of the parental IRQ controller. So making the callbacks definition depended on the GENERIC_IRQ_IPI config state will break the driver for these systems. That's why I suggested to use CONFIG_IRQ_DOMAIN_HIERARCHY which activates the hierarchical IRQ domains support in the IRQ-chip system (see the irq_domain_ops structure conditional fields definition) and shall we have the suggested approach implemented in the MIPS GIC driver. -Sergey > But this can come as an additional patch. > > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible.
[RFC] DVFS and Thermal management subsystem proposal
# Synopsis This document is intended to describe the design of the thermal based cpu throttling in virtualized environments. The goal is to provide generic thermal management subsystem, which should work with existing cpufreq subsystem in XEN and could be used on various architectures and hardware. # Cpufreq subsystem in XEN ## Brief overview Governors ++ | ++ | struct cpufreq_governor { | | ondemand | | .name | ++ | .governor | ++ | .handle_option | | powersave | | } | ++ | | ++ | +--+ | | performance | |->cpufreq_register_governor() | +---+| | ++ | | | cpufreq_dev_drv || | ++ | cpufreq_register_driver()->| +---+| | | userspace | | | +---+| | ++ | | | ... || | ++ | | +---+| | | ... | |struct cpufreq_driver { +--+ | ++ | .init +--+ ++ .verify|Hardware | .setpolicy +--+ .update .target .get .getavg .exit } Cpufreq subsystem consists of 2 parts: 1) Cpufreq governor, which should be registered using cpufreq_register_governor call; 2) Cpufreq driver, which provides access to the hardware should be registered using cpufreq_register_driver call. ## Hardware drivers There are two Cpufreq hardware drivers implemented by us (see Appendix 1 and Appendix 2) to provide support for Rcar-3 and i.MX8 boards. Those drivers are designed to support thermal throttling subsystem. They are going to be the part of the contribution package. ## Configuration options Cpufreq subsystem enables with the following config param: +-+ CONFIG_HAS_CPUFREQ=y +-+ Cpufreq device driver is platform specific and can be selected on compile time by setting config parameter: +-+ CONFIG_CPUFREQ_XXX +-+ Where XXX is the platform name. Additional configuration is also possible. This could be done by device tree nodes or using ACPI configuration. Current implementation supports only device-tree configuration. Device tree configuration is defined by the cpufreq driver implementation and mostly using device-tree bindings from linux kernel. Linux kernel defines common and platform specific cpufreq bindings. See [0] /Documentation/devicetree/bindings/cpufreq and [0] /Documentation/devicetree/bindings/opp for details. Some examples can be found in Appndix 1 and Appendix 2. Cpufreq driver initializes on Xen start based on the configuration parameters. Only one cpufreq device driver could be enabled on system. Switching to the diff Cpufreq hardware driver should be probed based on Device-tree nodes or ACPI configuration. The default governor can be set from the xen-bootargs and has the following format: +-+ cpufreq=xen:ondemand +-+ xl.cfg (guest configuration files) support the following configuration option: guestpm. It defines PM policy for the given guest. For example: +-+ guestpm = "0-7" +-+ guestpm = "0-7" line allows guest to choose OPP levels from 0 to 7 out of 15. Higher OPP levels will be ignored by hypervisor. # XEN Dynamic Thermal management design ## Synopsis Introducing the design of the Dynamic Thermal Management for Xen hypervisor. This feature is an enhancement of the Xen DVFS feature and will allow system admin to configure different thermal governors which will perform CPU throttling, based on the CPU cores temperature and thermal configuration. ## Top level design. +---+ |XEN| | +---+| | | Thermal || | +->| Governor || | | +-|-+| | ||
[libvirt test] 171541: regressions - FAIL
flight 171541 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/171541/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-arm64-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777 build-armhf-libvirt 6 libvirt-buildfail REGR. vs. 151777 Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-raw 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-qcow2 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-raw 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-qcow2 1 build-check(1) blocked n/a version targeted for testing: libvirt 35609616a2353d23b43d6c490daed333f60c917c baseline version: libvirt 2c846fa6bcc11929c9fb857a22430fb9945654ad Last test of basis 151777 2020-07-10 04:19:19 Z 727 days Failing since151818 2020-07-11 04:18:52 Z 726 days 708 attempts Testing same since 171497 2022-07-05 04:20:30 Z2 days3 attempts People who touched revisions under test: Adolfo Jayme Barrientos Aleksandr Alekseev Aleksei Zakharov Amneesh Singh Andika Triwidada Andrea Bolognani Andrew Melnychenko Ani Sinha Balázs Meskó Barrett Schonefeld Bastian Germann Bastien Orivel BiaoXiang Ye Bihong Yu Binfeng Wu Bjoern Walk Boris Fiuczynski Brad Laue Brian Turek Bruno Haible Chris Mayo Christian Borntraeger Christian Ehrhardt Christian Kirbach Christian Schoenebeck Christophe Fergeau Claudio Fontana Cole Robinson Collin Walling Cornelia Huck Cédric Bosdonnat Côme Borsoi Daniel Henrique Barboza Daniel Letai Daniel P. Berrange Daniel P. Berrangé David Michael Didik Supriadi dinglimin Divya Garg Dmitrii Shcherbakov Dmytro Linkin Eiichi Tsukata Emilio Herrera Eric Farman Erik Skultety Fabian Affolter Fabian Freyer Fabiano Fidêncio Fangge Jin Farhan Ali Fedora Weblate Translation Florian Schmidt Franck Ridel Gavi Teitz gongwei Guoyi Tu Göran Uddeborg Halil Pasic Han Han Hao Wang Haonan Wang Hela Basa Helmut Grohne Hiroki Narukawa Hyman Huang(黄勇) Ian Wienand Ioanna Alifieraki Ivan Teterevkov Jakob Meng Jamie Strandboge Jamie Strandboge Jan Kuparinen jason lee Jean-Baptiste Holcroft Jia Zhou Jianan Gao Jim Fehlig Jin Yan Jing Qi Jinsheng Zhang Jiri Denemark Joachim Falk John Ferlan John Levon John Levon Jonathan Watt Jonathon Jongsma Julio Faracco Justin Gatzen Ján Tomko Kashyap Chamarthy Kevin Locke Kim InSoo Koichi Murase Kristina Hanicova Laine Stump Laszlo Ersek Lee Yarwood Lei Yang Lena Voytek Liang Yan Liang Yan Liao Pingfang Lin Ma Lin Ma Lin Ma Liu Yiding Lubomir Rintel Luke Yue Luyao Zhong luzhipeng Marc Hartmayer Marc-André Lureau Marek Marczykowski-Górecki Mark Mielke Markus Schade Martin Kletzander Martin Pitt Masayoshi Mizuma Matej Cepl Matt Coleman Matt Coleman Mauro Matteo Cascella Max Goodhart Maxim Nestratov Meina Li Michal Privoznik Michał Smyk Milo Casagrande Moshe Levi Moteen Shah Moteen Shah Muha Aliss Nathan Neal Gompa Nick Chevsky Nick Shyrokovskiy Nickys Music Group Nico Pache Nicolas Lécureuil Nicolas Lécureuil Nikolay Shirokovskiy Nikolay Shirokovskiy Nikolay Shirokovskiy Niteesh Dubey Olaf Hering Olesya Gerasimenko Or Ozeri Orion Poplawski Pany Paolo Bonzini Patrick Magauran Paulo de Rezende Pinatti Pavel
Re: [RFC PATCH] flask: Remove magic SID setting
On 7/6/22 15:13, Jason Andryuk wrote: flask_domain_alloc_security and flask_domain_create has special code to magically label dom0 as dom0_t. This can all be streamlined by making create_dom0 set ssidref before creating dom0. Hmm, I wouldn't call it magical, it is the initialization policy for a domain labeling, which is specific to each policy module. I considered this approach already and my concern here is two fold. First, it now hard codes the concept of dom0 vs domU into the XSM API. There is an ever growing desire by solution providers to not have a dom0 and at most have a hardware domain if at all and this is a step backwards from that movement. Second, and related, is this now pushes the initial label policy up into the domain builder code away from the policy module and spreads it out. Hopefully Xen will evolve to have a richer set of initial domains and an appropriate initial label policy will be needed for this case. This approach will result in having to continually expand the XSM API for each new initial domain type. create_domU is also extended to create domains with domU_t. xsm_ssidref_domU and xsm_ssidref_dom0 are introduced to abstract away the details. Signed-off-by: Jason Andryuk --- Untested on ARM. Minimally tested on x86. Needs your Flask permission changes for xenboot_t to create dom0_t and domU_t. This is what I was thinking would be a better way to handle SID assignment. Regards, Jason --- xen/arch/arm/domain_build.c | 2 ++ xen/arch/x86/setup.c| 1 + xen/include/xsm/dummy.h | 10 ++ xen/include/xsm/xsm.h | 12 xen/xsm/dummy.c | 2 ++ xen/xsm/flask/hooks.c | 31 +-- 6 files changed, 44 insertions(+), 14 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 3fd1186b53..a7e88944c2 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -3281,6 +3281,7 @@ void __init create_domUs(void) .max_grant_frames = -1, .max_maptrack_frames = -1, .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version), +.ssidref = xsm_ssidref_domU(), }; unsigned int flags = 0U; @@ -3438,6 +3439,7 @@ void __init create_dom0(void) .max_grant_frames = gnttab_dom0_frames(), .max_maptrack_frames = -1, .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version), +.ssidref = xsm_ssidref_dom0(), }; /* The vGIC for DOM0 is exactly emulating the hardware GIC */ diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index f08b07b8de..5a6086cfe3 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -771,6 +771,7 @@ static struct domain *__init create_dom0(const module_t *image, .arch = { .misc_flags = opt_dom0_msr_relaxed ? XEN_X86_MSR_RELAXED : 0, }, +.ssidref = xsm_ssidref_dom0(), }; struct domain *d; char *cmdline; diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h index 77f27e7163..12fbc224d0 100644 --- a/xen/include/xsm/dummy.h +++ b/xen/include/xsm/dummy.h @@ -124,6 +124,16 @@ static XSM_INLINE void cf_check xsm_security_domaininfo( return; } +static XSM_INLINE int cf_check xsm_ssidref_dom0(XSM_DEFAULT_VOID) +{ +return 0; +} + +static XSM_INLINE int cf_check xsm_ssidref_domU(XSM_DEFAULT_VOID) +{ +return 0; +} + static XSM_INLINE int cf_check xsm_domain_create( XSM_DEFAULT_ARG struct domain *d, uint32_t ssidref) { diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h index 8dad03fd3d..a6a4ffe05a 100644 --- a/xen/include/xsm/xsm.h +++ b/xen/include/xsm/xsm.h @@ -55,6 +55,8 @@ struct xsm_ops { int (*set_system_active)(void); void (*security_domaininfo)(struct domain *d, struct xen_domctl_getdomaininfo *info); +int (*ssidref_dom0)(void); +int (*ssidref_domU)(void); int (*domain_create)(struct domain *d, uint32_t ssidref); int (*getdomaininfo)(struct domain *d); int (*domctl_scheduler_op)(struct domain *d, int op); @@ -220,6 +222,16 @@ static inline void xsm_security_domaininfo( alternative_vcall(xsm_ops.security_domaininfo, d, info); } +static inline int xsm_ssidref_dom0(void) +{ +return alternative_call(xsm_ops.ssidref_dom0); +} + +static inline int xsm_ssidref_domU(void) +{ +return alternative_call(xsm_ops.ssidref_domU); +} + static inline int xsm_domain_create( xsm_default_t def, struct domain *d, uint32_t ssidref) { diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c index e6ffa948f7..d46cfef0ec 100644 --- a/xen/xsm/dummy.c +++ b/xen/xsm/dummy.c @@ -16,6 +16,8 @@ static const struct xsm_ops __initconst_cf_clobber dummy_ops = { .set_system_active = xsm_set_system_active, .security_domaininfo = xsm_security_domaininfo, +.ssidref_dom0
[xen-unstable test] 171538: FAIL
flight 171538 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/171538/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-libvirt-pair broken in 171536 Tests which are failing intermittently (not blocking): test-amd64-amd64-libvirt-pair 7 host-install/dst_host(7) broken in 171536 pass in 171538 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail pass in 171536 test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10 fail pass in 171536 test-armhf-armhf-xl-credit2 14 guest-startfail pass in 171536 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-credit2 15 migrate-support-check fail in 171536 never pass test-armhf-armhf-xl-credit2 16 saverestore-support-check fail in 171536 never pass test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 171516 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 171516 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171516 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171516 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 171516 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 171516 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 171516 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 171516 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 171516 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171516 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171516 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171516 test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail
Re: [PATCH v3 6/8] genirq: Add and use an irq_data_update_affinity helper
On 07.07.22 11:39, Marc Zyngier wrote: Hello Marc > On Sun, 03 Jul 2022 16:22:03 +0100, > Oleksandr wrote: >> >> On 01.07.22 23:00, Samuel Holland wrote: >> >> >> Hello Samuel >> >>> Some architectures and irqchip drivers modify the cpumask returned by >>> irq_data_get_affinity_mask, usually by copying in to it. This is >>> problematic for uniprocessor configurations, where the affinity mask >>> should be constant, as it is known at compile time. >>> >>> Add and use a setter for the affinity mask, following the pattern of >>> irq_data_update_effective_affinity. This allows the getter function to >>> return a const cpumask pointer. >>> >>> Signed-off-by: Samuel Holland >>> --- >>> >>> Changes in v3: >>>- New patch to introduce irq_data_update_affinity >>> >>>arch/alpha/kernel/irq.c | 2 +- >>>arch/ia64/kernel/iosapic.c | 2 +- >>>arch/ia64/kernel/irq.c | 4 ++-- >>>arch/ia64/kernel/msi_ia64.c | 4 ++-- >>>arch/parisc/kernel/irq.c | 2 +- >>>drivers/irqchip/irq-bcm6345-l1.c | 4 ++-- >>>drivers/parisc/iosapic.c | 2 +- >>>drivers/sh/intc/chip.c | 2 +- >>>drivers/xen/events/events_base.c | 7 --- >>>include/linux/irq.h | 6 ++ >>>10 files changed, 21 insertions(+), 14 deletions(-) >>> >>> diff --git a/arch/alpha/kernel/irq.c b/arch/alpha/kernel/irq.c >>> index f6d2946edbd2..15f2effd6baf 100644 >>> --- a/arch/alpha/kernel/irq.c >>> +++ b/arch/alpha/kernel/irq.c >>> @@ -60,7 +60,7 @@ int irq_select_affinity(unsigned int irq) >>> cpu = (cpu < (NR_CPUS-1) ? cpu + 1 : 0); >>> last_cpu = cpu; >>>-cpumask_copy(irq_data_get_affinity_mask(data), >>> cpumask_of(cpu)); >>> + irq_data_update_affinity(data, cpumask_of(cpu)); >>> chip->irq_set_affinity(data, cpumask_of(cpu), false); >>> return 0; >>>} >>> diff --git a/arch/ia64/kernel/iosapic.c b/arch/ia64/kernel/iosapic.c >>> index 35adcf89035a..99300850abc1 100644 >>> --- a/arch/ia64/kernel/iosapic.c >>> +++ b/arch/ia64/kernel/iosapic.c >>> @@ -834,7 +834,7 @@ iosapic_unregister_intr (unsigned int gsi) >>> if (iosapic_intr_info[irq].count == 0) { >>>#ifdef CONFIG_SMP >>> /* Clear affinity */ >>> - cpumask_setall(irq_get_affinity_mask(irq)); >>> + irq_data_update_affinity(irq_get_irq_data(irq), cpu_all_mask); >>>#endif >>> /* Clear the interrupt information */ >>> iosapic_intr_info[irq].dest = 0; >>> diff --git a/arch/ia64/kernel/irq.c b/arch/ia64/kernel/irq.c >>> index ecef17c7c35b..275b9ea58c64 100644 >>> --- a/arch/ia64/kernel/irq.c >>> +++ b/arch/ia64/kernel/irq.c >>> @@ -57,8 +57,8 @@ static char irq_redir [NR_IRQS]; // = { [0 ... NR_IRQS-1] >>> = 1 }; >>>void set_irq_affinity_info (unsigned int irq, int hwid, int redir) >>>{ >>> if (irq < NR_IRQS) { >>> - cpumask_copy(irq_get_affinity_mask(irq), >>> -cpumask_of(cpu_logical_id(hwid))); >>> + irq_data_update_affinity(irq_get_irq_data(irq), >>> +cpumask_of(cpu_logical_id(hwid))); >>> irq_redir[irq] = (char) (redir & 0xff); >>> } >>>} >>> diff --git a/arch/ia64/kernel/msi_ia64.c b/arch/ia64/kernel/msi_ia64.c >>> index df5c28f252e3..025e5133c860 100644 >>> --- a/arch/ia64/kernel/msi_ia64.c >>> +++ b/arch/ia64/kernel/msi_ia64.c >>> @@ -37,7 +37,7 @@ static int ia64_set_msi_irq_affinity(struct irq_data >>> *idata, >>> msg.data = data; >>> pci_write_msi_msg(irq, ); >>> - cpumask_copy(irq_data_get_affinity_mask(idata), cpumask_of(cpu)); >>> + irq_data_update_affinity(idata, cpumask_of(cpu)); >>> return 0; >>>} >>> @@ -132,7 +132,7 @@ static int dmar_msi_set_affinity(struct irq_data *data, >>> msg.address_lo |= MSI_ADDR_DEST_ID_CPU(cpu_physical_id(cpu)); >>> dmar_msi_write(irq, ); >>> - cpumask_copy(irq_data_get_affinity_mask(data), mask); >>> + irq_data_update_affinity(data, mask); >>> return 0; >>>} >>> diff --git a/arch/parisc/kernel/irq.c b/arch/parisc/kernel/irq.c >>> index 0fe2d79fb123..5ebb1771b4ab 100644 >>> --- a/arch/parisc/kernel/irq.c >>> +++ b/arch/parisc/kernel/irq.c >>> @@ -315,7 +315,7 @@ unsigned long txn_affinity_addr(unsigned int irq, int >>> cpu) >>>{ >>>#ifdef CONFIG_SMP >>> struct irq_data *d = irq_get_irq_data(irq); >>> - cpumask_copy(irq_data_get_affinity_mask(d), cpumask_of(cpu)); >>> + irq_data_update_affinity(d, cpumask_of(cpu)); >>>#endif >>> return per_cpu(cpu_data, cpu).txn_addr; >>> diff --git a/drivers/irqchip/irq-bcm6345-l1.c >>> b/drivers/irqchip/irq-bcm6345-l1.c >>> index 142a7431745f..6899e37810a8 100644 >>> --- a/drivers/irqchip/irq-bcm6345-l1.c >>> +++ b/drivers/irqchip/irq-bcm6345-l1.c >>> @@ -216,11 +216,11 @@ static int bcm6345_l1_set_affinity(struct irq_data *d, >>> enabled = intc->cpus[old_cpu]->enable_cache[word] & mask; >>>
[PATCH] Config.mk: use newest Mini-OS commit
Switch to use the newest Mini-OS commit in order to get the recent fixes. Signed-off-by: Juergen Gross --- Config.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Config.mk b/Config.mk index a806ef0afb..e56844d964 100644 --- a/Config.mk +++ b/Config.mk @@ -230,7 +230,7 @@ MINIOS_UPSTREAM_URL ?= git://xenbits.xen.org/mini-os.git endif OVMF_UPSTREAM_REVISION ?= 7b4a99be8a39c12d3a7fc4b8db9f0eab4ac688d5 QEMU_UPSTREAM_REVISION ?= master -MINIOS_UPSTREAM_REVISION ?= 83ff43bff4bdd6879539fcb2b3d6ba5e61a64135 +MINIOS_UPSTREAM_REVISION ?= 5bcb28aaeba1c2506a82fab0cdad0201cd9b54b3 SEABIOS_UPSTREAM_REVISION ?= rel-1.16.0 -- 2.35.3
[PATCH v8 8/9] xen: introduce prepare_staticmem_pages
Later, we want to use acquire_domstatic_pages() for populating memory for static domain on runtime, however, there are a lot of pointless work (checking mfn_valid(), scrubbing the free part, cleaning the cache...) considering we know the page is valid and belong to the guest. This commit splits acquire_staticmem_pages() in two parts, and introduces prepare_staticmem_pages to bypass all "pointless work". Signed-off-by: Penny Zheng Acked-by: Jan Beulich --- v8 changes: - no change --- v7 changes: - no change --- v6 changes: - adapt to PGC_static --- v5 changes: - new commit --- xen/common/page_alloc.c | 61 - 1 file changed, 36 insertions(+), 25 deletions(-) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index b01272a59a..6112f6a3ed 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -2702,26 +2702,13 @@ void free_domstatic_page(struct page_info *page) put_domain(d); } -/* - * Acquire nr_mfns contiguous reserved pages, starting at #smfn, of - * static memory. - * This function needs to be reworked if used outside of boot. - */ -static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, - unsigned long nr_mfns, - unsigned int memflags) +static bool __init prepare_staticmem_pages(struct page_info *pg, + unsigned long nr_mfns, + unsigned int memflags) { bool need_tlbflush = false; uint32_t tlbflush_timestamp = 0; unsigned long i; -struct page_info *pg; - -ASSERT(nr_mfns); -for ( i = 0; i < nr_mfns; i++ ) -if ( !mfn_valid(mfn_add(smfn, i)) ) -return NULL; - -pg = mfn_to_page(smfn); spin_lock(_lock); @@ -2732,7 +2719,7 @@ static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, { printk(XENLOG_ERR "pg[%lu] Static MFN %"PRI_mfn" c=%#lx t=%#x\n", - i, mfn_x(smfn) + i, + i, mfn_x(page_to_mfn(pg)) + i, pg[i].count_info, pg[i].tlbflush_timestamp); goto out_err; } @@ -2756,6 +2743,38 @@ static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, if ( need_tlbflush ) filtered_flush_tlb_mask(tlbflush_timestamp); +return true; + + out_err: +while ( i-- ) +pg[i].count_info = PGC_static | PGC_state_free; + +spin_unlock(_lock); + +return false; +} + +/* + * Acquire nr_mfns contiguous reserved pages, starting at #smfn, of + * static memory. + * This function needs to be reworked if used outside of boot. + */ +static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, + unsigned long nr_mfns, + unsigned int memflags) +{ +unsigned long i; +struct page_info *pg; + +ASSERT(nr_mfns); +for ( i = 0; i < nr_mfns; i++ ) +if ( !mfn_valid(mfn_add(smfn, i)) ) +return NULL; + +pg = mfn_to_page(smfn); +if ( !prepare_staticmem_pages(pg, nr_mfns, memflags) ) +return NULL; + /* * Ensure cache and RAM are consistent for platforms where the guest * can control its own visibility of/through the cache. @@ -2764,14 +2783,6 @@ static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, flush_page_to_ram(mfn_x(smfn) + i, !(memflags & MEMF_no_icache_flush)); return pg; - - out_err: -while ( i-- ) -pg[i].count_info = PGC_static | PGC_state_free; - -spin_unlock(_lock); - -return NULL; } /* -- 2.25.1
Re: [PATCH] tools/init-xenstore-domain: fix memory map for PVH stubdom
Ping? On 24.06.22 11:28, Juergen Gross wrote: In case of maxmem != memsize the E820 map of the PVH stubdom is wrong, as it is missing the RAM above memsize. Additionally the MMIO area should only cover the HVM special pages. Signed-off-by: Juergen Gross --- tools/helpers/init-xenstore-domain.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/tools/helpers/init-xenstore-domain.c b/tools/helpers/init-xenstore-domain.c index b4f3c65a8a..dad8e43c42 100644 --- a/tools/helpers/init-xenstore-domain.c +++ b/tools/helpers/init-xenstore-domain.c @@ -71,8 +71,9 @@ static int build(xc_interface *xch) char cmdline[512]; int rv, xs_fd; struct xc_dom_image *dom = NULL; -int limit_kb = (maxmem ? : (memory + 1)) * 1024; +int limit_kb = (maxmem ? : memory) * 1024 + X86_HVM_NR_SPECIAL_PAGES * 4; uint64_t mem_size = MB(memory); +uint64_t max_size = MB(maxmem); struct e820entry e820[3]; struct xen_domctl_createdomain config = { .ssidref = SECINITSID_DOMU, @@ -157,21 +158,24 @@ static int build(xc_interface *xch) config.flags |= XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap; config.arch.emulation_flags = XEN_X86_EMU_LAPIC; dom->target_pages = mem_size >> XC_PAGE_SHIFT; -dom->mmio_size = GB(4) - LAPIC_BASE_ADDRESS; +dom->mmio_size = X86_HVM_NR_SPECIAL_PAGES << XC_PAGE_SHIFT; dom->lowmem_end = (mem_size > LAPIC_BASE_ADDRESS) ? LAPIC_BASE_ADDRESS : mem_size; dom->highmem_end = (mem_size > LAPIC_BASE_ADDRESS) ? GB(4) + mem_size - LAPIC_BASE_ADDRESS : 0; -dom->mmio_start = LAPIC_BASE_ADDRESS; +dom->mmio_start = (X86_HVM_END_SPECIAL_REGION - + X86_HVM_NR_SPECIAL_PAGES) << XC_PAGE_SHIFT; dom->max_vcpus = 1; e820[0].addr = 0; -e820[0].size = dom->lowmem_end; +e820[0].size = (max_size > LAPIC_BASE_ADDRESS) ? + LAPIC_BASE_ADDRESS : max_size; e820[0].type = E820_RAM; -e820[1].addr = LAPIC_BASE_ADDRESS; +e820[1].addr = dom->mmio_start; e820[1].size = dom->mmio_size; e820[1].type = E820_RESERVED; e820[2].addr = GB(4); -e820[2].size = dom->highmem_end - GB(4); +e820[2].size = (max_size > LAPIC_BASE_ADDRESS) ? + max_size - LAPIC_BASE_ADDRESS : 0; e820[2].type = E820_RAM; } OpenPGP_0xB0DE9DD628BF132F.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature
[PATCH v8 9/9] xen: retrieve reserved pages on populate_physmap
When a static domain populates memory through populate_physmap at runtime, it shall retrieve reserved pages from resv_page_list to make sure that guest RAM is still restricted in statically configured memory regions. This commit also introduces a new helper acquire_reserved_page to make it work. Signed-off-by: Penny Zheng --- v8 changes: - As concurrent free/allocate could modify the resv_page_list, we still need the lock --- v7 changes: - remove the lock, since we add the page to rsv_page_list after it has been totally freed. --- v6 changes: - drop the lock before returning --- v5 changes: - extract common codes for assigning pages into a helper assign_domstatic_pages - refine commit message - remove stub function acquire_reserved_page - Alloc/free of memory can happen concurrently. So access to rsv_page_list needs to be protected with a spinlock --- v4 changes: - miss dropping __init in acquire_domstatic_pages - add the page back to the reserved list in case of error - remove redundant printk - refine log message and make it warn level --- v3 changes: - move is_domain_using_staticmem to the common header file - remove #ifdef CONFIG_STATIC_MEMORY-ary - remove meaningless page_to_mfn(page) in error log --- v2 changes: - introduce acquire_reserved_page to retrieve reserved pages from resv_page_list - forbid non-zero-order requests in populate_physmap - let is_domain_static return ((void)(d), false) on x86 --- xen/common/memory.c | 23 ++ xen/common/page_alloc.c | 70 +++-- xen/include/xen/mm.h| 1 + 3 files changed, 77 insertions(+), 17 deletions(-) diff --git a/xen/common/memory.c b/xen/common/memory.c index f2d009843a..cb330ce877 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -245,6 +245,29 @@ static void populate_physmap(struct memop_args *a) mfn = _mfn(gpfn); } +else if ( is_domain_using_staticmem(d) ) +{ +/* + * No easy way to guarantee the retrieved pages are contiguous, + * so forbid non-zero-order requests here. + */ +if ( a->extent_order != 0 ) +{ +gdprintk(XENLOG_WARNING, + "Cannot allocate static order-%u pages for static %pd\n", + a->extent_order, d); +goto out; +} + +mfn = acquire_reserved_page(d, a->memflags); +if ( mfn_eq(mfn, INVALID_MFN) ) +{ +gdprintk(XENLOG_WARNING, + "%pd: failed to retrieve a reserved page\n", + d); +goto out; +} +} else { page = alloc_domheap_pages(d, a->extent_order, a->memflags); diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 6112f6a3ed..390a9c002d 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -2702,9 +2702,8 @@ void free_domstatic_page(struct page_info *page) put_domain(d); } -static bool __init prepare_staticmem_pages(struct page_info *pg, - unsigned long nr_mfns, - unsigned int memflags) +static bool prepare_staticmem_pages(struct page_info *pg, unsigned long nr_mfns, +unsigned int memflags) { bool need_tlbflush = false; uint32_t tlbflush_timestamp = 0; @@ -2785,21 +2784,9 @@ static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, return pg; } -/* - * Acquire nr_mfns contiguous pages, starting at #smfn, of static memory, - * then assign them to one specific domain #d. - */ -int __init acquire_domstatic_pages(struct domain *d, mfn_t smfn, - unsigned int nr_mfns, unsigned int memflags) +static int assign_domstatic_pages(struct domain *d, struct page_info *pg, + unsigned int nr_mfns, unsigned int memflags) { -struct page_info *pg; - -ASSERT_ALLOC_CONTEXT(); - -pg = acquire_staticmem_pages(smfn, nr_mfns, memflags); -if ( !pg ) -return -ENOENT; - if ( !d || (memflags & (MEMF_no_owner | MEMF_no_refcount)) ) { /* @@ -2818,6 +2805,55 @@ int __init acquire_domstatic_pages(struct domain *d, mfn_t smfn, return 0; } + +/* + * Acquire nr_mfns contiguous pages, starting at #smfn, of static memory, + * then assign them to one specific domain #d. + */ +int __init acquire_domstatic_pages(struct domain *d, mfn_t smfn, + unsigned int nr_mfns, unsigned int memflags) +{ +struct page_info *pg; + +ASSERT_ALLOC_CONTEXT(); + +pg = acquire_staticmem_pages(smfn, nr_mfns, memflags); +if ( !pg ) +return -ENOENT; + +if ( assign_domstatic_pages(d, pg, nr_mfns,
[PATCH v8 6/9] xen/arm: introduce CDF_staticmem
In order to have an easy and quick way to find out whether this domain memory is statically configured, this commit introduces a new flag CDF_staticmem and a new helper is_domain_using_staticmem() to tell. Signed-off-by: Penny Zheng --- v8 changes: - #ifdef-ary around is_domain_using_staticmem() is not needed anymore --- v7 changes: - IS_ENABLED(CONFIG_STATIC_MEMORY) would not be needed anymore --- v6 changes: - move non-zero is_domain_using_staticmem() from ARM header to common header --- v5 changes: - guard "is_domain_using_staticmem" under CONFIG_STATIC_MEMORY - #define is_domain_using_staticmem zero if undefined --- v4 changes: - no changes --- v3 changes: - change name from "is_domain_static()" to "is_domain_using_staticmem" --- v2 changes: - change name from "is_domain_on_static_allocation" to "is_domain_static()" --- xen/arch/arm/domain_build.c | 5 - xen/include/xen/domain.h| 8 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 3fd1186b53..b76a84e8f5 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -3287,9 +3287,12 @@ void __init create_domUs(void) if ( !dt_device_is_compatible(node, "xen,domain") ) continue; +if ( dt_find_property(node, "xen,static-mem", NULL) ) +flags |= CDF_staticmem; + if ( dt_property_read_bool(node, "direct-map") ) { -if ( !IS_ENABLED(CONFIG_STATIC_MEMORY) || !dt_find_property(node, "xen,static-mem", NULL) ) +if ( !(flags & CDF_staticmem) ) panic("direct-map is not valid for domain %s without static allocation.\n", dt_node_name(node)); diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h index 628b14b086..2c8116afba 100644 --- a/xen/include/xen/domain.h +++ b/xen/include/xen/domain.h @@ -35,6 +35,14 @@ void arch_get_domain_info(const struct domain *d, /* Should domain memory be directly mapped? */ #define CDF_directmap(1U << 1) #endif +/* Is domain memory on static allocation? */ +#ifdef CONFIG_STATIC_MEMORY +#define CDF_staticmem(1U << 2) +#else +#define CDF_staticmem0 +#endif + +#define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem) /* * Arch-specifics. -- 2.25.1
[PATCH v8 7/9] xen/arm: unpopulate memory when domain is static
Today when a domain unpopulates the memory on runtime, they will always hand the memory back to the heap allocator. And it will be a problem if domain is static. Pages as guest RAM for static domain shall be reserved to only this domain and not be used for any other purposes, so they shall never go back to heap allocator. This commit puts reserved pages on the new list resv_page_list only after having taken them off the "normal" list, when the last ref dropped. Signed-off-by: Penny Zheng --- v8 changes: - adapt this patch for newly introduced free_domstatic_page - order as a parameter is not needed here, as all staticmem operations are limited to order-0 regions - move d->page_alloc_lock after operation on d->resv_page_list --- v7 changes: - Add page on the rsv_page_list *after* it has been freed --- v6 changes: - refine in-code comment - move PGC_static !CONFIG_STATIC_MEMORY definition to common header --- v5 changes: - adapt this patch for PGC_staticmem --- v4 changes: - no changes --- v3 changes: - have page_list_del() just once out of the if() - remove resv_pages counter - make arch_free_heap_page be an expression, not a compound statement. --- v2 changes: - put reserved pages on resv_page_list after having taken them off the "normal" list --- xen/common/domain.c | 4 xen/common/page_alloc.c | 10 -- xen/include/xen/mm.h| 6 ++ xen/include/xen/sched.h | 3 +++ 4 files changed, 21 insertions(+), 2 deletions(-) diff --git a/xen/common/domain.c b/xen/common/domain.c index 875730df50..4043498ffa 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -604,6 +604,10 @@ struct domain *domain_create(domid_t domid, INIT_PAGE_LIST_HEAD(>page_list); INIT_PAGE_LIST_HEAD(>extra_page_list); INIT_PAGE_LIST_HEAD(>xenpage_list); +#ifdef CONFIG_STATIC_MEMORY +INIT_PAGE_LIST_HEAD(>resv_page_list); +#endif + spin_lock_init(>node_affinity_lock); d->node_affinity = NODE_MASK_ALL; diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 3260490688..b01272a59a 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -2681,8 +2681,6 @@ void free_domstatic_page(struct page_info *page) need_scrub = d->is_dying || scrub_debug || opt_scrub_domheap; drop_dom_ref = !domain_adjust_tot_pages(d, -1); - -spin_unlock_recursive(>page_alloc_lock); } else { @@ -2692,6 +2690,14 @@ void free_domstatic_page(struct page_info *page) free_staticmem_pages(page, 1, need_scrub); +if ( likely(d) ) +{ +/* Add page on the resv_page_list *after* it has been freed. */ +if ( !drop_dom_ref ) +put_static_page(d, page); +spin_unlock_recursive(>page_alloc_lock); +} + if ( drop_dom_ref ) put_domain(d); } diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index f1a7d5c991..07b8a45f1a 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -91,6 +91,12 @@ void free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns, void free_domstatic_page(struct page_info *page); int acquire_domstatic_pages(struct domain *d, mfn_t smfn, unsigned int nr_mfns, unsigned int memflags); +#ifdef CONFIG_STATIC_MEMORY +#define put_static_page(d, page) \ +page_list_add_tail((page), &(d)->resv_page_list) +#else +#define put_static_page(d, page) ((void)(d), (void)(page)) +#endif /* Map machine page range in Xen virtual address space. */ int map_pages_to_xen( diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 98e8001c89..d4fbd3dea7 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -381,6 +381,9 @@ struct domain struct page_list_head page_list; /* linked list */ struct page_list_head extra_page_list; /* linked list (size extra_pages) */ struct page_list_head xenpage_list; /* linked list (size xenheap_pages) */ +#ifdef CONFIG_STATIC_MEMORY +struct page_list_head resv_page_list; /* linked list */ +#endif /* * This field should only be directly accessed by domain_adjust_tot_pages() -- 2.25.1
[PATCH v8 5/9] xen: add field "flags" to cover all internal CDF_XXX
With more and more CDF_xxx internal flags in and to save the space, this commit introduces a new field "flags" in struct domain to store CDF_* internal flags directly. Another new CDF_xxx will be introduced in the next patch. Signed-off-by: Penny Zheng Acked-by: Julien Grall --- v8 changes: - no change --- v7 changes: - no change --- v6 changes: - no change --- v5 changes: - no change --- v4 changes: - no change --- v3 changes: - change fixed width type uint32_t to unsigned int - change "flags" to a more descriptive name "cdf" --- v2 changes: - let "flags" live in the struct domain. So other arch can take advantage of it in the future - fix coding style --- xen/arch/arm/domain.c | 2 -- xen/arch/arm/include/asm/domain.h | 3 +-- xen/common/domain.c | 3 +++ xen/include/xen/sched.h | 3 +++ 4 files changed, 7 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 2f8eaab7b5..4722988ee7 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -709,8 +709,6 @@ int arch_domain_create(struct domain *d, ioreq_domain_init(d); #endif -d->arch.directmap = flags & CDF_directmap; - /* p2m_init relies on some value initialized by the IOMMU subsystem */ if ( (rc = iommu_domain_init(d, config->iommu_opts)) != 0 ) goto fail; diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h index ed63c2b6f9..fe7a029ebf 100644 --- a/xen/arch/arm/include/asm/domain.h +++ b/xen/arch/arm/include/asm/domain.h @@ -29,7 +29,7 @@ enum domain_type { #define is_64bit_domain(d) (0) #endif -#define is_domain_direct_mapped(d) (d)->arch.directmap +#define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap) /* * Is the domain using the host memory layout? @@ -103,7 +103,6 @@ struct arch_domain void *tee; #endif -bool directmap; } __cacheline_aligned; struct arch_vcpu diff --git a/xen/common/domain.c b/xen/common/domain.c index 3b1169d79b..875730df50 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -567,6 +567,9 @@ struct domain *domain_create(domid_t domid, /* Sort out our idea of is_system_domain(). */ d->domain_id = domid; +/* Holding CDF_* internal flags. */ +d->cdf = flags; + /* Debug sanity. */ ASSERT(is_system_domain(d) ? config == NULL : config != NULL); diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index b9515eb497..98e8001c89 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -596,6 +596,9 @@ struct domain struct ioreq_server *server[MAX_NR_IOREQ_SERVERS]; } ioreq_server; #endif + +/* Holding CDF_* constant. Internal flags for domain creation. */ +unsigned int cdf; }; static inline struct page_list_head *page_to_list( -- 2.25.1
[PATCH v8 4/9] xen: do not merge reserved pages in free_heap_pages()
The code in free_heap_pages() will try to merge pages with the successor/predecessor if pages are suitably aligned. So if the pages reserved are right next to the pages given to the heap allocator, free_heap_pages() will merge them, and give the reserved pages to heap allocator accidently as a result. So in order to avoid the above scenario, this commit updates free_heap_pages() to check whether the predecessor and/or successor has PGC_reserved set, when trying to merge the about-to-be-freed chunk with the predecessor and/or successor. Suggested-by: Julien Grall Signed-off-by: Penny Zheng Reviewed-by: Jan Beulich Reviewed-by: Julien Grall --- v8 changes: - no change --- v7 changes: - no change --- v6 changes: - adapt to PGC_static --- v5 changes: - change PGC_reserved to adapt to PGC_staticmem --- v4 changes: - no changes --- v3 changes: - no changes --- v2 changes: - new commit --- xen/common/page_alloc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 9a80ca10fa..3260490688 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1475,6 +1475,7 @@ static void free_heap_pages( /* Merge with predecessor block? */ if ( !mfn_valid(page_to_mfn(predecessor)) || !page_state_is(predecessor, free) || + (predecessor->count_info & PGC_static) || (PFN_ORDER(predecessor) != order) || (phys_to_nid(page_to_maddr(predecessor)) != node) ) break; @@ -1498,6 +1499,7 @@ static void free_heap_pages( /* Merge with successor block? */ if ( !mfn_valid(page_to_mfn(successor)) || !page_state_is(successor, free) || + (successor->count_info & PGC_static) || (PFN_ORDER(successor) != order) || (phys_to_nid(page_to_maddr(successor)) != node) ) break; -- 2.25.1
[PATCH v8 1/9] xen/arm: rename PGC_reserved to PGC_static
PGC_reserved could be ambiguous, and we have to tell what the pages are reserved for, so this commit intends to rename PGC_reserved to PGC_static, which clearly indicates the page is reserved for static memory. Signed-off-by: Penny Zheng Acked-by: Jan Beulich --- v8 changes - no change --- v7 changes: - no change --- v6 changes: - rename PGC_staticmem to PGC_static --- v5 changes: - new commit --- xen/arch/arm/include/asm/mm.h | 6 +++--- xen/common/page_alloc.c | 22 +++--- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h index c4bc3cd1e5..8b2481c1f3 100644 --- a/xen/arch/arm/include/asm/mm.h +++ b/xen/arch/arm/include/asm/mm.h @@ -108,9 +108,9 @@ struct page_info /* Page is Xen heap? */ #define _PGC_xen_heap PG_shift(2) #define PGC_xen_heap PG_mask(1, 2) - /* Page is reserved */ -#define _PGC_reserved PG_shift(3) -#define PGC_reserved PG_mask(1, 3) + /* Page is static memory */ +#define _PGC_staticPG_shift(3) +#define PGC_static PG_mask(1, 3) /* ... */ /* Page is broken? */ #define _PGC_broken PG_shift(7) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index fe0e15429a..ed56379b96 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -151,8 +151,8 @@ #define p2m_pod_offline_or_broken_replace(pg) BUG_ON(pg != NULL) #endif -#ifndef PGC_reserved -#define PGC_reserved 0 +#ifndef PGC_static +#define PGC_static 0 #endif /* @@ -2286,7 +2286,7 @@ int assign_pages( for ( i = 0; i < nr; i++ ) { -ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_reserved))); +ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static))); if ( pg[i].count_info & PGC_extra ) extra_pages++; } @@ -2346,7 +2346,7 @@ int assign_pages( page_set_owner([i], d); smp_wmb(); /* Domain pointer must be visible before updating refcnt. */ pg[i].count_info = -(pg[i].count_info & (PGC_extra | PGC_reserved)) | PGC_allocated | 1; +(pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1; page_list_add_tail([i], page_to_list(d, [i])); } @@ -2652,8 +2652,8 @@ void __init free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns, scrub_one_page(pg); } -/* In case initializing page of static memory, mark it PGC_reserved. */ -pg[i].count_info |= PGC_reserved; +/* In case initializing page of static memory, mark it PGC_static. */ +pg[i].count_info |= PGC_static; } } @@ -2682,8 +2682,8 @@ static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, for ( i = 0; i < nr_mfns; i++ ) { -/* The page should be reserved and not yet allocated. */ -if ( pg[i].count_info != (PGC_state_free | PGC_reserved) ) +/* The page should be static and not yet allocated. */ +if ( pg[i].count_info != (PGC_state_free | PGC_static) ) { printk(XENLOG_ERR "pg[%lu] Static MFN %"PRI_mfn" c=%#lx t=%#x\n", @@ -2697,10 +2697,10 @@ static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, _timestamp); /* - * Preserve flag PGC_reserved and change page state + * Preserve flag PGC_static and change page state * to PGC_state_inuse. */ -pg[i].count_info = PGC_reserved | PGC_state_inuse; +pg[i].count_info = PGC_static | PGC_state_inuse; /* Initialise fields which have other uses for free pages. */ pg[i].u.inuse.type_info = 0; page_set_owner([i], NULL); @@ -2722,7 +2722,7 @@ static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, out_err: while ( i-- ) -pg[i].count_info = PGC_reserved | PGC_state_free; +pg[i].count_info = PGC_static | PGC_state_free; spin_unlock(_lock); -- 2.25.1
[PATCH v8 3/9] xen: update SUPPORT.md for static allocation
SUPPORT.md doesn't seem to explicitly say whether static memory is supported, so this commit updates SUPPORT.md to add feature static allocation tech preview for now. Signed-off-by: Penny Zheng Reviewed-by: Stefano Stabellini --- v8 changes: - no change --- v7 changes: - no change --- v6 changes: - use domain instead of sub-systems --- v5 changes: - new commit --- SUPPORT.md | 7 +++ 1 file changed, 7 insertions(+) diff --git a/SUPPORT.md b/SUPPORT.md index 70e98964cb..8e040d1c1e 100644 --- a/SUPPORT.md +++ b/SUPPORT.md @@ -286,6 +286,13 @@ to boot with memory < maxmem. Status, x86 HVM: Supported +### Static Allocation + +Static allocation refers to domains for which memory areas are +pre-defined by configuration using physical address ranges. + +Status, ARM: Tech Preview + ### Memory Sharing Allow sharing of identical pages between guests -- 2.25.1
[PATCH v8 2/9] xen: do not free reserved memory into heap
Pages used as guest RAM for static domain, shall be reserved to this domain only. So in case reserved pages being used for other purpose, users shall not free them back to heap, even when last ref gets dropped. This commit introduces a new helper free_domstatic_page to free static page in runtime, and free_staticmem_pages will be called by it in runtime, so let's drop the __init flag. Signed-off-by: Penny Zheng --- v8 changes: - introduce new helper free_domstatic_page - let put_page call free_domstatic_page for static page, when last ref drops - #define PGC_static zero when !CONFIG_STATIC_MEMORY in xen/mm.h, as it is used outside page_alloc.c --- v7 changes: - protect free_staticmem_pages with heap_lock to match its reverse function acquire_staticmem_pages --- v6 changes: - adapt to PGC_static - remove #ifdef aroud function declaration --- v5 changes: - In order to avoid stub functions, we #define PGC_staticmem to non-zero only when CONFIG_STATIC_MEMORY - use "unlikely()" around pg->count_info & PGC_staticmem - remove pointless "if", since mark_page_free() is going to set count_info to PGC_state_free and by consequence clear PGC_staticmem - move #define PGC_staticmem 0 to mm.h --- v4 changes: - no changes --- v3 changes: - fix possible racy issue in free_staticmem_pages() - introduce a stub free_staticmem_pages() for the !CONFIG_STATIC_MEMORY case - move the change to free_heap_pages() to cover other potential call sites - fix the indentation --- v2 changes: - new commit --- --- xen/arch/arm/include/asm/mm.h | 4 ++- xen/arch/arm/mm.c | 2 ++ xen/common/page_alloc.c | 51 ++- xen/include/xen/mm.h | 7 +++-- 4 files changed, 54 insertions(+), 10 deletions(-) diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h index 8b2481c1f3..f1640bbda4 100644 --- a/xen/arch/arm/include/asm/mm.h +++ b/xen/arch/arm/include/asm/mm.h @@ -108,9 +108,11 @@ struct page_info /* Page is Xen heap? */ #define _PGC_xen_heap PG_shift(2) #define PGC_xen_heap PG_mask(1, 2) - /* Page is static memory */ +#ifdef CONFIG_STATIC_MEMORY +/* Page is static memory */ #define _PGC_staticPG_shift(3) #define PGC_static PG_mask(1, 3) +#endif /* ... */ /* Page is broken? */ #define _PGC_broken PG_shift(7) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 009b8cd9ef..a3bc6d7a24 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1622,6 +1622,8 @@ void put_page(struct page_info *page) if ( unlikely((nx & PGC_count_mask) == 0) ) { +if ( unlikely(nx & PGC_static) ) +free_domstatic_page(page); free_domheap_page(page); } } diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index ed56379b96..9a80ca10fa 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -151,10 +151,6 @@ #define p2m_pod_offline_or_broken_replace(pg) BUG_ON(pg != NULL) #endif -#ifndef PGC_static -#define PGC_static 0 -#endif - /* * Comma-separated list of hexadecimal page numbers containing bad bytes. * e.g. 'badpage=0x3f45,0x8a321'. @@ -2636,12 +2632,14 @@ struct domain *get_pg_owner(domid_t domid) #ifdef CONFIG_STATIC_MEMORY /* Equivalent of free_heap_pages to free nr_mfns pages of static memory. */ -void __init free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns, - bool need_scrub) +void free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns, + bool need_scrub) { mfn_t mfn = page_to_mfn(pg); unsigned long i; +spin_lock(_lock); + for ( i = 0; i < nr_mfns; i++ ) { mark_page_free([i], mfn_add(mfn, i)); @@ -2652,9 +2650,48 @@ void __init free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns, scrub_one_page(pg); } -/* In case initializing page of static memory, mark it PGC_static. */ pg[i].count_info |= PGC_static; } + +spin_unlock(_lock); +} + +void free_domstatic_page(struct page_info *page) +{ +struct domain *d = page_get_owner(page); +bool drop_dom_ref, need_scrub; + +ASSERT_ALLOC_CONTEXT(); + +if ( likely(d) ) +{ +/* NB. May recursively lock from relinquish_memory(). */ +spin_lock_recursive(>page_alloc_lock); + +arch_free_heap_page(d, page); + +/* + * Normally we expect a domain to clear pages before freeing them, + * if it cares about the secrecy of their contents. However, after + * a domain has died we assume responsibility for erasure. We do + * scrub regardless if option scrub_domheap is set. + */ +need_scrub = d->is_dying || scrub_debug || opt_scrub_domheap; + +drop_dom_ref = !domain_adjust_tot_pages(d, -1); + +spin_unlock_recursive(>page_alloc_lock); +} +else +{ +drop_dom_ref = false; +need_scrub = true; +} + +
[PATCH v8 0/9] populate/unpopulate memory when domain on static allocation
Today when a domain unpopulates the memory on runtime, they will always hand the memory over to the heap allocator. And it will be a problem if it is a static domain. Pages used as guest RAM for static domain shall always be reserved to this domain only, and not be used for any other purposes, so they shall never go back to heap allocator. This patch serie intends to fix this issue, by adding pages on the new list resv_page_list after having taken them off the "normal" list, when unpopulating memory, and retrieving pages from resv page list(resv_page_list) when populating memory. --- v8 changes: - introduce new helper free_domstatic_page - let put_page call free_domstatic_page for static page, when last ref drops - #define PGC_static zero when !CONFIG_STATIC_MEMORY, as it is used outside page_alloc.c - #ifdef-ary around is_domain_using_staticmem() is not needed anymore - order as a parameter is not needed here, as all staticmem operations are limited to order-0 regions - move d->page_alloc_lock after operation on d->resv_page_list - As concurrent free/allocate could modify the resv_page_list, we still need the lock --- v7 changes: - protect free_staticmem_pages with heap_lock to match its reverse function acquire_staticmem_pages - IS_ENABLED(CONFIG_STATIC_MEMORY) would not be needed anymore - add page on the rsv_page_list *after* it has been freed - remove the lock, since we add the page to rsv_page_list after it has been totally freed. --- v6 changes: - rename PGC_staticmem to PGC_static - remove #ifdef aroud function declaration - use domain instead of sub-systems - move non-zero is_domain_using_staticmem() from ARM header to common header - move PGC_static !CONFIG_STATIC_MEMORY definition to common header - drop the lock before returning --- v5 changes: - introduce three new commits - In order to avoid stub functions, we #define PGC_staticmem to non-zero only when CONFIG_STATIC_MEMORY - use "unlikely()" around pg->count_info & PGC_staticmem - remove pointless "if", since mark_page_free() is going to set count_info to PGC_state_free and by consequence clear PGC_staticmem - move #define PGC_staticmem 0 to mm.h - guard "is_domain_using_staticmem" under CONFIG_STATIC_MEMORY - #define is_domain_using_staticmem zero if undefined - extract common codes for assigning pages into a helper assign_domstatic_pages - refine commit message - remove stub function acquire_reserved_page - Alloc/free of memory can happen concurrently. So access to rsv_page_list needs to be protected with a spinlock --- v4 changes: - commit message refinement - miss dropping __init in acquire_domstatic_pages - add the page back to the reserved list in case of error - remove redundant printk - refine log message and make it warn level - guard "is_domain_using_staticmem" under CONFIG_STATIC_MEMORY - #define is_domain_using_staticmem zero if undefined --- v3 changes: - fix possible racy issue in free_staticmem_pages() - introduce a stub free_staticmem_pages() for the !CONFIG_STATIC_MEMORY case - move the change to free_heap_pages() to cover other potential call sites - change fixed width type uint32_t to unsigned int - change "flags" to a more descriptive name "cdf" - change name from "is_domain_static()" to "is_domain_using_staticmem" - have page_list_del() just once out of the if() - remove resv_pages counter - make arch_free_heap_page be an expression, not a compound statement. - move #ifndef is_domain_using_staticmem to the common header file - remove #ifdef CONFIG_STATIC_MEMORY-ary - remove meaningless page_to_mfn(page) in error log --- v2 changes: - let "flags" live in the struct domain. So other arch can take advantage of it in the future - change name from "is_domain_on_static_allocation" to "is_domain_static()" - put reserved pages on resv_page_list after having taken them off the "normal" list - introduce acquire_reserved_page to retrieve reserved pages from resv_page_list - forbid non-zero-order requests in populate_physmap - let is_domain_static return ((void)(d), false) on x86 - fix coding style Penny Zheng (9): xen/arm: rename PGC_reserved to PGC_static xen: do not free reserved memory into heap xen: update SUPPORT.md for static allocation xen: do not merge reserved pages in free_heap_pages() xen: add field "flags" to cover all internal CDF_XXX xen/arm: introduce CDF_staticmem xen/arm: unpopulate memory when domain is static xen: introduce prepare_staticmem_pages xen: retrieve reserved pages on populate_physmap SUPPORT.md| 7 ++ xen/arch/arm/domain.c | 2 - xen/arch/arm/domain_build.c | 5 +- xen/arch/arm/include/asm/domain.h | 3 +- xen/arch/arm/include/asm/mm.h | 8 +- xen/arch/arm/mm.c | 2 + xen/common/domain.c | 7 ++ xen/common/memory.c | 23 xen/common/page_alloc.c | 190 ++ xen/include/xen/domain.h | 8 ++ xen/include/xen/mm.h |
Re: [PATCH v3 6/8] genirq: Add and use an irq_data_update_affinity helper
On Sun, 03 Jul 2022 16:22:03 +0100, Oleksandr wrote: > > > On 01.07.22 23:00, Samuel Holland wrote: > > > Hello Samuel > > > Some architectures and irqchip drivers modify the cpumask returned by > > irq_data_get_affinity_mask, usually by copying in to it. This is > > problematic for uniprocessor configurations, where the affinity mask > > should be constant, as it is known at compile time. > > > > Add and use a setter for the affinity mask, following the pattern of > > irq_data_update_effective_affinity. This allows the getter function to > > return a const cpumask pointer. > > > > Signed-off-by: Samuel Holland > > --- > > > > Changes in v3: > > - New patch to introduce irq_data_update_affinity > > > > arch/alpha/kernel/irq.c | 2 +- > > arch/ia64/kernel/iosapic.c | 2 +- > > arch/ia64/kernel/irq.c | 4 ++-- > > arch/ia64/kernel/msi_ia64.c | 4 ++-- > > arch/parisc/kernel/irq.c | 2 +- > > drivers/irqchip/irq-bcm6345-l1.c | 4 ++-- > > drivers/parisc/iosapic.c | 2 +- > > drivers/sh/intc/chip.c | 2 +- > > drivers/xen/events/events_base.c | 7 --- > > include/linux/irq.h | 6 ++ > > 10 files changed, 21 insertions(+), 14 deletions(-) > > > > diff --git a/arch/alpha/kernel/irq.c b/arch/alpha/kernel/irq.c > > index f6d2946edbd2..15f2effd6baf 100644 > > --- a/arch/alpha/kernel/irq.c > > +++ b/arch/alpha/kernel/irq.c > > @@ -60,7 +60,7 @@ int irq_select_affinity(unsigned int irq) > > cpu = (cpu < (NR_CPUS-1) ? cpu + 1 : 0); > > last_cpu = cpu; > > - cpumask_copy(irq_data_get_affinity_mask(data), > > cpumask_of(cpu)); > > + irq_data_update_affinity(data, cpumask_of(cpu)); > > chip->irq_set_affinity(data, cpumask_of(cpu), false); > > return 0; > > } > > diff --git a/arch/ia64/kernel/iosapic.c b/arch/ia64/kernel/iosapic.c > > index 35adcf89035a..99300850abc1 100644 > > --- a/arch/ia64/kernel/iosapic.c > > +++ b/arch/ia64/kernel/iosapic.c > > @@ -834,7 +834,7 @@ iosapic_unregister_intr (unsigned int gsi) > > if (iosapic_intr_info[irq].count == 0) { > > #ifdef CONFIG_SMP > > /* Clear affinity */ > > - cpumask_setall(irq_get_affinity_mask(irq)); > > + irq_data_update_affinity(irq_get_irq_data(irq), cpu_all_mask); > > #endif > > /* Clear the interrupt information */ > > iosapic_intr_info[irq].dest = 0; > > diff --git a/arch/ia64/kernel/irq.c b/arch/ia64/kernel/irq.c > > index ecef17c7c35b..275b9ea58c64 100644 > > --- a/arch/ia64/kernel/irq.c > > +++ b/arch/ia64/kernel/irq.c > > @@ -57,8 +57,8 @@ static char irq_redir [NR_IRQS]; // = { [0 ... NR_IRQS-1] > > = 1 }; > > void set_irq_affinity_info (unsigned int irq, int hwid, int redir) > > { > > if (irq < NR_IRQS) { > > - cpumask_copy(irq_get_affinity_mask(irq), > > -cpumask_of(cpu_logical_id(hwid))); > > + irq_data_update_affinity(irq_get_irq_data(irq), > > +cpumask_of(cpu_logical_id(hwid))); > > irq_redir[irq] = (char) (redir & 0xff); > > } > > } > > diff --git a/arch/ia64/kernel/msi_ia64.c b/arch/ia64/kernel/msi_ia64.c > > index df5c28f252e3..025e5133c860 100644 > > --- a/arch/ia64/kernel/msi_ia64.c > > +++ b/arch/ia64/kernel/msi_ia64.c > > @@ -37,7 +37,7 @@ static int ia64_set_msi_irq_affinity(struct irq_data > > *idata, > > msg.data = data; > > pci_write_msi_msg(irq, ); > > - cpumask_copy(irq_data_get_affinity_mask(idata), cpumask_of(cpu)); > > + irq_data_update_affinity(idata, cpumask_of(cpu)); > > return 0; > > } > > @@ -132,7 +132,7 @@ static int dmar_msi_set_affinity(struct irq_data *data, > > msg.address_lo |= MSI_ADDR_DEST_ID_CPU(cpu_physical_id(cpu)); > > dmar_msi_write(irq, ); > > - cpumask_copy(irq_data_get_affinity_mask(data), mask); > > + irq_data_update_affinity(data, mask); > > return 0; > > } > > diff --git a/arch/parisc/kernel/irq.c b/arch/parisc/kernel/irq.c > > index 0fe2d79fb123..5ebb1771b4ab 100644 > > --- a/arch/parisc/kernel/irq.c > > +++ b/arch/parisc/kernel/irq.c > > @@ -315,7 +315,7 @@ unsigned long txn_affinity_addr(unsigned int irq, int > > cpu) > > { > > #ifdef CONFIG_SMP > > struct irq_data *d = irq_get_irq_data(irq); > > - cpumask_copy(irq_data_get_affinity_mask(d), cpumask_of(cpu)); > > + irq_data_update_affinity(d, cpumask_of(cpu)); > > #endif > > return per_cpu(cpu_data, cpu).txn_addr; > > diff --git a/drivers/irqchip/irq-bcm6345-l1.c > > b/drivers/irqchip/irq-bcm6345-l1.c > > index 142a7431745f..6899e37810a8 100644 > > --- a/drivers/irqchip/irq-bcm6345-l1.c > > +++ b/drivers/irqchip/irq-bcm6345-l1.c > > @@ -216,11 +216,11 @@ static int bcm6345_l1_set_affinity(struct irq_data *d, > > enabled = intc->cpus[old_cpu]->enable_cache[word] & mask; > > if (enabled) > > __bcm6345_l1_mask(d); > > -
[qemu-mainline test] 171539: tolerable FAIL - PUSHED
flight 171539 qemu-mainline real [real] flight 171543 qemu-mainline real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/171539/ http://logs.test-lab.xenproject.org/osstest/logs/171543/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-amd64-amd64-xl-qcow222 guest-start.2 fail pass in 171543-retest test-amd64-i386-xl-vhd 21 guest-start/debian.repeat fail pass in 171543-retest Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds18 guest-start/debian.repeat fail REGR. vs. 171525 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171525 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 171525 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171525 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171525 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 171525 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 171525 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171525 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171525 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass
Re: [PATCH v3 1/8] irqchip/mips-gic: Only register IPI domain when SMP is enabled
On Tue, 05 Jul 2022 14:52:43 +0100, Serge Semin wrote: > > Hi Samuel > > On Fri, Jul 01, 2022 at 03:00:49PM -0500, Samuel Holland wrote: > > The MIPS GIC irqchip driver may be selected in a uniprocessor > > configuration, but it unconditionally registers an IPI domain. > > > > Limit the part of the driver dealing with IPIs to only be compiled when > > GENERIC_IRQ_IPI is enabled, which corresponds to an SMP configuration. > > Thanks for the patch. Some comment is below. > > > > > Reported-by: kernel test robot > > Signed-off-by: Samuel Holland > > --- > > > > Changes in v3: > > - New patch to fix build errors in uniprocessor MIPS configs > > > > drivers/irqchip/Kconfig| 3 +- > > drivers/irqchip/irq-mips-gic.c | 80 +++--- > > 2 files changed, 56 insertions(+), 27 deletions(-) > > > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig > > index 1f23a6be7d88..d26a4ff7c99f 100644 > > --- a/drivers/irqchip/Kconfig > > +++ b/drivers/irqchip/Kconfig > > @@ -322,7 +322,8 @@ config KEYSTONE_IRQ > > > > config MIPS_GIC > > bool > > - select GENERIC_IRQ_IPI > > + select GENERIC_IRQ_IPI if SMP > > > + select IRQ_DOMAIN_HIERARCHY > > It seems to me that the IRQ domains hierarchy is supposed to be > created only if IPI is required. At least that's what the MIPS GIC > driver implies. Thus we can go further and CONFIG_IRQ_DOMAIN_HIERARCHY > ifdef-out the gic_irq_domain_alloc() and gic_irq_domain_free() > methods definition together with the initialization: > > static const struct irq_domain_ops gic_irq_domain_ops = { > .xlate = gic_irq_domain_xlate, > +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY > .alloc = gic_irq_domain_alloc, > .free = gic_irq_domain_free, > +#endif > .map = gic_irq_domain_map, > }; > > If the GENERIC_IRQ_IPI config is enabled, CONFIG_IRQ_DOMAIN_HIERARCHY > will be automatically selected (see the config definition in > kernel/irq/Kconfig). If the IRQs hierarchy is needed for some another > functionality like GENERIC_MSI_IRQ_DOMAIN or GPIOs then they will > explicitly enable the IRQ_DOMAIN_HIERARCHY config thus activating the > denoted .alloc and .free methods definitions. > > To sum up you can get rid of the IRQ_DOMAIN_HIERARCHY config > force-select from this patch and make the MIPS GIC driver code a bit > more coherent. > > @Marc, please correct me if were wrong. Either way probably works correctly, but Samuel's approach is more readable IMO. It is far easier to reason about a high-level feature (GENERIC_IRQ_IPI) than an implementation detail (IRQ_DOMAIN_HIERARCHY). If you really want to save a handful of bytes, you can make the callbacks conditional on GENERIC_IRQ_IPI, and be done with it. But this can come as an additional patch. Thanks, M. -- Without deviation from the norm, progress is not possible.
Re: [PATCH 3/4] xen/arm: domain: Fix MISRA C 2012 Rule 8.7 violation
On 07.07.2022 09:27, Xenia Ragiadakou wrote: > On 7/6/22 11:51, Jan Beulich wrote: >> On 06.07.2022 10:43, Xenia Ragiadakou wrote: >>> On 7/6/22 10:10, Jan Beulich wrote: On 05.07.2022 23:02, Xenia Ragiadakou wrote: > The function idle_loop() is referenced only in domain.c. > Change its linkage from external to internal by adding the storage-class > specifier static to its definitions. > > Since idle_loop() is referenced only in inline assembly, add the 'used' > attribute to suppress unused-function compiler warning. While I see that Julien has already acked the patch, I'd like to point out that using __used here is somewhat bogus. Imo the better approach is to properly make visible to the compiler that the symbol is used by the asm(), by adding a fake(?) input. >>> >>> I 'm afraid I do not understand what do you mean by "adding a fake(?) >>> input". Can you please elaborate a little on your suggestion? >> >> Once the asm() in question takes the function as an input, the compiler >> will know that the function has a user (unless, of course, it finds a >> way to elide the asm() itself). The "fake(?)" was because I'm not deeply >> enough into Arm inline assembly to know whether the input could then >> also be used as an instruction operand (which imo would be preferable) - >> if it can't (e.g. because there's no suitable constraint or operand >> modifier), it still can be an input just to inform the compiler. > > According to the following statement, your approach is the recommended one: > "To prevent the compiler from removing global data or functions which > are referenced from inline assembly statements, you can: > -use __attribute__((used)) with the global data or functions. > -pass the reference to global data or functions as operands to inline > assembly statements. > Arm recommends passing the reference to global data or functions as > operands to inline assembly statements so that if the final image does > not require the inline assembly statements and the referenced global > data or function, then they can be removed." > > IIUC, you are suggesting to change > asm volatile ("mov sp,%0; b " STR(fn) : : "r" (stack) : "memory" ) > into > asm volatile ("mov sp,%0; b %1" : : "r" (stack), "S" (fn) : "memory" ); Yes, except that I can't judge about the "S" constraint. > This gives, respectively: > reset_stack_and_jump(idle_loop); > > 460:911fmov sp, x0 > > 464:1407b 480 > > > reset_stack_and_jump(idle_loop); > > 460:911fmov sp, x0 > > 464:1400b 600 > > > With this change, the functions return_to_new_vcpu32 and > return_to_new_vcpu64, implemented in assembly and called in the same way > as idle_loop(), need to be declared. Which imo is a good thing - these probably shouldn't have lived entirely behind the back of the compiler. Jan
Re: [PATCH 3/4] xen/arm: domain: Fix MISRA C 2012 Rule 8.7 violation
Hi Jan, On 7/6/22 11:51, Jan Beulich wrote: On 06.07.2022 10:43, Xenia Ragiadakou wrote: On 7/6/22 10:10, Jan Beulich wrote: On 05.07.2022 23:02, Xenia Ragiadakou wrote: The function idle_loop() is referenced only in domain.c. Change its linkage from external to internal by adding the storage-class specifier static to its definitions. Since idle_loop() is referenced only in inline assembly, add the 'used' attribute to suppress unused-function compiler warning. While I see that Julien has already acked the patch, I'd like to point out that using __used here is somewhat bogus. Imo the better approach is to properly make visible to the compiler that the symbol is used by the asm(), by adding a fake(?) input. I 'm afraid I do not understand what do you mean by "adding a fake(?) input". Can you please elaborate a little on your suggestion? Once the asm() in question takes the function as an input, the compiler will know that the function has a user (unless, of course, it finds a way to elide the asm() itself). The "fake(?)" was because I'm not deeply enough into Arm inline assembly to know whether the input could then also be used as an instruction operand (which imo would be preferable) - if it can't (e.g. because there's no suitable constraint or operand modifier), it still can be an input just to inform the compiler. According to the following statement, your approach is the recommended one: "To prevent the compiler from removing global data or functions which are referenced from inline assembly statements, you can: -use __attribute__((used)) with the global data or functions. -pass the reference to global data or functions as operands to inline assembly statements. Arm recommends passing the reference to global data or functions as operands to inline assembly statements so that if the final image does not require the inline assembly statements and the referenced global data or function, then they can be removed." IIUC, you are suggesting to change asm volatile ("mov sp,%0; b " STR(fn) : : "r" (stack) : "memory" ) into asm volatile ("mov sp,%0; b %1" : : "r" (stack), "S" (fn) : "memory" ); This gives, respectively: reset_stack_and_jump(idle_loop); 460: 911fmov sp, x0 464: 1407b 480 reset_stack_and_jump(idle_loop); 460: 911fmov sp, x0 464: 1400b 600 With this change, the functions return_to_new_vcpu32 and return_to_new_vcpu64, implemented in assembly and called in the same way as idle_loop(), need to be declared. -- Xenia
[ovmf test] 171540: all pass - PUSHED
flight 171540 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/171540/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf f193b945eac58ca379d3d21c77d5550b063580d6 baseline version: ovmf e1eef3a8b01a25e75abf63d15bdc90157a74cba9 Last test of basis 171446 2022-07-01 17:40:26 Z5 days Testing same since 171540 2022-07-07 01:10:28 Z0 days1 attempts People who touched revisions under test: Kun Qin Kun Qin Kun Qin kuqin Michael Kubacki jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git e1eef3a8b0..f193b945ea f193b945eac58ca379d3d21c77d5550b063580d6 -> xen-tested-master
Re: [PATCH] x86/PAT: have pat_enabled() properly reflect state when running on e.g. Xen
On 06.07.2022 19:01, Borislav Petkov wrote: > On Wed, Jul 06, 2022 at 08:17:41AM +0200, Jan Beulich wrote: >> Sure, but that alone won't help. > > Well, the MTRR code looks at X86_FEATURE_MTRR. If Xen doesn't expose the > MTRRs, then that bit should be clear in the CPUID the guest sees. > > So in that case, you could test X86_FEATURE_XENPV at the end of > mtrr_bp_init() and not disable PAT if running as a PV guest. Would that > work? > >> There's a beneficial side effect of running through pat_disable(): >> That way pat_init() will bail right away. Without that I'd need to >> further special case things there (as under Xen/PV PAT must not be >> written, only read) > > We have wrmsr_safe for that. Well, right now the pvops hook for Xen swallows #GP anyway (wrongly so imo, but any of my earlier pointing out of that has been left unheard, despite even the code comments there saying "It may be worth changing that"). The point is therefore that after writing PAT, it would need reading back. In which case it feels (slightly) more clean to me to avoid the write attempt in the first place, when we know it's not going to work. >> Any decent hypervisor will allow overriding CPUID, so in principle >> I'd expect any to permit disabling MTRR to leave a guest to use >> the (more modern and less cumbersome) PAT alone. > > So I'm being told that it would be generally beneficial for all kinds of > virtualization solutions to be able to support PAT only, without MTRRs > so it would be interesting to see how ugly it would become to decouple > PAT from MTRRs in Linux... If I may ask - doesn't this mean this patch, in its current shape, is already a (small) step in that direction? In any event what you say doesn't sound to me like a viable (backportable) route to addressing the regression at hand. Jan
Re: [PATCH v7 00/14] IOMMU: superpage support when not sharing pagetables
On 05.07.2022 14:41, Jan Beulich wrote: > For a long time we've been rather inefficient with IOMMU page table > management when not sharing page tables, i.e. in particular for PV (and > further specifically also for PV Dom0) and AMD (where nowadays we never > share page tables). While up to about 3.5 years ago AMD code had logic > to un-shatter page mappings, that logic was ripped out for being buggy > (XSA-275 plus follow-on). > > This series enables use of large pages in AMD and Intel (VT-d) code; > Arm is presently not in need of any enabling as pagetables are always > shared there. It also augments PV Dom0 creation with suitable explicit > IOMMU mapping calls to facilitate use of large pages there. Depending > on the amount of memory handed to Dom0 this improves booting time > (latency until Dom0 actually starts) quite a bit; subsequent shattering > of some of the large pages may of course consume some of the saved time. > > Known fallout has been spelled out here: > https://lists.xen.org/archives/html/xen-devel/2021-08/msg00781.html > > See individual patches for details on the v7 changes. > > 01: iommu: add preemption support to iommu_{un,}map() > 02: IOMMU/x86: perform PV Dom0 mappings in batches Paul, without meaning this to be a ping, may I ask whether - with Roger away for the next two months - you could find time to review these first two patches? I think this would then allow the entire series to go in. Thanks, Jan > 03: IOMMU/x86: support freeing of pagetables > 02: IOMMU/x86: new command line option to suppress use of superpage mappings > 03: AMD/IOMMU: allow use of superpage mappings > 04: VT-d: allow use of superpage mappings > 05: x86: introduce helper for recording degree of contiguity in page tables > 06: IOMMU/x86: prefill newly allocate page tables > 07: AMD/IOMMU: free all-empty page tables > 08: VT-d: free all-empty page tables > 09: AMD/IOMMU: replace all-contiguous page tables by superpage mappings > 10: VT-d: replace all-contiguous page tables by superpage mappings > 11: IOMMU/x86: add perf counters for page table splitting / coalescing > 12: VT-d: fold dma_pte_clear_one() into its only caller > > While not directly related (except that making this mode work properly > here was a fair part of the overall work), at this occasion I'd also > like to renew my proposal to make "iommu=dom0-strict" the default going > forward. It already is not only the default, but the only possible mode > for PVH Dom0. > > Jan >