Re: [PATCH 1/2] automation: add a QEMU aarch64 smoke test
On Fri, 13 Nov 2020, Stefano Stabellini wrote: > Use QEMU to start Xen (just the hypervisor) up until it stops because > there is no dom0 kernel to boot. > > It is based on the existing build job unstable-arm64v8. > > Also use make -j$(nproc) to build Xen. > > Signed-off-by: Stefano Stabellini > --- > automation/gitlab-ci/test.yaml | 22 ++ > automation/scripts/build | 8 +++ > automation/scripts/qemu-smoke-arm64.sh | 32 ++ > 3 files changed, 57 insertions(+), 5 deletions(-) > create mode 100755 automation/scripts/qemu-smoke-arm64.sh > > diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml > index 793feafe8b..35346e3f6e 100644 > --- a/automation/gitlab-ci/test.yaml > +++ b/automation/gitlab-ci/test.yaml > @@ -22,6 +22,28 @@ build-each-commit-gcc: > - /^coverity-tested\/.*/ > - /^stable-.*/ > > +qemu-smoke-arm64-gcc: > + stage: test > + image: registry.gitlab.com/xen-project/xen/${CONTAINER} > + variables: > +CONTAINER: debian:unstable-arm64v8 > + script: > +- ./automation/scripts/qemu-smoke-arm64.sh 2>&1 | tee > qemu-smoke-arm64.log > + dependencies: > +- debian-unstable-gcc-arm64 > + artifacts: > +paths: > + - smoke.serial > + - '*.log' > +when: always > + tags: > +- arm64 > + except: > +- master > +- smoke > +- /^coverity-tested\/.*/ > +- /^stable-.*/ > + > qemu-smoke-x86-64-gcc: >stage: test >image: registry.gitlab.com/xen-project/xen/${CONTAINER} > diff --git a/automation/scripts/build b/automation/scripts/build > index 0cd0f3971d..95ad23eadb 100755 > --- a/automation/scripts/build > +++ b/automation/scripts/build > @@ -10,9 +10,9 @@ cc-ver() > > # random config or default config > if [[ "${RANDCONFIG}" == "y" ]]; then > -make -C xen KCONFIG_ALLCONFIG=tools/kconfig/allrandom.config randconfig > +make -j$(nproc) -C xen KCONFIG_ALLCONFIG=tools/kconfig/allrandom.config > randconfig > else > -make -C xen defconfig > +make -j$(nproc) -C xen defconfig > fi > > # build up our configure options > @@ -45,9 +45,7 @@ make -j$(nproc) dist > # Extract artifacts to avoid getting rewritten by customised builds > cp xen/.config xen-config > mkdir binaries > -if [[ "${XEN_TARGET_ARCH}" == "x86_64" ]]; then > -cp xen/xen binaries/xen > -fi > +cp xen/xen binaries/xen This was a mistake, it should have been: if [[ "${XEN_TARGET_ARCH}" != "x86_32" ]]; then cp xen/xen binaries/xen fi Unrelated: should we temporarely disable the stubdom build (--disable-stubdom) in the gitlab build? (Even better if somebody volunteers to fix the stubdom build somehow.) At the moment a bunch of jobs are failing with: mini-os.c: In function 'main': mini-os.c:756: warning: cast from pointer to integer of different size ar: creating /builds/xen-project/people/sstabellini/xen/stubdom/mini-os-x86_64-xenstorepvh/arch/x86/libx86_64.a /builds/xen-project/people/sstabellini/xen/stubdom/mini-os-x86_64-xenstorepvh/mini-os.o: could not read symbols: File in wrong format make[2]: *** [/builds/xen-project/people/sstabellini/xen/stubdom/mini-os-x86_64-xenstorepvh/mini-os] Error 1 make[1]: *** [xenstorepvh-stubdom] Error 2 make[1]: *** Waiting for unfinished jobs With the above x86_32 fix, stubdoms disabled, all jobs (including the new aarch64 smoke test) complete successfully.
Re: [SPECIFICATION RFC] The firmware and bootloader log specification
On 11/13/20 3:52 PM, Daniel Kiper wrote: > Hey, > > > Here is the description (pseudocode) of the structures which will be > used to store the log data. > > Anyway, I am aware that this is not specification per se. Yes, you have caveats here. I'm sure that you either already know or would learn soon enough that struct struct bf_log has some padding added to it (for alignment) unless it is packed. Or you could rearrange the order of some of its fields and save 8 bytes per struct on x86_64. > struct bf_log > { > uint32_t version; > char producer[64]; > uint64_t flags; > uint64_t next_bf_log_addr; > uint32_t next_msg_off; > bf_log_msg msgs[]; > } > > struct bf_log_msg > { > uint32_t size; > uint64_t ts_nsec; > uint32_t level; > uint32_t facility; > uint32_t msg_off; > char strings[]; > } cheers. -- ~Randy
[xen-unstable test] 156737: tolerable FAIL - PUSHED
flight 156737 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/156737/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156443 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156443 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156443 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156443 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156443 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 156443 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 156443 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156443 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156443 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156443 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156443 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass version targeted for testing: xen 5505f5f8e7e805365cfe70b6a4af6115940bb749 baseline version: xen 9ff9705647646aa937b5f5c1426a64c69a62b3bd Last test of basis 156443 2020-11-05 15:47:13 Z8 days Failing since156524 2020-11-06 14:22:28 Z7 days9 attempts Testing same since 156711 2020-11-12 09:06:03 Z1 days2 attempts People who touched revisions under test: Andrew Cooper Anthony PERARD Jan Beulich Jason Andryuk Juergen Gross Julien Grall Marek Marczykowski-Górecki Olaf Hering Penny Zheng Roger Pau Monné Stefano Stabellini Stefano
[PATCH 2/2] automation: add dom0less to the QEMU aarch64 smoke test
Add a trivial dom0less test: - build Linux defconfig arm64 to be used as dom0/U kernel - use busybox-static to create a trivial dom0/U ramdisk - use ImageBuilder to generate the uboot boot script automatically - install and use u-boot from the Debian package to start the test - binaries are loaded from uboot via tftp Disabling the pl061 device is required to get any version of Linux to boot on Xen on qemu-system-aarch64. Signed-off-by: Stefano Stabellini --- automation/gitlab-ci/test.yaml | 1 + automation/scripts/qemu-smoke-arm64.sh | 87 +++--- 2 files changed, 81 insertions(+), 7 deletions(-) diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml index 35346e3f6e..76eff1004e 100644 --- a/automation/gitlab-ci/test.yaml +++ b/automation/gitlab-ci/test.yaml @@ -27,6 +27,7 @@ qemu-smoke-arm64-gcc: image: registry.gitlab.com/xen-project/xen/${CONTAINER} variables: CONTAINER: debian:unstable-arm64v8 +LINUX_VERSION: 5.9.8 script: - ./automation/scripts/qemu-smoke-arm64.sh 2>&1 | tee qemu-smoke-arm64.log dependencies: diff --git a/automation/scripts/qemu-smoke-arm64.sh b/automation/scripts/qemu-smoke-arm64.sh index a7efbf8b6f..e8dc5b19cb 100755 --- a/automation/scripts/qemu-smoke-arm64.sh +++ b/automation/scripts/qemu-smoke-arm64.sh @@ -6,27 +6,100 @@ set -ex export DEBIAN_FRONTENT=noninteractive apt-get -qy update apt-get -qy install --no-install-recommends qemu-system-aarch64 \ -u-boot-qemu +u-boot-qemu \ +u-boot-tools \ +device-tree-compiler \ +busybox-static \ +curl \ +cpio \ +bc # XXX Silly workaround to get the following QEMU command to work cp /usr/share/qemu/pvh.bin /usr/share/qemu/efi-virtio.rom qemu-system-aarch64 \ -machine virtualization=true \ -cpu cortex-a57 -machine type=virt \ - -m 512 -display none \ + -m 1024 -display none \ -machine dumpdtb=binaries/virt-gicv3.dtb +# XXX disable pl061 to avoid Linux crash +dtc -I dtb -O dts binaries/virt-gicv3.dtb > binaries/virt-gicv3.dts +sed 's/compatible = "arm,pl061.*/status = "disabled";/g' binaries/virt-gicv3.dts > binaries/virt-gicv3-edited.dts +dtc -I dts -O dtb binaries/virt-gicv3-edited.dts > binaries/virt-gicv3.dtb + +# Busybox Dom0 +mkdir -p initrd +mkdir -p initrd/bin +mkdir -p initrd/sbin +mkdir -p initrd/etc +mkdir -p initrd/dev +mkdir -p initrd/proc +mkdir -p initrd/sys +mkdir -p initrd/lib +mkdir -p initrd/var +mkdir -p initrd/mnt +cp /bin/busybox initrd/bin/busybox +initrd/bin/busybox --install initrd/bin +echo "#!/bin/sh + +mount -t proc proc /proc +mount -t sysfs sysfs /sys +mount -t devtmpfs devtmpfs /dev +/bin/sh" > initrd/init +chmod +x initrd/init +cd initrd +find . | cpio --create --format='newc' | gzip > ../binaries/initrd +cd .. + + +# Linux Dom0 +curl -fsSLO https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-"$LINUX_VERSION".tar.xz +tar xvJf linux-"$LINUX_VERSION".tar.xz +cd linux-"$LINUX_VERSION" +make defconfig +make -j$(nproc) Image.gz +cp arch/arm64/boot/Image ../binaries +cd .. + + +# ImageBuilder +echo 'MEMORY_START="0x4000" +MEMORY_END="0x8000" + +DEVICE_TREE="virt-gicv3.dtb" +XEN="xen" +DOM0_KERNEL="Image" +DOM0_RAMDISK="initrd" +XEN_CMD="console=dtuart dom0_mem=512M" + +NUM_DOMUS=1 +DOMU_KERNEL[0]="Image" +DOMU_RAMDISK[0]="initrd" +DOMU_MEM[0]="256" + +LOAD_CMD="tftpb" +UBOOT_SOURCE="boot.source" +UBOOT_SCRIPT="boot.scr"' > binaries/config +rm -rf imagebuilder +git clone https://gitlab.com/ViryaOS/imagebuilder +bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/config + + +# Run the test rm -f smoke.serial set +e -echo " booti 0x4900 - 0x4400" | timeout -k 1 30 qemu-system-aarch64 \ +echo " virtio scan; dhcp; tftpb 0x4000 boot.scr; source 0x4000"| \ +timeout -k 1 240 \ +qemu-system-aarch64 \ -machine virtualization=true \ -cpu cortex-a57 -machine type=virt \ --m 512 -monitor none -serial stdio \ +-m 1024 -monitor none -serial stdio \ +-smp 2 \ -no-reboot \ --device loader,file=binaries/virt-gicv3.dtb,force-raw=on,addr=0x4400 \ --device loader,file=binaries/xen,force-raw=on,addr=0x4900 \ +-device virtio-net-pci,netdev=n0 \ +-netdev user,id=n0,tftp=binaries \ -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin |& tee smoke.serial set -e -grep -q 'LOADING DOMAIN 0' smoke.serial || exit 1 +(grep -q "^BusyBox" smoke.serial && grep -q "DOM1: BusyBox" smoke.serial) || exit 1 exit 0 -- 2.17.1
[PATCH 1/2] automation: add a QEMU aarch64 smoke test
Use QEMU to start Xen (just the hypervisor) up until it stops because there is no dom0 kernel to boot. It is based on the existing build job unstable-arm64v8. Also use make -j$(nproc) to build Xen. Signed-off-by: Stefano Stabellini --- automation/gitlab-ci/test.yaml | 22 ++ automation/scripts/build | 8 +++ automation/scripts/qemu-smoke-arm64.sh | 32 ++ 3 files changed, 57 insertions(+), 5 deletions(-) create mode 100755 automation/scripts/qemu-smoke-arm64.sh diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml index 793feafe8b..35346e3f6e 100644 --- a/automation/gitlab-ci/test.yaml +++ b/automation/gitlab-ci/test.yaml @@ -22,6 +22,28 @@ build-each-commit-gcc: - /^coverity-tested\/.*/ - /^stable-.*/ +qemu-smoke-arm64-gcc: + stage: test + image: registry.gitlab.com/xen-project/xen/${CONTAINER} + variables: +CONTAINER: debian:unstable-arm64v8 + script: +- ./automation/scripts/qemu-smoke-arm64.sh 2>&1 | tee qemu-smoke-arm64.log + dependencies: +- debian-unstable-gcc-arm64 + artifacts: +paths: + - smoke.serial + - '*.log' +when: always + tags: +- arm64 + except: +- master +- smoke +- /^coverity-tested\/.*/ +- /^stable-.*/ + qemu-smoke-x86-64-gcc: stage: test image: registry.gitlab.com/xen-project/xen/${CONTAINER} diff --git a/automation/scripts/build b/automation/scripts/build index 0cd0f3971d..95ad23eadb 100755 --- a/automation/scripts/build +++ b/automation/scripts/build @@ -10,9 +10,9 @@ cc-ver() # random config or default config if [[ "${RANDCONFIG}" == "y" ]]; then -make -C xen KCONFIG_ALLCONFIG=tools/kconfig/allrandom.config randconfig +make -j$(nproc) -C xen KCONFIG_ALLCONFIG=tools/kconfig/allrandom.config randconfig else -make -C xen defconfig +make -j$(nproc) -C xen defconfig fi # build up our configure options @@ -45,9 +45,7 @@ make -j$(nproc) dist # Extract artifacts to avoid getting rewritten by customised builds cp xen/.config xen-config mkdir binaries -if [[ "${XEN_TARGET_ARCH}" == "x86_64" ]]; then -cp xen/xen binaries/xen -fi +cp xen/xen binaries/xen # Build all the configs we care about case ${XEN_TARGET_ARCH} in diff --git a/automation/scripts/qemu-smoke-arm64.sh b/automation/scripts/qemu-smoke-arm64.sh new file mode 100755 index 00..a7efbf8b6f --- /dev/null +++ b/automation/scripts/qemu-smoke-arm64.sh @@ -0,0 +1,32 @@ +#!/bin/bash + +set -ex + +# Install QEMU +export DEBIAN_FRONTENT=noninteractive +apt-get -qy update +apt-get -qy install --no-install-recommends qemu-system-aarch64 \ +u-boot-qemu + +# XXX Silly workaround to get the following QEMU command to work +cp /usr/share/qemu/pvh.bin /usr/share/qemu/efi-virtio.rom +qemu-system-aarch64 \ + -machine virtualization=true \ + -cpu cortex-a57 -machine type=virt \ + -m 512 -display none \ + -machine dumpdtb=binaries/virt-gicv3.dtb + +rm -f smoke.serial +set +e +echo " booti 0x4900 - 0x4400" | timeout -k 1 30 qemu-system-aarch64 \ +-machine virtualization=true \ +-cpu cortex-a57 -machine type=virt \ +-m 512 -monitor none -serial stdio \ +-no-reboot \ +-device loader,file=binaries/virt-gicv3.dtb,force-raw=on,addr=0x4400 \ +-device loader,file=binaries/xen,force-raw=on,addr=0x4900 \ +-bios /usr/lib/u-boot/qemu_arm64/u-boot.bin |& tee smoke.serial + +set -e +grep -q 'LOADING DOMAIN 0' smoke.serial || exit 1 +exit 0 -- 2.17.1
[PATCH 0/2] automation: arm64 dom0less smoke test
Hi all, This short series introduces a very simple Xen Dom0less smoke test based on qemu-system-aarch64 to be run as part of the gitlab CI-loop. It currently passes on staging. Cheers, Stefano Stefano Stabellini (2): automation: add a QEMU aarch64 smoke test automation: add dom0less to the QEMU aarch64 smoke test automation/gitlab-ci/test.yaml | 23 automation/scripts/build | 8 +-- automation/scripts/qemu-smoke-arm64.sh | 105 + 3 files changed, 131 insertions(+), 5 deletions(-) create mode 100755 automation/scripts/qemu-smoke-arm64.sh
[ovmf test] 156742: all pass - PUSHED
flight 156742 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/156742/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 662b42db76a5b195c3aa94ab2946e342a15cd185 baseline version: ovmf 544cb0132dc1778b9e791202995533523fa6cccd Last test of basis 156720 2020-11-12 16:00:44 Z1 days Testing same since 156742 2020-11-13 11:32:54 Z0 days1 attempts People who touched revisions under test: Pete Batard Yunhua Feng jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git 544cb0132d..662b42db76 662b42db76a5b195c3aa94ab2946e342a15cd185 -> xen-tested-master
[SPECIFICATION RFC] The firmware and bootloader log specification
Hey, This is next attempt to create firmware and bootloader log specification. Due to high interest among industry it is an extension to the initial bootloader log only specification. It takes into the account most of the comments which I got up until now. The goal is to pass all logs produced by various boot components to the running OS. The OS kernel should expose these logs to the user space and/or process them internally if needed. The content of these logs should be human readable. However, they should also contain the information which allows admins to do e.g. boot time analysis. The log specification should be as much as possible platform agnostic and self contained. The final version of this spec should be merged into existing specifications, e.g. UEFI, ACPI, Multiboot2, or be a standalone spec, e.g. as a part of OASIS Standards. The former seems better but is not perfect too... Here is the description (pseudocode) of the structures which will be used to store the log data. struct bf_log { uint32_t version; char producer[64]; uint64_t flags; uint64_t next_bf_log_addr; uint32_t next_msg_off; bf_log_msg msgs[]; } struct bf_log_msg { uint32_t size; uint64_t ts_nsec; uint32_t level; uint32_t facility; uint32_t msg_off; char strings[]; } The members of struct bf_log: - version: the firmware and bootloader log format version number, 1 for now, - producer: the producer/firmware/bootloader/... type; the length allows ASCII UUID storage if somebody needs that functionality, - flags: it can be used to store information about log state, e.g. it was truncated or not (does it make sense to have an information about the number of lost messages?), - next_bf_log_addr: address of next bf_log struct; none if zero (I think newer spec versions should not change anything in first 5 bf_log members; this way older log parsers will be able to traverse/copy all logs regardless of version used in one log or another), - next_msg_off: the offset, in bytes, from the beginning of the bf_log struct, of the next byte after the last log message in the msgs[]; i.e. the offset of the next available log message slot; it is equal to the total size of the log buffer including the bf_log struct, - msgs: the array of log messages, - should we add CRC or hash or signatures here? The members of struct bf_log_msg: - size: total size of bf_log_msg struct, - ts_nsec: timestamp expressed in nanoseconds starting from 0, - level: similar to syslog meaning; can be used to differentiate normal messages from debug messages; the exact interpretation depends on the current producer type specified in the bf_log.producer, - facility: similar to syslog meaning; can be used to differentiate the sources of the messages, e.g. message produced by networking module; the exact interpretation depends on the current producer type specified in the bf_log.producer, - msg_off: the log message offset in strings[], - strings[0]: the beginning of log message type, similar to the facility member but NUL terminated string instead of integer; this will be used by, e.g., the GRUB2 for messages printed using grub_dprintf(), - strings[msg_off]: the beginning of log message, NUL terminated string. Note: The producers are free to use/ignore any given set of level, facility and/or log type members. Though the usage of these members has to be clearly defined. Ignored integer members should be set to 0. Ignored log message type should contain an empty NUL terminated string. The log message is mandatory but can be an empty NUL terminated string. There is still not fully solved problem how the logs should be presented to the OS. On the UEFI platforms we can use config tables to do that. Then probably bf_log.next_bf_log_addr should not be used. On the ACPI and Device Tree platforms we can use these mechanisms to present the logs to the OSes. The situation gets more difficult if neither of these mechanisms are present. However, maybe we should not bother too much about that because probably these platforms getting less and less common. Anyway, I am aware that this is not specification per se. The goal of this email is to continue the discussion about the idea of the firmware and booloader log and to find out where the final specification should land. Of course taking into the account assumptions made above. You can find previous discussions about related topics at [1], [2] and [3]. Additionally, I am going to present this during GRUB mini-summit session on Tuesday, 17th of November at 15:45 UTC. So, if you want to discuss the log design please join us. You can find more details here [4]. Daniel [1] https://lists.gnu.org/archive/html/grub-devel/2019-10/msg00107.html [2] https://lists.gnu.org/archive/html/grub-devel/2019-11/msg00079.html [3]
[qemu-mainline test] 156733: regressions - FAIL
flight 156733 qemu-mainline real [real] flight 156763 qemu-mainline real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/156733/ http://logs.test-lab.xenproject.org/osstest/logs/156763/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeat fail REGR. vs. 152631 test-armhf-armhf-libvirt-raw 12 debian-di-installfail REGR. vs. 152631 test-amd64-amd64-xl-qcow2 21 guest-start/debian.repeat fail REGR. vs. 152631 test-arm64-arm64-libvirt-xsm 14 guest-start fail REGR. vs. 152631 test-armhf-armhf-xl-vhd 12 debian-di-installfail REGR. vs. 152631 test-armhf-armhf-libvirt 14 guest-start fail REGR. vs. 152631 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-rtds 19 guest-start.2 fail blocked in 152631 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152631 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 152631 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152631 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 152631 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152631 test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass version targeted for testing: qemuucb5d19e8294486551c422759260883ed290226d9 baseline version: qemuu1d806cef0e38b5db8347a8e12f214d543204a314 Last test of basis 152631 2020-08-20 09:07:46 Z 85 days Failing since152659 2020-08-21 14:07:39 Z 84 days 180 attempts Testing same since 156733 2020-11-13 05:00:33 Z0 days1 attempts People who touched revisions under test: Aaron Lindsay Alberto Garcia Aleksandar Markovic Alex Bennée Alex Chen Alex Williamson Alexander Bulekov AlexChen Alexey Kirillov Alistair Francis Alistair Francis Amey Narkhede Ana Pazos Andreas Gustafsson Andrew Jones Andrey Konovalov Andrey Shinkevich Ani Sinha Anthony PERARD Anton Blanchard Anup Patel Artyom Tarasenko Babu Moger BALATON Zoltan Bastian
[linux-linus test] 156725: regressions - FAIL
flight 156725 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/156725/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-qemut-rhel6hvm-intel 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-ws16-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemuu-rhel6hvm-intel 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-xsm7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-debianhvm-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-examine 6 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemuu-rhel6hvm-amd 7 xen-installfail REGR. vs. 152332 test-amd64-i386-xl7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-ws16-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332 test-amd64-i386-libvirt-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-freebsd10-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-pvshim 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-raw7 xen-install fail REGR. vs. 152332 test-amd64-i386-freebsd10-i386 7 xen-installfail REGR. vs. 152332 test-amd64-i386-xl-shadow 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-win7-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-win7-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-ovmf-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332 test-arm64-arm64-xl-credit2 10 host-ping-check-xen fail REGR. vs. 152332 test-amd64-coresched-i386-xl 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemut-rhel6hvm-amd 7 xen-installfail REGR. vs. 152332 test-arm64-arm64-examine 8 reboot fail REGR. vs. 152332 test-arm64-arm64-libvirt-xsm 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-xl 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-xl-credit1 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-xl-seattle 8 xen-boot fail REGR. vs. 152332 test-amd64-amd64-amd64-pvgrub 20 guest-stop fail REGR. vs. 152332 test-amd64-amd64-i386-pvgrub 20 guest-stop fail REGR. vs. 152332 test-armhf-armhf-xl-multivcpu 8 xen-bootfail REGR. vs. 152332 test-armhf-armhf-xl-credit1 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-xl-vhd 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-xl-cubietruck 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-libvirt-raw 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-libvirt 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-examine 8 reboot fail REGR. vs. 152332 test-arm64-arm64-xl-xsm 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-xl-credit2 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-xl 8 xen-boot fail REGR. vs. 152332 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 8 xen-boot fail REGR. vs. 152332 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass
[OSSTEST PATCH] cr-daily-branch: Sort out retest build reuse
Retest flights ought to reuse precisely the builds from the flight which prompts the retests. mg-adjust-flight-makexrefs is the wrong tool for this job. It can often leave the retry flights with no build jobs and no references to the main flights' build jobs, so the results are just broken jobs. Signed-off-by: Ian Jackson --- cr-daily-branch | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/cr-daily-branch b/cr-daily-branch index e54ca227..6ec3aa62 100755 --- a/cr-daily-branch +++ b/cr-daily-branch @@ -529,9 +529,8 @@ END $retry_jobs ) - ./mg-adjust-flight-makexrefs -v $rflight \ - --branch=$branch --revision-osstest=$narness_rev \ - '^build-*' --debug --blessings=real + ./cs-adjust-flight -D $rflight \ + runvar-build-set . . "^$rflight\\." $flight. if [ "${OSSTEST_SIMULATE_FAIL_RETRY+y}" = y ]; then export OSSTEST_SIMULATE_FAIL="${OSSTEST_SIMULATE_FAIL_RETRY}" -- 2.20.1
Re: [PATCH V2 23/23] [RFC] libxl: Add support for virtio-disk configuration
On 11.11.20 06:28, Wei Chen wrote: Hi Oleksandr, Hi Wei An example of domain configuration (two disks are assigned to the guest, the latter is in readonly mode): vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ] Can we keep use the same 'disk' parameter for virtio-disk, but add an option like "model=virtio-disk"? For example: disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ] Just like what Xen has done for x86 virtio-net. I think, this needs an additional investigation. In general I agree with you that it would be nice to reuse existing 'disk' parameter somehow rather than introducing new one for the same purpose (to handle virtual block device(s)). One note, although both are used for the same purpose they are different in at least one important option. For example: 1. disk = [ 'backend=DomD, phy:/dev/mmcblk0p3, xvda1' ] 2. vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3' ] As you can see existing "disk" parameter contains xvda1, this means that a new device /dev/xvda1 will appear at the guest side, but "vdisk" doesn't contain anything similar. So with Xen PV driver (xen-blkfront) Yes, I understand your concerns. But I think specifying a device name for virtio disk is not a mandatory requirement. Even if we're using physical disks on bare metal machine, we can't guarantee slot#1 disk is always 'sda'. So most modern OS are prefer to use blkid to mount filesystem. we are able to configure a device name, but with VirtIO solution (virtio-blk) we aren't (at least I don't know how exactly). Just my understanding, not exactly accurate: The virtio-blk could not get VDEV information for a bus like Xen-bus. So the disk ID is allocated dynamically in bus probe progress. The first probed disk will get ID 'a'. And then the ID keeps increasing. If we want to specify the disk ID for virtio disk, one possible way to do this is to construct a reasonable position on bus (fdt node position of virtio mmio device, PCI Function ID of virtio pci block) for virtio disk. But it is not always successful, we can't skip 'vda' to specify a virtio disk as 'vdb'. Thank you for the explanation. I got your point regarding using the same disk parameter and I think it makes sense. Now I am focusing on IOREQ transport to be available on Arm, and after that (I hope) I will be able to return to virtio stuff. -- Regards, Oleksandr Tyshchenko
Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup
On Fri, 2020-11-13 at 17:13 +, Andrew Cooper wrote: > On 13/11/2020 16:56, Bjoern Doebel wrote: > > On 13.11.20 16:36, Jürgen Groß wrote: > > > On 13.11.20 15:18, Bjoern Doebel wrote: > > > > Right now we do not have a mechanism to determine the version > > > > of the > > > > currently running xenstored at runtime. As xenstored runs > > > > throughout > > > > the > > > > lifetime of a Xen host, this may lead to problems when newer > > > > user space > > > > builds are staged. Then, the running xenstored will no longer > > > > match the > > > > version of the installed xenstored. > > > > > > > > To allow users to always identify the running version of > > > > xenstored, add > > > > a linker-generated unique build ID to every xenstored build. > > > > Add > > > > functionality to log this build ID into a file upon service > > > > startup. > > > > > > > > Signed-off-by: Bjoern Doebel > > > > Reviewed-by: Martin Mazein > > > > Reviewed-by: Paul Durrant > > > > > > No support for oxenstored or xenstore-stubdom? > > > > Your suggestion further down will apparently help for stubdom. I do > > not speak ocaml at all - how do we address this? > > CC'ing Edwin and Christian who have done the bulk of the oxenstored > recently. > > It sounds like it might not be possible right now, but would be > possible > with a future plan to switch the Ocaml build system over to dune (the > new/preferred Ocaml upstream toolchain). See here what is possible with Dune: https://dune.readthedocs.io/en/stable/dune-libs.html#build-info Would the output of 'git describe --always --dirty' (perhaps combined with a build date) serve as a useful build ID? > > If it does end up being an XS_CONTROL sub-op, we can implement it at > a > future point when we can usefully answer the question. Wouldn't using readelf on /proc//exe give you the running buildid? readelf -a /usr/sbin/oxenstored /proc/$(pidof oxenstored)/exe | grep 'Build ID' Build ID: bdd5304c8984ed22570d51308ae8717d03fe60ae Build ID: bdd5304c8984ed22570d51308ae8717d03fe60ae readelf -a /usr/sbin/oxenstored /proc/$(pidof oxenstored)/exe | grep 'Build ID' Build ID: b44ff99b216db7526e3ee7841068d584cc9c2b95 Build ID: bdd5304c8984ed22570d51308ae8717d03fe60ae When you're inside a stubdom it is probably not so easy though. Best regards, --Edwin > > ~Andrew
Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup
On 13.11.20 15:27, Jan Beulich wrote: On 13.11.2020 15:18, Bjoern Doebel wrote: --- a/tools/xenstore/Makefile +++ b/tools/xenstore/Makefile @@ -66,6 +66,10 @@ $(XENSTORED_OBJS): CFLAGS += $(SYSTEMD_CFLAGS) xenstored: LDFLAGS += $(SYSTEMD_LIBS) endif +# xenstored: enforce creation of a buildID section and use a linker +# script to add additional symbols around that section +xenstored: LDFLAGS += -Wl,--build-id=sha1 -Wl,-T,buildid_symbols.ld Since in the hypervisor build we're careful to not use --build-id when the linker doesn't support it, I suppose the same care needs applying here. See the setting of build_id_linker in ./Config.mk. Ok, I will make this conditional on the setting of $(build_id_linker). Jan Bjoern Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879
Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup
On 13/11/2020 16:56, Bjoern Doebel wrote: > On 13.11.20 16:36, Jürgen Groß wrote: >> On 13.11.20 15:18, Bjoern Doebel wrote: >>> Right now we do not have a mechanism to determine the version of the >>> currently running xenstored at runtime. As xenstored runs throughout >>> the >>> lifetime of a Xen host, this may lead to problems when newer user space >>> builds are staged. Then, the running xenstored will no longer match the >>> version of the installed xenstored. >>> >>> To allow users to always identify the running version of xenstored, add >>> a linker-generated unique build ID to every xenstored build. Add >>> functionality to log this build ID into a file upon service startup. >>> >>> Signed-off-by: Bjoern Doebel >>> Reviewed-by: Martin Mazein >>> Reviewed-by: Paul Durrant >> >> No support for oxenstored or xenstore-stubdom? > > Your suggestion further down will apparently help for stubdom. I do > not speak ocaml at all - how do we address this? CC'ing Edwin and Christian who have done the bulk of the oxenstored recently. It sounds like it might not be possible right now, but would be possible with a future plan to switch the Ocaml build system over to dune (the new/preferred Ocaml upstream toolchain). If it does end up being an XS_CONTROL sub-op, we can implement it at a future point when we can usefully answer the question. ~Andrew
Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup
On 13/11/2020 16:55, Bjoern Doebel wrote: > > On 13.11.20 15:30, Andrew Cooper wrote: >> On 13/11/2020 14:18, Bjoern Doebel wrote: >>> Right now we do not have a mechanism to determine the version of the >>> currently running xenstored at runtime. As xenstored runs throughout >>> the >>> lifetime of a Xen host, this may lead to problems when newer user space >>> builds are staged. Then, the running xenstored will no longer match the >>> version of the installed xenstored. >>> >>> To allow users to always identify the running version of xenstored, add >>> a linker-generated unique build ID to every xenstored build. Add >>> functionality to log this build ID into a file upon service startup. >>> >>> Signed-off-by: Bjoern Doebel >>> Reviewed-by: Martin Mazein >>> Reviewed-by: Paul Durrant >> I understand the problem you're trying to solve, but why is this >> anything more than just enabling build-id's by default across tools/ ? >> >> There are already standard ways of interacting with the build id of >> running executables on the system. I'd strongly discourage doing >> anything custom in xenstored specifically. > May I ask what tooling you would use to interact with a running > process' buildid? Amongst other things, yes. Although as Juergen points out, we want something which works with stub domains as well, and "normal userspace tools" won't cut it there. I still think a first patch in this series should be to turn build-id's on by default if supported by the toolchain, generally. ~Andrew
Re: [PATCH V2 16/23] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
On 12.11.20 14:18, Jan Beulich wrote: Hi Jan On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -1380,6 +1380,27 @@ int guest_physmap_remove_page(struct domain *d, gfn_t gfn, mfn_t mfn, return p2m_remove_mapping(d, gfn, (1 << page_order), mfn); } +int set_foreign_p2m_entry(struct domain *d, const struct domain *fd, + unsigned long gfn, mfn_t mfn) +{ +struct page_info *page = mfn_to_page(mfn); +int rc; + +if ( !get_page(page, fd) ) +return -EINVAL; + +/* + * It is valid to always use p2m_map_foreign_rw here as if this gets + * called that d != fd. A case when d == fd would be rejected by + * rcu_lock_remote_domain_by_id() earlier. + */ Are you describing things here on the assumption that no new callers may surface later on? To catch such, I'd recommend adding at least a respective ASSERT(). Agree, will add. +rc = guest_physmap_add_entry(d, _gfn(gfn), mfn, 0, p2m_map_foreign_rw); +if ( rc ) +put_page(page); + +return 0; I can't imagine it's correct to not signal the error to the caller(s). It is not correct, I have just missed to place return rc. Thank you for the catch. --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -1099,7 +1099,8 @@ static int acquire_resource( *reference counted, it is unsafe to allow mapping of *resource pages unless the caller is the hardware domain. */ -if ( paging_mode_translate(currd) && !is_hardware_domain(currd) ) +if ( paging_mode_translate(currd) && !is_hardware_domain(currd) && + !arch_refcounts_p2m() ) return -EACCES; I guess the hook may want naming differently, as both prior parts of the condition should be needed only on the x86 side, and there (for PV) there's no p2m involved in the refcounting. Maybe arch_acquire_resource_check()? And then the comment wants moving into the x86 hook as well. You may want to consider leaving a more generic one here... ok, again, I am fine with the name). Will follow everything suggested above. -- Regards, Oleksandr Tyshchenko
Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup
On 13.11.20 16:36, Jürgen Groß wrote: On 13.11.20 15:18, Bjoern Doebel wrote: Right now we do not have a mechanism to determine the version of the currently running xenstored at runtime. As xenstored runs throughout the lifetime of a Xen host, this may lead to problems when newer user space builds are staged. Then, the running xenstored will no longer match the version of the installed xenstored. To allow users to always identify the running version of xenstored, add a linker-generated unique build ID to every xenstored build. Add functionality to log this build ID into a file upon service startup. Signed-off-by: Bjoern Doebel Reviewed-by: Martin Mazein Reviewed-by: Paul Durrant No support for oxenstored or xenstore-stubdom? Your suggestion further down will apparently help for stubdom. I do not speak ocaml at all - how do we address this? --- tools/hotplug/Linux/launch-xenstore.in | 2 +- tools/xenstore/Makefile | 4 +++ tools/xenstore/buildid_symbols.ld | 10 +++ tools/xenstore/xenstored_core.c | 8 ++ tools/xenstore/xenstored_core.h | 3 ++ tools/xenstore/xenstored_minios.c | 4 +++ tools/xenstore/xenstored_posix.c | 52 ++ 7 files changed, 82 insertions(+), 1 deletion(-) create mode 100644 tools/xenstore/buildid_symbols.ld diff --git a/tools/hotplug/Linux/launch-xenstore.in b/tools/hotplug/Linux/launch-xenstore.in index 991dec8d25..a6f2254030 100644 --- a/tools/hotplug/Linux/launch-xenstore.in +++ b/tools/hotplug/Linux/launch-xenstore.in @@ -62,7 +62,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF } echo -n Starting $XENSTORED... - $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS + $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid --buildid-file @XEN_RUN_DIR@/xenstored.buildid $XENSTORED_ARGS systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || exit 1 diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile index 9a0f0d012d..c63350980b 100644 --- a/tools/xenstore/Makefile +++ b/tools/xenstore/Makefile @@ -66,6 +66,10 @@ $(XENSTORED_OBJS): CFLAGS += $(SYSTEMD_CFLAGS) xenstored: LDFLAGS += $(SYSTEMD_LIBS) endif +# xenstored: enforce creation of a buildID section and use a linker +# script to add additional symbols around that section +xenstored: LDFLAGS += -Wl,--build-id=sha1 -Wl,-T,buildid_symbols.ld + $(XENSTORED_OBJS): CFLAGS += $(CFLAGS_libxengnttab) xenstored: $(XENSTORED_OBJS) diff --git a/tools/xenstore/buildid_symbols.ld b/tools/xenstore/buildid_symbols.ld new file mode 100644 index 00..d74024c4e9 --- /dev/null +++ b/tools/xenstore/buildid_symbols.ld @@ -0,0 +1,10 @@ +SECTIONS +{ + __buildid_note_section = . ; + .note.gnu.build-id : + { + *(.note.gnu.build-id) + } + __buildid_end = . ; +} +INSERT AFTER .data diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c index b4be374d3f..c6f107bdd9 100644 --- a/tools/xenstore/xenstored_core.c +++ b/tools/xenstore/xenstored_core.c @@ -1844,6 +1844,7 @@ static void usage(void) static struct option options[] = { + { "buildid-file", 1, NULL, 'B' }, { "no-domain-init", 0, NULL, 'D' }, { "entry-nb", 1, NULL, 'E' }, { "pid-file", 1, NULL, 'F' }, @@ -1875,12 +1876,16 @@ int main(int argc, char *argv[]) bool outputpid = false; bool no_domain_init = false; const char *pidfile = NULL; + const char *buildid_file = NULL; int timeout; while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:T:RVW:", options, You are missing "B:" in the short options. Ack. Fixing. NULL)) != -1) { switch (opt) { + case 'B': + buildid_file = optarg; + break; case 'D': no_domain_init = true; break; @@ -1948,6 +1953,9 @@ int main(int argc, char *argv[]) if (pidfile) write_pidfile(pidfile); + if (buildid_file) + write_buildid_file(buildid_file); + /* Talloc leak reports go to stderr, which is closed if we fork. */ if (!dofork) talloc_enable_leak_report_full(); You should update the usage printout, too. Ok. diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h index 1df6ad94ab..712280626c 100644 --- a/tools/xenstore/xenstored_core.h +++ b/tools/xenstore/xenstored_core.h @@ -193,6 +193,9 @@ void xenbus_notify_running(void); /* Write out the pidfile */ void write_pidfile(const char *pidfile); +/* Write the buildid file */ +void write_buildid_file(const char *buildidfile); + /* Fork but do not close terminal FDs */ void daemonize(void); /* Close stdin/stdout/stderr to complete daemonize */ diff --git a/tools/xenstore/xenstored_minios.c b/tools/xenstore/xenstored_minios.c index c94493e52a..ef1151aee4 100644 ---
Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup
On 13.11.20 15:30, Andrew Cooper wrote: On 13/11/2020 14:18, Bjoern Doebel wrote: Right now we do not have a mechanism to determine the version of the currently running xenstored at runtime. As xenstored runs throughout the lifetime of a Xen host, this may lead to problems when newer user space builds are staged. Then, the running xenstored will no longer match the version of the installed xenstored. To allow users to always identify the running version of xenstored, add a linker-generated unique build ID to every xenstored build. Add functionality to log this build ID into a file upon service startup. Signed-off-by: Bjoern Doebel Reviewed-by: Martin Mazein Reviewed-by: Paul Durrant I understand the problem you're trying to solve, but why is this anything more than just enabling build-id's by default across tools/ ? There are already standard ways of interacting with the build id of running executables on the system. I'd strongly discourage doing anything custom in xenstored specifically. May I ask what tooling you would use to interact with a running process' buildid? ~Andrew Bjoern Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879
Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
I just noticed that CPU0 also stops receiving clock interrupts - which may explain why the kernel wedges. I can still enter NetBSD's debugger, which means that console interrupts are still working (the console's event channel is also handled by CPU 0). A 'q' in Xen doens't show any pending or masked events, for any CPU. The NetBSD interrupt counters show event channel 2's counter (the CPU0 clock) stuck at 13, while others are happily increasing. Any idea what to look at from here ? -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: [PATCH V2 20/23] xen/ioreq: Make x86's send_invalidate_req() common
On 12.11.20 13:55, Jan Beulich wrote: Hi Jan On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: --- a/xen/include/asm-x86/hvm/io.h +++ b/xen/include/asm-x86/hvm/io.h @@ -97,7 +97,6 @@ bool relocate_portio_handler( unsigned int size); void send_timeoffset_req(unsigned long timeoff); -void send_invalidate_req(void); bool handle_mmio_with_translation(unsigned long gla, unsigned long gpfn, struct npfec); bool handle_pio(uint16_t port, unsigned int size, int dir); diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h index 0679fef..aad682f 100644 --- a/xen/include/xen/ioreq.h +++ b/xen/include/xen/ioreq.h @@ -126,6 +126,7 @@ struct ioreq_server *select_ioreq_server(struct domain *d, int send_ioreq(struct ioreq_server *s, ioreq_t *proto_p, bool buffered); unsigned int broadcast_ioreq(ioreq_t *p, bool buffered); +void send_invalidate_ioreq(void); Again while renaming this function anyway could we see about giving it a suitable and consistent name? Maybe ioreq_request_mapcache_invalidate() or (to avoid the double "request") ioreq_signal_mapcache_invalidate()? Maybe even ioreq_server_...(). iirc send_invalidate_ioreq() was one of the proposed names during V1 review. But to follow new scheme, I am fine with both the first and second. -- Regards, Oleksandr Tyshchenko
Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
On Fri, Nov 13, 2020 at 04:14:28PM +0100, Manuel Bouyer wrote: > I can try a kernel without this driver, to see if other devices reports > interrupt. This didn't change anything, I don't get more interrupts with this driver commented out. -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
[linux-5.4 test] 156722: tolerable FAIL - PUSHED
flight 156722 linux-5.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/156722/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-amd64-i386-xl-xsm 20 guest-localmigrate/x10 fail in 156677 pass in 156722 test-amd64-i386-libvirt-xsm 21 guest-start.2 fail pass in 156677 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156412 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156412 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156412 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156412 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 156412 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156412 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156412 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156412 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156412 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 156412 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156412 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass version targeted for testing: linux2544d06afd8d060f35b159809274e4b7477e63e8 baseline version: linux6e97ed6efa701db070da0054b055c085895aba86 Last test of basis 156412 2020-11-05 11:13:59 Z8 days Failing since156620 2020-11-10 12:10:31 Z3 days3 attempts Testing same since 156677 2020-11-11 07:50:18 Z2 days2 attempts People who touched revisions under test:
Re: [PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved function names
On 12.11.20 13:45, Jan Beulich wrote: Hi Jan. On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: From: Oleksandr Tyshchenko This patch removes "hvm" prefixes and infixes from IOREQ related function names in the common code. AT least some of the functions touched here would be nice to be moved to a more consistent new naming scheme right away, to avoid having to touch all the same places again. I guess ioreq server functions would be nice to all start with ioreq_server_ and ioreq functions to all start with ioreq_. E.g. ioreq_send() and ioreq_server_select(). As it was initially discussed at patch #1, ok , I will prepare a list of the proposed renaming so we could find a common ground I hope. After that I will make required renaming. -- Regards, Oleksandr Tyshchenko
Re: [PATCH V2 14/23] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
On 12.11.20 13:48, Jan Beulich wrote: Hi Jan On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: --- a/xen/common/ioreq.c +++ b/xen/common/ioreq.c @@ -18,6 +18,7 @@ #include #include +#include #include #include #include There preferably wouldn't be a need to touch non-Arm code in this patch. I suppose the added #include could easily be introduced e.g. while moving the file? ok -- Regards, Oleksandr Tyshchenko
Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup
On 13.11.20 15:18, Bjoern Doebel wrote: Right now we do not have a mechanism to determine the version of the currently running xenstored at runtime. As xenstored runs throughout the lifetime of a Xen host, this may lead to problems when newer user space builds are staged. Then, the running xenstored will no longer match the version of the installed xenstored. To allow users to always identify the running version of xenstored, add a linker-generated unique build ID to every xenstored build. Add functionality to log this build ID into a file upon service startup. Signed-off-by: Bjoern Doebel Reviewed-by: Martin Mazein Reviewed-by: Paul Durrant No support for oxenstored or xenstore-stubdom? --- tools/hotplug/Linux/launch-xenstore.in | 2 +- tools/xenstore/Makefile| 4 +++ tools/xenstore/buildid_symbols.ld | 10 +++ tools/xenstore/xenstored_core.c| 8 ++ tools/xenstore/xenstored_core.h| 3 ++ tools/xenstore/xenstored_minios.c | 4 +++ tools/xenstore/xenstored_posix.c | 52 ++ 7 files changed, 82 insertions(+), 1 deletion(-) create mode 100644 tools/xenstore/buildid_symbols.ld diff --git a/tools/hotplug/Linux/launch-xenstore.in b/tools/hotplug/Linux/launch-xenstore.in index 991dec8d25..a6f2254030 100644 --- a/tools/hotplug/Linux/launch-xenstore.in +++ b/tools/hotplug/Linux/launch-xenstore.in @@ -62,7 +62,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF } echo -n Starting $XENSTORED... - $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS + $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid --buildid-file @XEN_RUN_DIR@/xenstored.buildid $XENSTORED_ARGS systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || exit 1 diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile index 9a0f0d012d..c63350980b 100644 --- a/tools/xenstore/Makefile +++ b/tools/xenstore/Makefile @@ -66,6 +66,10 @@ $(XENSTORED_OBJS): CFLAGS += $(SYSTEMD_CFLAGS) xenstored: LDFLAGS += $(SYSTEMD_LIBS) endif +# xenstored: enforce creation of a buildID section and use a linker +# script to add additional symbols around that section +xenstored: LDFLAGS += -Wl,--build-id=sha1 -Wl,-T,buildid_symbols.ld + $(XENSTORED_OBJS): CFLAGS += $(CFLAGS_libxengnttab) xenstored: $(XENSTORED_OBJS) diff --git a/tools/xenstore/buildid_symbols.ld b/tools/xenstore/buildid_symbols.ld new file mode 100644 index 00..d74024c4e9 --- /dev/null +++ b/tools/xenstore/buildid_symbols.ld @@ -0,0 +1,10 @@ +SECTIONS +{ + __buildid_note_section = . ; + .note.gnu.build-id : + { + *(.note.gnu.build-id) + } + __buildid_end = . ; +} +INSERT AFTER .data diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c index b4be374d3f..c6f107bdd9 100644 --- a/tools/xenstore/xenstored_core.c +++ b/tools/xenstore/xenstored_core.c @@ -1844,6 +1844,7 @@ static void usage(void) static struct option options[] = { + { "buildid-file", 1, NULL, 'B' }, { "no-domain-init", 0, NULL, 'D' }, { "entry-nb", 1, NULL, 'E' }, { "pid-file", 1, NULL, 'F' }, @@ -1875,12 +1876,16 @@ int main(int argc, char *argv[]) bool outputpid = false; bool no_domain_init = false; const char *pidfile = NULL; + const char *buildid_file = NULL; int timeout; while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:T:RVW:", options, You are missing "B:" in the short options. NULL)) != -1) { switch (opt) { + case 'B': + buildid_file = optarg; + break; case 'D': no_domain_init = true; break; @@ -1948,6 +1953,9 @@ int main(int argc, char *argv[]) if (pidfile) write_pidfile(pidfile); + if (buildid_file) + write_buildid_file(buildid_file); + /* Talloc leak reports go to stderr, which is closed if we fork. */ if (!dofork) talloc_enable_leak_report_full(); You should update the usage printout, too. diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h index 1df6ad94ab..712280626c 100644 --- a/tools/xenstore/xenstored_core.h +++ b/tools/xenstore/xenstored_core.h @@ -193,6 +193,9 @@ void xenbus_notify_running(void); /* Write out the pidfile */ void write_pidfile(const char *pidfile); +/* Write the buildid file */ +void write_buildid_file(const char *buildidfile); + /* Fork but do not close terminal FDs */ void daemonize(void); /* Close stdin/stdout/stderr to complete daemonize */ diff --git a/tools/xenstore/xenstored_minios.c b/tools/xenstore/xenstored_minios.c index c94493e52a..ef1151aee4 100644 --- a/tools/xenstore/xenstored_minios.c +++
[libvirt test] 156731: regressions - FAIL
flight 156731 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/156731/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-amd64-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777 build-arm64-libvirt 6 libvirt-buildfail REGR. vs. 151777 Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-qcow2 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a version targeted for testing: libvirt 09ba97ad6b4aa725364ed5e5c48031829dc9549f baseline version: libvirt 2c846fa6bcc11929c9fb857a22430fb9945654ad Last test of basis 151777 2020-07-10 04:19:19 Z 126 days Failing since151818 2020-07-11 04:18:52 Z 125 days 120 attempts Testing same since 156731 2020-11-13 04:19:03 Z0 days1 attempts People who touched revisions under test: Adolfo Jayme Barrientos Aleksandr Alekseev Andika Triwidada Andrea Bolognani Balázs Meskó Bastien Orivel Bihong Yu Binfeng Wu Boris Fiuczynski Brian Turek Christian Ehrhardt Christian Schoenebeck Cole Robinson Collin Walling Cornelia Huck Côme Borsoi Daniel Henrique Barboza Daniel Letai Daniel P. Berrange Daniel P. Berrangé Erik Skultety Fabian Freyer Fangge Jin Fedora Weblate Translation Göran Uddeborg Halil Pasic Han Han Hao Wang Ian Wienand Jamie Strandboge Jamie Strandboge Jean-Baptiste Holcroft Jianan Gao Jim Fehlig Jin Yan Jiri Denemark Jonathon Jongsma Julio Faracco Ján Tomko Kashyap Chamarthy Kevin Locke Laine Stump Liao Pingfang Lin Ma Lin Ma Lin Ma Marc Hartmayer Marek Marczykowski-Górecki Markus Schade Martin Kletzander Masayoshi Mizuma Matt Coleman Matt Coleman Mauro Matteo Cascella Michal Privoznik Michał Smyk Milo Casagrande Neal Gompa Nico Pache Nikolay Shirokovskiy Olesya Gerasimenko Orion Poplawski Patrick Magauran Paulo de Rezende Pinatti Pavel Hrdina Peter Krempa Pino Toscano Pino Toscano Piotr Drąg Prathamesh Chavan Ricky Tigg Roman Bogorodskiy Roman Bolshakov Ryan Schmidt Sam Hartman Scott Shambarger Sebastian Mitterle Simon Gaiser Stefan Bader Stefan Berger Szymon Scholz Thomas Huth Tim Wiederhake Tomáš Golembiovský Wang Xin Weblate Yang Hang Yanqiu Zhang Yi Li Yi Wang Yuri Chornoivan Zheng Chuan zhenwei pi Zhenyu Zheng jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt fail build-arm64-libvirt fail build-armhf-libvirt fail build-i386-libvirt fail build-amd64-pvopspass build-arm64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm blocked test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked
Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
On Fri, Nov 13, 2020 at 03:35:13PM +0100, Roger Pau Monné wrote: > Forgot to mention, it might also be helpful to boot Xen with > iommu=debug, just in case. I put the serial console log at http://www-soc.lip6.fr/~bouyer/xen-log2.txt in case it helps -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
On Fri, Nov 13, 2020 at 03:33:49PM +0100, Roger Pau Monné wrote: > On Fri, Nov 13, 2020 at 12:54:57PM +0100, Manuel Bouyer wrote: > > On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote: > > > The following might be able to get you going, but I think I need to > > > refine the logic a bit there, will have to give it some thought. > > > > I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: > > Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty). > > Your patch is needed there too to avoid the panic. > > > > As with 4.13, I have problems with interrupts not being properly > > delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13) > > or 2 (with 4.15) which would mean that interrupts stop being delivered > > at some point in the setup process. Maybe something to do with mask/unmask ? > > > > The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD > > kernel, > > so it's not MSI/MSI-X (not sure it matters though). > > Maybe something related to mask/unmask ? > > What device do you have on that pin? The PERC H740P raid controller. > Is it the only device not working > properly? I'm not sure, as the kernel is stalling because of this. The other device counter interrupts are 0. I can try a kernel without this driver, to see if other devices reports interrupt. > I get from that that MSI/MSI-X is now working fine. See above. > > You can get some interrupt info from the 'i' and the 'z' debug keys, > albeit that won't reflect the state of the emulated IO-APIC used by > dom0, which is likely what we care about. There's also the 'M' debug > key, but that's only useful for MSI/MSI-X. > > I can try to prepare a patch to dump some info from the emulated > IO-APIC, but I'm afraid I won't get to it until Monday. No problem. -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: [PATCH V2 10/23] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
On 12.11.20 13:40, Jan Beulich wrote: Hi Jan On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -30,6 +30,10 @@ #include #include +#ifdef CONFIG_IOREQ_SERVER +#include +#endif Preferably #ifdef-s would not be needed here. If any, they'd better live in xen/ioreq.h itself then. ok @@ -1045,6 +1049,38 @@ static int acquire_grant_table(struct domain *d, unsigned int id, return 0; } +#ifdef CONFIG_IOREQ_SERVER To limit the number of #ifdef-s, could this be moved ... +static int acquire_ioreq_server(struct domain *d, +unsigned int id, +unsigned long frame, +unsigned int nr_frames, +xen_pfn_t mfn_list[]) +{ ... here such that ... @@ -1103,9 +1139,14 @@ static int acquire_resource( mfn_list); break; +#ifdef CONFIG_IOREQ_SERVER +case XENMEM_resource_ioreq_server: +rc = acquire_ioreq_server(d, xmar.id, xmar.frame, xmar.nr_frames, + mfn_list); +break; +#endif ... the ones here then can be dropped? I think yes, that would be better. default: Also you'll want to a blank line between the new case statement and the "default:". ok -- Regards, Oleksandr Tyshchenko
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 11/13/20 4:51 PM, Jan Beulich wrote: > On 13.11.2020 15:44, Oleksandr Andrushchenko wrote: >> On 11/13/20 4:38 PM, Jan Beulich wrote: >>> On 13.11.2020 15:32, Oleksandr Andrushchenko wrote: On 11/13/20 4:23 PM, Jan Beulich wrote: > Earlier on I didn't say you should get this to work, only > that I think the general logic around what you add shouldn't make > things more arch specific than they really should be. That said, > something similar to the above should still be doable on x86, > utilizing struct pci_seg's bus2bridge[]. There ought to be > DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them > (provided by the CPUs themselves rather than the chipset) aren't > really host bridges for the purposes you're after. You mean I can still use DEV_TYPE_PCI_HOST_BRIDGE as a marker while trying to detect what I need? >>> I'm afraid I don't understand what marker you're thinking about >>> here. >> I mean that when I go over bus2bridge entries, should I only work with >> >> those who have DEV_TYPE_PCI_HOST_BRIDGE? > Well, if you're after host bridges - yes. Ok, I'll try to see what I can do about it. > > Jan Thank you, Oleksandr
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 13.11.2020 15:44, Oleksandr Andrushchenko wrote: > > On 11/13/20 4:38 PM, Jan Beulich wrote: >> On 13.11.2020 15:32, Oleksandr Andrushchenko wrote: >>> On 11/13/20 4:23 PM, Jan Beulich wrote: Earlier on I didn't say you should get this to work, only that I think the general logic around what you add shouldn't make things more arch specific than they really should be. That said, something similar to the above should still be doable on x86, utilizing struct pci_seg's bus2bridge[]. There ought to be DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them (provided by the CPUs themselves rather than the chipset) aren't really host bridges for the purposes you're after. >>> You mean I can still use DEV_TYPE_PCI_HOST_BRIDGE as a marker >>> >>> while trying to detect what I need? >> I'm afraid I don't understand what marker you're thinking about >> here. > > I mean that when I go over bus2bridge entries, should I only work with > > those who have DEV_TYPE_PCI_HOST_BRIDGE? Well, if you're after host bridges - yes. Jan
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 11/13/20 4:38 PM, Jan Beulich wrote: > On 13.11.2020 15:32, Oleksandr Andrushchenko wrote: >> On 11/13/20 4:23 PM, Jan Beulich wrote: >>> On 13.11.2020 13:41, Oleksandr Andrushchenko wrote: On 11/13/20 1:35 PM, Jan Beulich wrote: > On 13.11.2020 12:02, Oleksandr Andrushchenko wrote: >> On 11/13/20 12:50 PM, Jan Beulich wrote: >>> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote: On 11/13/20 12:25 PM, Jan Beulich wrote: > On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: >> I'll try to replace is_hardware_domain with something like: >> >> +/* >> + * Detect whether physical PCI devices in this segment belong >> + * to the domain given, e.g. on x86 all PCI devices live in hwdom, >> + * but in case of ARM this might not be the case: those may also >> + * live in driver domains or even Xen itself. >> + */ >> +bool pci_is_hardware_domain(struct domain *d, u16 seg) >> +{ >> +#ifdef CONFIG_X86 >> + return is_hardware_domain(d); >> +#elif CONFIG_ARM >> + return pci_is_owner_domain(d, seg); >> +#else >> +#error "Unsupported architecture" >> +#endif >> +} >> + >> +/* >> + * Get domain which owns this segment: for x86 this is always >> hardware >> + * domain and for ARM this can be different. >> + */ >> +struct domain *pci_get_hardware_domain(u16 seg) >> +{ >> +#ifdef CONFIG_X86 >> + return hardware_domain; >> +#elif CONFIG_ARM >> + return pci_get_owner_domain(seg); >> +#else >> +#error "Unsupported architecture" >> +#endif >> +} >> >> This is what I use to properly detect the domain that really owns >> physical host bridge > I consider this problematic. We should try to not let Arm's and x86'es > PCI implementations diverge too much, i.e. at least the underlying > basic > model would better be similar. For example, if entire segments can be > assigned to a driver domain on Arm, why should the same not be > possible > on x86? Good question, probably in this case x86 == ARM and I can use pci_is_owner_domain for both architectures instead of using is_hardware_domain for x86 > Furthermore I'm suspicious about segments being the right granularity > here. On ia64 multiple host bridges could (and typically would) live > on segment 0. Iirc I had once seen output from an x86 system which was > apparently laid out similarly. Therefore, just like for MCFG, I think > the granularity wants to be bus ranges within a segment. Can you please suggest something we can use as a hint for such a detection logic? >>> The underlying information comes from ACPI tables, iirc. I don't >>> recall the details, though - sorry. >> Ok, so seg + bus should be enough for both ARM and Xen then, right? >> >> pci_get_hardware_domain(u16 seg, u8 bus) > Whether an individual bus number can suitable express things I can't > tell; I did say bus range, but of course if you care about just > individual devices, then a single bus number will of course do. I can implement the lookup whether a PCI host bridge owned by a particular domain with something like: struct pci_host_bridge *bridge = pci_find_host_bridge(seg, bus); return bridge->dt_node->used_by == d->domain_id; Could you please give me a hint how this can be done on x86? >>> Bridges can't be assigned to other than the hardware domain right >>> now. >> So, I can probably then have pci_get_hardware_domain implemented >> >> by both ARM and x86 in their arch specific code. And for x86 for now >> >> it can simply be a wrapper for is_hardware_domain? > "get" can't be a wrapper for "is" Sure, s/get/is > , but I think I get what you're > saying. But no, preferably I would not see you do this (hence my > earlier comment). I still think you could use the owner of the > respective device; I may be mistaken, but iirc we do assign > bridges to Dom0, so deriving from that would be better than > hardwiring to is_hardware_domain(). Ok, I'll try to figure out how to do that > >>>Earlier on I didn't say you should get this to work, only >>> that I think the general logic around what you add shouldn't make >>> things more arch specific than they really should be. That said, >>> something similar to the above should still be doable on x86, >>> utilizing struct pci_seg's bus2bridge[]. There ought to be >>> DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them >>> (provided by the CPUs themselves rather than the chipset) aren't >>> really host bridges for the purposes you're after. >> You mean I can still use
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 13.11.2020 15:32, Oleksandr Andrushchenko wrote: > > On 11/13/20 4:23 PM, Jan Beulich wrote: >> On 13.11.2020 13:41, Oleksandr Andrushchenko wrote: >>> On 11/13/20 1:35 PM, Jan Beulich wrote: On 13.11.2020 12:02, Oleksandr Andrushchenko wrote: > On 11/13/20 12:50 PM, Jan Beulich wrote: >> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote: >>> On 11/13/20 12:25 PM, Jan Beulich wrote: On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: > I'll try to replace is_hardware_domain with something like: > > +/* > + * Detect whether physical PCI devices in this segment belong > + * to the domain given, e.g. on x86 all PCI devices live in hwdom, > + * but in case of ARM this might not be the case: those may also > + * live in driver domains or even Xen itself. > + */ > +bool pci_is_hardware_domain(struct domain *d, u16 seg) > +{ > +#ifdef CONFIG_X86 > + return is_hardware_domain(d); > +#elif CONFIG_ARM > + return pci_is_owner_domain(d, seg); > +#else > +#error "Unsupported architecture" > +#endif > +} > + > +/* > + * Get domain which owns this segment: for x86 this is always > hardware > + * domain and for ARM this can be different. > + */ > +struct domain *pci_get_hardware_domain(u16 seg) > +{ > +#ifdef CONFIG_X86 > + return hardware_domain; > +#elif CONFIG_ARM > + return pci_get_owner_domain(seg); > +#else > +#error "Unsupported architecture" > +#endif > +} > > This is what I use to properly detect the domain that really owns > physical host bridge I consider this problematic. We should try to not let Arm's and x86'es PCI implementations diverge too much, i.e. at least the underlying basic model would better be similar. For example, if entire segments can be assigned to a driver domain on Arm, why should the same not be possible on x86? >>> Good question, probably in this case x86 == ARM and I can use >>> >>> pci_is_owner_domain for both architectures instead of using >>> is_hardware_domain for x86 >>> Furthermore I'm suspicious about segments being the right granularity here. On ia64 multiple host bridges could (and typically would) live on segment 0. Iirc I had once seen output from an x86 system which was apparently laid out similarly. Therefore, just like for MCFG, I think the granularity wants to be bus ranges within a segment. >>> Can you please suggest something we can use as a hint for such a >>> detection logic? >> The underlying information comes from ACPI tables, iirc. I don't >> recall the details, though - sorry. > Ok, so seg + bus should be enough for both ARM and Xen then, right? > > pci_get_hardware_domain(u16 seg, u8 bus) Whether an individual bus number can suitable express things I can't tell; I did say bus range, but of course if you care about just individual devices, then a single bus number will of course do. >>> I can implement the lookup whether a PCI host bridge owned by a particular >>> >>> domain with something like: >>> >>> struct pci_host_bridge *bridge = pci_find_host_bridge(seg, bus); >>> >>> return bridge->dt_node->used_by == d->domain_id; >>> >>> Could you please give me a hint how this can be done on x86? >> Bridges can't be assigned to other than the hardware domain right >> now. > > So, I can probably then have pci_get_hardware_domain implemented > > by both ARM and x86 in their arch specific code. And for x86 for now > > it can simply be a wrapper for is_hardware_domain? "get" can't be a wrapper for "is", but I think I get what you're saying. But no, preferably I would not see you do this (hence my earlier comment). I still think you could use the owner of the respective device; I may be mistaken, but iirc we do assign bridges to Dom0, so deriving from that would be better than hardwiring to is_hardware_domain(). >> Earlier on I didn't say you should get this to work, only >> that I think the general logic around what you add shouldn't make >> things more arch specific than they really should be. That said, >> something similar to the above should still be doable on x86, >> utilizing struct pci_seg's bus2bridge[]. There ought to be >> DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them >> (provided by the CPUs themselves rather than the chipset) aren't >> really host bridges for the purposes you're after. > > You mean I can still use DEV_TYPE_PCI_HOST_BRIDGE as a marker > > while trying to detect what I need? I'm afraid I don't understand what marker you're thinking about here. Jan
Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
On Fri, Nov 13, 2020 at 12:54:57PM +0100, Manuel Bouyer wrote: > On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote: > > The following might be able to get you going, but I think I need to > > refine the logic a bit there, will have to give it some thought. > > I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: > Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty). > Your patch is needed there too to avoid the panic. > > As with 4.13, I have problems with interrupts not being properly > delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13) > or 2 (with 4.15) which would mean that interrupts stop being delivered > at some point in the setup process. Maybe something to do with mask/unmask ? > > The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD kernel, > so it's not MSI/MSI-X (not sure it matters though). > Maybe something related to mask/unmask ? Forgot to mention, it might also be helpful to boot Xen with iommu=debug, just in case. Roger.
Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
On Fri, Nov 13, 2020 at 12:54:57PM +0100, Manuel Bouyer wrote: > On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote: > > The following might be able to get you going, but I think I need to > > refine the logic a bit there, will have to give it some thought. > > I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: > Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty). > Your patch is needed there too to avoid the panic. > > As with 4.13, I have problems with interrupts not being properly > delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13) > or 2 (with 4.15) which would mean that interrupts stop being delivered > at some point in the setup process. Maybe something to do with mask/unmask ? > > The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD kernel, > so it's not MSI/MSI-X (not sure it matters though). > Maybe something related to mask/unmask ? What device do you have on that pin? Is it the only device not working properly? I get from that that MSI/MSI-X is now working fine. You can get some interrupt info from the 'i' and the 'z' debug keys, albeit that won't reflect the state of the emulated IO-APIC used by dom0, which is likely what we care about. There's also the 'M' debug key, but that's only useful for MSI/MSI-X. I can try to prepare a patch to dump some info from the emulated IO-APIC, but I'm afraid I won't get to it until Monday. Roger.
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 11/13/20 4:23 PM, Jan Beulich wrote: > On 13.11.2020 13:41, Oleksandr Andrushchenko wrote: >> On 11/13/20 1:35 PM, Jan Beulich wrote: >>> On 13.11.2020 12:02, Oleksandr Andrushchenko wrote: On 11/13/20 12:50 PM, Jan Beulich wrote: > On 13.11.2020 11:46, Oleksandr Andrushchenko wrote: >> On 11/13/20 12:25 PM, Jan Beulich wrote: >>> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: I'll try to replace is_hardware_domain with something like: +/* + * Detect whether physical PCI devices in this segment belong + * to the domain given, e.g. on x86 all PCI devices live in hwdom, + * but in case of ARM this might not be the case: those may also + * live in driver domains or even Xen itself. + */ +bool pci_is_hardware_domain(struct domain *d, u16 seg) +{ +#ifdef CONFIG_X86 + return is_hardware_domain(d); +#elif CONFIG_ARM + return pci_is_owner_domain(d, seg); +#else +#error "Unsupported architecture" +#endif +} + +/* + * Get domain which owns this segment: for x86 this is always hardware + * domain and for ARM this can be different. + */ +struct domain *pci_get_hardware_domain(u16 seg) +{ +#ifdef CONFIG_X86 + return hardware_domain; +#elif CONFIG_ARM + return pci_get_owner_domain(seg); +#else +#error "Unsupported architecture" +#endif +} This is what I use to properly detect the domain that really owns physical host bridge >>> I consider this problematic. We should try to not let Arm's and x86'es >>> PCI implementations diverge too much, i.e. at least the underlying basic >>> model would better be similar. For example, if entire segments can be >>> assigned to a driver domain on Arm, why should the same not be possible >>> on x86? >> Good question, probably in this case x86 == ARM and I can use >> >> pci_is_owner_domain for both architectures instead of using >> is_hardware_domain for x86 >> >>> Furthermore I'm suspicious about segments being the right granularity >>> here. On ia64 multiple host bridges could (and typically would) live >>> on segment 0. Iirc I had once seen output from an x86 system which was >>> apparently laid out similarly. Therefore, just like for MCFG, I think >>> the granularity wants to be bus ranges within a segment. >> Can you please suggest something we can use as a hint for such a >> detection logic? > The underlying information comes from ACPI tables, iirc. I don't > recall the details, though - sorry. Ok, so seg + bus should be enough for both ARM and Xen then, right? pci_get_hardware_domain(u16 seg, u8 bus) >>> Whether an individual bus number can suitable express things I can't >>> tell; I did say bus range, but of course if you care about just >>> individual devices, then a single bus number will of course do. >> I can implement the lookup whether a PCI host bridge owned by a particular >> >> domain with something like: >> >> struct pci_host_bridge *bridge = pci_find_host_bridge(seg, bus); >> >> return bridge->dt_node->used_by == d->domain_id; >> >> Could you please give me a hint how this can be done on x86? > Bridges can't be assigned to other than the hardware domain right > now. So, I can probably then have pci_get_hardware_domain implemented by both ARM and x86 in their arch specific code. And for x86 for now it can simply be a wrapper for is_hardware_domain? > Earlier on I didn't say you should get this to work, only > that I think the general logic around what you add shouldn't make > things more arch specific than they really should be. That said, > something similar to the above should still be doable on x86, > utilizing struct pci_seg's bus2bridge[]. There ought to be > DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them > (provided by the CPUs themselves rather than the chipset) aren't > really host bridges for the purposes you're after. You mean I can still use DEV_TYPE_PCI_HOST_BRIDGE as a marker while trying to detect what I need? > > Jan Thank you, Oleksandr
Re: [PATCH V2 01/23] x86/ioreq: Prepare IOREQ feature for making it common
On 13.11.2020 13:45, Oleksandr wrote: > On 13.11.20 13:20, Jan Beulich wrote: >> On 13.11.2020 12:09, Oleksandr wrote: >>> On 12.11.20 12:58, Jan Beulich wrote: On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: > @@ -855,7 +841,7 @@ int hvm_destroy_ioreq_server(struct domain *d, > ioservid_t id) > >domain_pause(d); > > -p2m_set_ioreq_server(d, 0, s); > +arch_hvm_destroy_ioreq_server(s); Iirc there are plans to rename hvm_destroy_ioreq_server() in the course of the generalization. If so, this arch hook would imo better be named following the new scheme right away. >>> Could you please clarify, are you speaking about the plans discussed there >>> >>> "[PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved >>> function names"? >>> >>> Copy text for the convenience: >>> AT least some of the functions touched here would be nice to be >>> moved to a more consistent new naming scheme right away, to >>> avoid having to touch all the same places again. I guess ioreq >>> server functions would be nice to all start with ioreq_server_ >>> and ioreq functions to all start with ioreq_. E.g. ioreq_send() >>> and ioreq_server_select(). >>> >>> or some other plans I am not aware of? >>> >>> >>> What I really want to avoid with IOREQ enabling work is the round-trips >>> related to naming things, this patch series >>> became quite big (and consumes som time to rebase and test it) and I >>> expect it to become bigger. >>> >>> So the arch_hvm_destroy_ioreq_server() should be >>> arch_ioreq_server_destroy()? >> I think so, yes. If you want to avoid doing full patches, how >> about you simply list the functions / variables you plan to >> rename alongside the intended new names? Would likely be easier >> for all involved parties. > I think it is a good idea. I will prepare a list once I analyze all new > comments to this series. > As I understand that only global IOREQ functions need renaming according > to the new scheme, > local ones can be left as is but without "hvm" prefixes of course? Please apply common sense: Static ones, if you have to drop a hvm_ prefix, may in some cases better further be renamed as well, when their names aren't really in line with their purpose (anymore). But yes, following a consistent naming model is more relevant for non-static functions. Jan
Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup
On 13/11/2020 14:18, Bjoern Doebel wrote: > Right now we do not have a mechanism to determine the version of the > currently running xenstored at runtime. As xenstored runs throughout the > lifetime of a Xen host, this may lead to problems when newer user space > builds are staged. Then, the running xenstored will no longer match the > version of the installed xenstored. > > To allow users to always identify the running version of xenstored, add > a linker-generated unique build ID to every xenstored build. Add > functionality to log this build ID into a file upon service startup. > > Signed-off-by: Bjoern Doebel > Reviewed-by: Martin Mazein > Reviewed-by: Paul Durrant I understand the problem you're trying to solve, but why is this anything more than just enabling build-id's by default across tools/ ? There are already standard ways of interacting with the build id of running executables on the system. I'd strongly discourage doing anything custom in xenstored specifically. ~Andrew
Re: [PATCH V2 09/23] xen/dm: Make x86's DM feature common
On 12.11.20 13:32, Jan Beulich wrote: Hi Jan On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: From: Julien Grall As a lot of x86 code can be re-used on Arm later on, this patch splits devicemodel support into common and arch specific parts. The common DM feature is supposed to be built with IOREQ_SERVER option enabled (as well as the IOREQ feature), which is selected for x86's config HVM for now. Did you consider doing it the other way around? It would seem more natural to have the top level dm-op handling arch-specific and call into e.g. ioreq_server_dm_op() for otherwise unhandled ops, just like e.g. do_domctl() calls into iommu_do_domctl() (indirectly via arch_do_domctl()). I ask because in the long run I expect the ioreq server sub-ops to only be a small part of the overall set of dm-ops; already now it's 7 out of 18 if I got the counting right. This would then also leave compat_dm_op() in x86 code. But yes, there are also downsides with this alternative. No, I didn't consider. I separated the proposed DM changes from Julien's patch without modifying the logic. My changes on top (except rebasing of course) are updating XSM code and introducing a xen/dm.h for definitions. -- Regards, Oleksandr Tyshchenko
Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup
On 13.11.2020 15:18, Bjoern Doebel wrote: > --- a/tools/xenstore/Makefile > +++ b/tools/xenstore/Makefile > @@ -66,6 +66,10 @@ $(XENSTORED_OBJS): CFLAGS += $(SYSTEMD_CFLAGS) > xenstored: LDFLAGS += $(SYSTEMD_LIBS) > endif > > +# xenstored: enforce creation of a buildID section and use a linker > +# script to add additional symbols around that section > +xenstored: LDFLAGS += -Wl,--build-id=sha1 -Wl,-T,buildid_symbols.ld Since in the hypervisor build we're careful to not use --build-id when the linker doesn't support it, I suppose the same care needs applying here. See the setting of build_id_linker in ./Config.mk. Jan
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 13.11.2020 13:41, Oleksandr Andrushchenko wrote: > > On 11/13/20 1:35 PM, Jan Beulich wrote: >> On 13.11.2020 12:02, Oleksandr Andrushchenko wrote: >>> On 11/13/20 12:50 PM, Jan Beulich wrote: On 13.11.2020 11:46, Oleksandr Andrushchenko wrote: > On 11/13/20 12:25 PM, Jan Beulich wrote: >> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: >>> I'll try to replace is_hardware_domain with something like: >>> >>> +/* >>> + * Detect whether physical PCI devices in this segment belong >>> + * to the domain given, e.g. on x86 all PCI devices live in hwdom, >>> + * but in case of ARM this might not be the case: those may also >>> + * live in driver domains or even Xen itself. >>> + */ >>> +bool pci_is_hardware_domain(struct domain *d, u16 seg) >>> +{ >>> +#ifdef CONFIG_X86 >>> + return is_hardware_domain(d); >>> +#elif CONFIG_ARM >>> + return pci_is_owner_domain(d, seg); >>> +#else >>> +#error "Unsupported architecture" >>> +#endif >>> +} >>> + >>> +/* >>> + * Get domain which owns this segment: for x86 this is always hardware >>> + * domain and for ARM this can be different. >>> + */ >>> +struct domain *pci_get_hardware_domain(u16 seg) >>> +{ >>> +#ifdef CONFIG_X86 >>> + return hardware_domain; >>> +#elif CONFIG_ARM >>> + return pci_get_owner_domain(seg); >>> +#else >>> +#error "Unsupported architecture" >>> +#endif >>> +} >>> >>> This is what I use to properly detect the domain that really owns >>> physical host bridge >> I consider this problematic. We should try to not let Arm's and x86'es >> PCI implementations diverge too much, i.e. at least the underlying basic >> model would better be similar. For example, if entire segments can be >> assigned to a driver domain on Arm, why should the same not be possible >> on x86? > Good question, probably in this case x86 == ARM and I can use > > pci_is_owner_domain for both architectures instead of using > is_hardware_domain for x86 > >> Furthermore I'm suspicious about segments being the right granularity >> here. On ia64 multiple host bridges could (and typically would) live >> on segment 0. Iirc I had once seen output from an x86 system which was >> apparently laid out similarly. Therefore, just like for MCFG, I think >> the granularity wants to be bus ranges within a segment. > Can you please suggest something we can use as a hint for such a > detection logic? The underlying information comes from ACPI tables, iirc. I don't recall the details, though - sorry. >>> Ok, so seg + bus should be enough for both ARM and Xen then, right? >>> >>> pci_get_hardware_domain(u16 seg, u8 bus) >> Whether an individual bus number can suitable express things I can't >> tell; I did say bus range, but of course if you care about just >> individual devices, then a single bus number will of course do. > > I can implement the lookup whether a PCI host bridge owned by a particular > > domain with something like: > > struct pci_host_bridge *bridge = pci_find_host_bridge(seg, bus); > > return bridge->dt_node->used_by == d->domain_id; > > Could you please give me a hint how this can be done on x86? Bridges can't be assigned to other than the hardware domain right now. Earlier on I didn't say you should get this to work, only that I think the general logic around what you add shouldn't make things more arch specific than they really should be. That said, something similar to the above should still be doable on x86, utilizing struct pci_seg's bus2bridge[]. There ought to be DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them (provided by the CPUs themselves rather than the chipset) aren't really host bridges for the purposes you're after. Jan
[XEN PATCH] tools/xenstore: Log xenstored build ID on startup
Right now we do not have a mechanism to determine the version of the currently running xenstored at runtime. As xenstored runs throughout the lifetime of a Xen host, this may lead to problems when newer user space builds are staged. Then, the running xenstored will no longer match the version of the installed xenstored. To allow users to always identify the running version of xenstored, add a linker-generated unique build ID to every xenstored build. Add functionality to log this build ID into a file upon service startup. Signed-off-by: Bjoern Doebel Reviewed-by: Martin Mazein Reviewed-by: Paul Durrant --- tools/hotplug/Linux/launch-xenstore.in | 2 +- tools/xenstore/Makefile| 4 +++ tools/xenstore/buildid_symbols.ld | 10 +++ tools/xenstore/xenstored_core.c| 8 ++ tools/xenstore/xenstored_core.h| 3 ++ tools/xenstore/xenstored_minios.c | 4 +++ tools/xenstore/xenstored_posix.c | 52 ++ 7 files changed, 82 insertions(+), 1 deletion(-) create mode 100644 tools/xenstore/buildid_symbols.ld diff --git a/tools/hotplug/Linux/launch-xenstore.in b/tools/hotplug/Linux/launch-xenstore.in index 991dec8d25..a6f2254030 100644 --- a/tools/hotplug/Linux/launch-xenstore.in +++ b/tools/hotplug/Linux/launch-xenstore.in @@ -62,7 +62,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF } echo -n Starting $XENSTORED... - $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS + $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid --buildid-file @XEN_RUN_DIR@/xenstored.buildid $XENSTORED_ARGS systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || exit 1 diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile index 9a0f0d012d..c63350980b 100644 --- a/tools/xenstore/Makefile +++ b/tools/xenstore/Makefile @@ -66,6 +66,10 @@ $(XENSTORED_OBJS): CFLAGS += $(SYSTEMD_CFLAGS) xenstored: LDFLAGS += $(SYSTEMD_LIBS) endif +# xenstored: enforce creation of a buildID section and use a linker +# script to add additional symbols around that section +xenstored: LDFLAGS += -Wl,--build-id=sha1 -Wl,-T,buildid_symbols.ld + $(XENSTORED_OBJS): CFLAGS += $(CFLAGS_libxengnttab) xenstored: $(XENSTORED_OBJS) diff --git a/tools/xenstore/buildid_symbols.ld b/tools/xenstore/buildid_symbols.ld new file mode 100644 index 00..d74024c4e9 --- /dev/null +++ b/tools/xenstore/buildid_symbols.ld @@ -0,0 +1,10 @@ +SECTIONS +{ + __buildid_note_section = . ; + .note.gnu.build-id : + { + *(.note.gnu.build-id) + } + __buildid_end = . ; +} +INSERT AFTER .data diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c index b4be374d3f..c6f107bdd9 100644 --- a/tools/xenstore/xenstored_core.c +++ b/tools/xenstore/xenstored_core.c @@ -1844,6 +1844,7 @@ static void usage(void) static struct option options[] = { + { "buildid-file", 1, NULL, 'B' }, { "no-domain-init", 0, NULL, 'D' }, { "entry-nb", 1, NULL, 'E' }, { "pid-file", 1, NULL, 'F' }, @@ -1875,12 +1876,16 @@ int main(int argc, char *argv[]) bool outputpid = false; bool no_domain_init = false; const char *pidfile = NULL; + const char *buildid_file = NULL; int timeout; while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:T:RVW:", options, NULL)) != -1) { switch (opt) { + case 'B': + buildid_file = optarg; + break; case 'D': no_domain_init = true; break; @@ -1948,6 +1953,9 @@ int main(int argc, char *argv[]) if (pidfile) write_pidfile(pidfile); + if (buildid_file) + write_buildid_file(buildid_file); + /* Talloc leak reports go to stderr, which is closed if we fork. */ if (!dofork) talloc_enable_leak_report_full(); diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h index 1df6ad94ab..712280626c 100644 --- a/tools/xenstore/xenstored_core.h +++ b/tools/xenstore/xenstored_core.h @@ -193,6 +193,9 @@ void xenbus_notify_running(void); /* Write out the pidfile */ void write_pidfile(const char *pidfile); +/* Write the buildid file */ +void write_buildid_file(const char *buildidfile); + /* Fork but do not close terminal FDs */ void daemonize(void); /* Close stdin/stdout/stderr to complete daemonize */ diff --git a/tools/xenstore/xenstored_minios.c b/tools/xenstore/xenstored_minios.c index c94493e52a..ef1151aee4 100644 --- a/tools/xenstore/xenstored_minios.c +++ b/tools/xenstore/xenstored_minios.c @@ -24,6 +24,10 @@ void write_pidfile(const char *pidfile) { } +void write_buildid_file(const char *buildid_file) +{ +} + void daemonize(void) { } diff --git
Re: [PATCH V2 07/23] xen/ioreq: Move x86's ioreq_gfn(server) to struct domain
On 12.11.20 13:21, Jan Beulich wrote: Hi Jan On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: --- a/xen/include/asm-x86/hvm/ioreq.h +++ b/xen/include/asm-x86/hvm/ioreq.h @@ -77,7 +77,7 @@ static inline int hvm_map_mem_type_to_ioreq_server(struct domain *d, if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE ) return -EINVAL; -spin_lock_recursive(>arch.hvm.ioreq_server.lock); +spin_lock_recursive(>ioreq_server.lock); s = get_ioreq_server(d, id); @@ -92,7 +92,7 @@ static inline int hvm_map_mem_type_to_ioreq_server(struct domain *d, rc = p2m_set_ioreq_server(d, flags, s); out: -spin_unlock_recursive(>arch.hvm.ioreq_server.lock); +spin_unlock_recursive(>ioreq_server.lock); if ( rc == 0 && flags == 0 ) { Does this build at this point, when !CONFIG_IOREQ_SERVER? Patch 1 moves the code here without guards, and patch 2, when introducing the Kconfig symbol, doesn't add guards here. I admit I didn't check further intermediate patches. Yes. I can confirm I checked x86 patch by patch with CONFIG_IOREQ_SERVER, as for !CONFIG_IOREQ_SERVER I can't recollect to be 100% sure, but likely I tested also patch by patch. Anyway, I have just rechecked. Probably it is because this header isn't in use with !CONFIG_IOREQ_SERVER since all users are x86/hvm/* and common/ioreq.c -- Regards, Oleksandr Tyshchenko
Re: [PATCH V2 02/23] xen/ioreq: Make x86's IOREQ feature common
On 12.11.20 13:11, Jan Beulich wrote: Hi Jan On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: From: Oleksandr Tyshchenko As a lot of x86 code can be re-used on Arm later on, this patch moves previously prepared x86/hvm/ioreq.c to the common code. The common IOREQ feature is supposed to be built with IOREQ_SERVER option enabled, which is selected for x86's config HVM for now. In order to avoid having a gigantic patch here, the subsequent patches will update remaining bits in the common code step by step: - Make IOREQ related structs/materials common - Drop the "hvm" prefixes and infixes - Remove layering violation by moving corresponding fields out of *arch.hvm* or abstracting away accesses to them This support is going to be used on Arm to be able run device emulator outside of Xen hypervisor. Signed-off-by: Oleksandr Tyshchenko CC: Julien Grall --- Please note, this is a split/cleanup/hardening of Julien's PoC: "Add support for Guest IO forwarding to a device emulator" *** Please note, this patch depends on the following which is on review: https://patchwork.kernel.org/patch/11816689/ *** Changes RFC -> V1: - was split into three patches: - x86/ioreq: Prepare IOREQ feature for making it common - xen/ioreq: Make x86's IOREQ feature common - xen/ioreq: Make x86's hvm_ioreq_needs_completion() common - update MAINTAINERS file - do not use a separate subdir for the IOREQ stuff, move it to: - xen/common/ioreq.c - xen/include/xen/ioreq.h - update x86's files to include xen/ioreq.h - remove unneeded headers in arch/x86/hvm/ioreq.c - re-order the headers alphabetically in common/ioreq.c - update common/ioreq.c according to the newly introduced arch functions: arch_hvm_destroy_ioreq_server()/arch_handle_hvm_io_completion() Changes V1 -> V2: - update patch description - make everything needed in the previous patch to achieve a truly rename here - don't include unnecessary headers from asm-x86/hvm/ioreq.h and xen/ioreq.h - use __XEN_IOREQ_H__ instead of __IOREQ_H__ - move get_ioreq_server() to common/ioreq.c --- MAINTAINERS |8 +- xen/arch/x86/Kconfig|1 + xen/arch/x86/hvm/Makefile |1 - xen/arch/x86/hvm/ioreq.c| 1422 --- xen/arch/x86/mm.c |2 +- xen/arch/x86/mm/shadow/common.c |2 +- xen/common/Kconfig |3 + xen/common/Makefile |1 + xen/common/ioreq.c | 1422 +++ xen/include/asm-x86/hvm/ioreq.h | 39 +- xen/include/xen/ioreq.h | 71 ++ 11 files changed, 1509 insertions(+), 1463 deletions(-) delete mode 100644 xen/arch/x86/hvm/ioreq.c create mode 100644 xen/common/ioreq.c create mode 100644 xen/include/xen/ioreq.h Iirc I've previously asked to make sure the diff here gets created with git's rename detection enabled, so we wouldn't see 1422 lines deleted and 1422 lines added, _hoping_ they're all the same (or going through the extra steps needed to compare old and new code), but instead seeing just the diff between old and new files (which in the ideal case would then be empty). This is my fault, I misread this last time. I have just rechecked this patch again. git config diff.renames false MAINTAINERS | 8 +- xen/arch/x86/Kconfig | 1 + xen/arch/x86/hvm/Makefile | 1 - xen/arch/x86/hvm/ioreq.c | 1544 xen/arch/x86/mm.c | 2 +- xen/arch/x86/mm/shadow/common.c | 2 +- xen/common/Kconfig | 3 + xen/common/Makefile | 1 + xen/common/ioreq.c | 1544 xen/include/asm-x86/hvm/ioreq.h | 38 + xen/include/xen/ioreq.h | 73 + 11 files changed, 1633 insertions(+), 1584 deletions(-) git config diff.renames true MAINTAINERS | 8 +++- xen/arch/x86/Kconfig | 1 + xen/arch/x86/hvm/Makefile | 1 - xen/arch/x86/mm.c | 2 +- xen/arch/x86/mm/shadow/common.c | 2 +- xen/common/Kconfig | 3 +++ xen/common/Makefile | 1 + xen/{arch/x86/hvm => common}/ioreq.c | 0 xen/include/asm-x86/hvm/ioreq.h | 38 ++ xen/include/xen/ioreq.h | 73 + 10 files changed, 89 insertions(+), 40 deletions(-) Although it is not needed for the patch in question anymore (since there
Re: [PATCH V2 01/23] x86/ioreq: Prepare IOREQ feature for making it common
On 13.11.20 13:20, Jan Beulich wrote: Hi Jan. On 13.11.2020 12:09, Oleksandr wrote: On 12.11.20 12:58, Jan Beulich wrote: On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: @@ -855,7 +841,7 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id) domain_pause(d); -p2m_set_ioreq_server(d, 0, s); +arch_hvm_destroy_ioreq_server(s); Iirc there are plans to rename hvm_destroy_ioreq_server() in the course of the generalization. If so, this arch hook would imo better be named following the new scheme right away. Could you please clarify, are you speaking about the plans discussed there "[PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved function names"? Copy text for the convenience: AT least some of the functions touched here would be nice to be moved to a more consistent new naming scheme right away, to avoid having to touch all the same places again. I guess ioreq server functions would be nice to all start with ioreq_server_ and ioreq functions to all start with ioreq_. E.g. ioreq_send() and ioreq_server_select(). or some other plans I am not aware of? What I really want to avoid with IOREQ enabling work is the round-trips related to naming things, this patch series became quite big (and consumes som time to rebase and test it) and I expect it to become bigger. So the arch_hvm_destroy_ioreq_server() should be arch_ioreq_server_destroy()? I think so, yes. If you want to avoid doing full patches, how about you simply list the functions / variables you plan to rename alongside the intended new names? Would likely be easier for all involved parties. I think it is a good idea. I will prepare a list once I analyze all new comments to this series. As I understand that only global IOREQ functions need renaming according to the new scheme, local ones can be left as is but without "hvm" prefixes of course? @@ -1215,7 +1153,7 @@ void hvm_destroy_all_ioreq_servers(struct domain *d) struct hvm_ioreq_server *s; unsigned int id; -if ( !relocate_portio_handler(d, 0xcf8, 0xcf8, 4) ) +if ( !arch_hvm_ioreq_destroy(d) ) There's no ioreq being destroyed here, so I think this wants renaming (and again ideally right away following the planned new scheme). Agree that no ioreq being destroyed here. Probably ioreq_server_check_for_destroy()? I couldn't think of a better name. "check" implies no change (and d ought to then be const struct domain *). With the containing function likely becoming ioreq_server_destroy_all(), arch_ioreq_server_destroy_all() would come to mind, or arch_ioreq_server_prepare_destroy_all(). ok, agree +static inline int hvm_map_mem_type_to_ioreq_server(struct domain *d, + ioservid_t id, + uint32_t type, + uint32_t flags) +{ +struct hvm_ioreq_server *s; +int rc; + +if ( type != HVMMEM_ioreq_server ) +return -EINVAL; + +if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE ) +return -EINVAL; + +spin_lock_recursive(>arch.hvm.ioreq_server.lock); + +s = get_ioreq_server(d, id); + +rc = -ENOENT; +if ( !s ) +goto out; + +rc = -EPERM; +if ( s->emulator != current->domain ) +goto out; + +rc = p2m_set_ioreq_server(d, flags, s); + + out: +spin_unlock_recursive(>arch.hvm.ioreq_server.lock); + +if ( rc == 0 && flags == 0 ) +{ +struct p2m_domain *p2m = p2m_get_hostp2m(d); I realize I may be asking too much, but would it be possible if, while moving code, you made simple and likely uncontroversial adjustments like adding const here? (Such adjustments would be less desirable to make if they increased the size of the patch, e.g. if you were touching only nearby code.) This function as well as one located below won't be moved to this header for the next version of patch. ok, will add const. Well, if you don't move the code, then better keep the diff small and leave things as they are. ok, in case I will have to move that code for any reason, I will take suggestions into the account. -- Regards, Oleksandr Tyshchenko
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 11/13/20 1:35 PM, Jan Beulich wrote: > On 13.11.2020 12:02, Oleksandr Andrushchenko wrote: >> On 11/13/20 12:50 PM, Jan Beulich wrote: >>> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote: On 11/13/20 12:25 PM, Jan Beulich wrote: > On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: >> On 11/12/20 4:46 PM, Roger Pau Monné wrote: >>> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: On 11/12/20 11:40 AM, Roger Pau Monné wrote: > On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko > wrote: >> From: Oleksandr Andrushchenko >> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, >> unsigned int reg, >> + void *data) >> +{ >> +struct vpci_bar *vbar, *bar = data; >> + >> +if ( is_hardware_domain(current->domain) ) >> +return bar_read_hwdom(pdev, reg, data); >> + >> +vbar = get_vpci_bar(current->domain, pdev, bar->index); >> +if ( !vbar ) >> +return ~0; >> + >> +return bar_read_guest(pdev, reg, vbar); >> +} >> + >> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned >> int reg, >> + uint32_t val, void *data) >> +{ >> +struct vpci_bar *bar = data; >> + >> +if ( is_hardware_domain(current->domain) ) >> +bar_write_hwdom(pdev, reg, val, data); >> +else >> +{ >> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, >> bar->index); >> + >> +if ( !vbar ) >> +return; >> +bar_write_guest(pdev, reg, val, vbar); >> +} >> +} > You should assign different handlers based on whether the domain that > has the device assigned is a domU or the hardware domain, rather than > doing the selection here. Hm, handlers are assigned once in init_bars and this function is only called for hwdom, so there is no way I can do that for the guests. Hence, the dispatcher >>> I think we might want to reset the vPCI handlers when a devices gets >>> assigned and deassigned. >> In ARM case init_bars is called too early: PCI device assignment is >> currently >> >> initiated by Domain-0' kernel and is done *before* PCI devices are given >> memory >> >> ranges and BARs assigned: >> >> [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] >> [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] >> [ 0.429555] pci_bus :00: root bus resource [mem >> 0xfe20-0xfe3f] >> [ 0.429575] pci_bus :00: root bus resource [mem >> 0x3000-0x37ff] >> [ 0.429604] pci_bus :00: root bus resource [mem >> 0x3800-0x3fff pref] >> [ 0.429670] pci :00:00.0: enabling Extended Tags >> [ 0.453764] pci :00:00.0: >> BUS_NOTIFY_ADD_DEVICE >> >> < init_bars > >> >> [ 0.453793] pci :00:00.0: -- IRQ 0 >> [ 0.458825] pci :00:00.0: Failed to add - passthrough or >> MSI/MSI-X might fail! >> [ 0.471790] pci :01:00.0: >> BUS_NOTIFY_ADD_DEVICE >> >> < init_bars > >> >> [ 0.471821] pci :01:00.0: -- IRQ 255 >> [ 0.476809] pci :01:00.0: Failed to add - passthrough or >> MSI/MSI-X might fail! >> >> < BAR assignments below > >> >> [ 0.488233] pci :00:00.0: BAR 14: assigned [mem >> 0xfe20-0xfe2f] >> [ 0.488265] pci :00:00.0: BAR 15: assigned [mem >> 0x3800-0x380f pref] >> >> In case of x86 this is pretty much ok as BARs are already in place, but >> for ARM we >> >> need to take care and re-setup vPCI BARs for hwdom. > Even on x86 there's no guarantee that all devices have their BARs set > up by firmware. This is true. But there you could have config space trapped in "x86 generic way", please correct me if I'm wrong here > In a subsequent reply you've suggested to move init_bars from "add" to > "assign", but I'm having trouble seeing what this would change: It's > not Dom0 controlling assignment (to itself), but Xen assigns the device > towards the end of pci_add_device(). PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device Currently we initialize BARs during PHYSDEVOP_pci_device_add and if we do that during XEN_DOMCTL_assign_device things seem to change >>> But there can't possibly be any XEN_DOMCTL_assign_device involved in >>> booting of Dom0. >> Indeed. So, do you have an idea when we should call init_bars suitable
Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote: > The following might be able to get you going, but I think I need to > refine the logic a bit there, will have to give it some thought. I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty). Your patch is needed there too to avoid the panic. As with 4.13, I have problems with interrupts not being properly delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13) or 2 (with 4.15) which would mean that interrupts stop being delivered at some point in the setup process. Maybe something to do with mask/unmask ? The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD kernel, so it's not MSI/MSI-X (not sure it matters though). Maybe something related to mask/unmask ? -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 13/11/2020 11:26, Jan Beulich wrote: On 13.11.2020 12:06, Julien Grall wrote: Hi Jan, On 13/11/2020 10:53, Jan Beulich wrote: On 13.11.2020 11:36, Julien Grall wrote: On 13/11/2020 10:25, Jan Beulich wrote: On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: On 11/12/20 4:46 PM, Roger Pau Monné wrote: On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: On 11/12/20 11:40 AM, Roger Pau Monné wrote: On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: From: Oleksandr Andrushchenko +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned int reg, + void *data) +{ +struct vpci_bar *vbar, *bar = data; + +if ( is_hardware_domain(current->domain) ) +return bar_read_hwdom(pdev, reg, data); + +vbar = get_vpci_bar(current->domain, pdev, bar->index); +if ( !vbar ) +return ~0; + +return bar_read_guest(pdev, reg, vbar); +} + +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int reg, + uint32_t val, void *data) +{ +struct vpci_bar *bar = data; + +if ( is_hardware_domain(current->domain) ) +bar_write_hwdom(pdev, reg, val, data); +else +{ +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, bar->index); + +if ( !vbar ) +return; +bar_write_guest(pdev, reg, val, vbar); +} +} You should assign different handlers based on whether the domain that has the device assigned is a domU or the hardware domain, rather than doing the selection here. Hm, handlers are assigned once in init_bars and this function is only called for hwdom, so there is no way I can do that for the guests. Hence, the dispatcher I think we might want to reset the vPCI handlers when a devices gets assigned and deassigned. In ARM case init_bars is called too early: PCI device assignment is currently initiated by Domain-0' kernel and is done *before* PCI devices are given memory ranges and BARs assigned: [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] [ 0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f] [ 0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff] [ 0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff pref] [ 0.429670] pci :00:00.0: enabling Extended Tags [ 0.453764] pci :00:00.0: BUS_NOTIFY_ADD_DEVICE < init_bars > [ 0.453793] pci :00:00.0: -- IRQ 0 [ 0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X might fail! [ 0.471790] pci :01:00.0: BUS_NOTIFY_ADD_DEVICE < init_bars > [ 0.471821] pci :01:00.0: -- IRQ 255 [ 0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X might fail! < BAR assignments below > [ 0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f] [ 0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f pref] In case of x86 this is pretty much ok as BARs are already in place, but for ARM we need to take care and re-setup vPCI BARs for hwdom. Even on x86 there's no guarantee that all devices have their BARs set up by firmware. In a subsequent reply you've suggested to move init_bars from "add" to "assign", but I'm having trouble seeing what this would change: It's not Dom0 controlling assignment (to itself), but Xen assigns the device towards the end of pci_add_device(). Things are getting even more complicated if the host PCI bridge is not ECAM like, so you cannot set mmio_handlers and trap hwdom's access to the config space to update BARs etc. This is why I have that ugly hack for rcar_gen3 to read actual BARs for hwdom. How to config space accesses work there? The latest for MSI/MSI-X it'll be imperative that Xen be able to intercept config space writes. I am not sure to understand your last sentence. Are you saying that we always need to trap access to MSI/MSI-X message in order to sanitize it? If one is using the GICv3 ITS (I haven't investigated other MSI controller), then I don't believe you need to sanitize the MSI/MSI-X message in most of the situation. Well, if it's fine for the guest to write arbitrary values to message address and message data, The message address would be the doorbell of the ITS that is usually going through the IOMMU page-tables. Although, I am aware of a couple of platforms where the doorbell access (among other address ranges including P2P transaction) bypass the IOMMU. In this situation, we would need a lot more work than just trapping the access. When you say "The message address would be the doorbell of the ITS" am I right in understanding that's the designated address to be put there? What if the guest puts some random different address there? My point was that all the accesses from a PCI
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 13.11.2020 12:02, Oleksandr Andrushchenko wrote: > > On 11/13/20 12:50 PM, Jan Beulich wrote: >> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote: >>> On 11/13/20 12:25 PM, Jan Beulich wrote: On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: > On 11/12/20 4:46 PM, Roger Pau Monné wrote: >> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: >>> On 11/12/20 11:40 AM, Roger Pau Monné wrote: On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: > From: Oleksandr Andrushchenko > +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, > unsigned int reg, > + void *data) > +{ > +struct vpci_bar *vbar, *bar = data; > + > +if ( is_hardware_domain(current->domain) ) > +return bar_read_hwdom(pdev, reg, data); > + > +vbar = get_vpci_bar(current->domain, pdev, bar->index); > +if ( !vbar ) > +return ~0; > + > +return bar_read_guest(pdev, reg, vbar); > +} > + > +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned > int reg, > + uint32_t val, void *data) > +{ > +struct vpci_bar *bar = data; > + > +if ( is_hardware_domain(current->domain) ) > +bar_write_hwdom(pdev, reg, val, data); > +else > +{ > +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, > bar->index); > + > +if ( !vbar ) > +return; > +bar_write_guest(pdev, reg, val, vbar); > +} > +} You should assign different handlers based on whether the domain that has the device assigned is a domU or the hardware domain, rather than doing the selection here. >>> Hm, handlers are assigned once in init_bars and this function is only >>> called >>> >>> for hwdom, so there is no way I can do that for the guests. Hence, the >>> dispatcher >> I think we might want to reset the vPCI handlers when a devices gets >> assigned and deassigned. > In ARM case init_bars is called too early: PCI device assignment is > currently > > initiated by Domain-0' kernel and is done *before* PCI devices are given > memory > > ranges and BARs assigned: > > [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] > [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] > [ 0.429555] pci_bus :00: root bus resource [mem > 0xfe20-0xfe3f] > [ 0.429575] pci_bus :00: root bus resource [mem > 0x3000-0x37ff] > [ 0.429604] pci_bus :00: root bus resource [mem > 0x3800-0x3fff pref] > [ 0.429670] pci :00:00.0: enabling Extended Tags > [ 0.453764] pci :00:00.0: > BUS_NOTIFY_ADD_DEVICE > > < init_bars > > > [ 0.453793] pci :00:00.0: -- IRQ 0 > [ 0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X > might fail! > [ 0.471790] pci :01:00.0: > BUS_NOTIFY_ADD_DEVICE > > < init_bars > > > [ 0.471821] pci :01:00.0: -- IRQ 255 > [ 0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X > might fail! > > < BAR assignments below > > > [ 0.488233] pci :00:00.0: BAR 14: assigned [mem > 0xfe20-0xfe2f] > [ 0.488265] pci :00:00.0: BAR 15: assigned [mem > 0x3800-0x380f pref] > > In case of x86 this is pretty much ok as BARs are already in place, but > for ARM we > > need to take care and re-setup vPCI BARs for hwdom. Even on x86 there's no guarantee that all devices have their BARs set up by firmware. >>> This is true. But there you could have config space trapped in "x86 generic >>> way", >>> >>> please correct me if I'm wrong here >>> In a subsequent reply you've suggested to move init_bars from "add" to "assign", but I'm having trouble seeing what this would change: It's not Dom0 controlling assignment (to itself), but Xen assigns the device towards the end of pci_add_device(). >>> PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device >>> >>> Currently we initialize BARs during PHYSDEVOP_pci_device_add and >>> >>> if we do that during XEN_DOMCTL_assign_device things seem to change >> But there can't possibly be any XEN_DOMCTL_assign_device involved in >> booting of Dom0. > > Indeed. So, do you have an idea when we should call init_bars suitable > > for both ARM and x86? > > Another question is: what happens bad if x86 and ARM won't call init_bars > > until the moment we really assign a PCI
[xen-4.14-testing test] 156716: tolerable FAIL - PUSHED
flight 156716 xen-4.14-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/156716/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail in 156670 pass in 156716 test-amd64-i386-xl-raw 19 guest-localmigrate/x10 fail in 156670 pass in 156716 test-armhf-armhf-xl-credit2 18 guest-start/debian.repeat fail pass in 156670 Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156525 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156525 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156525 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156525 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156525 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156525 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156525 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 156525 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 156525 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156525 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156525 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass version targeted for testing: xen d101b417b784a26326fc7800a79cc539ba570b79 baseline version: xen 0c96e4297da07944525729ddbe438b0131ab5b7e Last test of basis 156525 2020-11-06 16:01:25 Z6 days Failing since156594 2020-11-09 18:08:08 Z3 days4 attempts Testing same since 156670 2020-11-11 04:05:07 Z2 days2 attempts
[ovmf test] 156720: all pass - PUSHED
flight 156720 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/156720/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 544cb0132dc1778b9e791202995533523fa6cccd baseline version: ovmf 1c48866e041d2afaabb170086c5bb0c69a4653d3 Last test of basis 156684 2020-11-11 12:14:13 Z1 days Testing same since 156720 2020-11-12 16:00:44 Z0 days1 attempts People who touched revisions under test: Abner Chang gechao jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git 1c48866e04..544cb0132d 544cb0132dc1778b9e791202995533523fa6cccd -> xen-tested-master
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 13.11.2020 12:06, Julien Grall wrote: > Hi Jan, > > On 13/11/2020 10:53, Jan Beulich wrote: >> On 13.11.2020 11:36, Julien Grall wrote: >>> On 13/11/2020 10:25, Jan Beulich wrote: On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: > On 11/12/20 4:46 PM, Roger Pau Monné wrote: >> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: >>> On 11/12/20 11:40 AM, Roger Pau Monné wrote: On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: > From: Oleksandr Andrushchenko > +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, > unsigned int reg, > + void *data) > +{ > +struct vpci_bar *vbar, *bar = data; > + > +if ( is_hardware_domain(current->domain) ) > +return bar_read_hwdom(pdev, reg, data); > + > +vbar = get_vpci_bar(current->domain, pdev, bar->index); > +if ( !vbar ) > +return ~0; > + > +return bar_read_guest(pdev, reg, vbar); > +} > + > +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned > int reg, > + uint32_t val, void *data) > +{ > +struct vpci_bar *bar = data; > + > +if ( is_hardware_domain(current->domain) ) > +bar_write_hwdom(pdev, reg, val, data); > +else > +{ > +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, > bar->index); > + > +if ( !vbar ) > +return; > +bar_write_guest(pdev, reg, val, vbar); > +} > +} You should assign different handlers based on whether the domain that has the device assigned is a domU or the hardware domain, rather than doing the selection here. >>> Hm, handlers are assigned once in init_bars and this function is only >>> called >>> >>> for hwdom, so there is no way I can do that for the guests. Hence, the >>> dispatcher >> I think we might want to reset the vPCI handlers when a devices gets >> assigned and deassigned. > > In ARM case init_bars is called too early: PCI device assignment is > currently > > initiated by Domain-0' kernel and is done *before* PCI devices are given > memory > > ranges and BARs assigned: > > [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] > [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] > [ 0.429555] pci_bus :00: root bus resource [mem > 0xfe20-0xfe3f] > [ 0.429575] pci_bus :00: root bus resource [mem > 0x3000-0x37ff] > [ 0.429604] pci_bus :00: root bus resource [mem > 0x3800-0x3fff pref] > [ 0.429670] pci :00:00.0: enabling Extended Tags > [ 0.453764] pci :00:00.0: > BUS_NOTIFY_ADD_DEVICE > > < init_bars > > > [ 0.453793] pci :00:00.0: -- IRQ 0 > [ 0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X > might fail! > [ 0.471790] pci :01:00.0: > BUS_NOTIFY_ADD_DEVICE > > < init_bars > > > [ 0.471821] pci :01:00.0: -- IRQ 255 > [ 0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X > might fail! > > < BAR assignments below > > > [ 0.488233] pci :00:00.0: BAR 14: assigned [mem > 0xfe20-0xfe2f] > [ 0.488265] pci :00:00.0: BAR 15: assigned [mem > 0x3800-0x380f pref] > > In case of x86 this is pretty much ok as BARs are already in place, but > for ARM we > > need to take care and re-setup vPCI BARs for hwdom. Even on x86 there's no guarantee that all devices have their BARs set up by firmware. In a subsequent reply you've suggested to move init_bars from "add" to "assign", but I'm having trouble seeing what this would change: It's not Dom0 controlling assignment (to itself), but Xen assigns the device towards the end of pci_add_device(). > Things are getting even more > > complicated if the host PCI bridge is not ECAM like, so you cannot set > mmio_handlers > > and trap hwdom's access to the config space to update BARs etc. This is > why I have that > > ugly hack for rcar_gen3 to read actual BARs for hwdom. How to config space accesses work there? The latest for MSI/MSI-X it'll be imperative that Xen be able to intercept config space writes. >>> >>> I am not sure to understand your last sentence. Are you saying that we >>> always need to trap access to MSI/MSI-X message in order to sanitize it? >>> >>> If one is using the GICv3 ITS
Re: [PATCH v3 09/53] qdev: Make qdev_get_prop_ptr() get Object* arg
On Thu, 12 Nov 2020 16:43:06 -0500 Eduardo Habkost wrote: > Make the code more generic and not specific to TYPE_DEVICE. > > Reviewed-by: Marc-André Lureau > Signed-off-by: Eduardo Habkost > --- > Changes v1 -> v2: > - Fix build error with CONFIG_XEN > I took the liberty of keeping the Reviewed-by line from > Marc-André as the build fix is a trivial one line change > --- > Cc: Stefan Berger > Cc: Stefano Stabellini > Cc: Anthony Perard > Cc: Paul Durrant > Cc: Kevin Wolf > Cc: Max Reitz > Cc: Paolo Bonzini > Cc: "Daniel P. Berrangé" > Cc: Eduardo Habkost > Cc: Cornelia Huck > Cc: Thomas Huth > Cc: Richard Henderson > Cc: David Hildenbrand > Cc: Halil Pasic > Cc: Christian Borntraeger > Cc: Matthew Rosato > Cc: Alex Williamson > Cc: qemu-de...@nongnu.org > Cc: xen-devel@lists.xenproject.org > Cc: qemu-bl...@nongnu.org > Cc: qemu-s3...@nongnu.org > --- > include/hw/qdev-properties.h | 2 +- > backends/tpm/tpm_util.c | 8 ++-- > hw/block/xen-block.c | 5 +- > hw/core/qdev-properties-system.c | 57 +- > hw/core/qdev-properties.c| 82 +--- > hw/s390x/css.c | 5 +- > hw/s390x/s390-pci-bus.c | 4 +- > hw/vfio/pci-quirks.c | 5 +- > 8 files changed, 68 insertions(+), 100 deletions(-) Reviewed-by: Cornelia Huck #s390 parts
Re: [PATCH V2 01/23] x86/ioreq: Prepare IOREQ feature for making it common
On 13.11.2020 12:09, Oleksandr wrote: > On 12.11.20 12:58, Jan Beulich wrote: >> On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: >>> @@ -855,7 +841,7 @@ int hvm_destroy_ioreq_server(struct domain *d, >>> ioservid_t id) >>> >>> domain_pause(d); >>> >>> -p2m_set_ioreq_server(d, 0, s); >>> +arch_hvm_destroy_ioreq_server(s); >> Iirc there are plans to rename hvm_destroy_ioreq_server() in the >> course of the generalization. If so, this arch hook would imo >> better be named following the new scheme right away. > Could you please clarify, are you speaking about the plans discussed there > > "[PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved > function names"? > > Copy text for the convenience: > AT least some of the functions touched here would be nice to be > moved to a more consistent new naming scheme right away, to > avoid having to touch all the same places again. I guess ioreq > server functions would be nice to all start with ioreq_server_ > and ioreq functions to all start with ioreq_. E.g. ioreq_send() > and ioreq_server_select(). > > or some other plans I am not aware of? > > > What I really want to avoid with IOREQ enabling work is the round-trips > related to naming things, this patch series > became quite big (and consumes som time to rebase and test it) and I > expect it to become bigger. > > So the arch_hvm_destroy_ioreq_server() should be > arch_ioreq_server_destroy()? I think so, yes. If you want to avoid doing full patches, how about you simply list the functions / variables you plan to rename alongside the intended new names? Would likely be easier for all involved parties. >>> @@ -1215,7 +1153,7 @@ void hvm_destroy_all_ioreq_servers(struct domain *d) >>> struct hvm_ioreq_server *s; >>> unsigned int id; >>> >>> -if ( !relocate_portio_handler(d, 0xcf8, 0xcf8, 4) ) >>> +if ( !arch_hvm_ioreq_destroy(d) ) >> There's no ioreq being destroyed here, so I think this wants >> renaming (and again ideally right away following the planned >> new scheme). > Agree that no ioreq being destroyed here. Probably > ioreq_server_check_for_destroy()? > I couldn't think of a better name. "check" implies no change (and d ought to then be const struct domain *). With the containing function likely becoming ioreq_server_destroy_all(), arch_ioreq_server_destroy_all() would come to mind, or arch_ioreq_server_prepare_destroy_all(). >>> +static inline int hvm_map_mem_type_to_ioreq_server(struct domain *d, >>> + ioservid_t id, >>> + uint32_t type, >>> + uint32_t flags) >>> +{ >>> +struct hvm_ioreq_server *s; >>> +int rc; >>> + >>> +if ( type != HVMMEM_ioreq_server ) >>> +return -EINVAL; >>> + >>> +if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE ) >>> +return -EINVAL; >>> + >>> +spin_lock_recursive(>arch.hvm.ioreq_server.lock); >>> + >>> +s = get_ioreq_server(d, id); >>> + >>> +rc = -ENOENT; >>> +if ( !s ) >>> +goto out; >>> + >>> +rc = -EPERM; >>> +if ( s->emulator != current->domain ) >>> +goto out; >>> + >>> +rc = p2m_set_ioreq_server(d, flags, s); >>> + >>> + out: >>> +spin_unlock_recursive(>arch.hvm.ioreq_server.lock); >>> + >>> +if ( rc == 0 && flags == 0 ) >>> +{ >>> +struct p2m_domain *p2m = p2m_get_hostp2m(d); >> I realize I may be asking too much, but would it be possible if, >> while moving code, you made simple and likely uncontroversial >> adjustments like adding const here? (Such adjustments would be >> less desirable to make if they increased the size of the patch, >> e.g. if you were touching only nearby code.) > This function as well as one located below won't be moved to this header > for the next version of patch. > > ok, will add const. Well, if you don't move the code, then better keep the diff small and leave things as they are. Jan
Re: [PATCH V2 01/23] x86/ioreq: Prepare IOREQ feature for making it common
On 12.11.20 12:58, Jan Beulich wrote: Hi Jan. On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: From: Oleksandr Tyshchenko As a lot of x86 code can be re-used on Arm later on, this patch makes some preparation to x86/hvm/ioreq.c before moving to the common code. This way we will get a verbatim copy for a code movement in subsequent patch (arch/x86/hvm/ioreq.c will be *just* renamed to common/ioreq). This patch does the following: 1. Introduce *inline* arch_hvm_ioreq_init(), arch_hvm_ioreq_destroy(), arch_hvm_io_completion(), arch_hvm_destroy_ioreq_server() and hvm_ioreq_server_get_type_addr() to abstract arch specific materials. 2 Make hvm_map_mem_type_to_ioreq_server() *inline*. It is not going to be called from the common code. As already indicated on another sub-thread, I think some of these are too large to be inline functions. Additionally, considering their single-use purpose, I don't think they should be placed in a header consumed by more than the producer and the sole consumer. ok, the only reason I made these inline was to achieve a moving of the whole x86/hvm/ioreq.c to the common code. I will move some of them back to ioreq.c. 3. Make get_ioreq_server() global. It is going to be called from a few places. And with this its name ought to change, to fit the general naming model of global functions of this subsystem. I think, with new requirements (making hvm_map_mem_type_to_ioreq_server() common) this helper doesn't need to be global. I will make it static again. 4. Add IOREQ_STATUS_* #define-s and update candidates for moving. This, it seems to me, could be a separate patch. Well, will do. @@ -855,7 +841,7 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id) domain_pause(d); -p2m_set_ioreq_server(d, 0, s); +arch_hvm_destroy_ioreq_server(s); Iirc there are plans to rename hvm_destroy_ioreq_server() in the course of the generalization. If so, this arch hook would imo better be named following the new scheme right away. Could you please clarify, are you speaking about the plans discussed there "[PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved function names"? Copy text for the convenience: AT least some of the functions touched here would be nice to be moved to a more consistent new naming scheme right away, to avoid having to touch all the same places again. I guess ioreq server functions would be nice to all start with ioreq_server_ and ioreq functions to all start with ioreq_. E.g. ioreq_send() and ioreq_server_select(). or some other plans I am not aware of? What I really want to avoid with IOREQ enabling work is the round-trips related to naming things, this patch series became quite big (and consumes som time to rebase and test it) and I expect it to become bigger. So the arch_hvm_destroy_ioreq_server() should be arch_ioreq_server_destroy()? @@ -1215,7 +1153,7 @@ void hvm_destroy_all_ioreq_servers(struct domain *d) struct hvm_ioreq_server *s; unsigned int id; -if ( !relocate_portio_handler(d, 0xcf8, 0xcf8, 4) ) +if ( !arch_hvm_ioreq_destroy(d) ) There's no ioreq being destroyed here, so I think this wants renaming (and again ideally right away following the planned new scheme). Agree that no ioreq being destroyed here. Probably ioreq_server_check_for_destroy()? I couldn't think of a better name. +static inline int hvm_map_mem_type_to_ioreq_server(struct domain *d, + ioservid_t id, + uint32_t type, + uint32_t flags) +{ +struct hvm_ioreq_server *s; +int rc; + +if ( type != HVMMEM_ioreq_server ) +return -EINVAL; + +if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE ) +return -EINVAL; + +spin_lock_recursive(>arch.hvm.ioreq_server.lock); + +s = get_ioreq_server(d, id); + +rc = -ENOENT; +if ( !s ) +goto out; + +rc = -EPERM; +if ( s->emulator != current->domain ) +goto out; + +rc = p2m_set_ioreq_server(d, flags, s); + + out: +spin_unlock_recursive(>arch.hvm.ioreq_server.lock); + +if ( rc == 0 && flags == 0 ) +{ +struct p2m_domain *p2m = p2m_get_hostp2m(d); I realize I may be asking too much, but would it be possible if, while moving code, you made simple and likely uncontroversial adjustments like adding const here? (Such adjustments would be less desirable to make if they increased the size of the patch, e.g. if you were touching only nearby code.) This function as well as one located below won't be moved to this header for the next version of patch. ok, will add const. +if ( read_atomic(>ioreq.entry_count) ) +p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw); +} + +return rc; +} + +static inline int hvm_ioreq_server_get_type_addr(const struct domain *d, +
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
Hi Jan, On 13/11/2020 10:53, Jan Beulich wrote: On 13.11.2020 11:36, Julien Grall wrote: On 13/11/2020 10:25, Jan Beulich wrote: On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: On 11/12/20 4:46 PM, Roger Pau Monné wrote: On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: On 11/12/20 11:40 AM, Roger Pau Monné wrote: On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: From: Oleksandr Andrushchenko +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned int reg, + void *data) +{ +struct vpci_bar *vbar, *bar = data; + +if ( is_hardware_domain(current->domain) ) +return bar_read_hwdom(pdev, reg, data); + +vbar = get_vpci_bar(current->domain, pdev, bar->index); +if ( !vbar ) +return ~0; + +return bar_read_guest(pdev, reg, vbar); +} + +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int reg, + uint32_t val, void *data) +{ +struct vpci_bar *bar = data; + +if ( is_hardware_domain(current->domain) ) +bar_write_hwdom(pdev, reg, val, data); +else +{ +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, bar->index); + +if ( !vbar ) +return; +bar_write_guest(pdev, reg, val, vbar); +} +} You should assign different handlers based on whether the domain that has the device assigned is a domU or the hardware domain, rather than doing the selection here. Hm, handlers are assigned once in init_bars and this function is only called for hwdom, so there is no way I can do that for the guests. Hence, the dispatcher I think we might want to reset the vPCI handlers when a devices gets assigned and deassigned. In ARM case init_bars is called too early: PCI device assignment is currently initiated by Domain-0' kernel and is done *before* PCI devices are given memory ranges and BARs assigned: [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] [ 0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f] [ 0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff] [ 0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff pref] [ 0.429670] pci :00:00.0: enabling Extended Tags [ 0.453764] pci :00:00.0: BUS_NOTIFY_ADD_DEVICE < init_bars > [ 0.453793] pci :00:00.0: -- IRQ 0 [ 0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X might fail! [ 0.471790] pci :01:00.0: BUS_NOTIFY_ADD_DEVICE < init_bars > [ 0.471821] pci :01:00.0: -- IRQ 255 [ 0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X might fail! < BAR assignments below > [ 0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f] [ 0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f pref] In case of x86 this is pretty much ok as BARs are already in place, but for ARM we need to take care and re-setup vPCI BARs for hwdom. Even on x86 there's no guarantee that all devices have their BARs set up by firmware. In a subsequent reply you've suggested to move init_bars from "add" to "assign", but I'm having trouble seeing what this would change: It's not Dom0 controlling assignment (to itself), but Xen assigns the device towards the end of pci_add_device(). Things are getting even more complicated if the host PCI bridge is not ECAM like, so you cannot set mmio_handlers and trap hwdom's access to the config space to update BARs etc. This is why I have that ugly hack for rcar_gen3 to read actual BARs for hwdom. How to config space accesses work there? The latest for MSI/MSI-X it'll be imperative that Xen be able to intercept config space writes. I am not sure to understand your last sentence. Are you saying that we always need to trap access to MSI/MSI-X message in order to sanitize it? If one is using the GICv3 ITS (I haven't investigated other MSI controller), then I don't believe you need to sanitize the MSI/MSI-X message in most of the situation. Well, if it's fine for the guest to write arbitrary values to message address and message data, The message address would be the doorbell of the ITS that is usually going through the IOMMU page-tables. Although, I am aware of a couple of platforms where the doorbell access (among other address ranges including P2P transaction) bypass the IOMMU. In this situation, we would need a lot more work than just trapping the access. Regarding the message data, for the ITS this is an event ID. The HW will then tag each message with the device ID (this prevents spoofing). The tupple (device ID, event ID) is used by the ITS to decide where to inject the event. Whether other MSI controllers (e.g. GICv2m) have similar isolation feature will be on the case by case
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 11/13/20 12:50 PM, Jan Beulich wrote: > On 13.11.2020 11:46, Oleksandr Andrushchenko wrote: >> On 11/13/20 12:25 PM, Jan Beulich wrote: >>> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: On 11/12/20 4:46 PM, Roger Pau Monné wrote: > On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: >> On 11/12/20 11:40 AM, Roger Pau Monné wrote: >>> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: From: Oleksandr Andrushchenko +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned int reg, + void *data) +{ +struct vpci_bar *vbar, *bar = data; + +if ( is_hardware_domain(current->domain) ) +return bar_read_hwdom(pdev, reg, data); + +vbar = get_vpci_bar(current->domain, pdev, bar->index); +if ( !vbar ) +return ~0; + +return bar_read_guest(pdev, reg, vbar); +} + +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int reg, + uint32_t val, void *data) +{ +struct vpci_bar *bar = data; + +if ( is_hardware_domain(current->domain) ) +bar_write_hwdom(pdev, reg, val, data); +else +{ +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, bar->index); + +if ( !vbar ) +return; +bar_write_guest(pdev, reg, val, vbar); +} +} >>> You should assign different handlers based on whether the domain that >>> has the device assigned is a domU or the hardware domain, rather than >>> doing the selection here. >> Hm, handlers are assigned once in init_bars and this function is only >> called >> >> for hwdom, so there is no way I can do that for the guests. Hence, the >> dispatcher > I think we might want to reset the vPCI handlers when a devices gets > assigned and deassigned. In ARM case init_bars is called too early: PCI device assignment is currently initiated by Domain-0' kernel and is done *before* PCI devices are given memory ranges and BARs assigned: [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] [ 0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f] [ 0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff] [ 0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff pref] [ 0.429670] pci :00:00.0: enabling Extended Tags [ 0.453764] pci :00:00.0: BUS_NOTIFY_ADD_DEVICE < init_bars > [ 0.453793] pci :00:00.0: -- IRQ 0 [ 0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X might fail! [ 0.471790] pci :01:00.0: BUS_NOTIFY_ADD_DEVICE < init_bars > [ 0.471821] pci :01:00.0: -- IRQ 255 [ 0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X might fail! < BAR assignments below > [ 0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f] [ 0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f pref] In case of x86 this is pretty much ok as BARs are already in place, but for ARM we need to take care and re-setup vPCI BARs for hwdom. >>> Even on x86 there's no guarantee that all devices have their BARs set >>> up by firmware. >> This is true. But there you could have config space trapped in "x86 generic >> way", >> >> please correct me if I'm wrong here >> >>> In a subsequent reply you've suggested to move init_bars from "add" to >>> "assign", but I'm having trouble seeing what this would change: It's >>> not Dom0 controlling assignment (to itself), but Xen assigns the device >>> towards the end of pci_add_device(). >> PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device >> >> Currently we initialize BARs during PHYSDEVOP_pci_device_add and >> >> if we do that during XEN_DOMCTL_assign_device things seem to change > But there can't possibly be any XEN_DOMCTL_assign_device involved in > booting of Dom0. Indeed. So, do you have an idea when we should call init_bars suitable for both ARM and x86? Another question is: what happens bad if x86 and ARM won't call init_bars until the moment we really assign a PCI device to the first guest? > > In order to do passthrough to domUs safely > we will have to add more handlers than what's required for dom0, Can you please tell what are thinking about? What are
Re: [PATCH 08/10] vpci/arm: Allow updating BAR's header for non-ECAM bridges
On 11/13/20 12:47 PM, Jan Beulich wrote: > On 13.11.2020 11:39, Oleksandr Andrushchenko wrote: >> On 11/13/20 12:29 PM, Jan Beulich wrote: >>> On 09.11.2020 13:50, Oleksandr Andrushchenko wrote: From: Oleksandr Andrushchenko Non-ECAM host bridges in hwdom go directly to PCI config space, not through vpci (they use their specific method for accessing PCI configuration, e.g. dedicated registers etc.). >>> And access to these dedicated registers can't be intercepted? >> It can. But then you have to fully emulate that bridge, e.g. >> >> "if we write A to regB and after that write C to regZ then it >> >> means we are accessing config space. If we write" > Sounds pretty much like the I/O port based access mechanism on > x86, which also has some sort of "enable". Of course, I/O port > accesses are particularly easy to intercept and handle... Yes, it has somewhat similar idea > >> I mean this would need lots of code in Xen to achieve that > Possibly, but look at the amount of code we have in Xen on the > x86 side to handle MCFG writes by Dom0. But MCFG is handled the same way for all x86 machines, right? And here I'll have to have a SoC specific code, e.g. a specific driver > > Jan Thank you, Oleksandr
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 13.11.2020 11:36, Julien Grall wrote: > On 13/11/2020 10:25, Jan Beulich wrote: >> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: >>> On 11/12/20 4:46 PM, Roger Pau Monné wrote: On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: > On 11/12/20 11:40 AM, Roger Pau Monné wrote: >> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: >>> From: Oleksandr Andrushchenko >>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned >>> int reg, >>> + void *data) >>> +{ >>> +struct vpci_bar *vbar, *bar = data; >>> + >>> +if ( is_hardware_domain(current->domain) ) >>> +return bar_read_hwdom(pdev, reg, data); >>> + >>> +vbar = get_vpci_bar(current->domain, pdev, bar->index); >>> +if ( !vbar ) >>> +return ~0; >>> + >>> +return bar_read_guest(pdev, reg, vbar); >>> +} >>> + >>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned >>> int reg, >>> + uint32_t val, void *data) >>> +{ >>> +struct vpci_bar *bar = data; >>> + >>> +if ( is_hardware_domain(current->domain) ) >>> +bar_write_hwdom(pdev, reg, val, data); >>> +else >>> +{ >>> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, >>> bar->index); >>> + >>> +if ( !vbar ) >>> +return; >>> +bar_write_guest(pdev, reg, val, vbar); >>> +} >>> +} >> You should assign different handlers based on whether the domain that >> has the device assigned is a domU or the hardware domain, rather than >> doing the selection here. > Hm, handlers are assigned once in init_bars and this function is only > called > > for hwdom, so there is no way I can do that for the guests. Hence, the > dispatcher I think we might want to reset the vPCI handlers when a devices gets assigned and deassigned. >>> >>> In ARM case init_bars is called too early: PCI device assignment is >>> currently >>> >>> initiated by Domain-0' kernel and is done *before* PCI devices are given >>> memory >>> >>> ranges and BARs assigned: >>> >>> [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] >>> [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] >>> [ 0.429555] pci_bus :00: root bus resource [mem >>> 0xfe20-0xfe3f] >>> [ 0.429575] pci_bus :00: root bus resource [mem >>> 0x3000-0x37ff] >>> [ 0.429604] pci_bus :00: root bus resource [mem >>> 0x3800-0x3fff pref] >>> [ 0.429670] pci :00:00.0: enabling Extended Tags >>> [ 0.453764] pci :00:00.0: BUS_NOTIFY_ADD_DEVICE >>> >>> < init_bars > >>> >>> [ 0.453793] pci :00:00.0: -- IRQ 0 >>> [ 0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X >>> might fail! >>> [ 0.471790] pci :01:00.0: BUS_NOTIFY_ADD_DEVICE >>> >>> < init_bars > >>> >>> [ 0.471821] pci :01:00.0: -- IRQ 255 >>> [ 0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X >>> might fail! >>> >>> < BAR assignments below > >>> >>> [ 0.488233] pci :00:00.0: BAR 14: assigned [mem >>> 0xfe20-0xfe2f] >>> [ 0.488265] pci :00:00.0: BAR 15: assigned [mem >>> 0x3800-0x380f pref] >>> >>> In case of x86 this is pretty much ok as BARs are already in place, but for >>> ARM we >>> >>> need to take care and re-setup vPCI BARs for hwdom. >> >> Even on x86 there's no guarantee that all devices have their BARs set >> up by firmware. >> >> In a subsequent reply you've suggested to move init_bars from "add" to >> "assign", but I'm having trouble seeing what this would change: It's >> not Dom0 controlling assignment (to itself), but Xen assigns the device >> towards the end of pci_add_device(). >> >>> Things are getting even more >>> >>> complicated if the host PCI bridge is not ECAM like, so you cannot set >>> mmio_handlers >>> >>> and trap hwdom's access to the config space to update BARs etc. This is why >>> I have that >>> >>> ugly hack for rcar_gen3 to read actual BARs for hwdom. >> >> How to config space accesses work there? The latest for MSI/MSI-X it'll >> be imperative that Xen be able to intercept config space writes. > > I am not sure to understand your last sentence. Are you saying that we > always need to trap access to MSI/MSI-X message in order to sanitize it? > > If one is using the GICv3 ITS (I haven't investigated other MSI > controller), then I don't believe you need to sanitize the MSI/MSI-X > message in most of the situation. Well, if it's fine for the guest to write arbitrary values to message address and message data, _and_ to arbitrarily enable/disable MSI / MSI-X, then yes, no interception would be needed. Jan
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 13.11.2020 11:46, Oleksandr Andrushchenko wrote: > On 11/13/20 12:25 PM, Jan Beulich wrote: >> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: >>> On 11/12/20 4:46 PM, Roger Pau Monné wrote: On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: > On 11/12/20 11:40 AM, Roger Pau Monné wrote: >> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: >>> From: Oleksandr Andrushchenko >>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned >>> int reg, >>> + void *data) >>> +{ >>> +struct vpci_bar *vbar, *bar = data; >>> + >>> +if ( is_hardware_domain(current->domain) ) >>> +return bar_read_hwdom(pdev, reg, data); >>> + >>> +vbar = get_vpci_bar(current->domain, pdev, bar->index); >>> +if ( !vbar ) >>> +return ~0; >>> + >>> +return bar_read_guest(pdev, reg, vbar); >>> +} >>> + >>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned >>> int reg, >>> + uint32_t val, void *data) >>> +{ >>> +struct vpci_bar *bar = data; >>> + >>> +if ( is_hardware_domain(current->domain) ) >>> +bar_write_hwdom(pdev, reg, val, data); >>> +else >>> +{ >>> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, >>> bar->index); >>> + >>> +if ( !vbar ) >>> +return; >>> +bar_write_guest(pdev, reg, val, vbar); >>> +} >>> +} >> You should assign different handlers based on whether the domain that >> has the device assigned is a domU or the hardware domain, rather than >> doing the selection here. > Hm, handlers are assigned once in init_bars and this function is only > called > > for hwdom, so there is no way I can do that for the guests. Hence, the > dispatcher I think we might want to reset the vPCI handlers when a devices gets assigned and deassigned. >>> In ARM case init_bars is called too early: PCI device assignment is >>> currently >>> >>> initiated by Domain-0' kernel and is done *before* PCI devices are given >>> memory >>> >>> ranges and BARs assigned: >>> >>> [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] >>> [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] >>> [ 0.429555] pci_bus :00: root bus resource [mem >>> 0xfe20-0xfe3f] >>> [ 0.429575] pci_bus :00: root bus resource [mem >>> 0x3000-0x37ff] >>> [ 0.429604] pci_bus :00: root bus resource [mem >>> 0x3800-0x3fff pref] >>> [ 0.429670] pci :00:00.0: enabling Extended Tags >>> [ 0.453764] pci :00:00.0: BUS_NOTIFY_ADD_DEVICE >>> >>> < init_bars > >>> >>> [ 0.453793] pci :00:00.0: -- IRQ 0 >>> [ 0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X >>> might fail! >>> [ 0.471790] pci :01:00.0: BUS_NOTIFY_ADD_DEVICE >>> >>> < init_bars > >>> >>> [ 0.471821] pci :01:00.0: -- IRQ 255 >>> [ 0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X >>> might fail! >>> >>> < BAR assignments below > >>> >>> [ 0.488233] pci :00:00.0: BAR 14: assigned [mem >>> 0xfe20-0xfe2f] >>> [ 0.488265] pci :00:00.0: BAR 15: assigned [mem >>> 0x3800-0x380f pref] >>> >>> In case of x86 this is pretty much ok as BARs are already in place, but for >>> ARM we >>> >>> need to take care and re-setup vPCI BARs for hwdom. >> Even on x86 there's no guarantee that all devices have their BARs set >> up by firmware. > > This is true. But there you could have config space trapped in "x86 generic > way", > > please correct me if I'm wrong here > >> >> In a subsequent reply you've suggested to move init_bars from "add" to >> "assign", but I'm having trouble seeing what this would change: It's >> not Dom0 controlling assignment (to itself), but Xen assigns the device >> towards the end of pci_add_device(). > > PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device > > Currently we initialize BARs during PHYSDEVOP_pci_device_add and > > if we do that during XEN_DOMCTL_assign_device things seem to change But there can't possibly be any XEN_DOMCTL_assign_device involved in booting of Dom0. In order to do passthrough to domUs safely we will have to add more handlers than what's required for dom0, >>> Can you please tell what are thinking about? What are the missing handlers? and having is_hardware_domain sprinkled in all of them is not a suitable solution. >>> I'll try to replace is_hardware_domain with something like: >>> >>> +/* >>> + * Detect whether physical PCI devices in this segment belong >>> + * to the domain given, e.g. on x86 all PCI devices live in hwdom, >>> + * but in case of ARM this might not be the
Re: [PATCH 08/10] vpci/arm: Allow updating BAR's header for non-ECAM bridges
On 13.11.2020 11:39, Oleksandr Andrushchenko wrote: > > On 11/13/20 12:29 PM, Jan Beulich wrote: >> On 09.11.2020 13:50, Oleksandr Andrushchenko wrote: >>> From: Oleksandr Andrushchenko >>> >>> Non-ECAM host bridges in hwdom go directly to PCI config space, >>> not through vpci (they use their specific method for accessing PCI >>> configuration, e.g. dedicated registers etc.). >> And access to these dedicated registers can't be intercepted? > > It can. But then you have to fully emulate that bridge, e.g. > > "if we write A to regB and after that write C to regZ then it > > means we are accessing config space. If we write" Sounds pretty much like the I/O port based access mechanism on x86, which also has some sort of "enable". Of course, I/O port accesses are particularly easy to intercept and handle... > I mean this would need lots of code in Xen to achieve that Possibly, but look at the amount of code we have in Xen on the x86 side to handle MCFG writes by Dom0. Jan
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 11/13/20 12:25 PM, Jan Beulich wrote: > On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: >> On 11/12/20 4:46 PM, Roger Pau Monné wrote: >>> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: On 11/12/20 11:40 AM, Roger Pau Monné wrote: > On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: >> From: Oleksandr Andrushchenko >> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned >> int reg, >> + void *data) >> +{ >> +struct vpci_bar *vbar, *bar = data; >> + >> +if ( is_hardware_domain(current->domain) ) >> +return bar_read_hwdom(pdev, reg, data); >> + >> +vbar = get_vpci_bar(current->domain, pdev, bar->index); >> +if ( !vbar ) >> +return ~0; >> + >> +return bar_read_guest(pdev, reg, vbar); >> +} >> + >> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int >> reg, >> + uint32_t val, void *data) >> +{ >> +struct vpci_bar *bar = data; >> + >> +if ( is_hardware_domain(current->domain) ) >> +bar_write_hwdom(pdev, reg, val, data); >> +else >> +{ >> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, >> bar->index); >> + >> +if ( !vbar ) >> +return; >> +bar_write_guest(pdev, reg, val, vbar); >> +} >> +} > You should assign different handlers based on whether the domain that > has the device assigned is a domU or the hardware domain, rather than > doing the selection here. Hm, handlers are assigned once in init_bars and this function is only called for hwdom, so there is no way I can do that for the guests. Hence, the dispatcher >>> I think we might want to reset the vPCI handlers when a devices gets >>> assigned and deassigned. >> In ARM case init_bars is called too early: PCI device assignment is currently >> >> initiated by Domain-0' kernel and is done *before* PCI devices are given >> memory >> >> ranges and BARs assigned: >> >> [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] >> [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] >> [ 0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f] >> [ 0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff] >> [ 0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff >> pref] >> [ 0.429670] pci :00:00.0: enabling Extended Tags >> [ 0.453764] pci :00:00.0: BUS_NOTIFY_ADD_DEVICE >> >> < init_bars > >> >> [ 0.453793] pci :00:00.0: -- IRQ 0 >> [ 0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X >> might fail! >> [ 0.471790] pci :01:00.0: BUS_NOTIFY_ADD_DEVICE >> >> < init_bars > >> >> [ 0.471821] pci :01:00.0: -- IRQ 255 >> [ 0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X >> might fail! >> >> < BAR assignments below > >> >> [ 0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f] >> [ 0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f >> pref] >> >> In case of x86 this is pretty much ok as BARs are already in place, but for >> ARM we >> >> need to take care and re-setup vPCI BARs for hwdom. > Even on x86 there's no guarantee that all devices have their BARs set > up by firmware. This is true. But there you could have config space trapped in "x86 generic way", please correct me if I'm wrong here > > In a subsequent reply you've suggested to move init_bars from "add" to > "assign", but I'm having trouble seeing what this would change: It's > not Dom0 controlling assignment (to itself), but Xen assigns the device > towards the end of pci_add_device(). PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device Currently we initialize BARs during PHYSDEVOP_pci_device_add and if we do that during XEN_DOMCTL_assign_device things seem to change > >> Things are getting even more >> >> complicated if the host PCI bridge is not ECAM like, so you cannot set >> mmio_handlers >> >> and trap hwdom's access to the config space to update BARs etc. This is why >> I have that >> >> ugly hack for rcar_gen3 to read actual BARs for hwdom. > How to config space accesses work there? The latest for MSI/MSI-X it'll > be imperative that Xen be able to intercept config space writes. > >>>In order to do passthrough to domUs safely >>> we will have to add more handlers than what's required for dom0, >> Can you please tell what are thinking about? What are the missing handlers? >>>and >>> having is_hardware_domain sprinkled in all of them is not a suitable >>> solution. >> I'll try to replace is_hardware_domain with something like: >> >> +/* >> + * Detect whether physical PCI devices in
Re: [PATCH 08/10] vpci/arm: Allow updating BAR's header for non-ECAM bridges
On 11/13/20 12:29 PM, Jan Beulich wrote: > On 09.11.2020 13:50, Oleksandr Andrushchenko wrote: >> From: Oleksandr Andrushchenko >> >> Non-ECAM host bridges in hwdom go directly to PCI config space, >> not through vpci (they use their specific method for accessing PCI >> configuration, e.g. dedicated registers etc.). > And access to these dedicated registers can't be intercepted? It can. But then you have to fully emulate that bridge, e.g. "if we write A to regB and after that write C to regZ then it means we are accessing config space. If we write" I mean this would need lots of code in Xen to achieve that > It > would seem to me that if so, such a platform is not capable of > being virtualized (without cooperation by all the domains in > possession of devices). Guest domains always use an emulated ECAM bridge and are easily trapped and emulated > > Jan
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 13/11/2020 10:25, Jan Beulich wrote: On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: On 11/12/20 4:46 PM, Roger Pau Monné wrote: On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: On 11/12/20 11:40 AM, Roger Pau Monné wrote: On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: From: Oleksandr Andrushchenko +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned int reg, + void *data) +{ +struct vpci_bar *vbar, *bar = data; + +if ( is_hardware_domain(current->domain) ) +return bar_read_hwdom(pdev, reg, data); + +vbar = get_vpci_bar(current->domain, pdev, bar->index); +if ( !vbar ) +return ~0; + +return bar_read_guest(pdev, reg, vbar); +} + +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int reg, + uint32_t val, void *data) +{ +struct vpci_bar *bar = data; + +if ( is_hardware_domain(current->domain) ) +bar_write_hwdom(pdev, reg, val, data); +else +{ +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, bar->index); + +if ( !vbar ) +return; +bar_write_guest(pdev, reg, val, vbar); +} +} You should assign different handlers based on whether the domain that has the device assigned is a domU or the hardware domain, rather than doing the selection here. Hm, handlers are assigned once in init_bars and this function is only called for hwdom, so there is no way I can do that for the guests. Hence, the dispatcher I think we might want to reset the vPCI handlers when a devices gets assigned and deassigned. In ARM case init_bars is called too early: PCI device assignment is currently initiated by Domain-0' kernel and is done *before* PCI devices are given memory ranges and BARs assigned: [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] [ 0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f] [ 0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff] [ 0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff pref] [ 0.429670] pci :00:00.0: enabling Extended Tags [ 0.453764] pci :00:00.0: BUS_NOTIFY_ADD_DEVICE < init_bars > [ 0.453793] pci :00:00.0: -- IRQ 0 [ 0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X might fail! [ 0.471790] pci :01:00.0: BUS_NOTIFY_ADD_DEVICE < init_bars > [ 0.471821] pci :01:00.0: -- IRQ 255 [ 0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X might fail! < BAR assignments below > [ 0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f] [ 0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f pref] In case of x86 this is pretty much ok as BARs are already in place, but for ARM we need to take care and re-setup vPCI BARs for hwdom. Even on x86 there's no guarantee that all devices have their BARs set up by firmware. In a subsequent reply you've suggested to move init_bars from "add" to "assign", but I'm having trouble seeing what this would change: It's not Dom0 controlling assignment (to itself), but Xen assigns the device towards the end of pci_add_device(). Things are getting even more complicated if the host PCI bridge is not ECAM like, so you cannot set mmio_handlers and trap hwdom's access to the config space to update BARs etc. This is why I have that ugly hack for rcar_gen3 to read actual BARs for hwdom. How to config space accesses work there? The latest for MSI/MSI-X it'll be imperative that Xen be able to intercept config space writes. I am not sure to understand your last sentence. Are you saying that we always need to trap access to MSI/MSI-X message in order to sanitize it? If one is using the GICv3 ITS (I haven't investigated other MSI controller), then I don't believe you need to sanitize the MSI/MSI-X message in most of the situation. Cheers, -- Julien Grall
Re: [PATCH 08/10] vpci/arm: Allow updating BAR's header for non-ECAM bridges
On 09.11.2020 13:50, Oleksandr Andrushchenko wrote: > From: Oleksandr Andrushchenko > > Non-ECAM host bridges in hwdom go directly to PCI config space, > not through vpci (they use their specific method for accessing PCI > configuration, e.g. dedicated registers etc.). And access to these dedicated registers can't be intercepted? It would seem to me that if so, such a platform is not capable of being virtualized (without cooperation by all the domains in possession of devices). Jan
Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: > On 11/12/20 4:46 PM, Roger Pau Monné wrote: >> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote: >>> On 11/12/20 11:40 AM, Roger Pau Monné wrote: On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: > From: Oleksandr Andrushchenko > +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned > int reg, > + void *data) > +{ > +struct vpci_bar *vbar, *bar = data; > + > +if ( is_hardware_domain(current->domain) ) > +return bar_read_hwdom(pdev, reg, data); > + > +vbar = get_vpci_bar(current->domain, pdev, bar->index); > +if ( !vbar ) > +return ~0; > + > +return bar_read_guest(pdev, reg, vbar); > +} > + > +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int > reg, > + uint32_t val, void *data) > +{ > +struct vpci_bar *bar = data; > + > +if ( is_hardware_domain(current->domain) ) > +bar_write_hwdom(pdev, reg, val, data); > +else > +{ > +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, > bar->index); > + > +if ( !vbar ) > +return; > +bar_write_guest(pdev, reg, val, vbar); > +} > +} You should assign different handlers based on whether the domain that has the device assigned is a domU or the hardware domain, rather than doing the selection here. >>> Hm, handlers are assigned once in init_bars and this function is only called >>> >>> for hwdom, so there is no way I can do that for the guests. Hence, the >>> dispatcher >> I think we might want to reset the vPCI handlers when a devices gets >> assigned and deassigned. > > In ARM case init_bars is called too early: PCI device assignment is currently > > initiated by Domain-0' kernel and is done *before* PCI devices are given > memory > > ranges and BARs assigned: > > [ 0.429514] pci_bus :00: root bus resource [bus 00-ff] > [ 0.429532] pci_bus :00: root bus resource [io 0x-0xf] > [ 0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f] > [ 0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff] > [ 0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff > pref] > [ 0.429670] pci :00:00.0: enabling Extended Tags > [ 0.453764] pci :00:00.0: BUS_NOTIFY_ADD_DEVICE > > < init_bars > > > [ 0.453793] pci :00:00.0: -- IRQ 0 > [ 0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X > might fail! > [ 0.471790] pci :01:00.0: BUS_NOTIFY_ADD_DEVICE > > < init_bars > > > [ 0.471821] pci :01:00.0: -- IRQ 255 > [ 0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X > might fail! > > < BAR assignments below > > > [ 0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f] > [ 0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f > pref] > > In case of x86 this is pretty much ok as BARs are already in place, but for > ARM we > > need to take care and re-setup vPCI BARs for hwdom. Even on x86 there's no guarantee that all devices have their BARs set up by firmware. In a subsequent reply you've suggested to move init_bars from "add" to "assign", but I'm having trouble seeing what this would change: It's not Dom0 controlling assignment (to itself), but Xen assigns the device towards the end of pci_add_device(). > Things are getting even more > > complicated if the host PCI bridge is not ECAM like, so you cannot set > mmio_handlers > > and trap hwdom's access to the config space to update BARs etc. This is why I > have that > > ugly hack for rcar_gen3 to read actual BARs for hwdom. How to config space accesses work there? The latest for MSI/MSI-X it'll be imperative that Xen be able to intercept config space writes. >> In order to do passthrough to domUs safely >> we will have to add more handlers than what's required for dom0, > Can you please tell what are thinking about? What are the missing handlers? >> and >> having is_hardware_domain sprinkled in all of them is not a suitable >> solution. > > I'll try to replace is_hardware_domain with something like: > > +/* > + * Detect whether physical PCI devices in this segment belong > + * to the domain given, e.g. on x86 all PCI devices live in hwdom, > + * but in case of ARM this might not be the case: those may also > + * live in driver domains or even Xen itself. > + */ > +bool pci_is_hardware_domain(struct domain *d, u16 seg) > +{ > +#ifdef CONFIG_X86 > + return is_hardware_domain(d); > +#elif CONFIG_ARM > + return pci_is_owner_domain(d, seg); > +#else > +#error "Unsupported architecture" > +#endif > +} > + >
Re: [PATCH 5/5] x86/p2m: split write_p2m_entry() hook
On 12.11.2020 18:52, Tim Deegan wrote: > At 15:04 +0100 on 12 Nov (1605193496), Jan Beulich wrote: >> On 12.11.2020 14:07, Roger Pau Monné wrote: >>> On Thu, Nov 12, 2020 at 01:29:33PM +0100, Jan Beulich wrote: I agree with all this. If only it was merely about TLB flushes. In the shadow case, shadow_blow_all_tables() gets invoked, and that one - looking at the other call sites - wants the paging lock held. > [...] >>> The post hook for shadow could pick the lock again, as I don't think >>> the removal of the tables needs to be strictly done inside of the same >>> locked region? >> >> I think it does, or else a use of the now stale tables may occur >> before they got blown away. Tim? > > Is this the call to shadow_blow_tables() in the write_p2m_entry path? Yes. > I think it would be safe to drop and re-take the paging lock there as > long as the call happens before the write is considered to have > finished. > > But it would not be a useful performance improvement - any update that > takes this path is going to be very slow regardless. So unless you > have another pressing reason to split it up, I would be inclined to > leave it as it is. That way it's easier to see that the locking is > correct. Thanks for the clarification. Roger - what's your view at this point? Jan
Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote: > > The following might be able to get you going, but I think I need to > refine the logic a bit there, will have to give it some thought. thanks. It avoids the panic, but it seems that msi and msi-x interrupts are not delivered to the dom0 kernel. The conters stay at 0. I get some ioapic interrupts though, as well as some Xen events. -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --