Re: [PATCH 1/2] automation: add a QEMU aarch64 smoke test

2020-11-13 Thread Stefano Stabellini
On Fri, 13 Nov 2020, Stefano Stabellini wrote:
> Use QEMU to start Xen (just the hypervisor) up until it stops because
> there is no dom0 kernel to boot.
> 
> It is based on the existing build job unstable-arm64v8.
> 
> Also use make -j$(nproc) to build Xen.
> 
> Signed-off-by: Stefano Stabellini 
> ---
>  automation/gitlab-ci/test.yaml | 22 ++
>  automation/scripts/build   |  8 +++
>  automation/scripts/qemu-smoke-arm64.sh | 32 ++
>  3 files changed, 57 insertions(+), 5 deletions(-)
>  create mode 100755 automation/scripts/qemu-smoke-arm64.sh
> 
> diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> index 793feafe8b..35346e3f6e 100644
> --- a/automation/gitlab-ci/test.yaml
> +++ b/automation/gitlab-ci/test.yaml
> @@ -22,6 +22,28 @@ build-each-commit-gcc:
>  - /^coverity-tested\/.*/
>  - /^stable-.*/
>  
> +qemu-smoke-arm64-gcc:
> +  stage: test
> +  image: registry.gitlab.com/xen-project/xen/${CONTAINER}
> +  variables:
> +CONTAINER: debian:unstable-arm64v8
> +  script:
> +- ./automation/scripts/qemu-smoke-arm64.sh 2>&1 | tee 
> qemu-smoke-arm64.log
> +  dependencies:
> +- debian-unstable-gcc-arm64
> +  artifacts:
> +paths:
> +  - smoke.serial
> +  - '*.log'
> +when: always
> +  tags:
> +- arm64
> +  except:
> +- master
> +- smoke
> +- /^coverity-tested\/.*/
> +- /^stable-.*/
> +
>  qemu-smoke-x86-64-gcc:
>stage: test
>image: registry.gitlab.com/xen-project/xen/${CONTAINER}
> diff --git a/automation/scripts/build b/automation/scripts/build
> index 0cd0f3971d..95ad23eadb 100755
> --- a/automation/scripts/build
> +++ b/automation/scripts/build
> @@ -10,9 +10,9 @@ cc-ver()
>  
>  # random config or default config
>  if [[ "${RANDCONFIG}" == "y" ]]; then
> -make -C xen KCONFIG_ALLCONFIG=tools/kconfig/allrandom.config randconfig
> +make -j$(nproc) -C xen KCONFIG_ALLCONFIG=tools/kconfig/allrandom.config 
> randconfig
>  else
> -make -C xen defconfig
> +make -j$(nproc) -C xen defconfig
>  fi
>  
>  # build up our configure options
> @@ -45,9 +45,7 @@ make -j$(nproc) dist
>  # Extract artifacts to avoid getting rewritten by customised builds
>  cp xen/.config xen-config
>  mkdir binaries
> -if [[ "${XEN_TARGET_ARCH}" == "x86_64" ]]; then
> -cp xen/xen binaries/xen
> -fi
> +cp xen/xen binaries/xen

This was a mistake, it should have been:

if [[ "${XEN_TARGET_ARCH}" != "x86_32" ]]; then
cp xen/xen binaries/xen
fi


Unrelated: should we temporarely disable the stubdom build
(--disable-stubdom) in the gitlab build? (Even better if somebody
volunteers to fix the stubdom build somehow.) At the moment a bunch of
jobs are failing with:

mini-os.c: In function 'main':
mini-os.c:756: warning: cast from pointer to integer of different size
ar: creating 
/builds/xen-project/people/sstabellini/xen/stubdom/mini-os-x86_64-xenstorepvh/arch/x86/libx86_64.a
/builds/xen-project/people/sstabellini/xen/stubdom/mini-os-x86_64-xenstorepvh/mini-os.o:
 could not read symbols: File in wrong format
make[2]: *** 
[/builds/xen-project/people/sstabellini/xen/stubdom/mini-os-x86_64-xenstorepvh/mini-os]
 Error 1
make[1]: *** [xenstorepvh-stubdom] Error 2
make[1]: *** Waiting for unfinished jobs


With the above x86_32 fix, stubdoms disabled, all jobs (including the
new aarch64 smoke test) complete successfully.



Re: [SPECIFICATION RFC] The firmware and bootloader log specification

2020-11-13 Thread Randy Dunlap
On 11/13/20 3:52 PM, Daniel Kiper wrote:
> Hey,
> 
> 
> Here is the description (pseudocode) of the structures which will be
> used to store the log data.
> 
> Anyway, I am aware that this is not specification per se.


Yes, you have caveats here. I'm sure that you either already know
or would learn soon enough that struct struct bf_log has some
padding added to it (for alignment) unless it is packed.
Or you could rearrange the order of some of its fields
and save 8 bytes per struct on x86_64.


>   struct bf_log
>   {
> uint32_t   version;
> char   producer[64];
> uint64_t   flags;
> uint64_t   next_bf_log_addr;
> uint32_t   next_msg_off;
> bf_log_msg msgs[];
>   }
> 
>   struct bf_log_msg
>   {
> uint32_t size;
> uint64_t ts_nsec;
> uint32_t level;
> uint32_t facility;
> uint32_t msg_off;
> char strings[];
>   }


cheers.
-- 
~Randy




[xen-unstable test] 156737: tolerable FAIL - PUSHED

2020-11-13 Thread osstest service owner
flight 156737 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156737/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156443
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156443
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156443
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156443
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156443
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 156443
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 156443
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156443
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156443
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156443
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156443
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass

version targeted for testing:
 xen  5505f5f8e7e805365cfe70b6a4af6115940bb749
baseline version:
 xen  9ff9705647646aa937b5f5c1426a64c69a62b3bd

Last test of basis   156443  2020-11-05 15:47:13 Z8 days
Failing since156524  2020-11-06 14:22:28 Z7 days9 attempts
Testing same since   156711  2020-11-12 09:06:03 Z1 days2 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anthony PERARD 
  Jan Beulich 
  Jason Andryuk 
  Juergen Gross 
  Julien Grall 
  Marek Marczykowski-Górecki 
  Olaf Hering 
  Penny Zheng 
  Roger Pau Monné 
  Stefano Stabellini 
  Stefano 

[PATCH 2/2] automation: add dom0less to the QEMU aarch64 smoke test

2020-11-13 Thread Stefano Stabellini
Add a trivial dom0less test:
- build Linux defconfig arm64 to be used as dom0/U kernel
- use busybox-static to create a trivial dom0/U ramdisk
- use ImageBuilder to generate the uboot boot script automatically
- install and use u-boot from the Debian package to start the test
- binaries are loaded from uboot via tftp

Disabling the pl061 device is required to get any version of Linux to
boot on Xen on qemu-system-aarch64.

Signed-off-by: Stefano Stabellini 
---
 automation/gitlab-ci/test.yaml |  1 +
 automation/scripts/qemu-smoke-arm64.sh | 87 +++---
 2 files changed, 81 insertions(+), 7 deletions(-)

diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
index 35346e3f6e..76eff1004e 100644
--- a/automation/gitlab-ci/test.yaml
+++ b/automation/gitlab-ci/test.yaml
@@ -27,6 +27,7 @@ qemu-smoke-arm64-gcc:
   image: registry.gitlab.com/xen-project/xen/${CONTAINER}
   variables:
 CONTAINER: debian:unstable-arm64v8
+LINUX_VERSION: 5.9.8
   script:
 - ./automation/scripts/qemu-smoke-arm64.sh 2>&1 | tee qemu-smoke-arm64.log
   dependencies:
diff --git a/automation/scripts/qemu-smoke-arm64.sh 
b/automation/scripts/qemu-smoke-arm64.sh
index a7efbf8b6f..e8dc5b19cb 100755
--- a/automation/scripts/qemu-smoke-arm64.sh
+++ b/automation/scripts/qemu-smoke-arm64.sh
@@ -6,27 +6,100 @@ set -ex
 export DEBIAN_FRONTENT=noninteractive
 apt-get -qy update
 apt-get -qy install --no-install-recommends qemu-system-aarch64 \
-u-boot-qemu
+u-boot-qemu \
+u-boot-tools \
+device-tree-compiler \
+busybox-static \
+curl \
+cpio \
+bc
 
 # XXX Silly workaround to get the following QEMU command to work
 cp /usr/share/qemu/pvh.bin /usr/share/qemu/efi-virtio.rom
 qemu-system-aarch64 \
-machine virtualization=true \
-cpu cortex-a57 -machine type=virt \
-   -m 512 -display none \
+   -m 1024 -display none \
-machine dumpdtb=binaries/virt-gicv3.dtb
+# XXX disable pl061 to avoid Linux crash
+dtc -I dtb -O dts binaries/virt-gicv3.dtb > binaries/virt-gicv3.dts
+sed 's/compatible = "arm,pl061.*/status = "disabled";/g' 
binaries/virt-gicv3.dts > binaries/virt-gicv3-edited.dts
+dtc -I dts -O dtb binaries/virt-gicv3-edited.dts > binaries/virt-gicv3.dtb
 
+
+# Busybox Dom0
+mkdir -p initrd
+mkdir -p initrd/bin
+mkdir -p initrd/sbin
+mkdir -p initrd/etc
+mkdir -p initrd/dev
+mkdir -p initrd/proc
+mkdir -p initrd/sys
+mkdir -p initrd/lib
+mkdir -p initrd/var
+mkdir -p initrd/mnt
+cp /bin/busybox initrd/bin/busybox
+initrd/bin/busybox --install initrd/bin
+echo "#!/bin/sh
+
+mount -t proc proc /proc
+mount -t sysfs sysfs /sys
+mount -t devtmpfs devtmpfs /dev
+/bin/sh" > initrd/init
+chmod +x initrd/init
+cd initrd
+find . | cpio --create --format='newc' | gzip > ../binaries/initrd
+cd ..
+
+
+# Linux Dom0
+curl -fsSLO 
https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-"$LINUX_VERSION".tar.xz
+tar xvJf linux-"$LINUX_VERSION".tar.xz
+cd linux-"$LINUX_VERSION"
+make defconfig
+make -j$(nproc) Image.gz
+cp arch/arm64/boot/Image ../binaries
+cd ..
+
+
+# ImageBuilder
+echo 'MEMORY_START="0x4000"
+MEMORY_END="0x8000"
+
+DEVICE_TREE="virt-gicv3.dtb"
+XEN="xen"
+DOM0_KERNEL="Image"
+DOM0_RAMDISK="initrd"
+XEN_CMD="console=dtuart dom0_mem=512M"
+
+NUM_DOMUS=1
+DOMU_KERNEL[0]="Image"
+DOMU_RAMDISK[0]="initrd"
+DOMU_MEM[0]="256"
+
+LOAD_CMD="tftpb"
+UBOOT_SOURCE="boot.source"
+UBOOT_SCRIPT="boot.scr"' > binaries/config
+rm -rf imagebuilder
+git clone https://gitlab.com/ViryaOS/imagebuilder
+bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c 
binaries/config
+
+
+# Run the test
 rm -f smoke.serial
 set +e
-echo "  booti 0x4900 - 0x4400" | timeout -k 1 30 qemu-system-aarch64 \
+echo "  virtio scan; dhcp; tftpb 0x4000 boot.scr; source 0x4000"| \
+timeout -k 1 240 \
+qemu-system-aarch64 \
 -machine virtualization=true \
 -cpu cortex-a57 -machine type=virt \
--m 512 -monitor none -serial stdio \
+-m 1024 -monitor none -serial stdio \
+-smp 2 \
 -no-reboot \
--device loader,file=binaries/virt-gicv3.dtb,force-raw=on,addr=0x4400 \
--device loader,file=binaries/xen,force-raw=on,addr=0x4900 \
+-device virtio-net-pci,netdev=n0 \
+-netdev user,id=n0,tftp=binaries \
 -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin |& tee smoke.serial
 
 set -e
-grep -q 'LOADING DOMAIN 0' smoke.serial || exit 1
+(grep -q "^BusyBox" smoke.serial && grep -q "DOM1: BusyBox" smoke.serial) || 
exit 1
 exit 0
-- 
2.17.1




[PATCH 1/2] automation: add a QEMU aarch64 smoke test

2020-11-13 Thread Stefano Stabellini
Use QEMU to start Xen (just the hypervisor) up until it stops because
there is no dom0 kernel to boot.

It is based on the existing build job unstable-arm64v8.

Also use make -j$(nproc) to build Xen.

Signed-off-by: Stefano Stabellini 
---
 automation/gitlab-ci/test.yaml | 22 ++
 automation/scripts/build   |  8 +++
 automation/scripts/qemu-smoke-arm64.sh | 32 ++
 3 files changed, 57 insertions(+), 5 deletions(-)
 create mode 100755 automation/scripts/qemu-smoke-arm64.sh

diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
index 793feafe8b..35346e3f6e 100644
--- a/automation/gitlab-ci/test.yaml
+++ b/automation/gitlab-ci/test.yaml
@@ -22,6 +22,28 @@ build-each-commit-gcc:
 - /^coverity-tested\/.*/
 - /^stable-.*/
 
+qemu-smoke-arm64-gcc:
+  stage: test
+  image: registry.gitlab.com/xen-project/xen/${CONTAINER}
+  variables:
+CONTAINER: debian:unstable-arm64v8
+  script:
+- ./automation/scripts/qemu-smoke-arm64.sh 2>&1 | tee qemu-smoke-arm64.log
+  dependencies:
+- debian-unstable-gcc-arm64
+  artifacts:
+paths:
+  - smoke.serial
+  - '*.log'
+when: always
+  tags:
+- arm64
+  except:
+- master
+- smoke
+- /^coverity-tested\/.*/
+- /^stable-.*/
+
 qemu-smoke-x86-64-gcc:
   stage: test
   image: registry.gitlab.com/xen-project/xen/${CONTAINER}
diff --git a/automation/scripts/build b/automation/scripts/build
index 0cd0f3971d..95ad23eadb 100755
--- a/automation/scripts/build
+++ b/automation/scripts/build
@@ -10,9 +10,9 @@ cc-ver()
 
 # random config or default config
 if [[ "${RANDCONFIG}" == "y" ]]; then
-make -C xen KCONFIG_ALLCONFIG=tools/kconfig/allrandom.config randconfig
+make -j$(nproc) -C xen KCONFIG_ALLCONFIG=tools/kconfig/allrandom.config 
randconfig
 else
-make -C xen defconfig
+make -j$(nproc) -C xen defconfig
 fi
 
 # build up our configure options
@@ -45,9 +45,7 @@ make -j$(nproc) dist
 # Extract artifacts to avoid getting rewritten by customised builds
 cp xen/.config xen-config
 mkdir binaries
-if [[ "${XEN_TARGET_ARCH}" == "x86_64" ]]; then
-cp xen/xen binaries/xen
-fi
+cp xen/xen binaries/xen
 
 # Build all the configs we care about
 case ${XEN_TARGET_ARCH} in
diff --git a/automation/scripts/qemu-smoke-arm64.sh 
b/automation/scripts/qemu-smoke-arm64.sh
new file mode 100755
index 00..a7efbf8b6f
--- /dev/null
+++ b/automation/scripts/qemu-smoke-arm64.sh
@@ -0,0 +1,32 @@
+#!/bin/bash
+
+set -ex
+
+# Install QEMU
+export DEBIAN_FRONTENT=noninteractive
+apt-get -qy update
+apt-get -qy install --no-install-recommends qemu-system-aarch64 \
+u-boot-qemu
+
+# XXX Silly workaround to get the following QEMU command to work
+cp /usr/share/qemu/pvh.bin /usr/share/qemu/efi-virtio.rom
+qemu-system-aarch64 \
+   -machine virtualization=true \
+   -cpu cortex-a57 -machine type=virt \
+   -m 512 -display none \
+   -machine dumpdtb=binaries/virt-gicv3.dtb
+
+rm -f smoke.serial
+set +e
+echo "  booti 0x4900 - 0x4400" | timeout -k 1 30 qemu-system-aarch64 \
+-machine virtualization=true \
+-cpu cortex-a57 -machine type=virt \
+-m 512 -monitor none -serial stdio \
+-no-reboot \
+-device loader,file=binaries/virt-gicv3.dtb,force-raw=on,addr=0x4400 \
+-device loader,file=binaries/xen,force-raw=on,addr=0x4900 \
+-bios /usr/lib/u-boot/qemu_arm64/u-boot.bin |& tee smoke.serial
+
+set -e
+grep -q 'LOADING DOMAIN 0' smoke.serial || exit 1
+exit 0
-- 
2.17.1




[PATCH 0/2] automation: arm64 dom0less smoke test

2020-11-13 Thread Stefano Stabellini
Hi all,

This short series introduces a very simple Xen Dom0less smoke test based
on qemu-system-aarch64 to be run as part of the gitlab CI-loop. It
currently passes on staging.

Cheers,

Stefano


Stefano Stabellini (2):
  automation: add a QEMU aarch64 smoke test
  automation: add dom0less to the QEMU aarch64 smoke test

 automation/gitlab-ci/test.yaml |  23 
 automation/scripts/build   |   8 +--
 automation/scripts/qemu-smoke-arm64.sh | 105 +
 3 files changed, 131 insertions(+), 5 deletions(-)
 create mode 100755 automation/scripts/qemu-smoke-arm64.sh



[ovmf test] 156742: all pass - PUSHED

2020-11-13 Thread osstest service owner
flight 156742 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156742/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 662b42db76a5b195c3aa94ab2946e342a15cd185
baseline version:
 ovmf 544cb0132dc1778b9e791202995533523fa6cccd

Last test of basis   156720  2020-11-12 16:00:44 Z1 days
Testing same since   156742  2020-11-13 11:32:54 Z0 days1 attempts


People who touched revisions under test:
  Pete Batard 
  Yunhua Feng 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   544cb0132d..662b42db76  662b42db76a5b195c3aa94ab2946e342a15cd185 -> 
xen-tested-master



[SPECIFICATION RFC] The firmware and bootloader log specification

2020-11-13 Thread Daniel Kiper
Hey,

This is next attempt to create firmware and bootloader log specification.
Due to high interest among industry it is an extension to the initial
bootloader log only specification. It takes into the account most of the
comments which I got up until now.

The goal is to pass all logs produced by various boot components to the
running OS. The OS kernel should expose these logs to the user space
and/or process them internally if needed. The content of these logs
should be human readable. However, they should also contain the
information which allows admins to do e.g. boot time analysis.

The log specification should be as much as possible platform agnostic
and self contained. The final version of this spec should be merged into
existing specifications, e.g. UEFI, ACPI, Multiboot2, or be a standalone
spec, e.g. as a part of OASIS Standards. The former seems better but is
not perfect too...

Here is the description (pseudocode) of the structures which will be
used to store the log data.

  struct bf_log
  {
uint32_t   version;
char   producer[64];
uint64_t   flags;
uint64_t   next_bf_log_addr;
uint32_t   next_msg_off;
bf_log_msg msgs[];
  }

  struct bf_log_msg
  {
uint32_t size;
uint64_t ts_nsec;
uint32_t level;
uint32_t facility;
uint32_t msg_off;
char strings[];
  }

The members of struct bf_log:
  - version: the firmware and bootloader log format version number, 1 for now,
  - producer: the producer/firmware/bootloader/... type; the length
allows ASCII UUID storage if somebody needs that functionality,
  - flags: it can be used to store information about log state, e.g.
it was truncated or not (does it make sense to have an information
about the number of lost messages?),
  - next_bf_log_addr: address of next bf_log struct; none if zero (I think
newer spec versions should not change anything in first 5 bf_log members;
this way older log parsers will be able to traverse/copy all logs regardless
of version used in one log or another),
  - next_msg_off: the offset, in bytes, from the beginning of the bf_log struct,
of the next byte after the last log message in the msgs[]; i.e. the offset
of the next available log message slot; it is equal to the total size of
the log buffer including the bf_log struct,
  - msgs: the array of log messages,
  - should we add CRC or hash or signatures here?

The members of struct bf_log_msg:
  - size: total size of bf_log_msg struct,
  - ts_nsec: timestamp expressed in nanoseconds starting from 0,
  - level: similar to syslog meaning; can be used to differentiate normal 
messages
from debug messages; the exact interpretation depends on the current 
producer
type specified in the bf_log.producer,
  - facility: similar to syslog meaning; can be used to differentiate the 
sources of
the messages, e.g. message produced by networking module; the exact 
interpretation
depends on the current producer type specified in the bf_log.producer,
  - msg_off: the log message offset in strings[],
  - strings[0]: the beginning of log message type, similar to the facility 
member but
NUL terminated string instead of integer; this will be used by, e.g., the 
GRUB2
for messages printed using grub_dprintf(),
  - strings[msg_off]: the beginning of log message, NUL terminated string.

Note: The producers are free to use/ignore any given set of level, facility 
and/or
  log type members. Though the usage of these members has to be clearly 
defined.
  Ignored integer members should be set to 0. Ignored log message type 
should
  contain an empty NUL terminated string. The log message is mandatory but 
can
  be an empty NUL terminated string.

There is still not fully solved problem how the logs should be presented to the 
OS.
On the UEFI platforms we can use config tables to do that. Then probably
bf_log.next_bf_log_addr should not be used. On the ACPI and Device Tree 
platforms
we can use these mechanisms to present the logs to the OSes. The situation gets 
more
difficult if neither of these mechanisms are present. However, maybe we should 
not
bother too much about that because probably these platforms getting less and 
less
common.

Anyway, I am aware that this is not specification per se. The goal of this 
email is
to continue the discussion about the idea of the firmware and booloader log and 
to
find out where the final specification should land. Of course taking into the 
account
assumptions made above.

You can find previous discussions about related topics at [1], [2] and [3].

Additionally, I am going to present this during GRUB mini-summit session on 
Tuesday,
17th of November at 15:45 UTC. So, if you want to discuss the log design please 
join
us. You can find more details here [4].

Daniel

[1] https://lists.gnu.org/archive/html/grub-devel/2019-10/msg00107.html
[2] https://lists.gnu.org/archive/html/grub-devel/2019-11/msg00079.html
[3] 

[qemu-mainline test] 156733: regressions - FAIL

2020-11-13 Thread osstest service owner
flight 156733 qemu-mainline real [real]
flight 156763 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/156733/
http://logs.test-lab.xenproject.org/osstest/logs/156763/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeat fail REGR. vs. 152631
 test-armhf-armhf-libvirt-raw 12 debian-di-installfail REGR. vs. 152631
 test-amd64-amd64-xl-qcow2   21 guest-start/debian.repeat fail REGR. vs. 152631
 test-arm64-arm64-libvirt-xsm 14 guest-start  fail REGR. vs. 152631
 test-armhf-armhf-xl-vhd  12 debian-di-installfail REGR. vs. 152631
 test-armhf-armhf-libvirt 14 guest-start  fail REGR. vs. 152631

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-rtds 19 guest-start.2   fail blocked in 152631
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152631
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 152631
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152631
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 152631
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152631
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuucb5d19e8294486551c422759260883ed290226d9
baseline version:
 qemuu1d806cef0e38b5db8347a8e12f214d543204a314

Last test of basis   152631  2020-08-20 09:07:46 Z   85 days
Failing since152659  2020-08-21 14:07:39 Z   84 days  180 attempts
Testing same since   156733  2020-11-13 05:00:33 Z0 days1 attempts


People who touched revisions under test:
Aaron Lindsay 
  Alberto Garcia 
  Aleksandar Markovic 
  Alex Bennée 
  Alex Chen 
  Alex Williamson 
  Alexander Bulekov 
  AlexChen 
  Alexey Kirillov 
  Alistair Francis 
  Alistair Francis 
  Amey Narkhede 
  Ana Pazos 
  Andreas Gustafsson 
  Andrew Jones 
  Andrey Konovalov 
  Andrey Shinkevich 
  Ani Sinha 
  Anthony PERARD 
  Anton Blanchard 
  Anup Patel 
  Artyom Tarasenko 
  Babu Moger 
  BALATON Zoltan 
  Bastian 

[linux-linus test] 156725: regressions - FAIL

2020-11-13 Thread osstest service owner
flight 156725 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156725/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-xsm7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-examine   6 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-xsm   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-freebsd10-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-pvshim 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-raw7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-i386  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl-shadow 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-arm64-arm64-xl-credit2  10 host-ping-check-xen  fail REGR. vs. 152332
 test-amd64-coresched-i386-xl  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-arm64-arm64-examine  8 reboot   fail REGR. vs. 152332
 test-arm64-arm64-libvirt-xsm  8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl   8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl-credit1   8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl-seattle   8 xen-boot fail REGR. vs. 152332
 test-amd64-amd64-amd64-pvgrub 20 guest-stop  fail REGR. vs. 152332
 test-amd64-amd64-i386-pvgrub 20 guest-stop   fail REGR. vs. 152332
 test-armhf-armhf-xl-multivcpu  8 xen-bootfail REGR. vs. 152332
 test-armhf-armhf-xl-credit1   8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-xl-vhd   8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-xl-cubietruck  8 xen-boot   fail REGR. vs. 152332
 test-armhf-armhf-libvirt-raw  8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-libvirt  8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-examine  8 reboot   fail REGR. vs. 152332
 test-arm64-arm64-xl-xsm   8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-xl-credit2   8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-xl   8 xen-boot fail REGR. vs. 152332

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds  8 xen-boot fail REGR. vs. 152332

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 

[OSSTEST PATCH] cr-daily-branch: Sort out retest build reuse

2020-11-13 Thread Ian Jackson
Retest flights ought to reuse precisely the builds from the flight
which prompts the retests.

mg-adjust-flight-makexrefs is the wrong tool for this job.  It can
often leave the retry flights with no build jobs and no references to
the main flights' build jobs, so the results are just broken jobs.

Signed-off-by: Ian Jackson 
---
 cr-daily-branch | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/cr-daily-branch b/cr-daily-branch
index e54ca227..6ec3aa62 100755
--- a/cr-daily-branch
+++ b/cr-daily-branch
@@ -529,9 +529,8 @@ END
$retry_jobs
)
 
-   ./mg-adjust-flight-makexrefs -v $rflight \
-   --branch=$branch --revision-osstest=$narness_rev \
-   '^build-*' --debug --blessings=real
+   ./cs-adjust-flight -D $rflight \
+   runvar-build-set . . "^$rflight\\." $flight.
 
if [ "${OSSTEST_SIMULATE_FAIL_RETRY+y}" = y ]; then
export OSSTEST_SIMULATE_FAIL="${OSSTEST_SIMULATE_FAIL_RETRY}"
-- 
2.20.1




Re: [PATCH V2 23/23] [RFC] libxl: Add support for virtio-disk configuration

2020-11-13 Thread Oleksandr



On 11.11.20 06:28, Wei Chen wrote:

Hi Oleksandr,


Hi Wei



An example of domain configuration (two disks are assigned to the guest,
the latter is in readonly mode):

vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ]


Can we keep use the same 'disk' parameter for virtio-disk, but add an option

like

   "model=virtio-disk"?
For example:
disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ]
Just like what Xen has done for x86 virtio-net.

I think, this needs an additional investigation. In general I agree with
you that it would be nice to reuse existing 'disk' parameter somehow
rather than introducing new one
for the same purpose (to handle virtual block device(s)).


One note, although both are used for the same purpose they are different
in at least one important option.

For example:
1. disk = [ 'backend=DomD, phy:/dev/mmcblk0p3, xvda1' ]
2. vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3' ]
As you can see existing "disk" parameter contains xvda1, this means that
a new device /dev/xvda1 will appear at the guest side, but "vdisk"
doesn't contain anything similar. So with Xen PV driver (xen-blkfront)

Yes, I understand your concerns. But I think specifying a device name
for virtio disk is not a mandatory requirement. Even if we're using physical
disks on bare metal machine, we can't guarantee slot#1 disk is always 'sda'.
So most modern OS are prefer to use blkid to mount filesystem.


we are able to configure a device name, but with VirtIO solution
(virtio-blk) we aren't (at least I don't know how exactly).


Just my understanding, not exactly accurate:
The virtio-blk could not get VDEV information for a bus like Xen-bus. So the 
disk
ID is allocated dynamically in bus probe progress. The first probed disk will 
get
ID 'a'. And then the ID keeps increasing. If we want to specify the disk ID for 
virtio
disk, one possible way to do this is to construct a reasonable position on bus
(fdt node position of virtio mmio device, PCI Function ID of virtio pci block) 
for
virtio disk. But it is not always successful, we can't skip 'vda' to specify a 
virtio
disk as 'vdb'.
Thank you for the explanation. I got your point regarding using the same 
disk parameter and I think it makes sense. Now I am focusing on IOREQ 
transport to be available on Arm, and after that (I hope)

I will be able to return to virtio stuff.

--
Regards,

Oleksandr Tyshchenko




Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Edwin Torok
On Fri, 2020-11-13 at 17:13 +, Andrew Cooper wrote:
> On 13/11/2020 16:56, Bjoern Doebel wrote:
> > On 13.11.20 16:36, Jürgen Groß wrote:
> > > On 13.11.20 15:18, Bjoern Doebel wrote:
> > > > Right now we do not have a mechanism to determine the version
> > > > of the
> > > > currently running xenstored at runtime. As xenstored runs
> > > > throughout
> > > > the
> > > > lifetime of a Xen host, this may lead to problems when newer
> > > > user space
> > > > builds are staged. Then, the running xenstored will no longer
> > > > match the
> > > > version of the installed xenstored.
> > > > 
> > > > To allow users to always identify the running version of
> > > > xenstored, add
> > > > a linker-generated unique build ID to every xenstored build.
> > > > Add
> > > > functionality to log this build ID into a file upon service
> > > > startup.
> > > > 
> > > > Signed-off-by: Bjoern Doebel 
> > > > Reviewed-by: Martin Mazein 
> > > > Reviewed-by: Paul Durrant 
> > > 
> > > No support for oxenstored or xenstore-stubdom?
> > 
> > Your suggestion further down will apparently help for stubdom. I do
> > not speak ocaml at all - how do we address this?
> 
> CC'ing Edwin and Christian who have done the bulk of the oxenstored
> recently.
> 
> It sounds like it might not be possible right now, but would be
> possible
> with a future plan to switch the Ocaml build system over to dune (the
> new/preferred Ocaml upstream toolchain).

See here what is possible with Dune:
https://dune.readthedocs.io/en/stable/dune-libs.html#build-info

Would the output of 'git describe --always --dirty' (perhaps combined
with a build date) serve as a useful build ID?

> 
> If it does end up being an XS_CONTROL sub-op, we can implement it at
> a
> future point when we can usefully answer the question.

Wouldn't using readelf on /proc//exe give you the running buildid?

readelf -a /usr/sbin/oxenstored /proc/$(pidof oxenstored)/exe | grep
'Build ID'
Build ID: bdd5304c8984ed22570d51308ae8717d03fe60ae
Build ID: bdd5304c8984ed22570d51308ae8717d03fe60ae

readelf -a /usr/sbin/oxenstored /proc/$(pidof oxenstored)/exe | grep
'Build ID'
Build ID: b44ff99b216db7526e3ee7841068d584cc9c2b95
Build ID: bdd5304c8984ed22570d51308ae8717d03fe60ae


When you're inside a stubdom it is probably not so easy though.

Best regards,
--Edwin

> 
> ~Andrew



Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Bjoern Doebel

On 13.11.20 15:27, Jan Beulich wrote:

On 13.11.2020 15:18, Bjoern Doebel wrote:

--- a/tools/xenstore/Makefile
+++ b/tools/xenstore/Makefile
@@ -66,6 +66,10 @@ $(XENSTORED_OBJS): CFLAGS += $(SYSTEMD_CFLAGS)
  xenstored: LDFLAGS += $(SYSTEMD_LIBS)
  endif

+# xenstored: enforce creation of a buildID section and use a linker
+# script to add additional symbols around that section
+xenstored: LDFLAGS +=  -Wl,--build-id=sha1 -Wl,-T,buildid_symbols.ld

Since in the hypervisor build we're careful to not use --build-id
when the linker doesn't support it, I suppose the same care needs
applying here. See the setting of build_id_linker in ./Config.mk.


Ok, I will make this conditional on the setting of $(build_id_linker).



Jan


Bjoern




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Andrew Cooper
On 13/11/2020 16:56, Bjoern Doebel wrote:
> On 13.11.20 16:36, Jürgen Groß wrote:
>> On 13.11.20 15:18, Bjoern Doebel wrote:
>>> Right now we do not have a mechanism to determine the version of the
>>> currently running xenstored at runtime. As xenstored runs throughout
>>> the
>>> lifetime of a Xen host, this may lead to problems when newer user space
>>> builds are staged. Then, the running xenstored will no longer match the
>>> version of the installed xenstored.
>>>
>>> To allow users to always identify the running version of xenstored, add
>>> a linker-generated unique build ID to every xenstored build. Add
>>> functionality to log this build ID into a file upon service startup.
>>>
>>> Signed-off-by: Bjoern Doebel 
>>> Reviewed-by: Martin Mazein 
>>> Reviewed-by: Paul Durrant 
>>
>> No support for oxenstored or xenstore-stubdom?
>
> Your suggestion further down will apparently help for stubdom. I do
> not speak ocaml at all - how do we address this?

CC'ing Edwin and Christian who have done the bulk of the oxenstored
recently.

It sounds like it might not be possible right now, but would be possible
with a future plan to switch the Ocaml build system over to dune (the
new/preferred Ocaml upstream toolchain).

If it does end up being an XS_CONTROL sub-op, we can implement it at a
future point when we can usefully answer the question.

~Andrew



Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Andrew Cooper
On 13/11/2020 16:55, Bjoern Doebel wrote:
>
> On 13.11.20 15:30, Andrew Cooper wrote:
>> On 13/11/2020 14:18, Bjoern Doebel wrote:
>>> Right now we do not have a mechanism to determine the version of the
>>> currently running xenstored at runtime. As xenstored runs throughout
>>> the
>>> lifetime of a Xen host, this may lead to problems when newer user space
>>> builds are staged. Then, the running xenstored will no longer match the
>>> version of the installed xenstored.
>>>
>>> To allow users to always identify the running version of xenstored, add
>>> a linker-generated unique build ID to every xenstored build. Add
>>> functionality to log this build ID into a file upon service startup.
>>>
>>> Signed-off-by: Bjoern Doebel 
>>> Reviewed-by: Martin Mazein 
>>> Reviewed-by: Paul Durrant 
>> I understand the problem you're trying to solve, but why is this
>> anything more than just enabling build-id's by default across tools/ ?
>>
>> There are already standard ways of interacting with the build id of
>> running executables on the system.  I'd strongly discourage doing
>> anything custom in xenstored specifically.
> May I ask what tooling you would use to interact with a running
> process' buildid?

Amongst other things, yes.  Although as Juergen points out, we want
something which works with stub domains as well, and "normal userspace
tools" won't cut it there.

I still think a first patch in this series should be to turn build-id's
on by default if supported by the toolchain, generally.

~Andrew



Re: [PATCH V2 16/23] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm

2020-11-13 Thread Oleksandr



On 12.11.20 14:18, Jan Beulich wrote:

Hi Jan


On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:

--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1380,6 +1380,27 @@ int guest_physmap_remove_page(struct domain *d, gfn_t 
gfn, mfn_t mfn,
  return p2m_remove_mapping(d, gfn, (1 << page_order), mfn);
  }
  
+int set_foreign_p2m_entry(struct domain *d, const struct domain *fd,

+  unsigned long gfn, mfn_t mfn)
+{
+struct page_info *page = mfn_to_page(mfn);
+int rc;
+
+if ( !get_page(page, fd) )
+return -EINVAL;
+
+/*
+ * It is valid to always use p2m_map_foreign_rw here as if this gets
+ * called that d != fd. A case when d == fd would be rejected by
+ * rcu_lock_remote_domain_by_id() earlier.
+ */

Are you describing things here on the assumption that no new
callers may surface later on? To catch such, I'd recommend
adding at least a respective ASSERT().


Agree, will add.





+rc = guest_physmap_add_entry(d, _gfn(gfn), mfn, 0, p2m_map_foreign_rw);
+if ( rc )
+put_page(page);
+
+return 0;

I can't imagine it's correct to not signal the error to the
caller(s).


It is not correct, I have just missed to place return rc. Thank you for 
the catch.






--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1099,7 +1099,8 @@ static int acquire_resource(
   *reference counted, it is unsafe to allow mapping of
   *resource pages unless the caller is the hardware domain.
   */
-if ( paging_mode_translate(currd) && !is_hardware_domain(currd) )
+if ( paging_mode_translate(currd) && !is_hardware_domain(currd) &&
+ !arch_refcounts_p2m() )
  return -EACCES;

I guess the hook may want naming differently, as both prior parts of
the condition should be needed only on the x86 side, and there (for
PV) there's no p2m involved in the refcounting. Maybe
arch_acquire_resource_check()? And then the comment wants moving into
the x86 hook as well. You may want to consider leaving a more generic
one here...


ok, again, I am fine with the name). Will follow everything suggested 
above.


--
Regards,

Oleksandr Tyshchenko




Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Bjoern Doebel

On 13.11.20 16:36, Jürgen Groß wrote:

On 13.11.20 15:18, Bjoern Doebel wrote:

Right now we do not have a mechanism to determine the version of the
currently running xenstored at runtime. As xenstored runs throughout the
lifetime of a Xen host, this may lead to problems when newer user space
builds are staged. Then, the running xenstored will no longer match the
version of the installed xenstored.

To allow users to always identify the running version of xenstored, add
a linker-generated unique build ID to every xenstored build. Add
functionality to log this build ID into a file upon service startup.

Signed-off-by: Bjoern Doebel 
Reviewed-by: Martin Mazein 
Reviewed-by: Paul Durrant 


No support for oxenstored or xenstore-stubdom?


Your suggestion further down will apparently help for stubdom. I do not 
speak ocaml at all - how do we address this?





---
  tools/hotplug/Linux/launch-xenstore.in |  2 +-
  tools/xenstore/Makefile    |  4 +++
  tools/xenstore/buildid_symbols.ld  | 10 +++
  tools/xenstore/xenstored_core.c    |  8 ++
  tools/xenstore/xenstored_core.h    |  3 ++
  tools/xenstore/xenstored_minios.c  |  4 +++
  tools/xenstore/xenstored_posix.c   | 52 
++

  7 files changed, 82 insertions(+), 1 deletion(-)
  create mode 100644 tools/xenstore/buildid_symbols.ld

diff --git a/tools/hotplug/Linux/launch-xenstore.in 
b/tools/hotplug/Linux/launch-xenstore.in

index 991dec8d25..a6f2254030 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -62,7 +62,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons 
&& . @CONFIG_DIR@/@CONFIG_LEAF

  }
    echo -n Starting $XENSTORED...
-    $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
+    $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid --buildid-file 
@XEN_RUN_DIR@/xenstored.buildid $XENSTORED_ARGS
    systemd-notify --booted 2>/dev/null || timeout_xenstore 
$XENSTORED || exit 1

  diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile
index 9a0f0d012d..c63350980b 100644
--- a/tools/xenstore/Makefile
+++ b/tools/xenstore/Makefile
@@ -66,6 +66,10 @@ $(XENSTORED_OBJS): CFLAGS += $(SYSTEMD_CFLAGS)
  xenstored: LDFLAGS += $(SYSTEMD_LIBS)
  endif
  +# xenstored: enforce creation of a buildID section and use a linker
+# script to add additional symbols around that section
+xenstored: LDFLAGS +=  -Wl,--build-id=sha1 -Wl,-T,buildid_symbols.ld
+
  $(XENSTORED_OBJS): CFLAGS += $(CFLAGS_libxengnttab)
    xenstored: $(XENSTORED_OBJS)
diff --git a/tools/xenstore/buildid_symbols.ld 
b/tools/xenstore/buildid_symbols.ld

new file mode 100644
index 00..d74024c4e9
--- /dev/null
+++ b/tools/xenstore/buildid_symbols.ld
@@ -0,0 +1,10 @@
+SECTIONS
+{
+   __buildid_note_section = . ;
+   .note.gnu.build-id :
+   {
+   *(.note.gnu.build-id)
+   }
+   __buildid_end = . ;
+}
+INSERT AFTER .data
diff --git a/tools/xenstore/xenstored_core.c 
b/tools/xenstore/xenstored_core.c

index b4be374d3f..c6f107bdd9 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -1844,6 +1844,7 @@ static void usage(void)
      static struct option options[] = {
+    { "buildid-file", 1, NULL, 'B' },
  { "no-domain-init", 0, NULL, 'D' },
  { "entry-nb", 1, NULL, 'E' },
  { "pid-file", 1, NULL, 'F' },
@@ -1875,12 +1876,16 @@ int main(int argc, char *argv[])
  bool outputpid = false;
  bool no_domain_init = false;
  const char *pidfile = NULL;
+    const char *buildid_file = NULL;
  int timeout;
      while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:T:RVW:", 
options,


You are missing "B:" in the short options.

Ack. Fixing.



    NULL)) != -1) {
  switch (opt) {
+    case 'B':
+    buildid_file = optarg;
+    break;
  case 'D':
  no_domain_init = true;
  break;
@@ -1948,6 +1953,9 @@ int main(int argc, char *argv[])
  if (pidfile)
  write_pidfile(pidfile);
  +    if (buildid_file)
+    write_buildid_file(buildid_file);
+
  /* Talloc leak reports go to stderr, which is closed if we 
fork. */

  if (!dofork)
  talloc_enable_leak_report_full();


You should update the usage printout, too.

Ok.


diff --git a/tools/xenstore/xenstored_core.h 
b/tools/xenstore/xenstored_core.h

index 1df6ad94ab..712280626c 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -193,6 +193,9 @@ void xenbus_notify_running(void);
  /* Write out the pidfile */
  void write_pidfile(const char *pidfile);
  +/* Write the buildid file */
+void write_buildid_file(const char *buildidfile);
+
  /* Fork but do not close terminal FDs */
  void daemonize(void);
  /* Close stdin/stdout/stderr to complete daemonize */
diff --git a/tools/xenstore/xenstored_minios.c 
b/tools/xenstore/xenstored_minios.c

index c94493e52a..ef1151aee4 100644
--- 

Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Bjoern Doebel


On 13.11.20 15:30, Andrew Cooper wrote:

On 13/11/2020 14:18, Bjoern Doebel wrote:

Right now we do not have a mechanism to determine the version of the
currently running xenstored at runtime. As xenstored runs throughout the
lifetime of a Xen host, this may lead to problems when newer user space
builds are staged. Then, the running xenstored will no longer match the
version of the installed xenstored.

To allow users to always identify the running version of xenstored, add
a linker-generated unique build ID to every xenstored build. Add
functionality to log this build ID into a file upon service startup.

Signed-off-by: Bjoern Doebel 
Reviewed-by: Martin Mazein 
Reviewed-by: Paul Durrant 

I understand the problem you're trying to solve, but why is this
anything more than just enabling build-id's by default across tools/ ?

There are already standard ways of interacting with the build id of
running executables on the system.  I'd strongly discourage doing
anything custom in xenstored specifically.
May I ask what tooling you would use to interact with a running process' 
buildid?

~Andrew


Bjoern




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843

2020-11-13 Thread Manuel Bouyer
I just noticed that CPU0 also stops receiving clock interrupts - which may
explain why the kernel wedges. I can still enter NetBSD's debugger,
which means that console interrupts are still working (the console's event
channel is also handled by CPU 0).
A 'q' in Xen doens't show any pending or masked events, for any CPU.

The NetBSD interrupt counters show event channel 2's counter (the CPU0 clock)
stuck at 13, while others are happily increasing.

Any idea what to look at from here ?

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--



Re: [PATCH V2 20/23] xen/ioreq: Make x86's send_invalidate_req() common

2020-11-13 Thread Oleksandr



On 12.11.20 13:55, Jan Beulich wrote:

Hi Jan


On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:

--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -97,7 +97,6 @@ bool relocate_portio_handler(
  unsigned int size);
  
  void send_timeoffset_req(unsigned long timeoff);

-void send_invalidate_req(void);
  bool handle_mmio_with_translation(unsigned long gla, unsigned long gpfn,
struct npfec);
  bool handle_pio(uint16_t port, unsigned int size, int dir);
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
index 0679fef..aad682f 100644
--- a/xen/include/xen/ioreq.h
+++ b/xen/include/xen/ioreq.h
@@ -126,6 +126,7 @@ struct ioreq_server *select_ioreq_server(struct domain *d,
  int send_ioreq(struct ioreq_server *s, ioreq_t *proto_p,
 bool buffered);
  unsigned int broadcast_ioreq(ioreq_t *p, bool buffered);
+void send_invalidate_ioreq(void);

Again while renaming this function anyway could we see about giving
it a suitable and consistent name? Maybe
ioreq_request_mapcache_invalidate() or (to avoid the double "request")
ioreq_signal_mapcache_invalidate()? Maybe even ioreq_server_...().

iirc send_invalidate_ioreq() was one of the proposed names during V1 review.
But to follow new scheme, I am fine with both the first and second.

--
Regards,

Oleksandr Tyshchenko




Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843

2020-11-13 Thread Manuel Bouyer
On Fri, Nov 13, 2020 at 04:14:28PM +0100, Manuel Bouyer wrote:
> I can try a kernel without this driver, to see if other devices reports
> interrupt.

This didn't change anything, I don't get more interrupts with this driver
commented out.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--



[linux-5.4 test] 156722: tolerable FAIL - PUSHED

2020-11-13 Thread osstest service owner
flight 156722 linux-5.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156722/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-xsm 20 guest-localmigrate/x10 fail in 156677 pass in 156722
 test-amd64-i386-libvirt-xsm  21 guest-start.2  fail pass in 156677

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156412
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156412
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156412
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156412
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 156412
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156412
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156412
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156412
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156412
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 156412
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156412
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass

version targeted for testing:
 linux2544d06afd8d060f35b159809274e4b7477e63e8
baseline version:
 linux6e97ed6efa701db070da0054b055c085895aba86

Last test of basis   156412  2020-11-05 11:13:59 Z8 days
Failing since156620  2020-11-10 12:10:31 Z3 days3 attempts
Testing same since   156677  2020-11-11 07:50:18 Z2 days2 attempts


People who touched revisions under test:
  

Re: [PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved function names

2020-11-13 Thread Oleksandr



On 12.11.20 13:45, Jan Beulich wrote:

Hi Jan.


On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

This patch removes "hvm" prefixes and infixes from IOREQ
related function names in the common code.

AT least some of the functions touched here would be nice to be
moved to a more consistent new naming scheme right away, to
avoid having to touch all the same places again. I guess ioreq
server functions would be nice to all start with ioreq_server_
and ioreq functions to all start with ioreq_. E.g. ioreq_send()
and ioreq_server_select().


As it was initially discussed at patch #1, ok , I will prepare a list
of the proposed renaming so we could find a common ground I hope.
After that I will make required renaming.

--
Regards,

Oleksandr Tyshchenko




Re: [PATCH V2 14/23] arm/ioreq: Introduce arch specific bits for IOREQ/DM features

2020-11-13 Thread Oleksandr



On 12.11.20 13:48, Jan Beulich wrote:

Hi Jan


On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:

--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -18,6 +18,7 @@
  
  #include 

  #include 
+#include 
  #include 
  #include 
  #include 

There preferably wouldn't be a need to touch non-Arm code in this
patch. I suppose the added #include could easily be introduced
e.g. while moving the file?


ok


--
Regards,

Oleksandr Tyshchenko




Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Jürgen Groß

On 13.11.20 15:18, Bjoern Doebel wrote:

Right now we do not have a mechanism to determine the version of the
currently running xenstored at runtime. As xenstored runs throughout the
lifetime of a Xen host, this may lead to problems when newer user space
builds are staged. Then, the running xenstored will no longer match the
version of the installed xenstored.

To allow users to always identify the running version of xenstored, add
a linker-generated unique build ID to every xenstored build. Add
functionality to log this build ID into a file upon service startup.

Signed-off-by: Bjoern Doebel 
Reviewed-by: Martin Mazein 
Reviewed-by: Paul Durrant 


No support for oxenstored or xenstore-stubdom?


---
  tools/hotplug/Linux/launch-xenstore.in |  2 +-
  tools/xenstore/Makefile|  4 +++
  tools/xenstore/buildid_symbols.ld  | 10 +++
  tools/xenstore/xenstored_core.c|  8 ++
  tools/xenstore/xenstored_core.h|  3 ++
  tools/xenstore/xenstored_minios.c  |  4 +++
  tools/xenstore/xenstored_posix.c   | 52 ++
  7 files changed, 82 insertions(+), 1 deletion(-)
  create mode 100644 tools/xenstore/buildid_symbols.ld

diff --git a/tools/hotplug/Linux/launch-xenstore.in 
b/tools/hotplug/Linux/launch-xenstore.in
index 991dec8d25..a6f2254030 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -62,7 +62,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
@CONFIG_DIR@/@CONFIG_LEAF
}
  
  	echo -n Starting $XENSTORED...

-   $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
+   $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid --buildid-file 
@XEN_RUN_DIR@/xenstored.buildid $XENSTORED_ARGS
  
  	systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || exit 1
  
diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile

index 9a0f0d012d..c63350980b 100644
--- a/tools/xenstore/Makefile
+++ b/tools/xenstore/Makefile
@@ -66,6 +66,10 @@ $(XENSTORED_OBJS): CFLAGS += $(SYSTEMD_CFLAGS)
  xenstored: LDFLAGS += $(SYSTEMD_LIBS)
  endif
  
+# xenstored: enforce creation of a buildID section and use a linker

+# script to add additional symbols around that section
+xenstored: LDFLAGS +=  -Wl,--build-id=sha1 -Wl,-T,buildid_symbols.ld
+
  $(XENSTORED_OBJS): CFLAGS += $(CFLAGS_libxengnttab)
  
  xenstored: $(XENSTORED_OBJS)

diff --git a/tools/xenstore/buildid_symbols.ld 
b/tools/xenstore/buildid_symbols.ld
new file mode 100644
index 00..d74024c4e9
--- /dev/null
+++ b/tools/xenstore/buildid_symbols.ld
@@ -0,0 +1,10 @@
+SECTIONS
+{
+   __buildid_note_section = . ;
+   .note.gnu.build-id :
+   {
+   *(.note.gnu.build-id)
+   }
+   __buildid_end = . ;
+}
+INSERT AFTER .data
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index b4be374d3f..c6f107bdd9 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -1844,6 +1844,7 @@ static void usage(void)
  
  
  static struct option options[] = {

+   { "buildid-file", 1, NULL, 'B' },
{ "no-domain-init", 0, NULL, 'D' },
{ "entry-nb", 1, NULL, 'E' },
{ "pid-file", 1, NULL, 'F' },
@@ -1875,12 +1876,16 @@ int main(int argc, char *argv[])
bool outputpid = false;
bool no_domain_init = false;
const char *pidfile = NULL;
+   const char *buildid_file = NULL;
int timeout;
  
  
  	while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:T:RVW:", options,


You are missing "B:" in the short options.


  NULL)) != -1) {
switch (opt) {
+   case 'B':
+   buildid_file = optarg;
+   break;
case 'D':
no_domain_init = true;
break;
@@ -1948,6 +1953,9 @@ int main(int argc, char *argv[])
if (pidfile)
write_pidfile(pidfile);
  
+	if (buildid_file)

+   write_buildid_file(buildid_file);
+
/* Talloc leak reports go to stderr, which is closed if we fork. */
if (!dofork)
talloc_enable_leak_report_full();


You should update the usage printout, too.


diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index 1df6ad94ab..712280626c 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -193,6 +193,9 @@ void xenbus_notify_running(void);
  /* Write out the pidfile */
  void write_pidfile(const char *pidfile);
  
+/* Write the buildid file */

+void write_buildid_file(const char *buildidfile);
+
  /* Fork but do not close terminal FDs */
  void daemonize(void);
  /* Close stdin/stdout/stderr to complete daemonize */
diff --git a/tools/xenstore/xenstored_minios.c 
b/tools/xenstore/xenstored_minios.c
index c94493e52a..ef1151aee4 100644
--- a/tools/xenstore/xenstored_minios.c
+++ 

[libvirt test] 156731: regressions - FAIL

2020-11-13 Thread osstest service owner
flight 156731 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156731/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  09ba97ad6b4aa725364ed5e5c48031829dc9549f
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  126 days
Failing since151818  2020-07-11 04:18:52 Z  125 days  120 attempts
Testing same since   156731  2020-11-13 04:19:03 Z0 days1 attempts


People who touched revisions under test:
  Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Andika Triwidada 
  Andrea Bolognani 
  Balázs Meskó 
  Bastien Orivel 
  Bihong Yu 
  Binfeng Wu 
  Boris Fiuczynski 
  Brian Turek 
  Christian Ehrhardt 
  Christian Schoenebeck 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Erik Skultety 
  Fabian Freyer 
  Fangge Jin 
  Fedora Weblate Translation 
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Ian Wienand 
  Jamie Strandboge 
  Jamie Strandboge 
  Jean-Baptiste Holcroft 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jiri Denemark 
  Jonathon Jongsma 
  Julio Faracco 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Laine Stump 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Marc Hartmayer 
  Marek Marczykowski-Górecki 
  Markus Schade 
  Martin Kletzander 
  Masayoshi Mizuma 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Neal Gompa 
  Nico Pache 
  Nikolay Shirokovskiy 
  Olesya Gerasimenko 
  Orion Poplawski 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel Hrdina 
  Peter Krempa 
  Pino Toscano 
  Pino Toscano 
  Piotr Drąg 
  Prathamesh Chavan 
  Ricky Tigg 
  Roman Bogorodskiy 
  Roman Bolshakov 
  Ryan Schmidt 
  Sam Hartman 
  Scott Shambarger 
  Sebastian Mitterle 
  Simon Gaiser 
  Stefan Bader 
  Stefan Berger 
  Szymon Scholz 
  Thomas Huth 
  Tim Wiederhake 
  Tomáš Golembiovský 
  Wang Xin 
  Weblate 
  Yang Hang 
  Yanqiu Zhang 
  Yi Li 
  Yi Wang 
  Yuri Chornoivan 
  Zheng Chuan 
  zhenwei pi 
  Zhenyu Zheng 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-arm64-libvirt  fail
 build-armhf-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   blocked 
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked 
 

Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843

2020-11-13 Thread Manuel Bouyer
On Fri, Nov 13, 2020 at 03:35:13PM +0100, Roger Pau Monné wrote:
> Forgot to mention, it might also be helpful to boot Xen with
> iommu=debug, just in case.

I put the serial console log at
http://www-soc.lip6.fr/~bouyer/xen-log2.txt
in case it helps

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--



Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843

2020-11-13 Thread Manuel Bouyer
On Fri, Nov 13, 2020 at 03:33:49PM +0100, Roger Pau Monné wrote:
> On Fri, Nov 13, 2020 at 12:54:57PM +0100, Manuel Bouyer wrote:
> > On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote:
> > > The following might be able to get you going, but I think I need to
> > > refine the logic a bit there, will have to give it some thought.
> > 
> > I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: 
> > Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty).
> > Your patch is needed there too to avoid the panic.
> > 
> > As with 4.13, I have problems with interrupts not being properly
> > delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13)
> > or 2 (with 4.15) which would mean that interrupts stop being delivered
> > at some point in the setup process. Maybe something to do with mask/unmask ?
> > 
> > The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD 
> > kernel,
> > so it's not MSI/MSI-X (not sure it matters though).
> > Maybe something related to mask/unmask ?
> 
> What device do you have on that pin?

The PERC H740P raid controller.

> Is it the only device not working
> properly?

I'm not sure, as the kernel is stalling because of this.
The other device counter interrupts are 0.
I can try a kernel without this driver, to see if other devices reports
interrupt.

> I get from that that MSI/MSI-X is now working fine.

See above.

> 
> You can get some interrupt info from the 'i' and the 'z' debug keys,
> albeit that won't reflect the state of the emulated IO-APIC used by
> dom0, which is likely what we care about. There's also the 'M' debug
> key, but that's only useful for MSI/MSI-X.
> 
> I can try to prepare a patch to dump some info from the emulated
> IO-APIC, but I'm afraid I won't get to it until Monday.

No problem.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--



Re: [PATCH V2 10/23] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common

2020-11-13 Thread Oleksandr



On 12.11.20 13:40, Jan Beulich wrote:

Hi Jan


On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:

--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -30,6 +30,10 @@
  #include 
  #include 
  
+#ifdef CONFIG_IOREQ_SERVER

+#include 
+#endif

Preferably #ifdef-s would not be needed here. If any, they'd better
live in xen/ioreq.h itself then.


ok





@@ -1045,6 +1049,38 @@ static int acquire_grant_table(struct domain *d, 
unsigned int id,
  return 0;
  }
  
+#ifdef CONFIG_IOREQ_SERVER

To limit the number of #ifdef-s, could this be moved ...


+static int acquire_ioreq_server(struct domain *d,
+unsigned int id,
+unsigned long frame,
+unsigned int nr_frames,
+xen_pfn_t mfn_list[])
+{

... here such that ...


@@ -1103,9 +1139,14 @@ static int acquire_resource(
   mfn_list);
  break;
  
+#ifdef CONFIG_IOREQ_SERVER

+case XENMEM_resource_ioreq_server:
+rc = acquire_ioreq_server(d, xmar.id, xmar.frame, xmar.nr_frames,
+  mfn_list);
+break;
+#endif

... the ones here then can be dropped?


I think yes, that would be better.





  default:

Also you'll want to a blank line between the new case statement and
the "default:".


ok

--
Regards,

Oleksandr Tyshchenko




Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Oleksandr Andrushchenko

On 11/13/20 4:51 PM, Jan Beulich wrote:
> On 13.11.2020 15:44, Oleksandr Andrushchenko wrote:
>> On 11/13/20 4:38 PM, Jan Beulich wrote:
>>> On 13.11.2020 15:32, Oleksandr Andrushchenko wrote:
 On 11/13/20 4:23 PM, Jan Beulich wrote:
> Earlier on I didn't say you should get this to work, only
> that I think the general logic around what you add shouldn't make
> things more arch specific than they really should be. That said,
> something similar to the above should still be doable on x86,
> utilizing struct pci_seg's bus2bridge[]. There ought to be
> DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them
> (provided by the CPUs themselves rather than the chipset) aren't
> really host bridges for the purposes you're after.
 You mean I can still use DEV_TYPE_PCI_HOST_BRIDGE as a marker

 while trying to detect what I need?
>>> I'm afraid I don't understand what marker you're thinking about
>>> here.
>> I mean that when I go over bus2bridge entries, should I only work with
>>
>> those who have DEV_TYPE_PCI_HOST_BRIDGE?
> Well, if you're after host bridges - yes.
Ok, I'll try to see what I can do about it.
>
> Jan

Thank you,

Oleksandr


Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Jan Beulich
On 13.11.2020 15:44, Oleksandr Andrushchenko wrote:
> 
> On 11/13/20 4:38 PM, Jan Beulich wrote:
>> On 13.11.2020 15:32, Oleksandr Andrushchenko wrote:
>>> On 11/13/20 4:23 PM, Jan Beulich wrote:
Earlier on I didn't say you should get this to work, only
 that I think the general logic around what you add shouldn't make
 things more arch specific than they really should be. That said,
 something similar to the above should still be doable on x86,
 utilizing struct pci_seg's bus2bridge[]. There ought to be
 DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them
 (provided by the CPUs themselves rather than the chipset) aren't
 really host bridges for the purposes you're after.
>>> You mean I can still use DEV_TYPE_PCI_HOST_BRIDGE as a marker
>>>
>>> while trying to detect what I need?
>> I'm afraid I don't understand what marker you're thinking about
>> here.
> 
> I mean that when I go over bus2bridge entries, should I only work with
> 
> those who have DEV_TYPE_PCI_HOST_BRIDGE?

Well, if you're after host bridges - yes.

Jan



Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Oleksandr Andrushchenko

On 11/13/20 4:38 PM, Jan Beulich wrote:
> On 13.11.2020 15:32, Oleksandr Andrushchenko wrote:
>> On 11/13/20 4:23 PM, Jan Beulich wrote:
>>> On 13.11.2020 13:41, Oleksandr Andrushchenko wrote:
 On 11/13/20 1:35 PM, Jan Beulich wrote:
> On 13.11.2020 12:02, Oleksandr Andrushchenko wrote:
>> On 11/13/20 12:50 PM, Jan Beulich wrote:
>>> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote:
 On 11/13/20 12:25 PM, Jan Beulich wrote:
> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
>> I'll try to replace is_hardware_domain with something like:
>>
>> +/*
>> + * Detect whether physical PCI devices in this segment belong
>> + * to the domain given, e.g. on x86 all PCI devices live in hwdom,
>> + * but in case of ARM this might not be the case: those may also
>> + * live in driver domains or even Xen itself.
>> + */
>> +bool pci_is_hardware_domain(struct domain *d, u16 seg)
>> +{
>> +#ifdef CONFIG_X86
>> +    return is_hardware_domain(d);
>> +#elif CONFIG_ARM
>> +    return pci_is_owner_domain(d, seg);
>> +#else
>> +#error "Unsupported architecture"
>> +#endif
>> +}
>> +
>> +/*
>> + * Get domain which owns this segment: for x86 this is always 
>> hardware
>> + * domain and for ARM this can be different.
>> + */
>> +struct domain *pci_get_hardware_domain(u16 seg)
>> +{
>> +#ifdef CONFIG_X86
>> +    return hardware_domain;
>> +#elif CONFIG_ARM
>> +    return pci_get_owner_domain(seg);
>> +#else
>> +#error "Unsupported architecture"
>> +#endif
>> +}
>>
>> This is what I use to properly detect the domain that really owns 
>> physical host bridge
> I consider this problematic. We should try to not let Arm's and x86'es
> PCI implementations diverge too much, i.e. at least the underlying 
> basic
> model would better be similar. For example, if entire segments can be
> assigned to a driver domain on Arm, why should the same not be 
> possible
> on x86?
 Good question, probably in this case x86 == ARM and I can use

 pci_is_owner_domain for both architectures instead of using 
 is_hardware_domain for x86

> Furthermore I'm suspicious about segments being the right granularity
> here. On ia64 multiple host bridges could (and typically would) live
> on segment 0. Iirc I had once seen output from an x86 system which was
> apparently laid out similarly. Therefore, just like for MCFG, I think
> the granularity wants to be bus ranges within a segment.
 Can you please suggest something we can use as a hint for such a 
 detection logic?
>>> The underlying information comes from ACPI tables, iirc. I don't
>>> recall the details, though - sorry.
>> Ok, so seg + bus should be enough for both ARM and Xen then, right?
>>
>> pci_get_hardware_domain(u16 seg, u8 bus)
> Whether an individual bus number can suitable express things I can't
> tell; I did say bus range, but of course if you care about just
> individual devices, then a single bus number will of course do.
 I can implement the lookup whether a PCI host bridge owned by a particular

 domain with something like:

 struct pci_host_bridge *bridge = pci_find_host_bridge(seg, bus);

 return bridge->dt_node->used_by == d->domain_id;

 Could you please give me a hint how this can be done on x86?
>>> Bridges can't be assigned to other than the hardware domain right
>>> now.
>> So, I can probably then have pci_get_hardware_domain implemented
>>
>> by both ARM and x86 in their arch specific code. And for x86 for now
>>
>> it can simply be a wrapper for is_hardware_domain?
> "get" can't be a wrapper for "is"
Sure, s/get/is
> , but I think I get what you're
> saying. But no, preferably I would not see you do this (hence my
> earlier comment). I still think you could use the owner of the
> respective device; I may be mistaken, but iirc we do assign
> bridges to Dom0, so deriving from that would be better than
> hardwiring to is_hardware_domain().
Ok, I'll try to figure out how to do that
>
>>>Earlier on I didn't say you should get this to work, only
>>> that I think the general logic around what you add shouldn't make
>>> things more arch specific than they really should be. That said,
>>> something similar to the above should still be doable on x86,
>>> utilizing struct pci_seg's bus2bridge[]. There ought to be
>>> DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them
>>> (provided by the CPUs themselves rather than the chipset) aren't
>>> really host bridges for the purposes you're after.
>> You mean I can still use 

Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Jan Beulich
On 13.11.2020 15:32, Oleksandr Andrushchenko wrote:
> 
> On 11/13/20 4:23 PM, Jan Beulich wrote:
>> On 13.11.2020 13:41, Oleksandr Andrushchenko wrote:
>>> On 11/13/20 1:35 PM, Jan Beulich wrote:
 On 13.11.2020 12:02, Oleksandr Andrushchenko wrote:
> On 11/13/20 12:50 PM, Jan Beulich wrote:
>> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote:
>>> On 11/13/20 12:25 PM, Jan Beulich wrote:
 On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
> I'll try to replace is_hardware_domain with something like:
>
> +/*
> + * Detect whether physical PCI devices in this segment belong
> + * to the domain given, e.g. on x86 all PCI devices live in hwdom,
> + * but in case of ARM this might not be the case: those may also
> + * live in driver domains or even Xen itself.
> + */
> +bool pci_is_hardware_domain(struct domain *d, u16 seg)
> +{
> +#ifdef CONFIG_X86
> +    return is_hardware_domain(d);
> +#elif CONFIG_ARM
> +    return pci_is_owner_domain(d, seg);
> +#else
> +#error "Unsupported architecture"
> +#endif
> +}
> +
> +/*
> + * Get domain which owns this segment: for x86 this is always 
> hardware
> + * domain and for ARM this can be different.
> + */
> +struct domain *pci_get_hardware_domain(u16 seg)
> +{
> +#ifdef CONFIG_X86
> +    return hardware_domain;
> +#elif CONFIG_ARM
> +    return pci_get_owner_domain(seg);
> +#else
> +#error "Unsupported architecture"
> +#endif
> +}
>
> This is what I use to properly detect the domain that really owns 
> physical host bridge
 I consider this problematic. We should try to not let Arm's and x86'es
 PCI implementations diverge too much, i.e. at least the underlying 
 basic
 model would better be similar. For example, if entire segments can be
 assigned to a driver domain on Arm, why should the same not be possible
 on x86?
>>> Good question, probably in this case x86 == ARM and I can use
>>>
>>> pci_is_owner_domain for both architectures instead of using 
>>> is_hardware_domain for x86
>>>
 Furthermore I'm suspicious about segments being the right granularity
 here. On ia64 multiple host bridges could (and typically would) live
 on segment 0. Iirc I had once seen output from an x86 system which was
 apparently laid out similarly. Therefore, just like for MCFG, I think
 the granularity wants to be bus ranges within a segment.
>>> Can you please suggest something we can use as a hint for such a 
>>> detection logic?
>> The underlying information comes from ACPI tables, iirc. I don't
>> recall the details, though - sorry.
> Ok, so seg + bus should be enough for both ARM and Xen then, right?
>
> pci_get_hardware_domain(u16 seg, u8 bus)
 Whether an individual bus number can suitable express things I can't
 tell; I did say bus range, but of course if you care about just
 individual devices, then a single bus number will of course do.
>>> I can implement the lookup whether a PCI host bridge owned by a particular
>>>
>>> domain with something like:
>>>
>>> struct pci_host_bridge *bridge = pci_find_host_bridge(seg, bus);
>>>
>>> return bridge->dt_node->used_by == d->domain_id;
>>>
>>> Could you please give me a hint how this can be done on x86?
>> Bridges can't be assigned to other than the hardware domain right
>> now.
> 
> So, I can probably then have pci_get_hardware_domain implemented
> 
> by both ARM and x86 in their arch specific code. And for x86 for now
> 
> it can simply be a wrapper for is_hardware_domain?

"get" can't be a wrapper for "is", but I think I get what you're
saying. But no, preferably I would not see you do this (hence my
earlier comment). I still think you could use the owner of the
respective device; I may be mistaken, but iirc we do assign
bridges to Dom0, so deriving from that would be better than
hardwiring to is_hardware_domain().

>>   Earlier on I didn't say you should get this to work, only
>> that I think the general logic around what you add shouldn't make
>> things more arch specific than they really should be. That said,
>> something similar to the above should still be doable on x86,
>> utilizing struct pci_seg's bus2bridge[]. There ought to be
>> DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them
>> (provided by the CPUs themselves rather than the chipset) aren't
>> really host bridges for the purposes you're after.
> 
> You mean I can still use DEV_TYPE_PCI_HOST_BRIDGE as a marker
> 
> while trying to detect what I need?

I'm afraid I don't understand what marker you're thinking about
here.

Jan



Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843

2020-11-13 Thread Roger Pau Monné
On Fri, Nov 13, 2020 at 12:54:57PM +0100, Manuel Bouyer wrote:
> On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote:
> > The following might be able to get you going, but I think I need to
> > refine the logic a bit there, will have to give it some thought.
> 
> I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: 
> Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty).
> Your patch is needed there too to avoid the panic.
> 
> As with 4.13, I have problems with interrupts not being properly
> delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13)
> or 2 (with 4.15) which would mean that interrupts stop being delivered
> at some point in the setup process. Maybe something to do with mask/unmask ?
> 
> The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD kernel,
> so it's not MSI/MSI-X (not sure it matters though).
> Maybe something related to mask/unmask ?

Forgot to mention, it might also be helpful to boot Xen with
iommu=debug, just in case.

Roger.



Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843

2020-11-13 Thread Roger Pau Monné
On Fri, Nov 13, 2020 at 12:54:57PM +0100, Manuel Bouyer wrote:
> On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote:
> > The following might be able to get you going, but I think I need to
> > refine the logic a bit there, will have to give it some thought.
> 
> I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: 
> Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty).
> Your patch is needed there too to avoid the panic.
> 
> As with 4.13, I have problems with interrupts not being properly
> delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13)
> or 2 (with 4.15) which would mean that interrupts stop being delivered
> at some point in the setup process. Maybe something to do with mask/unmask ?
> 
> The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD kernel,
> so it's not MSI/MSI-X (not sure it matters though).
> Maybe something related to mask/unmask ?

What device do you have on that pin? Is it the only device not working
properly? I get from that that MSI/MSI-X is now working fine.

You can get some interrupt info from the 'i' and the 'z' debug keys,
albeit that won't reflect the state of the emulated IO-APIC used by
dom0, which is likely what we care about. There's also the 'M' debug
key, but that's only useful for MSI/MSI-X.

I can try to prepare a patch to dump some info from the emulated
IO-APIC, but I'm afraid I won't get to it until Monday.

Roger.



Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Oleksandr Andrushchenko

On 11/13/20 4:23 PM, Jan Beulich wrote:
> On 13.11.2020 13:41, Oleksandr Andrushchenko wrote:
>> On 11/13/20 1:35 PM, Jan Beulich wrote:
>>> On 13.11.2020 12:02, Oleksandr Andrushchenko wrote:
 On 11/13/20 12:50 PM, Jan Beulich wrote:
> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote:
>> On 11/13/20 12:25 PM, Jan Beulich wrote:
>>> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
 I'll try to replace is_hardware_domain with something like:

 +/*
 + * Detect whether physical PCI devices in this segment belong
 + * to the domain given, e.g. on x86 all PCI devices live in hwdom,
 + * but in case of ARM this might not be the case: those may also
 + * live in driver domains or even Xen itself.
 + */
 +bool pci_is_hardware_domain(struct domain *d, u16 seg)
 +{
 +#ifdef CONFIG_X86
 +    return is_hardware_domain(d);
 +#elif CONFIG_ARM
 +    return pci_is_owner_domain(d, seg);
 +#else
 +#error "Unsupported architecture"
 +#endif
 +}
 +
 +/*
 + * Get domain which owns this segment: for x86 this is always hardware
 + * domain and for ARM this can be different.
 + */
 +struct domain *pci_get_hardware_domain(u16 seg)
 +{
 +#ifdef CONFIG_X86
 +    return hardware_domain;
 +#elif CONFIG_ARM
 +    return pci_get_owner_domain(seg);
 +#else
 +#error "Unsupported architecture"
 +#endif
 +}

 This is what I use to properly detect the domain that really owns 
 physical host bridge
>>> I consider this problematic. We should try to not let Arm's and x86'es
>>> PCI implementations diverge too much, i.e. at least the underlying basic
>>> model would better be similar. For example, if entire segments can be
>>> assigned to a driver domain on Arm, why should the same not be possible
>>> on x86?
>> Good question, probably in this case x86 == ARM and I can use
>>
>> pci_is_owner_domain for both architectures instead of using 
>> is_hardware_domain for x86
>>
>>> Furthermore I'm suspicious about segments being the right granularity
>>> here. On ia64 multiple host bridges could (and typically would) live
>>> on segment 0. Iirc I had once seen output from an x86 system which was
>>> apparently laid out similarly. Therefore, just like for MCFG, I think
>>> the granularity wants to be bus ranges within a segment.
>> Can you please suggest something we can use as a hint for such a 
>> detection logic?
> The underlying information comes from ACPI tables, iirc. I don't
> recall the details, though - sorry.
 Ok, so seg + bus should be enough for both ARM and Xen then, right?

 pci_get_hardware_domain(u16 seg, u8 bus)
>>> Whether an individual bus number can suitable express things I can't
>>> tell; I did say bus range, but of course if you care about just
>>> individual devices, then a single bus number will of course do.
>> I can implement the lookup whether a PCI host bridge owned by a particular
>>
>> domain with something like:
>>
>> struct pci_host_bridge *bridge = pci_find_host_bridge(seg, bus);
>>
>> return bridge->dt_node->used_by == d->domain_id;
>>
>> Could you please give me a hint how this can be done on x86?
> Bridges can't be assigned to other than the hardware domain right
> now.

So, I can probably then have pci_get_hardware_domain implemented

by both ARM and x86 in their arch specific code. And for x86 for now

it can simply be a wrapper for is_hardware_domain?

>   Earlier on I didn't say you should get this to work, only
> that I think the general logic around what you add shouldn't make
> things more arch specific than they really should be. That said,
> something similar to the above should still be doable on x86,
> utilizing struct pci_seg's bus2bridge[]. There ought to be
> DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them
> (provided by the CPUs themselves rather than the chipset) aren't
> really host bridges for the purposes you're after.

You mean I can still use DEV_TYPE_PCI_HOST_BRIDGE as a marker

while trying to detect what I need?

>
> Jan

Thank you,

Oleksandr


Re: [PATCH V2 01/23] x86/ioreq: Prepare IOREQ feature for making it common

2020-11-13 Thread Jan Beulich
On 13.11.2020 13:45, Oleksandr wrote:
> On 13.11.20 13:20, Jan Beulich wrote:
>> On 13.11.2020 12:09, Oleksandr wrote:
>>> On 12.11.20 12:58, Jan Beulich wrote:
 On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:
> @@ -855,7 +841,7 @@ int hvm_destroy_ioreq_server(struct domain *d, 
> ioservid_t id)
>
>domain_pause(d);
>
> -p2m_set_ioreq_server(d, 0, s);
> +arch_hvm_destroy_ioreq_server(s);
 Iirc there are plans to rename hvm_destroy_ioreq_server() in the
 course of the generalization. If so, this arch hook would imo
 better be named following the new scheme right away.
>>> Could you please clarify, are you speaking about the plans discussed there
>>>
>>> "[PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved
>>> function names"?
>>>
>>> Copy text for the convenience:
>>> AT least some of the functions touched here would be nice to be
>>> moved to a more consistent new naming scheme right away, to
>>> avoid having to touch all the same places again. I guess ioreq
>>> server functions would be nice to all start with ioreq_server_
>>> and ioreq functions to all start with ioreq_. E.g. ioreq_send()
>>> and ioreq_server_select().
>>>
>>> or some other plans I am not aware of?
>>>
>>>
>>> What I really want to avoid with IOREQ enabling work is the round-trips
>>> related to naming things, this patch series
>>> became quite big (and consumes som time to rebase and test it) and I
>>> expect it to become bigger.
>>>
>>> So the arch_hvm_destroy_ioreq_server() should be
>>> arch_ioreq_server_destroy()?
>> I think so, yes. If you want to avoid doing full patches, how
>> about you simply list the functions / variables you plan to
>> rename alongside the intended new names? Would likely be easier
>> for all involved parties.
> I think it is a good idea. I will prepare a list once I analyze all new 
> comments to this series.
> As I understand that only global IOREQ functions need renaming according 
> to the new scheme,
> local ones can be left as is but without "hvm" prefixes of course?

Please apply common sense: Static ones, if you have to drop a
hvm_ prefix, may in some cases better further be renamed as well,
when their names aren't really in line with their purpose
(anymore). But yes, following a consistent naming model is more
relevant for non-static functions.

Jan



Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Andrew Cooper
On 13/11/2020 14:18, Bjoern Doebel wrote:
> Right now we do not have a mechanism to determine the version of the
> currently running xenstored at runtime. As xenstored runs throughout the
> lifetime of a Xen host, this may lead to problems when newer user space
> builds are staged. Then, the running xenstored will no longer match the
> version of the installed xenstored.
>
> To allow users to always identify the running version of xenstored, add
> a linker-generated unique build ID to every xenstored build. Add
> functionality to log this build ID into a file upon service startup.
>
> Signed-off-by: Bjoern Doebel 
> Reviewed-by: Martin Mazein 
> Reviewed-by: Paul Durrant 

I understand the problem you're trying to solve, but why is this
anything more than just enabling build-id's by default across tools/ ?

There are already standard ways of interacting with the build id of
running executables on the system.  I'd strongly discourage doing
anything custom in xenstored specifically.

~Andrew



Re: [PATCH V2 09/23] xen/dm: Make x86's DM feature common

2020-11-13 Thread Oleksandr



On 12.11.20 13:32, Jan Beulich wrote:

Hi Jan


On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:

From: Julien Grall 

As a lot of x86 code can be re-used on Arm later on, this patch
splits devicemodel support into common and arch specific parts.

The common DM feature is supposed to be built with IOREQ_SERVER
option enabled (as well as the IOREQ feature), which is selected
for x86's config HVM for now.

Did you consider doing it the other way around? It would seem
more natural to have the top level dm-op handling arch-specific
and call into e.g. ioreq_server_dm_op() for otherwise unhandled
ops, just like e.g. do_domctl() calls into iommu_do_domctl()
(indirectly via arch_do_domctl()). I ask because in the long
run I expect the ioreq server sub-ops to only be a small part
of the overall set of dm-ops; already now it's 7 out of 18 if
I got the counting right.

This would then also leave compat_dm_op() in x86 code.

But yes, there are also downsides with this alternative.



No, I didn't consider. I separated the proposed DM changes from Julien's 
patch without modifying the logic.
My changes on top (except rebasing of course) are updating XSM code and 
introducing a xen/dm.h for definitions.


--
Regards,

Oleksandr Tyshchenko




Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Jan Beulich
On 13.11.2020 15:18, Bjoern Doebel wrote:
> --- a/tools/xenstore/Makefile
> +++ b/tools/xenstore/Makefile
> @@ -66,6 +66,10 @@ $(XENSTORED_OBJS): CFLAGS += $(SYSTEMD_CFLAGS)
>  xenstored: LDFLAGS += $(SYSTEMD_LIBS)
>  endif
>  
> +# xenstored: enforce creation of a buildID section and use a linker
> +# script to add additional symbols around that section
> +xenstored: LDFLAGS +=  -Wl,--build-id=sha1 -Wl,-T,buildid_symbols.ld

Since in the hypervisor build we're careful to not use --build-id
when the linker doesn't support it, I suppose the same care needs
applying here. See the setting of build_id_linker in ./Config.mk.

Jan



Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Jan Beulich
On 13.11.2020 13:41, Oleksandr Andrushchenko wrote:
> 
> On 11/13/20 1:35 PM, Jan Beulich wrote:
>> On 13.11.2020 12:02, Oleksandr Andrushchenko wrote:
>>> On 11/13/20 12:50 PM, Jan Beulich wrote:
 On 13.11.2020 11:46, Oleksandr Andrushchenko wrote:
> On 11/13/20 12:25 PM, Jan Beulich wrote:
>> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
>>> I'll try to replace is_hardware_domain with something like:
>>>
>>> +/*
>>> + * Detect whether physical PCI devices in this segment belong
>>> + * to the domain given, e.g. on x86 all PCI devices live in hwdom,
>>> + * but in case of ARM this might not be the case: those may also
>>> + * live in driver domains or even Xen itself.
>>> + */
>>> +bool pci_is_hardware_domain(struct domain *d, u16 seg)
>>> +{
>>> +#ifdef CONFIG_X86
>>> +    return is_hardware_domain(d);
>>> +#elif CONFIG_ARM
>>> +    return pci_is_owner_domain(d, seg);
>>> +#else
>>> +#error "Unsupported architecture"
>>> +#endif
>>> +}
>>> +
>>> +/*
>>> + * Get domain which owns this segment: for x86 this is always hardware
>>> + * domain and for ARM this can be different.
>>> + */
>>> +struct domain *pci_get_hardware_domain(u16 seg)
>>> +{
>>> +#ifdef CONFIG_X86
>>> +    return hardware_domain;
>>> +#elif CONFIG_ARM
>>> +    return pci_get_owner_domain(seg);
>>> +#else
>>> +#error "Unsupported architecture"
>>> +#endif
>>> +}
>>>
>>> This is what I use to properly detect the domain that really owns 
>>> physical host bridge
>> I consider this problematic. We should try to not let Arm's and x86'es
>> PCI implementations diverge too much, i.e. at least the underlying basic
>> model would better be similar. For example, if entire segments can be
>> assigned to a driver domain on Arm, why should the same not be possible
>> on x86?
> Good question, probably in this case x86 == ARM and I can use
>
> pci_is_owner_domain for both architectures instead of using 
> is_hardware_domain for x86
>
>> Furthermore I'm suspicious about segments being the right granularity
>> here. On ia64 multiple host bridges could (and typically would) live
>> on segment 0. Iirc I had once seen output from an x86 system which was
>> apparently laid out similarly. Therefore, just like for MCFG, I think
>> the granularity wants to be bus ranges within a segment.
> Can you please suggest something we can use as a hint for such a 
> detection logic?
 The underlying information comes from ACPI tables, iirc. I don't
 recall the details, though - sorry.
>>> Ok, so seg + bus should be enough for both ARM and Xen then, right?
>>>
>>> pci_get_hardware_domain(u16 seg, u8 bus)
>> Whether an individual bus number can suitable express things I can't
>> tell; I did say bus range, but of course if you care about just
>> individual devices, then a single bus number will of course do.
> 
> I can implement the lookup whether a PCI host bridge owned by a particular
> 
> domain with something like:
> 
> struct pci_host_bridge *bridge = pci_find_host_bridge(seg, bus);
> 
> return bridge->dt_node->used_by == d->domain_id;
> 
> Could you please give me a hint how this can be done on x86?

Bridges can't be assigned to other than the hardware domain right
now. Earlier on I didn't say you should get this to work, only
that I think the general logic around what you add shouldn't make
things more arch specific than they really should be. That said,
something similar to the above should still be doable on x86,
utilizing struct pci_seg's bus2bridge[]. There ought to be
DEV_TYPE_PCI_HOST_BRIDGE entries there, albeit a number of them
(provided by the CPUs themselves rather than the chipset) aren't
really host bridges for the purposes you're after.

Jan



[XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Bjoern Doebel
Right now we do not have a mechanism to determine the version of the
currently running xenstored at runtime. As xenstored runs throughout the
lifetime of a Xen host, this may lead to problems when newer user space
builds are staged. Then, the running xenstored will no longer match the
version of the installed xenstored.

To allow users to always identify the running version of xenstored, add
a linker-generated unique build ID to every xenstored build. Add
functionality to log this build ID into a file upon service startup.

Signed-off-by: Bjoern Doebel 
Reviewed-by: Martin Mazein 
Reviewed-by: Paul Durrant 
---
 tools/hotplug/Linux/launch-xenstore.in |  2 +-
 tools/xenstore/Makefile|  4 +++
 tools/xenstore/buildid_symbols.ld  | 10 +++
 tools/xenstore/xenstored_core.c|  8 ++
 tools/xenstore/xenstored_core.h|  3 ++
 tools/xenstore/xenstored_minios.c  |  4 +++
 tools/xenstore/xenstored_posix.c   | 52 ++
 7 files changed, 82 insertions(+), 1 deletion(-)
 create mode 100644 tools/xenstore/buildid_symbols.ld

diff --git a/tools/hotplug/Linux/launch-xenstore.in 
b/tools/hotplug/Linux/launch-xenstore.in
index 991dec8d25..a6f2254030 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -62,7 +62,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
@CONFIG_DIR@/@CONFIG_LEAF
}
 
echo -n Starting $XENSTORED...
-   $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
+   $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid --buildid-file 
@XEN_RUN_DIR@/xenstored.buildid $XENSTORED_ARGS
 
systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || 
exit 1
 
diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile
index 9a0f0d012d..c63350980b 100644
--- a/tools/xenstore/Makefile
+++ b/tools/xenstore/Makefile
@@ -66,6 +66,10 @@ $(XENSTORED_OBJS): CFLAGS += $(SYSTEMD_CFLAGS)
 xenstored: LDFLAGS += $(SYSTEMD_LIBS)
 endif
 
+# xenstored: enforce creation of a buildID section and use a linker
+# script to add additional symbols around that section
+xenstored: LDFLAGS +=  -Wl,--build-id=sha1 -Wl,-T,buildid_symbols.ld
+
 $(XENSTORED_OBJS): CFLAGS += $(CFLAGS_libxengnttab)
 
 xenstored: $(XENSTORED_OBJS)
diff --git a/tools/xenstore/buildid_symbols.ld 
b/tools/xenstore/buildid_symbols.ld
new file mode 100644
index 00..d74024c4e9
--- /dev/null
+++ b/tools/xenstore/buildid_symbols.ld
@@ -0,0 +1,10 @@
+SECTIONS
+{
+   __buildid_note_section = . ;
+   .note.gnu.build-id :
+   {
+   *(.note.gnu.build-id)
+   }
+   __buildid_end = . ;
+}
+INSERT AFTER .data
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index b4be374d3f..c6f107bdd9 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -1844,6 +1844,7 @@ static void usage(void)
 
 
 static struct option options[] = {
+   { "buildid-file", 1, NULL, 'B' },
{ "no-domain-init", 0, NULL, 'D' },
{ "entry-nb", 1, NULL, 'E' },
{ "pid-file", 1, NULL, 'F' },
@@ -1875,12 +1876,16 @@ int main(int argc, char *argv[])
bool outputpid = false;
bool no_domain_init = false;
const char *pidfile = NULL;
+   const char *buildid_file = NULL;
int timeout;
 
 
while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:T:RVW:", options,
  NULL)) != -1) {
switch (opt) {
+   case 'B':
+   buildid_file = optarg;
+   break;
case 'D':
no_domain_init = true;
break;
@@ -1948,6 +1953,9 @@ int main(int argc, char *argv[])
if (pidfile)
write_pidfile(pidfile);
 
+   if (buildid_file)
+   write_buildid_file(buildid_file);
+
/* Talloc leak reports go to stderr, which is closed if we fork. */
if (!dofork)
talloc_enable_leak_report_full();
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index 1df6ad94ab..712280626c 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -193,6 +193,9 @@ void xenbus_notify_running(void);
 /* Write out the pidfile */
 void write_pidfile(const char *pidfile);
 
+/* Write the buildid file */
+void write_buildid_file(const char *buildidfile);
+
 /* Fork but do not close terminal FDs */
 void daemonize(void);
 /* Close stdin/stdout/stderr to complete daemonize */
diff --git a/tools/xenstore/xenstored_minios.c 
b/tools/xenstore/xenstored_minios.c
index c94493e52a..ef1151aee4 100644
--- a/tools/xenstore/xenstored_minios.c
+++ b/tools/xenstore/xenstored_minios.c
@@ -24,6 +24,10 @@ void write_pidfile(const char *pidfile)
 {
 }
 
+void write_buildid_file(const char *buildid_file)
+{
+}
+
 void daemonize(void)
 {
 }
diff --git 

Re: [PATCH V2 07/23] xen/ioreq: Move x86's ioreq_gfn(server) to struct domain

2020-11-13 Thread Oleksandr



On 12.11.20 13:21, Jan Beulich wrote:

Hi Jan


On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:

--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -77,7 +77,7 @@ static inline int hvm_map_mem_type_to_ioreq_server(struct 
domain *d,
  if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
  return -EINVAL;
  
-spin_lock_recursive(>arch.hvm.ioreq_server.lock);

+spin_lock_recursive(>ioreq_server.lock);
  
  s = get_ioreq_server(d, id);
  
@@ -92,7 +92,7 @@ static inline int hvm_map_mem_type_to_ioreq_server(struct domain *d,

  rc = p2m_set_ioreq_server(d, flags, s);
  
   out:

-spin_unlock_recursive(>arch.hvm.ioreq_server.lock);
+spin_unlock_recursive(>ioreq_server.lock);
  
  if ( rc == 0 && flags == 0 )

  {


Does this build at this point, when !CONFIG_IOREQ_SERVER? Patch 1
moves the code here without guards, and patch 2, when introducing
the Kconfig symbol, doesn't add guards here. I admit I didn't
check further intermediate patches.

Yes.


I can confirm I checked x86 patch by patch with CONFIG_IOREQ_SERVER, as 
for !CONFIG_IOREQ_SERVER I can't recollect to be 100% sure, but likely I 
tested also patch by patch.

Anyway, I have just rechecked.
Probably it is because this header isn't in use with 
!CONFIG_IOREQ_SERVER since all users are x86/hvm/* and common/ioreq.c



--
Regards,

Oleksandr Tyshchenko




Re: [PATCH V2 02/23] xen/ioreq: Make x86's IOREQ feature common

2020-11-13 Thread Oleksandr



On 12.11.20 13:11, Jan Beulich wrote:

Hi Jan


On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

As a lot of x86 code can be re-used on Arm later on, this patch
moves previously prepared x86/hvm/ioreq.c to the common code.

The common IOREQ feature is supposed to be built with IOREQ_SERVER
option enabled, which is selected for x86's config HVM for now.

In order to avoid having a gigantic patch here, the subsequent
patches will update remaining bits in the common code step by step:
- Make IOREQ related structs/materials common
- Drop the "hvm" prefixes and infixes
- Remove layering violation by moving corresponding fields
   out of *arch.hvm* or abstracting away accesses to them

This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.

Signed-off-by: Oleksandr Tyshchenko 
CC: Julien Grall 

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

***
Please note, this patch depends on the following which is
on review:
https://patchwork.kernel.org/patch/11816689/
***

Changes RFC -> V1:
- was split into three patches:
  - x86/ioreq: Prepare IOREQ feature for making it common
  - xen/ioreq: Make x86's IOREQ feature common
  - xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
- update MAINTAINERS file
- do not use a separate subdir for the IOREQ stuff, move it to:
  - xen/common/ioreq.c
  - xen/include/xen/ioreq.h
- update x86's files to include xen/ioreq.h
- remove unneeded headers in arch/x86/hvm/ioreq.c
- re-order the headers alphabetically in common/ioreq.c
- update common/ioreq.c according to the newly introduced arch functions:
  arch_hvm_destroy_ioreq_server()/arch_handle_hvm_io_completion()

Changes V1 -> V2:
- update patch description
- make everything needed in the previous patch to achieve
  a truly rename here
- don't include unnecessary headers from asm-x86/hvm/ioreq.h
  and xen/ioreq.h
- use __XEN_IOREQ_H__ instead of __IOREQ_H__
- move get_ioreq_server() to common/ioreq.c
---
  MAINTAINERS |8 +-
  xen/arch/x86/Kconfig|1 +
  xen/arch/x86/hvm/Makefile   |1 -
  xen/arch/x86/hvm/ioreq.c| 1422 ---
  xen/arch/x86/mm.c   |2 +-
  xen/arch/x86/mm/shadow/common.c |2 +-
  xen/common/Kconfig  |3 +
  xen/common/Makefile |1 +
  xen/common/ioreq.c  | 1422 +++
  xen/include/asm-x86/hvm/ioreq.h |   39 +-
  xen/include/xen/ioreq.h |   71 ++
  11 files changed, 1509 insertions(+), 1463 deletions(-)
  delete mode 100644 xen/arch/x86/hvm/ioreq.c
  create mode 100644 xen/common/ioreq.c
  create mode 100644 xen/include/xen/ioreq.h

Iirc I've previously asked to make sure the diff here gets created with
git's rename detection enabled, so we wouldn't see 1422 lines deleted
and 1422 lines added, _hoping_ they're all the same (or going through
the extra steps needed to compare old and new code), but instead seeing
just the diff between old and new files (which in the ideal case would
then be empty).


This is my fault, I misread this last time. I have just rechecked this 
patch again.


git config diff.renames false

 MAINTAINERS |    8 +-
 xen/arch/x86/Kconfig    |    1 +
 xen/arch/x86/hvm/Makefile   |    1 -
 xen/arch/x86/hvm/ioreq.c    | 1544 


 xen/arch/x86/mm.c   |    2 +-
 xen/arch/x86/mm/shadow/common.c |    2 +-
 xen/common/Kconfig  |    3 +
 xen/common/Makefile |    1 +
 xen/common/ioreq.c  | 1544 


 xen/include/asm-x86/hvm/ioreq.h |   38 +
 xen/include/xen/ioreq.h |   73 +
 11 files changed, 1633 insertions(+), 1584 deletions(-)

git config diff.renames true

 MAINTAINERS  |  8 +++-
 xen/arch/x86/Kconfig |  1 +
 xen/arch/x86/hvm/Makefile    |  1 -
 xen/arch/x86/mm.c    |  2 +-
 xen/arch/x86/mm/shadow/common.c  |  2 +-
 xen/common/Kconfig   |  3 +++
 xen/common/Makefile  |  1 +
 xen/{arch/x86/hvm => common}/ioreq.c |  0
 xen/include/asm-x86/hvm/ioreq.h  | 38 
++
 xen/include/xen/ioreq.h  | 73 
+

 10 files changed, 89 insertions(+), 40 deletions(-)

Although it is not needed for the patch in question anymore (since there 

Re: [PATCH V2 01/23] x86/ioreq: Prepare IOREQ feature for making it common

2020-11-13 Thread Oleksandr



On 13.11.20 13:20, Jan Beulich wrote:

Hi Jan.


On 13.11.2020 12:09, Oleksandr wrote:

On 12.11.20 12:58, Jan Beulich wrote:

On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:

@@ -855,7 +841,7 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t 
id)
   
   domain_pause(d);
   
-p2m_set_ioreq_server(d, 0, s);

+arch_hvm_destroy_ioreq_server(s);

Iirc there are plans to rename hvm_destroy_ioreq_server() in the
course of the generalization. If so, this arch hook would imo
better be named following the new scheme right away.

Could you please clarify, are you speaking about the plans discussed there

"[PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved
function names"?

Copy text for the convenience:
AT least some of the functions touched here would be nice to be
moved to a more consistent new naming scheme right away, to
avoid having to touch all the same places again. I guess ioreq
server functions would be nice to all start with ioreq_server_
and ioreq functions to all start with ioreq_. E.g. ioreq_send()
and ioreq_server_select().

or some other plans I am not aware of?


What I really want to avoid with IOREQ enabling work is the round-trips
related to naming things, this patch series
became quite big (and consumes som time to rebase and test it) and I
expect it to become bigger.

So the arch_hvm_destroy_ioreq_server() should be
arch_ioreq_server_destroy()?

I think so, yes. If you want to avoid doing full patches, how
about you simply list the functions / variables you plan to
rename alongside the intended new names? Would likely be easier
for all involved parties.
I think it is a good idea. I will prepare a list once I analyze all new 
comments to this series.
As I understand that only global IOREQ functions need renaming according 
to the new scheme,

local ones can be left as is but without "hvm" prefixes of course?






@@ -1215,7 +1153,7 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
   struct hvm_ioreq_server *s;
   unsigned int id;
   
-if ( !relocate_portio_handler(d, 0xcf8, 0xcf8, 4) )

+if ( !arch_hvm_ioreq_destroy(d) )

There's no ioreq being destroyed here, so I think this wants
renaming (and again ideally right away following the planned
new scheme).

Agree that no ioreq being destroyed here. Probably
ioreq_server_check_for_destroy()?
I couldn't think of a better name.

"check" implies no change (and d ought to then be const struct
domain *). With the containing function likely becoming
ioreq_server_destroy_all(), arch_ioreq_server_destroy_all()
would come to mind, or arch_ioreq_server_prepare_destroy_all().


ok, agree





+static inline int hvm_map_mem_type_to_ioreq_server(struct domain *d,
+   ioservid_t id,
+   uint32_t type,
+   uint32_t flags)
+{
+struct hvm_ioreq_server *s;
+int rc;
+
+if ( type != HVMMEM_ioreq_server )
+return -EINVAL;
+
+if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
+return -EINVAL;
+
+spin_lock_recursive(>arch.hvm.ioreq_server.lock);
+
+s = get_ioreq_server(d, id);
+
+rc = -ENOENT;
+if ( !s )
+goto out;
+
+rc = -EPERM;
+if ( s->emulator != current->domain )
+goto out;
+
+rc = p2m_set_ioreq_server(d, flags, s);
+
+ out:
+spin_unlock_recursive(>arch.hvm.ioreq_server.lock);
+
+if ( rc == 0 && flags == 0 )
+{
+struct p2m_domain *p2m = p2m_get_hostp2m(d);

I realize I may be asking too much, but would it be possible if,
while moving code, you made simple and likely uncontroversial
adjustments like adding const here? (Such adjustments would be
less desirable to make if they increased the size of the patch,
e.g. if you were touching only nearby code.)

This function as well as one located below won't be moved to this header
for the next version of patch.

ok, will add const.

Well, if you don't move the code, then better keep the diff small
and leave things as they are.


ok, in case I will have to move that code for any reason, I will take 
suggestions into the account.


--
Regards,

Oleksandr Tyshchenko




Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Oleksandr Andrushchenko

On 11/13/20 1:35 PM, Jan Beulich wrote:
> On 13.11.2020 12:02, Oleksandr Andrushchenko wrote:
>> On 11/13/20 12:50 PM, Jan Beulich wrote:
>>> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote:
 On 11/13/20 12:25 PM, Jan Beulich wrote:
> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
>> On 11/12/20 4:46 PM, Roger Pau Monné wrote:
>>> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:
 On 11/12/20 11:40 AM, Roger Pau Monné wrote:
> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko 
> wrote:
>> From: Oleksandr Andrushchenko 
>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, 
>> unsigned int reg,
>> +  void *data)
>> +{
>> +struct vpci_bar *vbar, *bar = data;
>> +
>> +if ( is_hardware_domain(current->domain) )
>> +return bar_read_hwdom(pdev, reg, data);
>> +
>> +vbar = get_vpci_bar(current->domain, pdev, bar->index);
>> +if ( !vbar )
>> +return ~0;
>> +
>> +return bar_read_guest(pdev, reg, vbar);
>> +}
>> +
>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned 
>> int reg,
>> +   uint32_t val, void *data)
>> +{
>> +struct vpci_bar *bar = data;
>> +
>> +if ( is_hardware_domain(current->domain) )
>> +bar_write_hwdom(pdev, reg, val, data);
>> +else
>> +{
>> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
>> bar->index);
>> +
>> +if ( !vbar )
>> +return;
>> +bar_write_guest(pdev, reg, val, vbar);
>> +}
>> +}
> You should assign different handlers based on whether the domain that
> has the device assigned is a domU or the hardware domain, rather than
> doing the selection here.
 Hm, handlers are assigned once in init_bars and this function is only 
 called

 for hwdom, so there is no way I can do that for the guests. Hence, the 
 dispatcher
>>> I think we might want to reset the vPCI handlers when a devices gets
>>> assigned and deassigned.
>> In ARM case init_bars is called too early: PCI device assignment is 
>> currently
>>
>> initiated by Domain-0' kernel and is done *before* PCI devices are given 
>> memory
>>
>> ranges and BARs assigned:
>>
>> [    0.429514] pci_bus :00: root bus resource [bus 00-ff]
>> [    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
>> [    0.429555] pci_bus :00: root bus resource [mem 
>> 0xfe20-0xfe3f]
>> [    0.429575] pci_bus :00: root bus resource [mem 
>> 0x3000-0x37ff]
>> [    0.429604] pci_bus :00: root bus resource [mem 
>> 0x3800-0x3fff pref]
>> [    0.429670] pci :00:00.0: enabling Extended Tags
>> [    0.453764] pci :00:00.0:  
>> BUS_NOTIFY_ADD_DEVICE
>>
>> < init_bars >
>>
>> [    0.453793] pci :00:00.0: -- IRQ 0
>> [    0.458825] pci :00:00.0: Failed to add - passthrough or 
>> MSI/MSI-X might fail!
>> [    0.471790] pci :01:00.0:  
>> BUS_NOTIFY_ADD_DEVICE
>>
>> < init_bars >
>>
>> [    0.471821] pci :01:00.0: -- IRQ 255
>> [    0.476809] pci :01:00.0: Failed to add - passthrough or 
>> MSI/MSI-X might fail!
>>
>> < BAR assignments below >
>>
>> [    0.488233] pci :00:00.0: BAR 14: assigned [mem 
>> 0xfe20-0xfe2f]
>> [    0.488265] pci :00:00.0: BAR 15: assigned [mem 
>> 0x3800-0x380f pref]
>>
>> In case of x86 this is pretty much ok as BARs are already in place, but 
>> for ARM we
>>
>> need to take care and re-setup vPCI BARs for hwdom.
> Even on x86 there's no guarantee that all devices have their BARs set
> up by firmware.
 This is true. But there you could have config space trapped in "x86 
 generic way",

 please correct me if I'm wrong here

> In a subsequent reply you've suggested to move init_bars from "add" to
> "assign", but I'm having trouble seeing what this would change: It's
> not Dom0 controlling assignment (to itself), but Xen assigns the device
> towards the end of pci_add_device().
 PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device

 Currently we initialize BARs during PHYSDEVOP_pci_device_add and

 if we do that during XEN_DOMCTL_assign_device things seem to change
>>> But there can't possibly be any XEN_DOMCTL_assign_device involved in
>>> booting of Dom0.
>> Indeed. So, do you have an idea when we should call init_bars suitable

Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843

2020-11-13 Thread Manuel Bouyer
On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote:
> The following might be able to get you going, but I think I need to
> refine the logic a bit there, will have to give it some thought.

I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: Wed 
Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty).
Your patch is needed there too to avoid the panic.

As with 4.13, I have problems with interrupts not being properly
delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13)
or 2 (with 4.15) which would mean that interrupts stop being delivered
at some point in the setup process. Maybe something to do with mask/unmask ?

The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD kernel,
so it's not MSI/MSI-X (not sure it matters though).
Maybe something related to mask/unmask ?

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--



Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Julien Grall




On 13/11/2020 11:26, Jan Beulich wrote:

On 13.11.2020 12:06, Julien Grall wrote:

Hi Jan,

On 13/11/2020 10:53, Jan Beulich wrote:

On 13.11.2020 11:36, Julien Grall wrote:

On 13/11/2020 10:25, Jan Beulich wrote:

On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:

On 11/12/20 4:46 PM, Roger Pau Monné wrote:

On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:

On 11/12/20 11:40 AM, Roger Pau Monné wrote:

On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko 
+static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned int reg,
+  void *data)
+{
+struct vpci_bar *vbar, *bar = data;
+
+if ( is_hardware_domain(current->domain) )
+return bar_read_hwdom(pdev, reg, data);
+
+vbar = get_vpci_bar(current->domain, pdev, bar->index);
+if ( !vbar )
+return ~0;
+
+return bar_read_guest(pdev, reg, vbar);
+}
+
+static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int reg,
+   uint32_t val, void *data)
+{
+struct vpci_bar *bar = data;
+
+if ( is_hardware_domain(current->domain) )
+bar_write_hwdom(pdev, reg, val, data);
+else
+{
+struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
bar->index);
+
+if ( !vbar )
+return;
+bar_write_guest(pdev, reg, val, vbar);
+}
+}

You should assign different handlers based on whether the domain that
has the device assigned is a domU or the hardware domain, rather than
doing the selection here.

Hm, handlers are assigned once in init_bars and this function is only called

for hwdom, so there is no way I can do that for the guests. Hence, the 
dispatcher

I think we might want to reset the vPCI handlers when a devices gets
assigned and deassigned.


In ARM case init_bars is called too early: PCI device assignment is currently

initiated by Domain-0' kernel and is done *before* PCI devices are given memory

ranges and BARs assigned:

[    0.429514] pci_bus :00: root bus resource [bus 00-ff]
[    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
[    0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f]
[    0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff]
[    0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff 
pref]
[    0.429670] pci :00:00.0: enabling Extended Tags
[    0.453764] pci :00:00.0:  BUS_NOTIFY_ADD_DEVICE

< init_bars >

[    0.453793] pci :00:00.0: -- IRQ 0
[    0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X might 
fail!
[    0.471790] pci :01:00.0:  BUS_NOTIFY_ADD_DEVICE

< init_bars >

[    0.471821] pci :01:00.0: -- IRQ 255
[    0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X might 
fail!

< BAR assignments below >

[    0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f]
[    0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f 
pref]

In case of x86 this is pretty much ok as BARs are already in place, but for ARM 
we

need to take care and re-setup vPCI BARs for hwdom.


Even on x86 there's no guarantee that all devices have their BARs set
up by firmware.

In a subsequent reply you've suggested to move init_bars from "add" to
"assign", but I'm having trouble seeing what this would change: It's
not Dom0 controlling assignment (to itself), but Xen assigns the device
towards the end of pci_add_device().


Things are getting even more

complicated if the host PCI bridge is not ECAM like, so you cannot set 
mmio_handlers

and trap hwdom's access to the config space to update BARs etc. This is why I 
have that

ugly hack for rcar_gen3 to read actual BARs for hwdom.


How to config space accesses work there? The latest for MSI/MSI-X it'll
be imperative that Xen be able to intercept config space writes.


I am not sure to understand your last sentence. Are you saying that we
always need to trap access to MSI/MSI-X message in order to sanitize it?

If one is using the GICv3 ITS (I haven't investigated other MSI
controller), then I don't believe you need to sanitize the MSI/MSI-X
message in most of the situation.


Well, if it's fine for the guest to write arbitrary values to message
address and message data,


The message address would be the doorbell of the ITS that is usually
going through the IOMMU page-tables. Although, I am aware of a couple of
platforms where the doorbell access (among other address ranges
including P2P transaction) bypass the IOMMU. In this situation, we would
need a lot more work than just trapping the access.


When you say "The message address would be the doorbell of the ITS" am
I right in understanding that's the designated address to be put there?
What if the guest puts some random different address there?


My point was that all the accesses from a PCI 

Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Jan Beulich
On 13.11.2020 12:02, Oleksandr Andrushchenko wrote:
> 
> On 11/13/20 12:50 PM, Jan Beulich wrote:
>> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote:
>>> On 11/13/20 12:25 PM, Jan Beulich wrote:
 On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
> On 11/12/20 4:46 PM, Roger Pau Monné wrote:
>> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:
>>> On 11/12/20 11:40 AM, Roger Pau Monné wrote:
 On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko 
 wrote:
> From: Oleksandr Andrushchenko 
> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, 
> unsigned int reg,
> +  void *data)
> +{
> +struct vpci_bar *vbar, *bar = data;
> +
> +if ( is_hardware_domain(current->domain) )
> +return bar_read_hwdom(pdev, reg, data);
> +
> +vbar = get_vpci_bar(current->domain, pdev, bar->index);
> +if ( !vbar )
> +return ~0;
> +
> +return bar_read_guest(pdev, reg, vbar);
> +}
> +
> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned 
> int reg,
> +   uint32_t val, void *data)
> +{
> +struct vpci_bar *bar = data;
> +
> +if ( is_hardware_domain(current->domain) )
> +bar_write_hwdom(pdev, reg, val, data);
> +else
> +{
> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
> bar->index);
> +
> +if ( !vbar )
> +return;
> +bar_write_guest(pdev, reg, val, vbar);
> +}
> +}
 You should assign different handlers based on whether the domain that
 has the device assigned is a domU or the hardware domain, rather than
 doing the selection here.
>>> Hm, handlers are assigned once in init_bars and this function is only 
>>> called
>>>
>>> for hwdom, so there is no way I can do that for the guests. Hence, the 
>>> dispatcher
>> I think we might want to reset the vPCI handlers when a devices gets
>> assigned and deassigned.
> In ARM case init_bars is called too early: PCI device assignment is 
> currently
>
> initiated by Domain-0' kernel and is done *before* PCI devices are given 
> memory
>
> ranges and BARs assigned:
>
> [    0.429514] pci_bus :00: root bus resource [bus 00-ff]
> [    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
> [    0.429555] pci_bus :00: root bus resource [mem 
> 0xfe20-0xfe3f]
> [    0.429575] pci_bus :00: root bus resource [mem 
> 0x3000-0x37ff]
> [    0.429604] pci_bus :00: root bus resource [mem 
> 0x3800-0x3fff pref]
> [    0.429670] pci :00:00.0: enabling Extended Tags
> [    0.453764] pci :00:00.0:  
> BUS_NOTIFY_ADD_DEVICE
>
> < init_bars >
>
> [    0.453793] pci :00:00.0: -- IRQ 0
> [    0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X 
> might fail!
> [    0.471790] pci :01:00.0:  
> BUS_NOTIFY_ADD_DEVICE
>
> < init_bars >
>
> [    0.471821] pci :01:00.0: -- IRQ 255
> [    0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X 
> might fail!
>
> < BAR assignments below >
>
> [    0.488233] pci :00:00.0: BAR 14: assigned [mem 
> 0xfe20-0xfe2f]
> [    0.488265] pci :00:00.0: BAR 15: assigned [mem 
> 0x3800-0x380f pref]
>
> In case of x86 this is pretty much ok as BARs are already in place, but 
> for ARM we
>
> need to take care and re-setup vPCI BARs for hwdom.
 Even on x86 there's no guarantee that all devices have their BARs set
 up by firmware.
>>> This is true. But there you could have config space trapped in "x86 generic 
>>> way",
>>>
>>> please correct me if I'm wrong here
>>>
 In a subsequent reply you've suggested to move init_bars from "add" to
 "assign", but I'm having trouble seeing what this would change: It's
 not Dom0 controlling assignment (to itself), but Xen assigns the device
 towards the end of pci_add_device().
>>> PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device
>>>
>>> Currently we initialize BARs during PHYSDEVOP_pci_device_add and
>>>
>>> if we do that during XEN_DOMCTL_assign_device things seem to change
>> But there can't possibly be any XEN_DOMCTL_assign_device involved in
>> booting of Dom0.
> 
> Indeed. So, do you have an idea when we should call init_bars suitable
> 
> for both ARM and x86?
> 
> Another question is: what happens bad if x86 and ARM won't call init_bars
> 
> until the moment we really assign a PCI 

[xen-4.14-testing test] 156716: tolerable FAIL - PUSHED

2020-11-13 Thread osstest service owner
flight 156716 xen-4.14-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156716/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail in 
156670 pass in 156716
 test-amd64-i386-xl-raw 19 guest-localmigrate/x10 fail in 156670 pass in 156716
 test-armhf-armhf-xl-credit2  18 guest-start/debian.repeat  fail pass in 156670

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156525
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156525
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156525
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156525
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156525
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156525
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156525
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 156525
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 156525
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156525
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156525
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  d101b417b784a26326fc7800a79cc539ba570b79
baseline version:
 xen  0c96e4297da07944525729ddbe438b0131ab5b7e

Last test of basis   156525  2020-11-06 16:01:25 Z6 days
Failing since156594  2020-11-09 18:08:08 Z3 days4 attempts
Testing same since   156670  2020-11-11 04:05:07 Z2 days2 attempts


[ovmf test] 156720: all pass - PUSHED

2020-11-13 Thread osstest service owner
flight 156720 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156720/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 544cb0132dc1778b9e791202995533523fa6cccd
baseline version:
 ovmf 1c48866e041d2afaabb170086c5bb0c69a4653d3

Last test of basis   156684  2020-11-11 12:14:13 Z1 days
Testing same since   156720  2020-11-12 16:00:44 Z0 days1 attempts


People who touched revisions under test:
  Abner Chang 
  gechao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   1c48866e04..544cb0132d  544cb0132dc1778b9e791202995533523fa6cccd -> 
xen-tested-master



Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Jan Beulich
On 13.11.2020 12:06, Julien Grall wrote:
> Hi Jan,
> 
> On 13/11/2020 10:53, Jan Beulich wrote:
>> On 13.11.2020 11:36, Julien Grall wrote:
>>> On 13/11/2020 10:25, Jan Beulich wrote:
 On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
> On 11/12/20 4:46 PM, Roger Pau Monné wrote:
>> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:
>>> On 11/12/20 11:40 AM, Roger Pau Monné wrote:
 On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko 
 wrote:
> From: Oleksandr Andrushchenko 
> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, 
> unsigned int reg,
> +  void *data)
> +{
> +struct vpci_bar *vbar, *bar = data;
> +
> +if ( is_hardware_domain(current->domain) )
> +return bar_read_hwdom(pdev, reg, data);
> +
> +vbar = get_vpci_bar(current->domain, pdev, bar->index);
> +if ( !vbar )
> +return ~0;
> +
> +return bar_read_guest(pdev, reg, vbar);
> +}
> +
> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned 
> int reg,
> +   uint32_t val, void *data)
> +{
> +struct vpci_bar *bar = data;
> +
> +if ( is_hardware_domain(current->domain) )
> +bar_write_hwdom(pdev, reg, val, data);
> +else
> +{
> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
> bar->index);
> +
> +if ( !vbar )
> +return;
> +bar_write_guest(pdev, reg, val, vbar);
> +}
> +}
 You should assign different handlers based on whether the domain that
 has the device assigned is a domU or the hardware domain, rather than
 doing the selection here.
>>> Hm, handlers are assigned once in init_bars and this function is only 
>>> called
>>>
>>> for hwdom, so there is no way I can do that for the guests. Hence, the 
>>> dispatcher
>> I think we might want to reset the vPCI handlers when a devices gets
>> assigned and deassigned.
>
> In ARM case init_bars is called too early: PCI device assignment is 
> currently
>
> initiated by Domain-0' kernel and is done *before* PCI devices are given 
> memory
>
> ranges and BARs assigned:
>
> [    0.429514] pci_bus :00: root bus resource [bus 00-ff]
> [    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
> [    0.429555] pci_bus :00: root bus resource [mem 
> 0xfe20-0xfe3f]
> [    0.429575] pci_bus :00: root bus resource [mem 
> 0x3000-0x37ff]
> [    0.429604] pci_bus :00: root bus resource [mem 
> 0x3800-0x3fff pref]
> [    0.429670] pci :00:00.0: enabling Extended Tags
> [    0.453764] pci :00:00.0:  
> BUS_NOTIFY_ADD_DEVICE
>
> < init_bars >
>
> [    0.453793] pci :00:00.0: -- IRQ 0
> [    0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X 
> might fail!
> [    0.471790] pci :01:00.0:  
> BUS_NOTIFY_ADD_DEVICE
>
> < init_bars >
>
> [    0.471821] pci :01:00.0: -- IRQ 255
> [    0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X 
> might fail!
>
> < BAR assignments below >
>
> [    0.488233] pci :00:00.0: BAR 14: assigned [mem 
> 0xfe20-0xfe2f]
> [    0.488265] pci :00:00.0: BAR 15: assigned [mem 
> 0x3800-0x380f pref]
>
> In case of x86 this is pretty much ok as BARs are already in place, but 
> for ARM we
>
> need to take care and re-setup vPCI BARs for hwdom.

 Even on x86 there's no guarantee that all devices have their BARs set
 up by firmware.

 In a subsequent reply you've suggested to move init_bars from "add" to
 "assign", but I'm having trouble seeing what this would change: It's
 not Dom0 controlling assignment (to itself), but Xen assigns the device
 towards the end of pci_add_device().

> Things are getting even more
>
> complicated if the host PCI bridge is not ECAM like, so you cannot set 
> mmio_handlers
>
> and trap hwdom's access to the config space to update BARs etc. This is 
> why I have that
>
> ugly hack for rcar_gen3 to read actual BARs for hwdom.

 How to config space accesses work there? The latest for MSI/MSI-X it'll
 be imperative that Xen be able to intercept config space writes.
>>>
>>> I am not sure to understand your last sentence. Are you saying that we
>>> always need to trap access to MSI/MSI-X message in order to sanitize it?
>>>
>>> If one is using the GICv3 ITS 

Re: [PATCH v3 09/53] qdev: Make qdev_get_prop_ptr() get Object* arg

2020-11-13 Thread Cornelia Huck
On Thu, 12 Nov 2020 16:43:06 -0500
Eduardo Habkost  wrote:

> Make the code more generic and not specific to TYPE_DEVICE.
> 
> Reviewed-by: Marc-André Lureau 
> Signed-off-by: Eduardo Habkost 
> ---
> Changes v1 -> v2:
> - Fix build error with CONFIG_XEN
>   I took the liberty of keeping the Reviewed-by line from
>   Marc-André as the build fix is a trivial one line change
> ---
> Cc: Stefan Berger 
> Cc: Stefano Stabellini 
> Cc: Anthony Perard 
> Cc: Paul Durrant 
> Cc: Kevin Wolf 
> Cc: Max Reitz 
> Cc: Paolo Bonzini 
> Cc: "Daniel P. Berrangé" 
> Cc: Eduardo Habkost 
> Cc: Cornelia Huck 
> Cc: Thomas Huth 
> Cc: Richard Henderson 
> Cc: David Hildenbrand 
> Cc: Halil Pasic 
> Cc: Christian Borntraeger 
> Cc: Matthew Rosato 
> Cc: Alex Williamson 
> Cc: qemu-de...@nongnu.org
> Cc: xen-devel@lists.xenproject.org
> Cc: qemu-bl...@nongnu.org
> Cc: qemu-s3...@nongnu.org
> ---
>  include/hw/qdev-properties.h |  2 +-
>  backends/tpm/tpm_util.c  |  8 ++--
>  hw/block/xen-block.c |  5 +-
>  hw/core/qdev-properties-system.c | 57 +-
>  hw/core/qdev-properties.c| 82 +---
>  hw/s390x/css.c   |  5 +-
>  hw/s390x/s390-pci-bus.c  |  4 +-
>  hw/vfio/pci-quirks.c |  5 +-
>  8 files changed, 68 insertions(+), 100 deletions(-)

Reviewed-by: Cornelia Huck  #s390 parts




Re: [PATCH V2 01/23] x86/ioreq: Prepare IOREQ feature for making it common

2020-11-13 Thread Jan Beulich
On 13.11.2020 12:09, Oleksandr wrote:
> On 12.11.20 12:58, Jan Beulich wrote:
>> On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:
>>> @@ -855,7 +841,7 @@ int hvm_destroy_ioreq_server(struct domain *d, 
>>> ioservid_t id)
>>>   
>>>   domain_pause(d);
>>>   
>>> -p2m_set_ioreq_server(d, 0, s);
>>> +arch_hvm_destroy_ioreq_server(s);
>> Iirc there are plans to rename hvm_destroy_ioreq_server() in the
>> course of the generalization. If so, this arch hook would imo
>> better be named following the new scheme right away.
> Could you please clarify, are you speaking about the plans discussed there
> 
> "[PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved 
> function names"?
> 
> Copy text for the convenience:
> AT least some of the functions touched here would be nice to be
> moved to a more consistent new naming scheme right away, to
> avoid having to touch all the same places again. I guess ioreq
> server functions would be nice to all start with ioreq_server_
> and ioreq functions to all start with ioreq_. E.g. ioreq_send()
> and ioreq_server_select().
> 
> or some other plans I am not aware of?
> 
> 
> What I really want to avoid with IOREQ enabling work is the round-trips 
> related to naming things, this patch series
> became quite big (and consumes som time to rebase and test it) and I 
> expect it to become bigger.
> 
> So the arch_hvm_destroy_ioreq_server() should be 
> arch_ioreq_server_destroy()?

I think so, yes. If you want to avoid doing full patches, how
about you simply list the functions / variables you plan to
rename alongside the intended new names? Would likely be easier
for all involved parties.

>>> @@ -1215,7 +1153,7 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
>>>   struct hvm_ioreq_server *s;
>>>   unsigned int id;
>>>   
>>> -if ( !relocate_portio_handler(d, 0xcf8, 0xcf8, 4) )
>>> +if ( !arch_hvm_ioreq_destroy(d) )
>> There's no ioreq being destroyed here, so I think this wants
>> renaming (and again ideally right away following the planned
>> new scheme).
> Agree that no ioreq being destroyed here. Probably 
> ioreq_server_check_for_destroy()?
> I couldn't think of a better name.

"check" implies no change (and d ought to then be const struct
domain *). With the containing function likely becoming
ioreq_server_destroy_all(), arch_ioreq_server_destroy_all()
would come to mind, or arch_ioreq_server_prepare_destroy_all().

>>> +static inline int hvm_map_mem_type_to_ioreq_server(struct domain *d,
>>> +   ioservid_t id,
>>> +   uint32_t type,
>>> +   uint32_t flags)
>>> +{
>>> +struct hvm_ioreq_server *s;
>>> +int rc;
>>> +
>>> +if ( type != HVMMEM_ioreq_server )
>>> +return -EINVAL;
>>> +
>>> +if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
>>> +return -EINVAL;
>>> +
>>> +spin_lock_recursive(>arch.hvm.ioreq_server.lock);
>>> +
>>> +s = get_ioreq_server(d, id);
>>> +
>>> +rc = -ENOENT;
>>> +if ( !s )
>>> +goto out;
>>> +
>>> +rc = -EPERM;
>>> +if ( s->emulator != current->domain )
>>> +goto out;
>>> +
>>> +rc = p2m_set_ioreq_server(d, flags, s);
>>> +
>>> + out:
>>> +spin_unlock_recursive(>arch.hvm.ioreq_server.lock);
>>> +
>>> +if ( rc == 0 && flags == 0 )
>>> +{
>>> +struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> I realize I may be asking too much, but would it be possible if,
>> while moving code, you made simple and likely uncontroversial
>> adjustments like adding const here? (Such adjustments would be
>> less desirable to make if they increased the size of the patch,
>> e.g. if you were touching only nearby code.)
> This function as well as one located below won't be moved to this header
> for the next version of patch.
> 
> ok, will add const.

Well, if you don't move the code, then better keep the diff small
and leave things as they are.

Jan



Re: [PATCH V2 01/23] x86/ioreq: Prepare IOREQ feature for making it common

2020-11-13 Thread Oleksandr



On 12.11.20 12:58, Jan Beulich wrote:

Hi Jan.


On 15.10.2020 18:44, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

As a lot of x86 code can be re-used on Arm later on, this
patch makes some preparation to x86/hvm/ioreq.c before moving
to the common code. This way we will get a verbatim copy for
a code movement in subsequent patch (arch/x86/hvm/ioreq.c
will be *just* renamed to common/ioreq).

This patch does the following:
1. Introduce *inline* arch_hvm_ioreq_init(), arch_hvm_ioreq_destroy(),
arch_hvm_io_completion(), arch_hvm_destroy_ioreq_server() and
hvm_ioreq_server_get_type_addr() to abstract arch specific materials.
2  Make hvm_map_mem_type_to_ioreq_server() *inline*. It is not going
to be called from the common code.

As already indicated on another sub-thread, I think some of these
are too large to be inline functions. Additionally, considering
their single-use purpose, I don't think they should be placed in
a header consumed by more than the producer and the sole consumer.
ok, the only reason I made these inline was to achieve a moving of the 
whole x86/hvm/ioreq.c to the common code.

I will move some of them back to ioreq.c.





3. Make get_ioreq_server() global. It is going to be called from
a few places.

And with this its name ought to change, to fit the general naming
model of global functions of this subsystem.
I think, with new requirements (making 
hvm_map_mem_type_to_ioreq_server() common) this helper

doesn't need to be global. I will make it static again.





4. Add IOREQ_STATUS_* #define-s and update candidates for moving.

This, it seems to me, could be a separate patch.


Well, will do.





@@ -855,7 +841,7 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t 
id)
  
  domain_pause(d);
  
-p2m_set_ioreq_server(d, 0, s);

+arch_hvm_destroy_ioreq_server(s);

Iirc there are plans to rename hvm_destroy_ioreq_server() in the
course of the generalization. If so, this arch hook would imo
better be named following the new scheme right away.

Could you please clarify, are you speaking about the plans discussed there

"[PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved 
function names"?


Copy text for the convenience:
AT least some of the functions touched here would be nice to be
moved to a more consistent new naming scheme right away, to
avoid having to touch all the same places again. I guess ioreq
server functions would be nice to all start with ioreq_server_
and ioreq functions to all start with ioreq_. E.g. ioreq_send()
and ioreq_server_select().

or some other plans I am not aware of?


What I really want to avoid with IOREQ enabling work is the round-trips 
related to naming things, this patch series
became quite big (and consumes som time to rebase and test it) and I 
expect it to become bigger.


So the arch_hvm_destroy_ioreq_server() should be 
arch_ioreq_server_destroy()?






@@ -1215,7 +1153,7 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
  struct hvm_ioreq_server *s;
  unsigned int id;
  
-if ( !relocate_portio_handler(d, 0xcf8, 0xcf8, 4) )

+if ( !arch_hvm_ioreq_destroy(d) )

There's no ioreq being destroyed here, so I think this wants
renaming (and again ideally right away following the planned
new scheme).
Agree that no ioreq being destroyed here. Probably 
ioreq_server_check_for_destroy()?

I couldn't think of a better name.





+static inline int hvm_map_mem_type_to_ioreq_server(struct domain *d,
+   ioservid_t id,
+   uint32_t type,
+   uint32_t flags)
+{
+struct hvm_ioreq_server *s;
+int rc;
+
+if ( type != HVMMEM_ioreq_server )
+return -EINVAL;
+
+if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
+return -EINVAL;
+
+spin_lock_recursive(>arch.hvm.ioreq_server.lock);
+
+s = get_ioreq_server(d, id);
+
+rc = -ENOENT;
+if ( !s )
+goto out;
+
+rc = -EPERM;
+if ( s->emulator != current->domain )
+goto out;
+
+rc = p2m_set_ioreq_server(d, flags, s);
+
+ out:
+spin_unlock_recursive(>arch.hvm.ioreq_server.lock);
+
+if ( rc == 0 && flags == 0 )
+{
+struct p2m_domain *p2m = p2m_get_hostp2m(d);

I realize I may be asking too much, but would it be possible if,
while moving code, you made simple and likely uncontroversial
adjustments like adding const here? (Such adjustments would be
less desirable to make if they increased the size of the patch,
e.g. if you were touching only nearby code.)

This function as well as one located below won't be moved to this header
for the next version of patch.

ok, will add const.





+if ( read_atomic(>ioreq.entry_count) )
+p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
+}
+
+return rc;
+}
+
+static inline int hvm_ioreq_server_get_type_addr(const struct domain *d,
+

Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Julien Grall

Hi Jan,

On 13/11/2020 10:53, Jan Beulich wrote:

On 13.11.2020 11:36, Julien Grall wrote:

On 13/11/2020 10:25, Jan Beulich wrote:

On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:

On 11/12/20 4:46 PM, Roger Pau Monné wrote:

On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:

On 11/12/20 11:40 AM, Roger Pau Monné wrote:

On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko 
+static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned int reg,
+  void *data)
+{
+struct vpci_bar *vbar, *bar = data;
+
+if ( is_hardware_domain(current->domain) )
+return bar_read_hwdom(pdev, reg, data);
+
+vbar = get_vpci_bar(current->domain, pdev, bar->index);
+if ( !vbar )
+return ~0;
+
+return bar_read_guest(pdev, reg, vbar);
+}
+
+static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int reg,
+   uint32_t val, void *data)
+{
+struct vpci_bar *bar = data;
+
+if ( is_hardware_domain(current->domain) )
+bar_write_hwdom(pdev, reg, val, data);
+else
+{
+struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
bar->index);
+
+if ( !vbar )
+return;
+bar_write_guest(pdev, reg, val, vbar);
+}
+}

You should assign different handlers based on whether the domain that
has the device assigned is a domU or the hardware domain, rather than
doing the selection here.

Hm, handlers are assigned once in init_bars and this function is only called

for hwdom, so there is no way I can do that for the guests. Hence, the 
dispatcher

I think we might want to reset the vPCI handlers when a devices gets
assigned and deassigned.


In ARM case init_bars is called too early: PCI device assignment is currently

initiated by Domain-0' kernel and is done *before* PCI devices are given memory

ranges and BARs assigned:

[    0.429514] pci_bus :00: root bus resource [bus 00-ff]
[    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
[    0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f]
[    0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff]
[    0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff 
pref]
[    0.429670] pci :00:00.0: enabling Extended Tags
[    0.453764] pci :00:00.0:  BUS_NOTIFY_ADD_DEVICE

< init_bars >

[    0.453793] pci :00:00.0: -- IRQ 0
[    0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X might 
fail!
[    0.471790] pci :01:00.0:  BUS_NOTIFY_ADD_DEVICE

< init_bars >

[    0.471821] pci :01:00.0: -- IRQ 255
[    0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X might 
fail!

< BAR assignments below >

[    0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f]
[    0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f 
pref]

In case of x86 this is pretty much ok as BARs are already in place, but for ARM 
we

need to take care and re-setup vPCI BARs for hwdom.


Even on x86 there's no guarantee that all devices have their BARs set
up by firmware.

In a subsequent reply you've suggested to move init_bars from "add" to
"assign", but I'm having trouble seeing what this would change: It's
not Dom0 controlling assignment (to itself), but Xen assigns the device
towards the end of pci_add_device().


Things are getting even more

complicated if the host PCI bridge is not ECAM like, so you cannot set 
mmio_handlers

and trap hwdom's access to the config space to update BARs etc. This is why I 
have that

ugly hack for rcar_gen3 to read actual BARs for hwdom.


How to config space accesses work there? The latest for MSI/MSI-X it'll
be imperative that Xen be able to intercept config space writes.


I am not sure to understand your last sentence. Are you saying that we
always need to trap access to MSI/MSI-X message in order to sanitize it?

If one is using the GICv3 ITS (I haven't investigated other MSI
controller), then I don't believe you need to sanitize the MSI/MSI-X
message in most of the situation.


Well, if it's fine for the guest to write arbitrary values to message
address and message data,


The message address would be the doorbell of the ITS that is usually 
going through the IOMMU page-tables. Although, I am aware of a couple of 
platforms where the doorbell access (among other address ranges 
including P2P transaction) bypass the IOMMU. In this situation, we would 
need a lot more work than just trapping the access.


Regarding the message data, for the ITS this is an event ID. The HW will 
then tag each message with the device ID (this prevents spoofing). The 
tupple (device ID, event ID) is used by the ITS to decide where to 
inject the event.


Whether other MSI controllers (e.g. GICv2m) have similar isolation 
feature will be on the case by case 

Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Oleksandr Andrushchenko

On 11/13/20 12:50 PM, Jan Beulich wrote:
> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote:
>> On 11/13/20 12:25 PM, Jan Beulich wrote:
>>> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
 On 11/12/20 4:46 PM, Roger Pau Monné wrote:
> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:
>> On 11/12/20 11:40 AM, Roger Pau Monné wrote:
>>> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote:
 From: Oleksandr Andrushchenko 
 +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, 
 unsigned int reg,
 +  void *data)
 +{
 +struct vpci_bar *vbar, *bar = data;
 +
 +if ( is_hardware_domain(current->domain) )
 +return bar_read_hwdom(pdev, reg, data);
 +
 +vbar = get_vpci_bar(current->domain, pdev, bar->index);
 +if ( !vbar )
 +return ~0;
 +
 +return bar_read_guest(pdev, reg, vbar);
 +}
 +
 +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned 
 int reg,
 +   uint32_t val, void *data)
 +{
 +struct vpci_bar *bar = data;
 +
 +if ( is_hardware_domain(current->domain) )
 +bar_write_hwdom(pdev, reg, val, data);
 +else
 +{
 +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
 bar->index);
 +
 +if ( !vbar )
 +return;
 +bar_write_guest(pdev, reg, val, vbar);
 +}
 +}
>>> You should assign different handlers based on whether the domain that
>>> has the device assigned is a domU or the hardware domain, rather than
>>> doing the selection here.
>> Hm, handlers are assigned once in init_bars and this function is only 
>> called
>>
>> for hwdom, so there is no way I can do that for the guests. Hence, the 
>> dispatcher
> I think we might want to reset the vPCI handlers when a devices gets
> assigned and deassigned.
 In ARM case init_bars is called too early: PCI device assignment is 
 currently

 initiated by Domain-0' kernel and is done *before* PCI devices are given 
 memory

 ranges and BARs assigned:

 [    0.429514] pci_bus :00: root bus resource [bus 00-ff]
 [    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
 [    0.429555] pci_bus :00: root bus resource [mem 
 0xfe20-0xfe3f]
 [    0.429575] pci_bus :00: root bus resource [mem 
 0x3000-0x37ff]
 [    0.429604] pci_bus :00: root bus resource [mem 
 0x3800-0x3fff pref]
 [    0.429670] pci :00:00.0: enabling Extended Tags
 [    0.453764] pci :00:00.0:  BUS_NOTIFY_ADD_DEVICE

 < init_bars >

 [    0.453793] pci :00:00.0: -- IRQ 0
 [    0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X 
 might fail!
 [    0.471790] pci :01:00.0:  BUS_NOTIFY_ADD_DEVICE

 < init_bars >

 [    0.471821] pci :01:00.0: -- IRQ 255
 [    0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X 
 might fail!

 < BAR assignments below >

 [    0.488233] pci :00:00.0: BAR 14: assigned [mem 
 0xfe20-0xfe2f]
 [    0.488265] pci :00:00.0: BAR 15: assigned [mem 
 0x3800-0x380f pref]

 In case of x86 this is pretty much ok as BARs are already in place, but 
 for ARM we

 need to take care and re-setup vPCI BARs for hwdom.
>>> Even on x86 there's no guarantee that all devices have their BARs set
>>> up by firmware.
>> This is true. But there you could have config space trapped in "x86 generic 
>> way",
>>
>> please correct me if I'm wrong here
>>
>>> In a subsequent reply you've suggested to move init_bars from "add" to
>>> "assign", but I'm having trouble seeing what this would change: It's
>>> not Dom0 controlling assignment (to itself), but Xen assigns the device
>>> towards the end of pci_add_device().
>> PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device
>>
>> Currently we initialize BARs during PHYSDEVOP_pci_device_add and
>>
>> if we do that during XEN_DOMCTL_assign_device things seem to change
> But there can't possibly be any XEN_DOMCTL_assign_device involved in
> booting of Dom0.

Indeed. So, do you have an idea when we should call init_bars suitable

for both ARM and x86?

Another question is: what happens bad if x86 and ARM won't call init_bars

until the moment we really assign a PCI device to the first guest?

>
> In order to do passthrough to domUs safely
> we will have to add more handlers than what's required for dom0,
 Can you please tell what are thinking about? What are 

Re: [PATCH 08/10] vpci/arm: Allow updating BAR's header for non-ECAM bridges

2020-11-13 Thread Oleksandr Andrushchenko

On 11/13/20 12:47 PM, Jan Beulich wrote:
> On 13.11.2020 11:39, Oleksandr Andrushchenko wrote:
>> On 11/13/20 12:29 PM, Jan Beulich wrote:
>>> On 09.11.2020 13:50, Oleksandr Andrushchenko wrote:
 From: Oleksandr Andrushchenko 

 Non-ECAM host bridges in hwdom go directly to PCI config space,
 not through vpci (they use their specific method for accessing PCI
 configuration, e.g. dedicated registers etc.).
>>> And access to these dedicated registers can't be intercepted?
>> It can. But then you have to fully emulate that bridge, e.g.
>>
>> "if we write A to regB and after that write C to regZ then it
>>
>> means we are accessing config space. If we write"
> Sounds pretty much like the I/O port based access mechanism on
> x86, which also has some sort of "enable". Of course, I/O port
> accesses are particularly easy to intercept and handle...
Yes, it has somewhat similar idea
>
>> I mean this would need lots of code in Xen to achieve that
> Possibly, but look at the amount of code we have in Xen on the
> x86 side to handle MCFG writes by Dom0.

But MCFG is handled the same way for all x86 machines, right?

And here I'll have to have a SoC specific code, e.g. a specific driver

>
> Jan

Thank you,

Oleksandr


Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Jan Beulich
On 13.11.2020 11:36, Julien Grall wrote:
> On 13/11/2020 10:25, Jan Beulich wrote:
>> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
>>> On 11/12/20 4:46 PM, Roger Pau Monné wrote:
 On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:
> On 11/12/20 11:40 AM, Roger Pau Monné wrote:
>> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko 
>>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned 
>>> int reg,
>>> +  void *data)
>>> +{
>>> +struct vpci_bar *vbar, *bar = data;
>>> +
>>> +if ( is_hardware_domain(current->domain) )
>>> +return bar_read_hwdom(pdev, reg, data);
>>> +
>>> +vbar = get_vpci_bar(current->domain, pdev, bar->index);
>>> +if ( !vbar )
>>> +return ~0;
>>> +
>>> +return bar_read_guest(pdev, reg, vbar);
>>> +}
>>> +
>>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned 
>>> int reg,
>>> +   uint32_t val, void *data)
>>> +{
>>> +struct vpci_bar *bar = data;
>>> +
>>> +if ( is_hardware_domain(current->domain) )
>>> +bar_write_hwdom(pdev, reg, val, data);
>>> +else
>>> +{
>>> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
>>> bar->index);
>>> +
>>> +if ( !vbar )
>>> +return;
>>> +bar_write_guest(pdev, reg, val, vbar);
>>> +}
>>> +}
>> You should assign different handlers based on whether the domain that
>> has the device assigned is a domU or the hardware domain, rather than
>> doing the selection here.
> Hm, handlers are assigned once in init_bars and this function is only 
> called
>
> for hwdom, so there is no way I can do that for the guests. Hence, the 
> dispatcher
 I think we might want to reset the vPCI handlers when a devices gets
 assigned and deassigned.
>>>
>>> In ARM case init_bars is called too early: PCI device assignment is 
>>> currently
>>>
>>> initiated by Domain-0' kernel and is done *before* PCI devices are given 
>>> memory
>>>
>>> ranges and BARs assigned:
>>>
>>> [    0.429514] pci_bus :00: root bus resource [bus 00-ff]
>>> [    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
>>> [    0.429555] pci_bus :00: root bus resource [mem 
>>> 0xfe20-0xfe3f]
>>> [    0.429575] pci_bus :00: root bus resource [mem 
>>> 0x3000-0x37ff]
>>> [    0.429604] pci_bus :00: root bus resource [mem 
>>> 0x3800-0x3fff pref]
>>> [    0.429670] pci :00:00.0: enabling Extended Tags
>>> [    0.453764] pci :00:00.0:  BUS_NOTIFY_ADD_DEVICE
>>>
>>> < init_bars >
>>>
>>> [    0.453793] pci :00:00.0: -- IRQ 0
>>> [    0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X 
>>> might fail!
>>> [    0.471790] pci :01:00.0:  BUS_NOTIFY_ADD_DEVICE
>>>
>>> < init_bars >
>>>
>>> [    0.471821] pci :01:00.0: -- IRQ 255
>>> [    0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X 
>>> might fail!
>>>
>>> < BAR assignments below >
>>>
>>> [    0.488233] pci :00:00.0: BAR 14: assigned [mem 
>>> 0xfe20-0xfe2f]
>>> [    0.488265] pci :00:00.0: BAR 15: assigned [mem 
>>> 0x3800-0x380f pref]
>>>
>>> In case of x86 this is pretty much ok as BARs are already in place, but for 
>>> ARM we
>>>
>>> need to take care and re-setup vPCI BARs for hwdom.
>>
>> Even on x86 there's no guarantee that all devices have their BARs set
>> up by firmware.
>>
>> In a subsequent reply you've suggested to move init_bars from "add" to
>> "assign", but I'm having trouble seeing what this would change: It's
>> not Dom0 controlling assignment (to itself), but Xen assigns the device
>> towards the end of pci_add_device().
>>
>>> Things are getting even more
>>>
>>> complicated if the host PCI bridge is not ECAM like, so you cannot set 
>>> mmio_handlers
>>>
>>> and trap hwdom's access to the config space to update BARs etc. This is why 
>>> I have that
>>>
>>> ugly hack for rcar_gen3 to read actual BARs for hwdom.
>>
>> How to config space accesses work there? The latest for MSI/MSI-X it'll
>> be imperative that Xen be able to intercept config space writes.
> 
> I am not sure to understand your last sentence. Are you saying that we 
> always need to trap access to MSI/MSI-X message in order to sanitize it?
> 
> If one is using the GICv3 ITS (I haven't investigated other MSI 
> controller), then I don't believe you need to sanitize the MSI/MSI-X 
> message in most of the situation.

Well, if it's fine for the guest to write arbitrary values to message
address and message data, _and_ to arbitrarily enable/disable MSI / MSI-X,
then yes, no interception would be needed.

Jan



Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Jan Beulich
On 13.11.2020 11:46, Oleksandr Andrushchenko wrote:
> On 11/13/20 12:25 PM, Jan Beulich wrote:
>> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
>>> On 11/12/20 4:46 PM, Roger Pau Monné wrote:
 On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:
> On 11/12/20 11:40 AM, Roger Pau Monné wrote:
>> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko 
>>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned 
>>> int reg,
>>> +  void *data)
>>> +{
>>> +struct vpci_bar *vbar, *bar = data;
>>> +
>>> +if ( is_hardware_domain(current->domain) )
>>> +return bar_read_hwdom(pdev, reg, data);
>>> +
>>> +vbar = get_vpci_bar(current->domain, pdev, bar->index);
>>> +if ( !vbar )
>>> +return ~0;
>>> +
>>> +return bar_read_guest(pdev, reg, vbar);
>>> +}
>>> +
>>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned 
>>> int reg,
>>> +   uint32_t val, void *data)
>>> +{
>>> +struct vpci_bar *bar = data;
>>> +
>>> +if ( is_hardware_domain(current->domain) )
>>> +bar_write_hwdom(pdev, reg, val, data);
>>> +else
>>> +{
>>> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
>>> bar->index);
>>> +
>>> +if ( !vbar )
>>> +return;
>>> +bar_write_guest(pdev, reg, val, vbar);
>>> +}
>>> +}
>> You should assign different handlers based on whether the domain that
>> has the device assigned is a domU or the hardware domain, rather than
>> doing the selection here.
> Hm, handlers are assigned once in init_bars and this function is only 
> called
>
> for hwdom, so there is no way I can do that for the guests. Hence, the 
> dispatcher
 I think we might want to reset the vPCI handlers when a devices gets
 assigned and deassigned.
>>> In ARM case init_bars is called too early: PCI device assignment is 
>>> currently
>>>
>>> initiated by Domain-0' kernel and is done *before* PCI devices are given 
>>> memory
>>>
>>> ranges and BARs assigned:
>>>
>>> [    0.429514] pci_bus :00: root bus resource [bus 00-ff]
>>> [    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
>>> [    0.429555] pci_bus :00: root bus resource [mem 
>>> 0xfe20-0xfe3f]
>>> [    0.429575] pci_bus :00: root bus resource [mem 
>>> 0x3000-0x37ff]
>>> [    0.429604] pci_bus :00: root bus resource [mem 
>>> 0x3800-0x3fff pref]
>>> [    0.429670] pci :00:00.0: enabling Extended Tags
>>> [    0.453764] pci :00:00.0:  BUS_NOTIFY_ADD_DEVICE
>>>
>>> < init_bars >
>>>
>>> [    0.453793] pci :00:00.0: -- IRQ 0
>>> [    0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X 
>>> might fail!
>>> [    0.471790] pci :01:00.0:  BUS_NOTIFY_ADD_DEVICE
>>>
>>> < init_bars >
>>>
>>> [    0.471821] pci :01:00.0: -- IRQ 255
>>> [    0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X 
>>> might fail!
>>>
>>> < BAR assignments below >
>>>
>>> [    0.488233] pci :00:00.0: BAR 14: assigned [mem 
>>> 0xfe20-0xfe2f]
>>> [    0.488265] pci :00:00.0: BAR 15: assigned [mem 
>>> 0x3800-0x380f pref]
>>>
>>> In case of x86 this is pretty much ok as BARs are already in place, but for 
>>> ARM we
>>>
>>> need to take care and re-setup vPCI BARs for hwdom.
>> Even on x86 there's no guarantee that all devices have their BARs set
>> up by firmware.
> 
> This is true. But there you could have config space trapped in "x86 generic 
> way",
> 
> please correct me if I'm wrong here
> 
>>
>> In a subsequent reply you've suggested to move init_bars from "add" to
>> "assign", but I'm having trouble seeing what this would change: It's
>> not Dom0 controlling assignment (to itself), but Xen assigns the device
>> towards the end of pci_add_device().
> 
> PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device
> 
> Currently we initialize BARs during PHYSDEVOP_pci_device_add and
> 
> if we do that during XEN_DOMCTL_assign_device things seem to change

But there can't possibly be any XEN_DOMCTL_assign_device involved in
booting of Dom0. 

In order to do passthrough to domUs safely
 we will have to add more handlers than what's required for dom0,
>>> Can you please tell what are thinking about? What are the missing handlers?
and
 having is_hardware_domain sprinkled in all of them is not a suitable
 solution.
>>> I'll try to replace is_hardware_domain with something like:
>>>
>>> +/*
>>> + * Detect whether physical PCI devices in this segment belong
>>> + * to the domain given, e.g. on x86 all PCI devices live in hwdom,
>>> + * but in case of ARM this might not be the 

Re: [PATCH 08/10] vpci/arm: Allow updating BAR's header for non-ECAM bridges

2020-11-13 Thread Jan Beulich
On 13.11.2020 11:39, Oleksandr Andrushchenko wrote:
> 
> On 11/13/20 12:29 PM, Jan Beulich wrote:
>> On 09.11.2020 13:50, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko 
>>>
>>> Non-ECAM host bridges in hwdom go directly to PCI config space,
>>> not through vpci (they use their specific method for accessing PCI
>>> configuration, e.g. dedicated registers etc.).
>> And access to these dedicated registers can't be intercepted?
> 
> It can. But then you have to fully emulate that bridge, e.g.
> 
> "if we write A to regB and after that write C to regZ then it
> 
> means we are accessing config space. If we write"

Sounds pretty much like the I/O port based access mechanism on
x86, which also has some sort of "enable". Of course, I/O port
accesses are particularly easy to intercept and handle...

> I mean this would need lots of code in Xen to achieve that

Possibly, but look at the amount of code we have in Xen on the
x86 side to handle MCFG writes by Dom0.

Jan



Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Oleksandr Andrushchenko

On 11/13/20 12:25 PM, Jan Beulich wrote:
> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
>> On 11/12/20 4:46 PM, Roger Pau Monné wrote:
>>> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:
 On 11/12/20 11:40 AM, Roger Pau Monné wrote:
> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko 
>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned 
>> int reg,
>> +  void *data)
>> +{
>> +struct vpci_bar *vbar, *bar = data;
>> +
>> +if ( is_hardware_domain(current->domain) )
>> +return bar_read_hwdom(pdev, reg, data);
>> +
>> +vbar = get_vpci_bar(current->domain, pdev, bar->index);
>> +if ( !vbar )
>> +return ~0;
>> +
>> +return bar_read_guest(pdev, reg, vbar);
>> +}
>> +
>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int 
>> reg,
>> +   uint32_t val, void *data)
>> +{
>> +struct vpci_bar *bar = data;
>> +
>> +if ( is_hardware_domain(current->domain) )
>> +bar_write_hwdom(pdev, reg, val, data);
>> +else
>> +{
>> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
>> bar->index);
>> +
>> +if ( !vbar )
>> +return;
>> +bar_write_guest(pdev, reg, val, vbar);
>> +}
>> +}
> You should assign different handlers based on whether the domain that
> has the device assigned is a domU or the hardware domain, rather than
> doing the selection here.
 Hm, handlers are assigned once in init_bars and this function is only 
 called

 for hwdom, so there is no way I can do that for the guests. Hence, the 
 dispatcher
>>> I think we might want to reset the vPCI handlers when a devices gets
>>> assigned and deassigned.
>> In ARM case init_bars is called too early: PCI device assignment is currently
>>
>> initiated by Domain-0' kernel and is done *before* PCI devices are given 
>> memory
>>
>> ranges and BARs assigned:
>>
>> [    0.429514] pci_bus :00: root bus resource [bus 00-ff]
>> [    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
>> [    0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f]
>> [    0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff]
>> [    0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff 
>> pref]
>> [    0.429670] pci :00:00.0: enabling Extended Tags
>> [    0.453764] pci :00:00.0:  BUS_NOTIFY_ADD_DEVICE
>>
>> < init_bars >
>>
>> [    0.453793] pci :00:00.0: -- IRQ 0
>> [    0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X 
>> might fail!
>> [    0.471790] pci :01:00.0:  BUS_NOTIFY_ADD_DEVICE
>>
>> < init_bars >
>>
>> [    0.471821] pci :01:00.0: -- IRQ 255
>> [    0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X 
>> might fail!
>>
>> < BAR assignments below >
>>
>> [    0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f]
>> [    0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f 
>> pref]
>>
>> In case of x86 this is pretty much ok as BARs are already in place, but for 
>> ARM we
>>
>> need to take care and re-setup vPCI BARs for hwdom.
> Even on x86 there's no guarantee that all devices have their BARs set
> up by firmware.

This is true. But there you could have config space trapped in "x86 generic 
way",

please correct me if I'm wrong here

>
> In a subsequent reply you've suggested to move init_bars from "add" to
> "assign", but I'm having trouble seeing what this would change: It's
> not Dom0 controlling assignment (to itself), but Xen assigns the device
> towards the end of pci_add_device().

PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device

Currently we initialize BARs during PHYSDEVOP_pci_device_add and

if we do that during XEN_DOMCTL_assign_device things seem to change

>
>> Things are getting even more
>>
>> complicated if the host PCI bridge is not ECAM like, so you cannot set 
>> mmio_handlers
>>
>> and trap hwdom's access to the config space to update BARs etc. This is why 
>> I have that
>>
>> ugly hack for rcar_gen3 to read actual BARs for hwdom.
> How to config space accesses work there? The latest for MSI/MSI-X it'll
> be imperative that Xen be able to intercept config space writes.
>
>>>In order to do passthrough to domUs safely
>>> we will have to add more handlers than what's required for dom0,
>> Can you please tell what are thinking about? What are the missing handlers?
>>>and
>>> having is_hardware_domain sprinkled in all of them is not a suitable
>>> solution.
>> I'll try to replace is_hardware_domain with something like:
>>
>> +/*
>> + * Detect whether physical PCI devices in 

Re: [PATCH 08/10] vpci/arm: Allow updating BAR's header for non-ECAM bridges

2020-11-13 Thread Oleksandr Andrushchenko

On 11/13/20 12:29 PM, Jan Beulich wrote:
> On 09.11.2020 13:50, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko 
>>
>> Non-ECAM host bridges in hwdom go directly to PCI config space,
>> not through vpci (they use their specific method for accessing PCI
>> configuration, e.g. dedicated registers etc.).
> And access to these dedicated registers can't be intercepted?

It can. But then you have to fully emulate that bridge, e.g.

"if we write A to regB and after that write C to regZ then it

means we are accessing config space. If we write"

I mean this would need lots of code in Xen to achieve that

>   It
> would seem to me that if so, such a platform is not capable of
> being virtualized (without cooperation by all the domains in
> possession of devices).

Guest domains always use an emulated ECAM bridge and are easily

trapped and emulated

>
> Jan

Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Julien Grall




On 13/11/2020 10:25, Jan Beulich wrote:

On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:

On 11/12/20 4:46 PM, Roger Pau Monné wrote:

On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:

On 11/12/20 11:40 AM, Roger Pau Monné wrote:

On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko 
+static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned int reg,
+  void *data)
+{
+struct vpci_bar *vbar, *bar = data;
+
+if ( is_hardware_domain(current->domain) )
+return bar_read_hwdom(pdev, reg, data);
+
+vbar = get_vpci_bar(current->domain, pdev, bar->index);
+if ( !vbar )
+return ~0;
+
+return bar_read_guest(pdev, reg, vbar);
+}
+
+static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int reg,
+   uint32_t val, void *data)
+{
+struct vpci_bar *bar = data;
+
+if ( is_hardware_domain(current->domain) )
+bar_write_hwdom(pdev, reg, val, data);
+else
+{
+struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
bar->index);
+
+if ( !vbar )
+return;
+bar_write_guest(pdev, reg, val, vbar);
+}
+}

You should assign different handlers based on whether the domain that
has the device assigned is a domU or the hardware domain, rather than
doing the selection here.

Hm, handlers are assigned once in init_bars and this function is only called

for hwdom, so there is no way I can do that for the guests. Hence, the 
dispatcher

I think we might want to reset the vPCI handlers when a devices gets
assigned and deassigned.


In ARM case init_bars is called too early: PCI device assignment is currently

initiated by Domain-0' kernel and is done *before* PCI devices are given memory

ranges and BARs assigned:

[    0.429514] pci_bus :00: root bus resource [bus 00-ff]
[    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
[    0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f]
[    0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff]
[    0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff 
pref]
[    0.429670] pci :00:00.0: enabling Extended Tags
[    0.453764] pci :00:00.0:  BUS_NOTIFY_ADD_DEVICE

< init_bars >

[    0.453793] pci :00:00.0: -- IRQ 0
[    0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X might 
fail!
[    0.471790] pci :01:00.0:  BUS_NOTIFY_ADD_DEVICE

< init_bars >

[    0.471821] pci :01:00.0: -- IRQ 255
[    0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X might 
fail!

< BAR assignments below >

[    0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f]
[    0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f 
pref]

In case of x86 this is pretty much ok as BARs are already in place, but for ARM 
we

need to take care and re-setup vPCI BARs for hwdom.


Even on x86 there's no guarantee that all devices have their BARs set
up by firmware.

In a subsequent reply you've suggested to move init_bars from "add" to
"assign", but I'm having trouble seeing what this would change: It's
not Dom0 controlling assignment (to itself), but Xen assigns the device
towards the end of pci_add_device().


Things are getting even more

complicated if the host PCI bridge is not ECAM like, so you cannot set 
mmio_handlers

and trap hwdom's access to the config space to update BARs etc. This is why I 
have that

ugly hack for rcar_gen3 to read actual BARs for hwdom.


How to config space accesses work there? The latest for MSI/MSI-X it'll
be imperative that Xen be able to intercept config space writes.


I am not sure to understand your last sentence. Are you saying that we 
always need to trap access to MSI/MSI-X message in order to sanitize it?


If one is using the GICv3 ITS (I haven't investigated other MSI 
controller), then I don't believe you need to sanitize the MSI/MSI-X 
message in most of the situation.


Cheers,

--
Julien Grall



Re: [PATCH 08/10] vpci/arm: Allow updating BAR's header for non-ECAM bridges

2020-11-13 Thread Jan Beulich
On 09.11.2020 13:50, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko 
> 
> Non-ECAM host bridges in hwdom go directly to PCI config space,
> not through vpci (they use their specific method for accessing PCI
> configuration, e.g. dedicated registers etc.).

And access to these dedicated registers can't be intercepted? It
would seem to me that if so, such a platform is not capable of
being virtualized (without cooperation by all the domains in
possession of devices).

Jan



Re: [PATCH 06/10] vpci: Make every domain handle its own BARs

2020-11-13 Thread Jan Beulich
On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
> On 11/12/20 4:46 PM, Roger Pau Monné wrote:
>> On Thu, Nov 12, 2020 at 01:16:10PM +, Oleksandr Andrushchenko wrote:
>>> On 11/12/20 11:40 AM, Roger Pau Monné wrote:
 On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko 
> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned 
> int reg,
> +  void *data)
> +{
> +struct vpci_bar *vbar, *bar = data;
> +
> +if ( is_hardware_domain(current->domain) )
> +return bar_read_hwdom(pdev, reg, data);
> +
> +vbar = get_vpci_bar(current->domain, pdev, bar->index);
> +if ( !vbar )
> +return ~0;
> +
> +return bar_read_guest(pdev, reg, vbar);
> +}
> +
> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int 
> reg,
> +   uint32_t val, void *data)
> +{
> +struct vpci_bar *bar = data;
> +
> +if ( is_hardware_domain(current->domain) )
> +bar_write_hwdom(pdev, reg, val, data);
> +else
> +{
> +struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
> bar->index);
> +
> +if ( !vbar )
> +return;
> +bar_write_guest(pdev, reg, val, vbar);
> +}
> +}
 You should assign different handlers based on whether the domain that
 has the device assigned is a domU or the hardware domain, rather than
 doing the selection here.
>>> Hm, handlers are assigned once in init_bars and this function is only called
>>>
>>> for hwdom, so there is no way I can do that for the guests. Hence, the 
>>> dispatcher
>> I think we might want to reset the vPCI handlers when a devices gets
>> assigned and deassigned.
> 
> In ARM case init_bars is called too early: PCI device assignment is currently
> 
> initiated by Domain-0' kernel and is done *before* PCI devices are given 
> memory
> 
> ranges and BARs assigned:
> 
> [    0.429514] pci_bus :00: root bus resource [bus 00-ff]
> [    0.429532] pci_bus :00: root bus resource [io 0x-0xf]
> [    0.429555] pci_bus :00: root bus resource [mem 0xfe20-0xfe3f]
> [    0.429575] pci_bus :00: root bus resource [mem 0x3000-0x37ff]
> [    0.429604] pci_bus :00: root bus resource [mem 0x3800-0x3fff 
> pref]
> [    0.429670] pci :00:00.0: enabling Extended Tags
> [    0.453764] pci :00:00.0:  BUS_NOTIFY_ADD_DEVICE
> 
> < init_bars >
> 
> [    0.453793] pci :00:00.0: -- IRQ 0
> [    0.458825] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X 
> might fail!
> [    0.471790] pci :01:00.0:  BUS_NOTIFY_ADD_DEVICE
> 
> < init_bars >
> 
> [    0.471821] pci :01:00.0: -- IRQ 255
> [    0.476809] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X 
> might fail!
> 
> < BAR assignments below >
> 
> [    0.488233] pci :00:00.0: BAR 14: assigned [mem 0xfe20-0xfe2f]
> [    0.488265] pci :00:00.0: BAR 15: assigned [mem 0x3800-0x380f 
> pref]
> 
> In case of x86 this is pretty much ok as BARs are already in place, but for 
> ARM we
> 
> need to take care and re-setup vPCI BARs for hwdom.

Even on x86 there's no guarantee that all devices have their BARs set
up by firmware.

In a subsequent reply you've suggested to move init_bars from "add" to
"assign", but I'm having trouble seeing what this would change: It's
not Dom0 controlling assignment (to itself), but Xen assigns the device
towards the end of pci_add_device().

> Things are getting even more
> 
> complicated if the host PCI bridge is not ECAM like, so you cannot set 
> mmio_handlers
> 
> and trap hwdom's access to the config space to update BARs etc. This is why I 
> have that
> 
> ugly hack for rcar_gen3 to read actual BARs for hwdom.

How to config space accesses work there? The latest for MSI/MSI-X it'll
be imperative that Xen be able to intercept config space writes.

>>   In order to do passthrough to domUs safely
>> we will have to add more handlers than what's required for dom0,
> Can you please tell what are thinking about? What are the missing handlers?
>>   and
>> having is_hardware_domain sprinkled in all of them is not a suitable
>> solution.
> 
> I'll try to replace is_hardware_domain with something like:
> 
> +/*
> + * Detect whether physical PCI devices in this segment belong
> + * to the domain given, e.g. on x86 all PCI devices live in hwdom,
> + * but in case of ARM this might not be the case: those may also
> + * live in driver domains or even Xen itself.
> + */
> +bool pci_is_hardware_domain(struct domain *d, u16 seg)
> +{
> +#ifdef CONFIG_X86
> +    return is_hardware_domain(d);
> +#elif CONFIG_ARM
> +    return pci_is_owner_domain(d, seg);
> +#else
> +#error "Unsupported architecture"
> +#endif
> +}
> +
> 

Re: [PATCH 5/5] x86/p2m: split write_p2m_entry() hook

2020-11-13 Thread Jan Beulich
On 12.11.2020 18:52, Tim Deegan wrote:
> At 15:04 +0100 on 12 Nov (1605193496), Jan Beulich wrote:
>> On 12.11.2020 14:07, Roger Pau Monné wrote:
>>> On Thu, Nov 12, 2020 at 01:29:33PM +0100, Jan Beulich wrote:
 I agree with all this. If only it was merely about TLB flushes. In
 the shadow case, shadow_blow_all_tables() gets invoked, and that
 one - looking at the other call sites - wants the paging lock held.
> [...]
>>> The post hook for shadow could pick the lock again, as I don't think
>>> the removal of the tables needs to be strictly done inside of the same
>>> locked region?
>>
>> I think it does, or else a use of the now stale tables may occur
>> before they got blown away. Tim?
> 
> Is this the call to shadow_blow_tables() in the write_p2m_entry path?

Yes.

> I think it would be safe to drop and re-take the paging lock there as
> long as the call happens before the write is considered to have
> finished.
> 
> But it would not be a useful performance improvement - any update that
> takes this path is going to be very slow regardless.  So unless you
> have another pressing reason to split it up, I would be inclined to
> leave it as it is.  That way it's easier to see that the locking is
> correct.

Thanks for the clarification.

Roger - what's your view at this point?

Jan



Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843

2020-11-13 Thread Manuel Bouyer
On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote:
> 
> The following might be able to get you going, but I think I need to
> refine the logic a bit there, will have to give it some thought.

thanks. It avoids the panic, but it seems that msi and msi-x interrupts are
not delivered to the dom0 kernel. The conters stay at 0.
I get some ioapic interrupts though, as well as some Xen events.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--