[Xen-devel] [linux-next test] 131060: regressions - FAIL

2018-12-06 Thread osstest service owner
flight 131060 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131060/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 7 xen-boot fail REGR. vs. 
130908
 test-amd64-amd64-xl-pvshim7 xen-boot fail REGR. vs. 130908
 test-amd64-amd64-libvirt  7 xen-boot fail REGR. vs. 130908
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-boot fail REGR. vs. 
130908
 test-amd64-amd64-xl-qemuu-ovmf-amd64  7 xen-boot fail REGR. vs. 130908
 build-amd64-xsm   6 xen-buildfail REGR. vs. 130908
 test-armhf-armhf-examine  8 reboot   fail REGR. vs. 130908
 test-armhf-armhf-xl-credit1   7 xen-boot fail REGR. vs. 130908
 test-armhf-armhf-xl-cubietruck  7 xen-boot   fail REGR. vs. 130908
 test-armhf-armhf-xl-multivcpu  7 xen-bootfail REGR. vs. 130908
 test-armhf-armhf-xl   7 xen-boot fail REGR. vs. 130908
 test-armhf-armhf-xl-vhd  10 debian-di-installfail REGR. vs. 130908

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumprun-amd64 17 rumprun-demo-xenstorels/xenstorels.repeat 
fail blocked in 130908
 test-armhf-armhf-libvirt  7 xen-bootfail blocked in 130908
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop  fail blocked in 130908
 test-armhf-armhf-libvirt-raw 13 saverestore-support-check fail blocked in 
130908
 test-amd64-i386-xl-qemuu-debianhvm-amd64 10 debian-hvm-install fail like 130908
 test-amd64-i386-qemuu-rhel6hvm-intel 10 redhat-installfail like 130908
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail like 130908
 test-amd64-i386-freebsd10-i386 11 guest-start fail like 130908
 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-installfail like 130908
 test-amd64-i386-xl-qemuu-win7-amd64 10 windows-installfail like 130908
 test-amd64-amd64-pygrub   7 xen-boot fail  like 130908
 test-amd64-amd64-pair10 xen-boot/src_hostfail  like 130908
 test-amd64-amd64-pair11 xen-boot/dst_hostfail  like 130908
 test-amd64-amd64-libvirt-pair 10 xen-boot/src_hostfail like 130908
 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_hostfail like 130908
 test-amd64-i386-examine   8 reboot   fail  like 130908
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot  fail like 130908
 test-amd64-i386-pair 10 xen-boot/src_hostfail  like 130908
 test-amd64-i386-pair 11 xen-boot/dst_hostfail  like 130908
 test-amd64-i386-qemuu-rhel6hvm-amd 10 redhat-install  fail like 130908
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail  like 130908
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail  like 130908
 test-amd64-amd64-examine  8 reboot   fail  like 130908
 test-amd64-i386-freebsd10-amd64 11 guest-startfail like 130908
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 130908
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail like 130908
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 130908
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 130908
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 130908
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-check 

[Xen-devel] [PATCH v6.1 09/20] xen: add basic hooks for PVH in current code

2018-12-06 Thread Juergen Gross
Add the hooks to current code needed for Xen PVH. They will be filled
with code later when the related functionality is being added.

loader/i386/linux.c needs to include machine/kernel.h now as it needs
to get GRUB_KERNEL_USE_RSDP_ADDR from there. This in turn requires to
add an empty kernel.h header for some i386 platforms (efi, coreboot,
ieee1275, xen).

Signed-off-by: Juergen Gross 
Reviewed-by: Daniel Kiper 
---
V3: xenpvh->xen_pvh (Daniel Kiper)
adjust copyright date (Roger Pau Monné)
V5: update commit message (Daniel Kiper)
move including xen/hvm/start_info.h to the sources really needing
  it (Daniel Kiper)
V6.1: add empty kernel.h headers for i386 platforms
It should be noted that i386_efi build is broken even without this
patch, but this is clearly beyond the scope of this series.
---
 grub-core/Makefile.am |  5 +
 grub-core/kern/i386/xen/pvh.c | 37 +++
 grub-core/kern/i386/xen/startup_pvh.S | 29 +++
 grub-core/kern/xen/init.c |  4 
 grub-core/loader/i386/linux.c |  1 +
 include/grub/i386/coreboot/kernel.h   |  0
 include/grub/i386/ieee1275/kernel.h   |  0
 include/grub/i386/xen/kernel.h|  0
 include/grub/i386/xen_pvh/kernel.h| 30 
 include/grub/xen.h|  5 +
 10 files changed, 111 insertions(+)
 create mode 100644 grub-core/kern/i386/xen/pvh.c
 create mode 100644 grub-core/kern/i386/xen/startup_pvh.S
 create mode 100644 include/grub/i386/coreboot/kernel.h
 create mode 100644 include/grub/i386/ieee1275/kernel.h
 create mode 100644 include/grub/i386/xen/kernel.h
 create mode 100644 include/grub/i386/xen_pvh/kernel.h

diff --git a/grub-core/Makefile.am b/grub-core/Makefile.am
index f4ff62b76..89433f498 100644
--- a/grub-core/Makefile.am
+++ b/grub-core/Makefile.am
@@ -102,6 +102,7 @@ KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/i386/tsc.h
 endif
 
 if COND_i386_efi
+KERNEL_HEADER_FILES += $(top_builddir)/include/grub/machine/kernel.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/efi/efi.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/efi/disk.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/i386/tsc.h
@@ -111,6 +112,7 @@ KERNEL_HEADER_FILES += 
$(top_srcdir)/include/grub/i386/pmtimer.h
 endif
 
 if COND_i386_coreboot
+KERNEL_HEADER_FILES += $(top_builddir)/include/grub/machine/kernel.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/i386/tsc.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/coreboot/lbio.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/video.h
@@ -122,6 +124,7 @@ KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/acpi.h
 endif
 
 if COND_i386_multiboot
+KERNEL_HEADER_FILES += $(top_builddir)/include/grub/machine/kernel.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/i386/tsc.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/acpi.h
 endif
@@ -132,6 +135,7 @@ KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/i386/tsc.h
 endif
 
 if COND_i386_ieee1275
+KERNEL_HEADER_FILES += $(top_builddir)/include/grub/machine/kernel.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/ieee1275/ieee1275.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/terminfo.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/extcmd.h
@@ -140,6 +144,7 @@ KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/i386/tsc.h
 endif
 
 if COND_i386_xen
+KERNEL_HEADER_FILES += $(top_builddir)/include/grub/machine/kernel.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/xen.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/i386/xen/hypercall.h
 KERNEL_HEADER_FILES += $(top_srcdir)/include/grub/terminfo.h
diff --git a/grub-core/kern/i386/xen/pvh.c b/grub-core/kern/i386/xen/pvh.c
new file mode 100644
index 0..4f629b15e
--- /dev/null
+++ b/grub-core/kern/i386/xen/pvh.c
@@ -0,0 +1,37 @@
+/*
+ *  GRUB  --  GRand Unified Bootloader
+ *  Copyright (C) 2018  Free Software Foundation, Inc.
+ *
+ *  GRUB is free software: you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation, either version 3 of the License, or
+ *  (at your option) any later version.
+ *
+ *  GRUB is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with GRUB.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+grub_uint64_t grub_rsdp_addr;
+
+void
+grub_xen_setup_pvh (void)
+{
+}
+
+grub_err_t
+grub_machine_mmap_iterate (grub_memory_hook_t hook, void *hook_data)
+{
+}
diff --git a/grub-core/kern/i386/xen/startup_pvh.S 
b/grub-core/kern/i386/xen/startup_pvh.S
new file mode 100644
index 

Re: [Xen-devel] [PATCH v8 1/7] xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH

2018-12-06 Thread Juergen Gross
On 06/12/2018 23:11, Paolo Bonzini wrote:
> On 06/12/18 07:04, Maran Wilson wrote:
>> +config PVH
>> +bool "Support for running PVH guests"
>> +---help---
>> +  This option enables the PVH entry point for guest virtual machines
>> +  as specified in the x86/HVM direct boot ABI.
>> +
> 
> IIUC this breaks "normal" bzImage boot, so we should have something like
> 
>   The resulting kernel will not boot with most x86 boot loaders
>   such as GRUB or SYSLINUX.  Unless you plan to start the kernel
>   using QEMU or Xen, you probably want to say N here.

The resulting kernel should be able to be booted either in PVH mode
via the PVH entry point or the "normal" way via the still existing
old entry point(s).

It is an _additional_ way to boot the kernel, not an exclusive
alternative.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.10-testing test] 131061: regressions - FAIL

2018-12-06 Thread osstest service owner
flight 131061 xen-4.10-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131061/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stopfail REGR. vs. 129676

Tests which did not succeed, but are not blocking:
 test-xtf-amd64-amd64-3   69 xtf/test-hvm64-xsa-278  fail blocked in 129676
 test-xtf-amd64-amd64-4   69 xtf/test-hvm64-xsa-278  fail blocked in 129676
 test-xtf-amd64-amd64-2   69 xtf/test-hvm64-xsa-278  fail blocked in 129676
 test-xtf-amd64-amd64-5   69 xtf/test-hvm64-xsa-278  fail blocked in 129676
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 xen  b6e203bc80e9d3e1dc7eb579d9665a77700d78cc
baseline version:
 xen  e907460fd61c350487ffee5d8aa375bef56bc81c

Last test of basis   129676  2018-11-09 01:56:32 Z   28 days
Testing same since   130611  2018-11-20 15:07:52 Z   16 days8 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Roger Pau Monné 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt   

[Xen-devel] [linux-3.18 bisection] complete test-amd64-i386-xl-qemuu-debianhvm-amd64

2018-12-06 Thread osstest service owner
branch xen-unstable
xenbranch xen-unstable
job test-amd64-i386-xl-qemuu-debianhvm-amd64
testid xen-boot

Tree: linux 
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  linux 
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
  Bug introduced:  7b8052e19304865477e03a0047062d977309a22f
  Bug not present: d255d18a34a8d53ccc4a019dc07e17b6e8cf6bd1
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/131105/


  commit 7b8052e19304865477e03a0047062d977309a22f
  Author: Jan Beulich 
  Date:   Mon Oct 19 04:23:29 2015 -0600
  
  igb: fix NULL derefs due to skipped SR-IOV enabling
  
  [ Upstream commit be06998f96ecb93938ad2cce46c4289bf7cf45bc ]
  
  The combined effect of commits 6423fc3416 ("igb: do not re-init SR-IOV
  during probe") and ceee3450b3 ("igb: make sure SR-IOV init uses the
  right number of queues") causes VFs no longer getting set up, leading
  to NULL pointer dereferences due to the adapter's ->vf_data being NULL
  while ->vfs_allocated_count is non-zero. The first commit not only
  neglected the side effect of igb_sriov_reinit() that the second commit
  tried to account for, but also that of setting IGB_FLAG_HAS_MSIX,
  without which igb_enable_sriov() is effectively a no-op. Calling
  igb_{,re}set_interrupt_capability() as done here seems to address this,
  but I'm not sure whether this is better than sinply reverting the other
  two commits.
  
  Signed-off-by: Jan Beulich 
  Tested-by: Aaron Brown 
  Signed-off-by: Jeff Kirsher 
  Signed-off-by: Sasha Levin 


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/linux-3.18/test-amd64-i386-xl-qemuu-debianhvm-amd64.xen-boot.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/linux-3.18/test-amd64-i386-xl-qemuu-debianhvm-amd64.xen-boot
 --summary-out=tmp/131105.bisection-summary --basis-template=128858 
--blessings=real,real-bisect linux-3.18 
test-amd64-i386-xl-qemuu-debianhvm-amd64 xen-boot
Searching for failure / basis pass:
 131035 fail [host=debina1] / 130367 [host=fiano0] 130203 [host=rimava1] 130067 
[host=albana1] 129845 [host=huxelrebe1] 129760 [host=joubertin0] 128858 
[host=baroque1] 128841 [host=rimava1] 128807 [host=joubertin0] 128691 
[host=huxelrebe1] 128258 [host=pinot1] 128232 [host=pinot1] 128177 
[host=huxelrebe1] 128096 [host=albana0] 127486 [host=rimava1] 127472 
[host=fiano0] 127455 [host=italia0] 127296 [host=huxelrebe1] 127001 
[host=baroque0] 126926 [host=baroque0] 126813 [host=baroque0] 126711 
[host=baroque0] 126583 [host=baroque0] 126472 [host=baroque0] 126362 
[host=baroque0] 126270 [host=baroque0] 126189 [host=baroque0] 126042 
[host=pinot0] 125899 ok.
Failure / basis pass flights: 131035 / 125899
(tree with no url: minios)
(tree with no url: ovmf)
(tree with no url: seabios)
Tree: linux 
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git
Latest 3879c163e8681939b1d93139521aee983623884f 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
de5b678ca4dcdfa83e322491d478d66df56c1986 
6d8ffac1f7a782dc2c7f8df3871a294729ae36bd
Basis pass 830f9674e76d08d04585e53fc200ae8af99966e7 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
c8ea0457495342c417c3dc033bba25148b279f60 
43139135a8938de44f66333831d3a8655d07663a 
1f7574763cbb2c85825b8cc4d81f386e767a476f
Generating revisions with ./adhoc-revtuple-generator  
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git#830f9674e76d08d04585e53fc200ae8af99966e7-3879c163e8681939b1d93139521aee983623884f
 
git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860
 
git://xenbits.xen.org/qemu-xen-traditional.git#c8ea0457495342c417c3dc033bba25148b279f60-d0d8ad39ecb51cd7497cd524484fe09f50876798
 
git://xenbits.xen.org/qemu-xen.git#43139135a8938de44f66333831d3a8655d07663a-de5b678ca4dcdfa83e322491d478d66df56c1986
 
git://xenbits.xen.org/xen.git#1f7574763cbb2c85825b8cc4d81f386e767a476f-6d8ffac1f7a782dc2c7f8df3871a294729ae36bd
adhoc-revtuple-generator: tree discontiguous: qemu-xen
Loaded 3005 nodes in revision graph
Searching for test results:
 125899 pass 830f9674e76d08d04585e53fc200ae8af99966e7 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 

[Xen-devel] [PATCH 1/1] xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-order' set by malicious blkfront

2018-12-06 Thread Dongli Zhang
The xenstore 'ring-page-order' is used globally for each blkback queue and
therefore should be read from xenstore only once. However, it is obtained
in read_per_ring_refs() which might be called multiple times during the
initialization of each blkback queue.

If the blkfront is malicious and the 'ring-page-order' is set in different
value by blkfront every time before blkback reads it, this may end up at
the "WARN_ON(i != (XEN_BLKIF_REQS_PER_PAGE * blkif->nr_ring_pages));" in
xen_blkif_disconnect() when frontend is destroyed.

This patch reworks connect_ring() to read xenstore 'ring-page-order' only
once.

Signed-off-by: Dongli Zhang 
---
 drivers/block/xen-blkback/xenbus.c | 49 --
 1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/drivers/block/xen-blkback/xenbus.c 
b/drivers/block/xen-blkback/xenbus.c
index a4bc74e..4a8ce20 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -919,14 +919,15 @@ static void connect(struct backend_info *be)
 /*
  * Each ring may have multi pages, depends on "ring-page-order".
  */
-static int read_per_ring_refs(struct xen_blkif_ring *ring, const char *dir)
+static int read_per_ring_refs(struct xen_blkif_ring *ring, const char *dir,
+ bool use_ring_page_order)
 {
unsigned int ring_ref[XENBUS_MAX_RING_GRANTS];
struct pending_req *req, *n;
int err, i, j;
struct xen_blkif *blkif = ring->blkif;
struct xenbus_device *dev = blkif->be->dev;
-   unsigned int ring_page_order, nr_grefs, evtchn;
+   unsigned int nr_grefs, evtchn;
 
err = xenbus_scanf(XBT_NIL, dir, "event-channel", "%u",
  );
@@ -936,28 +937,18 @@ static int read_per_ring_refs(struct xen_blkif_ring 
*ring, const char *dir)
return err;
}
 
-   err = xenbus_scanf(XBT_NIL, dev->otherend, "ring-page-order", "%u",
- _page_order);
-   if (err != 1) {
+   nr_grefs = blkif->nr_ring_pages;
+
+   if (!use_ring_page_order) {
err = xenbus_scanf(XBT_NIL, dir, "ring-ref", "%u", 
_ref[0]);
if (err != 1) {
err = -EINVAL;
xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
return err;
}
-   nr_grefs = 1;
} else {
unsigned int i;
 
-   if (ring_page_order > xen_blkif_max_ring_order) {
-   err = -EINVAL;
-   xenbus_dev_fatal(dev, err, "%s/request %d ring page 
order exceed max:%d",
-dir, ring_page_order,
-xen_blkif_max_ring_order);
-   return err;
-   }
-
-   nr_grefs = 1 << ring_page_order;
for (i = 0; i < nr_grefs; i++) {
char ring_ref_name[RINGREF_NAME_LEN];
 
@@ -972,7 +963,6 @@ static int read_per_ring_refs(struct xen_blkif_ring *ring, 
const char *dir)
}
}
}
-   blkif->nr_ring_pages = nr_grefs;
 
for (i = 0; i < nr_grefs * XEN_BLKIF_REQS_PER_PAGE; i++) {
req = kzalloc(sizeof(*req), GFP_KERNEL);
@@ -1030,6 +1020,8 @@ static int connect_ring(struct backend_info *be)
size_t xspathsize;
const size_t xenstore_path_ext_size = 11; /* sufficient for 
"/queue-NNN" */
unsigned int requested_num_queues = 0;
+   bool use_ring_page_order = false;
+   unsigned int ring_page_order;
 
pr_debug("%s %s\n", __func__, dev->otherend);
 
@@ -1075,8 +1067,28 @@ static int connect_ring(struct backend_info *be)
 be->blkif->nr_rings, be->blkif->blk_protocol, protocol,
 pers_grants ? "persistent grants" : "");
 
+   err = xenbus_scanf(XBT_NIL, dev->otherend, "ring-page-order", "%u",
+  _page_order);
+
+   if (err != 1) {
+   be->blkif->nr_ring_pages = 1;
+   } else {
+   if (ring_page_order > xen_blkif_max_ring_order) {
+   err = -EINVAL;
+   xenbus_dev_fatal(dev, err,
+"requested ring page order %d exceed 
max:%d",
+ring_page_order,
+xen_blkif_max_ring_order);
+   return err;
+   }
+
+   use_ring_page_order = true;
+   be->blkif->nr_ring_pages = 1 << ring_page_order;
+   }
+
if (be->blkif->nr_rings == 1)
-   return read_per_ring_refs(>blkif->rings[0], dev->otherend);
+   return read_per_ring_refs(>blkif->rings[0], dev->otherend,
+ use_ring_page_order);
else {
xspathsize = strlen(dev->otherend) + 

Re: [Xen-devel] [PATCH for-4.12 v2 13/17] xen/arm: p2m: Rework p2m_cache_flush_range

2018-12-06 Thread Stefano Stabellini
On Tue, 4 Dec 2018, Julien Grall wrote:
> A follow-up patch will add support for preemption in p2m_cache_flush_range.
> Because of the complexity for the 2 loops, it would be necessary to add
> preemption in both of them.
> 
> This can be avoided by merging the 2 loops together and still keeping
> the code fairly simple to read and extend.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> Changes in v2:
> - Patch added
> ---
>  xen/arch/arm/p2m.c | 52 +---
>  1 file changed, 37 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index c713226561..db22b53bfd 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1527,7 +1527,8 @@ int relinquish_p2m_mapping(struct domain *d)
>  int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>  {
>  struct p2m_domain *p2m = p2m_get_hostp2m(d);
> -gfn_t next_gfn;
> +gfn_t next_block_gfn;
> +mfn_t mfn = INVALID_MFN;
>  p2m_type_t t;
>  unsigned int order;
>  
> @@ -1542,24 +1543,45 @@ int p2m_cache_flush_range(struct domain *d, gfn_t 
> start, gfn_t end)
>  start = gfn_max(start, p2m->lowest_mapped_gfn);
>  end = gfn_min(end, p2m->max_mapped_gfn);
>  
> -for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
> -{
> -mfn_t mfn = p2m_get_entry(p2m, start, , NULL, , NULL);
> +next_block_gfn = start;
>  
> -next_gfn = gfn_next_boundary(start, order);
> -
> -/* Skip hole and non-RAM page */
> -if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
> -continue;
> -
> -/* XXX: Implement preemption */
> -while ( gfn_x(start) < gfn_x(next_gfn) )
> +while ( gfn_x(start) < gfn_x(end) )
> +{
> +/*
> + * We want to flush page by page as:
> + *  - it may not be possible to map the full block (can be up to 1GB)
> + *in Xen memory
> + *  - we may want to do fine grain preemption as flushing multiple
> + *page in one go may take a long time
> + *
> + * As p2m_get_entry is able to return the size of the mapping
> + * in the p2m, it is pointless to execute it for each page.
> + *
> + * We can optimize it by tracking the gfn of the next
> + * block. So we will only call p2m_get_entry for each block (can
> + * be up to 1GB).
> + */
> +if ( gfn_eq(start, next_block_gfn) )
>  {
> -flush_page_to_ram(mfn_x(mfn), false);
> +mfn = p2m_get_entry(p2m, start, , NULL, , NULL);
> +next_block_gfn = gfn_next_boundary(start, order);
>  
> -start = gfn_add(start, 1);
> -mfn = mfn_add(mfn, 1);
> +/*
> + * The following regions can be skipped:
> + *  - Hole
> + *  - non-RAM
> + */

I think this comment is superfluous as the code is already obvious. You
can remove it.

Reviewed-by: Stefano Stabellini 


> +if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
> +{
> +start = next_block_gfn;
> +continue;
> +}
>  }
> +
> +flush_page_to_ram(mfn_x(mfn), false);
> +
> +start = gfn_add(start, 1);
> +mfn = mfn_add(mfn, 1);
>  }
>  
>  invalidate_icache();
> -- 
> 2.11.0
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 1/7] xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH

2018-12-06 Thread Andrew Cooper
On 06/12/2018 23:30, Paolo Bonzini wrote:
> On 07/12/18 00:11, Boris Ostrovsky wrote:
>> On 12/6/18 5:49 PM, Paolo Bonzini wrote:
>>> On 06/12/18 23:34, Boris Ostrovsky wrote:
 On 12/6/18 5:11 PM, Paolo Bonzini wrote:

> and also
>
>   depends on !EFI
>
> because even though in principle it would be possible to write a PVH
> loader for UEFI, PVH's start info does not support the EFI handover
> protocol.
 But we should be able to build the binary with both EFI and PVH?
>>> Can you?  It's a completely different binary format, the EFI handover
>>> protocol is invoked via a special entry point and needs the Linux header
>>> format, not ELF.
>> Right, but I think it is desirable to be able to build both from the
>> same config file.
> Ah, "make bzImage" and use the vmlinux for PVH, because PVH fetches the
> entry point from the special note.  That's clever. :)
>
> I don't see why it should not work, and if so the "depends on !EFI" is
> indeed unnecessary.

We do strive for single binaries in the Xen world, because that is how
people actually want to consume Xen.

It is for this reason why a single xen.gz binary can be loaded as a
straight ELF (including this PVH boot protocol), or via Multiboot 1 or
2, and even do full EFI if your bootloader is up to date on its
Multiboot2 spec :)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations

2018-12-06 Thread Stefano Stabellini
On Tue, 4 Dec 2018, Julien Grall wrote:
> Set/Way operations are used to perform maintenance on a given cache.
> At the moment, Set/Way operations are not trapped and therefore a guest
> OS will directly act on the local cache. However, a vCPU may migrate to
> another pCPU in the middle of the processor. This will result to have
> cache with stall data (Set/Way are not propagated) potentially causing
> crash. This may be the cause of heisenbug noticed in Osstest [1].
> 
> Furthermore, Set/Way operations are not available on system cache. This
> means that OS, such as Linux 32-bit, relying on those operations to
> fully clean the cache before disabling MMU may break because data may
> sits in system caches and not in RAM.
> 
> For more details about Set/Way, see the talk "The Art of Virtualizing
> Cache Maintenance" given at Xen Summit 2018 [2].
> 
> In the context of Xen, we need to trap Set/Way operations and emulate
> them. From the Arm Arm (B1.14.4 in DDI 046C.c), Set/Way operations are
> difficult to virtualized. So we can assume that a guest OS using them will
> suffer the consequence (i.e slowness) until developer removes all the usage
> of Set/Way.
> 
> As the software is not allowed to infer the Set/Way to Physical Address
> mapping, Xen will need to go through the guest P2M and clean &
> invalidate all the entries mapped.
> 
> Because Set/Way happen in batch (a loop on all Set/Way of a cache), Xen
> would need to go through the P2M for every instructions. This is quite
> expensive and would severely impact the guest OS. The implementation is
> re-using the KVM policy to limit the number of flush:
> - If we trap a Set/Way operations, we enable VM trapping (i.e
>   HVC_EL2.TVM) to detect cache being turned on/off, and do a full
> clean.
> - We clean the caches when turning on and off
> - Once the caches are enabled, we stop trapping VM instructions
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html
> [2] https://fr.slideshare.net/xen_com_mgr/virtualizing-cache
> 
> Signed-off-by: Julien Grall 
> 
> ---
> Changes in v2:
> - Fix emulation for Set/Way cache flush arm64 sysreg
> - Add support for preemption
> - Check cache status on every VM traps in Arm64
> - Remove spurious change
> ---
>  xen/arch/arm/arm64/vsysreg.c | 17 
>  xen/arch/arm/p2m.c   | 92 
> 
>  xen/arch/arm/traps.c | 25 +++-
>  xen/arch/arm/vcpreg.c| 22 +++
>  xen/include/asm-arm/domain.h |  8 
>  xen/include/asm-arm/p2m.h| 20 ++
>  6 files changed, 183 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
> index 16ac9c344a..8a85507d9d 100644
> --- a/xen/arch/arm/arm64/vsysreg.c
> +++ b/xen/arch/arm/arm64/vsysreg.c
> @@ -34,9 +34,14 @@
>  static bool vreg_emulate_##reg(struct cpu_user_regs *regs,  \
> uint64_t *r, bool read)  \
>  {   \
> +struct vcpu *v = current;   \
> +bool cache_enabled = vcpu_has_cache_enabled(v); \
> +\
>  GUEST_BUG_ON(read); \
>  WRITE_SYSREG64(*r, reg);\
>  \
> +p2m_toggle_cache(v, cache_enabled); \
> +\
>  return true;\
>  }
>  
> @@ -85,6 +90,18 @@ void do_sysreg(struct cpu_user_regs *regs,
>  break;
>  
>  /*
> + * HCR_EL2.TSW
> + *
> + * ARMv8 (DDI 0487B.b): Table D1-42
> + */
> +case HSR_SYSREG_DCISW:
> +case HSR_SYSREG_DCCSW:
> +case HSR_SYSREG_DCCISW:
> +if ( !hsr.sysreg.read )
> +p2m_set_way_flush(current);
> +break;
> +
> +/*
>   * HCR_EL2.TVM
>   *
>   * ARMv8 (DDI 0487D.a): Table D1-38
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index ca9f0d9ebe..8ee6ff7bd7 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -3,6 +3,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -1620,6 +1621,97 @@ int p2m_cache_flush_range(struct domain *d, gfn_t 
> *pstart, gfn_t end)
>  return rc;
>  }
>  
> +/*
> + * Clean & invalidate RAM associated to the guest vCPU.
> + *
> + * The function can only work with the current vCPU and should be called
> + * with IRQ enabled as the vCPU could get preempted.
> + */
> +void p2m_flush_vm(struct vcpu *v)
> +{
> +int rc;
> +gfn_t start = _gfn(0);
> +
> +ASSERT(v == current);
> +ASSERT(local_irq_is_enabled());
> +

Re: [Xen-devel] [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range

2018-12-06 Thread Stefano Stabellini
On Tue, 4 Dec 2018, Julien Grall wrote:
> p2m_cache_flush_range does not yet support preemption, this may be an
> issue as cleaning the cache can take a long time. While the current
> caller (XEN_DOMCTL_cacheflush) does not stricly require preemption, this
> will be necessary for new caller in a follow-up patch.
> 
> The preemption implemented is quite simple, a counter is incremented by:
> - 1 on region skipped
> - 10 for each page requiring a flush
> 
> When the counter reach 512 or above, we will check if preemption is
> needed. If not, the counter will be reset to 0. If yes, the function
> will stop, update start (to allow resuming later on) and return
> -ERESTART. This allows the caller to decide how the preemption will be
> done.
> 
> For now, XEN_DOMCTL_cacheflush will continue to ignore the preemption.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> Changes in v2:
> - Patch added
> ---
>  xen/arch/arm/domctl.c |  8 +++-
>  xen/arch/arm/p2m.c| 35 ---
>  xen/include/asm-arm/p2m.h |  4 +++-
>  3 files changed, 42 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
> index 20691528a6..9da88b8c64 100644
> --- a/xen/arch/arm/domctl.c
> +++ b/xen/arch/arm/domctl.c
> @@ -54,6 +54,7 @@ long arch_do_domctl(struct xen_domctl *domctl, struct 
> domain *d,
>  {
>  gfn_t s = _gfn(domctl->u.cacheflush.start_pfn);
>  gfn_t e = gfn_add(s, domctl->u.cacheflush.nr_pfns);
> +int rc;

This is unnecessary...


>  if ( domctl->u.cacheflush.nr_pfns > (1U<  return -EINVAL;
> @@ -61,7 +62,12 @@ long arch_do_domctl(struct xen_domctl *domctl, struct 
> domain *d,
>  if ( gfn_x(e) < gfn_x(s) )
>  return -EINVAL;
>  
> -return p2m_cache_flush_range(d, s, e);
> +/* XXX: Handle preemption */
> +do
> +rc = p2m_cache_flush_range(d, , e);
> +while ( rc == -ERESTART );

... you can just do:

  while ( -ERESTART == p2m_cache_flush_range(d, , e) )

But given that it is just style, I'll leave it up to you.


> +return rc;
>  }
>  case XEN_DOMCTL_bind_pt_irq:
>  {
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index db22b53bfd..ca9f0d9ebe 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1524,13 +1524,17 @@ int relinquish_p2m_mapping(struct domain *d)
>  return rc;
>  }
>  
> -int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
> +int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
>  {
>  struct p2m_domain *p2m = p2m_get_hostp2m(d);
>  gfn_t next_block_gfn;
> +gfn_t start = *pstart;
>  mfn_t mfn = INVALID_MFN;
>  p2m_type_t t;
>  unsigned int order;
> +int rc = 0;
> +/* Counter for preemption */
> +unsigned long count = 0;
>  
>  /*
>   * The operation cache flush will invalidate the RAM assigned to the
> @@ -1547,6 +1551,25 @@ int p2m_cache_flush_range(struct domain *d, gfn_t 
> start, gfn_t end)
>  
>  while ( gfn_x(start) < gfn_x(end) )
>  {
> +   /*
> + * Cleaning the cache for the P2M may take a long time. So we
> + * need to be able to preempt. We will arbitrarily preempt every
> + * time count reach 512 or above.
> +
> + *
> + * The count will be incremented by:
> + *  - 1 on region skipped
> + *  - 10 for each page requiring a flush

Why this choice? A page flush should cost much more than 10x a region
skipped, more like 100x or 1000x. In fact, doing the full loop without
calling flush_page_to_ram should be cheap and fast, right?. I would:

- not increase count on region skipped at all
- increase it by 1 on each page requiring a flush
- set the limit lower, if we go with your proposal it would be about 50,
  I am not sure what the limit should be though


> + */
> +if ( count >= 512 )
> +{
> +if ( softirq_pending(smp_processor_id()) )
> +{
> +rc = -ERESTART;
> +break;
> +}
> +count = 0;

No need to set count to 0 here


> +}
> +
>  /*
>   * We want to flush page by page as:
>   *  - it may not be possible to map the full block (can be up to 1GB)
> @@ -1573,22 +1596,28 @@ int p2m_cache_flush_range(struct domain *d, gfn_t 
> start, gfn_t end)
>   */
>  if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
>  {
> +count++;

This is just an iteration doing nothing, I would not increament count.

>  start = next_block_gfn;
>  continue;
>  }
>  }
>  
> +count += 10;

This makes sense, but if we skip the count++ above, we might as well
just count++ here and have a lower limit.


>  flush_page_to_ram(mfn_x(mfn), false);
>  
>  start = gfn_add(start, 1);
>  mfn 

Re: [Xen-devel] [PATCH v8 1/7] xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH

2018-12-06 Thread Paolo Bonzini
On 07/12/18 00:11, Boris Ostrovsky wrote:
> On 12/6/18 5:49 PM, Paolo Bonzini wrote:
>> On 06/12/18 23:34, Boris Ostrovsky wrote:
>>> On 12/6/18 5:11 PM, Paolo Bonzini wrote:
>>>
 and also

depends on !EFI

 because even though in principle it would be possible to write a PVH
 loader for UEFI, PVH's start info does not support the EFI handover
 protocol.
>>> But we should be able to build the binary with both EFI and PVH?
>> Can you?  It's a completely different binary format, the EFI handover
>> protocol is invoked via a special entry point and needs the Linux header
>> format, not ELF.
> 
> Right, but I think it is desirable to be able to build both from the
> same config file.

Ah, "make bzImage" and use the vmlinux for PVH, because PVH fetches the
entry point from the special note.  That's clever. :)

I don't see why it should not work, and if so the "depends on !EFI" is
indeed unnecessary.

Paolo


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 14/17] xen/arm: domctl: Use typesafe gfn in XEN_DOMCTL_cacheflush

2018-12-06 Thread Stefano Stabellini
On Tue, 4 Dec 2018, Julien Grall wrote:
> This will make changes in a follow-up patch easier.
> 
> Signed-off-by: Julien Grall 

Acked-by: Stefano Stabellini 

> ---
> Changes in v2:
> - Patch added
> ---
>  xen/arch/arm/domctl.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
> index c10f568aad..20691528a6 100644
> --- a/xen/arch/arm/domctl.c
> +++ b/xen/arch/arm/domctl.c
> @@ -52,16 +52,16 @@ long arch_do_domctl(struct xen_domctl *domctl, struct 
> domain *d,
>  {
>  case XEN_DOMCTL_cacheflush:
>  {
> -unsigned long s = domctl->u.cacheflush.start_pfn;
> -unsigned long e = s + domctl->u.cacheflush.nr_pfns;
> +gfn_t s = _gfn(domctl->u.cacheflush.start_pfn);
> +gfn_t e = gfn_add(s, domctl->u.cacheflush.nr_pfns);
>  
>  if ( domctl->u.cacheflush.nr_pfns > (1U<  return -EINVAL;
>  
> -if ( e < s )
> +if ( gfn_x(e) < gfn_x(s) )
>  return -EINVAL;
>  
> -return p2m_cache_flush_range(d, _gfn(s), _gfn(e));
> +return p2m_cache_flush_range(d, s, e);
>  }
>  case XEN_DOMCTL_bind_pt_irq:
>  {
> -- 
> 2.11.0
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 1/7] xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH

2018-12-06 Thread Boris Ostrovsky
On 12/6/18 5:49 PM, Paolo Bonzini wrote:
> On 06/12/18 23:34, Boris Ostrovsky wrote:
>> On 12/6/18 5:11 PM, Paolo Bonzini wrote:
>>
>>> and also
>>>
>>> depends on !EFI
>>>
>>> because even though in principle it would be possible to write a PVH
>>> loader for UEFI, PVH's start info does not support the EFI handover
>>> protocol.
>> But we should be able to build the binary with both EFI and PVH?
> Can you?  It's a completely different binary format, the EFI handover
> protocol is invoked via a special entry point and needs the Linux header
> format, not ELF.

Right, but I think it is desirable to be able to build both from the
same config file.

-boris


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 12/17] xen/arm: traps: Rework leave_hypervisor_tail

2018-12-06 Thread Stefano Stabellini
On Tue, 4 Dec 2018, Julien Grall wrote:
> The function leave_hypervisor_tail is called before each return to the
> guest vCPU. It has two main purposes:
> 1) Process physical CPU work (e.g rescheduling) if required
> 2) Prepare the physical CPU to run the guest vCPU
> 
> 2) will always be done once we finished to process physical CPU work. At
> the moment, it is done part of the last iterations of 1) making adding
> some extra indentation in the code.
> 
> This could be streamlined by moving out 2) of the loop. At the same
> time, 1) is moved in a separate function making more obvious
> 
> All those changes will help a follow-up patch where we would want to
> introduce some vCPU work before returning to the guest vCPU.
> 
> Signed-off-by: Julien Grall 

Reviewed-by: Stefano Stabellini 


> ---
> Changes in v2:
> - Patch added
> ---
>  xen/arch/arm/traps.c | 61 
> 
>  1 file changed, 33 insertions(+), 28 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index b00d0b8e1e..02665cc7b4 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -2241,36 +2241,12 @@ void do_trap_fiq(struct cpu_user_regs *regs)
>  gic_interrupt(regs, 1);
>  }
>  
> -void leave_hypervisor_tail(void)
> +static void check_for_pcpu_work(void)
>  {
> -while (1)
> -{
> -local_irq_disable();
> -if ( !softirq_pending(smp_processor_id()) )
> -{
> -vgic_sync_to_lrs();
> -
> -/*
> - * If the SErrors handle option is "DIVERSE", we have to prevent
> - * slipping the hypervisor SError to guest. In this option, 
> before
> - * returning from trap, we have to synchronize SErrors to 
> guarantee
> - * that the pending SError would be caught in hypervisor.
> - *
> - * If option is NOT "DIVERSE", SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
> - * will be set to cpu_hwcaps. This means we can use the 
> alternative
> - * to skip synchronizing SErrors for other SErrors handle 
> options.
> - */
> -SYNCHRONIZE_SERROR(SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT);
> -
> -/*
> - * The hypervisor runs with the workaround always present.
> - * If the guest wants it disabled, so be it...
> - */
> -if ( needs_ssbd_flip(current) )
> -arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 0, NULL);
> +ASSERT(!local_irq_is_enabled());
>  
> -return;
> -}
> +while ( softirq_pending(smp_processor_id()) )
> +{
>  local_irq_enable();
>  do_softirq();
>  /*
> @@ -2278,9 +2254,38 @@ void leave_hypervisor_tail(void)
>   * and we want to patch the hypervisor with almost no stack.
>   */
>  check_for_livepatch_work();
> +local_irq_disable();
>  }
>  }
>  
> +void leave_hypervisor_tail(void)
> +{
> +local_irq_disable();
> +
> +check_for_pcpu_work();
> +
> +vgic_sync_to_lrs();
> +
> +/*
> + * If the SErrors handle option is "DIVERSE", we have to prevent
> + * slipping the hypervisor SError to guest. In this option, before
> + * returning from trap, we have to synchronize SErrors to guarantee
> + * that the pending SError would be caught in hypervisor.
> + *
> + * If option is NOT "DIVERSE", SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
> + * will be set to cpu_hwcaps. This means we can use the alternative
> + * to skip synchronizing SErrors for other SErrors handle options.
> + */
> +SYNCHRONIZE_SERROR(SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT);
> +
> +/*
> + * The hypervisor runs with the workaround always present.
> + * If the guest wants it disabled, so be it...
> + */
> +if ( needs_ssbd_flip(current) )
> +arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 0, NULL);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> -- 
> 2.11.0
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 1/7] xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH

2018-12-06 Thread Paolo Bonzini
On 06/12/18 23:34, Boris Ostrovsky wrote:
> On 12/6/18 5:11 PM, Paolo Bonzini wrote:
>> On 06/12/18 07:04, Maran Wilson wrote:
>>> +config PVH
>>> +   bool "Support for running PVH guests"
>>> +   ---help---
>>> + This option enables the PVH entry point for guest virtual machines
>>> + as specified in the x86/HVM direct boot ABI.
>>> +
>> IIUC this breaks "normal" bzImage boot, so we should have something like
>>
>>  The resulting kernel will not boot with most x86 boot loaders
>>  such as GRUB 
> 
> 
> Grub support for PVH guests (for Xen) is well under way.

Oh, nice. :)

>> or SYSLINUX.  Unless you plan to start the kernel
>>  using QEMU or Xen, you probably want to say N here.
> 
> I think PVH should not be user-selectable at all. It should be selected
> by either XEN_PVH or KVM_GUEST_PVH (which you suggested to drop).

KVM_GUEST_PVH is not entirely accurate because it's not just for KVM (it
can be used with QEMU and Apple's Hypervisor.framework for example).

It's also not necessarily just for QEMU (it could be implemented for
kvmtool if desired), but as long as it's in the help I guess it's
acceptable.

I think we could just drop the sentence about boot loaders from my
suggestion.

>>
>> and also
>>
>>  depends on !EFI
>>
>> because even though in principle it would be possible to write a PVH
>> loader for UEFI, PVH's start info does not support the EFI handover
>> protocol.
> 
> But we should be able to build the binary with both EFI and PVH?

Can you?  It's a completely different binary format, the EFI handover
protocol is invoked via a special entry point and needs the Linux header
format, not ELF.

Paolo

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 1/7] xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH

2018-12-06 Thread Boris Ostrovsky
On 12/6/18 5:11 PM, Paolo Bonzini wrote:
> On 06/12/18 07:04, Maran Wilson wrote:
>> +config PVH
>> +bool "Support for running PVH guests"
>> +---help---
>> +  This option enables the PVH entry point for guest virtual machines
>> +  as specified in the x86/HVM direct boot ABI.
>> +
> IIUC this breaks "normal" bzImage boot, so we should have something like
>
>   The resulting kernel will not boot with most x86 boot loaders
>   such as GRUB 


Grub support for PVH guests (for Xen) is well under way.


> or SYSLINUX.  Unless you plan to start the kernel
>   using QEMU or Xen, you probably want to say N here.

I think PVH should not be user-selectable at all. It should be selected
by either XEN_PVH or KVM_GUEST_PVH (which you suggested to drop).

>
> and also
>
>   depends on !EFI
>
> because even though in principle it would be possible to write a PVH
> loader for UEFI, PVH's start info does not support the EFI handover
> protocol.

But we should be able to build the binary with both EFI and PVH?

-boris

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 07/17] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM

2018-12-06 Thread Stefano Stabellini
On Tue, 4 Dec 2018, Julien Grall wrote:
> A follow-up patch will require to emulate some accesses to some
> co-processors registers trapped by HCR_EL2.TVM. When set, all NS EL1 writes
> to the virtual memory control registers will be trapped to the hypervisor.
> 
> This patch adds the infrastructure to passthrough the access to host
> registers. For convenience a bunch of helpers have been added to
> generate the different helpers.
> 
> Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.
> 
> Signed-off-by: Julien Grall 

Reviewed-by: Stefano Stabellini 

> ---
> Changes in v2:
> - Add missing include vreg.h
> - Fixup mask TMV_REG32_COMBINED
> - Update comments
> ---
>  xen/arch/arm/vcpreg.c| 149 
> +++
>  xen/include/asm-arm/cpregs.h |   1 +
>  2 files changed, 150 insertions(+)
> 
> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> index 7b783e4bcc..550c25ec3f 100644
> --- a/xen/arch/arm/vcpreg.c
> +++ b/xen/arch/arm/vcpreg.c
> @@ -23,8 +23,129 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
> +/*
> + * Macros to help generating helpers for registers trapped when
> + * HCR_EL2.TVM is set.
> + *
> + * Note that it only traps NS write access from EL1.
> + *
> + *  - TVM_REG() should not be used outside of the macros. It is there to
> + *help defining TVM_REG32() and TVM_REG64()
> + *  - TVM_REG32(regname, xreg) and TVM_REG64(regname, xreg) are used to
> + *resp. generate helper accessing 32-bit and 64-bit register. "regname"
> + *is the Arm32 name and "xreg" the Arm64 name.
> + *  - TVM_REG32_COMBINED(lowreg, hireg, xreg) are used to generate a
> + *pair of register sharing the same Arm64 register, but are 2 distinct
> + *Arm32 registers. "lowreg" and "hireg" contains the name for on Arm32
> + *registers, "xreg" contains the name for the combined register on Arm64.
> + *The definition of "lowreg" and "higreg" match the Armv8 specification,
> + *this means "lowreg" is an alias to xreg[31:0] and "high" is an alias to
> + *xreg[63:32].
> + *
> + */
> +
> +/* The name is passed from the upper macro to workaround macro expansion. */
> +#define TVM_REG(sz, func, reg...)   \
> +static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)\
> +{   \
> +GUEST_BUG_ON(read); \
> +WRITE_SYSREG##sz(*r, reg);  \
> +\
> +return true;\
> +}
> +
> +#define TVM_REG32(regname, xreg) TVM_REG(32, vreg_emulate_##regname, xreg)
> +#define TVM_REG64(regname, xreg) TVM_REG(64, vreg_emulate_##regname, xreg)
> +
> +#ifdef CONFIG_ARM_32
> +#define TVM_REG32_COMBINED(lowreg, hireg, xreg) \
> +/* Use TVM_REG directly to workaround macro expansion. */   \
> +TVM_REG(32, vreg_emulate_##lowreg, lowreg)  \
> +TVM_REG(32, vreg_emulate_##hireg, hireg)
> +
> +#else /* CONFIG_ARM_64 */
> +#define TVM_REG32_COMBINED(lowreg, hireg, xreg) \
> +static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,\
> +bool read, bool hi) \
> +{   \
> +register_t reg = READ_SYSREG(xreg); \
> +\
> +GUEST_BUG_ON(read); \
> +if ( hi ) /* reg[63:32] is AArch32 register hireg */\
> +{   \
> +reg &= GENMASK(31, 0);  \
> +reg |= ((uint64_t)*r) << 32;\
> +}   \
> +else /* reg[31:0] is AArch32 register lowreg. */\
> +{   \
> +reg &= GENMASK(63, 32); \
> +reg |= *r;  \
> +}   \
> +WRITE_SYSREG(reg, xreg);\
> +\
> +return true;\
> +}   \
> +

Re: [Xen-devel] [PATCH for-4.12 v2 06/17] xen/arm: p2m: Introduce a function to resolve translation fault

2018-12-06 Thread Stefano Stabellini
On Tue, 4 Dec 2018, Julien Grall wrote:
> Currently a Stage-2 translation fault could happen:
> 1) MMIO emulation
> 2) Another pCPU was modifying the P2M using Break-Before-Make
> 3) Guest Physical address is not mapped
> 
> A follow-up patch will re-purpose the valid bit in an entry to generate
> translation fault. This would be used to do an action on each entry to
> track pages used for a given period.
> 
> When receiving the translation fault, we would need to walk the pages
> table to find the faulting entry and then toggle valid bit. We can't use
> p2m_lookup() for this purpose as it only tells us the mapping exists.
> 
> So this patch adds a new function to walk the page-tables and updates
> the entry. This function will also handle 2) as it also requires walking
> the page-table.
> 
> The function is able to cope with both table and block entry having the
> validate bit unset. This gives flexibility to the function clearing the
> valid bits. To keep the algorithm simple, the fault will be propating
> one-level down. This will be repeated until a block entry has been
> reached.
> 
> At the moment, there are no action done when reaching a block/page entry
> but setting the valid bit to 1.

Thanks, this explanation is much better

Acked-by: Stefano Stabellini 


> Signed-off-by: Julien Grall 
> 
> ---
> Changes in v2:
> - Typoes
> - Add more comment
> - Skip clearing valid bit if it was already done
> - Move the prototype in p2m.h
> - Expand commit message
> ---
>  xen/arch/arm/p2m.c| 142 
> ++
>  xen/arch/arm/traps.c  |  10 ++--
>  xen/include/asm-arm/p2m.h |   2 +
>  3 files changed, 148 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 39680eeb6e..2706db3e67 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1035,6 +1035,148 @@ int p2m_set_entry(struct p2m_domain *p2m,
>  return rc;
>  }
>  
> +/* Invalidate all entries in the table. The p2m should be write locked. */
> +static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
> +{
> +lpae_t *table;
> +unsigned int i;
> +
> +ASSERT(p2m_is_write_locked(p2m));
> +
> +table = map_domain_page(mfn);
> +
> +for ( i = 0; i < LPAE_ENTRIES; i++ )
> +{
> +lpae_t pte = table[i];
> +
> +/*
> + * Writing an entry can be expensive because it may involve
> + * cleaning the cache. So avoid updating the entry if the valid
> + * bit is already cleared.
> + */
> +if ( !pte.p2m.valid )
> +continue;
> +
> +pte.p2m.valid = 0;
> +
> +p2m_write_pte([i], pte, p2m->clean_pte);
> +}
> +
> +unmap_domain_page(table);
> +
> +p2m->need_flush = true;
> +}
> +
> +/*
> + * Resolve any translation fault due to change in the p2m. This
> + * includes break-before-make and valid bit cleared.
> + */
> +bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
> +{
> +struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +unsigned int level = 0;
> +bool resolved = false;
> +lpae_t entry, *table;
> +paddr_t addr = gfn_to_gaddr(gfn);
> +
> +/* Convenience aliases */
> +const unsigned int offsets[4] = {
> +zeroeth_table_offset(addr),
> +first_table_offset(addr),
> +second_table_offset(addr),
> +third_table_offset(addr)
> +};
> +
> +p2m_write_lock(p2m);
> +
> +/* This gfn is higher than the highest the p2m map currently holds */
> +if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
> +goto out;
> +
> +table = p2m_get_root_pointer(p2m, gfn);
> +/*
> + * The table should always be non-NULL because the gfn is below
> + * p2m->max_mapped_gfn and the root table pages are always present.
> + */
> +BUG_ON(table == NULL);
> +
> +/*
> + * Go down the page-tables until an entry has the valid bit unset or
> + * a block/page entry has been hit.
> + */
> +for ( level = P2M_ROOT_LEVEL; level <= 3; level++ )
> +{
> +int rc;
> +
> +entry = table[offsets[level]];
> +
> +if ( level == 3 )
> +break;
> +
> +/* Stop as soon as we hit an entry with the valid bit unset. */
> +if ( !lpae_is_valid(entry) )
> +break;
> +
> +rc = p2m_next_level(p2m, true, level, , offsets[level]);
> +if ( rc == GUEST_TABLE_MAP_FAILED )
> +goto out_unmap;
> +else if ( rc != GUEST_TABLE_NORMAL_PAGE )
> +break;
> +}
> +
> +/*
> + * If the valid bit of the entry is set, it means someone was playing 
> with
> + * the Stage-2 page table. Nothing to do and mark the fault as resolved.
> + */
> +if ( lpae_is_valid(entry) )
> +{
> +resolved = true;
> +goto out_unmap;
> +}
> +
> +/*
> + * The valid bit is unset. If the entry is still not valid then the 

Re: [Xen-devel] [PATCH for-4.12 v2 11/17] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit)

2018-12-06 Thread Stefano Stabellini
On Tue, 4 Dec 2018, Razvan Cojocaru wrote:
> On 12/4/18 10:26 PM, Julien Grall wrote:
> > With the recent changes, a P2M entry may be populated but may as not
> > valid. In some situation, it would be useful to know whether the entry
> 
> I think you mean to say "may not be valid"?
> 
> > has been marked available to guest in order to perform a specific
> > action. So extend p2m_get_entry to return the value of bit[0] (valid bit).
> > 
> > Signed-off-by: Julien Grall 
> 
> Other than that,
> 
> Acked-by: Razvan Cojocaru 

Same here:

Reviewed-by: Stefano Stabellini 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 0/7] KVM: x86: Allow Qemu/KVM to use PVH entry point

2018-12-06 Thread Paolo Bonzini
On 06/12/18 22:58, Boris Ostrovsky wrote:
> On 12/6/18 4:37 PM, Borislav Petkov wrote:
>> On Thu, Dec 06, 2018 at 10:21:12PM +0100, Paolo Bonzini wrote:
>>> Thanks!  I should be able to post a Tested-by next Monday.  Boris, are
>>> you going to pick it up for 4.21?
>> Boris me or Boris O.?
>>
>> :-)
>>
> 
> O. ;-)
> 
> There are some minor changes in non-xen x86 code so it would be good to
> get x86 maintainers' ack.

It's not really code, only Kconfig (and I remarked on it just now), but
it doesn't hurt of course.

> And as far as qemu/qboot changes, should we assume that the general
> approach is acceptable? I understand that the patches will probably need
> to go through some iterations but I want to make sure we have a path
> forward there.

Yes, the general approach is fine.  I have already reviewed the qboot
parts, I guess we will also want an option ROM similar to
linuxboot/multiboot for SeaBIOS support but that's simple matter of
programming. :)

Paolo

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-3.18 test] 131035: regressions - FAIL

2018-12-06 Thread osstest service owner
flight 131035 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131035/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-examine   8 reboot   fail REGR. vs. 128858
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-boot fail REGR. vs. 
128858
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-boot fail REGR. 
vs. 128858
 test-amd64-amd64-xl-multivcpu  7 xen-bootfail REGR. vs. 128858
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. 
vs. 128858
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 128858
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-xl-pvshim7 xen-boot fail REGR. vs. 128858
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 128858
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 128858
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot  fail REGR. vs. 128858
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 128858
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 128858
 test-amd64-amd64-rumprun-amd64  7 xen-boot   fail REGR. vs. 128858
 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host   fail REGR. vs. 128858
 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host   fail REGR. vs. 128858
 test-amd64-amd64-xl-xsm   7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-xl-qemuu-ovmf-amd64  7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 128858
 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 128858

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-vhd  10 debian-di-installfail  like 128841
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 128858
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 128858
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 128858
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 128858
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 128858
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 128858
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 

Re: [Xen-devel] [PATCH v8 1/7] xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH

2018-12-06 Thread Paolo Bonzini
On 06/12/18 07:04, Maran Wilson wrote:
> +config PVH
> + bool "Support for running PVH guests"
> + ---help---
> +   This option enables the PVH entry point for guest virtual machines
> +   as specified in the x86/HVM direct boot ABI.
> +

IIUC this breaks "normal" bzImage boot, so we should have something like

The resulting kernel will not boot with most x86 boot loaders
such as GRUB or SYSLINUX.  Unless you plan to start the kernel
using QEMU or Xen, you probably want to say N here.

and also

depends on !EFI

because even though in principle it would be possible to write a PVH
loader for UEFI, PVH's start info does not support the EFI handover
protocol.

Paolo

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 7/7] KVM: x86: Allow Qemu/KVM to use PVH entry point

2018-12-06 Thread Paolo Bonzini
On 06/12/18 07:06, Maran Wilson wrote:
> +config KVM_GUEST_PVH
> + bool "Support for running as a KVM PVH guest"
> + depends on KVM_GUEST
> + select PVH
> + ---help---
> +   This option enables starting KVM guests via the PVH entry point as
> +   specified in the x86/HVM direct boot ABI.
> +

This symbol is unused, so it can be removed.

Paolo

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva

2018-12-06 Thread Stefano Stabellini
On Wed, 5 Dec 2018, Julien Grall wrote:
> On 04/12/2018 23:59, Stefano Stabellini wrote:
> > On Tue, 4 Dec 2018, Julien Grall wrote:
> > > A follow-up patch will re-purpose the valid bit of LPAE entries to
> > > generate fault even on entry containing valid information.
> > > 
> > > This means that when translating a guest VA to guest PA (e.g IPA) will
> > > fail if the Stage-2 entries used have the valid bit unset. Because of
> > > that, we need to fallback to walk the page-table in software to check
> > > whether the fault was expected.
> > > 
> > > This patch adds the software page-table walk on all the translation
> > > fault. It would be possible in the future to avoid pointless walk when
> > > the fault in PAR_EL1 is not a translation fault.
> > > 
> > > Signed-off-by: Julien Grall 
> > > 
> > > ---
> > > 
> > > There are a couple of TODO in the code. They are clean-up and performance
> > > improvement (e.g when the fault cannot be handled) that could be delayed
> > > after
> > > the series has been merged.
> > > 
> > >  Changes in v2:
> > >  - Check stage-2 permission during software lookup
> > >  - Fix typoes
> > > ---
> > >   xen/arch/arm/p2m.c | 66
> > > --
> > >   1 file changed, 59 insertions(+),should  7 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> > > index 47b54c792e..39680eeb6e 100644
> > > --- a/xen/arch/arm/p2m.c
> > > +++ b/xen/arch/arm/p2m.c
> > > @@ -6,6 +6,7 @@
> > > #include 
> > >   #include 
> > > +#include 
> > >   #include 
> > > #define MAX_VMID_8_BIT  (1UL << 8)
> > > @@ -1430,6 +1431,8 @@ struct page_info *get_page_from_gva(struct vcpu *v,
> > > vaddr_t va,
> > >   struct page_info *page = NULL;
> > >   paddr_t maddr = 0;
> > >   uint64_t par;
> > > +mfn_t mfn;
> > > +p2m_type_t t;
> > > /*
> > >* XXX: To support a different vCPU, we would need to load the
> > > @@ -1446,8 +1449,29 @@ struct page_info *get_page_from_gva(struct vcpu *v,
> > > vaddr_t va,
> > >   par = gvirt_to_maddr(va, , flags);
> > >   p2m_read_unlock(p2m);
> > >   +/*
> > > + * gvirt_to_maddr may fail if the entry does not have the valid bit
> > > + * set. Fallback to the second method:
> > > + *  1) Translate the VA to IPA using software lookup -> Stage-1
> > > page-table
> > > + *  may not be accessible because the stage-2 entries may have valid
> > > + *  bit unset.
> > > + *  2) Software lookup of the MFN
> > > + *
> > > + * Note that when memaccess is enabled, we instead call directly
> > > + * p2m_mem_access_check_and_get_page(...). Because the function is a
> > > + * a variant of the methods described above, it will be able to
> > > + * handle entries with valid bit unset.
> > > + *
> > > + * TODO: Integrate more nicely memaccess with the rest of the
> > > + * function.
> > > + * TODO: Use the fault error in PAR_EL1 to avoid pointless
> > > + *  translation.
> > > + */
> > >   if ( par )
> > >   {
> > > +paddr_t ipa;
> > > +unsigned int s1_perms;
> > > +
> > >   /*
> > >* When memaccess is enabled, the translation GVA to MADDR may
> > >* have failed because of a permission fault.
> > > @@ -1455,20 +1479,48 @@ struct page_info *get_page_from_gva(struct vcpu
> > > *v, vaddr_t va,
> > >   if ( p2m->mem_access_enabled )
> > >   return p2m_mem_access_check_and_get_page(va, flags, v);
> > >   -dprintk(XENLOG_G_DEBUG,
> > > -"%pv: gvirt_to_maddr failed va=%#"PRIvaddr" flags=0x%lx
> > > par=%#"PRIx64"\n",
> > > -v, va, flags, par);
> > > -return NULL;
> > > +/*
> > > + * The software stage-1 table walk can still fail, e.g, if the
> > > + * GVA is not mapped.
> > > + */
> > > +if ( !guest_walk_tables(v, va, , _perms) )
> > > +{
> > > +dprintk(XENLOG_G_DEBUG,
> > > +"%pv: Failed to walk page-table va %#"PRIvaddr"\n",
> > > v, va);
> > > +return NULL;
> > > +}
> > > +
> > > +mfn = p2m_lookup(d, gaddr_to_gfn(ipa), );
> > > +if ( mfn_eq(INVALID_MFN, mfn) || !p2m_is_ram(t) )
> > > +return NULL;
> > > +
> > > +/*
> > > + * Check permission that are assumed by the caller. For instance
> > > + * in case of guestcopy, the caller assumes that the translated
> > > + * page can be accessed with the requested permissions. If this
> > > + * is not the case, we should fail.
> > > + *
> > > + * Please note that we do not check for the GV2M_EXEC
> > > + * permission. This is fine because the hardware-based
> > > translation
> > > + * instruction does not test for execute permissions.
> > > + */
> > > +if ( (flags & GV2M_WRITE) && !(s1_perms & GV2M_WRITE) )
> > > +  

Re: [Xen-devel] [PATCH for-4.12 v2 04/17] xen/arm: p2m: Introduce p2m_is_valid and use it

2018-12-06 Thread Stefano Stabellini
On Wed, 5 Dec 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 04/12/2018 23:50, Stefano Stabellini wrote:
> > On Tue, 4 Dec 2018, Julien Grall wrote:
> > > The LPAE format allows to store information in an entry even with the
> > > valid bit unset. In a follow-up patch, we will take advantage of this
> > > feature to re-purpose the valid bit for generating a translation fault
> > > even if an entry contains valid information.
> > > 
> > > So we need a different way to know whether an entry contains valid
> > > information. It is possible to use the information hold in the p2m_type
> > > to know for that purpose. Indeed all entries containing valid
> > > information will have a valid p2m type (i.e p2m_type != p2m_invalid).
> > > 
> > > This patch introduces a new helper p2m_is_valid, which implements that
> > > idea, and replace most of lpae_is_valid call with the new helper. The ones
> > > remaining are for TLBs handling and entries accounting.
> > > 
> > > With the renaming there are 2 others changes required:
> > >  - Generate table entry with a valid p2m type
> > >  - Detect new mapping for proper stats accounting
> > > 
> > > Signed-off-by: Julien Grall 
> > 
> > Reviewed-by: Stefano Stabellini 
> > 
> > (This patch doesn't apply to master, please rebase)
> 
> Why are you trying to apply to master? This series (as most of my series) are
> based on staging at the time it was sent. I tried to apply this patch today on
> staging and I didn't find any issue.

No problems then, I thought the series was based on an older tree, but
instead it was on step ahead.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 0/7] KVM: x86: Allow Qemu/KVM to use PVH entry point

2018-12-06 Thread Boris Ostrovsky
On 12/6/18 4:37 PM, Borislav Petkov wrote:
> On Thu, Dec 06, 2018 at 10:21:12PM +0100, Paolo Bonzini wrote:
>> Thanks!  I should be able to post a Tested-by next Monday.  Boris, are
>> you going to pick it up for 4.21?
> Boris me or Boris O.?
>
> :-)
>

O. ;-)

There are some minor changes in non-xen x86 code so it would be good to
get x86 maintainers' ack.

And as far as qemu/qboot changes, should we assume that the general
approach is acceptable? I understand that the patches will probably need
to go through some iterations but I want to make sure we have a path
forward there.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 0/7] KVM: x86: Allow Qemu/KVM to use PVH entry point

2018-12-06 Thread Borislav Petkov
On Thu, Dec 06, 2018 at 10:21:12PM +0100, Paolo Bonzini wrote:
> Thanks!  I should be able to post a Tested-by next Monday.  Boris, are
> you going to pick it up for 4.21?

Boris me or Boris O.?

:-)

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 0/7] KVM: x86: Allow Qemu/KVM to use PVH entry point

2018-12-06 Thread Paolo Bonzini
On 06/12/18 07:02, Maran Wilson wrote:
> For certain applications it is desirable to rapidly boot a KVM virtual
> machine. In cases where legacy hardware and software support within the
> guest is not needed, Qemu should be able to boot directly into the
> uncompressed Linux kernel binary without the need to run firmware.
> 
> There already exists an ABI to allow this for Xen PVH guests and the ABI
> is supported by Linux and FreeBSD:
> 
>https://xenbits.xen.org/docs/unstable/misc/pvh.html
> 
> This patch series would enable Qemu to use that same entry point for
> booting KVM guests.

Thanks!  I should be able to post a Tested-by next Monday.  Boris, are
you going to pick it up for 4.21?

Paolo

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Patch "x86/entry/64: Remove %ebx handling from error_entry/exit" has been added to the 4.9-stable tree

2018-12-06 Thread David Woodhouse
On Thu, 2018-12-06 at 10:49 -0800, Andy Lutomirski wrote:
> > On Dec 6, 2018, at 9:36 AM, Andrew Cooper  wrote:
> > Basically - what is happening is that xen_load_tls() is invalidating the
> > %gs selector while %gs is still non-NUL.
> > 
> > If this happens to intersect with a vcpu reschedule, %gs (being non-NUL)
> > takes precedence over KERNGSBASE, and faults when Xen tries to reload
> > it.  This results in the failsafe callback being invoked.
> > 
> > I think the correct course of action is to use xen_load_gs_index(0)
> > (poorly named - it is a hypercall which does swapgs; mov to %gs; swapgs)
> > before using update_descriptor() to invalidate the segment.
> > 
> > That will reset %gs to 0 without touching KERNGSBASE, and can be queued
> > in the same multicall as the update_descriptor() hypercall.
> 
> Sounds good to me as long as we skip it on native.

Like this? The other option is just to declare that we don't care. On
the rare occasion that it does happen to preempt and then take the trap
on reloading, xen_failsafe_callback is actually doing the right thing
and just leaving %gs as zero. We'd just need to fix the comments so
they explicitly note this case is handled there too. At the moment it
just says 'Retry the IRET', as I noted before.

diff --git a/arch/x86/include/asm/xen/hypercall.h 
b/arch/x86/include/asm/xen/hypercall.h
index ef05bea7010d..e8b383b24246 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -520,4 +520,15 @@ MULTI_stack_switch(struct multicall_entry *mcl,
trace_xen_mc_entry(mcl, 2);
 }
 
+static inline void
+MULTI_set_segment_base(struct multicall_entry *mcl,
+  int reg, unsigned long value)
+{
+   mcl->op = __HYPERVISOR_set_segment_base;
+   mcl->args[0] = reg;
+   mcl->args[1] = value;
+
+   trace_xen_mc_entry(mcl, 2);
+}
+
 #endif /* _ASM_X86_XEN_HYPERCALL_H */
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 2f6787fc7106..722f1f51e20c 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -527,6 +527,8 @@ static void load_TLS_descriptor(struct thread_struct *t,
 
 static void xen_load_tls(struct thread_struct *t, unsigned int cpu)
 {
+   xen_mc_batch();
+
/*
 * XXX sleazy hack: If we're being called in a lazy-cpu zone
 * and lazy gs handling is enabled, it means we're in a
@@ -537,24 +539,24 @@ static void xen_load_tls(struct thread_struct *t, 
unsigned int cpu)
 * This will go away as soon as Xen has been modified to not
 * save/restore %gs for normal hypercalls.
 *
-* On x86_64, this hack is not used for %gs, because gs points
-* to KERNEL_GS_BASE (and uses it for PDA references), so we
-* must not zero %gs on x86_64
-*
 * For x86_64, we need to zero %fs, otherwise we may get an
 * exception between the new %fs descriptor being loaded and
-* %fs being effectively cleared at __switch_to().
+* %fs being effectively cleared at __switch_to(). We can't
+* just zero %gs, but we do need to clear the selector in
+* case of a Xen vCPU context switch before it gets reloaded
+* which would also cause a fault.
 */
if (paravirt_get_lazy_mode() == PARAVIRT_LAZY_CPU) {
 #ifdef CONFIG_X86_32
lazy_load_gs(0);
 #else
+   struct multicall_space mc = __xen_mc_entry(0);
+   MULTI_set_segment_base(mc.mc, SEGBASE_GS_USER_SEL, 0);
+
loadsegment(fs, 0);
 #endif
}
 
-   xen_mc_batch();
-
load_TLS_descriptor(t, cpu, 0);
load_TLS_descriptor(t, cpu, 1);
load_TLS_descriptor(t, cpu, 2);


smime.p7s
Description: S/MIME cryptographic signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 8/9] x86/amd: Virtualise MSR_VIRT_SPEC_CTRL for guests

2018-12-06 Thread Woods, Brian
On Wed, Dec 05, 2018 at 01:41:30AM -0700, Jan Beulich wrote:
> >>> On 04.12.18 at 22:35,  wrote:
> > The other thing I don't get is why advertise virtualized SSBD when the
> > guest setting it does nothing?  If ssbd_opt=true is set, as the code is
> > now, why even advertise it to the guest?  I'd suggest either allowing
> > the guest to turn it off or not advertise it at all (when ssbd_opt =
> > true).
> 
> I think it's better to advertise the feature nevertheless: Otherwise
> the guest might either try some other way of mitigating the
> (believed) vulnerability, or it may report in its logs that it's vulnerable
> (without mitigation) when it really isn't.
> 
> Jan
> 

I can understand that reasoning, but I'd still argue that an additional
option to force guests to use SSBD (like setting ssbd=yes in these
patches) and the default of ssbd=yes allow the guest to turn it off
would be more correct.  I'm not going to be adamant about it though.

-- 
Brian Woods

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 6/9] x86/amd: Allocate resources to cope with LS_CFG being per-core on Fam17h

2018-12-06 Thread Woods, Brian
On Thu, Dec 06, 2018 at 06:46:51PM +, Andy Cooper wrote:
> On 06/12/2018 08:54, Jan Beulich wrote:
>  On 05.12.18 at 18:05,  wrote:
> >> On 05/12/2018 16:57, Jan Beulich wrote:
> >> On 03.12.18 at 17:18,  wrote:
>  --- a/xen/arch/x86/cpu/amd.c
>  +++ b/xen/arch/x86/cpu/amd.c
>  @@ -419,6 +419,97 @@ static void __init noinline 
>  amd_probe_legacy_ssbd(void)
>   }
>   
>   /*
>  + * This is all a gross hack, but Xen really doesn't have flexible-enough
>  + * per-cpu infrastructure to do it properly.  For Zen(v1) with SMT 
>  active,
>  + * MSR_AMD64_LS_CFG is per-core rather than per-thread, so we need a 
>  per-core
>  + * spinlock to synchronise updates of the MSR.
>  + *
>  + * We can't use per-cpu state because taking one CPU offline would free 
>  state
>  + * under the feet of another.  Ideally, we'd allocate memory on the AP 
>  boot
>  + * path, but by the time the sibling information is calculated 
>  sufficiently
>  + * for us to locate the per-core state, it's too late to fail the AP 
>  boot.
>  + *
>  + * We also can't afford to end up in a heterogeneous scenario with some 
>  CPUs
>  + * unable to safely use LS_CFG.
>  + *
>  + * Therefore, we have to allocate for the worse-case scenario, which is
>  + * believed to be 4 sockets.  Any allocation failure cause us to turn 
>  LS_CFG
>  + * off, as this is fractionally better than failing to boot.
>  + */
>  +static struct ssbd_ls_cfg {
>  +spinlock_t lock;
>  +unsigned int disable_count;
>  +} *ssbd_ls_cfg[4];
> >>> Same question as to Brian for his original code: Instead of the
> >>> hard-coding of 4, can't you use nr_sockets here?
> >>> smp_prepare_cpus() runs before pre-SMP initcalls after all.
> >> nr_sockets has zero connection with reality as far as I can tell.
> >>
> >> On this particular box it reports 6 when the correct answer is 2.  I've
> >> got some Intel boxes where nr_sockets reports 15 and the correct answer
> >> is 4.
> > If you look back at when it was introduced, the main goal was
> > for it to never be too low. Any improvements to its calculation
> > are welcome, provided they maintain that guarantee. To high
> > a socket count is imo still better than a hard-coded one.
> 
> Even for the extra 2k of memory it will waste?
> 
> ~Andrew

Just as a side note, for processors using MSR LS_CFG and have SMT
enabled (F17h), there should only be 2 physical sockets.  The 4 was a
worst case (and before some other information was available).
Realistically, there should only be a max of 2 physical sockets when
this needed.  Although, having 4 could be nice as a safe buffer and
only costs 16 bytes.

-- 
Brian Woods

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 131054: regressions - FAIL

2018-12-06 Thread osstest service owner
flight 131054 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131054/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-xsm6 xen-buildfail REGR. vs. 129475
 build-i3866 xen-buildfail REGR. vs. 129475
 build-amd64   6 xen-buildfail REGR. vs. 129475
 build-amd64-xsm   6 xen-buildfail REGR. vs. 129475

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a

version targeted for testing:
 ovmf 8efc6d84ca41e692cc60702e1f27276f7883b6db
baseline version:
 ovmf 5ae3184d8c59f7bbb84bad482df6b8020ba58188

Last test of basis   129475  2018-11-05 21:13:11 Z   30 days
Failing since129526  2018-11-06 20:49:26 Z   29 days  146 attempts
Testing same since   131054  2018-12-05 10:42:30 Z1 days1 attempts


People who touched revisions under test:
  Achin Gupta 
  Ard Biesheuvel 
  BobCF 
  Chasel Chiu 
  Chasel, Chiu 
  Dandan Bi 
  David Wei 
  Eric Dong 
  Feng, Bob C 
  Fu Siyuan 
  Gary Lin 
  Hao Wu 
  Jeff Brasen 
  Jian J Wang 
  Jiaxin Wu 
  Jiewen Yao 
  Laszlo Ersek 
  Leif Lindholm 
  Liming Gao 
  Liu Yu 
  Marc Zyngier 
  Marcin Wojtas 
  Ming Huang 
  Pedroa Liu 
  Ruiyu Ni 
  shenglei 
  Shenglei Zhang 
  Star Zeng 
  Sughosh Ganu 
  Sun, Zailiang 
  Thomas Abraham 
  Tomasz Michalec 
  Vijayenthiran Subramaniam 
  Wang BinX A 
  Wu Jiaxin 
  Yonghong Zhu 
  yuchenlin 
  Zailiang Sun 
  Zhang, Chao B 
  zwei4 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 2658 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/2] x86/pv: Code improvements to do_update_descriptor()

2018-12-06 Thread Andrew Cooper
 * Add "uint64_t raw" to seg_desc_t to remove the opencoded uint64_t casting
   in this function.
 * Rename the 'pa' parameter to 'gaddr', because it lives in GFN space rather
   than physical address space.
 * Use gfn_t and mfn_t rather than unsigned longs.
 * Check the alignment and proposed new descriptor before taking a page
   reference.
 * Reuse the out label for all exit paths.
 * Use the more flexible ACCESS_ONCE() accessor in preference to
   write_atomic()

No expected change in behaviour.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
---
 xen/arch/x86/pv/descriptor-tables.c | 41 +
 xen/include/asm-x86/desc.h  |  7 +--
 2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/pv/descriptor-tables.c 
b/xen/arch/x86/pv/descriptor-tables.c
index caa62eb..efcfbab 100644
--- a/xen/arch/x86/pv/descriptor-tables.c
+++ b/xen/arch/x86/pv/descriptor-tables.c
@@ -206,30 +206,26 @@ int compat_set_gdt(XEN_GUEST_HANDLE_PARAM(uint) 
frame_list,
 return ret;
 }
 
-long do_update_descriptor(uint64_t pa, uint64_t desc)
+long do_update_descriptor(uint64_t gaddr, uint64_t desc)
 {
 struct domain *currd = current->domain;
-unsigned long gmfn = pa >> PAGE_SHIFT;
-unsigned long mfn;
-unsigned int  offset;
-seg_desc_t *gdt_pent, d;
+gfn_t gfn = gaddr_to_gfn(gaddr);
+mfn_t mfn;
+seg_desc_t *entry, d;
 struct page_info *page;
 long ret = -EINVAL;
 
-offset = ((unsigned int)pa & ~PAGE_MASK) / sizeof(seg_desc_t);
+d.raw = desc;
 
-*(uint64_t *) = desc;
+/* gaddr must be aligned, or it will corrupt adjacent descriptors. */
+if ( !IS_ALIGNED(gaddr, sizeof(d)) || !check_descriptor(currd, ) )
+goto out;
 
-page = get_page_from_gfn(currd, gmfn, NULL, P2M_ALLOC);
-if ( (((unsigned int)pa % sizeof(seg_desc_t)) != 0) ||
- !page ||
- !check_descriptor(currd, ) )
-{
-if ( page )
-put_page(page);
-return -EINVAL;
-}
-mfn = mfn_x(page_to_mfn(page));
+page = get_page_from_gfn(currd, gfn_x(gfn), NULL, P2M_ALLOC);
+if ( !page )
+goto out;
+
+mfn = page_to_mfn(page);
 
 /* Check if the given frame is in use in an unsafe context. */
 switch ( page->u.inuse.type_info & PGT_type_mask )
@@ -244,19 +240,20 @@ long do_update_descriptor(uint64_t pa, uint64_t desc)
 break;
 }
 
-paging_mark_dirty(currd, _mfn(mfn));
+paging_mark_dirty(currd, mfn);
 
 /* All is good so make the update. */
-gdt_pent = map_domain_page(_mfn(mfn));
-write_atomic((uint64_t *)_pent[offset], *(uint64_t *));
-unmap_domain_page(gdt_pent);
+entry = map_domain_page(mfn) + (gaddr & ~PAGE_MASK);
+ACCESS_ONCE(entry->raw) = d.raw;
+unmap_domain_page(entry);
 
 put_page_type(page);
 
 ret = 0; /* success */
 
  out:
-put_page(page);
+if ( page )
+put_page(page);
 
 return ret;
 }
diff --git a/xen/include/asm-x86/desc.h b/xen/include/asm-x86/desc.h
index 5a8afb6..85e83bc 100644
--- a/xen/include/asm-x86/desc.h
+++ b/xen/include/asm-x86/desc.h
@@ -102,8 +102,11 @@
 #define SYS_DESC_irq_gate 14
 #define SYS_DESC_trap_gate15
 
-typedef struct {
-uint32_t a, b;
+typedef union {
+struct {
+uint32_t a, b;
+};
+uint64_t raw;
 } seg_desc_t;
 
 typedef union {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 7/9] x86/amd: Support context switching legacy SSBD interface

2018-12-06 Thread Andrew Cooper
On 06/12/2018 10:51, Jan Beulich wrote:
>
>> +unsigned int socket = c->phys_proc_id, core = c->cpu_core_id;
>> +struct ssbd_ls_cfg *cfg;
>> +uint64_t val;
>> +
>> +ASSERT(cpu_has_legacy_ssbd);
>> +
>> +/*
>> + * Update hardware lazily, as these MSRs are expensive.  However, on
>> + * the boot paths which pass NULL, force a write to set a consistent
>> + * initial state.
>> + */
>> +if (*this_ssbd == disable && next)
>> +return;
>> +
>> +if (cpu_has_virt_sc_ssbd) {
>> +wrmsrl(MSR_VIRT_SPEC_CTRL,
>> +   disable ? SPEC_CTRL_SSBD : 0);
>> +goto done;
>> +}
>> +
>> +val = ls_cfg_base | (disable ? ls_cfg_ssbd_mask : 0);
>> +
>> +if (c->x86 < 0x17 || c->x86_num_siblings == 1) {
>> +/* No threads to be concerned with. */
>> +wrmsrl(MSR_AMD64_LS_CFG, val);
>> +goto done;
>> +}
>> +
>> +/* Check that we won't overflow the worse-case allocation. */
>> +BUG_ON(socket >= ARRAY_SIZE(ssbd_ls_cfg));
>> +BUG_ON(core   >= ssbd_max_cores);
> Wouldn't it be better to fail onlining of such CPUs?

How?  We've not currently got an ability to fail in the middle of
start_secondary(), which is why the previous patch really does go an
allocate the worst case.

These are here because I don't trust really trust the topology logic
(which turned out to be very wise, in retrospect), not because I
actually expect them to trigger from now on.

>
>> +cfg = _ls_cfg[socket][core];
>> +
>> +if (disable) {
>> +spin_lock(>lock);
>> +
>> +/* First sibling to disable updates hardware. */
>> +if (!cfg->disable_count)
>> +wrmsrl(MSR_AMD64_LS_CFG, val);
>> +cfg->disable_count++;
>> +
>> +spin_unlock(>lock);
>> +} else {
>> +spin_lock(>lock);
>> +
>> +/* Last sibling to enable updates hardware. */
>> +cfg->disable_count--;
>> +if (!cfg->disable_count)
>> +wrmsrl(MSR_AMD64_LS_CFG, val);
>> +
>> +spin_unlock(>lock);
>> +}
> Any reason for duplicating the spin_{,un}lock() calls?

To avoid having a context-dependent jump in the critical region.  Then
again, I suppose that is completely dwarfed by the WRMSR.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Patch "x86/entry/64: Remove %ebx handling from error_entry/exit" has been added to the 4.9-stable tree

2018-12-06 Thread Andy Lutomirski
> On Dec 6, 2018, at 9:36 AM, Andrew Cooper  wrote:
>
>> On 06/12/2018 17:10, David Woodhouse wrote:
>> On Wed, 2018-11-28 at 08:44 -0800, Andy Lutomirski wrote:
 Can we assume it's always from kernel? The Xen code definitely seems to
 handle invoking this from both kernel and userspace contexts.
>>> I learned that my comment here was wrong shortly after the patch landed :(
>> Turns out the only place I see it getting called from is under
>> __context_switch().
>>
>> #7 [8801144a7cf0] new_xen_failsafe_callback at a028028a 
>> [kmod_ebxfix]
>> #8 [8801144a7d90] xen_hypercall_update_descriptor at 8100114a
>> #9 [8801144a7db8] xen_hypercall_update_descriptor at 8100114a
>> #10 [8801144a7df0] xen_mc_flush at 81006ab9
>> #11 [8801144a7e30] xen_end_context_switch at 81004e12
>> #12 [8801144a7e48] __switch_to at 81016582
>> #13 [8801144a7ea0] __schedule at 815d2b37
>>
>> That …114a in xen_hypercall_update_descriptor is the 'pop' instruction
>> right after the syscall; it's happening when Xen is preempting the
>> domain in the hypercall and then reloads the segment registers to run
>> that vCPU again later.
>>
>> [  44185.225289]   WARN: RDX:  RSI:  RDI: 
>> 000abbd76060
>>
>> The update_descriptor hypercall args (rdi, rsi) were 0xabbd76060 and 0
>> respectively — it was setting a descriptor at that address, to zero.
>>
>> Xen then failed to load the selector 0x63 into the %gs register (since
>> that descriptor has just been wiped?), leaving it zero.
>>
>> [  44185.225256]   WARN: xen_failsafe_callback from 
>> xen_hypercall_update_descriptor+0xa/0x40
>> [  44185.225263]   WARN: DS: 2b/2b ES: 2b/2b FS: 0/0 GS:0/63
>>
>> This is on context switch from a 32-bit task to idle. So
>> xen_failsafe_callback is returning to the "faulting" instruction, with
>> a comment saying "Retry the IRET", but in fact is just continuing on
>> its merry way with %gs unexpectedly set to zero.
>>
>> In fact I think this is probably fine in practice, since it's about to
>> get explicitly set a few lines further down in __context_switch(). But
>> it's odd enough, and far enough away from what's actually said by the
>> comments, that I'm utterly unsure.
>>
>> In xen_load_tls() we explicitly only do the lazy_load_gs(0) for the
>> 32-bit kernel. Is that really right?
>
> Basically - what is happening is that xen_load_tls() is invalidating the
> %gs selector while %gs is still non-NUL.
>
> If this happens to intersect with a vcpu reschedule, %gs (being non-NUL)
> takes precedence over KERNGSBASE, and faults when Xen tries to reload
> it.  This results in the failsafe callback being invoked.
>
> I think the correct course of action is to use xen_load_gs_index(0)
> (poorly named - it is a hypercall which does swapgs; mov to %gs; swapgs)
> before using update_descriptor() to invalidate the segment.
>
> That will reset %gs to 0 without touching KERNGSBASE, and can be queued
> in the same multicall as the update_descriptor() hypercall.

Sounds good to me as long as we skip it on native.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 6/9] x86/amd: Allocate resources to cope with LS_CFG being per-core on Fam17h

2018-12-06 Thread Andrew Cooper
On 06/12/2018 08:54, Jan Beulich wrote:
 On 05.12.18 at 18:05,  wrote:
>> On 05/12/2018 16:57, Jan Beulich wrote:
>> On 03.12.18 at 17:18,  wrote:
 --- a/xen/arch/x86/cpu/amd.c
 +++ b/xen/arch/x86/cpu/amd.c
 @@ -419,6 +419,97 @@ static void __init noinline 
 amd_probe_legacy_ssbd(void)
  }
  
  /*
 + * This is all a gross hack, but Xen really doesn't have flexible-enough
 + * per-cpu infrastructure to do it properly.  For Zen(v1) with SMT active,
 + * MSR_AMD64_LS_CFG is per-core rather than per-thread, so we need a 
 per-core
 + * spinlock to synchronise updates of the MSR.
 + *
 + * We can't use per-cpu state because taking one CPU offline would free 
 state
 + * under the feet of another.  Ideally, we'd allocate memory on the AP 
 boot
 + * path, but by the time the sibling information is calculated 
 sufficiently
 + * for us to locate the per-core state, it's too late to fail the AP boot.
 + *
 + * We also can't afford to end up in a heterogeneous scenario with some 
 CPUs
 + * unable to safely use LS_CFG.
 + *
 + * Therefore, we have to allocate for the worse-case scenario, which is
 + * believed to be 4 sockets.  Any allocation failure cause us to turn 
 LS_CFG
 + * off, as this is fractionally better than failing to boot.
 + */
 +static struct ssbd_ls_cfg {
 +  spinlock_t lock;
 +  unsigned int disable_count;
 +} *ssbd_ls_cfg[4];
>>> Same question as to Brian for his original code: Instead of the
>>> hard-coding of 4, can't you use nr_sockets here?
>>> smp_prepare_cpus() runs before pre-SMP initcalls after all.
>> nr_sockets has zero connection with reality as far as I can tell.
>>
>> On this particular box it reports 6 when the correct answer is 2.  I've
>> got some Intel boxes where nr_sockets reports 15 and the correct answer
>> is 4.
> If you look back at when it was introduced, the main goal was
> for it to never be too low. Any improvements to its calculation
> are welcome, provided they maintain that guarantee. To high
> a socket count is imo still better than a hard-coded one.

Even for the extra 2k of memory it will waste?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 9/9] xen/privcmd-buf.c: Convert to use vm_insert_range

2018-12-06 Thread Souptick Joarder
Convert to use vm_insert_range() to map range of kernel
memory to user vma.

Signed-off-by: Souptick Joarder 
Reviewed-by: Matthew Wilcox 
Reviewed-by: Boris Ostrovsky 
---
 drivers/xen/privcmd-buf.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/xen/privcmd-buf.c b/drivers/xen/privcmd-buf.c
index df1ed37..8d8255b 100644
--- a/drivers/xen/privcmd-buf.c
+++ b/drivers/xen/privcmd-buf.c
@@ -180,12 +180,8 @@ static int privcmd_buf_mmap(struct file *file, struct 
vm_area_struct *vma)
if (vma_priv->n_pages != count)
ret = -ENOMEM;
else
-   for (i = 0; i < vma_priv->n_pages; i++) {
-   ret = vm_insert_page(vma, vma->vm_start + i * PAGE_SIZE,
-vma_priv->pages[i]);
-   if (ret)
-   break;
-   }
+   ret = vm_insert_range(vma, vma->vm_start, vma_priv->pages,
+   vma_priv->n_pages);
 
if (ret)
privcmd_buf_vmapriv_free(vma_priv);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 8/9] xen/gntdev.c: Convert to use vm_insert_range

2018-12-06 Thread Souptick Joarder
Convert to use vm_insert_range() to map range of kernel
memory to user vma.

Signed-off-by: Souptick Joarder 
Reviewed-by: Matthew Wilcox 
Reviewed-by: Boris Ostrovsky 
---
 drivers/xen/gntdev.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index b0b02a5..430d4cb 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -1084,7 +1084,7 @@ static int gntdev_mmap(struct file *flip, struct 
vm_area_struct *vma)
int index = vma->vm_pgoff;
int count = vma_pages(vma);
struct gntdev_grant_map *map;
-   int i, err = -EINVAL;
+   int err = -EINVAL;
 
if ((vma->vm_flags & VM_WRITE) && !(vma->vm_flags & VM_SHARED))
return -EINVAL;
@@ -1145,12 +1145,9 @@ static int gntdev_mmap(struct file *flip, struct 
vm_area_struct *vma)
goto out_put_map;
 
if (!use_ptemod) {
-   for (i = 0; i < count; i++) {
-   err = vm_insert_page(vma, vma->vm_start + i*PAGE_SIZE,
-   map->pages[i]);
-   if (err)
-   goto out_put_map;
-   }
+   err = vm_insert_range(vma, vma->vm_start, map->pages, count);
+   if (err)
+   goto out_put_map;
} else {
 #ifdef CONFIG_X86
/*
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 5/9] drm/xen/xen_drm_front_gem.c: Convert to use vm_insert_range

2018-12-06 Thread Souptick Joarder
Convert to use vm_insert_range() to map range of kernel
memory to user vma.

Signed-off-by: Souptick Joarder 
Reviewed-by: Matthew Wilcox 
Reviewed-by: Oleksandr Andrushchenko 
---
 drivers/gpu/drm/xen/xen_drm_front_gem.c | 20 ++--
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c 
b/drivers/gpu/drm/xen/xen_drm_front_gem.c
index 47ff019..c21e5d1 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_gem.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c
@@ -225,8 +225,7 @@ struct drm_gem_object *
 static int gem_mmap_obj(struct xen_gem_object *xen_obj,
struct vm_area_struct *vma)
 {
-   unsigned long addr = vma->vm_start;
-   int i;
+   int ret;
 
/*
 * clear the VM_PFNMAP flag that was set by drm_gem_mmap(), and set the
@@ -247,18 +246,11 @@ static int gem_mmap_obj(struct xen_gem_object *xen_obj,
 * FIXME: as we insert all the pages now then no .fault handler must
 * be called, so don't provide one
 */
-   for (i = 0; i < xen_obj->num_pages; i++) {
-   int ret;
-
-   ret = vm_insert_page(vma, addr, xen_obj->pages[i]);
-   if (ret < 0) {
-   DRM_ERROR("Failed to insert pages into vma: %d\n", ret);
-   return ret;
-   }
-
-   addr += PAGE_SIZE;
-   }
-   return 0;
+   ret = vm_insert_range(vma, vma->vm_start, xen_obj->pages,
+   xen_obj->num_pages);
+   if (ret < 0)
+   DRM_ERROR("Failed to insert pages into vma: %d\n", ret);
+   return ret;
 }
 
 int xen_drm_front_gem_mmap(struct file *filp, struct vm_area_struct *vma)
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/9] x86/amd: Introduce CPUID/MSR definitions for per-vcpu SSBD support

2018-12-06 Thread Andrew Cooper
On 06/12/2018 08:49, Jan Beulich wrote:
 +{"amd_stibp",0x8008, NA, CPUID_REG_EBX, 15,  1},
 +{"amd_ssbd", 0x8008, NA, CPUID_REG_EBX, 24,  1},
 +{"virt_sc_ssbd", 0x8008, NA, CPUID_REG_EBX, 25,  1},
 +{"amd_ssb_no",   0x8008, NA, CPUID_REG_EBX, 26,  1},
>>> Since you're at it, why not also introduce names for bits 16-18
>>> at this occasion?
>> I haven't previously filled in names for the sake of it.
>>
>> The reason that ibrs/stibp/ssbd are here is because they're related and
>> I've also got a followon few patches to support MSR_VIRT_SPEC_CTRL on
>> Rome hardware via MSR_SPEC_CTRL, but I need an SDP and some
>> experimentation time before I'd be happy posting them.
>>
>> But to address your question, I can't locate those bits at all.  Not
>> even in the NDA docs or Linux source.
> Hmm, that's certainly odd. I've found them quite some time ago in this
> public whitepaper:
> https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Update_Indirect_Branch_Control.pdf
> They're all clearly IBRS/STIBP related.

Oh - I'd completely forgotten about that whitepaper.  Some of the
details are superseded by the SSBD paper.

For now, I'll drop the bit names for features not used in this series. 
One way or another, doing anything with the others will require some
experimentation on hardware which supports them.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/9] mm: Introduce new vm_insert_range API

2018-12-06 Thread Souptick Joarder
Previouly drivers have their own way of mapping range of
kernel pages/memory into user vma and this was done by
invoking vm_insert_page() within a loop.

As this pattern is common across different drivers, it can
be generalized by creating a new function and use it across
the drivers.

vm_insert_range is the new API which will be used to map a
range of kernel memory/pages to user vma.

This API is tested by Heiko for Rockchip drm driver, on rk3188,
rk3288, rk3328 and rk3399 with graphics.

Signed-off-by: Souptick Joarder 
Reviewed-by: Matthew Wilcox 
Reviewed-by: Mike Rapoport 
Tested-by: Heiko Stuebner 
---
 include/linux/mm.h |  2 ++
 mm/memory.c| 38 ++
 mm/nommu.c |  7 +++
 3 files changed, 47 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fcf9cc9..2bc399f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2506,6 +2506,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
 int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
unsigned long pfn, unsigned long size, pgprot_t);
 int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+   struct page **pages, unsigned long page_count);
 vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn);
 vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
diff --git a/mm/memory.c b/mm/memory.c
index 15c417e..84ea46c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1478,6 +1478,44 @@ static int insert_page(struct vm_area_struct *vma, 
unsigned long addr,
 }
 
 /**
+ * vm_insert_range - insert range of kernel pages into user vma
+ * @vma: user vma to map to
+ * @addr: target user address of this page
+ * @pages: pointer to array of source kernel pages
+ * @page_count: number of pages need to insert into user vma
+ *
+ * This allows drivers to insert range of kernel pages they've allocated
+ * into a user vma. This is a generic function which drivers can use
+ * rather than using their own way of mapping range of kernel pages into
+ * user vma.
+ *
+ * If we fail to insert any page into the vma, the function will return
+ * immediately leaving any previously-inserted pages present.  Callers
+ * from the mmap handler may immediately return the error as their caller
+ * will destroy the vma, removing any successfully-inserted pages. Other
+ * callers should make their own arrangements for calling unmap_region().
+ *
+ * Context: Process context. Called by mmap handlers.
+ * Return: 0 on success and error code otherwise
+ */
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+   struct page **pages, unsigned long page_count)
+{
+   unsigned long uaddr = addr;
+   int ret = 0, i;
+
+   for (i = 0; i < page_count; i++) {
+   ret = vm_insert_page(vma, uaddr, pages[i]);
+   if (ret < 0)
+   return ret;
+   uaddr += PAGE_SIZE;
+   }
+
+   return ret;
+}
+EXPORT_SYMBOL(vm_insert_range);
+
+/**
  * vm_insert_page - insert single page into user vma
  * @vma: user vma to map to
  * @addr: target user address of this page
diff --git a/mm/nommu.c b/mm/nommu.c
index 749276b..d6ef5c7 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned 
long addr,
 }
 EXPORT_SYMBOL(vm_insert_page);
 
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+   struct page **pages, unsigned long page_count)
+{
+   return -EINVAL;
+}
+EXPORT_SYMBOL(vm_insert_range);
+
 /*
  *  sys_brk() for the most part doesn't need the global kernel
  *  lock, except when an application is doing something nasty
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 0/9] Use vm_insert_range

2018-12-06 Thread Souptick Joarder
Previouly drivers have their own way of mapping range of
kernel pages/memory into user vma and this was done by
invoking vm_insert_page() within a loop.

As this pattern is common across different drivers, it can
be generalized by creating a new function and use it across
the drivers.

vm_insert_range is the new API which will be used to map a
range of kernel memory/pages to user vma.

All the applicable places are converted to use new vm_insert_range
in this patch series.

v1 -> v2:
Address review comment on mm/memory.c. Add EXPORT_SYMBOL
for vm_insert_range and corrected the documentation part
for this API.

In drivers/gpu/drm/xen/xen_drm_front_gem.c, replace err
with ret as suggested.

In drivers/iommu/dma-iommu.c, handle the scenario of partial
mmap() of large buffer by passing *pages + vma->vm_pgoff* to
vm_insert_range().

v2 -> v3:
Declaration of vm_insert_range() moved to include/linux/mm.h

Souptick Joarder (9):
  mm: Introduce new vm_insert_range API
  arch/arm/mm/dma-mapping.c: Convert to use vm_insert_range
  drivers/firewire/core-iso.c: Convert to use vm_insert_range
  drm/rockchip/rockchip_drm_gem.c: Convert to use vm_insert_range
  drm/xen/xen_drm_front_gem.c: Convert to use vm_insert_range
  iommu/dma-iommu.c: Convert to use vm_insert_range
  videobuf2/videobuf2-dma-sg.c: Convert to use vm_insert_range
  xen/gntdev.c: Convert to use vm_insert_range
  xen/privcmd-buf.c: Convert to use vm_insert_range

 arch/arm/mm/dma-mapping.c | 21 +
 drivers/firewire/core-iso.c   | 15 ++---
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c   | 20 ++--
 drivers/gpu/drm/xen/xen_drm_front_gem.c   | 20 
 drivers/iommu/dma-iommu.c | 13 ++--
 drivers/media/common/videobuf2/videobuf2-dma-sg.c | 23 +-
 drivers/xen/gntdev.c  | 11 +++
 drivers/xen/privcmd-buf.c |  8 ++---
 include/linux/mm.h|  2 ++
 mm/memory.c   | 38 +++
 mm/nommu.c|  7 +
 11 files changed, 80 insertions(+), 98 deletions(-)

-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/2] x86: Switch "struct desc_struct" to being seg_desc_t

2018-12-06 Thread Andrew Cooper
The struct suffix is redundant in the name, and a future change will want to
turn it into a union, rather than a structure.  As this represents a segment
descriptor, give it an appropriate typedef.

No functional change.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
---
 xen/arch/x86/cpu/common.c   |  4 ++--
 xen/arch/x86/domain.c   |  2 +-
 xen/arch/x86/hvm/hvm.c  |  4 ++--
 xen/arch/x86/hvm/svm/svm.c  |  4 ++--
 xen/arch/x86/mm.c   |  2 +-
 xen/arch/x86/pv/descriptor-tables.c |  6 +++---
 xen/arch/x86/pv/emul-gate-op.c  |  4 ++--
 xen/arch/x86/pv/emulate.c   |  2 +-
 xen/arch/x86/pv/emulate.h   |  4 ++--
 xen/arch/x86/smpboot.c  |  2 +-
 xen/arch/x86/traps.c|  4 ++--
 xen/arch/x86/x86_64/mm.c|  2 +-
 xen/include/asm-x86/desc.h  | 14 +++---
 xen/include/asm-x86/ldt.h   |  2 +-
 xen/include/asm-x86/mm.h|  2 +-
 15 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 90f4a9b..de6c5c9 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -724,9 +724,9 @@ void load_system_tables(void)
stack_top = stack_bottom & ~(STACK_SIZE - 1);
 
struct tss_struct *tss = _cpu(init_tss);
-   struct desc_struct *gdt =
+   seg_desc_t *gdt =
this_cpu(gdt_table) - FIRST_RESERVED_GDT_ENTRY;
-   struct desc_struct *compat_gdt =
+   seg_desc_t *compat_gdt =
this_cpu(compat_gdt_table) - FIRST_RESERVED_GDT_ENTRY;
 
const struct desc_ptr gdtr = {
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index b4d5948..f0e0cdb 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1665,7 +1665,7 @@ static void __context_switch(void)
 struct vcpu  *p = per_cpu(curr_vcpu, cpu);
 struct vcpu  *n = current;
 struct domain*pd = p->domain, *nd = n->domain;
-struct desc_struct   *gdt;
+seg_desc_t   *gdt;
 struct desc_ptr   gdt_desc;
 
 ASSERT(p != n);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 0039e8c..d64b6b6 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2695,7 +2695,7 @@ static int task_switch_load_seg(
 enum x86_segment seg, uint16_t sel, unsigned int cpl, unsigned int eflags)
 {
 struct segment_register desctab, segr;
-struct desc_struct *pdesc = NULL, desc;
+seg_desc_t *pdesc = NULL, desc;
 u8 dpl, rpl;
 bool_t writable;
 int fault_type = TRAP_invalid_tss;
@@ -2876,7 +2876,7 @@ void hvm_task_switch(
 struct vcpu *v = current;
 struct cpu_user_regs *regs = guest_cpu_user_regs();
 struct segment_register gdt, tr, prev_tr, segr;
-struct desc_struct *optss_desc = NULL, *nptss_desc = NULL, tss_desc;
+seg_desc_t *optss_desc = NULL, *nptss_desc = NULL, tss_desc;
 bool_t otd_writable, ntd_writable;
 unsigned int eflags, new_cpl;
 pagefault_info_t pfinfo;
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index b9a8900..40937bf 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1636,8 +1636,8 @@ bool svm_load_segs(unsigned int ldt_ents, unsigned long 
ldt_base,
 else
 {
 /* Keep GDT in sync. */
-struct desc_struct *desc = this_cpu(gdt_table) + LDT_ENTRY -
-   FIRST_RESERVED_GDT_ENTRY;
+seg_desc_t *desc =
+this_cpu(gdt_table) + LDT_ENTRY - FIRST_RESERVED_GDT_ENTRY;
 
 _set_tssldt_desc(desc, ldt_base, ldt_ents * 8 - 1, SYS_DESC_ldt);
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index b3350ee..1431f34 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -626,7 +626,7 @@ const char __section(".bss.page_aligned.const") 
__aligned(PAGE_SIZE)
 static int alloc_segdesc_page(struct page_info *page)
 {
 const struct domain *owner = page_get_owner(page);
-struct desc_struct *descs = __map_domain_page(page);
+seg_desc_t *descs = __map_domain_page(page);
 unsigned i;
 
 for ( i = 0; i < 512; i++ )
diff --git a/xen/arch/x86/pv/descriptor-tables.c 
b/xen/arch/x86/pv/descriptor-tables.c
index 8b2d55f..caa62eb 100644
--- a/xen/arch/x86/pv/descriptor-tables.c
+++ b/xen/arch/x86/pv/descriptor-tables.c
@@ -212,16 +212,16 @@ long do_update_descriptor(uint64_t pa, uint64_t desc)
 unsigned long gmfn = pa >> PAGE_SHIFT;
 unsigned long mfn;
 unsigned int  offset;
-struct desc_struct *gdt_pent, d;
+seg_desc_t *gdt_pent, d;
 struct page_info *page;
 long ret = -EINVAL;
 
-offset = ((unsigned int)pa & ~PAGE_MASK) / sizeof(struct desc_struct);
+offset = ((unsigned int)pa & ~PAGE_MASK) / sizeof(seg_desc_t);
 
 *(uint64_t *) = desc;
 
 page = get_page_from_gfn(currd, gmfn, NULL, P2M_ALLOC);
-if ( (((unsigned int)pa % sizeof(struct 

[Xen-devel] [PATCH 0/2] x86: Cleanup to segment handling

2018-12-06 Thread Andrew Cooper
Andrew Cooper (2):
  x86: Switch "struct desc_struct" to being seg_desc_t
  x86/pv: Code improvements to do_update_descriptor()

 xen/arch/x86/cpu/common.c   |  4 ++--
 xen/arch/x86/domain.c   |  2 +-
 xen/arch/x86/hvm/hvm.c  |  4 ++--
 xen/arch/x86/hvm/svm/svm.c  |  4 ++--
 xen/arch/x86/mm.c   |  2 +-
 xen/arch/x86/pv/descriptor-tables.c | 41 +
 xen/arch/x86/pv/emul-gate-op.c  |  4 ++--
 xen/arch/x86/pv/emulate.c   |  2 +-
 xen/arch/x86/pv/emulate.h   |  4 ++--
 xen/arch/x86/smpboot.c  |  2 +-
 xen/arch/x86/traps.c|  4 ++--
 xen/arch/x86/x86_64/mm.c|  2 +-
 xen/include/asm-x86/desc.h  | 17 ---
 xen/include/asm-x86/ldt.h   |  2 +-
 xen/include/asm-x86/mm.h|  2 +-
 15 files changed, 48 insertions(+), 48 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-linus test] 131008: regressions - FAIL

2018-12-06 Thread osstest service owner
flight 131008 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131008/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 debian-hvm-install fail 
REGR. vs. 125898
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 10 debian-hvm-install fail REGR. 
vs. 125898
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 10 debian-hvm-install fail 
REGR. vs. 125898
 test-amd64-i386-qemuu-rhel6hvm-intel 10 redhat-install   fail REGR. vs. 125898
 test-amd64-i386-freebsd10-i386 11 guest-startfail REGR. vs. 125898
 test-amd64-i386-qemuu-rhel6hvm-amd 10 redhat-install fail REGR. vs. 125898
 test-amd64-i386-xl-qemuu-win7-amd64 10 windows-install   fail REGR. vs. 125898
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 125898
 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install   fail REGR. vs. 125898
 test-amd64-i386-xl-qemuu-debianhvm-amd64 10 debian-hvm-install fail REGR. vs. 
125898
 test-amd64-amd64-rumprun-amd64  7 xen-boot   fail REGR. vs. 125898
 test-amd64-amd64-xl-qcow2 7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-xl-qemuu-win7-amd64  7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-xl-pvhv2-intel  7 xen-boot  fail REGR. vs. 125898
 test-amd64-amd64-pygrub   7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-xl-multivcpu  7 xen-bootfail REGR. vs. 125898
 test-amd64-amd64-libvirt-xsm  7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-libvirt-vhd  7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host   fail REGR. vs. 125898
 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host   fail REGR. vs. 125898
 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 125898
 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 125898
 test-amd64-i386-examine   8 reboot   fail REGR. vs. 125898
 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-xl-shadow 7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-xl-qemut-win10-i386  7 xen-boot  fail REGR. vs. 125898
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-libvirt   7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-rumprun-i386  7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-libvirt-xsm   7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 125898
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 125898
 test-amd64-i386-freebsd10-amd64 11 guest-start   fail REGR. vs. 125898
 test-amd64-amd64-examine  8 reboot   fail REGR. vs. 125898
 test-armhf-armhf-libvirt  7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-qemut-rhel6hvm-amd 10 redhat-install fail REGR. vs. 125898
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 125898
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 125898
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
125898

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  7 xen-boot fail REGR. vs. 125898

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-credit1   7 xen-bootfail baseline untested
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 125898
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail like 125898
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 125898
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 125898
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 125898
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 125898
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 

Re: [Xen-devel] Patch "x86/entry/64: Remove %ebx handling from error_entry/exit" has been added to the 4.9-stable tree

2018-12-06 Thread Andrew Cooper
On 06/12/2018 17:10, David Woodhouse wrote:
> On Wed, 2018-11-28 at 08:44 -0800, Andy Lutomirski wrote:
>>> Can we assume it's always from kernel? The Xen code definitely seems to
>>> handle invoking this from both kernel and userspace contexts.
>> I learned that my comment here was wrong shortly after the patch landed :(
> Turns out the only place I see it getting called from is under
> __context_switch().
>
>  #7 [8801144a7cf0] new_xen_failsafe_callback at a028028a 
> [kmod_ebxfix]
>  #8 [8801144a7d90] xen_hypercall_update_descriptor at 8100114a
>  #9 [8801144a7db8] xen_hypercall_update_descriptor at 8100114a
> #10 [8801144a7df0] xen_mc_flush at 81006ab9
> #11 [8801144a7e30] xen_end_context_switch at 81004e12
> #12 [8801144a7e48] __switch_to at 81016582
> #13 [8801144a7ea0] __schedule at 815d2b37
>
> That …114a in xen_hypercall_update_descriptor is the 'pop' instruction
> right after the syscall; it's happening when Xen is preempting the
> domain in the hypercall and then reloads the segment registers to run
> that vCPU again later.
>
> [  44185.225289]   WARN: RDX:  RSI:  RDI: 
> 000abbd76060
>
> The update_descriptor hypercall args (rdi, rsi) were 0xabbd76060 and 0
> respectively — it was setting a descriptor at that address, to zero.
>
> Xen then failed to load the selector 0x63 into the %gs register (since
> that descriptor has just been wiped?), leaving it zero.
>
> [  44185.225256]   WARN: xen_failsafe_callback from 
> xen_hypercall_update_descriptor+0xa/0x40
> [  44185.225263]   WARN: DS: 2b/2b ES: 2b/2b FS: 0/0 GS:0/63
>
> This is on context switch from a 32-bit task to idle. So
> xen_failsafe_callback is returning to the "faulting" instruction, with
> a comment saying "Retry the IRET", but in fact is just continuing on
> its merry way with %gs unexpectedly set to zero.
>
> In fact I think this is probably fine in practice, since it's about to
> get explicitly set a few lines further down in __context_switch(). But
> it's odd enough, and far enough away from what's actually said by the
> comments, that I'm utterly unsure.
>
> In xen_load_tls() we explicitly only do the lazy_load_gs(0) for the
> 32-bit kernel. Is that really right?

Basically - what is happening is that xen_load_tls() is invalidating the
%gs selector while %gs is still non-NUL.

If this happens to intersect with a vcpu reschedule, %gs (being non-NUL)
takes precedence over KERNGSBASE, and faults when Xen tries to reload
it.  This results in the failsafe callback being invoked.

I think the correct course of action is to use xen_load_gs_index(0)
(poorly named - it is a hypercall which does swapgs; mov to %gs; swapgs)
before using update_descriptor() to invalidate the segment.

That will reset %gs to 0 without touching KERNGSBASE, and can be queued
in the same multicall as the update_descriptor() hypercall.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Patch "x86/entry/64: Remove %ebx handling from error_entry/exit" has been added to the 4.9-stable tree

2018-12-06 Thread David Woodhouse
On Wed, 2018-11-28 at 08:44 -0800, Andy Lutomirski wrote:
> > Can we assume it's always from kernel? The Xen code definitely seems to
> > handle invoking this from both kernel and userspace contexts.
> 
> I learned that my comment here was wrong shortly after the patch landed :(

Turns out the only place I see it getting called from is under
__context_switch().

 #7 [8801144a7cf0] new_xen_failsafe_callback at a028028a 
[kmod_ebxfix]
 #8 [8801144a7d90] xen_hypercall_update_descriptor at 8100114a
 #9 [8801144a7db8] xen_hypercall_update_descriptor at 8100114a
#10 [8801144a7df0] xen_mc_flush at 81006ab9
#11 [8801144a7e30] xen_end_context_switch at 81004e12
#12 [8801144a7e48] __switch_to at 81016582
#13 [8801144a7ea0] __schedule at 815d2b37

That …114a in xen_hypercall_update_descriptor is the 'pop' instruction
right after the syscall; it's happening when Xen is preempting the
domain in the hypercall and then reloads the segment registers to run
that vCPU again later.

[  44185.225289]   WARN: RDX:  RSI:  RDI: 
000abbd76060

The update_descriptor hypercall args (rdi, rsi) were 0xabbd76060 and 0
respectively — it was setting a descriptor at that address, to zero.

Xen then failed to load the selector 0x63 into the %gs register (since
that descriptor has just been wiped?), leaving it zero.

[  44185.225256]   WARN: xen_failsafe_callback from 
xen_hypercall_update_descriptor+0xa/0x40
[  44185.225263]   WARN: DS: 2b/2b ES: 2b/2b FS: 0/0 GS:0/63

This is on context switch from a 32-bit task to idle. So
xen_failsafe_callback is returning to the "faulting" instruction, with
a comment saying "Retry the IRET", but in fact is just continuing on
its merry way with %gs unexpectedly set to zero.

In fact I think this is probably fine in practice, since it's about to
get explicitly set a few lines further down in __context_switch(). But
it's odd enough, and far enough away from what's actually said by the
comments, that I'm utterly unsure.

In xen_load_tls() we explicitly only do the lazy_load_gs(0) for the
32-bit kernel. Is that really right?


smime.p7s
Description: S/MIME cryptographic signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] libxl: Documentation about the domain configuration on disk

2018-12-06 Thread Wei Liu
On Thu, Dec 06, 2018 at 04:09:14PM +, Anthony PERARD wrote:
> On Thu, Dec 06, 2018 at 03:46:22PM +, Wei Liu wrote:
> > Anyway, I'm not overly opposed to adding some easy to grep pointers, but
> > CODING_STYLE looks wrong to me.  Maybe README.dev?
> 
> To me, CODING_STYLE in libxl looks like a combination of both
> CODING_STYLE and HACKING that exist in qemu.git.
> 
> Maybe adding README.dev or HACKING might be ok. Or maybe adding pointers
> in some places in libxl_internal.h might work too.

Either README.dev or HACKING works for me.

Wei.

> 
> -- 
> Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] libxl: Documentation about the domain configuration on disk

2018-12-06 Thread Anthony PERARD
On Thu, Dec 06, 2018 at 03:46:22PM +, Wei Liu wrote:
> Anyway, I'm not overly opposed to adding some easy to grep pointers, but
> CODING_STYLE looks wrong to me.  Maybe README.dev?

To me, CODING_STYLE in libxl looks like a combination of both
CODING_STYLE and HACKING that exist in qemu.git.

Maybe adding README.dev or HACKING might be ok. Or maybe adding pointers
in some places in libxl_internal.h might work too.

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 131024: regressions - FAIL

2018-12-06 Thread osstest service owner
flight 131024 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131024/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 129996
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 129996

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 129996
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 129996
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 129996
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 129996
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 129996
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 qemuu118cafff251318d16e1cfdef9cbf6b7d1e74cdb5
baseline version:
 qemuucb968d275c145467c8b385a3618a207ec111eab1

Last test of basis   129996  2018-11-13 22:49:16 Z   22 days
Failing since130168  2018-11-16 04:27:30 Z   20 days   12 attempts
Testing same since   131024  2018-12-04 18:23:39 Z1 days1 attempts


People who touched revisions under test:
  Alberto Garcia 
  Aleksandar Markovic 
  Alex Bennée 
  Alistair Francis 
  baldu...@units.it
  Bandan Das 
  Bastian Koppelmann 
  Corey Minyard 
  Cornelia Huck 
  Daniel P. Berrangé 
  David Hildenbrand 
  Dr. David Alan Gilbert 
  Edgar E. Iglesias 
  Eduardo Habkost 
  Eric Auger 
  Eric Blake 
  Erik Skultety 
  Fredrik Noring 
  George Kennedy 
  Gerd Hoffmann 
  Greg Kurz 
  Guenter Roeck 
  Hervé Poussineau 
  Igor Druzhinin 
  Jason Wang 
  John Snow 
  Keith Busch 
  Kevin Wolf 
  Laurent Vivier 
  Li Qiang 
  linzhecheng 
  Logan Gunthorpe 
  Luc Michel 
  Mao Zhongyi 
  Marc-André Lureau 
  Mark Cave-Ayland 
  Markus Armbruster 
  Max Filippov 
  Max Reitz 
  Michael Roth 
  Palmer Dabbelt 
  Paolo Bonzini 
  Peter Maydell 
  Philippe Mathieu-Daudé 
  Philippe Mathieu-Daudé 
  Prasad J Pandit 
  Richard Henderson 
  Richard W.M. Jones 
  Roman Bolshakov 
  Roman Kagan 
  Seth Kintigh 
  Stefan Berger 
  Stefan Berger 
  Stefan Markovic 
  Thomas Huth 
  Vladimir Sementsov-Ogievskiy 
  Wang Xin 
  Zhang Chen 
  Zhang Chen 
  ZhiPeng Lu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  

Re: [Xen-devel] [PATCH 15/18] xen: add a mechanism to automatically create XenDevice-s...

2018-12-06 Thread Paul Durrant
> -Original Message-
> From: Anthony PERARD [mailto:anthony.per...@citrix.com]
> Sent: 06 December 2018 15:24
> To: Paul Durrant 
> Cc: qemu-bl...@nongnu.org; qemu-de...@nongnu.org; xen-
> de...@lists.xenproject.org; Stefano Stabellini 
> Subject: Re: [PATCH 15/18] xen: add a mechanism to automatically create
> XenDevice-s...
> 
> On Thu, Dec 06, 2018 at 12:36:52PM +, Paul Durrant wrote:
> > > -Original Message-
> > > From: Anthony PERARD [mailto:anthony.per...@citrix.com]
> > > Sent: 04 December 2018 15:35
> > >
> > > On Wed, Nov 21, 2018 at 03:12:08PM +, Paul Durrant wrote:
> > > > +xenbus->backend_watch =
> > > > +xen_bus_add_watch(xenbus, "", /* domain root node */
> > > > +  "backend", xen_bus_enumerate, xenbus,
> > > _err);
> > > > +if (local_err) {
> > > > +error_propagate(errp, local_err);
> > > > +error_prepend(errp, "failed to set up enumeration watch:
> ");
> > >
> > > You should use error_propagate_prepend instead
> > > error_propagate;error_prepend. And it looks like there is the same
> > > mistake in other patches that I haven't notice.
> > >
> >
> > Oh, I didn't know about that one either... I've only seen the separate
> calls used elsewhere.
> 
> That information is all in "include/qapi/error.h", if you which to know
> more on how to use Error.
> 

Thanks.

> > > Also you probably want goto fail here.
> > >
> >
> > Not sure about that. Whilst the bus scan won't happen, it doesn't mean
> devices can't be added via QMP.
> 
> In that case, don't modify errp, and use error_reportf_err instead, or
> warn_reportf_err (then local_err = NULL, in case it is reused in a
> future modification of the function).
> 
> Setting errp (with error_propagate) means that the function failed, and
> QEMU is going to exit(1), because of qdev_init_nofail call in
> xen_bus_init.

Ah, good point. I'll wait for more feedback on v2 and then fix in v3.

  Paul


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] libxl: Documentation about the domain configuration on disk

2018-12-06 Thread Wei Liu
On Thu, Dec 06, 2018 at 02:57:33PM +, Anthony PERARD wrote:
> On Thu, Dec 06, 2018 at 12:16:40PM +, Wei Liu wrote:
> > On Thu, Dec 06, 2018 at 10:43:32AM +, Anthony PERARD wrote:
> > > +UPDATE OF DOMAIN CONFIGURATION
> > > +--
> > > +
> > > +Also known as "libxl-json" userdata or `libxl_domain_config'.
> > > +
> > > +Whenever a running domain have its configuration updated, like changing
> > > +media in a cdrom drive, the domain configuration in libxl private data
> > > +store needs to be updated as well. The domain configuration should
> > > +contain *more* information about the domain rather than less, stale data
> > > +are easier to spot that missing data.
> > > +
> > > +Here is an example of how to update the domain configuration:
> > > + * Remove current media from cdrom drive
> > > + * Update domain configuration with media removed
> > 
> > We may not even need this because the primary source in this case is
> > QEMU. See below.
> > 
> > > + ( we could stop here)
> > > + * Update domain configuration to add media we are about to insert
> > > + * Insert media into cdrom drive
> > 
> > In essence we need a primary reference while using libxl-json file as a
> > secondary source.
> > 
> > When doing device hotplug, the primary source is xenstore. It may become
> > QEMU in the future if we move to a model where everything is
> > communicated via QMP.
> > 
> > When doing CDROM insertion and rejection, the primary source is QEMU
> > state.
> 
> I'm not trying to figure out what primary source should be here, I'm
> trying to find out how the secondary source, namely "libxl-json", should
> behave, what it should contain, when to update it compare the primary
> source (what a guest ultimately see).
> 
> > All in all I think your description is not wrong but it failed to
> > capture the high-level intent -- always update libxl-json before
> > updating the primary source.
> 
> That isn't what Ian said IRL, I don't think. From what I understand,
> when removing a media/disk, first remove the media, then update
> libxl-json; when adding a media/disk, first update libxl-json, then add
> the media.

OK I should have been clearer on this.

When removing a CD, you only need to update the primary source -- QEMU
in this case, you can leave libxl-json untouched. It is allowed to have
stale entries in libxl-json. This is implied in "We may not even need
this ..." further above.

When inserting a CD, always update libxl-json first, then add the media
to QEMU. My previous reply was for this part.

Yet I think CDROM has its own quirks. IIRC it is more an exception than
the norm. The existing code doesn't match what you wrote either.

> 
> > > +
> > > +Retrieve / store domain configuration from / to libxl private data store
> > > +are done with `libxl__get_domain_configuration' and
> > > +`libxl__set_domain_configuration'. Consult libxl_internal.h for more
> > > +information.
> > > +
> > 
> > What do you think about the text around libxl_internal.h:L2598?
> 
> If only I knew this comment existed :-(. It is burried, don't mention
> "libxl-json" or "userdata" or "domain config" but only the not very
> helpful term "json config"... Hmm, ... it actualy have "domain
> configuration" once.
> 
> Anyway, that comment block isn't very helpful because it basically says
> that we can't depriv QEMU, I mean do hotplug with a deprived QEMU. It
> assumes that we can keep a lock on the userdata while updating the
> guest, but we can't keep the lock while talking with QEMU (or more
> generaly: we can't keep the lock while doing any async operation).
> 
> But there is one useful piece of information:
> Here we maintain one invariant: every device in xenstore must have
> an entry in JSON file.
> (xenstore is describe as "primary reference" just before that sentence).
> 

Yes. That.

> This is what I would like my past self to be able to find out more
> easly, and having the information in CODING_STYLE would make sense I
> think.
> 
> > Maybe we should extend that comment block?
> 
> I still think it would be helpful to have pointers in CODING_STYLE, as
> there isn't a single place in libxl_internal.h where the information I
> was looking for could be added.
> 

Anyway, I'm not overly opposed to adding some easy to grep pointers, but
CODING_STYLE looks wrong to me.  Maybe README.dev?

Wei.

> Thanks,
> 
> -- 
> Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [GRUB PATCH 1/2] verifiers: Xen fallout cleanup

2018-12-06 Thread Ross Philipson
On 12/06/2018 10:40 AM, Daniel Kiper wrote:
> On Thu, Dec 06, 2018 at 10:37:43AM -0500, Ross Philipson wrote:
>> On 12/06/2018 08:40 AM, Daniel Kiper wrote:
>>> Xen fallout cleanup after commit ca0a4f689 (verifiers: File type for
>>> fine-grained signature-verification controlling).
>>>
>>> Signed-off-by: Daniel Kiper 
>>> ---
>>>  grub-core/loader/i386/xen.c | 14 +++---
>>>  1 file changed, 7 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
>>> index 1a99ca72c..8f662c8ac 100644
>>> --- a/grub-core/loader/i386/xen.c
>>> +++ b/grub-core/loader/i386/xen.c
>>> @@ -645,10 +645,10 @@ grub_cmd_xen (grub_command_t cmd __attribute__ 
>>> ((unused)),
>>>
>>>grub_xen_reset ();
>>>
>>> -  grub_create_loader_cmdline (argc - 1, argv + 1,
>>> - (char *) xen_state.next_start.cmd_line,
>>> - sizeof (xen_state.next_start.cmd_line) - 1);
>>> -  err = grub_verify_string (xen_state.next_start.cmd_line, 
>>> GRUB_VERIFY_MODULE_CMDLINE);
>>> +  err = grub_create_loader_cmdline (argc - 1, argv + 1,
>>> +   (char *) xen_state.next_start.cmd_line,
>>> +   sizeof (xen_state.next_start.cmd_line) - 1,
>>> +   GRUB_VERIFY_KERNEL_CMDLINE);
>>
>> How did this compile previously if you were missing an argument to
>> grub_create_loader_cmdline?
> 
> This is only build if xen platform is enabled. Otherwise this file is
> not used.

Ack, that is what I was starting to guess happened. For the series:

Reviewed-by: Ross Philipson 

> 
> Daniel
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [GRUB PATCH 1/2] verifiers: Xen fallout cleanup

2018-12-06 Thread Daniel Kiper
On Thu, Dec 06, 2018 at 10:37:43AM -0500, Ross Philipson wrote:
> On 12/06/2018 08:40 AM, Daniel Kiper wrote:
> > Xen fallout cleanup after commit ca0a4f689 (verifiers: File type for
> > fine-grained signature-verification controlling).
> >
> > Signed-off-by: Daniel Kiper 
> > ---
> >  grub-core/loader/i386/xen.c | 14 +++---
> >  1 file changed, 7 insertions(+), 7 deletions(-)
> >
> > diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
> > index 1a99ca72c..8f662c8ac 100644
> > --- a/grub-core/loader/i386/xen.c
> > +++ b/grub-core/loader/i386/xen.c
> > @@ -645,10 +645,10 @@ grub_cmd_xen (grub_command_t cmd __attribute__ 
> > ((unused)),
> >
> >grub_xen_reset ();
> >
> > -  grub_create_loader_cmdline (argc - 1, argv + 1,
> > - (char *) xen_state.next_start.cmd_line,
> > - sizeof (xen_state.next_start.cmd_line) - 1);
> > -  err = grub_verify_string (xen_state.next_start.cmd_line, 
> > GRUB_VERIFY_MODULE_CMDLINE);
> > +  err = grub_create_loader_cmdline (argc - 1, argv + 1,
> > +   (char *) xen_state.next_start.cmd_line,
> > +   sizeof (xen_state.next_start.cmd_line) - 1,
> > +   GRUB_VERIFY_KERNEL_CMDLINE);
>
> How did this compile previously if you were missing an argument to
> grub_create_loader_cmdline?

This is only build if xen platform is enabled. Otherwise this file is
not used.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [GRUB PATCH 2/2] verifiers: ARM Xen fallout cleanup

2018-12-06 Thread Ross Philipson
On 12/06/2018 08:40 AM, Daniel Kiper wrote:
> ARM Xen fallout cleanup after commit ca0a4f689 (verifiers: File type for
> fine-grained signature-verification controlling).
> 
> Signed-off-by: Daniel Kiper 
> ---
>  grub-core/loader/arm64/xen_boot.c | 8 
>  include/grub/file.h   | 5 +
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/grub-core/loader/arm64/xen_boot.c 
> b/grub-core/loader/arm64/xen_boot.c
> index 33a855df4..a742868a4 100644
> --- a/grub-core/loader/arm64/xen_boot.c
> +++ b/grub-core/loader/arm64/xen_boot.c
> @@ -429,9 +429,9 @@ grub_cmd_xen_module (grub_command_t cmd 
> __attribute__((unused)),
>  
>grub_dprintf ("xen_loader", "Init module and node info\n");
>  
> -  if (nounzip)
> -grub_file_filter_disable_compression ();
> -  file = grub_file_open (argv[0]);
> +  file = grub_file_open (argv[0], GRUB_FILE_TYPE_XEN_MODULE
> +  | (nounzip ? GRUB_FILE_TYPE_NO_DECOMPRESS
> + : GRUB_FILE_TYPE_NONE));

Same question, how did this compile if you were missing an argument? I
guess maybe you were not building xen bits in and you missed fixing this up?

>if (!file)
>  goto fail;
>  
> @@ -463,7 +463,7 @@ grub_cmd_xen_hypervisor (grub_command_t cmd __attribute__ 
> ((unused)),
>goto fail;
>  }
>  
> -  file = grub_file_open (argv[0]);
> +  file = grub_file_open (argv[0], GRUB_FILE_TYPE_XEN_HYPERVISOR);
>if (!file)
>  goto fail;
>  
> diff --git a/include/grub/file.h b/include/grub/file.h
> index 9aae46355..cbbd29465 100644
> --- a/include/grub/file.h
> +++ b/include/grub/file.h
> @@ -42,6 +42,11 @@ enum grub_file_type
>  /* Multiboot module.  */
>  GRUB_FILE_TYPE_MULTIBOOT_MODULE,
>  
> +/* Xen hypervisor - used on ARM only. */
> +GRUB_FILE_TYPE_XEN_HYPERVISOR,
> +/* Xen module - used on ARM only. */
> +GRUB_FILE_TYPE_XEN_MODULE,
> +
>  GRUB_FILE_TYPE_BSD_KERNEL,
>  GRUB_FILE_TYPE_FREEBSD_ENV,
>  GRUB_FILE_TYPE_FREEBSD_MODULE,
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [GRUB PATCH 1/2] verifiers: Xen fallout cleanup

2018-12-06 Thread Ross Philipson
On 12/06/2018 08:40 AM, Daniel Kiper wrote:
> Xen fallout cleanup after commit ca0a4f689 (verifiers: File type for
> fine-grained signature-verification controlling).
> 
> Signed-off-by: Daniel Kiper 
> ---
>  grub-core/loader/i386/xen.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
> index 1a99ca72c..8f662c8ac 100644
> --- a/grub-core/loader/i386/xen.c
> +++ b/grub-core/loader/i386/xen.c
> @@ -645,10 +645,10 @@ grub_cmd_xen (grub_command_t cmd __attribute__ 
> ((unused)),
>  
>grub_xen_reset ();
>  
> -  grub_create_loader_cmdline (argc - 1, argv + 1,
> -   (char *) xen_state.next_start.cmd_line,
> -   sizeof (xen_state.next_start.cmd_line) - 1);
> -  err = grub_verify_string (xen_state.next_start.cmd_line, 
> GRUB_VERIFY_MODULE_CMDLINE);
> +  err = grub_create_loader_cmdline (argc - 1, argv + 1,
> + (char *) xen_state.next_start.cmd_line,
> + sizeof (xen_state.next_start.cmd_line) - 1,
> + GRUB_VERIFY_KERNEL_CMDLINE);

How did this compile previously if you were missing an argument to
grub_create_loader_cmdline?

>if (err)
>  return err;
>  
> @@ -910,9 +910,9 @@ grub_cmd_module (grub_command_t cmd __attribute__ 
> ((unused)),
>if (err)
>  goto fail;
>  
> -  grub_create_loader_cmdline (argc - 1, argv + 1,
> -   get_virtual_current_address (ch), cmdline_len);
> -  err = grub_verify_string (get_virtual_current_address (ch), 
> GRUB_VERIFY_MODULE_CMDLINE);
> +  err = grub_create_loader_cmdline (argc - 1, argv + 1,
> + get_virtual_current_address (ch), 
> cmdline_len,
> + GRUB_VERIFY_MODULE_CMDLINE);
>if (err)
>  goto fail;
>  
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [seabios test] 131012: regressions - FAIL

2018-12-06 Thread osstest service owner
flight 131012 seabios real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131012/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 130373
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 130373

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 130373
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 130373
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 130373
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 130373
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 seabios  628b2e6b0e390e26d59b3c5db07a4226175b6f8a
baseline version:
 seabios  a698c8995ffb2838296ec284fe3c4ad33dfca307

Last test of basis   130373  2018-11-18 03:30:13 Z   18 days
Failing since130842  2018-11-28 02:10:59 Z8 days4 attempts
Testing same since   130871  2018-11-30 09:17:08 Z6 days3 attempts


People who touched revisions under test:
  Liran Alon 
  Stephen Douthit 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-ws16-amd64 fail
 test-amd64-i386-xl-qemuu-ws16-amd64  fail
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrictfail
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict fail
 test-amd64-amd64-xl-qemuu-win10-i386 fail
 test-amd64-i386-xl-qemuu-win10-i386  fail
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 628b2e6b0e390e26d59b3c5db07a4226175b6f8a
Author: Liran Alon 
Date:   Tue Nov 13 17:53:40 2018 +0200

pvscsi: ring_desc do not have to be page aligned

In contrast to other allocations made by pvscsi_init_rings(),
ring_desc is only used internally by SeaBIOS (not passed to
device-controller) and there is not restriction which force
it to be page aligned.

Reviewed-by: Mark Kanda 
Signed-off-by: Liran Alon 

commit 42efebdf1d120554e1a30e8debf562527ec6a53d
Author: Stephen Douthit 
Date:   Wed Mar 7 13:17:36 2018 -0500

tpm: Check for TPM related ACPI tables before attempting hw probe

[Xen-devel] [PATCH v4 3/4] iommu: elide flushing for higher order map/unmap operations

2018-12-06 Thread Paul Durrant
This patch removes any implicit flushing that occurs in the implementation
of map and unmap operations and adds new iommu_map/unmap() wrapper
functions. To maintain sematics of the iommu_legacy_map/unmap() wrapper
functions, these are modified to call the new wrapper functions and then
perform an explicit flush operation.

Because VT-d currently performs two different types of flush dependent upon
whether a PTE is being modified versus merely added (i.e. replacing a non-
present PTE) 'iommu flush flags' are defined by this patch and the
iommu_ops map_page() and unmap_page() methods are modified to OR the type
of flush necessary for the PTE that has been populated or depopulated into
an accumulated flags value. The accumulated value can then be passed into
the explicit flush operation.

The ARM SMMU implementations of map_page() and unmap_page() currently
perform no implicit flushing and therefore the modified methods do not
adjust the flush flags.

NOTE: The per-cpu 'iommu_dont_flush_iotlb' is respected by the
  iommu_legacy_map/unmap() wrapper functions and therefore this now
  applies to all IOMMU implementations rather than just VT-d.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Konrad Rzeszutek Wilk 
Cc: Tim Deegan 
Cc: Wei Liu 
Cc: Suravee Suthikulpanit 
Cc: Brian Woods 
Cc: Kevin Tian 
Cc: "Roger Pau Monné" 

v4:
 - Formatting fixes.
 - Respect flush flags even on a failed map or unmap.

v3:
 - Make AMD IOMMU and Intel VT-d map/unmap operations pass back accurate
   flush_flags.
 - Respect 'iommu_dont_flush_iotlb' in legacy unmap wrapper.
 - Pass flush_flags into iommu_iotlb_flush_all().
 - Improve comments and fix style issues.

v2:
 - Add the new iommu_map/unmap() and don't proliferate use of
   iommu_dont_flush_iotlb.
 - Use 'flush flags' instead of a 'iommu_flush_type'
 - Add a 'flush_flags' argument to iommu_flush() and modify the call-sites.

This code has only been compile tested for ARM.
---
 xen/arch/arm/p2m.c| 11 +++-
 xen/common/memory.c   |  6 +-
 xen/drivers/passthrough/amd/iommu_map.c   | 87 ++-
 xen/drivers/passthrough/arm/smmu.c| 15 +++--
 xen/drivers/passthrough/iommu.c   | 84 --
 xen/drivers/passthrough/vtd/iommu.c   | 32 +-
 xen/drivers/passthrough/x86/iommu.c   | 27 ++---
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  9 ++-
 xen/include/xen/iommu.h   | 44 +++---
 9 files changed, 228 insertions(+), 87 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 6c76298ebc..8b783b602b 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -971,8 +971,17 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
 
 if ( need_iommu_pt_sync(p2m->domain) &&
  (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
+{
+unsigned int flush_flags = 0;
+
+if ( lpae_is_valid(orig_pte) )
+flush_flags |= IOMMU_FLUSHF_modified;
+if ( lpae_is_valid(*entry) )
+flush_flags |= IOMMU_FLUSHF_added;
+
 rc = iommu_iotlb_flush(p2m->domain, _dfn(gfn_x(sgfn)),
-   1UL << page_order);
+   1UL << page_order, flush_flags);
+}
 else
 rc = 0;
 
diff --git a/xen/common/memory.c b/xen/common/memory.c
index f37eb288d4..b6cf09585c 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -853,11 +853,13 @@ int xenmem_add_to_physmap(struct domain *d, struct 
xen_add_to_physmap *xatp,
 
 this_cpu(iommu_dont_flush_iotlb) = 0;
 
-ret = iommu_flush(d, _dfn(xatp->idx - done), done);
+ret = iommu_iotlb_flush(d, _dfn(xatp->idx - done), done,
+IOMMU_FLUSHF_added | IOMMU_FLUSHF_modified);
 if ( unlikely(ret) && rc >= 0 )
 rc = ret;
 
-ret = iommu_flush(d, _dfn(xatp->gpfn - done), done);
+ret = iommu_iotlb_flush(d, _dfn(xatp->gpfn - done), done,
+IOMMU_FLUSHF_added | IOMMU_FLUSHF_modified);
 if ( unlikely(ret) && rc >= 0 )
 rc = ret;
 }
diff --git a/xen/drivers/passthrough/amd/iommu_map.c 
b/xen/drivers/passthrough/amd/iommu_map.c
index de5a880070..21d147411e 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -35,23 +35,37 @@ static unsigned int pfn_to_pde_idx(unsigned long pfn, 
unsigned int level)
 return idx;
 }
 
-static void clear_iommu_pte_present(unsigned long l1_mfn, unsigned long dfn)
+static unsigned int clear_iommu_pte_present(unsigned long l1_mfn,
+unsigned long dfn)
 {
 uint64_t *table, *pte;
+uint32_t entry;
+unsigned int flush_flags;
 
 table = map_domain_page(_mfn(l1_mfn));
-pte = table + 

[Xen-devel] [PATCH v4 2/4] iommu: rename wrapper functions

2018-12-06 Thread Paul Durrant
A subsequent patch will add semantically different versions of
iommu_map/unmap() so, in advance of that change, this patch renames the
existing functions to iommu_legacy_map/unmap() and modifies all call-sites.
It also adjusts a comment that refers to iommu_map_page(), which was re-
named by a previous patch.

This patch is purely cosmetic. No functional change.

Signed-off-by: Paul Durrant 
Acked-by: Jan Beulich A
---
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Jun Nakajima 
Cc: Kevin Tian 
Cc: George Dunlap 

v2:
 - New in v2.

v3:
 - Leave iommu_iotlb_flush[_all] alone.
 - Make patch purely cosmetic.
 - Fix comment in xen/iommu.h.
---
 xen/arch/x86/mm.c   | 11 ++-
 xen/arch/x86/mm/p2m-ept.c   |  4 ++--
 xen/arch/x86/mm/p2m-pt.c|  5 +++--
 xen/arch/x86/mm/p2m.c   | 12 ++--
 xen/arch/x86/x86_64/mm.c|  9 +
 xen/common/grant_table.c| 14 +++---
 xen/common/memory.c |  4 ++--
 xen/drivers/passthrough/iommu.c |  6 +++---
 xen/drivers/passthrough/x86/iommu.c |  4 ++--
 xen/include/xen/iommu.h | 16 +++-
 10 files changed, 47 insertions(+), 38 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index b3350eee35..a903fa7ba5 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2801,12 +2801,13 @@ static int _get_page_type(struct page_info *page, 
unsigned long type,
 mfn_t mfn = page_to_mfn(page);
 
 if ( (x & PGT_type_mask) == PGT_writable_page )
-iommu_ret = iommu_unmap(d, _dfn(mfn_x(mfn)),
-PAGE_ORDER_4K);
+iommu_ret = iommu_legacy_unmap(d, _dfn(mfn_x(mfn)),
+   PAGE_ORDER_4K);
 else if ( type == PGT_writable_page )
-iommu_ret = iommu_map(d, _dfn(mfn_x(mfn)), mfn,
-  PAGE_ORDER_4K,
-  IOMMUF_readable | IOMMUF_writable);
+iommu_ret = iommu_legacy_map(d, _dfn(mfn_x(mfn)), mfn,
+ PAGE_ORDER_4K,
+ IOMMUF_readable |
+ IOMMUF_writable);
 }
 }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 6e4e375bad..64a49c07b7 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -882,8 +882,8 @@ out:
 rc = iommu_pte_flush(d, gfn, _entry->epte, order, 
vtd_pte_present);
 else if ( need_iommu_pt_sync(d) )
 rc = iommu_flags ?
-iommu_map(d, _dfn(gfn), mfn, order, iommu_flags) :
-iommu_unmap(d, _dfn(gfn), order);
+iommu_legacy_map(d, _dfn(gfn), mfn, order, iommu_flags) :
+iommu_legacy_unmap(d, _dfn(gfn), order);
 }
 
 unmap_domain_page(table);
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 17a6b61f12..69ffb08179 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -686,8 +686,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t 
mfn,
 
 if ( need_iommu_pt_sync(p2m->domain) )
 rc = iommu_pte_flags ?
-iommu_map(d, _dfn(gfn), mfn, page_order, iommu_pte_flags) :
-iommu_unmap(d, _dfn(gfn), page_order);
+iommu_legacy_map(d, _dfn(gfn), mfn, page_order,
+ iommu_pte_flags) :
+iommu_legacy_unmap(d, _dfn(gfn), page_order);
 else if ( iommu_use_hap_pt(d) && iommu_old_flags )
 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
 }
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fea4497910..ed76e96d33 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -733,7 +733,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long 
gfn_l, unsigned long mfn,
 
 if ( !paging_mode_translate(p2m->domain) )
 return need_iommu_pt_sync(p2m->domain) ?
-iommu_unmap(p2m->domain, _dfn(mfn), page_order) : 0;
+iommu_legacy_unmap(p2m->domain, _dfn(mfn), page_order) : 0;
 
 ASSERT(gfn_locked_by_me(p2m, gfn));
 P2M_DEBUG("removing gfn=%#lx mfn=%#lx\n", gfn_l, mfn);
@@ -780,8 +780,8 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t 
mfn,
 
 if ( !paging_mode_translate(d) )
 return (need_iommu_pt_sync(d) && t == p2m_ram_rw) ?
-iommu_map(d, _dfn(mfn_x(mfn)), mfn, page_order,
-  IOMMUF_readable | IOMMUF_writable) : 0;
+iommu_legacy_map(d, _dfn(mfn_x(mfn)), mfn, page_order,
+ IOMMUF_readable | IOMMUF_writable) : 0;
 
 /* foreign pages are added thru p2m_add_foreign */
 if ( 

[Xen-devel] [PATCH v4 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order()

2018-12-06 Thread Paul Durrant
Now that the iommu_map() and iommu_unmap() operations take an order
parameter and elide flushing there's no strong reason why modifying MMIO
ranges in the p2m should be restricted to a 4k granularity simply because
the IOMMU is enabled but shared page tables are not in operation.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
---
Cc: George Dunlap 
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 

v2:
 - New in v2. (Adapted from a previously independent patch).
---
 xen/arch/x86/mm/p2m.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index ed76e96d33..a9cfd1b2e4 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2059,13 +2059,12 @@ static unsigned int mmio_order(const struct domain *d,
unsigned long start_fn, unsigned long nr)
 {
 /*
- * Note that the !iommu_use_hap_pt() here has three effects:
- * - cover iommu_{,un}map_page() not having an "order" input yet,
+ * Note that the !hap_enabled() here has two effects:
  * - exclude shadow mode (which doesn't support large MMIO mappings),
  * - exclude PV guests, should execution reach this code for such.
  * So be careful when altering this.
  */
-if ( !iommu_use_hap_pt(d) ||
+if ( !hap_enabled(d) ||
  (start_fn & ((1UL << PAGE_ORDER_2M) - 1)) || !(nr >> PAGE_ORDER_2M) )
 return PAGE_ORDER_4K;
 
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 1/4] amd-iommu: add flush iommu_ops

2018-12-06 Thread Paul Durrant
The iommu_ops structure contains two methods for flushing: 'iotlb_flush' and
'iotlb_flush_all'. This patch adds implementations of these for AMD IOMMUs.

The iotlb_flush method takes a base DFN and a (4k) page count, but the
flush needs to be done by page order (i.e. 0, 9 or 18). Because a flush
operation is fairly expensive to perform, the code calculates the minimum
order single flush that will cover the specified page range rather than
performing multiple flushes.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
---
Cc: Suravee Suthikulpanit 
Cc: Brian Woods 
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 

v4:
 - Fix flush_count() properly this time.

v3:
 - Really get rid of dfn_lt().
 - Fix flush_count().

v2:
 - Treat passing INVALID_DFN to iommu_iotlb_flush() as an error, and a zero
   page_count as a no-op.
 - Get rid of dfn_lt().
---
 xen/drivers/passthrough/amd/iommu_map.c   | 50 +++
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |  2 ++
 xen/drivers/passthrough/iommu.c   |  6 +++-
 xen/drivers/passthrough/vtd/iommu.c   |  2 ++
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  3 ++
 5 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c 
b/xen/drivers/passthrough/amd/iommu_map.c
index 2429e01bb4..de5a880070 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -634,6 +634,56 @@ int amd_iommu_unmap_page(struct domain *d, dfn_t dfn)
 spin_unlock(>arch.mapping_lock);
 
 amd_iommu_flush_pages(d, dfn_x(dfn), 0);
+return 0;
+}
+
+static unsigned long flush_count(unsigned long dfn, unsigned int page_count,
+ unsigned int order)
+{
+unsigned long start = dfn >> order;
+unsigned long end = ((dfn + page_count - 1) >> order) + 1;
+
+ASSERT(end > start);
+return end - start;
+}
+
+int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
+unsigned int page_count)
+{
+unsigned long dfn_l = dfn_x(dfn);
+
+ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
+
+/* If the range wraps then just flush everything */
+if ( dfn_l + page_count < dfn_l )
+{
+amd_iommu_flush_all_pages(d);
+return 0;
+}
+
+/*
+ * Flushes are expensive so find the minimal single flush that will
+ * cover the page range.
+ *
+ * NOTE: It is unnecessary to round down the DFN value to align with
+ *   the flush order here. This is done by the internals of the
+ *   flush code.
+ */
+if ( page_count == 1 ) /* order 0 flush count */
+amd_iommu_flush_pages(d, dfn_l, 0);
+else if ( flush_count(dfn_l, page_count, 9) == 1 )
+amd_iommu_flush_pages(d, dfn_l, 9);
+else if ( flush_count(dfn_l, page_count, 18) == 1 )
+amd_iommu_flush_pages(d, dfn_l, 18);
+else
+amd_iommu_flush_all_pages(d);
+
+return 0;
+}
+
+int amd_iommu_flush_iotlb_all(struct domain *d)
+{
+amd_iommu_flush_all_pages(d);
 
 return 0;
 }
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c 
b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 900136390d..33a3798f36 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -579,6 +579,8 @@ static const struct iommu_ops __initconstrel amd_iommu_ops 
= {
 .teardown = amd_iommu_domain_destroy,
 .map_page = amd_iommu_map_page,
 .unmap_page = amd_iommu_unmap_page,
+.iotlb_flush = amd_iommu_flush_iotlb_pages,
+.iotlb_flush_all = amd_iommu_flush_iotlb_all,
 .free_page_table = deallocate_page_table,
 .reassign_device = reassign_device,
 .get_device_group_id = amd_iommu_group_id,
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index ac62d7f52a..c1cce08551 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -414,9 +414,13 @@ int iommu_iotlb_flush(struct domain *d, dfn_t dfn, 
unsigned int page_count)
 const struct domain_iommu *hd = dom_iommu(d);
 int rc;
 
-if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush 
)
+if ( !iommu_enabled || !hd->platform_ops ||
+ !hd->platform_ops->iotlb_flush || !page_count )
 return 0;
 
+if ( dfn_eq(dfn, INVALID_DFN) )
+return -EINVAL;
+
 rc = hd->platform_ops->iotlb_flush(d, dfn, page_count);
 if ( unlikely(rc) )
 {
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 1601278b07..d2fa5e2b25 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -635,6 +635,8 @@ static int __must_check iommu_flush_iotlb_pages(struct 
domain *d,
 dfn_t dfn,
 unsigned int page_count)
 {
+ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
+
 

[Xen-devel] [PATCH v4 0/4] iommu improvements

2018-12-06 Thread Paul Durrant
Paul Durrant (4):
  amd-iommu: add flush iommu_ops
  iommu: rename wrapper functions
  iommu: elide flushing for higher order map/unmap operations
  x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order()

 xen/arch/arm/p2m.c|  11 ++-
 xen/arch/x86/mm.c |  11 ++-
 xen/arch/x86/mm/p2m-ept.c |   4 +-
 xen/arch/x86/mm/p2m-pt.c  |   5 +-
 xen/arch/x86/mm/p2m.c |  17 ++--
 xen/arch/x86/x86_64/mm.c  |   9 +-
 xen/common/grant_table.c  |  14 +--
 xen/common/memory.c   |   6 +-
 xen/drivers/passthrough/amd/iommu_map.c   | 135 --
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   2 +
 xen/drivers/passthrough/arm/smmu.c|  15 ++-
 xen/drivers/passthrough/iommu.c   |  86 +---
 xen/drivers/passthrough/vtd/iommu.c   |  34 ---
 xen/drivers/passthrough/x86/iommu.c   |  25 +++--
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  10 +-
 xen/include/xen/iommu.h   |  56 +--
 16 files changed, 325 insertions(+), 115 deletions(-)

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/4] iommu: elide flushing for higher order map/unmap operations

2018-12-06 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 06 December 2018 15:08
> To: Paul Durrant 
> Cc: Brian Woods ; Suravee Suthikulpanit
> ; Julien Grall ;
> Andrew Cooper ; Roger Pau Monne
> ; Wei Liu ; George Dunlap
> ; Ian Jackson ; Kevin
> Tian ; Stefano Stabellini ;
> xen-devel ; Konrad Rzeszutek Wilk
> ; Tim (Xen.org) 
> Subject: Re: [PATCH v3 3/4] iommu: elide flushing for higher order
> map/unmap operations
> 
> >>> On 05.12.18 at 12:29,  wrote:
> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -865,11 +865,15 @@ int xenmem_add_to_physmap(struct domain *d, struct
> xen_add_to_physmap *xatp,
> >
> >  this_cpu(iommu_dont_flush_iotlb) = 0;
> >
> > -ret = iommu_flush(d, _dfn(xatp->idx - done), done);
> > +ret = iommu_iotlb_flush(d, _dfn(xatp->idx - done), done,
> > +IOMMU_FLUSHF_added |
> > +IOMMU_FLUSHF_modified);
> 
> No need to split these last two lines afaict, nor ...
> 
> >  if ( unlikely(ret) && rc >= 0 )
> >  rc = ret;
> >
> > -ret = iommu_flush(d, _dfn(xatp->gpfn - done), done);
> > +ret = iommu_iotlb_flush(d, _dfn(xatp->gpfn - done), done,
> > +IOMMU_FLUSHF_added |
> > +IOMMU_FLUSHF_modified);
> 
> ... these.
> 
> > @@ -573,18 +589,17 @@ int amd_iommu_map_page(struct domain *d, dfn_t
> dfn, mfn_t mfn,
> >  }
> >
> >  /* Install 4k mapping */
> > -need_flush = set_iommu_pte_present(pt_mfn[1], dfn_x(dfn),
> mfn_x(mfn), 1,
> > -   !!(flags & IOMMUF_writable),
> > -   !!(flags & IOMMUF_readable));
> > -
> > -if ( need_flush )
> > -amd_iommu_flush_pages(d, dfn_x(dfn), 0);
> > +*flush_flags |= set_iommu_pte_present(pt_mfn[1], dfn_x(dfn),
> mfn_x(mfn),
> > +  1, !!(flags &
> IOMMUF_writable),
> > +  !!(flags & IOMMUF_readable));
> 
> I don't think the !! here need retaining.
> 
> > @@ -235,6 +236,10 @@ void __hwdom_init iommu_hwdom_init(struct domain
> *d)
> >  process_pending_softirqs();
> >  }
> >
> > +/* Use while-break to avoid compiler warning */
> > +while ( !iommu_iotlb_flush_all(d, flush_flags) )
> > +break;
> 
> With just the "break;" as body, what's the ! good for?
> 
> > @@ -320,7 +326,8 @@ int iommu_legacy_map(struct domain *d, dfn_t dfn,
> mfn_t mfn,
> >  for ( i = 0; i < (1ul << page_order); i++ )
> >  {
> >  rc = hd->platform_ops->map_page(d, dfn_add(dfn, i),
> > -mfn_add(mfn, i), flags);
> > +mfn_add(mfn, i), flags,
> > +flush_flags);
> 
> Again no need for two lines here as it seems.
> 
> > @@ -345,7 +353,20 @@ int iommu_legacy_map(struct domain *d, dfn_t dfn,
> mfn_t mfn,
> >  return rc;
> >  }
> >
> > -int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int
> page_order)
> > +int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
> > + unsigned int page_order, unsigned int flags)
> > +{
> > +unsigned int flush_flags = 0;
> > +int rc = iommu_map(d, dfn, mfn, page_order, flags, _flags);
> > +
> > +if ( !rc && !this_cpu(iommu_dont_flush_iotlb) )
> > +rc = iommu_iotlb_flush(d, dfn, (1u << page_order),
> flush_flags);
> 
> The question was raised in a different context (but iirc this same
> series) already: Is it correct to skip flushing when failure occurred
> on other than the first page of a set? There's no rollback afaict,
> and even if there was the transiently available mappings would
> then still need purging. Same on the unmap side then. (Note that
> this is different from the arch_iommu_populate_page_table()
> case, where I/O can't be initiated yet by the guest.)

That's true... the code should respect the flush_flags even in the failure 
case. I'll send v4.

  Paul

> 
> > @@ -241,8 +245,10 @@ void __hwdom_init arch_iommu_hwdom_init(struct
> domain *d)
> >  if ( paging_mode_translate(d) )
> >  rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
> >  else
> > -rc = iommu_legacy_map(d, _dfn(pfn), _mfn(pfn),
> PAGE_ORDER_4K,
> > -  IOMMUF_readable | IOMMUF_writable);
> > +rc = iommu_map(d, _dfn(pfn), _mfn(pfn), PAGE_ORDER_4K,
> > +   IOMMUF_readable | IOMMUF_writable,
> > +   _flags);
> 
> Again overly aggressive line wrapping?
> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 13/18] xen: purge 'blk' and 'ioreq' from function names in dataplane/xen-block.c

2018-12-06 Thread Paul Durrant
This is a purely cosmetic patch that purges remaining use of 'blk' and
'ioreq' in local function names, and then makes sure all functions are
prefixed with 'xen_block_'.

No functional change.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Stefan Hajnoczi 
Cc: Kevin Wolf 
Cc: Max Reitz 

v2:
 - Add 'xen_block_' prefix
---
 hw/block/dataplane/xen-block.c | 90 +-
 1 file changed, 46 insertions(+), 44 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 426e83c..8c451ae 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -73,7 +73,7 @@ struct XenBlockDataPlane {
 AioContext *ctx;
 };
 
-static void ioreq_reset(XenBlockRequest *request)
+static void reset_request(XenBlockRequest *request)
 {
 memset(>req, 0, sizeof(request->req));
 request->status = 0;
@@ -92,7 +92,7 @@ static void ioreq_reset(XenBlockRequest *request)
 qemu_iovec_reset(>v);
 }
 
-static XenBlockRequest *ioreq_start(XenBlockDataPlane *dataplane)
+static XenBlockRequest *xen_block_start_request(XenBlockDataPlane *dataplane)
 {
 XenBlockRequest *request = NULL;
 
@@ -117,7 +117,7 @@ out:
 return request;
 }
 
-static void ioreq_finish(XenBlockRequest *request)
+static void xen_block_finish_request(XenBlockRequest *request)
 {
 XenBlockDataPlane *dataplane = request->dataplane;
 
@@ -127,12 +127,12 @@ static void ioreq_finish(XenBlockRequest *request)
 dataplane->requests_finished++;
 }
 
-static void ioreq_release(XenBlockRequest *request, bool finish)
+static void xen_block_release_request(XenBlockRequest *request, bool finish)
 {
 XenBlockDataPlane *dataplane = request->dataplane;
 
 QLIST_REMOVE(request, list);
-ioreq_reset(request);
+reset_request(request);
 request->dataplane = dataplane;
 QLIST_INSERT_HEAD(>freelist, request, list);
 if (finish) {
@@ -146,7 +146,7 @@ static void ioreq_release(XenBlockRequest *request, bool 
finish)
  * translate request into iovec + start offset
  * do sanity checks along the way
  */
-static int ioreq_parse(XenBlockRequest *request)
+static int xen_block_parse_request(XenBlockRequest *request)
 {
 XenBlockDataPlane *dataplane = request->dataplane;
 size_t len;
@@ -207,7 +207,7 @@ err:
 return -1;
 }
 
-static int ioreq_grant_copy(XenBlockRequest *request)
+static int xen_block_copy_request(XenBlockRequest *request)
 {
 XenBlockDataPlane *dataplane = request->dataplane;
 XenDevice *xendev = dataplane->xendev;
@@ -253,9 +253,9 @@ static int ioreq_grant_copy(XenBlockRequest *request)
 return 0;
 }
 
-static int ioreq_runio_qemu_aio(XenBlockRequest *request);
+static int xen_block_do_aio(XenBlockRequest *request);
 
-static void qemu_aio_complete(void *opaque, int ret)
+static void xen_block_complete_aio(void *opaque, int ret)
 {
 XenBlockRequest *request = opaque;
 XenBlockDataPlane *dataplane = request->dataplane;
@@ -272,7 +272,7 @@ static void qemu_aio_complete(void *opaque, int ret)
 request->aio_inflight--;
 if (request->presync) {
 request->presync = 0;
-ioreq_runio_qemu_aio(request);
+xen_block_do_aio(request);
 goto done;
 }
 if (request->aio_inflight > 0) {
@@ -283,7 +283,7 @@ static void qemu_aio_complete(void *opaque, int ret)
 case BLKIF_OP_READ:
 /* in case of failure request->aio_errors is increased */
 if (ret == 0) {
-ioreq_grant_copy(request);
+xen_block_copy_request(request);
 }
 qemu_vfree(request->buf);
 break;
@@ -299,7 +299,7 @@ static void qemu_aio_complete(void *opaque, int ret)
 }
 
 request->status = request->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY;
-ioreq_finish(request);
+xen_block_finish_request(request);
 
 switch (request->req.operation) {
 case BLKIF_OP_WRITE:
@@ -324,9 +324,9 @@ done:
 aio_context_release(dataplane->ctx);
 }
 
-static bool blk_split_discard(XenBlockRequest *request,
-  blkif_sector_t sector_number,
-  uint64_t nr_sectors)
+static bool xen_block_split_discard(XenBlockRequest *request,
+blkif_sector_t sector_number,
+uint64_t nr_sectors)
 {
 XenBlockDataPlane *dataplane = request->dataplane;
 int64_t byte_offset;
@@ -349,7 +349,7 @@ static bool blk_split_discard(XenBlockRequest *request,
 byte_chunk = byte_remaining > limit ? limit : byte_remaining;
 request->aio_inflight++;
 blk_aio_pdiscard(dataplane->blk, byte_offset, byte_chunk,
- qemu_aio_complete, request);
+ xen_block_complete_aio, request);
 byte_remaining -= byte_chunk;
 byte_offset += byte_chunk;
 } while (byte_remaining > 0);
@@ -357,7 +357,7 @@ static bool blk_split_discard(XenBlockRequest 

[Xen-devel] [PATCH v2 10/18] xen: add header and build dataplane/xen-block.c

2018-12-06 Thread Paul Durrant
This patch adds the transformations necessary to get dataplane/xen-block.c
to build against the new XenBus/XenDevice framework. MAINTAINERS is also
updated due to the introduction of dataplane/xen-block.h.

NOTE: Existing data structure names are retained for the moment. These will
  be modified by subsequent patches. A typedef for XenBlockDataPlane
  has been added to the header (based on the old struct XenBlkDev name
  for the moment) so that the old names don't need to leak out of the
  dataplane code.

Signed-off-by: Paul Durrant 
---
Cc: Stefan Hajnoczi 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Stefano Stabellini 
Cc: Anthony Perard 

v2:
 - Tidy up header inclusions
 - Get rid of error_fatal
---
 MAINTAINERS  |   1 +
 hw/block/dataplane/Makefile.objs |   1 +
 hw/block/dataplane/xen-block.c   | 356 ---
 hw/block/dataplane/xen-block.h   |  29 
 4 files changed, 287 insertions(+), 100 deletions(-)
 create mode 100644 hw/block/dataplane/xen-block.h

diff --git a/MAINTAINERS b/MAINTAINERS
index ab62ad4..9875581 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -408,6 +408,7 @@ F: hw/block/dataplane/xen*
 F: hw/xen/
 F: hw/xenpv/
 F: hw/i386/xen/
+F: include/hw/block/dataplane/xen*
 F: include/hw/xen/
 F: include/sysemu/xen-mapcache.h
 
diff --git a/hw/block/dataplane/Makefile.objs b/hw/block/dataplane/Makefile.objs
index e786f66..c6c68db 100644
--- a/hw/block/dataplane/Makefile.objs
+++ b/hw/block/dataplane/Makefile.objs
@@ -1 +1,2 @@
 obj-y += virtio-blk.o
+obj-$(CONFIG_XEN) += xen-block.o
diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 98f987d..20d16e7 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -24,65 +24,53 @@
  * See the COPYING file in the top-level directory.
  */
 
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "hw/hw.h"
+#include "hw/xen/xen_common.h"
+#include "hw/block/xen_blkif.h"
+#include "sysemu/block-backend.h"
+#include "sysemu/iothread.h"
+#include "xen-block.h"
+
 struct ioreq {
-blkif_request_t req;
-int16_t status;
-
-/* parsed request */
-off_t   start;
-QEMUIOVectorv;
-void*buf;
-size_t  size;
-int presync;
-
-/* aio status */
-int aio_inflight;
-int aio_errors;
-
-struct XenBlkDev*blkdev;
-QLIST_ENTRY(ioreq)   list;
-BlockAcctCookie acct;
+blkif_request_t req;
+int16_t status;
+off_t start;
+QEMUIOVector v;
+void *buf;
+size_t size;
+int presync;
+int aio_inflight;
+int aio_errors;
+struct XenBlkDev *blkdev;
+QLIST_ENTRY(ioreq) list;
+BlockAcctCookie acct;
 };
 
-#define MAX_RING_PAGE_ORDER 4
-
 struct XenBlkDev {
-struct XenLegacyDevicexendev;  /* must be first */
-char*params;
-char*mode;
-char*type;
-char*dev;
-char*devtype;
-booldirectiosafe;
-const char  *fileproto;
-const char  *filename;
-unsigned intring_ref[1 << MAX_RING_PAGE_ORDER];
-unsigned intnr_ring_ref;
-void*sring;
-int64_t file_blk;
-int64_t file_size;
-int protocol;
-blkif_back_rings_t  rings;
-int more_work;
-
-/* request lists */
+XenDevice *xendev;
+XenEventChannel *event_channel;
+unsigned int *ring_ref;
+unsigned int nr_ring_ref;
+void *sring;
+int64_t file_blk;
+int64_t file_size;
+int protocol;
+blkif_back_rings_t rings;
+int more_work;
 QLIST_HEAD(inflight_head, ioreq) inflight;
 QLIST_HEAD(finished_head, ioreq) finished;
 QLIST_HEAD(freelist_head, ioreq) freelist;
-int requests_total;
-int requests_inflight;
-int requests_finished;
-unsigned intmax_requests;
-
-gbooleanfeature_discard;
-
-/* qemu block driver */
-DriveInfo   *dinfo;
-BlockBackend*blk;
-QEMUBH  *bh;
-
-IOThread*iothread;
-AioContext  *ctx;
+int requests_total;
+int requests_inflight;
+int requests_finished;
+unsigned int max_requests;
+BlockBackend *blk;
+QEMUBH *bh;
+IOThread *iothread;
+AioContext *ctx;
 };
 
 static void ioreq_reset(struct ioreq *ioreq)
@@ -161,7 +149,6 @@ static void ioreq_release(struct ioreq *ioreq, bool finish)
 static int ioreq_parse(struct ioreq *ioreq)
 {
 struct XenBlkDev *blkdev = ioreq->blkdev;
-struct XenLegacyDevice *xendev = >xendev;
 size_t len;
 int i;
 
@@ -183,7 +170,8 @@ static int ioreq_parse(struct ioreq *ioreq)
 goto err;
 };
 
-if (ioreq->req.operation 

[Xen-devel] [PATCH v2 16/18] xen: automatically create XenBlockDevice-s

2018-12-06 Thread Paul Durrant
This patch adds a creator function for XenBlockDevice-s so that they can
be created automatically when the Xen toolstack instantiates a new
PV backend. When the XenBlockDevice is created this way it is also
necessary to create a drive which matches the configuration that the Xen
toolstack has written into xenstore. This drive is marked 'auto_del' so
that it will be removed when the XenBlockDevice is destroyed. Also, for
compatibility with the legacy 'xen_disk' implementation, an iothread
is automatically created for the new XenBlockDevice. This will also be
removed when the XenBlockDevice is destroyed.

Correspondingly the legacy backend scan for 'qdisk' is removed.

After this patch is applied the legacy 'xen_disk' code is redundant. It
will be removed by a subsequent patch.

Signed-off-by: Paul Durrant 
---
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Stefano Stabellini 
Cc: Anthony Perard 

v2:
 - Get rid of error_abort
 - Don't use qdev_init_nofail
 - Explain why file locking needs to be off
---
 hw/block/trace-events   |   1 +
 hw/block/xen-block.c| 262 +++-
 hw/xen/xen-bus.c|   2 +-
 hw/xen/xen-legacy-backend.c |   1 -
 include/hw/xen/xen-block.h  |   1 +
 5 files changed, 264 insertions(+), 3 deletions(-)

diff --git a/hw/block/trace-events b/hw/block/trace-events
index 89e2583..a89c8a6 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -137,3 +137,4 @@ xen_disk_realize(void) ""
 xen_disk_unrealize(void) ""
 xen_cdrom_realize(void) ""
 xen_cdrom_unrealize(void) ""
+xen_block_device_create(const char *name) "name: %s"
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index fc64aaf..2430dae 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -7,12 +7,15 @@
 
 #include "qemu/osdep.h"
 #include "qemu/cutils.h"
+#include "qemu/option.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
+#include "qapi/qmp/qdict.h"
 #include "hw/hw.h"
 #include "hw/xen/xen_common.h"
 #include "hw/block/xen_blkif.h"
 #include "hw/xen/xen-block.h"
+#include "hw/xen/xen-backend.h"
 #include "sysemu/blockdev.h"
 #include "sysemu/block-backend.h"
 #include "sysemu/iothread.h"
@@ -135,6 +138,11 @@ static void xen_block_unrealize(XenDevice *xendev, Error 
**errp)
 xen_block_dataplane_destroy(blockdev->dataplane);
 blockdev->dataplane = NULL;
 
+if (blockdev->auto_iothread) {
+iothread_destroy(blockdev->auto_iothread);
+blockdev->auto_iothread = NULL;
+}
+
 if (blockdev_class->unrealize) {
 blockdev_class->unrealize(blockdev, _err);
 if (local_err) {
@@ -152,6 +160,8 @@ static void xen_block_realize(XenDevice *xendev, Error 
**errp)
 XenBlockVdev *vdev = >vdev;
 Error *local_err = NULL;
 BlockConf *conf = >conf;
+IOThread *iothread = blockdev->auto_iothread ?
+blockdev->auto_iothread : blockdev->iothread;
 
 if (vdev->type == XEN_BLOCK_VDEV_TYPE_INVALID) {
 error_setg(errp, "vdev property not set");
@@ -218,7 +228,7 @@ static void xen_block_realize(XenDevice *xendev, Error 
**errp)
   conf->logical_block_size);
 
 blockdev->dataplane = xen_block_dataplane_create(xendev, conf,
- blockdev->iothread);
+ iothread);
 }
 
 static void xen_block_frontend_changed(XenDevice *xendev,
@@ -480,6 +490,8 @@ static void xen_block_class_init(ObjectClass *class, void 
*data)
 DeviceClass *dev_class = DEVICE_CLASS(class);
 XenDeviceClass *xendev_class = XEN_DEVICE_CLASS(class);
 
+xendev_class->backend = "qdisk";
+xendev_class->device = "vbd";
 xendev_class->get_name = xen_block_get_name;
 xendev_class->realize = xen_block_realize;
 xendev_class->frontend_changed = xen_block_frontend_changed;
@@ -591,3 +603,251 @@ static void xen_block_register_types(void)
 }
 
 type_init(xen_block_register_types)
+
+static void xen_block_drive_create(const char *id, const char *device_type,
+   QDict *opts, Error **errp)
+{
+const char *params = qdict_get_try_str(opts, "params");
+const char *mode = qdict_get_try_str(opts, "mode");
+const char *direct_io_safe = qdict_get_try_str(opts, "direct-io-safe");
+const char *discard_enable = qdict_get_try_str(opts, "discard-enable");
+char *format = NULL;
+char *file = NULL;
+char *drive_optstr = NULL;
+QemuOpts *drive_opts;
+Error *local_err = NULL;
+
+if (params) {
+char **v = g_strsplit(params, ":", 2);
+
+if (v[1] == NULL) {
+file = g_strdup(v[0]);
+} else {
+if (strcmp(v[0], "aio") == 0) {
+format = g_strdup("raw");
+} else if (strcmp(v[0], "vhd") == 0) {
+format = g_strdup("vpc");
+} else {
+format = g_strdup(v[0]);
+}
+file = g_strdup(v[1]);
+}
+
+

[Xen-devel] [PATCH v2 17/18] MAINTAINERS: add myself as a Xen maintainer

2018-12-06 Thread Paul Durrant
I have made many significant contributions to the Xen code in QEMU,
particularly the recent patches introducing a new PV device framework.
I intend to make further significant contributions, porting other PV back-
ends to the new framework with the intent of eventually removing the
legacy code. It therefore seems reasonable that I become a maintainer of
the Xen code.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
Acked-by: Stefano Stabellini 
---
Cc: Paolo Bonzini 

v2:
 - Fix typo
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9875581..e6bd441 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -396,6 +396,7 @@ Guest CPU Cores (Xen):
 X86
 M: Stefano Stabellini 
 M: Anthony Perard 
+M: Paul Durrant 
 L: xen-devel@lists.xenproject.org
 S: Supported
 F: */xen*
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 18/18] xen: remove the legacy 'xen_disk' backend

2018-12-06 Thread Paul Durrant
This backend has now been replaced by the 'xen-qdisk' XenDevice.

Signed-off-by: Paul Durrant 
---
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Anthony Perard 
Cc: Stefano Stabellini 
---
 hw/block/Makefile.objs |1 -
 hw/block/xen_disk.c| 1011 
 2 files changed, 1012 deletions(-)
 delete mode 100644 hw/block/xen_disk.c

diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index f34813a..e206b8e 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -5,7 +5,6 @@ common-obj-$(CONFIG_NAND) += nand.o
 common-obj-$(CONFIG_PFLASH_CFI01) += pflash_cfi01.o
 common-obj-$(CONFIG_PFLASH_CFI02) += pflash_cfi02.o
 common-obj-$(CONFIG_XEN) += xen-block.o
-common-obj-$(CONFIG_XEN) += xen_disk.o
 common-obj-$(CONFIG_ECC) += ecc.o
 common-obj-$(CONFIG_ONENAND) += onenand.o
 common-obj-$(CONFIG_NVME_PCI) += nvme.o
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
deleted file mode 100644
index 75fe55f..000
--- a/hw/block/xen_disk.c
+++ /dev/null
@@ -1,1011 +0,0 @@
-/*
- *  xen paravirt block device backend
- *
- *  (c) Gerd Hoffmann 
- *
- *  This program is free software; you can redistribute it and/or modify
- *  it under the terms of the GNU General Public License as published by
- *  the Free Software Foundation; under version 2 of the License.
- *
- *  This program is distributed in the hope that it will be useful,
- *  but WITHOUT ANY WARRANTY; without even the implied warranty of
- *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- *  GNU General Public License for more details.
- *
- *  You should have received a copy of the GNU General Public License along
- *  with this program; if not, see .
- *
- *  Contributions after 2012-01-13 are licensed under the terms of the
- *  GNU GPL, version 2 or (at your option) any later version.
- */
-
-#include "qemu/osdep.h"
-#include "qemu/units.h"
-#include 
-#include 
-
-#include "hw/hw.h"
-#include "hw/xen/xen-legacy-backend.h"
-#include "xen_blkif.h"
-#include "sysemu/blockdev.h"
-#include "sysemu/iothread.h"
-#include "sysemu/block-backend.h"
-#include "qapi/error.h"
-#include "qapi/qmp/qdict.h"
-#include "qapi/qmp/qstring.h"
-#include "trace.h"
-
-/* - */
-
-#define BLOCK_SIZE  512
-#define IOCB_COUNT  (BLKIF_MAX_SEGMENTS_PER_REQUEST + 2)
-
-struct ioreq {
-blkif_request_t req;
-int16_t status;
-
-/* parsed request */
-off_t   start;
-QEMUIOVectorv;
-void*buf;
-size_t  size;
-int presync;
-
-/* aio status */
-int aio_inflight;
-int aio_errors;
-
-struct XenBlkDev*blkdev;
-QLIST_ENTRY(ioreq)   list;
-BlockAcctCookie acct;
-};
-
-#define MAX_RING_PAGE_ORDER 4
-
-struct XenBlkDev {
-struct XenLegacyDevicexendev;  /* must be first */
-char*params;
-char*mode;
-char*type;
-char*dev;
-char*devtype;
-booldirectiosafe;
-const char  *fileproto;
-const char  *filename;
-unsigned intring_ref[1 << MAX_RING_PAGE_ORDER];
-unsigned intnr_ring_ref;
-void*sring;
-int64_t file_blk;
-int64_t file_size;
-int protocol;
-blkif_back_rings_t  rings;
-int more_work;
-
-/* request lists */
-QLIST_HEAD(inflight_head, ioreq) inflight;
-QLIST_HEAD(finished_head, ioreq) finished;
-QLIST_HEAD(freelist_head, ioreq) freelist;
-int requests_total;
-int requests_inflight;
-int requests_finished;
-unsigned intmax_requests;
-
-gbooleanfeature_discard;
-
-/* qemu block driver */
-DriveInfo   *dinfo;
-BlockBackend*blk;
-QEMUBH  *bh;
-
-IOThread*iothread;
-AioContext  *ctx;
-};
-
-/* - */
-
-static void ioreq_reset(struct ioreq *ioreq)
-{
-memset(>req, 0, sizeof(ioreq->req));
-ioreq->status = 0;
-ioreq->start = 0;
-ioreq->buf = NULL;
-ioreq->size = 0;
-ioreq->presync = 0;
-
-ioreq->aio_inflight = 0;
-ioreq->aio_errors = 0;
-
-ioreq->blkdev = NULL;
-memset(>list, 0, sizeof(ioreq->list));
-memset(>acct, 0, sizeof(ioreq->acct));
-
-qemu_iovec_reset(>v);
-}
-
-static struct ioreq *ioreq_start(struct XenBlkDev *blkdev)
-{
-struct ioreq *ioreq = NULL;
-
-if (QLIST_EMPTY(>freelist)) {
-if (blkdev->requests_total >= blkdev->max_requests) {
-goto out;
-}
-/* allocate new struct */
-ioreq = g_malloc0(sizeof(*ioreq));
-ioreq->blkdev = blkdev;
-

[Xen-devel] [PATCH v2 11/18] xen: remove 'XenBlkDev' and 'blkdev' names from dataplane/xen-block

2018-12-06 Thread Paul Durrant
This is a purely cosmetic patch that substitutes the old 'struct XenBlkDev'
name with 'XenBlockDataPlane' and 'blkdev' field/variable names with
'dataplane', and then does necessary fix-up to adhere to coding style.

No functional change.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
---
Cc: Stefano Stabellini 
Cc: Stefan Hajnoczi 
Cc: Kevin Wolf 
Cc: Max Reitz 
---
 hw/block/dataplane/xen-block.c | 352 +
 hw/block/dataplane/xen-block.h |   2 +-
 2 files changed, 183 insertions(+), 171 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 20d16e7..6ecd160 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -44,12 +44,12 @@ struct ioreq {
 int presync;
 int aio_inflight;
 int aio_errors;
-struct XenBlkDev *blkdev;
+XenBlockDataPlane *dataplane;
 QLIST_ENTRY(ioreq) list;
 BlockAcctCookie acct;
 };
 
-struct XenBlkDev {
+struct XenBlockDataPlane {
 XenDevice *xendev;
 XenEventChannel *event_channel;
 unsigned int *ring_ref;
@@ -85,33 +85,33 @@ static void ioreq_reset(struct ioreq *ioreq)
 ioreq->aio_inflight = 0;
 ioreq->aio_errors = 0;
 
-ioreq->blkdev = NULL;
+ioreq->dataplane = NULL;
 memset(>list, 0, sizeof(ioreq->list));
 memset(>acct, 0, sizeof(ioreq->acct));
 
 qemu_iovec_reset(>v);
 }
 
-static struct ioreq *ioreq_start(struct XenBlkDev *blkdev)
+static struct ioreq *ioreq_start(XenBlockDataPlane *dataplane)
 {
 struct ioreq *ioreq = NULL;
 
-if (QLIST_EMPTY(>freelist)) {
-if (blkdev->requests_total >= blkdev->max_requests) {
+if (QLIST_EMPTY(>freelist)) {
+if (dataplane->requests_total >= dataplane->max_requests) {
 goto out;
 }
 /* allocate new struct */
 ioreq = g_malloc0(sizeof(*ioreq));
-ioreq->blkdev = blkdev;
-blkdev->requests_total++;
+ioreq->dataplane = dataplane;
+dataplane->requests_total++;
 qemu_iovec_init(>v, 1);
 } else {
 /* get one from freelist */
-ioreq = QLIST_FIRST(>freelist);
+ioreq = QLIST_FIRST(>freelist);
 QLIST_REMOVE(ioreq, list);
 }
-QLIST_INSERT_HEAD(>inflight, ioreq, list);
-blkdev->requests_inflight++;
+QLIST_INSERT_HEAD(>inflight, ioreq, list);
+dataplane->requests_inflight++;
 
 out:
 return ioreq;
@@ -119,26 +119,26 @@ out:
 
 static void ioreq_finish(struct ioreq *ioreq)
 {
-struct XenBlkDev *blkdev = ioreq->blkdev;
+XenBlockDataPlane *dataplane = ioreq->dataplane;
 
 QLIST_REMOVE(ioreq, list);
-QLIST_INSERT_HEAD(>finished, ioreq, list);
-blkdev->requests_inflight--;
-blkdev->requests_finished++;
+QLIST_INSERT_HEAD(>finished, ioreq, list);
+dataplane->requests_inflight--;
+dataplane->requests_finished++;
 }
 
 static void ioreq_release(struct ioreq *ioreq, bool finish)
 {
-struct XenBlkDev *blkdev = ioreq->blkdev;
+XenBlockDataPlane *dataplane = ioreq->dataplane;
 
 QLIST_REMOVE(ioreq, list);
 ioreq_reset(ioreq);
-ioreq->blkdev = blkdev;
-QLIST_INSERT_HEAD(>freelist, ioreq, list);
+ioreq->dataplane = dataplane;
+QLIST_INSERT_HEAD(>freelist, ioreq, list);
 if (finish) {
-blkdev->requests_finished--;
+dataplane->requests_finished--;
 } else {
-blkdev->requests_inflight--;
+dataplane->requests_inflight--;
 }
 }
 
@@ -148,7 +148,7 @@ static void ioreq_release(struct ioreq *ioreq, bool finish)
  */
 static int ioreq_parse(struct ioreq *ioreq)
 {
-struct XenBlkDev *blkdev = ioreq->blkdev;
+XenBlockDataPlane *dataplane = ioreq->dataplane;
 size_t len;
 int i;
 
@@ -171,12 +171,12 @@ static int ioreq_parse(struct ioreq *ioreq)
 };
 
 if (ioreq->req.operation != BLKIF_OP_READ &&
-blk_is_read_only(blkdev->blk)) {
+blk_is_read_only(dataplane->blk)) {
 error_report("error: write req for ro device");
 goto err;
 }
 
-ioreq->start = ioreq->req.sector_number * blkdev->file_blk;
+ioreq->start = ioreq->req.sector_number * dataplane->file_blk;
 for (i = 0; i < ioreq->req.nr_segments; i++) {
 if (i == BLKIF_MAX_SEGMENTS_PER_REQUEST) {
 error_report("error: nr_segments too big");
@@ -186,16 +186,16 @@ static int ioreq_parse(struct ioreq *ioreq)
 error_report("error: first > last sector");
 goto err;
 }
-if (ioreq->req.seg[i].last_sect * blkdev->file_blk >= XC_PAGE_SIZE) {
+if (ioreq->req.seg[i].last_sect * dataplane->file_blk >= XC_PAGE_SIZE) 
{
 error_report("error: page crossing");
 goto err;
 }
 
 len = (ioreq->req.seg[i].last_sect -
-   ioreq->req.seg[i].first_sect + 1) * blkdev->file_blk;
+   ioreq->req.seg[i].first_sect + 1) * dataplane->file_blk;
 ioreq->size += len;
 }
-if (ioreq->start + ioreq->size > 

[Xen-devel] [PATCH v2 15/18] xen: add a mechanism to automatically create XenDevice-s...

2018-12-06 Thread Paul Durrant
...that maintains compatibility with existing Xen toolstacks.

Xen toolstacks instantiate PV backends by simply writing information into
xenstore and expecting a backend implementation to be watching for this.

This patch adds a new 'xen-backend' module to allow individual XenDevice
implementations to register a creator function to be called when a tool-
stack instantiates a new backend in this way.

To support this it is also necessary to add new watchers into the XenBus
implementation to handle enumeration of new backends and also destruction
of XenDevice-s when the toolstack sets the backend 'online' key to 0.

NOTE: This patch only adds the framework. A subsequent patch will add a
  creator function for xen-block devices.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 

v2:
 - Sort out error paths and error reporting
---
 hw/xen/Makefile.objs |   2 +-
 hw/xen/trace-events  |   5 +
 hw/xen/xen-backend.c |  69 +
 hw/xen/xen-bus.c | 226 +++
 include/hw/xen/xen-backend.h |  26 +
 include/hw/xen/xen-bus.h |   5 +-
 include/qemu/module.h|   3 +
 7 files changed, 315 insertions(+), 21 deletions(-)
 create mode 100644 hw/xen/xen-backend.c
 create mode 100644 include/hw/xen/xen-backend.h

diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
index 77c0868..84df60a 100644
--- a/hw/xen/Makefile.objs
+++ b/hw/xen/Makefile.objs
@@ -1,5 +1,5 @@
 # xen backend driver support
-common-obj-$(CONFIG_XEN) += xen-legacy-backend.o xen_devconfig.o xen_pvdev.o 
xen-common.o xen-bus.o xen-bus-helper.o
+common-obj-$(CONFIG_XEN) += xen-legacy-backend.o xen_devconfig.o xen_pvdev.o 
xen-common.o xen-bus.o xen-bus-helper.o xen-backend.o
 
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o 
xen_pt_graphics.o xen_pt_msi.o
diff --git a/hw/xen/trace-events b/hw/xen/trace-events
index 22055b5..d567242 100644
--- a/hw/xen/trace-events
+++ b/hw/xen/trace-events
@@ -16,13 +16,18 @@ xen_domid_restrict(int err) "err: %u"
 # include/hw/xen/xen-bus.c
 xen_bus_realize(void) ""
 xen_bus_unrealize(void) ""
+xen_bus_enumerate(void) ""
+xen_bus_type_enumerate(const char *type) "type: %s"
+xen_bus_backend_create(const char *type, const char *path) "type: %s path: %s"
 xen_bus_add_watch(const char *node, const char *key, char *token) "node: %s 
key: %s token: %s"
 xen_bus_remove_watch(const char *node, const char *key, char *token) "node: %s 
key: %s token: %s"
 xen_bus_watch(const char *token) "token: %s"
 xen_device_realize(const char *type, char *name) "type: %s name: %s"
 xen_device_unrealize(const char *type, char *name) "type: %s name: %s"
 xen_device_backend_state(const char *type, char *name, const char *state) 
"type: %s name: %s -> %s"
+xen_device_backend_online(const char *type, char *name, bool online) "type: %s 
name: %s -> %u"
 xen_device_frontend_state(const char *type, char *name, const char *state) 
"type: %s name: %s -> %s"
+xen_device_backend_changed(const char *type, char *name) "type: %s name: %s"
 xen_device_frontend_changed(const char *type, char *name) "type: %s name: %s"
 
 # include/hw/xen/xen-bus-helper.c
diff --git a/hw/xen/xen-backend.c b/hw/xen/xen-backend.c
new file mode 100644
index 000..d87e6ec
--- /dev/null
+++ b/hw/xen/xen-backend.c
@@ -0,0 +1,69 @@
+/*
+ * Copyright (c) 2018  Citrix Systems Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "hw/xen/xen-backend.h"
+
+typedef struct XenBackendImpl {
+const char *type;
+XenBackendDeviceCreate create;
+} XenBackendImpl;
+
+static GHashTable *xen_backend_table_get(void)
+{
+static GHashTable *table;
+
+if (table == NULL) {
+table = g_hash_table_new(g_str_hash, g_str_equal);
+}
+
+return table;
+}
+
+static void xen_backend_table_add(XenBackendImpl *impl)
+{
+g_hash_table_insert(xen_backend_table_get(), (void *)impl->type, impl);
+}
+
+static XenBackendImpl *xen_backend_table_lookup(const char *type)
+{
+return g_hash_table_lookup(xen_backend_table_get(), type);
+}
+
+void xen_backend_register(const XenBackendInfo *info)
+{
+XenBackendImpl *impl = g_new0(XenBackendImpl, 1);
+
+g_assert(info->type);
+
+if (xen_backend_table_lookup(info->type)) {
+error_report("attempt to register duplicate Xen backend type '%s'",
+ info->type);
+abort();
+}
+
+if (!info->create) {
+error_report("backend type '%s' has no creator", info->type);
+abort();
+}
+
+impl->type = info->type;
+impl->create = info->create;
+
+xen_backend_table_add(impl);
+}
+
+void xen_backend_device_create(BusState *bus, const char *type,
+   const char *name, QDict *opts, Error 

[Xen-devel] [PATCH v2 14/18] xen: add implementations of xen-block connect and disconnect functions...

2018-12-06 Thread Paul Durrant
...and wire in the dataplane.

This patch adds the remaining code to make the xen-block XenDevice
functional. The parameters that a block frontend expects to find are
populated in the backend xenstore area, and the 'ring-ref' and
'event-channel' values specified in the frontend xenstore area are
mapped/bound and used to set up the dataplane.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Kevin Wolf 
Cc: Max Reitz 

v2:
 - Tidy up header inclusions
 - Stop leaking ring_ref on error
 - Auto-create drive for CDRom devices
---
 hw/block/xen-block.c   | 164 +
 hw/xen/xen-bus.c   |  12 ++--
 include/hw/xen/xen-block.h |   9 +++
 include/hw/xen/xen-bus.h   |  10 +++
 4 files changed, 189 insertions(+), 6 deletions(-)

diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index d2334ef..fc64aaf 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -10,7 +10,13 @@
 #include "qapi/error.h"
 #include "qapi/visitor.h"
 #include "hw/hw.h"
+#include "hw/xen/xen_common.h"
+#include "hw/block/xen_blkif.h"
 #include "hw/xen/xen-block.h"
+#include "sysemu/blockdev.h"
+#include "sysemu/block-backend.h"
+#include "sysemu/iothread.h"
+#include "dataplane/xen-block.h"
 #include "trace.h"
 
 static char *xen_block_get_name(XenDevice *xendev, Error **errp)
@@ -28,6 +34,8 @@ static void xen_block_disconnect(XenDevice *xendev, Error 
**errp)
 XenBlockVdev *vdev = >vdev;
 
 trace_xen_block_disconnect(type, vdev->disk, vdev->partition);
+
+xen_block_dataplane_stop(blockdev->dataplane);
 }
 
 static void xen_block_connect(XenDevice *xendev, Error **errp)
@@ -35,8 +43,72 @@ static void xen_block_connect(XenDevice *xendev, Error 
**errp)
 XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
 const char *type = object_get_typename(OBJECT(blockdev));
 XenBlockVdev *vdev = >vdev;
+unsigned int order, nr_ring_ref, *ring_ref, event_channel, protocol;
+char *str;
 
 trace_xen_block_connect(type, vdev->disk, vdev->partition);
+
+if (xen_device_frontend_scanf(xendev, "ring-page-order", "%u",
+  ) != 1) {
+nr_ring_ref = 1;
+ring_ref = g_new(unsigned int, nr_ring_ref);
+
+if (xen_device_frontend_scanf(xendev, "ring-ref", "%u",
+  _ref[0]) != 1) {
+error_setg(errp, "failed to read ring-ref");
+g_free(ring_ref);
+return;
+}
+} else if (order <= blockdev->max_ring_page_order) {
+unsigned int i;
+
+nr_ring_ref = 1 << order;
+ring_ref = g_new(unsigned int, nr_ring_ref);
+
+for (i = 0; i < nr_ring_ref; i++) {
+const char *key = g_strdup_printf("ring-ref%u", i);
+
+if (xen_device_frontend_scanf(xendev, key, "%u",
+  _ref[i]) != 1) {
+error_setg(errp, "failed to read %s", key);
+g_free((gpointer)key);
+g_free(ring_ref);
+return;
+}
+
+g_free((gpointer)key);
+}
+} else {
+error_setg(errp, "invalid ring-page-order (%d)", order);
+return;
+}
+
+if (xen_device_frontend_scanf(xendev, "event-channel", "%u",
+  _channel) != 1) {
+error_setg(errp, "failed to read event-channel");
+g_free(ring_ref);
+return;
+}
+
+if (xen_device_frontend_scanf(xendev, "protocol", "%ms",
+  ) != 1) {
+protocol = BLKIF_PROTOCOL_NATIVE;
+} else {
+if (strcmp(str, XEN_IO_PROTO_ABI_X86_32) == 0) {
+protocol = BLKIF_PROTOCOL_X86_32;
+} else if (strcmp(str, XEN_IO_PROTO_ABI_X86_64) == 0) {
+protocol = BLKIF_PROTOCOL_X86_64;
+} else {
+protocol = BLKIF_PROTOCOL_NATIVE;
+}
+
+free(str);
+}
+
+xen_block_dataplane_start(blockdev->dataplane, ring_ref, nr_ring_ref,
+  event_channel, protocol, errp);
+
+g_free(ring_ref);
 }
 
 static void xen_block_unrealize(XenDevice *xendev, Error **errp)
@@ -60,6 +132,9 @@ static void xen_block_unrealize(XenDevice *xendev, Error 
**errp)
 error_propagate(errp, local_err);
 }
 
+xen_block_dataplane_destroy(blockdev->dataplane);
+blockdev->dataplane = NULL;
+
 if (blockdev_class->unrealize) {
 blockdev_class->unrealize(blockdev, _err);
 if (local_err) {
@@ -76,6 +151,7 @@ static void xen_block_realize(XenDevice *xendev, Error 
**errp)
 const char *type = object_get_typename(OBJECT(blockdev));
 XenBlockVdev *vdev = >vdev;
 Error *local_err = NULL;
+BlockConf *conf = >conf;
 
 if (vdev->type == XEN_BLOCK_VDEV_TYPE_INVALID) {
 error_setg(errp, "vdev property not set");
@@ -90,6 +166,59 @@ static void xen_block_realize(XenDevice *xendev, Error 
**errp)
 

[Xen-devel] [PATCH v2 12/18] xen: remove 'ioreq' struct/varable/field names from dataplane/xen-block.c

2018-12-06 Thread Paul Durrant
This is a purely cosmetic patch that purges the name 'ioreq' from struct,
variable and field names. (This name has been problematic for a long time
as 'ioreq' is the name used for generic I/O requests coming from Xen).
The patch replaces 'struct ioreq' with a new 'XenBlockRequest' type and
'ioreq' field/variable names with 'request', and then does necessary
fix-up to adhere to coding style.

Function names are not modified by this patch. They will be dealt with in
a subsequent patch.

No functional change.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
---
Cc: Stefan Hajnoczi 
Cc: Stefano Stabellini 
Cc: Kevin Wolf 
Cc: Max Reitz 
---
 hw/block/dataplane/xen-block.c | 310 +
 1 file changed, 156 insertions(+), 154 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 6ecd160..426e83c 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -34,7 +34,7 @@
 #include "sysemu/iothread.h"
 #include "xen-block.h"
 
-struct ioreq {
+typedef struct XenBlockRequest {
 blkif_request_t req;
 int16_t status;
 off_t start;
@@ -45,9 +45,9 @@ struct ioreq {
 int aio_inflight;
 int aio_errors;
 XenBlockDataPlane *dataplane;
-QLIST_ENTRY(ioreq) list;
+QLIST_ENTRY(XenBlockRequest) list;
 BlockAcctCookie acct;
-};
+} XenBlockRequest;
 
 struct XenBlockDataPlane {
 XenDevice *xendev;
@@ -60,9 +60,9 @@ struct XenBlockDataPlane {
 int protocol;
 blkif_back_rings_t rings;
 int more_work;
-QLIST_HEAD(inflight_head, ioreq) inflight;
-QLIST_HEAD(finished_head, ioreq) finished;
-QLIST_HEAD(freelist_head, ioreq) freelist;
+QLIST_HEAD(inflight_head, XenBlockRequest) inflight;
+QLIST_HEAD(finished_head, XenBlockRequest) finished;
+QLIST_HEAD(freelist_head, XenBlockRequest) freelist;
 int requests_total;
 int requests_inflight;
 int requests_finished;
@@ -73,68 +73,68 @@ struct XenBlockDataPlane {
 AioContext *ctx;
 };
 
-static void ioreq_reset(struct ioreq *ioreq)
+static void ioreq_reset(XenBlockRequest *request)
 {
-memset(>req, 0, sizeof(ioreq->req));
-ioreq->status = 0;
-ioreq->start = 0;
-ioreq->buf = NULL;
-ioreq->size = 0;
-ioreq->presync = 0;
+memset(>req, 0, sizeof(request->req));
+request->status = 0;
+request->start = 0;
+request->buf = NULL;
+request->size = 0;
+request->presync = 0;
 
-ioreq->aio_inflight = 0;
-ioreq->aio_errors = 0;
+request->aio_inflight = 0;
+request->aio_errors = 0;
 
-ioreq->dataplane = NULL;
-memset(>list, 0, sizeof(ioreq->list));
-memset(>acct, 0, sizeof(ioreq->acct));
+request->dataplane = NULL;
+memset(>list, 0, sizeof(request->list));
+memset(>acct, 0, sizeof(request->acct));
 
-qemu_iovec_reset(>v);
+qemu_iovec_reset(>v);
 }
 
-static struct ioreq *ioreq_start(XenBlockDataPlane *dataplane)
+static XenBlockRequest *ioreq_start(XenBlockDataPlane *dataplane)
 {
-struct ioreq *ioreq = NULL;
+XenBlockRequest *request = NULL;
 
 if (QLIST_EMPTY(>freelist)) {
 if (dataplane->requests_total >= dataplane->max_requests) {
 goto out;
 }
 /* allocate new struct */
-ioreq = g_malloc0(sizeof(*ioreq));
-ioreq->dataplane = dataplane;
+request = g_malloc0(sizeof(*request));
+request->dataplane = dataplane;
 dataplane->requests_total++;
-qemu_iovec_init(>v, 1);
+qemu_iovec_init(>v, 1);
 } else {
 /* get one from freelist */
-ioreq = QLIST_FIRST(>freelist);
-QLIST_REMOVE(ioreq, list);
+request = QLIST_FIRST(>freelist);
+QLIST_REMOVE(request, list);
 }
-QLIST_INSERT_HEAD(>inflight, ioreq, list);
+QLIST_INSERT_HEAD(>inflight, request, list);
 dataplane->requests_inflight++;
 
 out:
-return ioreq;
+return request;
 }
 
-static void ioreq_finish(struct ioreq *ioreq)
+static void ioreq_finish(XenBlockRequest *request)
 {
-XenBlockDataPlane *dataplane = ioreq->dataplane;
+XenBlockDataPlane *dataplane = request->dataplane;
 
-QLIST_REMOVE(ioreq, list);
-QLIST_INSERT_HEAD(>finished, ioreq, list);
+QLIST_REMOVE(request, list);
+QLIST_INSERT_HEAD(>finished, request, list);
 dataplane->requests_inflight--;
 dataplane->requests_finished++;
 }
 
-static void ioreq_release(struct ioreq *ioreq, bool finish)
+static void ioreq_release(XenBlockRequest *request, bool finish)
 {
-XenBlockDataPlane *dataplane = ioreq->dataplane;
+XenBlockDataPlane *dataplane = request->dataplane;
 
-QLIST_REMOVE(ioreq, list);
-ioreq_reset(ioreq);
-ioreq->dataplane = dataplane;
-QLIST_INSERT_HEAD(>freelist, ioreq, list);
+QLIST_REMOVE(request, list);
+ioreq_reset(request);
+request->dataplane = dataplane;
+QLIST_INSERT_HEAD(>freelist, request, list);
 if (finish) {
 dataplane->requests_finished--;
 

Re: [Xen-devel] [PATCH v6 09/20] xen: add basic hooks for PVH in current code

2018-12-06 Thread Daniel Kiper
On Wed, Nov 28, 2018 at 02:55:19PM +0100, Juergen Gross wrote:
> Add the hooks to current code needed for Xen PVH. They will be filled
> with code later when the related functionality is being added.
>
> loader/i386/linux.c needs to include machine/kernel.h now as it needs
> to get GRUB_KERNEL_USE_RSDP_ADDR from there.
>
> Signed-off-by: Juergen Gross 
> Reviewed-by: Daniel Kiper 
> ---
> V3: xenpvh->xen_pvh (Daniel Kiper)
> adjust copyright date (Roger Pau Monné)
> V5: update commit message (Daniel Kiper)
> move including xen/hvm/start_info.h to the sources really needing
>   it (Daniel Kiper)
> ---

[...]

> diff --git a/grub-core/loader/i386/linux.c b/grub-core/loader/i386/linux.c
> index 375ee80dc..b6015913b 100644
> --- a/grub-core/loader/i386/linux.c
> +++ b/grub-core/loader/i386/linux.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

This include breaks i386 ieee1275 builds. Could you fix that and repost?

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 15/18] xen: add a mechanism to automatically create XenDevice-s...

2018-12-06 Thread Anthony PERARD
On Thu, Dec 06, 2018 at 12:36:52PM +, Paul Durrant wrote:
> > -Original Message-
> > From: Anthony PERARD [mailto:anthony.per...@citrix.com]
> > Sent: 04 December 2018 15:35
> > 
> > On Wed, Nov 21, 2018 at 03:12:08PM +, Paul Durrant wrote:
> > > +xenbus->backend_watch =
> > > +xen_bus_add_watch(xenbus, "", /* domain root node */
> > > +  "backend", xen_bus_enumerate, xenbus,
> > _err);
> > > +if (local_err) {
> > > +error_propagate(errp, local_err);
> > > +error_prepend(errp, "failed to set up enumeration watch: ");
> > 
> > You should use error_propagate_prepend instead
> > error_propagate;error_prepend. And it looks like there is the same
> > mistake in other patches that I haven't notice.
> > 
> 
> Oh, I didn't know about that one either... I've only seen the separate calls 
> used elsewhere.

That information is all in "include/qapi/error.h", if you which to know
more on how to use Error.

> > Also you probably want goto fail here.
> > 
> 
> Not sure about that. Whilst the bus scan won't happen, it doesn't mean 
> devices can't be added via QMP.

In that case, don't modify errp, and use error_reportf_err instead, or
warn_reportf_err (then local_err = NULL, in case it is reused in a
future modification of the function).

Setting errp (with error_propagate) means that the function failed, and
QEMU is going to exit(1), because of qdev_init_nofail call in
xen_bus_init.

> > > +static void xen_device_backend_changed(void *opaque)
> > > +{
> > > +XenDevice *xendev = opaque;
> > > +const char *type = object_get_typename(OBJECT(xendev));
> > > +enum xenbus_state state;
> > > +unsigned int online;
> > > +
> > > +trace_xen_device_backend_changed(type, xendev->name);
> > > +
> > > +if (xen_device_backend_scanf(xendev, "state", "%u", ) != 1) {
> > > +state = XenbusStateUnknown;
> > > +}
> > > +
> > > +xen_device_backend_set_state(xendev, state);
> > 
> > It's kind of weird to set the internal state base on the external one
> > that something else may have modified. Shouldn't we check that it is
> > fine for something else to modify the state and that it is a correct
> > transition?
> 
> The only thing (apart from this code) that's going to have perms to write the 
> backend state is the toolstack... which is, of course, be definition trusted.

"trusted" doesn't mean that there isn't a bug somewhere else :-). But I
guess it's good enough for now.

> > Also aren't we going in a loop by having QEMU set the state, then the
> > watch fires again? (Not really a loop since the function _set_state
> > check for changes.
> 
> No. It's de-bounced inside the set_state function.
> 
> > 
> > Also maybe we should watch for the state changes only when something
> > else like libxl creates (ask for) the backend, and ignore changes when
> > QEMU did it itself.
> 
> I don't think it's necessary to add that complexity.

Ok.

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 05/18] xen: add xenstore watcher infrastructure

2018-12-06 Thread Paul Durrant
A Xen PV frontend communicates its state to the PV backend by writing to
the 'state' key in the frontend area in xenstore. It is therefore
necessary for a XenDevice implementation to be notified whenever the
value of this key changes.

This patch adds code to do this as follows:

- an 'fd handler' is registered on the libxenstore handle which will be
  triggered whenever a 'watch' event occurs
- primitives are added to xen-bus-helper to add or remove watch events
- a list of Notifier objects is added to XenBus to provide a mechanism
  to call the appropriate 'watch handler' when its associated event
  occurs

The xen-block implementation is extended with a 'frontend_changed' method,
which calls as-yet stub 'connect' and 'disconnect' functions when the
relevant frontend state transitions occur. A subsequent patch will supply
a full implementation for these functions.

Signed-off-by: Paul Durrant 
---
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Stefano Stabellini 
Cc: Anthony Perard 

v2:
 - Don't crash when xen_block_disconnect() fails
 - Check xs_unwatch() for error
 - Add new_watch() and free_watch() utility functions
 - Use xs_check_watch() rather than xs_read_watch()
---
 hw/block/trace-events   |   2 +
 hw/block/xen-block.c|  73 ++
 hw/xen/trace-events |   6 ++
 hw/xen/xen-bus-helper.c |  34 +++
 hw/xen/xen-bus.c| 217 +++-
 include/hw/xen/xen-bus-helper.h |   6 ++
 include/hw/xen/xen-bus.h|  15 +++
 7 files changed, 351 insertions(+), 2 deletions(-)

diff --git a/hw/block/trace-events b/hw/block/trace-events
index 4afbd62..89e2583 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -130,6 +130,8 @@ xen_disk_free(char *name) "%s"
 
 # hw/block/xen-block.c
 xen_block_realize(const char *type, uint32_t disk, uint32_t partition) "%s 
d%up%u"
+xen_block_connect(const char *type, uint32_t disk, uint32_t partition) "%s 
d%up%u"
+xen_block_disconnect(const char *type, uint32_t disk, uint32_t partition) "%s 
d%up%u"
 xen_block_unrealize(const char *type, uint32_t disk, uint32_t partition) "%s 
d%up%u"
 xen_disk_realize(void) ""
 xen_disk_unrealize(void) ""
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index 440bec2..d2334ef 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -21,6 +21,24 @@ static char *xen_block_get_name(XenDevice *xendev, Error 
**errp)
 return g_strdup_printf("%lu", vdev->number);
 }
 
+static void xen_block_disconnect(XenDevice *xendev, Error **errp)
+{
+XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
+const char *type = object_get_typename(OBJECT(blockdev));
+XenBlockVdev *vdev = >vdev;
+
+trace_xen_block_disconnect(type, vdev->disk, vdev->partition);
+}
+
+static void xen_block_connect(XenDevice *xendev, Error **errp)
+{
+XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
+const char *type = object_get_typename(OBJECT(blockdev));
+XenBlockVdev *vdev = >vdev;
+
+trace_xen_block_connect(type, vdev->disk, vdev->partition);
+}
+
 static void xen_block_unrealize(XenDevice *xendev, Error **errp)
 {
 XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
@@ -36,6 +54,12 @@ static void xen_block_unrealize(XenDevice *xendev, Error 
**errp)
 
 trace_xen_block_unrealize(type, vdev->disk, vdev->partition);
 
+/* Disconnect from the frontend in case this has not already happened */
+xen_block_disconnect(xendev, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+}
+
 if (blockdev_class->unrealize) {
 blockdev_class->unrealize(blockdev, _err);
 if (local_err) {
@@ -68,6 +92,54 @@ static void xen_block_realize(XenDevice *xendev, Error 
**errp)
 }
 }
 
+static void xen_block_frontend_changed(XenDevice *xendev,
+   enum xenbus_state frontend_state,
+   Error **errp)
+{
+enum xenbus_state backend_state = xen_device_backend_get_state(xendev);
+Error *local_err = NULL;
+
+switch (frontend_state) {
+case XenbusStateInitialised:
+case XenbusStateConnected:
+if (backend_state == XenbusStateConnected) {
+break;
+}
+
+xen_block_disconnect(xendev, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+break;
+}
+
+xen_block_connect(xendev, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+break;
+}
+
+xen_device_backend_set_state(xendev, XenbusStateConnected);
+break;
+
+case XenbusStateClosing:
+xen_device_backend_set_state(xendev, XenbusStateClosing);
+break;
+
+case XenbusStateClosed:
+xen_block_disconnect(xendev, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+break;
+}
+
+xen_device_backend_set_state(xendev, XenbusStateClosed);

[Xen-devel] [PATCH v2 02/18] xen: introduce new 'XenBus' and 'XenDevice' object hierarchy

2018-12-06 Thread Paul Durrant
This patch adds the basic boilerplate for a 'XenBus' object that will act
as a parent to 'XenDevice' PV backends.
A new 'XenBridge' object is also added to connect XenBus to the system bus.

The XenBus object is instantiated by a new xen_bus_init() function called
from the same sites as the legacy xen_be_init() function.

Subsequent patches will flesh-out the functionality of these objects.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 

v2:
 - Fix boilerplate
 - Make xen-bus hotplug capable
---
 hw/i386/xen/xen-hvm.c |   3 ++
 hw/xen/Makefile.objs  |   2 +-
 hw/xen/trace-events   |   6 +++
 hw/xen/xen-bus.c  | 131 ++
 hw/xenpv/xen_machine_pv.c |   3 ++
 include/hw/xen/xen-bus.h  |  55 +++
 6 files changed, 199 insertions(+), 1 deletion(-)
 create mode 100644 hw/xen/xen-bus.c
 create mode 100644 include/hw/xen/xen-bus.h

diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index 1d63763..4497f75 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -17,6 +17,7 @@
 #include "hw/i386/apic-msidef.h"
 #include "hw/xen/xen_common.h"
 #include "hw/xen/xen-legacy-backend.h"
+#include "hw/xen/xen-bus.h"
 #include "qapi/error.h"
 #include "qapi/qapi-commands-misc.h"
 #include "qemu/error-report.h"
@@ -1479,6 +1480,8 @@ void xen_hvm_init(PCMachineState *pcms, MemoryRegion 
**ram_memory)
 QLIST_INIT(>dev_list);
 device_listener_register(>device_listener);
 
+xen_bus_init();
+
 /* Initialize backend core & drivers */
 if (xen_be_init() != 0) {
 error_report("xen backend core setup failed");
diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
index 3f64a44..d9d6d7b 100644
--- a/hw/xen/Makefile.objs
+++ b/hw/xen/Makefile.objs
@@ -1,5 +1,5 @@
 # xen backend driver support
-common-obj-$(CONFIG_XEN) += xen-legacy-backend.o xen_devconfig.o xen_pvdev.o 
xen-common.o
+common-obj-$(CONFIG_XEN) += xen-legacy-backend.o xen_devconfig.o xen_pvdev.o 
xen-common.o xen-bus.o
 
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o 
xen_pt_graphics.o xen_pt_msi.o
diff --git a/hw/xen/trace-events b/hw/xen/trace-events
index c7e7a3b..0172cd4 100644
--- a/hw/xen/trace-events
+++ b/hw/xen/trace-events
@@ -12,3 +12,9 @@ xen_unmap_portio_range(uint32_t id, uint64_t start_addr, 
uint64_t end_addr) "id:
 xen_map_pcidev(uint32_t id, uint8_t bus, uint8_t dev, uint8_t func) "id: %u 
bdf: %02x.%02x.%02x"
 xen_unmap_pcidev(uint32_t id, uint8_t bus, uint8_t dev, uint8_t func) "id: %u 
bdf: %02x.%02x.%02x"
 xen_domid_restrict(int err) "err: %u"
+
+# include/hw/xen/xen-bus.c
+xen_bus_realize(void) ""
+xen_bus_unrealize(void) ""
+xen_device_realize(const char *type) "type: %s"
+xen_device_unrealize(const char *type) "type: %s"
diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
new file mode 100644
index 000..1385bab
--- /dev/null
+++ b/hw/xen/xen-bus.c
@@ -0,0 +1,131 @@
+/*
+ * Copyright (c) 2018  Citrix Systems Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/hw.h"
+#include "hw/sysbus.h"
+#include "hw/xen/xen-bus.h"
+#include "qapi/error.h"
+#include "trace.h"
+
+static void xen_bus_unrealize(BusState *bus, Error **errp)
+{
+trace_xen_bus_unrealize();
+}
+
+static void xen_bus_realize(BusState *bus, Error **errp)
+{
+trace_xen_bus_realize();
+}
+
+static void xen_bus_class_init(ObjectClass *class, void *data)
+{
+BusClass *bus_class = BUS_CLASS(class);
+
+bus_class->realize = xen_bus_realize;
+bus_class->unrealize = xen_bus_unrealize;
+}
+
+static const TypeInfo xen_bus_type_info = {
+.name = TYPE_XEN_BUS,
+.parent = TYPE_BUS,
+.instance_size = sizeof(XenBus),
+.class_size = sizeof(XenBusClass),
+.class_init = xen_bus_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_HOTPLUG_HANDLER },
+{ }
+},
+};
+
+static void xen_device_unrealize(DeviceState *dev, Error **errp)
+{
+XenDevice *xendev = XEN_DEVICE(dev);
+XenDeviceClass *xendev_class = XEN_DEVICE_GET_CLASS(xendev);
+const char *type = object_get_typename(OBJECT(xendev));
+Error *local_err = NULL;
+
+trace_xen_device_unrealize(type);
+
+if (xendev_class->unrealize) {
+xendev_class->unrealize(xendev, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+}
+}
+}
+
+static void xen_device_realize(DeviceState *dev, Error **errp)
+{
+XenDevice *xendev = XEN_DEVICE(dev);
+XenDeviceClass *xendev_class = XEN_DEVICE_GET_CLASS(xendev);
+const char *type = object_get_typename(OBJECT(xendev));
+Error *local_err = NULL;
+
+trace_xen_device_realize(type);
+
+if 

[Xen-devel] [PATCH v2 01/18] xen: re-name XenDevice to XenLegacyDevice...

2018-12-06 Thread Paul Durrant
...and xen_backend.h to xen-legacy-backend.h

Rather than attempting to convert the existing backend infrastructure to
be QOM compliant (which would be hard to do in an incremental fashion),
subsequent patches will introduce a completely new framework for Xen PV
backends. Hence it is necessary to re-name parts of existing code to avoid
name clashes. The re-named 'legacy' infrastructure will be removed once all
backends have been ported to the new framework.

This patch is purely cosmetic. No functional change.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
---
Cc: Stefano Stabellini 
Cc: Greg Kurz 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: "Marc-André Lureau" 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Cc: Jason Wang 
Cc: Gerd Hoffmann 
---
 hw/9pfs/xen-9p-backend.c|  16 +-
 hw/block/xen_disk.c |  24 +-
 hw/char/xen_console.c   |  12 +-
 hw/display/xenfb.c  |  25 +-
 hw/i386/xen/xen-hvm.c   |   2 +-
 hw/i386/xen/xen-mapcache.c  |   2 +-
 hw/i386/xen/xen_platform.c  |   2 +-
 hw/net/xen_nic.c|  14 +-
 hw/usb/xen-usb.c|  25 +-
 hw/xen/Makefile.objs|   2 +-
 hw/xen/xen-common.c |   2 +-
 hw/xen/xen-legacy-backend.c | 854 
 hw/xen/xen_backend.c| 845 ---
 hw/xen/xen_devconfig.c  |   2 +-
 hw/xen/xen_pt.c |   2 +-
 hw/xen/xen_pt_config_init.c |   2 +-
 hw/xen/xen_pt_graphics.c|   2 +-
 hw/xen/xen_pt_msi.c |   2 +-
 hw/xen/xen_pvdev.c  |  20 +-
 hw/xenpv/xen_domainbuild.c  |   2 +-
 hw/xenpv/xen_machine_pv.c   |   2 +-
 include/hw/xen/xen-legacy-backend.h | 104 +
 include/hw/xen/xen_backend.h|  99 -
 include/hw/xen/xen_pvdev.h  |  38 +-
 24 files changed, 1059 insertions(+), 1041 deletions(-)
 create mode 100644 hw/xen/xen-legacy-backend.c
 delete mode 100644 hw/xen/xen_backend.c
 create mode 100644 include/hw/xen/xen-legacy-backend.h
 delete mode 100644 include/hw/xen/xen_backend.h

diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
index 3f54a21..3859a06 100644
--- a/hw/9pfs/xen-9p-backend.c
+++ b/hw/9pfs/xen-9p-backend.c
@@ -12,7 +12,7 @@
 
 #include "hw/hw.h"
 #include "hw/9pfs/9p.h"
-#include "hw/xen/xen_backend.h"
+#include "hw/xen/xen-legacy-backend.h"
 #include "hw/9pfs/xen-9pfs.h"
 #include "qapi/error.h"
 #include "qemu/config-file.h"
@@ -45,7 +45,7 @@ typedef struct Xen9pfsRing {
 } Xen9pfsRing;
 
 typedef struct Xen9pfsDev {
-struct XenDevice xendev;  /* must be first */
+struct XenLegacyDevice xendev;  /* must be first */
 V9fsState state;
 char *path;
 char *security_model;
@@ -56,7 +56,7 @@ typedef struct Xen9pfsDev {
 Xen9pfsRing *rings;
 } Xen9pfsDev;
 
-static void xen_9pfs_disconnect(struct XenDevice *xendev);
+static void xen_9pfs_disconnect(struct XenLegacyDevice *xendev);
 
 static void xen_9pfs_in_sg(Xen9pfsRing *ring,
struct iovec *in_sg,
@@ -243,7 +243,7 @@ static const V9fsTransport xen_9p_transport = {
 .push_and_notify = xen_9pfs_push_and_notify,
 };
 
-static int xen_9pfs_init(struct XenDevice *xendev)
+static int xen_9pfs_init(struct XenLegacyDevice *xendev)
 {
 return 0;
 }
@@ -305,7 +305,7 @@ static void xen_9pfs_evtchn_event(void *opaque)
 qemu_bh_schedule(ring->bh);
 }
 
-static void xen_9pfs_disconnect(struct XenDevice *xendev)
+static void xen_9pfs_disconnect(struct XenLegacyDevice *xendev)
 {
 Xen9pfsDev *xen_9pdev = container_of(xendev, Xen9pfsDev, xendev);
 int i;
@@ -321,7 +321,7 @@ static void xen_9pfs_disconnect(struct XenDevice *xendev)
 }
 }
 
-static int xen_9pfs_free(struct XenDevice *xendev)
+static int xen_9pfs_free(struct XenLegacyDevice *xendev)
 {
 Xen9pfsDev *xen_9pdev = container_of(xendev, Xen9pfsDev, xendev);
 int i;
@@ -354,7 +354,7 @@ static int xen_9pfs_free(struct XenDevice *xendev)
 return 0;
 }
 
-static int xen_9pfs_connect(struct XenDevice *xendev)
+static int xen_9pfs_connect(struct XenLegacyDevice *xendev)
 {
 Error *err = NULL;
 int i;
@@ -467,7 +467,7 @@ out:
 return -1;
 }
 
-static void xen_9pfs_alloc(struct XenDevice *xendev)
+static void xen_9pfs_alloc(struct XenLegacyDevice *xendev)
 {
 xenstore_write_be_str(xendev, "versions", VERSIONS);
 xenstore_write_be_int(xendev, "max-rings", MAX_RINGS);
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 36eff94..75fe55f 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -25,7 +25,7 @@
 #include 
 
 #include "hw/hw.h"
-#include "hw/xen/xen_backend.h"
+#include "hw/xen/xen-legacy-backend.h"
 #include "xen_blkif.h"
 #include "sysemu/blockdev.h"
 #include "sysemu/iothread.h"
@@ -63,7 +63,7 @@ struct ioreq {
 #define 

[Xen-devel] [PATCH v2 08/18] xen: duplicate xen_disk.c as basis of dataplane/xen-block.c

2018-12-06 Thread Paul Durrant
The new xen-block XenDevice implementation requires the same core
dataplane as the legacy xen_disk implementation it will eventually replace.
This patch therefore copies the legacy xen_disk.c source module into a new
dataplane/xen-block.c source module as the basis for the new dataplane and
adjusts the MAINTAINERS file accordingly.

NOTE: The duplicated code is not yet built. It is simply put into place by
  this patch (just fixing style violations) such that the
  modifications that will need to be made to the code are not
  conflated with code movement, thus making review harder.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
---
Cc: Stefano Stabellini 
Cc: Stefan Hajnoczi 
Cc: Kevin Wolf 
Cc: Max Reitz 
---
 MAINTAINERS|1 +
 hw/block/dataplane/xen-block.c | 1019 
 2 files changed, 1020 insertions(+)
 create mode 100644 hw/block/dataplane/xen-block.c

diff --git a/MAINTAINERS b/MAINTAINERS
index dd728c3..ab62ad4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -404,6 +404,7 @@ F: hw/char/xen_console.c
 F: hw/display/xenfb.c
 F: hw/net/xen_nic.c
 F: hw/block/xen*
+F: hw/block/dataplane/xen*
 F: hw/xen/
 F: hw/xenpv/
 F: hw/i386/xen/
diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
new file mode 100644
index 000..9fae505
--- /dev/null
+++ b/hw/block/dataplane/xen-block.c
@@ -0,0 +1,1019 @@
+/*
+ *  xen paravirt block device backend
+ *
+ *  (c) Gerd Hoffmann 
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see .
+ *
+ *  Contributions after 2012-01-13 are licensed under the terms of the
+ *  GNU GPL, version 2 or (at your option) any later version.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include 
+#include 
+
+#include "hw/hw.h"
+#include "hw/xen/xen_backend.h"
+#include "xen_blkif.h"
+#include "sysemu/blockdev.h"
+#include "sysemu/iothread.h"
+#include "sysemu/block-backend.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "trace.h"
+
+/* - */
+
+#define BLOCK_SIZE  512
+#define IOCB_COUNT  (BLKIF_MAX_SEGMENTS_PER_REQUEST + 2)
+
+struct ioreq {
+blkif_request_t req;
+int16_t status;
+
+/* parsed request */
+off_t   start;
+QEMUIOVectorv;
+void*buf;
+size_t  size;
+int presync;
+
+/* aio status */
+int aio_inflight;
+int aio_errors;
+
+struct XenBlkDev*blkdev;
+QLIST_ENTRY(ioreq)   list;
+BlockAcctCookie acct;
+};
+
+#define MAX_RING_PAGE_ORDER 4
+
+struct XenBlkDev {
+struct XenLegacyDevicexendev;  /* must be first */
+char*params;
+char*mode;
+char*type;
+char*dev;
+char*devtype;
+booldirectiosafe;
+const char  *fileproto;
+const char  *filename;
+unsigned intring_ref[1 << MAX_RING_PAGE_ORDER];
+unsigned intnr_ring_ref;
+void*sring;
+int64_t file_blk;
+int64_t file_size;
+int protocol;
+blkif_back_rings_t  rings;
+int more_work;
+
+/* request lists */
+QLIST_HEAD(inflight_head, ioreq) inflight;
+QLIST_HEAD(finished_head, ioreq) finished;
+QLIST_HEAD(freelist_head, ioreq) freelist;
+int requests_total;
+int requests_inflight;
+int requests_finished;
+unsigned intmax_requests;
+
+gbooleanfeature_discard;
+
+/* qemu block driver */
+DriveInfo   *dinfo;
+BlockBackend*blk;
+QEMUBH  *bh;
+
+IOThread*iothread;
+AioContext  *ctx;
+};
+
+/* - */
+
+static void ioreq_reset(struct ioreq *ioreq)
+{
+memset(>req, 0, sizeof(ioreq->req));
+ioreq->status = 0;
+ioreq->start = 0;
+ioreq->buf = NULL;
+ioreq->size = 0;
+ioreq->presync = 0;
+
+ioreq->aio_inflight = 0;
+ioreq->aio_errors = 0;
+
+ioreq->blkdev = NULL;
+memset(>list, 0, sizeof(ioreq->list));
+memset(>acct, 0, sizeof(ioreq->acct));
+
+

[Xen-devel] [PATCH v2 00/18] Xen PV backend 'qdevification'

2018-12-06 Thread Paul Durrant
This series introduces a new QOM compliant framework for Xen PV backends.
This is achieved by first moving the current non-compliant framework aside,
before building up a new framework incrementally.

This series was prompted by a thread [1] started by Kevin Wolf in response
to patches against xen_disk.c posted by Tim Smith. Therefore, alongside
the patches introducing the new framework, other patches build up a QOM
compliant replacement for 'xen_disk', called 'xen-qdisk'. Patch #16 swaps
this new device into place (having establisheda mechanism to auto-
instantiate devices that is compliant with existing Xen toolstacks in
patch #15) and patch #18 then removes the old xen_disk code.

Subsequent series will port other Xen PV backends across to the new
framework.

The series is also available as a repository branch [2] on xenbits.xen.org.

[1] https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg00259.html
[2] 
http://xenbits.xen.org/gitweb/?p=people/pauldu/qemu.git;a=shortlog;h=refs/heads/qom27

Paul Durrant (18):
  xen: re-name XenDevice to XenLegacyDevice...
  xen: introduce new 'XenBus' and 'XenDevice' object hierarchy
  xen: introduce 'xen-block', 'xen-disk' and 'xen-cdrom'
  xen: create xenstore areas for XenDevice-s
  xen: add xenstore watcher infrastructure
  xen: add grant table interface for XenDevice-s
  xen: add event channel interface for XenDevice-s
  xen: duplicate xen_disk.c as basis of dataplane/xen-block.c
  xen: remove unnecessary code from dataplane/xen-block.c
  xen: add header and build dataplane/xen-block.c
  xen: remove 'XenBlkDev' and 'blkdev' names from dataplane/xen-block
  xen: remove 'ioreq' struct/varable/field names from
dataplane/xen-block.c
  xen: purge 'blk' and 'ioreq' from function names in
dataplane/xen-block.c
  xen: add implementations of xen-block connect and disconnect
functions...
  xen: add a mechanism to automatically create XenDevice-s...
  xen: automatically create XenBlockDevice-s
  MAINTAINERS: add myself as a Xen maintainer
  xen: remove the legacy 'xen_disk' backend

 MAINTAINERS |5 +-
 hw/9pfs/xen-9p-backend.c|   16 +-
 hw/block/Makefile.objs  |2 +-
 hw/block/dataplane/Makefile.objs|1 +
 hw/block/dataplane/xen-block.c  |  814 ++
 hw/block/dataplane/xen-block.h  |   29 +
 hw/block/trace-events   |   11 +
 hw/block/xen-block.c|  853 +++
 hw/block/xen_disk.c | 1011 
 hw/char/xen_console.c   |   12 +-
 hw/display/xenfb.c  |   25 +-
 hw/i386/xen/xen-hvm.c   |5 +-
 hw/i386/xen/xen-mapcache.c  |2 +-
 hw/i386/xen/xen_platform.c  |2 +-
 hw/net/xen_nic.c|   14 +-
 hw/usb/xen-usb.c|   25 +-
 hw/xen/Makefile.objs|2 +-
 hw/xen/trace-events |   25 +
 hw/xen/xen-backend.c|   69 +++
 hw/xen/xen-bus-helper.c |  181 ++
 hw/xen/xen-bus.c| 1094 +++
 hw/xen/xen-common.c |2 +-
 hw/xen/xen-legacy-backend.c |  853 +++
 hw/xen/xen_backend.c|  845 ---
 hw/xen/xen_devconfig.c  |2 +-
 hw/xen/xen_pt.c |2 +-
 hw/xen/xen_pt_config_init.c |2 +-
 hw/xen/xen_pt_graphics.c|2 +-
 hw/xen/xen_pt_msi.c |2 +-
 hw/xen/xen_pvdev.c  |   20 +-
 hw/xenpv/xen_domainbuild.c  |2 +-
 hw/xenpv/xen_machine_pv.c   |5 +-
 include/hw/xen/xen-backend.h|   26 +
 include/hw/xen/xen-block.h  |   79 +++
 include/hw/xen/xen-bus-helper.h |   40 ++
 include/hw/xen/xen-bus.h|  138 +
 include/hw/xen/xen-legacy-backend.h |  104 
 include/hw/xen/xen_backend.h|   99 
 include/hw/xen/xen_pvdev.h  |   38 +-
 include/qemu/module.h   |3 +
 40 files changed, 4420 insertions(+), 2042 deletions(-)
 create mode 100644 hw/block/dataplane/xen-block.c
 create mode 100644 hw/block/dataplane/xen-block.h
 create mode 100644 hw/block/xen-block.c
 delete mode 100644 hw/block/xen_disk.c
 create mode 100644 hw/xen/xen-backend.c
 create mode 100644 hw/xen/xen-bus-helper.c
 create mode 100644 hw/xen/xen-bus.c
 create mode 100644 hw/xen/xen-legacy-backend.c
 delete mode 100644 hw/xen/xen_backend.c
 create mode 100644 include/hw/xen/xen-backend.h
 create mode 100644 include/hw/xen/xen-block.h
 create mode 100644 include/hw/xen/xen-bus-helper.h
 create mode 100644 include/hw/xen/xen-bus.h
 create mode 100644 include/hw/xen/xen-legacy-backend.h
 delete mode 100644 include/hw/xen/xen_backend.h
---
Cc: Anthony Perard 
Cc: Eduardo Habkost 
Cc: Gerd Hoffmann 
Cc: Greg Kurz 
Cc: Jason Wang 
Cc: Kevin Wolf 
Cc: "Marc-André Lureau" 

[Xen-devel] [PATCH v2 03/18] xen: introduce 'xen-block', 'xen-disk' and 'xen-cdrom'

2018-12-06 Thread Paul Durrant
This patch adds new XenDevice-s: 'xen-disk' and 'xen-cdrom', both derived
from a common 'xen-block' parent type. These will eventually replace the
'xen_disk' (note the underscore rather than hyphen) legacy PV backend but
it is illustrative to build up the implementation incrementally, along with
the XenBus/XenDevice framework. Subsequent patches will therefore add to
these devices' implementation as new features are added to the framework.

After this patch has been applied it is possible to instantiate new
'xen-disk' or 'xen-cdrom' devices with a single 'vdev' parameter, which
accepts values adhering to the Xen VBD naming scheme [1]. For example, a
command-line instantiation of a xen-disk can be done with an argument
similar to the following:

-device xen-disk,vdev=hda

The implementation of the vdev parameter formulates the appropriate VBD
number for use in the PV protocol.

[1] https://xenbits.xen.org/docs/unstable/man/xen-vbd-interface.7.html

Signed-off-by: Paul Durrant 
---
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Stefano Stabellini 
Cc: Anthony Perard 

v2:
 - Fix boilerplate
 - Fix vdev parsing
 - Change name from 'xen-qdisk' to 'xen-block', make abstract, and split
   off 'xen-disk' and 'xen-cdrom' as concrete sub-types
---
 MAINTAINERS|   2 +-
 hw/block/Makefile.objs |   1 +
 hw/block/trace-events  |   8 ++
 hw/block/xen-block.c   | 347 +
 include/hw/xen/xen-block.h |  69 +
 5 files changed, 426 insertions(+), 1 deletion(-)
 create mode 100644 hw/block/xen-block.c
 create mode 100644 include/hw/xen/xen-block.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 63effdc..dd728c3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -403,7 +403,7 @@ F: hw/9pfs/xen-9p-backend.c
 F: hw/char/xen_console.c
 F: hw/display/xenfb.c
 F: hw/net/xen_nic.c
-F: hw/block/xen_*
+F: hw/block/xen*
 F: hw/xen/
 F: hw/xenpv/
 F: hw/i386/xen/
diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index 53ce575..f34813a 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -4,6 +4,7 @@ common-obj-$(CONFIG_SSI_M25P80) += m25p80.o
 common-obj-$(CONFIG_NAND) += nand.o
 common-obj-$(CONFIG_PFLASH_CFI01) += pflash_cfi01.o
 common-obj-$(CONFIG_PFLASH_CFI02) += pflash_cfi02.o
+common-obj-$(CONFIG_XEN) += xen-block.o
 common-obj-$(CONFIG_XEN) += xen_disk.o
 common-obj-$(CONFIG_ECC) += ecc.o
 common-obj-$(CONFIG_ONENAND) += onenand.o
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 335c092..4afbd62 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -127,3 +127,11 @@ xen_disk_init(char *name) "%s"
 xen_disk_connect(char *name) "%s"
 xen_disk_disconnect(char *name) "%s"
 xen_disk_free(char *name) "%s"
+
+# hw/block/xen-block.c
+xen_block_realize(const char *type, uint32_t disk, uint32_t partition) "%s 
d%up%u"
+xen_block_unrealize(const char *type, uint32_t disk, uint32_t partition) "%s 
d%up%u"
+xen_disk_realize(void) ""
+xen_disk_unrealize(void) ""
+xen_cdrom_realize(void) ""
+xen_cdrom_unrealize(void) ""
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
new file mode 100644
index 000..78f4218
--- /dev/null
+++ b/hw/block/xen-block.c
@@ -0,0 +1,347 @@
+/*
+ * Copyright (c) 2018  Citrix Systems Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/cutils.h"
+#include "qapi/error.h"
+#include "qapi/visitor.h"
+#include "hw/hw.h"
+#include "hw/xen/xen-block.h"
+#include "trace.h"
+
+static void xen_block_unrealize(XenDevice *xendev, Error **errp)
+{
+XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
+XenBlockDeviceClass *blockdev_class =
+XEN_BLOCK_DEVICE_GET_CLASS(xendev);
+const char *type = object_get_typename(OBJECT(blockdev));
+XenBlockVdev *vdev = >vdev;
+Error *local_err = NULL;
+
+if (vdev->type == XEN_BLOCK_VDEV_TYPE_INVALID) {
+return;
+}
+
+trace_xen_block_unrealize(type, vdev->disk, vdev->partition);
+
+if (blockdev_class->unrealize) {
+blockdev_class->unrealize(blockdev, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+}
+}
+}
+
+static void xen_block_realize(XenDevice *xendev, Error **errp)
+{
+XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
+XenBlockDeviceClass *blockdev_class =
+XEN_BLOCK_DEVICE_GET_CLASS(xendev);
+const char *type = object_get_typename(OBJECT(blockdev));
+XenBlockVdev *vdev = >vdev;
+Error *local_err = NULL;
+
+if (vdev->type == XEN_BLOCK_VDEV_TYPE_INVALID) {
+error_setg(errp, "vdev property not set");
+return;
+}
+
+trace_xen_block_realize(type, vdev->disk, vdev->partition);
+
+if (blockdev_class->realize) {
+blockdev_class->realize(blockdev, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+}
+}
+}
+
+static char 

[Xen-devel] [PATCH v2 06/18] xen: add grant table interface for XenDevice-s

2018-12-06 Thread Paul Durrant
The legacy PV backend infrastructure provides functions to map, unmap and
copy pages granted by frontends. Similar functionality will be required
by XenDevice implementations so this patch adds the necessary support.

Signed-off-by: Paul Durrant 
Reviewed-by: Anthony Perard 
---
Cc: Stefano Stabellini 
---
 hw/xen/xen-bus.c | 146 +++
 include/hw/xen/xen-bus.h |  25 
 2 files changed, 171 insertions(+)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index f0732f8..b40dc83 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -489,6 +489,138 @@ static void xen_device_frontend_destroy(XenDevice *xendev)
 }
 }
 
+void xen_device_set_max_grant_refs(XenDevice *xendev, unsigned int nr_refs,
+   Error **errp)
+{
+if (xengnttab_set_max_grants(xendev->xgth, nr_refs)) {
+error_setg_errno(errp, errno, "xengnttab_set_max_grants failed");
+}
+}
+
+void *xen_device_map_grant_refs(XenDevice *xendev, uint32_t *refs,
+unsigned int nr_refs, int prot,
+Error **errp)
+{
+void *map = xengnttab_map_domain_grant_refs(xendev->xgth, nr_refs,
+xendev->frontend_id, refs,
+prot);
+
+if (!map) {
+error_setg_errno(errp, errno,
+ "xengnttab_map_domain_grant_refs failed");
+}
+
+return map;
+}
+
+void xen_device_unmap_grant_refs(XenDevice *xendev, void *map,
+ unsigned int nr_refs, Error **errp)
+{
+if (xengnttab_unmap(xendev->xgth, map, nr_refs)) {
+error_setg_errno(errp, errno, "xengnttab_unmap failed");
+}
+}
+
+static void compat_copy_grant_refs(XenDevice *xendev, bool to_domain,
+   XenDeviceGrantCopySegment segs[],
+   unsigned int nr_segs, Error **errp)
+{
+uint32_t *refs = g_new(uint32_t, nr_segs);
+int prot = to_domain ? PROT_WRITE : PROT_READ;
+void *map;
+unsigned int i;
+
+for (i = 0; i < nr_segs; i++) {
+XenDeviceGrantCopySegment *seg = [i];
+
+refs[i] = to_domain ? seg->dest.foreign.ref :
+seg->source.foreign.ref;
+}
+
+map = xengnttab_map_domain_grant_refs(xendev->xgth, nr_segs,
+  xendev->frontend_id, refs,
+  prot);
+if (!map) {
+error_setg_errno(errp, errno,
+ "xengnttab_map_domain_grant_refs failed");
+goto done;
+}
+
+for (i = 0; i < nr_segs; i++) {
+XenDeviceGrantCopySegment *seg = [i];
+void *page = map + (i * XC_PAGE_SIZE);
+
+if (to_domain) {
+memcpy(page + seg->dest.foreign.offset, seg->source.virt,
+   seg->len);
+} else {
+memcpy(seg->dest.virt, page + seg->source.foreign.offset,
+   seg->len);
+}
+}
+
+if (xengnttab_unmap(xendev->xgth, map, nr_segs)) {
+error_setg_errno(errp, errno, "xengnttab_unmap failed");
+}
+
+done:
+g_free(refs);
+}
+
+void xen_device_copy_grant_refs(XenDevice *xendev, bool to_domain,
+XenDeviceGrantCopySegment segs[],
+unsigned int nr_segs, Error **errp)
+{
+xengnttab_grant_copy_segment_t *xengnttab_segs;
+unsigned int i;
+
+if (!xendev->feature_grant_copy) {
+compat_copy_grant_refs(xendev, to_domain, segs, nr_segs, errp);
+return;
+}
+
+xengnttab_segs = g_new0(xengnttab_grant_copy_segment_t, nr_segs);
+
+for (i = 0; i < nr_segs; i++) {
+XenDeviceGrantCopySegment *seg = [i];
+xengnttab_grant_copy_segment_t *xengnttab_seg = _segs[i];
+
+if (to_domain) {
+xengnttab_seg->flags = GNTCOPY_dest_gref;
+xengnttab_seg->dest.foreign.domid = xendev->frontend_id;
+xengnttab_seg->dest.foreign.ref = seg->dest.foreign.ref;
+xengnttab_seg->dest.foreign.offset = seg->dest.foreign.offset;
+xengnttab_seg->source.virt = seg->source.virt;
+} else {
+xengnttab_seg->flags = GNTCOPY_source_gref;
+xengnttab_seg->source.foreign.domid = xendev->frontend_id;
+xengnttab_seg->source.foreign.ref = seg->source.foreign.ref;
+xengnttab_seg->source.foreign.offset =
+seg->source.foreign.offset;
+xengnttab_seg->dest.virt = seg->dest.virt;
+}
+
+xengnttab_seg->len = seg->len;
+}
+
+if (xengnttab_grant_copy(xendev->xgth, nr_segs, xengnttab_segs)) {
+error_setg_errno(errp, errno, "xengnttab_grant_copy failed");
+goto done;
+}
+
+for (i = 0; i < nr_segs; i++) {
+xengnttab_grant_copy_segment_t *xengnttab_seg = _segs[i];
+
+if (xengnttab_seg->status != GNTST_okay) {
+   

[Xen-devel] [PATCH v2 04/18] xen: create xenstore areas for XenDevice-s

2018-12-06 Thread Paul Durrant
This patch adds a new source module, xen-bus-helper.c, which builds on
basic libxenstore primitives to provide functions to create (setting
permissions appropriately) and destroy xenstore areas, and functions to
'printf' and 'scanf' nodes therein. The main xen-bus code then uses
these primitives [1] to initialize and destroy the frontend and backend
areas for a XenDevice during realize and unrealize respectively.

The 'xen-block' implementation is extended with a 'get_name' method that
returns the VBD number. This number is required to 'name' the xenstore
areas.

NOTE: An exit handler is also added to make sure the xenstore areas are
  cleaned up if QEMU terminates without devices being unrealized.

[1] The 'scanf' functions are actually not yet needed, but they will be
needed by code delivered in subsequent patches.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Kevin Wolf 
Cc: Max Reitz 

v2:
 - Fix boilerplate
 - Add error pointers to all xs_node... helpers
 - Add GCC_FMT_ATTR to declarations of printf-like helpers
---
 hw/block/xen-block.c|   9 ++
 hw/xen/Makefile.objs|   2 +-
 hw/xen/trace-events |  12 +-
 hw/xen/xen-bus-helper.c | 147 ++
 hw/xen/xen-bus.c| 319 +++-
 include/hw/xen/xen-bus-helper.h |  34 +
 include/hw/xen/xen-bus.h|  12 ++
 7 files changed, 530 insertions(+), 5 deletions(-)
 create mode 100644 hw/xen/xen-bus-helper.c
 create mode 100644 include/hw/xen/xen-bus-helper.h

diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index 78f4218..440bec2 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -13,6 +13,14 @@
 #include "hw/xen/xen-block.h"
 #include "trace.h"
 
+static char *xen_block_get_name(XenDevice *xendev, Error **errp)
+{
+XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
+XenBlockVdev *vdev = >vdev;
+
+return g_strdup_printf("%lu", vdev->number);
+}
+
 static void xen_block_unrealize(XenDevice *xendev, Error **errp)
 {
 XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
@@ -266,6 +274,7 @@ static void xen_block_class_init(ObjectClass *class, void 
*data)
 DeviceClass *dev_class = DEVICE_CLASS(class);
 XenDeviceClass *xendev_class = XEN_DEVICE_CLASS(class);
 
+xendev_class->get_name = xen_block_get_name;
 xendev_class->realize = xen_block_realize;
 xendev_class->unrealize = xen_block_unrealize;
 
diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
index d9d6d7b..77c0868 100644
--- a/hw/xen/Makefile.objs
+++ b/hw/xen/Makefile.objs
@@ -1,5 +1,5 @@
 # xen backend driver support
-common-obj-$(CONFIG_XEN) += xen-legacy-backend.o xen_devconfig.o xen_pvdev.o 
xen-common.o xen-bus.o
+common-obj-$(CONFIG_XEN) += xen-legacy-backend.o xen_devconfig.o xen_pvdev.o 
xen-common.o xen-bus.o xen-bus-helper.o
 
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o 
xen_pt_graphics.o xen_pt_msi.o
diff --git a/hw/xen/trace-events b/hw/xen/trace-events
index 0172cd4..75dc226 100644
--- a/hw/xen/trace-events
+++ b/hw/xen/trace-events
@@ -16,5 +16,13 @@ xen_domid_restrict(int err) "err: %u"
 # include/hw/xen/xen-bus.c
 xen_bus_realize(void) ""
 xen_bus_unrealize(void) ""
-xen_device_realize(const char *type) "type: %s"
-xen_device_unrealize(const char *type) "type: %s"
+xen_device_realize(const char *type, char *name) "type: %s name: %s"
+xen_device_unrealize(const char *type, char *name) "type: %s name: %s"
+xen_device_backend_state(const char *type, char *name, const char *state) 
"type: %s name: %s -> %s"
+xen_device_frontend_state(const char *type, char *name, const char *state) 
"type: %s name: %s -> %s"
+
+# include/hw/xen/xen-bus-helper.c
+xs_node_create(const char *node) "%s"
+xs_node_destroy(const char *node) "%s"
+xs_node_vprintf(char *path, char *value) "%s %s"
+xs_node_vscanf(char *path, char *value) "%s %s"
diff --git a/hw/xen/xen-bus-helper.c b/hw/xen/xen-bus-helper.c
new file mode 100644
index 000..2304f8e
--- /dev/null
+++ b/hw/xen/xen-bus-helper.c
@@ -0,0 +1,147 @@
+/*
+ * Copyright (c) 2018  Citrix Systems Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/hw.h"
+#include "hw/sysbus.h"
+#include "hw/xen/xen.h"
+#include "hw/xen/xen-bus.h"
+#include "hw/xen/xen-bus-helper.h"
+#include "qapi/error.h"
+
+#include 
+
+struct xs_state {
+enum xenbus_state statenum;
+const char *statestr;
+};
+#define XS_STATE(state) { state, #state }
+
+static struct xs_state xs_state[] = {
+XS_STATE(XenbusStateUnknown),
+XS_STATE(XenbusStateInitialising),
+XS_STATE(XenbusStateInitWait),
+XS_STATE(XenbusStateInitialised),
+XS_STATE(XenbusStateConnected),
+XS_STATE(XenbusStateClosing),
+XS_STATE(XenbusStateClosed),
+

[Xen-devel] [PATCH v2 09/18] xen: remove unnecessary code from dataplane/xen-block.c

2018-12-06 Thread Paul Durrant
Not all of the code duplicated from xen_disk.c is required as the basis for
the new dataplane implementation so this patch removes extraneous code,
along with the legacy #includes and calls to the legacy xen_pv_printf()
function. Error messages are changed to be reported using error_report().

NOTE: The code is still not yet built. Further transformations will be
  required to make it correctly interface to the new XenBus/XenDevice
  framework. They will be delivered in a subsequent patch.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
---
Cc: Stefano Stabellini 
Cc: Stefan Hajnoczi 
Cc: Kevin Wolf 
Cc: Max Reitz 

v2:
 - Leave existing boilerplate alone, other than removing the now-incorrect
   description
---
 hw/block/dataplane/xen-block.c | 409 ++---
 1 file changed, 16 insertions(+), 393 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 9fae505..98f987d 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -1,6 +1,4 @@
 /*
- *  xen paravirt block device backend
- *
  *  (c) Gerd Hoffmann 
  *
  *  This program is free software; you can redistribute it and/or modify
@@ -19,26 +17,12 @@
  *  GNU GPL, version 2 or (at your option) any later version.
  */
 
-#include "qemu/osdep.h"
-#include "qemu/units.h"
-#include 
-#include 
-
-#include "hw/hw.h"
-#include "hw/xen/xen_backend.h"
-#include "xen_blkif.h"
-#include "sysemu/blockdev.h"
-#include "sysemu/iothread.h"
-#include "sysemu/block-backend.h"
-#include "qapi/error.h"
-#include "qapi/qmp/qdict.h"
-#include "qapi/qmp/qstring.h"
-#include "trace.h"
-
-/* - */
-
-#define BLOCK_SIZE  512
-#define IOCB_COUNT  (BLKIF_MAX_SEGMENTS_PER_REQUEST + 2)
+/*
+ * Copyright (c) 2018  Citrix Systems Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
 
 struct ioreq {
 blkif_request_t req;
@@ -101,8 +85,6 @@ struct XenBlkDev {
 AioContext  *ctx;
 };
 
-/* - */
-
 static void ioreq_reset(struct ioreq *ioreq)
 {
 memset(>req, 0, sizeof(ioreq->req));
@@ -183,11 +165,6 @@ static int ioreq_parse(struct ioreq *ioreq)
 size_t len;
 int i;
 
-xen_pv_printf(
-xendev, 3,
-"op %d, nr %d, handle %d, id %" PRId64 ", sector %" PRId64 "\n",
-ioreq->req.operation, ioreq->req.nr_segments,
-ioreq->req.handle, ioreq->req.id, ioreq->req.sector_number);
 switch (ioreq->req.operation) {
 case BLKIF_OP_READ:
 break;
@@ -202,28 +179,27 @@ static int ioreq_parse(struct ioreq *ioreq)
 case BLKIF_OP_DISCARD:
 return 0;
 default:
-xen_pv_printf(xendev, 0, "error: unknown operation (%d)\n",
-  ioreq->req.operation);
+error_report("error: unknown operation (%d)", ioreq->req.operation);
 goto err;
 };
 
 if (ioreq->req.operation != BLKIF_OP_READ && blkdev->mode[0] != 'w') {
-xen_pv_printf(xendev, 0, "error: write req for ro device\n");
+error_report("error: write req for ro device");
 goto err;
 }
 
 ioreq->start = ioreq->req.sector_number * blkdev->file_blk;
 for (i = 0; i < ioreq->req.nr_segments; i++) {
 if (i == BLKIF_MAX_SEGMENTS_PER_REQUEST) {
-xen_pv_printf(xendev, 0, "error: nr_segments too big\n");
+error_report("error: nr_segments too big");
 goto err;
 }
 if (ioreq->req.seg[i].first_sect > ioreq->req.seg[i].last_sect) {
-xen_pv_printf(xendev, 0, "error: first > last sector\n");
+error_report("error: first > last sector");
 goto err;
 }
 if (ioreq->req.seg[i].last_sect * BLOCK_SIZE >= XC_PAGE_SIZE) {
-xen_pv_printf(xendev, 0, "error: page crossing\n");
+error_report("error: page crossing");
 goto err;
 }
 
@@ -232,7 +208,7 @@ static int ioreq_parse(struct ioreq *ioreq)
 ioreq->size += len;
 }
 if (ioreq->start + ioreq->size > blkdev->file_size) {
-xen_pv_printf(xendev, 0, "error: access beyond end of file\n");
+error_report("error: access beyond end of file");
 goto err;
 }
 return 0;
@@ -278,8 +254,7 @@ static int ioreq_grant_copy(struct ioreq *ioreq)
 rc = xen_be_copy_grant_refs(xendev, to_domain, segs, count);
 
 if (rc) {
-xen_pv_printf(xendev, 0,
-  "failed to copy data %d\n", rc);
+error_report("failed to copy data %d", rc);
 ioreq->aio_errors++;
 return -1;
 }
@@ -298,8 +273,9 @@ static void qemu_aio_complete(void *opaque, int ret)
 aio_context_acquire(blkdev->ctx);
 
 if (ret != 0) {
-xen_pv_printf(xendev, 0, "%s I/O error\n",
-  ioreq->req.operation == 

[Xen-devel] [PATCH v2 07/18] xen: add event channel interface for XenDevice-s

2018-12-06 Thread Paul Durrant
The legacy PV backend infrastructure provides functions to bind, unbind
and send notifications to event channnels. Similar functionality will be
required by XenDevice implementations so this patch adds the necessary
support.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 

v2:
 - Added error pointers to notify and unbind
---
 hw/xen/xen-bus.c | 101 +++
 include/hw/xen/xen-bus.h |  18 +
 2 files changed, 119 insertions(+)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index b40dc83..0e6f194 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -621,6 +621,81 @@ done:
 g_free(xengnttab_segs);
 }
 
+struct XenEventChannel {
+unsigned int local_port;
+XenEventHandler handler;
+void *opaque;
+Notifier notifier;
+};
+
+static void event_notify(Notifier *n, void *data)
+{
+XenEventChannel *channel = container_of(n, XenEventChannel, notifier);
+unsigned long port = (unsigned long)data;
+
+if (port == channel->local_port) {
+channel->handler(channel->opaque);
+}
+}
+
+XenEventChannel *xen_device_bind_event_channel(XenDevice *xendev,
+   unsigned int port,
+   XenEventHandler handler,
+   void *opaque, Error **errp)
+{
+XenEventChannel *channel = g_new0(XenEventChannel, 1);
+
+channel->local_port = xenevtchn_bind_interdomain(xendev->xeh,
+ xendev->frontend_id,
+ port);
+if (xendev->local_port < 0) {
+error_setg_errno(errp, errno, "xenevtchn_bind_interdomain failed");
+
+g_free(channel);
+return NULL;
+}
+
+channel->handler = handler;
+channel->opaque = opaque;
+channel->notifier.notify = event_notify;
+
+notifier_list_add(>event_notifiers, >notifier);
+
+return channel;
+}
+
+void xen_device_notify_event_channel(XenDevice *xendev,
+ XenEventChannel *channel,
+ Error **errp)
+{
+if (!channel) {
+error_setg(errp, "bad channel");
+return;
+}
+
+if (xenevtchn_notify(xendev->xeh, channel->local_port) < 0) {
+error_setg_errno(errp, errno, "xenevtchn_notify failed");
+}
+}
+
+void xen_device_unbind_event_channel(XenDevice *xendev,
+ XenEventChannel *channel,
+ Error **errp)
+{
+if (!channel) {
+error_setg(errp, "bad channel");
+return;
+}
+
+notifier_remove(>notifier);
+
+if (xenevtchn_unbind(xendev->xeh, channel->local_port) < 0) {
+error_setg_errno(errp, errno, "xenevtchn_unbind failed");
+}
+
+g_free(channel);
+}
+
 static void xen_device_unrealize(DeviceState *dev, Error **errp)
 {
 XenDevice *xendev = XEN_DEVICE(dev);
@@ -649,6 +724,12 @@ static void xen_device_unrealize(DeviceState *dev, Error 
**errp)
 xen_device_frontend_destroy(xendev);
 xen_device_backend_destroy(xendev);
 
+if (xendev->xeh) {
+qemu_set_fd_handler(xenevtchn_fd(xendev->xeh), NULL, NULL, NULL);
+xenevtchn_close(xendev->xeh);
+xendev->xeh = NULL;
+}
+
 if (xendev->xgth) {
 xengnttab_close(xendev->xgth);
 xendev->xgth = NULL;
@@ -665,6 +746,16 @@ static void xen_device_exit(Notifier *n, void *data)
 xen_device_unrealize(DEVICE(xendev), _abort);
 }
 
+static void xen_device_event(void *opaque)
+{
+XenDevice *xendev = opaque;
+unsigned long port = xenevtchn_pending(xendev->xeh);
+
+notifier_list_notify(>event_notifiers, (void *)port);
+
+xenevtchn_unmask(xendev->xeh, port);
+}
+
 static void xen_device_realize(DeviceState *dev, Error **errp)
 {
 XenDevice *xendev = XEN_DEVICE(dev);
@@ -705,6 +796,16 @@ static void xen_device_realize(DeviceState *dev, Error 
**errp)
 xendev->feature_grant_copy =
 (xengnttab_grant_copy(xendev->xgth, 0, NULL) == 0);
 
+xendev->xeh = xenevtchn_open(NULL, 0);
+if (!xendev->xeh) {
+error_setg_errno(errp, errno, "failed xenevtchn_open");
+goto unrealize;
+}
+
+notifier_list_init(>event_notifiers);
+qemu_set_fd_handler(xenevtchn_fd(xendev->xeh), xen_device_event, NULL,
+xendev);
+
 xen_device_backend_create(xendev, _err);
 if (local_err) {
 error_propagate(errp, local_err);
diff --git a/include/hw/xen/xen-bus.h b/include/hw/xen/xen-bus.h
index 63a09b6..f83a95c 100644
--- a/include/hw/xen/xen-bus.h
+++ b/include/hw/xen/xen-bus.h
@@ -26,6 +26,9 @@ typedef struct XenDevice {
 XenWatch *frontend_state_watch;
 xengnttab_handle *xgth;
 bool feature_grant_copy;
+xenevtchn_handle *xeh;
+xenevtchn_port_or_error_t local_port;
+NotifierList event_notifiers;
 } XenDevice;
 
 typedef char 

Re: [Xen-devel] [PATCH v3 3/4] iommu: elide flushing for higher order map/unmap operations

2018-12-06 Thread Jan Beulich
>>> On 05.12.18 at 12:29,  wrote:
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -865,11 +865,15 @@ int xenmem_add_to_physmap(struct domain *d, struct 
> xen_add_to_physmap *xatp,
>  
>  this_cpu(iommu_dont_flush_iotlb) = 0;
>  
> -ret = iommu_flush(d, _dfn(xatp->idx - done), done);
> +ret = iommu_iotlb_flush(d, _dfn(xatp->idx - done), done,
> +IOMMU_FLUSHF_added |
> +IOMMU_FLUSHF_modified);

No need to split these last two lines afaict, nor ...

>  if ( unlikely(ret) && rc >= 0 )
>  rc = ret;
>  
> -ret = iommu_flush(d, _dfn(xatp->gpfn - done), done);
> +ret = iommu_iotlb_flush(d, _dfn(xatp->gpfn - done), done,
> +IOMMU_FLUSHF_added |
> +IOMMU_FLUSHF_modified);

... these.

> @@ -573,18 +589,17 @@ int amd_iommu_map_page(struct domain *d, dfn_t dfn, 
> mfn_t mfn,
>  }
>  
>  /* Install 4k mapping */
> -need_flush = set_iommu_pte_present(pt_mfn[1], dfn_x(dfn), mfn_x(mfn), 1,
> -   !!(flags & IOMMUF_writable),
> -   !!(flags & IOMMUF_readable));
> -
> -if ( need_flush )
> -amd_iommu_flush_pages(d, dfn_x(dfn), 0);
> +*flush_flags |= set_iommu_pte_present(pt_mfn[1], dfn_x(dfn), mfn_x(mfn),
> +  1, !!(flags & IOMMUF_writable),
> +  !!(flags & IOMMUF_readable));

I don't think the !! here need retaining.

> @@ -235,6 +236,10 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
>  process_pending_softirqs();
>  }
>  
> +/* Use while-break to avoid compiler warning */
> +while ( !iommu_iotlb_flush_all(d, flush_flags) )
> +break;

With just the "break;" as body, what's the ! good for?

> @@ -320,7 +326,8 @@ int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t 
> mfn,
>  for ( i = 0; i < (1ul << page_order); i++ )
>  {
>  rc = hd->platform_ops->map_page(d, dfn_add(dfn, i),
> -mfn_add(mfn, i), flags);
> +mfn_add(mfn, i), flags,
> +flush_flags);

Again no need for two lines here as it seems.

> @@ -345,7 +353,20 @@ int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t 
> mfn,
>  return rc;
>  }
>  
> -int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
> +int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
> + unsigned int page_order, unsigned int flags)
> +{
> +unsigned int flush_flags = 0;
> +int rc = iommu_map(d, dfn, mfn, page_order, flags, _flags);
> +
> +if ( !rc && !this_cpu(iommu_dont_flush_iotlb) )
> +rc = iommu_iotlb_flush(d, dfn, (1u << page_order), flush_flags);

The question was raised in a different context (but iirc this same
series) already: Is it correct to skip flushing when failure occurred
on other than the first page of a set? There's no rollback afaict,
and even if there was the transiently available mappings would
then still need purging. Same on the unmap side then. (Note that
this is different from the arch_iommu_populate_page_table()
case, where I/O can't be initiated yet by the guest.)

> @@ -241,8 +245,10 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>  if ( paging_mode_translate(d) )
>  rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
>  else
> -rc = iommu_legacy_map(d, _dfn(pfn), _mfn(pfn), PAGE_ORDER_4K,
> -  IOMMUF_readable | IOMMUF_writable);
> +rc = iommu_map(d, _dfn(pfn), _mfn(pfn), PAGE_ORDER_4K,
> +   IOMMUF_readable | IOMMUF_writable,
> +   _flags);

Again overly aggressive line wrapping?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/4] iommu: rename wrapper functions

2018-12-06 Thread Andrew Cooper
On 06/12/2018 14:44, Jan Beulich wrote:
 On 05.12.18 at 12:29,  wrote:
>> A subsequent patch will add semantically different versions of
>> iommu_map/unmap() so, in advance of that change, this patch renames the
>> existing functions to iommu_legacy_map/unmap() and modifies all call-sites.
>> It also adjusts a comment that refers to iommu_map_page(), which was re-
>> named by a previous patch.
>>
>> This patch is purely cosmetic. No functional change.
>>
>> Signed-off-by: Paul Durrant 
> Acked-by: Jan Beulich 
>
> I have to admit that I'm undecided whether to ask that this be
> committed in the same development window where all the
> "legacy" infixes would go away again, i.e. presumably only
> after 4.12 now.

The legacy infixes wont disappear without a substantial quantity of
cleanup and fixing in the P2M code.

The P2M and IOMMU code is in much too big of a mess to be cleaned up in
one release release cycle.  I specifically suggested the legacy infix so
we can get the new proper APIs in place and starting to be used, while
ensuring that we won't gain new users of the old interfaces.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 10/10] dm_depriv: Mark `UID cleanup` as completed

2018-12-06 Thread George Dunlap
Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
---
 docs/designs/qemu-deprivilege.md | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/docs/designs/qemu-deprivilege.md b/docs/designs/qemu-deprivilege.md
index f7444a434d..81a5f5c05d 100644
--- a/docs/designs/qemu-deprivilege.md
+++ b/docs/designs/qemu-deprivilege.md
@@ -128,26 +128,6 @@ are specified; this does not apply to QEMU running as a 
Xen DM.
 
 '''Tested''': Not tested
 
-# Restrictions / improvements still to do
-
-This lists potential restrictions still to do.  It is meant to be
-listed in order of ease of implementation, with low-hanging fruit
-first.
-
-### Further RLIMITs
-
-RLIMIT_AS limits the total amount of memory; but this includes the
-virtual memory which QEMU uses as a mapcache.  xen-mapcache.c already
-fiddles with this; it would be straightforward to make it *set* the
-rlimit to what it thinks a sensible limit is.
-
-RLIMIT_NPROC limits total number of processes or threads.  QEMU uses
-threads for some devices, so this would require some thought.
-
-Other things that would take some cleverness / changes to QEMU to
-utilize due to ordering constrants:
- - RLIMIT_NOFILES (after all necessary files are opened)
-
 ### libxl UID cleanup
 
 '''Description''': Domain IDs are reused, and thus restricted UIDs are
@@ -223,6 +203,26 @@ Since this will kill all other `reaper_uid` processes as 
well, we must
 either allocate a separate `reaper_uid` per domain, or use locking to
 ensure that only one killing process is active at a time.
 
+# Restrictions / improvements still to do
+
+This lists potential restrictions still to do.  It is meant to be
+listed in order of ease of implementation, with low-hanging fruit
+first.
+
+### Further RLIMITs
+
+RLIMIT_AS limits the total amount of memory; but this includes the
+virtual memory which QEMU uses as a mapcache.  xen-mapcache.c already
+fiddles with this; it would be straightforward to make it *set* the
+rlimit to what it thinks a sensible limit is.
+
+RLIMIT_NPROC limits total number of processes or threads.  QEMU uses
+threads for some devices, so this would require some thought.
+
+Other things that would take some cleverness / changes to QEMU to
+utilize due to ordering constrants:
+ - RLIMIT_NOFILES (after all necessary files are opened)
+
 ## libxl: Treat QMP connection as untrusted
 
 '''Description''': Currently libxl talks with QEMU via QMP; but its
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 08/10] libxl: Kill QEMU by uid when possible

2018-12-06 Thread George Dunlap
The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
one specific domain ID.  This domain ID will probably eventually be
used again.  It is therefore necessary to make absolutely sure that a
rogue QEMU process cannot hang around after its domain has exited.

Killing QEMU by pid is insufficient in this situation, because QEMU
may be able to fork() to escape killing.  It is surprisingly tricky to
kill a process which can call fork() without races; the only reliable
way is to use kill(-1) to kill all processes with a given uid.

We can use this method only when we're sure that there's only one QEMU
instance per uid.  Add a dm_uid into the domain_build_state struct,
and set it in libxl__domain_get_device_model_uid() when it's safe to
kill by UID.  Store this in xenstore next to device-model-pid.

On domain destroy, check to see if device-model-uid is present in
xenstore.  If so, fork off a reaper process, setuid to that uid, and
do kill(-9) to kill all uids of that type.  Otherwise, carry on
destroying by pid.

While we're here, make libxl__destroy_device_model() consistently:
 1. Return an error when anything fails
 2. But continue to do as much clean-up as possible

NOTE that this is not yet completely safe: with ruid == dm_uid, the
device model may be able to kill(-9) the 'reaper' process before the
reaper process can kill it.  Further patches will address this.

Signed-off-by: George Dunlap 
---
v2:
- Rebase on top of previous "goto out" refactoring
- Rather than introducing a `uid` string, Introduce a boolean,
  "kill_by_uid"; and do the GCSPRINTF() once if that is set.
- Fix typo "starting"
- Always call kill_device_model_uid_cb(); only call
  libxl__qmp_cleanup() from there
- Refactor libxl__destroy_device_model() to follow "goto out on error"
  pattern
- Retain and report errors even when we continue trying to clean up
- Report errors removing DM xenstore directory (except -ENOENT)
- Report errors reading device-model-uid
- Put "kill by uid" child logic in a separate function
- Refactor "kill by uid" to follow "goto out on error" pattern
- Change "kill by uid" to return libxl-style error, rather than errno
- Document the intention of when to return errors
- Assert that dm_uid != 0
- Log what the reaper process setresuid'd to

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 tools/libxl/libxl_dm.c   | 206 +--
 tools/libxl/libxl_internal.h |   4 +-
 2 files changed, 200 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 7f9c6a62fe..53fdf8daf7 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -129,6 +129,8 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
 int rc;
 char *user;
 uid_t intended_uid;
+bool kill_by_uid;
+
 
 /* Only qemu-upstream can run as a different uid */
 if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
@@ -148,8 +150,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
  user);
 rc = ERROR_INVAL;
-} else
+} else {
 intended_uid = user_base->pw_uid;
+kill_by_uid = true;
+}
 
 goto out;
 }
@@ -192,11 +196,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 LOGD(DEBUG, guest_domid, "using uid %ld", (long)intended_uid);
 user = GCSPRINTF("%ld:%ld", (long)intended_uid,
  (long)user_base->pw_gid);
+kill_by_uid = true;
 goto out;
 }
 
 /*
- * We couldn't find QEMU_USER_BASE_RANGE; look for QEMU_USER_SHARED
+ * We couldn't find QEMU_USER_BASE_RANGE; look for
+ * QEMU_USER_SHARED.  NB for QEMU_USER_SHARED, all QEMU will run
+ * as the same UID, we can't kill by uid; therefore don't set uid.
  */
 user = LIBXL_QEMU_USER_SHARED;
 rc = userlookup_helper_getpwnam(gc, user, _pwbuf, _base);
@@ -206,6 +213,7 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
 LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
  LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
 intended_uid = user_base->pw_uid;
+kill_by_uid = false;
 goto out;
 }
 
@@ -226,6 +234,8 @@ out:
 }
 
 state->dm_runas = user;
+if (kill_by_uid)
+state->dm_uid = GCSPRINTF("%ld", (long)intended_uid);
 }
 
 return rc;
@@ -2408,6 +2418,15 @@ void libxl__spawn_local_dm(libxl__egc *egc, 
libxl__dm_spawn_state *dmss)
 
 const char *dom_path = libxl__xs_get_dompath(gc, domid);
 
+/*
+ * If we're starting the dm with a non-root UID, save the UID so
+ * that we can reliably kill it and any subprocesses
+ */
+if (state->dm_uid)
+libxl__xs_printf(gc, XBT_NULL,
+ GCSPRINTF("%s/image/device-model-uid", 

[Xen-devel] [PATCH v2 09/10] libxl: Kill QEMU with "reaper" ruid

2018-12-06 Thread George Dunlap
Using kill(-1) to killing an untrusted dm process with the real uid
equal to the dm_uid isn't guaranteed to succeed: the process in
question may be able to kill the reaper process after the setresuid()
and before the kill().

Instead, set the real uid to the QEMU user for domain 0
(QEMU_USER_RANGE_BASE + 0).  The reaper process will still be able to
kill the dm process, but not vice versa.

This, in turn, requires locking to make sure that only one reaper
process is using that uid at a time; otherwise one reaper process may
kill the other reaper process.

Create a lockfile in RUNDIR/dm-reaper-lock, and grab the lock before
executing kill.

In the event that we can't get the lock for some reason, go ahead with
the kill using dm_uid for both real and effective UIDs.  This isn't
guaranteed to work, but it's no worse than not trying to kill the
process at all.

Signed-off-by: George Dunlap 
---
v2:
- Port over previous changes
- libxl__get_reaper_uid() won't set errno, use LOG rather than LOGE.
- Accumulate error and return for all failures
- Move flock() outside of the condition.  Also fix EINTR check (check
  errno rather than return value).
- Add a comment explaining why we return an error even if the kill()
  succeeds
- Move locking to a separate function to minimize gotos
- Refactor libxl__get_reaper_id to take a pointer for reaper_uid;
  return only success/failure.  Also return EINVAL if reaper_uid would
  resolve to 0.
- Handle "reaper_uid not found" specially; note issue with
  device_model_user feature
- Assert that final reaper_uid != 0

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 117 +
 1 file changed, 108 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 53fdf8daf7..90b4e21d48 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -241,6 +241,35 @@ out:
 return rc;
 }
 
+/*
+ * Look up "reaper UID".  If present and non-root, returns 0 and sets
+ * reaper_uid.  If not present, returns 0 and leaves reaper_uid unset;
+ * otherwise returns libxl-style error.
+ */
+static int libxl__get_reaper_uid(libxl__gc *gc, uid_t *reaper_uid)
+{
+struct passwd *user_base, user_pwbuf;
+int rc;
+
+rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
+ _pwbuf, _base);
+if (rc)
+return rc;
+
+if (!user_base) {
+LOG(WARN, "Couldn't find uid for reaper process");
+} else {
+if(user_base->pw_uid == 0) {
+LOG(ERROR, "UID for reaper process maps to root!");
+return ERROR_INVAL;
+}
+
+*reaper_uid = user_base->pw_uid;
+}
+
+return 0;
+}
+
 const char *libxl__domain_device_model(libxl__gc *gc,
const libxl_domain_build_info *info)
 {
@@ -2810,11 +2839,61 @@ out:
 return;
 }
 
+static int get_reaper_lock_and_uid(libxl__destroy_devicemodel_state *ddms,
+   uid_t *reaper_uid)
+{
+STATE_AO_GC(ddms->ao);
+int domid = ddms->domid;
+int r;
+const char * lockfile;
+int fd;
+
+/* Try to lock the "reaper uid" */
+lockfile = GCSPRINTF("%s/dm-reaper-lock", libxl__run_dir_path());
+
+/*
+ * NB that since we've just forked, we can't have any
+ * threads; so we don't need the libxl__carefd
+ * infrastructure here.
+ */
+fd = open(lockfile, O_RDWR|O_CREAT, 0666);
+if (fd < 0) {
+/* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */
+LOGED(ERROR, domid,
+  "unexpected error while trying to open lockfile %s, errno=%d",
+  lockfile, errno);
+return ERROR_FAIL;
+}
+
+/* Try to lock the file, retrying on EINTR */
+for (;;) {
+r = flock(fd, LOCK_EX);
+if (!r)
+break;
+if (errno != EINTR) {
+/* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */
+LOGED(ERROR, domid,
+  "unexpected error while trying to lock %s, fd=%d, errno=%d",
+  lockfile, fd, errno);
+return ERROR_FAIL;
+}
+}
+
+/*
+ * Get reaper_uid.  If we can't find such a uid, return an error.
+ *
+ * FIXME: This means that domain destruction will fail if
+ * device_model_user is set but QEMU_USER_RANGE_BASE doesn't exist.
+ */
+return libxl__get_reaper_uid(gc, reaper_uid);
+}
+
+
 /*
  * Destroy all processes of the given uid by setresuid to the
  * specified uid and kill(-1).  NB this MUST BE CALLED FROM A SEPARATE
- * PROCESS from the normal libxl process.  Returns a libxl-style error
- * code.
+ * PROCESS from the normal libxl process, and should exit immediately
+ * after return.  Returns a libxl-style error code.
  */
 static int kill_device_model_uid_child(libxl__destroy_devicemodel_state *ddms,
const char *dm_uid_str) {
@@ -2822,24 +2901,44 @@ 

[Xen-devel] [PATCH v2 03/10] libxl: Clean up userlookup_helper_getpw* helper

2018-12-06 Thread George Dunlap
Bring conventions more in line with libxl__xs_read_checked():
- If found, return 0 and set pointer to non-NULL
- If not found, return 0 and set pointer to NULL
- On error, return libxl-style error number.

Update documentation to match.

Use CODING_STYLE compliant `r` rather than `ret`.

On error, log the error code before returning instead of discarding
it.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 6024d4b7b8..959fa0f46c 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -72,7 +72,13 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
  *  userlookup_helper_getpwuid(libxl__gc*, uid_t uid,
  * struct passwd **pwd_r);
  *
- *  returns 1 if the user was found, 0 if it was not, -1 on error
+ *  If the user is found, return 0 and set *pwd_r to the appropriat
+ *  value.
+ *
+ *  If the user is not found but there are no errors, return 0
+ *  and set *pwd_r to NULL.
+ *
+ *  On error, return a libxl-style error code.
  */
 #define DEFINE_USERLOOKUP_HELPER(NAME,SPEC_TYPE,STRUCTNAME,SYSCONF) \
 static int userlookup_helper_##NAME(libxl__gc *gc,  \
@@ -83,7 +89,7 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
 struct STRUCTNAME *resultp = NULL;  \
 char *buf = NULL;   \
 long buf_size;  \
-int ret;\
+int r;  \
 \
 buf_size = sysconf(SYSCONF);\
 if (buf_size < 0) { \
@@ -95,17 +101,16 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
 \
 while (1) { \
 buf = libxl__realloc(gc, buf, buf_size);\
-ret = NAME##_r(spec, resultbuf, buf, buf_size, );   \
-if (ret == ERANGE) {\
+r = NAME##_r(spec, resultbuf, buf, buf_size, ); \
+if (r == ERANGE) {  \
 buf_size += 128;\
 continue;   \
 }   \
-if (ret != 0)   \
+if (r != 0) {   \
+LOGEV(ERROR, r, "Looking up username/uid with " #NAME); \
 return ERROR_FAIL;  \
-if (resultp != NULL) {  \
-if (out) *out = resultp;\
-return 1;   \
 }   \
+*out = resultp; \
 return 0;   \
 }   \
 }
@@ -142,14 +147,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
  _pwbuf, _base);
 if (ret < 0)
 return ret;
-if (ret > 0) {
+if (user_base) {
 struct passwd *user_clash, user_clash_pwbuf;
 uid_t intended_uid = user_base->pw_uid + guest_domid;
 ret = userlookup_helper_getpwuid(gc, intended_uid,
  _clash_pwbuf, _clash);
 if (ret < 0)
 return ret;
-if (ret > 0) {
+if (user_clash) {
 LOGD(ERROR, guest_domid,
  "wanted to use uid %ld (%s + %d) but that is user %s !",
  (long)intended_uid, LIBXL_QEMU_USER_RANGE_BASE,
@@ -163,10 +168,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 }
 
 user = LIBXL_QEMU_USER_SHARED;
-ret = userlookup_helper_getpwnam(gc, user, _pwbuf, 0);
+ret = userlookup_helper_getpwnam(gc, user, _pwbuf, _base);
 if (ret < 0)
 return ret;
-if (ret > 0) {
+if (user_base) {
 LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
  LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
 goto end_search;
-- 
2.19.2


___

[Xen-devel] [PATCH v2 02/10] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN)

2018-12-06 Thread George Dunlap
QEMU_USER_BASE allows a user to specify the UID to use when running
the devicemodel for a specific domain number.  Unfortunately, this is
not really practical: It requires nearly 32,000 entries in
/etc/passwd.  QEMU_USER_RANGE_BASE is much more practical.

Remove support for QEMU_USER_BASE.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
NB that I've chosen not to update the xl.cfg man page at this time; it
needs a lot of other updates as well, which would be easier to do all
at once at the end.

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c   | 16 
 tools/libxl/libxl_internal.h |  1 -
 2 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index bbcbc94b6c..6024d4b7b8 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -138,13 +138,6 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 return 0;
 }
 
-user = GCSPRINTF("%s%d", LIBXL_QEMU_USER_BASE, guest_domid);
-ret = userlookup_helper_getpwnam(gc, user, _pwbuf, 0);
-if (ret < 0)
-return ret;
-if (ret > 0)
-goto end_search;
-
 ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  _pwbuf, _base);
 if (ret < 0)
@@ -174,15 +167,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 if (ret < 0)
 return ret;
 if (ret > 0) {
-LOGD(WARN, guest_domid, "Could not find user %s%d, falling back to %s",
- LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED);
+LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
+ LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
 goto end_search;
 }
 
 LOGD(ERROR, guest_domid,
- "Could not find user %s%d or %s or range base pseudo-user %s, cannot 
restrict",
- LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED,
- LIBXL_QEMU_USER_RANGE_BASE);
+ "Could not find user %s or range base pseudo-user %s, cannot 
restrict",
+ LIBXL_QEMU_USER_SHARED, LIBXL_QEMU_USER_RANGE_BASE);
 return ERROR_INVAL;
 
 end_search:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index c4a43bd0b7..b147f3803c 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4387,7 +4387,6 @@ _hidden int libxl__read_sysfs_file_contents(libxl__gc *gc,
 int *datalen_r);
 
 #define LIBXL_QEMU_USER_PREFIX "xen-qemuuser"
-#define LIBXL_QEMU_USER_BASE   LIBXL_QEMU_USER_PREFIX"-domid"
 #define LIBXL_QEMU_USER_SHARED LIBXL_QEMU_USER_PREFIX"-shared"
 #define LIBXL_QEMU_USER_RANGE_BASE LIBXL_QEMU_USER_PREFIX"-range-base"
 
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid

2018-12-06 Thread George Dunlap
At the moment, we check for equivalence to literal "root" before
deciding whether to add the `runas` command-line option to QEMU.  This
is unsatisfactory for several reasons.

First, just because the string doesn't match "root" doesn't mean the
final uid won't end up being zero; in particular, the range_base
calculations may end up producing "0:NNN", which would be root in any
case.

Secondly, it's almost certainly a configuration error if the resulting
uid ends up to be zero; rather than silently do what was specified but
probably not intended, throw an error.

To fix this, check for root once in
libxl__domain_get_device_model_uid.  If the result is root, return an
error; if appropriate, set the user.

After that, assume that the presence of state->dm_runas implies that a
`runas` argument should be constructed.

One side effect of this is to check whether device_model_user exists
before passing it to qemu, resulting in better error reporting.

While we're here:
- Refactor the function to use the "goto out" idiom in most cases
- Use 'rc' rather than 'ret', in line with CODING_STYLE
- Change the error returned in the "uid collision" case to
  ERROR_DEVICE_EXISTS

Signed-off-by: George Dunlap 
---
v2:
- Refactor to use `out` rather than multiple labels
- Only check for root once
- Use 'out' rather than direct returns for errors (only use direct returns
  for early `succeed-without-setting-runas` paths)
- Use `rc` rather than `ret` to more closely align with CODING_STYLE
- Fill out comments about the cases we're handling
- Return ERROR_DEVICE_EXISTS rather than ERROR_FAIL if there's another
  username that maps to our calculated uid
- Report an error if the specified device_model_user doesn't exist

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 86 +++---
 1 file changed, 65 insertions(+), 21 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 959fa0f46c..7ff3e3160a 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -126,65 +126,109 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 const libxl_domain_build_info *b_info = >guest_config->b_info;
 
 struct passwd *user_base, user_pwbuf;
-int ret;
+int rc;
 char *user;
+uid_t intended_uid;
 
 /* Only qemu-upstream can run as a different uid */
 if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
 return 0;
 
+/*
+ * If device_model_user is present, set `-runas` even if
+ * dm_restrict isn't in use
+ */
 user = b_info->device_model_user;
-if (user)
-goto end_search;
+if (user) {
+rc = userlookup_helper_getpwnam(gc, user, _pwbuf, _base);
+if (rc)
+goto out;
+
+if (!user_base) {
+LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
+ user);
+rc = ERROR_INVAL;
+} else
+intended_uid = user_base->pw_uid;
 
+goto out;
+}
+
+/*
+ * If dm_restrict isn't set, and we don't have a specified user, don't
+ * bother setting a `-runas` parameter.
+ */
 if (!libxl_defbool_val(b_info->dm_restrict)) {
 LOGD(DEBUG, guest_domid,
  "dm_restrict disabled, starting QEMU as root");
 return 0;
 }
 
-ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
+/*
+ * dm_restrict is set, but device_model_user isn't set; look for
+ * QEMU_USER_BASE_RANGE
+ */
+rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  _pwbuf, _base);
-if (ret < 0)
-return ret;
+if (rc)
+goto out;
 if (user_base) {
 struct passwd *user_clash, user_clash_pwbuf;
-uid_t intended_uid = user_base->pw_uid + guest_domid;
-ret = userlookup_helper_getpwuid(gc, intended_uid,
+
+intended_uid = user_base->pw_uid + guest_domid;
+rc = userlookup_helper_getpwuid(gc, intended_uid,
  _clash_pwbuf, _clash);
-if (ret < 0)
-return ret;
+if (rc < 0)
+goto out;
 if (user_clash) {
 LOGD(ERROR, guest_domid,
  "wanted to use uid %ld (%s + %d) but that is user %s !",
  (long)intended_uid, LIBXL_QEMU_USER_RANGE_BASE,
  guest_domid, user_clash->pw_name);
-return ERROR_FAIL;
+rc = ERROR_DEVICE_EXISTS;
+goto out;
 }
+
 LOGD(DEBUG, guest_domid, "using uid %ld", (long)intended_uid);
 user = GCSPRINTF("%ld:%ld", (long)intended_uid,
  (long)user_base->pw_gid);
-goto end_search;
+goto out;
 }
 
+/*
+ * We couldn't find QEMU_USER_BASE_RANGE; look for QEMU_USER_SHARED
+ */
 user = LIBXL_QEMU_USER_SHARED;
-ret = userlookup_helper_getpwnam(gc, user, _pwbuf, _base);
-  

[Xen-devel] [PATCH v2 04/10] dm_depriv: Describe expected usage of device_model_user parameter

2018-12-06 Thread George Dunlap
A number of subsequent patches rely on as-yet undefined behavior for
what the `device_model_user` parameter does.  Rather than implement it
incorrectly (or randomly), or remove the feature, describe an expected
usage for the feature.  Further patches will make decisions based on
this expected usage.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v2:
- Remove stale comment about device_model_user not being ready

RFC: As we'll see in a later patch, this implementation is still
incomplete: we need a `reaper` uid from which to kill uids.

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 docs/features/qemu-deprivilege.pandoc | 17 +
 tools/libxl/libxl_types.idl   |  1 -
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/docs/features/qemu-deprivilege.pandoc 
b/docs/features/qemu-deprivilege.pandoc
index f941525189..49b571980e 100644
--- a/docs/features/qemu-deprivilege.pandoc
+++ b/docs/features/qemu-deprivilege.pandoc
@@ -66,6 +66,23 @@ this, create a user named `xen-qemuuser-shared`; for example:
 
 adduser --no-create-home --system xen-qemuuser-shared
 
+A final way to set up a separate process for qemus is to allocate one
+UID per VM, and set the UID in the domain config file with the
+`device_model_user` argument.  For example, suppose you have a VM
+named `c6-01`.  You might do the following:
+
+adduser --system --no-create-home --group xen-qemuuuser-c6-01
+
+And then in your config file, the following line:
+
+device_model_user="xen-qemuuser-c6-01"
+
+NOTE: It is important when using `device_model_user` that EACH VM HAVE
+A SEPARATE UID, and that none of these UIDs map to root.  xl will
+throw an error a uid maps to zero, but not if multiple VMs have the
+same uid.  Multiple VMs with the same device model uid will cause
+problems.
+
 ## Domain config changes
 
 The core domain config change is to add the following line to the
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 51cf06a3a2..141c46e42a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -495,7 +495,6 @@ libxl_domain_build_info = Struct("domain_build_info",[
 ("device_model", string),
 ("device_model_ssidref", uint32),
 ("device_model_ssid_label", string),
-# device_model_user is not ready for use yet
 ("device_model_user", string),
 
 # extra parameters pass directly to qemu, NULL terminated
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 07/10] libxl: Make killing of device model asynchronous

2018-12-06 Thread George Dunlap
Or at least, give it an asynchronous interface so that we can make it
actually asynchronous in subsequent patches.

Create state structures and callback function signatures.  Add the
state structure to libxl__destroy_domid_state.  Break
libxl__destroy_domid down into two functions.

No functional change intended.

Signed-off-by: George Dunlap 
---
v2:
- Note that libxl__devicemodel_destroy_cb may be called reentrantly

NB that I retain the comment before libxl__destroy_device_model, in
spite of the fact that it looks "pointless", to separate it logically
from the previous prototype.

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c   | 11 +++---
 tools/libxl/libxl_domain.c   | 40 
 tools/libxl/libxl_internal.h | 20 --
 3 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index db10b692dc..7f9c6a62fe 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2677,19 +2677,24 @@ out:
 return rc;
 }
 
-int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
+void libxl__destroy_device_model(libxl__egc *egc,
+ libxl__destroy_devicemodel_state *ddms)
 {
+STATE_AO_GC(ddms->ao);
 int rc;
+int domid = ddms->domid;
 char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
+
 if (!xs_rm(CTX->xsh, XBT_NULL, path))
 LOGD(ERROR, domid, "xs_rm failed for %s", path);
+
 /* We should try to destroy the device model anyway. */
 rc = kill_device_model(gc,
   GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
-
+
 libxl__qmp_cleanup(gc, domid);
 
-return rc;
+ddms->callback(egc, ddms, rc);
 }
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index d46b97dedf..0ce1ba1327 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1008,6 +1008,10 @@ static void destroy_finish_check(libxl__egc *egc,
 }
 
 /* Callbacks for libxl__destroy_domid */
+static void dm_destroy_cb(libxl__egc *egc,
+  libxl__destroy_devicemodel_state *ddms,
+  int rc);
+
 static void devices_destroy_cb(libxl__egc *egc,
libxl__devices_remove_state *drs,
int rc);
@@ -1066,16 +1070,18 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
 if (rc < 0) {
 LOGEVD(ERROR, rc, domid, "xc_domain_pause failed");
 }
+
 if (dm_present) {
-if (libxl__destroy_device_model(gc, domid) < 0)
-LOGD(ERROR, domid, "libxl__destroy_device_model failed");
+dis->ddms.ao = ao;
+dis->ddms.domid = domid;
+dis->ddms.callback = dm_destroy_cb;
+
+libxl__destroy_device_model(egc, >ddms);
+return;
+} else {
+dm_destroy_cb(egc, >ddms, 0);
+return;
 }
-dis->drs.ao = ao;
-dis->drs.domid = domid;
-dis->drs.callback = devices_destroy_cb;
-dis->drs.force = 1;
-libxl__devices_destroy(egc, >drs);
-return;
 
 out:
 assert(rc);
@@ -1083,6 +1089,24 @@ out:
 return;
 }
 
+static void dm_destroy_cb(libxl__egc *egc,
+  libxl__destroy_devicemodel_state *ddms,
+  int rc)
+{
+libxl__destroy_domid_state *dis = CONTAINER_OF(ddms, *dis, ddms);
+STATE_AO_GC(dis->ao);
+uint32_t domid = dis->domid;
+
+if (rc < 0)
+LOGD(ERROR, domid, "libxl__destroy_device_model failed");
+
+dis->drs.ao = ao;
+dis->drs.domid = domid;
+dis->drs.callback = devices_destroy_cb;
+dis->drs.force = 1;
+libxl__devices_destroy(egc, >drs);
+}
+
 static void devices_destroy_cb(libxl__egc *egc,
libxl__devices_remove_state *drs,
int rc)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b147f3803c..f9e0bf6578 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1705,8 +1705,6 @@ _hidden int 
libxl__wait_for_device_model_deprecated(libxl__gc *gc,
   void *userdata),
 void *check_callback_userdata);
 
-_hidden int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid);
-
 _hidden const libxl_vnc_info *libxl__dm_vnc(const libxl_domain_config *g_cfg);
 
 _hidden char *libxl__abs_path(libxl__gc *gc, const char *s, const char *path);
@@ -3672,6 +3670,7 @@ extern const struct libxl_device_type *device_type_tbl[];
 
 typedef struct libxl__domain_destroy_state libxl__domain_destroy_state;
 typedef struct libxl__destroy_domid_state libxl__destroy_domid_state;
+typedef struct libxl__destroy_devicemodel_state 
libxl__destroy_devicemodel_state;
 typedef struct libxl__devices_remove_state libxl__devices_remove_state;
 
 

[Xen-devel] [PATCH v2 06/10] libxl: Move qmp cleanup into devicemodel destroy function

2018-12-06 Thread George Dunlap
Removing the qmp connection is logically part of the device model
destruction; having the caller destroy it is a mild layering
violation.

Move libxl__qmp_cleanup() into libxl__destroy_device_model().  This
will make it easier when we make devicemodel destruction asynchronous.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 9 +++--
 tools/libxl/libxl_domain.c | 2 --
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 7ff3e3160a..db10b692dc 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2679,12 +2679,17 @@ out:
 
 int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
 {
+int rc;
 char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
 if (!xs_rm(CTX->xsh, XBT_NULL, path))
 LOGD(ERROR, domid, "xs_rm failed for %s", path);
 /* We should try to destroy the device model anyway. */
-return kill_device_model(gc,
-GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+rc = kill_device_model(gc,
+  GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+
+libxl__qmp_cleanup(gc, domid);
+
+return rc;
 }
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index 3377bba994..d46b97dedf 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1069,8 +1069,6 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
 if (dm_present) {
 if (libxl__destroy_device_model(gc, domid) < 0)
 LOGD(ERROR, domid, "libxl__destroy_device_model failed");
-
-libxl__qmp_cleanup(gc, domid);
 }
 dis->drs.ao = ao;
 dis->drs.domid = domid;
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function

2018-12-06 Thread George Dunlap
To reliably kill an untrusted devicemodel, we need to know not only
its pid, but its uid.  In preparation for this, move the userid
determination logic into a helper function.

Create a new field, `dm_runas`, in libxl__domain_build_state to store
the value during domain creation.

This change also removes unnecessary duplication of the argument
construction code.

While here, clean up some minor CODING_STYLE infractions (space
between * and variable name).

No functional change intended.

While here, delete some trailing whitespace.

Signed-off-by: George Dunlap 
Acked-by: Wei Liu 
Acked-by: Ian Jackson 
---
v2:
- Remove unnecessary space between * and dm_runas
- Additional code clean-up
- Delete trailing whitespace

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 tools/libxl/libxl_dm.c   | 260 +++
 tools/libxl/libxl_internal.h |  22 +--
 2 files changed, 151 insertions(+), 131 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 5698fe8af3..bbcbc94b6c 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -65,6 +65,131 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
 return logfile_w;
 }
 
+/*
+ *  userlookup_helper_getpwnam(libxl__gc*, const char *user,
+ * struct passwd **pwd_r);
+ *
+ *  userlookup_helper_getpwuid(libxl__gc*, uid_t uid,
+ * struct passwd **pwd_r);
+ *
+ *  returns 1 if the user was found, 0 if it was not, -1 on error
+ */
+#define DEFINE_USERLOOKUP_HELPER(NAME,SPEC_TYPE,STRUCTNAME,SYSCONF) \
+static int userlookup_helper_##NAME(libxl__gc *gc,  \
+SPEC_TYPE spec, \
+struct STRUCTNAME *resultbuf,   \
+struct STRUCTNAME **out)\
+{   \
+struct STRUCTNAME *resultp = NULL;  \
+char *buf = NULL;   \
+long buf_size;  \
+int ret;\
+\
+buf_size = sysconf(SYSCONF);\
+if (buf_size < 0) { \
+buf_size = 2048;\
+LOG(DEBUG,  \
+"sysconf failed, setting the initial buffer size to %ld",   \
+buf_size);  \
+}   \
+\
+while (1) { \
+buf = libxl__realloc(gc, buf, buf_size);\
+ret = NAME##_r(spec, resultbuf, buf, buf_size, );   \
+if (ret == ERANGE) {\
+buf_size += 128;\
+continue;   \
+}   \
+if (ret != 0)   \
+return ERROR_FAIL;  \
+if (resultp != NULL) {  \
+if (out) *out = resultp;\
+return 1;   \
+}   \
+return 0;   \
+}   \
+}
+
+DEFINE_USERLOOKUP_HELPER(getpwnam, const char*, passwd, _SC_GETPW_R_SIZE_MAX);
+DEFINE_USERLOOKUP_HELPER(getpwuid, uid_t,   passwd, _SC_GETPW_R_SIZE_MAX);
+
+static int libxl__domain_get_device_model_uid(libxl__gc *gc,
+  libxl__dm_spawn_state *dmss)
+{
+int guest_domid = dmss->guest_domid;
+libxl__domain_build_state *const state = dmss->build_state;
+const libxl_domain_build_info *b_info = >guest_config->b_info;
+
+struct passwd *user_base, user_pwbuf;
+int ret;
+char *user;
+
+/* Only qemu-upstream can run as a different uid */
+if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
+return 0;
+
+user = b_info->device_model_user;
+if (user)
+goto end_search;
+
+if (!libxl_defbool_val(b_info->dm_restrict)) {
+LOGD(DEBUG, guest_domid,
+ "dm_restrict disabled, 

Re: [Xen-devel] remove the ->mapping_error method from dma_map_ops V3

2018-12-06 Thread Christoph Hellwig
I've pulled this into the dma-mapping for-next tree, with the suggestion
from Robin that improves bisectability, and two unused variables found
by the build bot.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] libxl: Documentation about the domain configuration on disk

2018-12-06 Thread Anthony PERARD
On Thu, Dec 06, 2018 at 12:16:40PM +, Wei Liu wrote:
> On Thu, Dec 06, 2018 at 10:43:32AM +, Anthony PERARD wrote:
> > +UPDATE OF DOMAIN CONFIGURATION
> > +--
> > +
> > +Also known as "libxl-json" userdata or `libxl_domain_config'.
> > +
> > +Whenever a running domain have its configuration updated, like changing
> > +media in a cdrom drive, the domain configuration in libxl private data
> > +store needs to be updated as well. The domain configuration should
> > +contain *more* information about the domain rather than less, stale data
> > +are easier to spot that missing data.
> > +
> > +Here is an example of how to update the domain configuration:
> > + * Remove current media from cdrom drive
> > + * Update domain configuration with media removed
> 
> We may not even need this because the primary source in this case is
> QEMU. See below.
> 
> > + ( we could stop here)
> > + * Update domain configuration to add media we are about to insert
> > + * Insert media into cdrom drive
> 
> In essence we need a primary reference while using libxl-json file as a
> secondary source.
> 
> When doing device hotplug, the primary source is xenstore. It may become
> QEMU in the future if we move to a model where everything is
> communicated via QMP.
> 
> When doing CDROM insertion and rejection, the primary source is QEMU
> state.

I'm not trying to figure out what primary source should be here, I'm
trying to find out how the secondary source, namely "libxl-json", should
behave, what it should contain, when to update it compare the primary
source (what a guest ultimately see).

> All in all I think your description is not wrong but it failed to
> capture the high-level intent -- always update libxl-json before
> updating the primary source.

That isn't what Ian said IRL, I don't think. From what I understand,
when removing a media/disk, first remove the media, then update
libxl-json; when adding a media/disk, first update libxl-json, then add
the media.

> > +
> > +Retrieve / store domain configuration from / to libxl private data store
> > +are done with `libxl__get_domain_configuration' and
> > +`libxl__set_domain_configuration'. Consult libxl_internal.h for more
> > +information.
> > +
> 
> What do you think about the text around libxl_internal.h:L2598?

If only I knew this comment existed :-(. It is burried, don't mention
"libxl-json" or "userdata" or "domain config" but only the not very
helpful term "json config"... Hmm, ... it actualy have "domain
configuration" once.

Anyway, that comment block isn't very helpful because it basically says
that we can't depriv QEMU, I mean do hotplug with a deprived QEMU. It
assumes that we can keep a lock on the userdata while updating the
guest, but we can't keep the lock while talking with QEMU (or more
generaly: we can't keep the lock while doing any async operation).

But there is one useful piece of information:
Here we maintain one invariant: every device in xenstore must have
an entry in JSON file.
(xenstore is describe as "primary reference" just before that sentence).

This is what I would like my past self to be able to find out more
easly, and having the information in CODING_STYLE would make sense I
think.

> Maybe we should extend that comment block?

I still think it would be helpful to have pointers in CODING_STYLE, as
there isn't a single place in libxl_internal.h where the information I
was looking for could be added.

Thanks,

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/4] iommu: rename wrapper functions

2018-12-06 Thread Jan Beulich
>>> On 05.12.18 at 12:29,  wrote:
> A subsequent patch will add semantically different versions of
> iommu_map/unmap() so, in advance of that change, this patch renames the
> existing functions to iommu_legacy_map/unmap() and modifies all call-sites.
> It also adjusts a comment that refers to iommu_map_page(), which was re-
> named by a previous patch.
> 
> This patch is purely cosmetic. No functional change.
> 
> Signed-off-by: Paul Durrant 

Acked-by: Jan Beulich 

I have to admit that I'm undecided whether to ask that this be
committed in the same development window where all the
"legacy" infixes would go away again, i.e. presumably only
after 4.12 now.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 131076: tolerable all pass - PUSHED

2018-12-06 Thread osstest service owner
flight 131076 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131076/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  ae01a8e315fecb1914edd99980a619d387951d3f
baseline version:
 xen  81cfc1b3c78f5d4abafdb368ede914b1dd825a7b

Last test of basis   131069  2018-12-06 00:00:47 Z0 days
Testing same since   131076  2018-12-06 12:00:38 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   81cfc1b3c7..ae01a8e315  ae01a8e315fecb1914edd99980a619d387951d3f -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >