[Xen-devel] [linux-3.18 test] 102732: regressions - FAIL

2016-11-30 Thread osstest service owner
flight 102732 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/102732/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-debianhvm-amd64  6 xen-bootfail REGR. vs. 101675
 test-amd64-amd64-xl-multivcpu  6 xen-bootfail REGR. vs. 101675
 test-amd64-amd64-qemuu-nested-intel  6 xen-boot  fail REGR. vs. 101675
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 6 xen-boot fail REGR. vs. 
101675
 test-amd64-amd64-libvirt-xsm  6 xen-boot fail REGR. vs. 101675
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-boot fail REGR. vs. 101675
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail REGR. vs. 101675
 test-amd64-amd64-xl-qemut-win7-amd64  6 xen-boot fail REGR. vs. 101675
 test-amd64-amd64-libvirt-pair  9 xen-boot/src_host   fail REGR. vs. 101675
 test-amd64-amd64-libvirt-pair 10 xen-boot/dst_host   fail REGR. vs. 101675
 test-amd64-amd64-amd64-pvgrub  6 xen-bootfail REGR. vs. 101675
 build-i386-pvops  5 kernel-build fail REGR. vs. 101675

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 101675
 test-armhf-armhf-libvirt-qcow2 12 saverestore-support-check   fail like 101675
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 101675
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 101675
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 101675

Tests which did not succeed, but are not blocking:
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-rumprun-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-qemut-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-amd64-rumprun-amd64  6 xen-boot fail never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 

[Xen-devel] [xen-4.4-testing test] 102730: regressions - trouble: blocked/broken/fail/pass

2016-11-30 Thread osstest service owner
flight 102730 xen-4.4-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/102730/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf   3 host-install(3)broken REGR. vs. 102521
 test-amd64-i386-xend-qemut-winxpsp3  9 windows-install   fail REGR. vs. 102521
 test-armhf-armhf-xl-multivcpu 11 guest-start   fail in 102718 REGR. vs. 102521

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-amd64-pvgrub  9 debian-di-install fail pass in 102718
 test-amd64-i386-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail pass in 
102718
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install  fail pass in 102718
 test-amd64-i386-qemuu-rhel6hvm-amd  9 redhat-install   fail pass in 102718
 test-amd64-amd64-i386-pvgrub  9 debian-di-install  fail pass in 102718
 test-xtf-amd64-amd64-2 20 xtf/test-hvm32-invlpg~shadow fail pass in 102718
 test-amd64-i386-qemut-rhel6hvm-intel  9 redhat-install fail pass in 102718
 test-xtf-amd64-amd64-2  29 xtf/test-hvm32pae-invlpg~shadow fail pass in 102718
 test-xtf-amd64-amd64-2 40 xtf/test-hvm64-invlpg~shadow fail pass in 102718
 test-amd64-i386-pv   14 guest-saverestore  fail pass in 102718
 test-amd64-amd64-xl-qemuu-winxpsp3 15 guest-localmigrate/x10 fail pass in 
102718

Regressions which are regarded as allowable (not blocking):
 test-xtf-amd64-amd64-4 16 xtf/test-pv32pae-selftest fail in 102718 like 102521
 test-xtf-amd64-amd64-2 16 xtf/test-pv32pae-selftest fail in 102718 like 102521
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 102521
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 102521
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 102521
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 102521

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumprun-amd64  1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-amd64-i386-rumprun-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-xtf-amd64-amd64-4 18 xtf/test-hvm32-cpuid-faulting fail in 102718 never 
pass
 test-armhf-armhf-libvirt-qcow2  9 debian-di-install  fail in 102718 never pass
 test-armhf-armhf-xl-vhd   9 debian-di-installfail in 102718 never pass
 test-armhf-armhf-libvirt-raw  9 debian-di-installfail in 102718 never pass
 test-xtf-amd64-amd64-2 18 xtf/test-hvm32-cpuid-faulting fail in 102718 never 
pass
 test-xtf-amd64-amd64-4 27 xtf/test-hvm32pae-cpuid-faulting fail in 102718 
never pass
 test-xtf-amd64-amd64-2 27 xtf/test-hvm32pae-cpuid-faulting fail in 102718 
never pass
 test-xtf-amd64-amd64-4 33 xtf/test-hvm32pse-cpuid-faulting fail in 102718 
never pass
 test-xtf-amd64-amd64-2 33 xtf/test-hvm32pse-cpuid-faulting fail in 102718 
never pass
 test-xtf-amd64-amd64-4 37 xtf/test-hvm64-cpuid-faulting fail in 102718 never 
pass
 test-xtf-amd64-amd64-2 37 xtf/test-hvm64-cpuid-faulting fail in 102718 never 
pass
 test-xtf-amd64-amd64-2 49 xtf/test-pv64-cpuid-faulting fail in 102718 never 
pass
 test-xtf-amd64-amd64-4 49 xtf/test-pv64-cpuid-faulting fail in 102718 never 
pass
 test-armhf-armhf-xl-credit2 12 migrate-support-check fail in 102718 never pass
 test-armhf-armhf-xl-arndale 12 migrate-support-check fail in 102718 never pass
 test-armhf-armhf-xl-credit2 13 saverestore-support-check fail in 102718 never 
pass
 test-armhf-armhf-xl-arndale 13 saverestore-support-check fail in 102718 never 
pass
 test-armhf-armhf-libvirt 11 guest-start  fail in 102718 never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-check fail in 102718 never 
pass
 test-armhf-armhf-xl 12 migrate-support-check fail in 102718 never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-check fail in 102718 
never pass
 test-armhf-armhf-xl 13 saverestore-support-check fail in 102718 never pass
 test-xtf-amd64-amd64-5   10 xtf-fep  fail   never pass
 build-i386-rumprun7 xen-buildfail   never pass
 test-xtf-amd64-amd64-5   16 xtf/test-pv32pae-selftestfail   never pass
 test-xtf-amd64-amd64-5   18 

Re: [Xen-devel] Xen ARM small task (WAS: Re: [Xen Question])

2016-11-30 Thread Stefano Stabellini
On Fri, 25 Nov 2016, Julien Grall wrote:
> Hi Stefano,
> 
> On 23/11/16 19:55, Stefano Stabellini wrote:
> > Actually I am thinking that the default values should be in the
> > emulators themselves. After all they are the part of the code that knows
> > more about vuarts.
> 
> Can you expand what you mean by emulator? I was never expecting to have a
> fully emulated UART exposed to the guest (i.e read/write character support)
> except for a PL011.

Once we start having emulators, it is possible that we'll end up with
more than one. For example, we introduce the PL011 now, then in a couple
of years somebody wants to add ns16550 because it is the only one that
Windows 2019 supports. I am assuming that one way or another they'll run
in a low privilege mode (see other recent threads).


> The current vuart (see xen/arch/arm/vuart.c) is very simple but require
> someone to configure it. For DOM0, this is configured by the serial driver.
> For guest we need someone doing the same.

I understand. For clarity, I'll call "PL011 emulator" the one that will
end up being used for DomUs, which might be based on, or different from,
xen/arch/arm/vuart.c. It doesn't exist yet.

The PL011 emulator should have default values for everything. Some of
these values could be configured by libxl, but none should be required
to be configured by libxl. The last thing we want is to disseminate
numbers and addresses in libxl. One of these parameters could be the
MMIO address, but it is just an example, we don't necessarily need to
support changing it. It could be a decent feature to have but I don't
think is important if we'll support configurable memory layouts soon.


> > So the toolstack would pass down all the info provided by the users to
> > Xen. Xen would start the appropriate emulator, initializing anything not
> > specifically configured by libxl to default values. No need for long
> > lists of defaults in libxl.
> > 
> > 
> > > If so, you will end up people asking to implement each of their UART
> > > (8250,
> > > exynos, pl011...) in the toolstack. A user would have to pay attention
> > > whether
> > > this model is supported or not by their toolstack.
> > 
> > It is up to the maintainers to decide how many and which vuart should be
> > configurable. libxl would have the capability of listing supported models
> > of vuarts. Today libxl already does that for nics and vgas.
> 
> As we discussed recently, the goal of exposing the vuart is to let the guest
> write data not read without having to bring a full PV drivers.
> 
> Supporting multiple fully emulated UARTs would be very similarly to
> incorporate piece of QEMU code within Xen. I think we both well know what it
> means in term of security.
> 
> We have to emulate a PL011 because this is part of the VM spec. If you think
> that more kind of UART have to be emulated, then I would like to see real use
> case as nobody stepped up for that on the ML so far.

Unfortunately we have to expect that the number of requests for
emulators will only increase going forward. We need to have a proper low
privilege mode to run them in, to avoid security issues in Xen.


> > > Lastly, the pl011 emulation needs to be easily enabled by any user without
> > > requiring a knowledge on the guest memory layout (which is not stable
> > > BTW). By
> > > default the layout is static, so what's the point to let the user
> > > configuring
> > > it?
> > 
> > This is my reasoning: people that request a vuart explicitly in the VM
> > config file are people that are configuring an embedded system with
> > non-Linux OSes because all others should be able to use the PV console
> > effectively.
> > 
> > In that case, to setup the system with the minimum amount of
> > configuration and efforts, they might want to emulate one of the UARTs
> > supported by their non-Linux OSes. The PL011 is pretty widespread, so it
> > could be a good choice.
> > Additionally they know the memory layout of all their VMs, so they can
> > easily pick an unused address and configure it both in the VM config
> > file and in their non-Linux OS.
> 
> Their non-Linux OSes will already need to be aware of the guest memory layout
> because it is fully static. They will use either device-tree/acpi or hardcoded
> the layout.
> 
> In both case, could you explain why they would want to configure the base
> address of the UART? It looks like to me it is more burden to chose the
> address. They would be fine to use the one in the static memory layout.
> 
> If they want a dynamic memory layout (host or custom [1]). Then it needs to be
> fully defined separately. I am not in favor of having a layout that can be
> half-static, half-dynamic as you are currently suggesting.
> 
> Note that I know this is currently the case for iomem parameters, but I found
> this ugly and there was no better solution. Let's not continue that way.

I see, I was exactly thinking of iomem as a model :-)

My thinking is that some users might have half-hacked Xen 

Re: [Xen-devel] [PATCH] xen/arm: Fix misplaced parentheses for PSCI version check

2016-11-30 Thread Wei Liu
On Wed, Nov 30, 2016 at 11:16:49AM -0800, Stefano Stabellini wrote:
> On Wed, 30 Nov 2016, Julien Grall wrote:
> > Hi Artem,
> > 
> > On 30/11/16 13:53, Artem Mygaiev wrote:
> > > Fix misplaced parentheses for PSCI version check
> > > 
> > > Signed-off-by: Artem Mygaiev 
> > 
> > Can you please include the coverity ID:
> > 
> > Coverity-ID: 1381830
> > 
> > With that:
> > 
> > Reviewed-by: Julien Grall 
> > 
> > This has been introduced by me in commit 2831f20 "xen/arm: Add support of 
> > PSCI
> > v1.0 for the host" in Xen 4.7. I am not sure whether we should backport it.
>  
> I think we should backport it.
> 
> Wei, can I have a release-ack or do you want to wait until after the
> release?

Please wait until after the release.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-4.5-testing baseline-only test] 68129: regressions - FAIL

2016-11-30 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 68129 xen-4.5-testing real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68129/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-xtf-amd64-amd64-520 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 68122
 test-xtf-amd64-amd64-420 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 68122
 test-xtf-amd64-amd64-5 29 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 68122
 test-xtf-amd64-amd64-4 29 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 68122
 test-xtf-amd64-amd64-540 xtf/test-hvm64-invlpg~shadow fail REGR. vs. 68122
 test-xtf-amd64-amd64-440 xtf/test-hvm64-invlpg~shadow fail REGR. vs. 68122
 test-xtf-amd64-amd64-120 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 68122
 test-xtf-amd64-amd64-1 29 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 68122
 test-xtf-amd64-amd64-140 xtf/test-hvm64-invlpg~shadow fail REGR. vs. 68122
 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail REGR. vs. 
68122

Regressions which are regarded as allowable (not blocking):
 test-xtf-amd64-amd64-3   20 xtf/test-hvm32-invlpg~shadow fail   like 68122
 test-xtf-amd64-amd64-3  29 xtf/test-hvm32pae-invlpg~shadow fail like 68122
 test-xtf-amd64-amd64-3   40 xtf/test-hvm64-invlpg~shadow fail   like 68122
 test-amd64-amd64-xl-rtds  6 xen-boot fail   like 68122
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail like 68122
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 68122
 test-amd64-amd64-xl-qemut-winxpsp3  9 windows-install  fail like 68122

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumprun-i386 10 rumprun-demo-xenstorels/xenstorels fail never 
pass
 test-amd64-amd64-rumprun-amd64 10 rumprun-demo-xenstorels/xenstorels fail 
never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-xtf-amd64-amd64-2   20 xtf/test-hvm32-invlpg~shadow fail   never pass
 test-xtf-amd64-amd64-2  29 xtf/test-hvm32pae-invlpg~shadow fail never pass
 test-xtf-amd64-amd64-2   40 xtf/test-hvm64-invlpg~shadow fail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-midway   12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 10 guest-start  fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 10 guest-start  fail   never pass
 test-armhf-armhf-xl-vhd  10 guest-start  fail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 xen  cc325c0bd56d5cc327f8a426d661cc1a2f3a52bd
baseline version:
 xen  8e7b84dd2a187edc74f44b69437734b8e4af9628

Last test of basis68122  2016-11-29 17:14:25 Z1 days
Testing same since68129  2016-11-30 15:47:00 Z0 days1 attempts


People who touched revisions under test:
  Ian Jackson 

jobs:
 build-amd64-xtf  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-prev

[Xen-devel] tap device name for emulated NIC too long

2016-11-30 Thread Jim Fehlig

Hi All,

During the last Wg-openstack meetup we briefly discussed a long-standing bug 
when using Xen+libvirt+OpenStack with Neutron networking


https://bugs.launchpad.net/nova/+bug/1450465

The bug was also discussed on this list with no resolution

https://lists.xenproject.org/archives/html/xen-devel/2015-06/msg04116.html

To summarize: the tap device name for an emulated NIC is too long after libxl 
appends '-emu' to the name provided by Neutron. Some proposed fixes include


1. Shorten '-emu' to just '-e', avoiding IFNAMSIZ limit. But users are free to 
provide a name that already occupies the full IFNAMSIZ. Also, the user-provided 
name may be used in rules, filters, etc. elsewhere in the network, so modifying 
it at all seems questionable.


2. Change OpenStack to not exceed IFNAMSIZ-4 when specifying Xen vif name. This 
could be proposed to the Neutron devs, but IMO adding such Xen-specific hacks in 
OpenStack is undesirable.


3. Change the Xen default vif type from 'ioemu' to 'vif' (see 
docs/misc/xl-network-configuration.markdown), which avoids creating an emulated 
device. (Note: such a change could be made in Xen or libvirt.) But I think this 
is a no-go. I'd suspect it would result in a lot of broken configurations. E.g. 
a guest may not have PV drivers and is relying on the emulated device. Or the 
guest may be configured to network boot, in which case the emulated device would 
be needed for PXE [0].


We (the Wg-openstack folks) would like to hear your opinions on these proposals, 
or alternatives for fixing this bug.


Regards,
Jim

[0] iPXE claims support for Xen netfront devices, but I've not yet got it to 
work: http://lists.ipxe.org/pipermail/ipxe-devel/2014-July/003674.html


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 21/24] ARM: vITS: handle INVALL command

2016-11-30 Thread Stefano Stabellini
On Fri, 25 Nov 2016, Julien Grall wrote:
> Hi,
> 
> On 18/11/16 18:39, Stefano Stabellini wrote:
> > On Fri, 11 Nov 2016, Stefano Stabellini wrote:
> > > On Fri, 11 Nov 2016, Julien Grall wrote:
> > > > On 10/11/16 20:42, Stefano Stabellini wrote:
> > > > That's why in the approach we had on the previous series was "host ITS
> > > > command
> > > > should be limited when emulating guest ITS command". From my recall, in
> > > > that
> > > > series the host and guest LPIs was fully separated (enabling a guest
> > > > LPIs was
> > > > not enabling host LPIs).
> > > 
> > > I am interested in reading what Ian suggested to do when the physical
> > > ITS queue is full, but I cannot find anything specific about it in the
> > > doc.
> > > 
> > > Do you have a suggestion for this?
> > > 
> > > The only things that come to mind right now are:
> > > 
> > > 1) check if the ITS queue is full and busy loop until it is not (spin_lock
> > > style)
> > > 2) check if the ITS queue is full and sleep until it is not (mutex style)
> > 
> > Another, probably better idea, is to map all pLPIs of a device when the
> > device is assigned to a guest (including Dom0). This is what was written
> > in Ian's design doc. The advantage of this approach is that Xen doesn't
> > need to take any actions on the physical ITS command queue when the
> > guest issues virtual ITS commands, therefore completely solving this
> > problem at the root. (Although I am not sure about enable/disable
> > commands: could we avoid issuing enable/disable on pLPIs?)
> 
> In the previous design document (see [1]), the pLPIs are enabled when the
> device is assigned to the guest. This means that it is not necessary to send
> command there. This is also means we may receive a pLPI before the associated
> vLPI has been configured.
> 
> That said, given that LPIs are edge-triggered, there is no deactivate state
> (see 4.1 in ARM IHI 0069C). So as soon as the priority drop is done, the same
> LPIs could potentially be raised again. This could generate a storm.

Thank you for raising this important point. You are correct.


> The priority drop is necessary if we don't want to block the reception of
> interrupt for the current physical CPU.
> 
> What I am more concerned about is this problem can also happen in normal
> running (i.e the pLPI is bound to an vLPI and the vLPI is enabled). For
> edge-triggered interrupt, there is no way to prevent them to fire again. Maybe
> it is time to introduce rate-limit interrupt for ARM. Any opinions?

Yes. It could be as simple as disabling the pLPI when Xen receives a
second pLPI before the guest EOIs the first corresponding vLPI, which
shouldn't happen in normal circumstances.

We need a simple per-LPI inflight counter, incremented when a pLPI is
received, decremented when the corresponding vLPI is EOIed (the LR is
cleared).

When the counter > 1, we disable the pLPI and request a maintenance
interrupt for the corresponding vLPI.

When we receive the maintenance interrupt and we clear the LR of the
vLPI, Xen should re-enable the pLPI.

Given that the state of the LRs is sync'ed before calling gic_interrupt,
we can be sure to know exactly in what state the vLPI is at any given
time. But for this to work correctly, it is important to configure the
pLPI to be delivered to the same pCPU running the vCPU which handles
the vLPI (as it is already the case today for SPIs).

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 06/14] x86/vtd: refuse to enable IOMMU if the PCI scan fails

2016-11-30 Thread Tian, Kevin
> From: Roger Pau Monne [mailto:roger@citrix.com]
> Sent: Thursday, December 01, 2016 12:50 AM
> 
> This provides uniform behavior between Intel and AMD IOMMU initialization, and
> is a requirement for PVHv2 Dom0, that depends on a working IOMMU plus the PCI
> bus being scanned for devices.
> 
> Signed-off-by: Roger Pau Monné 

Acked-by: Kevin Tian 
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [ARM] Handling CMA pool device nodes in Dom0

2016-11-30 Thread Stefano Stabellini
On Tue, 29 Nov 2016, Julien Grall wrote:
> (CC Stefano)
> 
> On 25/11/16 12:19, Iurii Mykhalskyi wrote:
> > Hello!
> 
> Hi Iurii,
> 
> > 
> > I'm working under Renesas Gen3 H3 board with 4GB RAM (Salvator-X)
> > support in Xen mainline.
> > 
> > Salvator-X has several  CMA pool nodes, for example:
> > 
> > 1:
> > adsp_reserved: linux,adsp {
> > compatible = "shared-dma-pool";
> > reusable;
> > reg = <0x 0x5700 0x0 0x0100>;
> > };
> > 
> > 2:
> > linux,cma {
> > compatible = "shared-dma-pool";
> > reusable;
> > reg = <0x 0x5800 0x0 0x1800>;
> > linux,cma-default;
> > };
> > 
> > During Dom0 allocation, we can't guarantee, that allocated memory will
> > contain mentioned regions.
> > In second сase, we can actually hardcode mapped region by using separate
> > DTS for Dom0 with changed memory regions.
> > But for first one, this in not an option - this pool is used for audio
> > DSP and its firmware relies on this addresses.
> > 
> > What is the correct way to solve this situation?
> > Does Xen has some mechanism to handle such cases?
> 
> From my understanding all the nodes you mentioned are living under the node
> /reserved-memory, right? Currently Xen is not parsing this node.
> 
> Before answering about possible implementation in Xen, I would like to
> understand what are the constraints on these reserved memory regions.
> 
> I understand that when "reg" property is specified, it is a static allocation
> and we need to be able to map those regions at the same address in DOM0.
> 
> However, do these regions need to be included in memory node?

Another question: what caching attributes do they need in the stage2 mapping?___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 102727: all pass - PUSHED

2016-11-30 Thread osstest service owner
flight 102727 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/102727/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf e148e6e9625f8a0054f131bacba4e5c9a21a4377
baseline version:
 ovmf ff9a1358b3ff98b1c3a9b4b584fca71653a1c9fe

Last test of basis   102715  2016-11-29 16:43:34 Z1 days
Testing same since   102727  2016-11-30 04:36:01 Z0 days1 attempts


People who touched revisions under test:
  Liming Gao 
  Yonghong Zhu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=e148e6e9625f8a0054f131bacba4e5c9a21a4377
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
e148e6e9625f8a0054f131bacba4e5c9a21a4377
+ branch=ovmf
+ revision=e148e6e9625f8a0054f131bacba4e5c9a21a4377
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.7-testing
+ '[' xe148e6e9625f8a0054f131bacba4e5c9a21a4377 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git
++ : 

Re: [Xen-devel] [PATCH v5 3/3] Significant changes to decision making; some new roles and minor changes

2016-11-30 Thread Stefano Stabellini
On Wed, 23 Nov 2016, Lars Kurth wrote:
> List of changes
> - Added Goal: Local Decision Making
> - Split roles into project wide and sub-project specific roles
> - Added new roles: Community Manager, Security Response Team, Leadership Team
> - Added RTC Policy
> - Added +2 ... -2 scheme for expressing opinions more clearly
> - Clarified lazy consensus / lazy voting with examples
> - Added Informal Votes or Surveys
> - Added Project Team Leadership decisions (majority vote, non-monotonicity)
> - Clarified and Adapted Conflict Resolution to previous changes
> - Updated Elections to cover new roles and terminology
> - Changed Project Wide Decision making (per project, non-monotonicity)
> - Clarified scope of Decision making
> - Added section on Community Decisions with Funding and Legal Implications
> - Modified all other sections which have dependencies on changes above
> - Added Per Sub-Project Governance Specialisation
> - Fixed various typos
> - Fixed changelog
> 
> Signed-off-by: Lars Kurth 
> ---
>  governance.pandoc | 628 
> ++
>  1 file changed, 496 insertions(+), 132 deletions(-)
> 
> diff --git a/governance.pandoc b/governance.pandoc
> index 2ce780c..188fa41 100644
> --- a/governance.pandoc
> +++ b/governance.pandoc
> @@ -1,5 +1,5 @@
>  This document has come in effect in June 2011 and will be reviewed 
> periodically 
> -(see revision sections). The last modification has been made in July 2016.
> +(see revision sections). The last modification has been made in December 
> 2016.
>  
>  Content
>  ---
> @@ -11,8 +11,10 @@ Content
>  -   [Making Contributions](#contributions)
>  -   [Decision Making, Conflict Resolution, Role Nominations and 
>  Elections](#decisions)
> --   [Formal Votes](#formal-votes)
> +-   [Project Wide Decision Making](#project-decisions)
> +-   [Community Decisions with Funding and Legal 
> Implications](#funding-and-legal)
>  -   [Project Governance](#project-governance)
> +-   [Per Sub-Project Governance Specialisations](#specialisations)
>  
>  Goals {#goals}
>  -
> @@ -54,7 +56,12 @@ The Xen Project is a meritocracy. The more you contribute 
> the more
>  responsibility you will earn. Leadership roles in Xen are also merit-based 
> and 
>  earned by peer acclaim.
>  
> -Xen Project Wide Roles {#roles-global}
> +### Local Decision Making
> +
> +The Xen Project consists of a number of sub-projects: each sub-project makes 
> +technical and other decisions that solely affect it locally.
> +
> +Xen Project Wide Roles {#roles-global} 
>  --
>  
>  ### Sub-projects and Teams
> @@ -64,9 +71,22 @@ the [Project Governance](#project-governance) (or Project 
> Lifecycle) as
>  outlined in this document. Sub-projects (sometimes simply referred to as 
>  projects) are run by individuals and are often referred to as teams to 
>  highlight the collaborative nature of development. For example, each 
> -sub-project has a [team portal](/developers/teams.html) on Xenproject.org.
> +sub-project has a [team portal](/developers/teams.html) on Xenproject.org. 
> +Sub-projects own and are responsible for a collection of source repositories 
> +and other resources (e.g. test infrastructure, CI infrastructure, ...), 
> which 
> +we call **sub-project assets** (or team assets) in this document.
> +
> +Sub-projects can either be **incubation projects** or **mature projects** as 
> +outlined in [Basic Project Life Cycle](#project-governance). In line with 
> the 
> +meritocratic principle, mature projects have more influence than incubation 
> +projects, on [project wide decisions](#project-decisions).
> +
> +### Community Manager
>  
> -### Xen Project Advisory Board
> +The Xen Project has a community manager, whose primary role it is to support 
> +the entire Xen Project Community.
> +
> +### Xen Project Advisory Board {#roles-ab}
>  
>  The [Xen Project Advisory Board](/join.html) consists of members who are 
>  committed to steering the project to advance its market and technical 
> success, 
> @@ -76,7 +96,7 @@ shared project infrastructure, marketing and events, and 
> managing the Xen
>  Project trademark. The Advisory Board leaves all technical decisions to the 
>  open source meritocracy.
>  
> -### The Linux Foundation
> +### The Linux Foundation {#roles-lf}
>  
>  The Xen Project is a [Linux Foundation](/linux-foundation.html) 
> Collaborative 
>  Project. Collaborative Projects are independently funded software projects 
> that 
> @@ -95,21 +115,48 @@ members or other distinguished community members.
>  ### Sponsor
>  
>  To form a new sub-project or team on Xenproject.org, we require a sponsor to 
> -support the creation of the new project. A sponsor can be a project lead or 
> -committer of a mature project, a member of the advisory board or the 
> community 
> -manager. This ensures that a distinguished community member supports the 
> idea 
> -behind the project.
> +support the 

[Xen-devel] [libvirt test] 102726: tolerable all pass - PUSHED

2016-11-30 Thread osstest service owner
flight 102726 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/102726/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 102706
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 102706
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 102706
 test-armhf-armhf-libvirt-qcow2 12 saverestore-support-check   fail like 102706

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass

version targeted for testing:
 libvirt  bb738f9fcdc3967903a6ff78111dfa989f61d04d
baseline version:
 libvirt  17879605fe08bfe446fc10ae512ab83e0f37b08a

Last test of basis   102706  2016-11-29 04:21:25 Z1 days
Testing same since   102726  2016-11-30 04:24:03 Z0 days1 attempts


People who touched revisions under test:
  Jiri Denemark 
  Michal Privoznik 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-armhf-armhf-libvirt-qcow2   pass
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=libvirt
+ revision=bb738f9fcdc3967903a6ff78111dfa989f61d04d
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w 

[Xen-devel] [xen-4.7-testing test] 102725: regressions - FAIL

2016-11-30 Thread osstest service owner
flight 102725 xen-4.7-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/102725/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-raw   10 guest-start  fail REGR. vs. 102536
 test-armhf-armhf-xl-vhd   9 debian-di-installfail REGR. vs. 102536

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-rumprun-i386 16 rumprun-demo-xenstorels/xenstorels.repeat fail 
REGR. vs. 102536
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 102536
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 102536
 test-armhf-armhf-libvirt-qcow2 12 saverestore-support-check   fail like 102536
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 102536
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 102536
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 102536

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  e144f21309b1f637919433caf734600da34962ec
baseline version:
 xen  206fc7084dfaf05c55fd9de650f93a7ef9fe0722

Last test of basis   102536  2016-11-22 20:45:38 Z8 days
Failing since102711  2016-11-29 15:46:49 Z1 days2 attempts
Testing same since   102725  2016-11-30 03:06:03 Z0 days1 attempts


People who touched revisions under test:
  Ian Jackson 
  Stefano Stabellini 
  Wei Chen 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-prev pass
 build-i386-prev  pass
 build-amd64-pvopspass
 build-armhf-pvops  

Re: [Xen-devel] Xen ARM - Exposing a PL011 to the guest

2016-11-30 Thread Stefano Stabellini
On Wed, 30 Nov 2016, Julien Grall wrote:
> Hi all,
> 
> Few months ago, Linaro has published the version 2 of the VM specification
> [1].
> 
> For those who don't know, the specification provides guidelines to guarantee a
> compliant OS images could run on various hypervisor (e.g Xen, KVM).
> 
> Looking at the specification, it will require Xen to expose new devices to the
> guest: pl011, rtc, persistent flash (for UEFI variables).
> 
> The RTC and persistent will only be used by the UEFI firwmare. The firwmare is
> custom made for Xen guest and be loaded by the toolstack, so we could
> theoretically provide PV drivers for those.
> 
> This is not the case for the PL011. The guest will be shipped with a
> PL011/SBSA UART driver,.This means it will expect to access it through MMIO.
> 
> So we have to emulate a PL011. The question is where? Before suggesting some
> ideas, the guest/user will expect to be able to interact with the console
> through the UART. This means that the UART and xenconsoled needs to
> communicate together.
> 
> I think we can distinct two places where the PL011 could be emulated:
> in the hypervisor, or outside the hypervisor.
> 
> Emulating the UART in the hypervisor means that we take the risk to increase
> to the attack surface of Xen if there is a bug in the emulation code. The
> attack surface could be reduced by emulating the UART in another exception
> level (e.g EL1, EL0) but still under the control of the hypervisor. Usually
> the guest is communicating between with xenconsoled using a ring. For the
> first console this could be discovered using hypercall HVMOP_get_param. For
> the second and onwards, it described in xenstore. I would not worry too much
> about emulating multiple PL011s, so we could implement the PV frontend in Xen.
> 
> Emulating the UART outside the hypervisor (e.g in DOM0 or special domain)
> would require to bring the concept of ioreq server on ARM. Which left the
> question where do we emulate the PL011? The best place would be xenconsoled.
> But I am not sure how would be the security impact here. Does all guest
> consoles are emulated within the same daemon?

One xenconsoled instance handles all PV console frontends. However QEMU,
not xenconsoled, handles secondary PV consoles and emulated serials. One
QEMU instance per VM. The PV console protocol is pretty trivial and not
easy to exploit.

Instead of emulating PL011 in xenconsoled we could do it in QEMU (or
something like it). But I think we should do something else instead, see
below.


> I would lean towards the first solution if we implement all the security
> safety I mentioned. Although, the second solution would be a good move if we
> decide to implement more devices (e.g RTC, pflash) in the future.
> 
> Do you have any opinions?

As I have just written in this other email:

http://marc.info/?l=xen-devel=148054285829397

I don't think we should introduce userspace emulators in Dom0 on ARM.
The reason is that they are bad both for security and performance. In
general I think that the best solution is to introduce emulators in EL1
in Xen.

That said, PL011 is pretty small and the PL011 emulator needs to feed
data back to Dom0 for user interaction, so, if we need to, we could make
an exception for it and run it in userspace Dom0.

But do we really need to make an exception? I don't think so. The PL011
emulator could run in EL1 in Xen and connect to xenconsoled using the
regular PV console frontend/backend protocol, as you suggested. One
caveat: I would avoid introducing xenstore support in Xen, but
fortunately we don't need xenstore support for the first PV console
connection, so we should be OK. In any case if the xenconsole protocol
or xenconsoled are the problem, we could easily rewrite them or add
specific support for this new use case.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFD] OP-TEE (and probably other TEEs) support

2016-11-30 Thread Stefano Stabellini
On Mon, 28 Nov 2016, Julien Grall wrote:
> > > If not, then it might be worth to consider a 3rd solution where the TEE
> > > SMC
> > > calls are forwarded to a specific domain handling the SMC on behalf of the
> > > guests. This would allow to upgrade the TEE layer without having to
> > > upgrade
> > > the hypervisor.
> > Yes, this is good idea. How this can look? I imagine following flow:
> > Hypervisor traps SMC, uses event channel to pass request to Dom0. Some
> > userspace daemon receives it, maps pages with request data, alters is
> > (e.g. by replacing IPAs with PAs), sends request to hypervisor to
> > issue real SMC, then alters response and only then returns data back
> > to guest.
> 
> The event channel is only a way to notify (similar to an interrupt), you would
> need a shared memory page between the hypervisor and the client to communicate
> the SMC information.
> 
> I was thinking to get advantage of the VM event API for trapping the SMC. But
> I am not sure if it is the best solution here. Stefano, do you have any
> opinions here?
> 
> > I can see only one benefit there - this code will be not in
> > hypervisor. And there are number of drawbacks:
> > 
> > Stability: if this userspace demon will crash or get killed by, say,
> > OOM, we will lose information about all opened sessions, mapped shared
> > buffers, etc.That would be complete disaster.
> 
> I disagree on your statement, you would gain in isolation. If your userspace
> crashes (because of an emulation bug), you will only loose access to TEE for a
> bit. If the hypervisor crashes (because of an emulation bug), then you take
> down the platform. I agree that you lose information when the userspace app is
> crashing but your platform is still up. Isn't it the most important?
> 
> Note that I think it would be "fairly easy" to implement code to reset
> everything or having a backup on the side.
> 
> > Performance: how big will be latency introduced by switching between
> > hypervisor, dom0 SVC and USR modes? I have seen use case where TEE is
> > part of video playback pipe (it decodes DRM media).
> > There also can be questions about security, but Dom0 in any case can
> > access any memory from any guest.
> 
> But those concerns would be the same in the hypervisor, right? If your
> emulation is buggy then a guest would get access to all the memory.
> 
> > But I really like the idea, because I don't want to mess with
> > hypervisor when I don't need to. So, how do you think, how it will
> > affect performance?
> 
> I can't tell here. I would recommend you to try a quick prototype (e.g
> receiving and sending SMC) and see what would be the overhead.
> 
> When I wrote my previous e-mail, I mentioned "specific domain", because I
> don't think it is strictly necessary to forward the SMC to DOM0. If you are
> concern about overloading DOM0, you could have a separate service domain that
> would handle TEE for you. You could have your "custom OS" handling TEE request
> directly in kernel space (i.e SVC).
> 
> This would be up to the developer of this TEE-layer to decide what to do.

Thanks Julien from bringing me into the discussion. These are my
thoughts on the matter.


Running emulators in Dom0 (AKA QEMU on x86) has always meant giving them
full Dom0 privileges so far. I don't think that is acceptable. There is
work undergoing on the x86 side of things to fix the situation, see:

http://marc.info/?i=1479489244-2201-1-git-send-email-paul.durrant%40citrix.com

But if the past is any indication of future development speed, we are
still a couple of Xen releases away at least from having unprivileged
emulators in Dom0 on x86. By unprivileged, I mean that they are not able
to map any random page in memory, but just the ones belonging to the
virtual machine that they are serving. Until then, having an emulator in
userspace Dom0 is just as bad as having it in the hypervisor from a
security standpoint.

I would only consider this option, if we mandate from the start, in the
design doc and implementations, that the emulators need to be
unprivileged on ARM. This would likely require a new set of hypercalls
and possibly Linux privcmds. And even then, this solution would still
present a series of problems:

- latency
- scalability
- validation against the root of trust
- certifications (because they are part of Dom0 and nobody can certify
  that)


The other option that traditionally is proposed is using stubdoms.
Specialized little VMs to run emulators, each VM runs one emulator
instance. They are far better from a security standpoint, and could be
certifiable. They might still pose problems from a root of trust point
of view. However the real issue with stubdoms, is just that being
treated as VMs they show up in "xl list", they introduce latency, they
consume a lot of memory, etc. Also dealing with Mini-OS can be unfunny.
I think that this option is only a little better than the previous
option, but it is still not great.


This brings us to the third 

Re: [Xen-devel] How to Build Xen for UEFI native boot

2016-11-30 Thread Konrad Rzeszutek Wilk
On Wed, Nov 30, 2016 at 06:31:15PM +, Bill Jacobs (billjac) wrote:
> Hi All
> 
> Relative newb to Xen.
> 
> Would like to build Xen for UEFI Native boot behind GRUB2 (EFI), and see many 
> resources 'out there', many of which are dated. Where's the best current 
> place to start down this path?
> 

Here.

Basically you need to have binutils with pecoff enabled.

And then follow:
https://wiki.xenproject.org/wiki/Compiling_Xen_From_Source

> Thanks
> -Bill
> 

> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-4.6-testing test] 102723: regressions - FAIL

2016-11-30 Thread osstest service owner
flight 102723 xen-4.6-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/102723/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-amd64 10 guest-start   fail REGR. vs. 102712

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 102712
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 102712
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 102712
 test-armhf-armhf-libvirt-qcow2 12 saverestore-support-check   fail like 102712
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 102712
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 102712
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 102712
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 102712
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail  like 102712

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  22f70a3edd2f7f798391a381433da350e2272871
baseline version:
 xen  0ba95621b8988ad5ceb76b43e76be404fd798f7b

Last test of basis   102712  2016-11-29 15:46:48 Z1 days
Testing same since   102723  2016-11-30 01:15:21 Z0 days1 attempts


People who touched revisions under test:
  Ian Jackson 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-prev pass
 build-i386-prev  pass
 build-amd64-pvopspass
 build-armhf-pvopspass   

[Xen-devel] How to Build Xen for UEFI native boot

2016-11-30 Thread Bill Jacobs (billjac)
Hi All

Relative newb to Xen.

Would like to build Xen for UEFI Native boot behind GRUB2 (EFI), and see many 
resources 'out there', many of which are dated. Where's the best current place 
to start down this path?

Thanks
-Bill

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 02/36] qdict: Add convenience helpers for wrapped puts

2016-11-30 Thread Eric Blake
Quite a few users of qdict_put() were manually wrapping a
non-QObject. We can make such call-sites shorter, by providing
common macros to do the tedious work.  Also shorten nearby
qdict_put_obj(,,QOBJECT()) sequences.

Signed-off-by: Eric Blake 

---
I'm okay if you want me to break this patch into smaller pieces.
---
 include/qapi/qmp/qdict.h|   8 +++
 block.c |  59 +++-
 block/archipelago.c |   4 +-
 block/blkdebug.c|   6 +-
 block/blkverify.c   |  11 ++-
 block/curl.c|   2 +-
 block/iscsi.c   |   2 +-
 block/nbd.c |  41 ++-
 block/nfs.c |  43 +---
 block/null.c|   2 +-
 block/qcow2.c   |   4 +-
 block/quorum.c  |  13 ++--
 block/raw-posix.c   |   8 +--
 block/raw-win32.c   |   4 +-
 block/ssh.c |  16 ++---
 block/vvfat.c   |  10 +--
 blockdev.c  |  28 
 hw/block/xen_disk.c |   2 +-
 hw/usb/xen-usb.c|  12 ++--
 monitor.c   |  18 ++---
 qapi/qmp-event.c|   2 +-
 qemu-img.c  |   6 +-
 qemu-io.c   |   2 +-
 qemu-nbd.c  |   2 +-
 qobject/qdict.c |   2 +-
 target-s390x/cpu_models.c   |   4 +-
 tests/check-qdict.c | 132 ++--
 tests/test-qmp-commands.c   |  30 
 tests/test-qmp-event.c  |  30 
 tests/test-qobject-output-visitor.c |   6 +-
 util/qemu-option.c  |   6 +-
 31 files changed, 245 insertions(+), 270 deletions(-)

diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h
index fe9a4c5..9d9f9a3 100644
--- a/include/qapi/qmp/qdict.h
+++ b/include/qapi/qmp/qdict.h
@@ -52,6 +52,14 @@ void qdict_destroy_obj(QObject *obj);
 #define qdict_put(qdict, key, obj) \
 qdict_put_obj(qdict, key, QOBJECT(obj))

+/* Helpers for int, bool, and const char*. */
+#define qdict_put_int(qdict, key, value) \
+qdict_put(qdict, key, qint_from_int(value))
+#define qdict_put_bool(qdict, key, value) \
+qdict_put(qdict, key, qbool_from_bool(value))
+#define qdict_put_str(qdict, key, value) \
+qdict_put(qdict, key, qstring_from_str(value))
+
 /* High level helpers */
 double qdict_get_double(const QDict *qdict, const char *key);
 int64_t qdict_get_int(const QDict *qdict, const char *key);
diff --git a/block.c b/block.c
index 39ddea3..e816657 100644
--- a/block.c
+++ b/block.c
@@ -876,16 +876,16 @@ static void update_flags_from_options(int *flags, 
QemuOpts *opts)
 static void update_options_from_flags(QDict *options, int flags)
 {
 if (!qdict_haskey(options, BDRV_OPT_CACHE_DIRECT)) {
-qdict_put(options, BDRV_OPT_CACHE_DIRECT,
-  qbool_from_bool(flags & BDRV_O_NOCACHE));
+qdict_put_bool(options, BDRV_OPT_CACHE_DIRECT,
+   flags & BDRV_O_NOCACHE);
 }
 if (!qdict_haskey(options, BDRV_OPT_CACHE_NO_FLUSH)) {
-qdict_put(options, BDRV_OPT_CACHE_NO_FLUSH,
-  qbool_from_bool(flags & BDRV_O_NO_FLUSH));
+qdict_put_bool(options, BDRV_OPT_CACHE_NO_FLUSH,
+   flags & BDRV_O_NO_FLUSH);
 }
 if (!qdict_haskey(options, BDRV_OPT_READ_ONLY)) {
-qdict_put(options, BDRV_OPT_READ_ONLY,
-  qbool_from_bool(!(flags & BDRV_O_RDWR)));
+qdict_put_bool(options, BDRV_OPT_READ_ONLY,
+   !(flags & BDRV_O_RDWR));
 }
 }

@@ -1244,7 +1244,7 @@ static int bdrv_fill_options(QDict **options, const char 
*filename,
 /* Fetch the file name from the options QDict if necessary */
 if (protocol && filename) {
 if (!qdict_haskey(*options, "filename")) {
-qdict_put(*options, "filename", qstring_from_str(filename));
+qdict_put_str(*options, "filename", filename);
 parse_filename = true;
 } else {
 error_setg(errp, "Can't specify 'file' and 'filename' options at "
@@ -1264,7 +1264,7 @@ static int bdrv_fill_options(QDict **options, const char 
*filename,
 }

 drvname = drv->format_name;
-qdict_put(*options, "driver", qstring_from_str(drvname));
+qdict_put_str(*options, "driver", drvname);
 } else {
 error_setg(errp, "Must specify either driver or file");
 return -EINVAL;
@@ -1517,7 +1517,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
 }

 if (bs->backing_format[0] != '\0' && !qdict_haskey(options, "driver")) {
-qdict_put(options, "driver", qstring_from_str(bs->backing_format));
+

[Xen-devel] [distros-debian-squeeze test] 68128: tolerable FAIL

2016-11-30 Thread Platform Team regression test user
flight 68128 distros-debian-squeeze real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68128/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-amd64-squeeze-netboot-pygrub 9 debian-di-install fail like 
68084
 test-amd64-amd64-i386-squeeze-netboot-pygrub 9 debian-di-install fail like 
68084
 test-amd64-i386-amd64-squeeze-netboot-pygrub 9 debian-di-install fail like 
68084
 test-amd64-i386-i386-squeeze-netboot-pygrub 9 debian-di-install fail like 68084

baseline version:
 flight   68084

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-squeeze-netboot-pygrubfail
 test-amd64-i386-amd64-squeeze-netboot-pygrub fail
 test-amd64-amd64-i386-squeeze-netboot-pygrub fail
 test-amd64-i386-i386-squeeze-netboot-pygrub  fail



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] DomU application crashes while mmap'ing device memory on x86_64

2016-11-30 Thread Oleksandr Andrushchenko

Thank you for explanation, now it is clear.

BTW, is PAGE_SHARED the right choice in my case

or should I use something else instead?

Thank you,
Oleksandr

On 11/30/2016 09:10 PM, Andrew Cooper wrote:

On 30/11/16 19:00, Oleksandr Andrushchenko wrote:

I traced the problem down to vma->vm_page_prot which

in my case is set as:

vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

This sets additional flags _PAGE_BIT_PSE/_PAGE_BIT_PAT +_PAGE_BIT_PCD

so after that remap_pfn_range makes Xen complain.

(pgprot_noncached(vma->vm_page_prot) == 0x80b7)

If I change prot to

vma->vm_page_prot = PAGE_SHARED;

(PAGE_SHARED == 0x8027) then I am able to mmap.

Can anyone please help me understand if this is a valid use-case for DomU
(pgprot_noncached) and if so why Xen cannot make it?

Thank you,
Oleksandr

Superpages are not supported in a PV guest.

You can enable the use of 2mb superpages for PV guests by booting Xen
with the "allowsuperpage" command line option, but quite a few features
are broken in combination with PV superpages, and this area has been a
ripe source of security bugs.

Unprivileged guests (i.e. ones without hardware) are not permitted to
make mappings with anything other than a writeback memory type, because
all kinds of chaos can ensue if the guest constructs aliasing mappings
with different cacheabilities, and it is prohibitively expensive for Xen
to track for auditing purposes.

~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [COVERITY ACCESS] for Embedded/Automotive team

2016-11-30 Thread Andrew Cooper
On 29/11/16 15:09, Artem Mygaiev wrote:
> Hi Julien
>
> On 29.11.16 16:27, Julien Grall wrote:
>> Hi Artem,
>>
>> On 29/11/16 14:21, Artem Mygaiev wrote:
>>> Lars, the project is approved by Coverity. Scan has found some issues in
>>> xen/arch/arm on master, part of them are false positives.
>> Perfect. It would be interesting to know the list of issues so we can
>> categorize them (i.e are they security issue) and address them.
> Let me clean up the build scripts a bit and I will send you invite to
> Coverity Scan project

Can I get an invite as well please?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: Fix misplaced parentheses for PSCI version check

2016-11-30 Thread Stefano Stabellini
On Wed, 30 Nov 2016, Julien Grall wrote:
> Hi Artem,
> 
> On 30/11/16 13:53, Artem Mygaiev wrote:
> > Fix misplaced parentheses for PSCI version check
> > 
> > Signed-off-by: Artem Mygaiev 
> 
> Can you please include the coverity ID:
> 
> Coverity-ID: 1381830
> 
> With that:
> 
> Reviewed-by: Julien Grall 
> 
> This has been introduced by me in commit 2831f20 "xen/arm: Add support of PSCI
> v1.0 for the host" in Xen 4.7. I am not sure whether we should backport it.
 
I think we should backport it.

Wei, can I have a release-ack or do you want to wait until after the
release?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] DomU application crashes while mmap'ing device memory on x86_64

2016-11-30 Thread Andrew Cooper
On 30/11/16 19:00, Oleksandr Andrushchenko wrote:
> I traced the problem down to vma->vm_page_prot which
>
> in my case is set as:
>
> vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>
> This sets additional flags _PAGE_BIT_PSE/_PAGE_BIT_PAT +_PAGE_BIT_PCD
>
> so after that remap_pfn_range makes Xen complain.
>
> (pgprot_noncached(vma->vm_page_prot) == 0x80b7)
>
> If I change prot to
>
> vma->vm_page_prot = PAGE_SHARED;
>
> (PAGE_SHARED == 0x8027) then I am able to mmap.
>
> Can anyone please help me understand if this is a valid use-case for DomU
> (pgprot_noncached) and if so why Xen cannot make it?
>
> Thank you,
> Oleksandr

Superpages are not supported in a PV guest.

You can enable the use of 2mb superpages for PV guests by booting Xen
with the "allowsuperpage" command line option, but quite a few features
are broken in combination with PV superpages, and this area has been a
ripe source of security bugs.

Unprivileged guests (i.e. ones without hardware) are not permitted to
make mappings with anything other than a writeback memory type, because
all kinds of chaos can ensue if the guest constructs aliasing mappings
with different cacheabilities, and it is prohibitively expensive for Xen
to track for auditing purposes.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [COVERITY ACCESS] for Embedded/Automotive team

2016-11-30 Thread Stefano Stabellini
Thank you!

On Wed, 30 Nov 2016, Artem Mygaiev wrote:
> Done
> 
> 
> On 29.11.16 20:19, Stefano Stabellini wrote:
> > On Tue, 29 Nov 2016, Artem Mygaiev wrote:
> >> Hi Julien
> >>
> >> On 29.11.16 16:27, Julien Grall wrote:
> >>> Hi Artem,
> >>>
> >>> On 29/11/16 14:21, Artem Mygaiev wrote:
>  Lars, the project is approved by Coverity. Scan has found some issues in
>  xen/arch/arm on master, part of them are false positives.
> >>> Perfect. It would be interesting to know the list of issues so we can
> >>> categorize them (i.e are they security issue) and address them.
> >> Let me clean up the build scripts a bit and I will send you invite to
> >> Coverity Scan project
> > I would like access too if possible, thanks!
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] arm/acpi: hide watchdog timer in GTDT table for dom0

2016-11-30 Thread Stefano Stabellini
On Wed, 30 Nov 2016, Shanker Donthineni wrote:
> Hi Julien,
> 
> We are using Fu's  [v5] patch series
> https://patchwork.codeaurora.org/patch/20325/ in our testing. We thought
> system crash in xen was related to watchdog timer driver, so removed the
> watchdog timer sections including GT blocks in GTDT to fix the crash. Let me
> root cause the issue and update the results to you by end of this week.

FYI the Linux notifier is
drivers/xen/arm-device.c:xen_platform_notifier. If it doesn't get called
for some reason, it would be useful to know why and fix the problem.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf baseline-only test] 68127: all pass

2016-11-30 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 68127 ovmf real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68127/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf ff9a1358b3ff98b1c3a9b4b584fca71653a1c9fe
baseline version:
 ovmf 2b2efe33eaceb3fd2b5c6859dcb5151970dc797b

Last test of basis68121  2016-11-29 16:19:18 Z1 days
Testing same since68127  2016-11-30 04:52:02 Z0 days1 attempts


People who touched revisions under test:
  Laszlo Ersek 
  Michael Kinney 
  Yonghong Zhu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


commit ff9a1358b3ff98b1c3a9b4b584fca71653a1c9fe
Author: Michael Kinney 
Date:   Tue Nov 29 01:15:28 2016 -0800

Vlv2TbltDevicePkg: Fix IA32 boot timeouts

https://bugzilla.tianocore.org/show_bug.cgi?id=264

The IA32 build gets timeouts booting to the UEFI Shell.
Update the IA32 DSC file to match the X64 DSC file
disabling the fTPM feature.

Cc: Jiewen Yao 
Cc: David Wei 
Cc: Mang Guo 
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Michael Kinney 
Reviewed-by: Jiewen Yao 

commit 71d86ec879ae573a264eed71dd1bff9d547e70b0
Author: Michael Kinney 
Date:   Tue Nov 29 01:13:21 2016 -0800

Vlv2TbltDevicePkg/PlatformFlashAccessLib: Fix IA32 build issues

https://bugzilla.tianocore.org/show_bug.cgi?id=263

Fix IA32 build issues in the PlatformFlashAccessLib.  Some of the
UINT64 FLASH addresses values need to be typecast to UINTN.

Cc: Jiewen Yao 
Cc: David Wei 
Cc: Mang Guo 
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Michael Kinney 
Reviewed-by: Jiewen Yao 

commit 97e862bbbdd112c0d9523292b0184e74c1a220fc
Author: Michael Kinney 
Date:   Tue Nov 29 01:11:43 2016 -0800

Vlv2TbltDevicePkg: Remove SMM binary modules from FDF

https://bugzilla.tianocore.org/show_bug.cgi?id=261
https://bugzilla.tianocore.org/show_bug.cgi?id=262

Remove the PowerManagement2 binary SMM module that generates an
ASSERT() and the DigitalThermalSensor binary SMM module that
causes an AP to be stuck in the busy state.

This is a workaround until these two SMM binary modules can be
updated.

Cc: Jiewen Yao 
Cc: David Wei 
Cc: Mang Guo 
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Michael Kinney 
Reviewed-by: Jiewen Yao 

commit 890f11d4286b29f008a0deecd2bf01b4d767c118
Author: Michael Kinney 
Date:   Tue Nov 29 01:05:05 2016 -0800

Vlv2TbltDevicePkg/PlatformInitPei: Workaround unaligned SMRAM size

https://bugzilla.tianocore.org/show_bug.cgi?id=260

The PiSmmCPuDxeSmm module requires the SMRR base address and length
to be aligned.  The memory initialization for Vlv2TbltDevicePkg
produces an SMRAM base address that is on a 16MB boundary and an
SMRAM length of 12MB.  The SMRAM length is rounded up to 16MB.

This is a workaround until the binary module that produces the
gEfiSmmPeiSmramMemoryReserveGuid HOB is updated

Cc: Jiewen Yao 
Cc: David Wei 
Cc: Mang Guo 

Re: [Xen-devel] DomU application crashes while mmap'ing device memory on x86_64

2016-11-30 Thread Oleksandr Andrushchenko

I traced the problem down to vma->vm_page_prot which

in my case is set as:

vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

This sets additional flags _PAGE_BIT_PSE/_PAGE_BIT_PAT +_PAGE_BIT_PCD

so after that remap_pfn_range makes Xen complain.

(pgprot_noncached(vma->vm_page_prot) == 0x80b7)

If I change prot to

vma->vm_page_prot = PAGE_SHARED;

(PAGE_SHARED == 0x8027) then I am able to mmap.

Can anyone please help me understand if this is a valid use-case for DomU
(pgprot_noncached) and if so why Xen cannot make it?

Thank you,
Oleksandr

On 11/22/2016 08:27 PM, Oleksandr Andrushchenko wrote:


Hi,

just wanted to bump this as I also have the same issue on real HW now 
(x86_64)


Nov 14 10:30:18 DomU kernel: [ 1169.569936]  [] 
xen_mc_flush+0x19c/0x1b0


Thank you in advnce,
Oleksandr


On Mon, Nov 14, 2016 at 6:07 PM, Oleksandr Andrushchenko 
> wrote:


Hi, there!

Sorry for the long read ahead, but it seems I've got stuck...

I am working on a PV driver and facing an mmap issue.
This actually happens when user-space tries to mmap
the memory allocated by the driver:

cma_obj->vaddr = dma_alloc_wc(drm->dev, size, _obj->paddr,
  GFP_KERNEL | __GFP_NOWARN);

and maping:

vma->vm_flags &= ~VM_PFNMAP;
vma->vm_pgoff = 0;

ret = dma_mmap_wc(cma_obj->base.dev->dev, vma, cma_obj->vaddr,
 cma_obj->paddr, vma->vm_end - vma->vm_start);

Return of the dma_mmap_wc is 0, but I see in the DomU kernel logs:

Nov 14 10:30:18 DomU kernel: [ 1169.569909] [ cut here
]
Nov 14 10:30:18 DomU kernel: [ 1169.569911] WARNING: CPU: 1 PID:
5146 at /home/kernel/COD/linux/arch/x86/xen/multicalls.c:129
xen_mc_flush+0x19c/0x1b0
Nov 14 10:30:18 DomU kernel: [ 1169.569912] Modules linked in:
xen_drmfront(OE) drm_kms_helper(OE) drm(OE) fb_sys_fops
syscopyarea sysfillrect sysimgblt crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper
ablk_helper cryptd intel_rapl_perf autofs4 [last unloaded:
xen_drmfront]
Nov 14 10:30:18 DomU kernel: [ 1169.569919] CPU: 1 PID: 5146 Comm:
lt-modetest Tainted: GW  OE 4.9.0-040900rc3-generic
#201610291831
Nov 14 10:30:18 DomU kernel: [ 1169.569920]  c900406ffb10
81416bf2  
Nov 14 10:30:18 DomU kernel: [ 1169.569923]  c900406ffb50
8108361b 0081406ffb30 88003f90b8e0
Nov 14 10:30:18 DomU kernel: [ 1169.569925]  0001
0010  0201
Nov 14 10:30:18 DomU kernel: [ 1169.569928] Call Trace:
Nov 14 10:30:18 DomU kernel: [ 1169.569930]  []
dump_stack+0x63/0x81
Nov 14 10:30:18 DomU kernel: [ 1169.569932]  []
__warn+0xcb/0xf0
Nov 14 10:30:18 DomU kernel: [ 1169.569934]  []
warn_slowpath_null+0x1d/0x20
Nov 14 10:30:18 DomU kernel: [ 1169.569936]  []
xen_mc_flush+0x19c/0x1b0
Nov 14 10:30:18 DomU kernel: [ 1169.569938]  []
__xen_mc_entry+0xf6/0x150
Nov 14 10:30:18 DomU kernel: [ 1169.569940]  []
xen_extend_mmu_update+0x56/0xd0
Nov 14 10:30:18 DomU kernel: [ 1169.569942]  []
xen_set_pte_at+0x177/0x2f0
Nov 14 10:30:18 DomU kernel: [ 1169.569944]  []
remap_pfn_range+0x30b/0x430
Nov 14 10:30:18 DomU kernel: [ 1169.569946]  []
dma_common_mmap+0x87/0xa0
Nov 14 10:30:18 DomU kernel: [ 1169.569953]  []
drm_gem_cma_mmap_obj+0x8f/0xa0 [drm]
Nov 14 10:30:18 DomU kernel: [ 1169.569960]  []
drm_gem_cma_mmap+0x25/0x30 [drm]
Nov 14 10:30:18 DomU kernel: [ 1169.569962]  []
mmap_region+0x3a5/0x640
Nov 14 10:30:18 DomU kernel: [ 1169.569964]  []
do_mmap+0x446/0x530
Nov 14 10:30:18 DomU kernel: [ 1169.569966]  []
? common_mmap+0x45/0x50
Nov 14 10:30:18 DomU kernel: [ 1169.569968]  []
? apparmor_mmap_file+0x16/0x20
Nov 14 10:30:18 DomU kernel: [ 1169.569970]  []
? security_mmap_file+0xdd/0xf0
Nov 14 10:30:18 DomU kernel: [ 1169.569972]  []
vm_mmap_pgoff+0xba/0xf0
Nov 14 10:30:18 DomU kernel: [ 1169.569974]  []
SyS_mmap_pgoff+0x1c1/0x290
Nov 14 10:30:18 DomU kernel: [ 1169.569976]  []
SyS_mmap+0x1b/0x30
Nov 14 10:30:18 DomU kernel: [ 1169.569978]  []
entry_SYSCALL_64_fastpath+0x1e/0xad
Nov 14 10:30:18 DomU kernel: [ 1169.569979] ---[ end trace
ce1796cb265ebe08 ]---
Nov 14 10:30:18 DomU kernel: [ 1169.569982] [ cut here
]


And output of xl dmesg says:

(XEN) memory.c:226:d0v0 Could not allocate order=9 extent: id=31
memflags=0x40 (488 of 512)
(d31) mapping kernel into physical memory
(d31) about to get started...
(XEN) d31 attempted to change d31v0's CR4 flags 0620 -> 00040660
(XEN) d31 attempted to change d31v1's CR4 flags 0620 -> 00040660
(XEN) traps.c:3657: GPF (): 82d0801a1a09 -> 82d08024b970
   

Re: [Xen-devel] [PATCH] arm/acpi: hide watchdog timer in GTDT table for dom0

2016-11-30 Thread Stefano Stabellini
On Wed, 30 Nov 2016, Julien Grall wrote:
> Hi Stefano,
> 
> On 29/11/2016 19:08, Stefano Stabellini wrote:
> > On Mon, 28 Nov 2016, Shanker Donthineni wrote:
> > > Either we have to hide the watchdog timer section in GTDT or emulate
> > > watchdog timer block for dom0. Otherwise, system gets panic when
> > > dom0 accesses its MMIO registers. The current XEN doesn't support
> > > virtualization of watchdog timer, so hide the watchdog timer section
> > > for dom0.
> > > 
> > > Signed-off-by: Shanker Donthineni 
> > 
> > Thanks for the patch, it looks good. Just a couple of questions below.
> > 
> > 
> > >  xen/arch/arm/domain_build.c | 41
> > > +
> > >  xen/include/asm-arm/acpi.h  |  1 +
> > >  2 files changed, 42 insertions(+)
> > > 
> > > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > > index e8a400c..611c803 100644
> > > --- a/xen/arch/arm/domain_build.c
> > > +++ b/xen/arch/arm/domain_build.c
> > > @@ -1668,6 +1668,8 @@ static int acpi_create_xsdt(struct domain *d, struct
> > > membank tbl_add[])
> > > ACPI_SIG_FADT, tbl_add[TBL_FADT].start);
> > >  acpi_xsdt_modify_entry(xsdt->table_offset_entry, entry_count,
> > > ACPI_SIG_MADT, tbl_add[TBL_MADT].start);
> > > +acpi_xsdt_modify_entry(xsdt->table_offset_entry, entry_count,
> > > +   ACPI_SIG_GTDT, tbl_add[TBL_GTDT].start);
> > >  xsdt->table_offset_entry[entry_count] = tbl_add[TBL_STAO].start;
> > > 
> > >  xsdt->header.length = table_size;
> > > @@ -1718,6 +1720,41 @@ static int acpi_create_stao(struct domain *d,
> > > struct membank tbl_add[])
> > >  return 0;
> > >  }
> > > 
> > > +static int acpi_create_gtdt(struct domain *d, struct membank tbl_add[])
> > > +{
> > > +struct acpi_table_header *table = NULL;
> > > +struct acpi_table_gtdt *gtdt = NULL;
> > > +u32 table_size = sizeof(struct acpi_table_gtdt);
> > > +u32 offset = acpi_get_table_offset(tbl_add, TBL_GTDT);
> > > +acpi_status status;
> > > +u8 *base_ptr, checksum;
> > > +
> > > +status = acpi_get_table(ACPI_SIG_GTDT, 0, );
> > > +
> > > +if ( ACPI_FAILURE(status) )
> > > +{
> > > +const char *msg = acpi_format_exception(status);
> > > +
> > > +printk("Failed to get GTDT table, %s\n", msg);
> > > +return -EINVAL;
> > > +}
> > > +
> > > +base_ptr = d->arch.efi_acpi_table + offset;
> > > +ACPI_MEMCPY(base_ptr, table, sizeof(struct acpi_table_gtdt));
> > 
> > Use table_size
> > 
> > 
> > > +gtdt = (struct acpi_table_gtdt *)base_ptr;
> > > +gtdt->header.length = table_size;
> > > +gtdt->platform_timer_count = 0;
> > > +gtdt->platform_timer_offset = table_size;
> > 
> > Why table_size instead of 0? Is that the expected values when the array
> > is empty?
> 
> platform_timer_offset contains the offset to the start of the array from the
> beginning of the table. So I don't think it matters here.
> 
> Actually I would even avoid to update this parameter.

I understand that it shouldn't matter, but this kind of parameters
usually have a default value regardless. However in this instance it is
not in the spec so I guess anything should be OK.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-4.6-testing baseline-only test] 68126: regressions - FAIL

2016-11-30 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 68126 xen-4.6-testing real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68126/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-xtf-amd64-amd64-120 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 68092
 test-xtf-amd64-amd64-1 29 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 68092
 test-xtf-amd64-amd64-140 xtf/test-hvm64-invlpg~shadow fail REGR. vs. 68092
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  6 xen-boot fail REGR. vs. 68092

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-i386-pvgrub 10 guest-start  fail   like 68092
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 68092
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 68092
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 68092
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail like 68092
 test-amd64-amd64-xl-qemut-winxpsp3  9 windows-install  fail like 68092

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumprun-i386 10 rumprun-demo-xenstorels/xenstorels fail never 
pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-midway   12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-midway   13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-rumprun-amd64 10 rumprun-demo-xenstorels/xenstorels fail 
never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 xen  0ba95621b8988ad5ceb76b43e76be404fd798f7b
baseline version:
 xen  514173d3f8623194aa607f71866a240d02d432e8

Last test of basis68092  2016-11-24 12:17:31 Z6 days
Testing same since68126  2016-11-30 01:21:37 Z0 days1 attempts


People who touched revisions under test:
  Stefano Stabellini 
  Wei Chen 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass  

[Xen-devel] [qemu-mainline test] 102722: tolerable FAIL - PUSHED

2016-11-30 Thread osstest service owner
flight 102722 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/102722/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 102621
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 102621
 test-armhf-armhf-libvirt-qcow2 12 saverestore-support-check   fail like 102621
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 102621
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 102621
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 102621
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 102621

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 qemuu1cd56fd2e14f67ead2f0458b4ae052f19865c41c
baseline version:
 qemuu00227fefd2059464cd2f59aed29944874c630e2f

Last test of basis   102621  2016-11-24 19:58:38 Z5 days
Testing same since   102722  2016-11-29 23:14:18 Z0 days1 attempts


People who touched revisions under test:
  Adrian Bunk 
  Alberto Garcia 
  Alistair Francis 
  Benjamin Herrenschmidt 
  Bobby Bingham 
  Changlimin 
  David Gibson 
  Dr. David Alan Gilbert 
  Eduardo Habkost 
  Eric Blake 
  Fam Zheng 
  Francis Deslauriers 
  Greg Kurz 
  Guenter Roeck 
  Hervé Poussineau 
  Jan Beulich 
  Jose Ricardo Ziviani 
  Kevin Wolf 
  Laurent Vivier 
  Li Qiang 
  Mark Cave-Ayland 
  Max Reitz 
  Michael Roth 
  Olaf Hering 
  Paolo Bonzini 
  Peter Maydell 
  Peter Xu 
  Richard Henderson 
  Stefan Hajnoczi 
  Stefano Stabellini 
  Thomas Huth 
  Vladimir Svoboda 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm   

Re: [Xen-devel] [PATCH v3 10/24] x86/emul: Always use fault semantics for software events

2016-11-30 Thread Boris Ostrovsky
On 11/30/2016 08:50 AM, Andrew Cooper wrote:
> The common case is already using fault semantics out of x86_emulate(), as that
> is how VT-x/SVM expects to inject the event (given suitable hardware support).
>
> However, x86_emulate() returning X86EMUL_EXCEPTION and also completing a
> register writeback is problematic for callers.
>
> Switch the logic to always using fault semantics, and leave svm_inject_trap()
> to fix up %eip if necessary.
>
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Tim Deegan 
> CC: Boris Ostrovsky 
> CC: Suravee Suthikulpanit 

Reviewed-by:  Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 07/13] x86: add multiboot2 protocol support for EFI platforms

2016-11-30 Thread Daniel Kiper
On Wed, Nov 30, 2016 at 06:59:57AM -0700, Jan Beulich wrote:
> >>> On 30.11.16 at 14:45,  wrote:
> > On Fri, Nov 25, 2016 at 12:50:55AM -0700, Jan Beulich wrote:
> >> >>> On 24.11.16 at 22:44,  wrote:
> >> > On Thu, Nov 24, 2016 at 04:08:12AM -0700, Jan Beulich wrote:
> >> >> >>> On 23.11.16 at 19:52,  wrote:
> >> >> > Always use add/sub 1 in preference to inc and dec.  They are the same
> >> >> > length to encode in 64bit, and avoids a pipeline stall from a merge of
> >> >> > the eflags register.
> >> >>
> >> >> What you say regarding length not true - add/sub need to encode
> >> >> the immediate somewhere (even if the operand was a register,
> >> >> inc/dec would still be smaller than add/sub, just not by as much as
> >> >> in 32-bit code). And the pipeline stall, afaik, affects only rather old
> >> >> processors.
> >> >
> >> > Intel 64 and IA-32 Architectures Optimization Reference Manual, section
> >> > 3.5.1.1, Use of the INC and DEC Instructions says nothing about 
> >> > exceptions.
> >>
> >> Which by itself is suspicious, as the dependency issue had been
> >> introduced only in (iirc) Pentium4. And anyway, this is a general
> >
> > Hmmm... Interesting... It looks that INC/DEC behavior has not changed since
> > the beginning (why it would change? It would not make sense).
>
> Please properly disambiguate "behavior": Architectural behavior of
> course can't change. Performance, otoh, has changed many times.

Right!

> And the resource dependency issue did appear only once the
> pipelining of instructions was sophisticated enough, but not good
> enough yet to track (as a dependency) EFLAGS.CF separately from
> the other arithmetic result flags.

OK, make sense right now. Thanks for explanation.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen ARM - Exposing a PL011 to the guest

2016-11-30 Thread Volodymyr Babchuk
Hello Julien,



On 30 November 2016 at 17:29, Julien Grall  wrote:
[...]
> I think we can distinct two places where the PL011 could be emulated:
> in the hypervisor, or outside the hypervisor.
>
> Emulating the UART in the hypervisor means that we take the risk to increase
> to the attack surface of Xen if there is a bug in the emulation code. The
> attack surface could be reduced by emulating the UART in another exception
> level (e.g EL1, EL0) but still under the control of the hypervisor. Usually
> the guest is communicating between with xenconsoled using a ring. For the
> first console this could be discovered using hypercall HVMOP_get_param. For
> the second and onwards, it described in xenstore. I would not worry too much
> about emulating multiple PL011s, so we could implement the PV frontend in
> Xen.
>
[...]

> I would lean towards the first solution if we implement all the security
> safety I mentioned. Although, the second solution would be a good move if we
> decide to implement more devices (e.g RTC, pflash) in the future.
>
> Do you have any opinions?
Looks like this topic have some in common with OP-TEE thread. I like
first solution, because if there will be easy and reliable way to run
code in XEN's EL1/EL0, then this will be ideal solution for TEE
emulation/mediation layer.
So, if you'll choose this way, please bear in mind other uses, like
TEE emulation.

-- 
WBR Volodymyr Babchuk aka lorc [+380976646013]
mailto: vlad.babc...@gmail.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 11/14] xen/x86: parse Dom0 kernel for PVHv2

2016-11-30 Thread Roger Pau Monne
Introduce a helper to parse the Dom0 kernel.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
Changes since v3:
 - Change one error message.
 - Indent "out" label by one space.
 - Introduce hvm_copy_to_phys and slightly simplify the code in hvm_load_kernel.

Changes since v2:
 - Remove debug messages.
 - Don't hardcode the number of modules to 1.
---
 xen/arch/x86/domain_build.c | 145 
 1 file changed, 145 insertions(+)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 8602566..e40fb94 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -39,6 +39,7 @@
 #include 
 
 #include 
+#include 
 
 static long __initdata dom0_nrpages;
 static long __initdata dom0_min_nrpages;
@@ -1930,12 +1931,148 @@ static int __init hvm_setup_p2m(struct domain *d)
 #undef MB1_PAGES
 }
 
+static int __init hvm_copy_to_phys(struct domain *d, paddr_t paddr, void *buf,
+   int size)
+{
+struct vcpu *saved_current;
+int rc;
+
+saved_current = current;
+set_current(d->vcpu[0]);
+rc = hvm_copy_to_guest_phys(paddr, buf, size);
+set_current(saved_current);
+
+return rc != HVMCOPY_okay ? -EFAULT : 0;
+}
+
+static int __init hvm_load_kernel(struct domain *d, const module_t *image,
+  unsigned long image_headroom,
+  module_t *initrd, char *image_base,
+  char *cmdline, paddr_t *entry,
+  paddr_t *start_info_addr)
+{
+char *image_start = image_base + image_headroom;
+unsigned long image_len = image->mod_end;
+struct elf_binary elf;
+struct elf_dom_parms parms;
+paddr_t last_addr;
+struct hvm_start_info start_info;
+struct hvm_modlist_entry mod;
+struct vcpu *saved_current, *v = d->vcpu[0];
+int rc;
+
+if ( (rc = bzimage_parse(image_base, _start, _len)) != 0 )
+{
+printk("Error trying to detect bz compressed kernel\n");
+return rc;
+}
+
+if ( (rc = elf_init(, image_start, image_len)) != 0 )
+{
+printk("Unable to init ELF\n");
+return rc;
+}
+#ifdef VERBOSE
+elf_set_verbose();
+#endif
+elf_parse_binary();
+if ( (rc = elf_xen_parse(, )) != 0 )
+{
+printk("Unable to parse kernel for ELFNOTES\n");
+return rc;
+}
+
+if ( parms.phys_entry == UNSET_ADDR32 ) {
+printk("Unable to find XEN_ELFNOTE_PHYS32_ENTRY address\n");
+return -EINVAL;
+}
+
+printk("OS: %s version: %s loader: %s bitness: %s\n", parms.guest_os,
+   parms.guest_ver, parms.loader,
+   elf_64bit() ? "64-bit" : "32-bit");
+
+/* Copy the OS image and free temporary buffer. */
+elf.dest_base = (void *)(parms.virt_kstart - parms.virt_base);
+elf.dest_size = parms.virt_kend - parms.virt_kstart;
+
+saved_current = current;
+set_current(v);
+rc = elf_load_binary();
+set_current(saved_current);
+if ( rc < 0 )
+{
+printk("Failed to load kernel: %d\n", rc);
+printk("Xen dom0 kernel broken ELF: %s\n", elf_check_broken());
+return rc;
+}
+
+last_addr = ROUNDUP(parms.virt_kend - parms.virt_base, PAGE_SIZE);
+
+if ( initrd != NULL )
+{
+rc = hvm_copy_to_phys(d, last_addr, mfn_to_virt(initrd->mod_start),
+  initrd->mod_end);
+if ( rc )
+{
+printk("Unable to copy initrd to guest\n");
+return rc;
+}
+
+mod.paddr = last_addr;
+mod.size = initrd->mod_end;
+last_addr += ROUNDUP(initrd->mod_end, PAGE_SIZE);
+}
+
+/* Free temporary buffers. */
+discard_initial_images();
+
+memset(_info, 0, sizeof(start_info));
+if ( cmdline != NULL )
+{
+rc = hvm_copy_to_phys(d, last_addr, cmdline, strlen(cmdline) + 1);
+if ( rc )
+{
+printk("Unable to copy guest command line\n");
+return rc;
+}
+start_info.cmdline_paddr = last_addr;
+last_addr += ROUNDUP(strlen(cmdline) + 1, 8);
+}
+if ( initrd != NULL )
+{
+rc = hvm_copy_to_phys(d, last_addr, , sizeof(mod));
+if ( rc )
+{
+printk("Unable to copy guest modules\n");
+return rc;
+}
+start_info.modlist_paddr = last_addr;
+start_info.nr_modules = 1;
+last_addr += sizeof(mod);
+}
+
+start_info.magic = XEN_HVM_START_MAGIC_VALUE;
+start_info.flags = SIF_PRIVILEGED | SIF_INITDOMAIN;
+rc = hvm_copy_to_phys(d, last_addr, _info, sizeof(start_info));
+if ( rc )
+{
+printk("Unable to copy start info to guest\n");
+return rc;
+}
+
+*entry = parms.phys_entry;
+*start_info_addr = last_addr;
+
+return 0;
+}
+
 static int 

[Xen-devel] [PATCH v4 09/14] xen/x86: split Dom0 build into PV and PVHv2

2016-11-30 Thread Roger Pau Monne
Split the Dom0 builder into two different functions, one for PV (and classic
PVH), and another one for PVHv2. Introduce a new command line parameter
called 'dom0' that can be used to request the creation of a PVHv2 Dom0 by
setting the 'hvm' sub-option. A panic has also been added if a user tries
to use dom0=hvm until all the code is in place, then the panic will be
removed.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
Changes since v3:
 - Correctly declare the parameter list.
 - Add a panic if dom0=hvm is used. This will be removed once all the code is in
   place.

Changes since v2:
 - Fix coding style.
 - Introduce a new dom0 option that allows passing several parameters.
   Currently supported ones are hvm and shadow.

Changes since RFC:
 - Add documentation for the new command line option.
 - Simplify the logic in construct_dom0.
---
 docs/misc/xen-command-line.markdown | 17 +
 xen/arch/x86/domain_build.c | 28 +++
 xen/arch/x86/setup.c| 38 +
 xen/include/asm-x86/setup.h |  6 ++
 4 files changed, 85 insertions(+), 4 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 0138978..c9729be 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -656,6 +656,23 @@ affinities to prefer but be not limited to the specified 
node(s).
 
 Pin dom0 vcpus to their respective pcpus
 
+### dom0
+> `= List of [ hvm | shadow ]`
+
+> Sub-options:
+
+> `hvm`
+
+> Default: `false`
+
+Flag that makes a dom0 boot in PVHv2 mode.
+
+> `shadow`
+
+> Default: `false`
+
+Flag that makes a dom0 use shadow paging.
+
 ### dom0pvh
 > `= `
 
diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 1e557b9..2c9ebf2 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -191,10 +191,8 @@ struct vcpu *__init alloc_dom0_vcpu0(struct domain *dom0)
 }
 
 #ifdef CONFIG_SHADOW_PAGING
-static bool_t __initdata opt_dom0_shadow;
+bool __initdata opt_dom0_shadow;
 boolean_param("dom0_shadow", opt_dom0_shadow);
-#else
-#define opt_dom0_shadow 0
 #endif
 
 static char __initdata opt_dom0_ioports_disable[200] = "";
@@ -951,7 +949,7 @@ static int __init setup_permissions(struct domain *d)
 return rc;
 }
 
-int __init construct_dom0(
+static int __init construct_dom0_pv(
 struct domain *d,
 const module_t *image, unsigned long image_headroom,
 module_t *initrd,
@@ -1655,6 +1653,28 @@ out:
 return rc;
 }
 
+static int __init construct_dom0_hvm(struct domain *d, const module_t *image,
+ unsigned long image_headroom,
+ module_t *initrd,
+ void *(*bootstrap_map)(const module_t *),
+ char *cmdline)
+{
+
+printk("** Building a PVH Dom0 **\n");
+
+return 0;
+}
+
+int __init construct_dom0(struct domain *d, const module_t *image,
+  unsigned long image_headroom, module_t *initrd,
+  void *(*bootstrap_map)(const module_t *),
+  char *cmdline)
+{
+
+return (is_hvm_domain(d) ? construct_dom0_hvm : construct_dom0_pv)
+   (d, image, image_headroom, initrd,bootstrap_map, cmdline);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index b130671..255e20c 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -187,6 +187,35 @@ static void __init parse_acpi_param(char *s)
 }
 }
 
+/*
+ * List of parameters that affect Dom0 creation:
+ *
+ *  - hvm   Create a PVHv2 Dom0.
+ *  - shadowUse shadow paging for Dom0.
+ */
+static bool __initdata dom0_hvm;
+static void __init parse_dom0_param(char *s)
+{
+char *ss;
+
+do {
+
+ss = strchr(s, ',');
+if ( ss )
+*ss = '\0';
+
+if ( !strcmp(s, "hvm") )
+dom0_hvm = true;
+#ifdef CONFIG_SHADOW_PAGING
+else if ( !strcmp(s, "shadow") )
+opt_dom0_shadow = true;
+#endif
+
+s = ss + 1;
+} while ( ss );
+}
+custom_param("dom0", parse_dom0_param);
+
 static const module_t *__initdata initial_images;
 static unsigned int __initdata nr_initial_images;
 
@@ -1541,6 +1570,15 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 if ( opt_dom0pvh )
 domcr_flags |= DOMCRF_pvh | DOMCRF_hap;
 
+if ( dom0_hvm )
+{
+panic("Building a PVHv2 Dom0 is not yet supported.");
+domcr_flags |= DOMCRF_hvm |
+   ((hvm_funcs.hap_supported && !opt_dom0_shadow) ?
+ DOMCRF_hap : 0);
+config.emulation_flags = XEN_X86_EMU_LAPIC|XEN_X86_EMU_IOAPIC;
+}
+
 /* Create initial domain 0. */
 dom0 = 

[Xen-devel] [PATCH v4 05/14] xen/x86: split the setup of Dom0 permissions to a function

2016-11-30 Thread Roger Pau Monne
So that it can also be used by the PVH-specific domain builder. This is just
code motion, it should not introduce any functional change.

Signed-off-by: Roger Pau Monné 
Acked-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: Jan Beulich 
---
Changes since v2:
 - Fix comment style.
 - Convert i to unsigned int.
 - Restore previous BUG_ON in case of failure (instead of panic).
 - Remove unneeded rc initializer.
---
 xen/arch/x86/domain_build.c | 160 +++-
 1 file changed, 83 insertions(+), 77 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 17f8e91..1e557b9 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -869,6 +869,88 @@ static __init void setup_pv_physmap(struct domain *d, 
unsigned long pgtbl_pfn,
 unmap_domain_page(l4start);
 }
 
+static int __init setup_permissions(struct domain *d)
+{
+unsigned long mfn;
+unsigned int i;
+int rc;
+
+/* The hardware domain is initially permitted full I/O capabilities. */
+rc = ioports_permit_access(d, 0, 0x);
+rc |= iomem_permit_access(d, 0UL, (1UL << (paddr_bits - PAGE_SHIFT)) - 1);
+rc |= irqs_permit_access(d, 1, nr_irqs_gsi - 1);
+
+/* Modify I/O port access permissions. */
+
+/* Master Interrupt Controller (PIC). */
+rc |= ioports_deny_access(d, 0x20, 0x21);
+/* Slave Interrupt Controller (PIC). */
+rc |= ioports_deny_access(d, 0xA0, 0xA1);
+/* Interval Timer (PIT). */
+rc |= ioports_deny_access(d, 0x40, 0x43);
+/* PIT Channel 2 / PC Speaker Control. */
+rc |= ioports_deny_access(d, 0x61, 0x61);
+/* ACPI PM Timer. */
+if ( pmtmr_ioport )
+rc |= ioports_deny_access(d, pmtmr_ioport, pmtmr_ioport + 3);
+/* PCI configuration space (NB. 0xcf8 has special treatment). */
+rc |= ioports_deny_access(d, 0xcfc, 0xcff);
+/* Command-line I/O ranges. */
+process_dom0_ioports_disable(d);
+
+/* Modify I/O memory access permissions. */
+
+/* Local APIC. */
+if ( mp_lapic_addr != 0 )
+{
+mfn = paddr_to_pfn(mp_lapic_addr);
+rc |= iomem_deny_access(d, mfn, mfn);
+}
+/* I/O APICs. */
+for ( i = 0; i < nr_ioapics; i++ )
+{
+mfn = paddr_to_pfn(mp_ioapics[i].mpc_apicaddr);
+if ( !rangeset_contains_singleton(mmio_ro_ranges, mfn) )
+rc |= iomem_deny_access(d, mfn, mfn);
+}
+/* MSI range. */
+rc |= iomem_deny_access(d, paddr_to_pfn(MSI_ADDR_BASE_LO),
+paddr_to_pfn(MSI_ADDR_BASE_LO +
+ MSI_ADDR_DEST_ID_MASK));
+/* HyperTransport range. */
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+rc |= iomem_deny_access(d, paddr_to_pfn(0xfdULL << 32),
+paddr_to_pfn((1ULL << 40) - 1));
+
+/* Remove access to E820_UNUSABLE I/O regions above 1MB. */
+for ( i = 0; i < e820.nr_map; i++ )
+{
+unsigned long sfn, efn;
+sfn = max_t(unsigned long, paddr_to_pfn(e820.map[i].addr), 0x100ul);
+efn = paddr_to_pfn(e820.map[i].addr + e820.map[i].size - 1);
+if ( (e820.map[i].type == E820_UNUSABLE) &&
+ (e820.map[i].size != 0) &&
+ (sfn <= efn) )
+rc |= iomem_deny_access(d, sfn, efn);
+}
+
+/* Prevent access to HPET */
+if ( hpet_address )
+{
+u8 prot_flags = hpet_flags & ACPI_HPET_PAGE_PROTECT_MASK;
+
+mfn = paddr_to_pfn(hpet_address);
+if ( prot_flags == ACPI_HPET_PAGE_PROTECT4 )
+rc |= iomem_deny_access(d, mfn, mfn);
+else if ( prot_flags == ACPI_HPET_PAGE_PROTECT64 )
+rc |= iomem_deny_access(d, mfn, mfn + 15);
+else if ( ro_hpet )
+rc |= rangeset_add_singleton(mmio_ro_ranges, mfn);
+}
+
+return rc;
+}
+
 int __init construct_dom0(
 struct domain *d,
 const module_t *image, unsigned long image_headroom,
@@ -1539,83 +1621,7 @@ int __init construct_dom0(
 if ( test_bit(XENFEAT_supervisor_mode_kernel, parms.f_required) )
 panic("Dom0 requires supervisor-mode execution");
 
-rc = 0;
-
-/* The hardware domain is initially permitted full I/O capabilities. */
-rc |= ioports_permit_access(d, 0, 0x);
-rc |= iomem_permit_access(d, 0UL, (1UL << (paddr_bits - PAGE_SHIFT)) - 1);
-rc |= irqs_permit_access(d, 1, nr_irqs_gsi - 1);
-
-/*
- * Modify I/O port access permissions.
- */
-/* Master Interrupt Controller (PIC). */
-rc |= ioports_deny_access(d, 0x20, 0x21);
-/* Slave Interrupt Controller (PIC). */
-rc |= ioports_deny_access(d, 0xA0, 0xA1);
-/* Interval Timer (PIT). */
-rc |= ioports_deny_access(d, 0x40, 0x43);
-/* PIT Channel 2 / PC Speaker Control. */
-rc |= ioports_deny_access(d, 0x61, 0x61);
-/* ACPI PM Timer. */
-if ( pmtmr_ioport )
-rc |= 

[Xen-devel] [PATCH v4 13/14] xen/x86: hack to setup PVHv2 Dom0 CPUs

2016-11-30 Thread Roger Pau Monne
Initialize Dom0 BSP/APs and setup the memory and IO permissions.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
DO NOT APPLY.

The logic used to setup the CPUID leaves is clearly lacking. This patch will
be rebased on top of Andrew's CPUID work, that will move CPUID setup from
libxc into Xen. For the time being this is needed in order to be able to
boot a PVHv2 Dom0, in order to test the rest of the patches.
---
 xen/arch/x86/domain_build.c | 97 +
 1 file changed, 97 insertions(+)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 7e22ba3..5c7592b 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static long __initdata dom0_nrpages;
 static long __initdata dom0_min_nrpages;
@@ -2069,6 +2070,93 @@ static int __init hvm_load_kernel(struct domain *d, 
const module_t *image,
 return 0;
 }
 
+static int __init hvm_setup_cpus(struct domain *d, paddr_t entry,
+ paddr_t start_info)
+{
+vcpu_hvm_context_t cpu_ctx;
+struct vcpu *v = d->vcpu[0];
+int cpu, i, rc;
+struct {
+uint32_t index;
+uint32_t count;
+} cpuid_leaves[] = {
+{0, XEN_CPUID_INPUT_UNUSED},
+{1, XEN_CPUID_INPUT_UNUSED},
+{2, XEN_CPUID_INPUT_UNUSED},
+{4, 0},
+{4, 1},
+{4, 2},
+{4, 3},
+{4, 4},
+{7, 0},
+{0xa, XEN_CPUID_INPUT_UNUSED},
+{0xd, 0},
+{0x8000, XEN_CPUID_INPUT_UNUSED},
+{0x8001, XEN_CPUID_INPUT_UNUSED},
+{0x8002, XEN_CPUID_INPUT_UNUSED},
+{0x8003, XEN_CPUID_INPUT_UNUSED},
+{0x8004, XEN_CPUID_INPUT_UNUSED},
+{0x8005, XEN_CPUID_INPUT_UNUSED},
+{0x8006, XEN_CPUID_INPUT_UNUSED},
+{0x8007, XEN_CPUID_INPUT_UNUSED},
+{0x8008, XEN_CPUID_INPUT_UNUSED},
+};
+
+cpu = v->processor;
+for ( i = 1; i < d->max_vcpus; i++ )
+{
+cpu = cpumask_cycle(cpu, _cpus);
+setup_dom0_vcpu(d, i, cpu);
+}
+
+memset(_ctx, 0, sizeof(cpu_ctx));
+
+cpu_ctx.mode = VCPU_HVM_MODE_32B;
+
+cpu_ctx.cpu_regs.x86_32.ebx = start_info;
+cpu_ctx.cpu_regs.x86_32.eip = entry;
+cpu_ctx.cpu_regs.x86_32.cr0 = X86_CR0_PE | X86_CR0_ET;
+
+cpu_ctx.cpu_regs.x86_32.cs_limit = ~0u;
+cpu_ctx.cpu_regs.x86_32.ds_limit = ~0u;
+cpu_ctx.cpu_regs.x86_32.ss_limit = ~0u;
+cpu_ctx.cpu_regs.x86_32.tr_limit = 0x67;
+cpu_ctx.cpu_regs.x86_32.cs_ar = 0xc9b;
+cpu_ctx.cpu_regs.x86_32.ds_ar = 0xc93;
+cpu_ctx.cpu_regs.x86_32.ss_ar = 0xc93;
+cpu_ctx.cpu_regs.x86_32.tr_ar = 0x8b;
+
+rc = arch_set_info_hvm_guest(v, _ctx);
+if ( rc )
+{
+printk("Unable to setup Dom0 BSP context: %d\n", rc);
+return rc;
+}
+
+for ( i = 0; i < ARRAY_SIZE(cpuid_leaves); i++ )
+{
+d->arch.cpuids[i].input[0] = cpuid_leaves[i].index;
+d->arch.cpuids[i].input[1] = cpuid_leaves[i].count;
+cpuid_count(d->arch.cpuids[i].input[0], d->arch.cpuids[i].input[1],
+>arch.cpuids[i].eax, >arch.cpuids[i].ebx,
+>arch.cpuids[i].ecx, >arch.cpuids[i].edx);
+/* XXX: we need to do much more filtering here. */
+if ( d->arch.cpuids[i].input[0] == 1 )
+d->arch.cpuids[i].ecx &= ~X86_FEATURE_VMX;
+}
+
+rc = setup_permissions(d);
+if ( rc )
+{
+panic("Unable to setup Dom0 permissions: %d\n", rc);
+return rc;
+}
+
+update_domain_wallclock_time(d);
+
+return 0;
+}
+
 static int __init construct_dom0_hvm(struct domain *d, const module_t *image,
  unsigned long image_headroom,
  module_t *initrd,
@@ -2101,6 +2189,15 @@ static int __init construct_dom0_hvm(struct domain *d, 
const module_t *image,
 return rc;
 }
 
+rc = hvm_setup_cpus(d, entry, start_info);
+if ( rc )
+{
+printk("Failed to setup Dom0 CPUs: %d\n", rc);
+return rc;
+}
+
+clear_bit(_VPF_down, >vcpu[0]->pause_flags);
+
 return 0;
 }
 
-- 
2.9.3 (Apple Git-75)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 00/14] Initial PVHv2 Dom0 support

2016-11-30 Thread Roger Pau Monne
Hello,

This is the first batch of the PVHv2 Dom0 support series, that includes
everything up to the point where ACPI tables for Dom0 are crafted. I've
decided to left the last part of the series (the one that contains the PCI
config space handlers, and other emulation/trapping related code) separated,
in order to focus and ease the review. This is of course not functional, one
might be able to partially boot a Dom0 kernel if it doesn't try to access
any physical devices, and the panic in setup.c is removed.

The full series can also be found on a git branch in my personal git repo:

git://xenbits.xen.org/people/royger/xen.git dom0_hvm_v4

Each patch contains the changelog between versions.

Thanks, Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 01/14] xen/x86: remove XENFEAT_hvm_pirqs for PVHv2 guests

2016-11-30 Thread Roger Pau Monne
PVHv2 guests, unlike HVM guests, won't have the option to route interrupts
from physical or emulated devices over event channels using PIRQs. This
applies to both DomU and Dom0 PVHv2 guests.

Introduce a new XEN_X86_EMU_USE_PIRQ to notify Xen whether a HVM guest can
route physical interrupts (even from emulated devices) over event channels,
and is thus allowed to use some of the PHYSDEV ops.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
Changes since v3:
 - Update docs.

Changes since v2:
 - Change local variable name to currd instead of d.
 - Use currd where it makes sense.
---
 docs/misc/hvmlite.markdown| 20 
 xen/arch/x86/hvm/hvm.c| 25 -
 xen/arch/x86/physdev.c|  5 +++--
 xen/common/kernel.c   |  3 ++-
 xen/include/public/arch-x86/xen.h |  4 +++-
 5 files changed, 44 insertions(+), 13 deletions(-)

diff --git a/docs/misc/hvmlite.markdown b/docs/misc/hvmlite.markdown
index 898b8ee..b2557f7 100644
--- a/docs/misc/hvmlite.markdown
+++ b/docs/misc/hvmlite.markdown
@@ -75,3 +75,23 @@ info structure that's passed at boot time (field rsdp_paddr).
 
 Description of paravirtualized devices will come from XenStore, just as it's
 done for HVM guests.
+
+## Interrupts ##
+
+### Interrupts from physical devices ###
+
+Interrupts from physical devices are delivered using native methods, this is
+done in order to take advantage of new hardware assisted virtualization
+functions, like posted interrupts. This implies that PVHv2 guests with physical
+devices will also have the necessary interrupt controllers in order to manage
+the delivery of interrupts from those devices, using the same interfaces that
+are available on native hardware.
+
+### Interrupts from paravirtualized devices ###
+
+Interrupts from paravirtualized devices are delivered using event channels, see
+[Event Channel Internals][event_channels] for more detailed information about
+event channels. Delivery of those interrupts can be configured in the same way
+as HVM guests, check xen/include/public/hvm/params.h and
+xen/include/public/hvm/hvm_op.h for more information about available delivery
+methods.
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 25dc759..306d6b0 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4186,10 +4186,12 @@ static long hvm_memory_op(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 
 static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
+struct domain *currd = current->domain;
+
 switch ( cmd )
 {
 default:
-if ( !is_pvh_vcpu(current) || !is_hardware_domain(current->domain) )
+if ( !is_pvh_domain(currd) || !is_hardware_domain(currd) )
 return -ENOSYS;
 /* fall through */
 case PHYSDEVOP_map_pirq:
@@ -4197,7 +4199,9 @@ static long hvm_physdev_op(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 case PHYSDEVOP_eoi:
 case PHYSDEVOP_irq_status_query:
 case PHYSDEVOP_get_free_pirq:
-return do_physdev_op(cmd, arg);
+return ((currd->arch.emulation_flags & XEN_X86_EMU_USE_PIRQ) ||
+   is_pvh_domain(currd)) ?
+do_physdev_op(cmd, arg) : -ENOSYS;
 }
 }
 
@@ -4230,17 +4234,20 @@ static long hvm_memory_op_compat32(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 static long hvm_physdev_op_compat32(
 int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
+struct domain *d = current->domain;
+
 switch ( cmd )
 {
-case PHYSDEVOP_map_pirq:
-case PHYSDEVOP_unmap_pirq:
-case PHYSDEVOP_eoi:
-case PHYSDEVOP_irq_status_query:
-case PHYSDEVOP_get_free_pirq:
-return compat_physdev_op(cmd, arg);
+case PHYSDEVOP_map_pirq:
+case PHYSDEVOP_unmap_pirq:
+case PHYSDEVOP_eoi:
+case PHYSDEVOP_irq_status_query:
+case PHYSDEVOP_get_free_pirq:
+return (d->arch.emulation_flags & XEN_X86_EMU_USE_PIRQ) ?
+compat_physdev_op(cmd, arg) : -ENOSYS;
 break;
 default:
-return -ENOSYS;
+return -ENOSYS;
 break;
 }
 }
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 5a49796..0bea6e1 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -94,7 +94,8 @@ int physdev_map_pirq(domid_t domid, int type, int *index, int 
*pirq_p,
 int pirq, irq, ret = 0;
 void *map_data = NULL;
 
-if ( domid == DOMID_SELF && is_hvm_domain(d) )
+if ( domid == DOMID_SELF && is_hvm_domain(d) &&
+ (d->arch.emulation_flags & XEN_X86_EMU_USE_PIRQ) )
 {
 /*
  * Only makes sense for vector-based callback, else HVM-IRQ logic
@@ -265,7 +266,7 @@ int physdev_unmap_pirq(domid_t domid, int pirq)
 if ( ret )
 goto free_domain;
 
-if ( is_hvm_domain(d) )
+if ( is_hvm_domain(d) && (d->arch.emulation_flags & XEN_X86_EMU_USE_PIRQ) 

[Xen-devel] [PATCH v4 03/14] xen/x86: allow calling {shadow/hap}_set_allocation with the idle domain

2016-11-30 Thread Roger Pau Monne
... and using the "preempted" parameter. Introduce a new helper that can
be used from both hypercall or idle vcpu context (ie: during Dom0
creation) in order to check if preemption is needed. If such preemption
happens, the caller should then call process_pending_softirqs in order to
drain the pending softirqs, and then call *_set_allocation again to continue
with it's execution.

This allows us to call *_set_allocation() when building domain 0.

While there also document hypercall_preempt_check and add an assert to
local_events_need_delivery in order to be sure it's not called by the idle
domain, which doesn't receive any events (and that in turn
hypercall_preempt_check is also not called by the idle domain).

Signed-off-by: Roger Pau Monné 
Acked-by: George Dunlap 
---
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Tim Deegan 
---
Changes since v3:
 - Add general_preempt_check as a macro that can be called from both
   hypercall context and idle vcpu context.
 - Document hypercall_preempt_check.
 - Add assert to local_events_need_delivery in order to be sure it's not
   called by the idle domain.

Changes since v2:
 - Fix commit message.
---
 xen/arch/x86/mm/hap/hap.c   |  2 +-
 xen/arch/x86/mm/shadow/common.c |  2 +-
 xen/include/asm-x86/event.h |  3 +++
 xen/include/xen/sched.h | 15 +++
 4 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index f099e94..b9faba6 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -379,7 +379,7 @@ hap_set_allocation(struct domain *d, unsigned int pages, 
int *preempted)
 break;
 
 /* Check to see if we need to yield and try again */
-if ( preempted && hypercall_preempt_check() )
+if ( preempted && general_preempt_check() )
 {
 *preempted = 1;
 return 0;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 756c276..ddbdb73 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1681,7 +1681,7 @@ static int sh_set_allocation(struct domain *d,
 break;
 
 /* Check to see if we need to yield and try again */
-if ( preempted && hypercall_preempt_check() )
+if ( preempted && general_preempt_check() )
 {
 *preempted = 1;
 return 0;
diff --git a/xen/include/asm-x86/event.h b/xen/include/asm-x86/event.h
index a82062e..d589d6f 100644
--- a/xen/include/asm-x86/event.h
+++ b/xen/include/asm-x86/event.h
@@ -23,6 +23,9 @@ int hvm_local_events_need_delivery(struct vcpu *v);
 static inline int local_events_need_delivery(void)
 {
 struct vcpu *v = current;
+
+ASSERT(!is_idle_vcpu(v));
+
 return (has_hvm_container_vcpu(v) ? hvm_local_events_need_delivery(v) :
 (vcpu_info(v, evtchn_upcall_pending) &&
  !vcpu_info(v, evtchn_upcall_mask)));
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 1fbda87..063efe6 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -708,11 +708,26 @@ unsigned long hypercall_create_continuation(
 unsigned int op, const char *format, ...);
 void hypercall_cancel_continuation(void);
 
+/*
+ * For long-running operations that must be in hypercall context, check
+ * if there is background work to be done that should interrupt this
+ * operation.
+ */
 #define hypercall_preempt_check() (unlikely(\
 softirq_pending(smp_processor_id()) |   \
 local_events_need_delivery()\
 ))
 
+/*
+ * For long-running operations that may be in hypercall context or on
+ * the idle vcpu (e.g. during dom0 construction), check if there is
+ * background work to be done that should interrupt this operation.
+ */
+#define general_preempt_check() (unlikely(  \
+softirq_pending(smp_processor_id()) ||  \
+(!is_idle_vcpu(current) && local_events_need_delivery())\
+))
+
 extern struct domain *domain_list;
 
 /* Caller must hold the domlist_read_lock or domlist_update_lock. */
-- 
2.9.3 (Apple Git-75)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 07/14] x86/iommu: add IOMMU entries for p2m_mmio_direct pages

2016-11-30 Thread Roger Pau Monne
There's nothing wrong with allowing the domain to perform DMA transfers to
MMIO areas that it already can access from the CPU, and this allows us to
remove the hack in set_identity_p2m_entry for PVH Dom0.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/mm/p2m.c | 9 -
 xen/include/asm-x86/p2m.h | 1 +
 2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6a45185..7e33ab6 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1053,16 +1053,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
long gfn,
 ret = p2m_set_entry(p2m, gfn, _mfn(gfn), PAGE_ORDER_4K,
 p2m_mmio_direct, p2ma);
 else if ( mfn_x(mfn) == gfn && p2mt == p2m_mmio_direct && a == p2ma )
-{
 ret = 0;
-/*
- * PVH fixme: during Dom0 PVH construction, p2m entries are being set
- * but iomem regions are not mapped with IOMMU. This makes sure that
- * RMRRs are correctly mapped with IOMMU.
- */
-if ( is_hardware_domain(d) && !iommu_use_hap_pt(d) )
-ret = iommu_map_page(d, gfn, gfn, IOMMUF_readable|IOMMUF_writable);
-}
 else
 {
 if ( flag & XEN_DOMCTL_DEV_RDM_RELAXED )
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 7035860..b562da3 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -834,6 +834,7 @@ static inline unsigned int p2m_get_iommu_flags(p2m_type_t 
p2mt)
 case p2m_grant_map_rw:
 case p2m_ram_logdirty:
 case p2m_map_foreign:
+case p2m_mmio_direct:
 flags =  IOMMUF_readable | IOMMUF_writable;
 break;
 case p2m_ram_ro:
-- 
2.9.3 (Apple Git-75)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 12/14] x86/PVHv2: fix dom0_max_vcpus so it's capped to 128 for PVHv2 Dom0

2016-11-30 Thread Roger Pau Monne
PVHv2 Dom0 is limited to 128 vCPUs, as are all HVM guests at the moment. Fix
dom0_max_vcpus so it takes this limitation into account by poking at the
dom0_hvm variable.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
Changes since v3:
 - New in the series.
---
 xen/arch/x86/domain_build.c | 3 +++
 xen/arch/x86/setup.c| 2 +-
 xen/include/asm-x86/setup.h | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index e40fb94..7e22ba3 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -40,6 +40,7 @@
 
 #include 
 #include 
+#include 
 
 static long __initdata dom0_nrpages;
 static long __initdata dom0_min_nrpages;
@@ -176,6 +177,8 @@ unsigned int __init dom0_max_vcpus(void)
 max_vcpus = opt_dom0_max_vcpus_max;
 if ( max_vcpus > MAX_VIRT_CPUS )
 max_vcpus = MAX_VIRT_CPUS;
+if ( dom0_hvm )
+max_vcpus = min_t(typeof(max_vcpus), max_vcpus, HVM_MAX_VCPUS);
 
 return max_vcpus;
 }
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 255e20c..737f2ca 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -193,7 +193,7 @@ static void __init parse_acpi_param(char *s)
  *  - hvm   Create a PVHv2 Dom0.
  *  - shadowUse shadow paging for Dom0.
  */
-static bool __initdata dom0_hvm;
+bool __initdata dom0_hvm;
 static void __init parse_dom0_param(char *s)
 {
 char *ss;
diff --git a/xen/include/asm-x86/setup.h b/xen/include/asm-x86/setup.h
index c4179d1..3c9389e 100644
--- a/xen/include/asm-x86/setup.h
+++ b/xen/include/asm-x86/setup.h
@@ -63,4 +63,6 @@ extern bool opt_dom0_shadow;
 #define opt_dom0_shadow 0
 #endif
 
+extern bool dom0_hvm;
+
 #endif
-- 
2.9.3 (Apple Git-75)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 10/14] xen/x86: populate PVHv2 Dom0 physical memory map

2016-11-30 Thread Roger Pau Monne
Craft the Dom0 e820 memory map and populate it. Introduce a helper to remove
memory pages that are shared between Xen and a domain, and use it in order to
remove low 1MB RAM regions from dom_io in order to assign them to a PVHv2 Dom0.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
Changes since v3:
 - Drop get_order_from_bytes_floor, it was only used by
   hvm_populate_memory_range.
 - Switch hvm_populate_memory_range to use frame numbers instead of full memory
   addresses.
 - Add a helper to steal the low 1MB RAM areas from dom_io and add them to Dom0
   as normal RAM.
 - Introduce unshare_xen_page_with_guest in order to remove pages from dom_io,
   so they can be assigned to other domains. This is needed in order to remove
   the low 1MB RAM regions from dom_io and assign them to the hardware_domain.
 - Simplify the loop in hvm_steal_ram.
 - Move definition of map_identity_mmio into this patch.

Changes since v2:
 - Introduce get_order_from_bytes_floor as a local function to
   domain_build.c.
 - Remove extra asserts.
 - Make hvm_populate_memory_range return an error code instead of panicking.
 - Fix comments and printks.
 - Use ULL sufix instead of casting to uint64_t.
 - Rename hvm_setup_vmx_unrestricted_guest to
   hvm_setup_vmx_realmode_helpers.
 - Only substract two pages from the memory calculation, that will be used
   by the MADT replacement.
 - Remove some comments.
 - Remove printing allocation information.
 - Don't stash any pages for the MADT, TSS or ident PT, those will be
   subtracted directly from RAM regions of the memory map.
 - Count the number of iterations before calling process_pending_softirqs
   when populating the memory map.
 - Move the initial call to process_pending_softirqs into construct_dom0,
   and remove the ones from construct_dom0_hvm and construct_dom0_pv.
 - Make memflags global so it can be shared between alloc_chunk and
   hvm_populate_memory_range.

Changes since RFC:
 - Use IS_ALIGNED instead of checking with PAGE_MASK.
 - Use the new %pB specifier in order to print sizes in human readable form.
 - Create a VM86 TSS for hardware that doesn't support unrestricted mode.
 - Subtract guest RAM for the identity page table and the VM86 TSS.
 - Split the creation of the unrestricted mode helper structures to a
   separate function.
 - Use preemption with paging_set_allocation.
 - Use get_order_from_bytes_floor.
---
 xen/arch/x86/domain_build.c | 310 ++--
 xen/arch/x86/mm.c   |  37 ++
 xen/include/asm-x86/mm.h|   2 +
 3 files changed, 340 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 2c9ebf2..8602566 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -43,6 +44,9 @@ static long __initdata dom0_nrpages;
 static long __initdata dom0_min_nrpages;
 static long __initdata dom0_max_nrpages = LONG_MAX;
 
+/* Size of the VM86 TSS for virtual 8086 mode to use. */
+#define HVM_VM86_TSS_SIZE   128
+
 /*
  * dom0_mem=[min:,][max:,][]
  * 
@@ -213,11 +217,12 @@ boolean_param("ro-hpet", ro_hpet);
 #define round_pgup(_p)(((_p)+(PAGE_SIZE-1))_MASK)
 #define round_pgdown(_p)  ((_p)_MASK)
 
+static unsigned int __initdata memflags = MEMF_no_dma|MEMF_exact_node;
+
 static struct page_info * __init alloc_chunk(
 struct domain *d, unsigned long max_pages)
 {
 static unsigned int __initdata last_order = MAX_ORDER;
-static unsigned int __initdata memflags = MEMF_no_dma|MEMF_exact_node;
 struct page_info *page;
 unsigned int order = get_order_from_pages(max_pages), free_order;
 
@@ -302,7 +307,8 @@ static unsigned long __init compute_dom0_nr_pages(
 avail -= max_pdx >> s;
 }
 
-need_paging = opt_dom0_shadow || (is_pvh_domain(d) && !iommu_hap_pt_share);
+need_paging = opt_dom0_shadow || (has_hvm_container_domain(d) &&
+  (!iommu_hap_pt_share || !paging_mode_hap(d)));
 for ( ; ; need_paging = 0 )
 {
 nr_pages = dom0_nrpages;
@@ -334,7 +340,8 @@ static unsigned long __init compute_dom0_nr_pages(
 avail -= dom0_paging_pages(d, nr_pages);
 }
 
-if ( (parms->p2m_base == UNSET_ADDR) && (dom0_nrpages <= 0) &&
+if ( is_pv_domain(d) &&
+ (parms->p2m_base == UNSET_ADDR) && (dom0_nrpages <= 0) &&
  ((dom0_min_nrpages <= 0) || (nr_pages > min_pages)) )
 {
 /*
@@ -545,11 +552,12 @@ static __init void pvh_map_all_iomem(struct domain *d, 
unsigned long nr_pages)
 ASSERT(nr_holes == 0);
 }
 
-static __init void pvh_setup_e820(struct domain *d, unsigned long nr_pages)
+static __init void hvm_setup_e820(struct domain *d, unsigned long nr_pages)
 {
 struct e820entry *entry, *entry_guest;
 unsigned int i;
 unsigned long pages, cur_pages = 0;

[Xen-devel] [PATCH v4 04/14] x86/paging: introduce paging_set_allocation

2016-11-30 Thread Roger Pau Monne
... and remove hap_set_alloc_for_pvh_dom0. While there also change the last
parameter of the {hap/shadow}_set_allocation functions to be a boolean.

Signed-off-by: Roger Pau Monné 
Acked-by: Tim Deegan 
Acked-by: George Dunlap 
Reviewed-by: Jan Beulich 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Tim Deegan 
---
Changes since v3:
 - Rename sh_set_allocation to shadow_set_allocation (public shadow
   functions use the shadow prefix instead of sh).

Changes since v2:
 - Convert the preempt parameter into a bool.
 - Fix Dom0 builder comment to reflect that paging.mode should be correct
   before calling paging_set_allocation.

Changes since RFC:
 - Make paging_set_allocation preemtable.
 - Move comments.
---
 xen/arch/x86/domain_build.c | 21 +++--
 xen/arch/x86/mm/hap/hap.c   | 22 +-
 xen/arch/x86/mm/paging.c| 19 ++-
 xen/arch/x86/mm/shadow/common.c | 31 +--
 xen/include/asm-x86/hap.h   |  4 ++--
 xen/include/asm-x86/paging.h|  7 +++
 xen/include/asm-x86/shadow.h| 11 ++-
 7 files changed, 70 insertions(+), 45 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 0a02d65..17f8e91 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -35,7 +35,6 @@
 #include 
 #include  /* for bzimage_parse */
 #include 
-#include 
 #include 
 
 #include 
@@ -1383,15 +1382,25 @@ int __init construct_dom0(
  nr_pages);
 }
 
-if ( is_pvh_domain(d) )
-hap_set_alloc_for_pvh_dom0(d, dom0_paging_pages(d, nr_pages));
-
 /*
- * We enable paging mode again so guest_physmap_add_page will do the
- * right thing for us.
+ * We enable paging mode again so guest_physmap_add_page and
+ * paging_set_allocation will do the right thing for us.
  */
 d->arch.paging.mode = save_pvh_pg_mode;
 
+if ( is_pvh_domain(d) )
+{
+bool preempted;
+
+do {
+preempted = false;
+paging_set_allocation(d, dom0_paging_pages(d, nr_pages),
+  );
+process_pending_softirqs();
+} while ( preempted );
+}
+
+
 /* Write the phys->machine and machine->phys table entries. */
 for ( pfn = 0; pfn < count; pfn++ )
 {
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index b9faba6..e6dc088 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -334,8 +334,7 @@ hap_get_allocation(struct domain *d)
 
 /* Set the pool of pages to the required number of pages.
  * Returns 0 for success, non-zero for failure. */
-static int
-hap_set_allocation(struct domain *d, unsigned int pages, int *preempted)
+int hap_set_allocation(struct domain *d, unsigned int pages, bool *preempted)
 {
 struct page_info *pg;
 
@@ -381,7 +380,7 @@ hap_set_allocation(struct domain *d, unsigned int pages, 
int *preempted)
 /* Check to see if we need to yield and try again */
 if ( preempted && general_preempt_check() )
 {
-*preempted = 1;
+*preempted = true;
 return 0;
 }
 }
@@ -561,7 +560,7 @@ void hap_final_teardown(struct domain *d)
 paging_unlock(d);
 }
 
-void hap_teardown(struct domain *d, int *preempted)
+void hap_teardown(struct domain *d, bool *preempted)
 {
 struct vcpu *v;
 mfn_t mfn;
@@ -609,7 +608,8 @@ out:
 int hap_domctl(struct domain *d, xen_domctl_shadow_op_t *sc,
XEN_GUEST_HANDLE_PARAM(void) u_domctl)
 {
-int rc, preempted = 0;
+int rc;
+bool preempted = false;
 
 switch ( sc->op )
 {
@@ -636,18 +636,6 @@ int hap_domctl(struct domain *d, xen_domctl_shadow_op_t 
*sc,
 }
 }
 
-void __init hap_set_alloc_for_pvh_dom0(struct domain *d,
-   unsigned long hap_pages)
-{
-int rc;
-
-paging_lock(d);
-rc = hap_set_allocation(d, hap_pages, NULL);
-paging_unlock(d);
-
-BUG_ON(rc);
-}
-
 static const struct paging_mode hap_paging_real_mode;
 static const struct paging_mode hap_paging_protected_mode;
 static const struct paging_mode hap_paging_pae_mode;
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index cc44682..853a035 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -809,7 +809,8 @@ long 
paging_domctl_continuation(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
 /* Call when destroying a domain */
 int paging_teardown(struct domain *d)
 {
-int rc, preempted = 0;
+int rc;
+bool preempted = false;
 
 if ( hap_enabled(d) )
 hap_teardown(d, );
@@ -954,6 +955,22 @@ void paging_write_p2m_entry(struct p2m_domain *p2m, 
unsigned long gfn,
 safe_write_pte(p, new);
 }
 
+int 

[Xen-devel] [PATCH v4 08/14] xen/x86: allow the emulated APICs to be enabled for the hardware domain

2016-11-30 Thread Roger Pau Monne
Allow the use of both the emulated local APIC and IO APIC for the hardware
domain.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
Changes since v3:
 - Don't enable the emulated PIT for PVHv2 Dom0.

Changes since v2:
 - Allow all PV guests to use the emulated PIT.

Changes since RFC:
 - Move the emulation flags check to a separate helper.
---
 xen/arch/x86/domain.c | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index eae643f..2c7a528 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -509,6 +509,27 @@ void vcpu_destroy(struct vcpu *v)
 xfree(v->arch.pv_vcpu.trap_ctxt);
 }
 
+static bool emulation_flags_ok(const struct domain *d, uint32_t emflags)
+{
+
+if ( is_hvm_domain(d) )
+{
+if ( is_hardware_domain(d) &&
+ emflags != (XEN_X86_EMU_LAPIC|XEN_X86_EMU_IOAPIC) )
+return false;
+if ( !is_hardware_domain(d) && emflags &&
+ emflags != XEN_X86_EMU_ALL && emflags != XEN_X86_EMU_LAPIC )
+return false;
+}
+else if ( emflags != 0 && emflags != XEN_X86_EMU_PIT )
+{
+/* PV or classic PVH. */
+return false;
+}
+
+return true;
+}
+
 int arch_domain_create(struct domain *d, unsigned int domcr_flags,
struct xen_arch_domainconfig *config)
 {
@@ -547,7 +568,7 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 {
 uint32_t emflags;
 
-if ( is_hardware_domain(d) )
+if ( is_hardware_domain(d) && is_pv_domain(d) )
 config->emulation_flags |= XEN_X86_EMU_PIT;
 
 emflags = config->emulation_flags;
@@ -558,11 +579,7 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 return -EINVAL;
 }
 
-/* PVHv2 guests can request emulated APIC. */
-if ( emflags &&
-(is_hvm_domain(d) ? ((emflags != XEN_X86_EMU_ALL) &&
- (emflags != XEN_X86_EMU_LAPIC)) :
-(emflags != XEN_X86_EMU_PIT)) )
+if ( !emulation_flags_ok(d, emflags) )
 {
 printk(XENLOG_G_ERR "d%d: Xen does not allow %s domain creation "
"with the current selection of emulators: %#x\n",
-- 
2.9.3 (Apple Git-75)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 02/14] xen/x86: fix return value of *_set_allocation functions

2016-11-30 Thread Roger Pau Monne
Return should be an int.

Signed-off-by: Roger Pau Monné 
Acked-by: George Dunlap 
Acked-by: Tim Deegan 
---
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Tim Deegan 
---
Changes since v2:
 - Also fix the callers to treat the return value as an int.
 - Don't convert the pages parameter to unsigned long.
---
 xen/arch/x86/mm/hap/hap.c   |  8 +++-
 xen/arch/x86/mm/shadow/common.c | 12 +---
 2 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 3218fa2..f099e94 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -334,7 +334,7 @@ hap_get_allocation(struct domain *d)
 
 /* Set the pool of pages to the required number of pages.
  * Returns 0 for success, non-zero for failure. */
-static unsigned int
+static int
 hap_set_allocation(struct domain *d, unsigned int pages, int *preempted)
 {
 struct page_info *pg;
@@ -468,14 +468,12 @@ int hap_enable(struct domain *d, u32 mode)
 old_pages = d->arch.paging.hap.total_pages;
 if ( old_pages == 0 )
 {
-unsigned int r;
 paging_lock(d);
-r = hap_set_allocation(d, 256, NULL);
-if ( r != 0 )
+rv = hap_set_allocation(d, 256, NULL);
+if ( rv != 0 )
 {
 hap_set_allocation(d, 0, NULL);
 paging_unlock(d);
-rv = -ENOMEM;
 goto out;
 }
 paging_unlock(d);
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index ced2313..756c276 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1615,9 +1615,9 @@ shadow_free_p2m_page(struct domain *d, struct page_info 
*pg)
  * Input will be rounded up to at least shadow_min_acceptable_pages(),
  * plus space for the p2m table.
  * Returns 0 for success, non-zero for failure. */
-static unsigned int sh_set_allocation(struct domain *d,
-  unsigned int pages,
-  int *preempted)
+static int sh_set_allocation(struct domain *d,
+ unsigned int pages,
+ int *preempted)
 {
 struct page_info *sp;
 unsigned int lower_bound;
@@ -3153,13 +3153,11 @@ int shadow_enable(struct domain *d, u32 mode)
 old_pages = d->arch.paging.shadow.total_pages;
 if ( old_pages == 0 )
 {
-unsigned int r;
 paging_lock(d);
-r = sh_set_allocation(d, 1024, NULL); /* Use at least 4MB */
-if ( r != 0 )
+rv = sh_set_allocation(d, 1024, NULL); /* Use at least 4MB */
+if ( rv != 0 )
 {
 sh_set_allocation(d, 0, NULL);
-rv = -ENOMEM;
 goto out_locked;
 }
 paging_unlock(d);
-- 
2.9.3 (Apple Git-75)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 06/14] x86/vtd: refuse to enable IOMMU if the PCI scan fails

2016-11-30 Thread Roger Pau Monne
This provides uniform behavior between Intel and AMD IOMMU initialization, and
is a requirement for PVHv2 Dom0, that depends on a working IOMMU plus the PCI
bus being scanned for devices.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Kevin Tian 
Cc: Feng Wu 
---
Changes since v3:
 - New in this revision.
---
 xen/drivers/passthrough/vtd/iommu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 48f120b..78b5a6a 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2299,7 +2299,9 @@ int __init intel_vtd_setup(void)
 P(iommu_hap_pt_share, "Shared EPT tables");
 #undef P
 
-scan_pci_devices();
+ret = scan_pci_devices();
+if ( ret )
+goto error;
 
 ret = init_vtd_hw();
 if ( ret )
-- 
2.9.3 (Apple Git-75)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 14/14] xen/x86: setup PVHv2 Dom0 ACPI tables

2016-11-30 Thread Roger Pau Monne
Create a new MADT table that contains the topology exposed to the guest. A
new XSDT table is also created, in order to filter the tables that we want
to expose to the guest, plus the Xen crafted MADT. This in turn requires Xen
to also create a new RSDP in order to make it point to the custom XSDT.

Also, regions marked as E820_ACPI or E820_NVS are identity mapped into Dom0
p2m, plus any top-level ACPI tables that should be accessible to Dom0 and
reside in reserved regions. This is needed because some memory maps don't
properly account for all the memory used by ACPI, so it's common to find ACPI
tables in reserved regions.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
Changes since v3:
 - Use hvm_copy_to_phys in order to copy the tables to Dom0 memory.
 - Return EEXIST for overlaping ranges in hvm_add_mem_range.
 - s/ov/ovr/ for interrupt override parsing functions.
 - Constify intr local variable in acpi_set_intr_ovr.
 - Use structure asignement for type safety.
 - Perform sizeof using local variables in hvm_setup_acpi_madt.
 - Manually set revision of crafted/modified tables.
 - Only map tables to guest that reside in reserved or ACPI memory regions.
 - Copy the RSDP OEM signature to the crafted RSDP.
 - Pair calls to acpi_os_map_memory/acpi_os_unmap_memory.
 - Add memory regions for allowed ACPI tables to the memory map and then
   perform the identity mappings. This avoids having to call 
modify_identity_mmio
   multiple times.
 - Add a FIXME comment regarding the lack of multiple vIO-APICs.

Changes since v2:
 - Completely reworked.
---
 xen/arch/x86/domain_build.c | 429 +++-
 1 file changed, 428 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 5c7592b..fc778b2 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -38,6 +39,8 @@
 #include 
 #include 
 
+#include 
+
 #include 
 #include 
 #include 
@@ -50,6 +53,9 @@ static long __initdata dom0_max_nrpages = LONG_MAX;
 /* Size of the VM86 TSS for virtual 8086 mode to use. */
 #define HVM_VM86_TSS_SIZE   128
 
+static unsigned int __initdata acpi_intr_overrrides;
+static struct acpi_madt_interrupt_override __initdata *intsrcovr;
+
 /*
  * dom0_mem=[min:,][max:,][]
  * 
@@ -567,7 +573,7 @@ static __init void hvm_setup_e820(struct domain *d, 
unsigned long nr_pages)
 /*
  * Craft the e820 memory map for Dom0 based on the hardware e820 map.
  */
-d->arch.e820 = xzalloc_array(struct e820entry, e820.nr_map);
+d->arch.e820 = xzalloc_array(struct e820entry, E820MAX);
 if ( !d->arch.e820 )
 panic("Unable to allocate memory for Dom0 e820 map");
 entry_guest = d->arch.e820;
@@ -1779,6 +1785,55 @@ static int __init hvm_steal_ram(struct domain *d, 
unsigned long size,
 return -ENOMEM;
 }
 
+/* NB: memory map must be sorted at all times for this to work correctly. */
+static int __init hvm_add_mem_range(struct domain *d, uint64_t s, uint64_t e,
+unsigned int type)
+{
+unsigned int i;
+
+for ( i = 0; i < d->arch.nr_e820; i++ )
+{
+uint64_t rs = d->arch.e820[i].addr;
+uint64_t re = rs + d->arch.e820[i].size;
+
+if ( rs == e && d->arch.e820[i].type == type )
+{
+d->arch.e820[i].addr = s;
+return 0;
+}
+
+if ( re == s && d->arch.e820[i].type == type &&
+ (i + 1 == d->arch.nr_e820 || d->arch.e820[i + 1].addr >= e) )
+{
+d->arch.e820[i].size += e - s;
+return 0;
+}
+
+if ( rs >= e )
+break;
+
+if ( re > s )
+return -EEXIST;
+}
+
+if ( d->arch.nr_e820 >= E820MAX )
+{
+printk(XENLOG_WARNING "E820: overflow while adding region"
+   "[%"PRIx64", %"PRIx64")\n", s, e);
+return -ENOMEM;
+}
+
+memmove(d->arch.e820 + i + 1, d->arch.e820 + i,
+(d->arch.nr_e820 - i) * sizeof(*d->arch.e820));
+
+d->arch.nr_e820++;
+d->arch.e820[i].addr = s;
+d->arch.e820[i].size = e - s;
+d->arch.e820[i].type = type;
+
+return 0;
+}
+
 static int __init hvm_setup_vmx_realmode_helpers(struct domain *d)
 {
 p2m_type_t p2mt;
@@ -2157,6 +2212,371 @@ static int __init hvm_setup_cpus(struct domain *d, 
paddr_t entry,
 return 0;
 }
 
+static int __init acpi_count_intr_ovr(struct acpi_subtable_header *header,
+ const unsigned long end)
+{
+
+acpi_intr_overrrides++;
+return 0;
+}
+
+static int __init acpi_set_intr_ovr(struct acpi_subtable_header *header,
+const unsigned long end)
+{
+const struct acpi_madt_interrupt_override *intr =
+container_of(header, struct 

Re: [Xen-devel] [PATCH RFC 1/2] xen/page_alloc: Add size_align parameter to provide MFNs which are size aligned.

2016-11-30 Thread Jan Beulich
>>> On 30.11.16 at 17:42,  wrote:
> On Wed, Nov 30, 2016 at 02:30:41AM -0700, Jan Beulich wrote:
>> >>> On 30.11.16 at 05:39,  wrote:
>> > This is to support the requirement that exists in PV dom0
>> > when doing DMA requests:
>> > 
>> > "dma_alloc_coherent()
>> > [...]
>> > The CPU virtual address and the DMA address are both guaranteed to be
>> > aligned to the smallest PAGE_SIZE order which is greater than or equal
>> > to the requested size.  This invariant exists (for example) to guarantee
>> > that if you allocate a chunk which is smaller than or equal to 64
>> > kilobytes, the extent of the buffer you receive will not cross a 64K
>> > boundary."
>> 
>> So I'm having trouble understanding what it is that actually needs
>> fixing / changing here: Any order-N allocation will be order-N-aligned
>> already. Is your caller perhaps simply not passing in a large enough
>> order? And changing alloc_heap_pages(), which guarantees the
>> requested alignment already anyway (after all it takes an order
>> input, not a size one), looks completely pointless regardless of what
>> extra requirements you may want to put on the exchange hypercall.
> 
> The page_alloc.c code walks through different order pages. Which means
> that if it can't find one within the requested order pages it will
> go one up (and so on). Eventually that means you do get the requested
> order pages, but they are not guaranteed to be order aligned (as they
> may be order aligned to a higher value).

But that's _better_ alignment than you asked for then.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union

2016-11-30 Thread Jan Beulich
>>> On 30.11.16 at 15:05,  wrote:
>> From: Andrew Cooper
>> Sent: 30 November 2016 14:02
>> On 30/11/16 13:58, Paul Durrant wrote:
>> > Also, anonymous unions are not part of C99 AFAIK... are we now stipulating
>> something more recent?
>> 
>> We used gnu99 for as long as I can remember, and we have other examples
>> of this pattern already in Xen.
>> 
> 
> If there's precedent then that's fine.

Tighter rules only apply for the public headers.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 1/2] xen/page_alloc: Add size_align parameter to provide MFNs which are size aligned.

2016-11-30 Thread Konrad Rzeszutek Wilk
On Wed, Nov 30, 2016 at 02:30:41AM -0700, Jan Beulich wrote:
> >>> On 30.11.16 at 05:39,  wrote:
> > This is to support the requirement that exists in PV dom0
> > when doing DMA requests:
> > 
> > "dma_alloc_coherent()
> > [...]
> > The CPU virtual address and the DMA address are both guaranteed to be
> > aligned to the smallest PAGE_SIZE order which is greater than or equal
> > to the requested size.  This invariant exists (for example) to guarantee
> > that if you allocate a chunk which is smaller than or equal to 64
> > kilobytes, the extent of the buffer you receive will not cross a 64K
> > boundary."
> 
> So I'm having trouble understanding what it is that actually needs
> fixing / changing here: Any order-N allocation will be order-N-aligned
> already. Is your caller perhaps simply not passing in a large enough
> order? And changing alloc_heap_pages(), which guarantees the
> requested alignment already anyway (after all it takes an order
> input, not a size one), looks completely pointless regardless of what
> extra requirements you may want to put on the exchange hypercall.

The page_alloc.c code walks through different order pages. Which means
that if it can't find one within the requested order pages it will
go one up (and so on). Eventually that means you do get the requested
order pages, but they are not guaranteed to be order aligned (as they
may be order aligned to a higher value).

> 
> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3.1 15/15] xen/x86: setup PVHv2 Dom0 ACPI tables

2016-11-30 Thread Jan Beulich
>>> On 30.11.16 at 15:23,  wrote:
> On Wed, Nov 30, 2016 at 07:09:47AM -0700, Jan Beulich wrote:
>> >>> On 30.11.16 at 13:40,  wrote:
>> > On Mon, Nov 14, 2016 at 09:15:37AM -0700, Jan Beulich wrote:
>> >> >>> On 29.10.16 at 11:00,  wrote:
>> >> > Also, regions marked as E820_ACPI or E820_NVS are identity mapped into 
>> >> > Dom0
>> >> > p2m, plus any top-level ACPI tables that should be accessible to Dom0 
>> >> > and
>> >> > that don't reside in RAM regions. This is needed because some memory 
>> >> > maps
>> >> > don't properly account for all the memory used by ACPI, so it's common 
>> >> > to
>> >> > find ACPI tables in holes.
>> >> 
>> >> I question whether this behavior should be enabled by default. Not
>> >> having seen the code yet I also wonder whether these regions
>> >> shouldn't simply be added to the guest's E820 as E820_ACPI, which
>> >> should then result in them getting mapped without further special
>> >> casing.
>> >> 
>> >> > +static int __init hvm_add_mem_range(struct domain *d, uint64_t s, 
>> >> > uint64_t e,
>> >> > +uint32_t type)
>> >> 
>> >> I see s and e being uint64_t, but I don't see why type can't be plain
>> >> unsigned int.
>> > 
>> > Well, that's the type for "type" as defined in e820.h. I'm just using 
>> > uint32_t 
>> > for consistency with that.
>> 
>> As said a number of times in various contexts: We should try to
>> get away from using fixed width types where we don't really need
>> them.
> 
> Done, I've changed it. Would you like me to also change the uint64_t's to 
> paddr_t?

To me paddr_t is not better or worse than uint64_t, perhaps with
the slight exception that in a (very old) non-PAE 32-bit build
paddr_t would have been actively wrong.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen ARM - Exposing a PL011 to the guest

2016-11-30 Thread Christoffer Dall
On Wed, Nov 30, 2016 at 03:29:32PM +, Julien Grall wrote:
> Hi all,
> 
> Few months ago, Linaro has published the version 2 of the VM
> specification [1].
> 
> For those who don't know, the specification provides guidelines to
> guarantee a compliant OS images could run on various hypervisor (e.g
> Xen, KVM).
> 
> Looking at the specification, it will require Xen to expose new
> devices to the guest: pl011, rtc, persistent flash (for UEFI
> variables).
> 
> The RTC and persistent will only be used by the UEFI firwmare. 

Why would a guest booting without UEFI not want to use the RTC directly?

Linux does this on KVM today...

-Christoffer

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: Fix misplaced parentheses for PSCI version check

2016-11-30 Thread Julien Grall

Hi Artem,

On 30/11/16 13:53, Artem Mygaiev wrote:

Fix misplaced parentheses for PSCI version check

Signed-off-by: Artem Mygaiev 


Can you please include the coverity ID:

Coverity-ID: 1381830

With that:

Reviewed-by: Julien Grall 

This has been introduced by me in commit 2831f20 "xen/arm: Add support 
of PSCI v1.0 for the host" in Xen 4.7. I am not sure whether we should 
backport it.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 13/24] x86/emul: Rework emulator event injection

2016-11-30 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: 30 November 2016 13:51
> To: Xen-devel 
> Cc: Andrew Cooper ; Jan Beulich
> ; Paul Durrant ; Tim
> (Xen.org) ; George Dunlap 
> Subject: [PATCH v3 13/24] x86/emul: Rework emulator event injection
> 
> The emulator needs to gain an understanding of interrupts and exceptions
> generated by its actions.
> 
> Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt
> so they
> are visible to the emulator.  This removes the need for the
> inject_{hw_exception,sw_interrupt}() hooks, which are dropped and
> replaced
> with x86_emul_{hw_exception,software_event,reset_event}() instead.
> 
> For exceptions raised by x86_emulate() itself (rather than its callbacks), the
> shadow pagetable and PV uses of x86_emulate() previously failed with
> X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks.
> 
> This behaviour has changed, and such cases will now return
> X86EMUL_EXCEPTION
> with event_pending set.  Until the callers of x86_emulate() have been
> updated
> to inject events back into the guest, divert the event_pending case back into
> the X86EMUL_UNHANDLEABLE path to maintain the same guest-visible
> behaviour.
> 
> No overall functional change.
> 
> Signed-off-by: Andrew Cooper 
> Reviewed-by: Boris Ostrovsky 
> Reviewed-by: Kevin Tian 
> ---
> CC: Jan Beulich 
> CC: Paul Durrant 

Reviewed-by: Paul Durrant 

> CC: Tim Deegan 
> CC: George Dunlap 
> 
> v3:
>  * Rework how the event_pending case is currently handled
> v2:
>  * Change x86_emul_hw_exception()'s error_code parameter to being
> signed
>  * Clarify how software interrupt injection happens.
>  * More ASSERT()'s and description of how event_pending works without the
>inject_sw_interrupt() hook
> ---
>  xen/arch/x86/hvm/emulate.c | 81 
> --
>  xen/arch/x86/hvm/hvm.c |  4 +-
>  xen/arch/x86/hvm/io.c  |  4 +-
>  xen/arch/x86/hvm/vmx/realmode.c| 16 +++
>  xen/arch/x86/mm.c  | 26 +++
>  xen/arch/x86/mm/shadow/multi.c | 17 +++
>  xen/arch/x86/x86_emulate/x86_emulate.c | 12 +++--
>  xen/arch/x86/x86_emulate/x86_emulate.h | 76
> +--
>  xen/include/asm-x86/hvm/emulate.h  |  3 --
>  9 files changed, 132 insertions(+), 107 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 91c79fa..4b8c9a0 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -568,12 +568,9 @@ static int hvmemul_virtual_to_linear(
>  return X86EMUL_UNHANDLEABLE;
> 
>  /* This is a singleton operation: fail it with an exception. */
> -hvmemul_ctxt->exn_pending = 1;
> -hvmemul_ctxt->trap.vector =
> -(seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault;
> -hvmemul_ctxt->trap.type = X86_EVENTTYPE_HW_EXCEPTION;
> -hvmemul_ctxt->trap.error_code = 0;
> -hvmemul_ctxt->trap.insn_len = 0;
> +x86_emul_hw_exception((seg == x86_seg_ss)
> +  ? TRAP_stack_error
> +  : TRAP_gp_fault, 0, _ctxt->ctxt);
>  return X86EMUL_EXCEPTION;
>  }
> 
> @@ -1562,59 +1559,6 @@ int hvmemul_cpuid(
>  return X86EMUL_OKAY;
>  }
> 
> -static int hvmemul_inject_hw_exception(
> -uint8_t vector,
> -int32_t error_code,
> -struct x86_emulate_ctxt *ctxt)
> -{
> -struct hvm_emulate_ctxt *hvmemul_ctxt =
> -container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
> -
> -hvmemul_ctxt->exn_pending = 1;
> -hvmemul_ctxt->trap.vector = vector;
> -hvmemul_ctxt->trap.type = X86_EVENTTYPE_HW_EXCEPTION;
> -hvmemul_ctxt->trap.error_code = error_code;
> -hvmemul_ctxt->trap.insn_len = 0;
> -
> -return X86EMUL_OKAY;
> -}
> -
> -static int hvmemul_inject_sw_interrupt(
> -enum x86_swint_type type,
> -uint8_t vector,
> -uint8_t insn_len,
> -struct x86_emulate_ctxt *ctxt)
> -{
> -struct hvm_emulate_ctxt *hvmemul_ctxt =
> -container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
> -
> -switch ( type )
> -{
> -case x86_swint_icebp:
> -hvmemul_ctxt->trap.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
> -break;
> -
> -case x86_swint_int3:
> -case x86_swint_into:
> -hvmemul_ctxt->trap.type = X86_EVENTTYPE_SW_EXCEPTION;
> -break;
> -
> -case x86_swint_int:
> -hvmemul_ctxt->trap.type = X86_EVENTTYPE_SW_INTERRUPT;
> -break;
> -
> -default:
> -return X86EMUL_UNHANDLEABLE;
> -}
> -
> -hvmemul_ctxt->exn_pending = 1;
> -hvmemul_ctxt->trap.vector = vector;
> -

Re: [Xen-devel] [PATCH v3 22/24] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back

2016-11-30 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: 30 November 2016 13:51
> To: Xen-devel 
> Cc: Andrew Cooper ; Paul Durrant
> 
> Subject: [PATCH v3 22/24] x86/hvm: Avoid __hvm_copy() raising #PF behind
> the emulators back
> 
> Drop the call to hvm_inject_page_fault() in __hvm_copy(), and require
> callers
> to inject the pagefault themselves.
> 
> Signed-off-by: Andrew Cooper 
> Acked-by: Tim Deegan 
> Acked-by: Kevin Tian 
> Reviewed-by: Jan Beulich 
> ---
> CC: Paul Durrant 

Reviewed-by: Paul Durrant 

> 
> v3:
>  * Correct patch description
>  * Fix rebasing error over previous TSS series
> ---
>  xen/arch/x86/hvm/emulate.c|  2 ++
>  xen/arch/x86/hvm/hvm.c| 14 --
>  xen/arch/x86/hvm/vmx/vvmx.c   | 20 +++-
>  xen/arch/x86/mm/shadow/common.c   |  1 +
>  xen/include/asm-x86/hvm/support.h |  4 +---
>  5 files changed, 31 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 035b654..ccf3aa2 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -799,6 +799,7 @@ static int __hvmemul_read(
>  case HVMCOPY_okay:
>  break;
>  case HVMCOPY_bad_gva_to_gfn:
> +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, _ctxt->ctxt);
>  return X86EMUL_EXCEPTION;
>  case HVMCOPY_bad_gfn_to_mfn:
>  if ( access_type == hvm_access_insn_fetch )
> @@ -905,6 +906,7 @@ static int hvmemul_write(
>  case HVMCOPY_okay:
>  break;
>  case HVMCOPY_bad_gva_to_gfn:
> +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, _ctxt->ctxt);
>  return X86EMUL_EXCEPTION;
>  case HVMCOPY_bad_gfn_to_mfn:
>  return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec,
> hvmemul_ctxt, 0);
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 37eaee2..3596f2c 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2927,6 +2927,8 @@ void hvm_task_switch(
> 
>  rc = hvm_copy_from_guest_linear(
>  , prev_tr.base, sizeof(tss), PFEC_page_present, );
> +if ( rc == HVMCOPY_bad_gva_to_gfn )
> +hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>  if ( rc != HVMCOPY_okay )
>  goto out;
> 
> @@ -2965,11 +2967,15 @@ void hvm_task_switch(
>offsetof(typeof(tss), trace) -
>offsetof(typeof(tss), eip),
>PFEC_page_present, );
> +if ( rc == HVMCOPY_bad_gva_to_gfn )
> +hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>  if ( rc != HVMCOPY_okay )
>  goto out;
> 
>  rc = hvm_copy_from_guest_linear(
>  , tr.base, sizeof(tss), PFEC_page_present, );
> +if ( rc == HVMCOPY_bad_gva_to_gfn )
> +hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>  /*
>   * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
>   * functions knew we want RO access.
> @@ -3012,7 +3018,10 @@ void hvm_task_switch(
>_link, sizeof(tss.back_link), 
> 0,
>);
>  if ( rc == HVMCOPY_bad_gva_to_gfn )
> +{
> +hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>  exn_raised = 1;
> +}
>  else if ( rc != HVMCOPY_okay )
>  goto out;
>  }
> @@ -3050,7 +3059,10 @@ void hvm_task_switch(
>  rc = hvm_copy_to_guest_linear(linear_addr, , opsz, 0,
>);
>  if ( rc == HVMCOPY_bad_gva_to_gfn )
> +{
> +hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>  exn_raised = 1;
> +}
>  else if ( rc != HVMCOPY_okay )
>  goto out;
>  }
> @@ -3114,8 +3126,6 @@ static enum hvm_copy_result __hvm_copy(
>  {
>  pfinfo->linear = addr;
>  pfinfo->ec = pfec;
> -
> -hvm_inject_page_fault(pfec, addr);
>  }
>  return HVMCOPY_bad_gva_to_gfn;
>  }
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
> b/xen/arch/x86/hvm/vmx/vvmx.c
> index fd7ea0a..e6e9ebd 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -396,7 +396,6 @@ static int decode_vmx_inst(struct cpu_user_regs
> *regs,
>  struct vcpu *v = current;
>  union vmx_inst_info info;
>  struct segment_register seg;
> -pagefault_info_t pfinfo;
>  unsigned long base, index, seg_base, disp, offset;
>  int scale, size;
> 
> @@ -451,10 +450,17 @@ static int decode_vmx_inst(struct cpu_user_regs
> *regs,
>offset + 

Re: [Xen-devel] [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag

2016-11-30 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: 30 November 2016 13:50
> To: Xen-devel 
> Cc: Andrew Cooper ; Jan Beulich
> ; Tim (Xen.org) ; Paul Durrant
> 
> Subject: [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag
> 
> The behaviour of singlestep is to raise #DB after the instruction has been
> completed, but implementing it with inject_hw_exception() causes
> x86_emulate()
> to return X86EMUL_EXCEPTION, despite succesfully completing execution of
> the
> instruction, including register writeback.
> 
> Instead, use a retire flag to indicate singlestep, which causes x86_emulate()
> to return X86EMUL_OKAY.
> 
> Update all callers of x86_emulate() to use the new retire flag.  This fixes
> the behaviour of singlestep for shadow pagetable updates and
> mmcfg/mmio_ro
> intercepts, which previously discarded the exception.
> 
> With this change, all uses of X86EMUL_EXCEPTION from x86_emulate() are
> believed to have strictly fault semantics.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Tim Deegan 
> CC: Paul Durrant 

Reviewed-by: Paul Durrant 

> 
> v3:
>  * New
> ---
>  xen/arch/x86/hvm/emulate.c |  3 +++
>  xen/arch/x86/mm.c  | 11 ++-
>  xen/arch/x86/mm/shadow/multi.c | 21 -
>  xen/arch/x86/x86_emulate/x86_emulate.c |  9 -
>  xen/arch/x86/x86_emulate/x86_emulate.h |  6 ++
>  5 files changed, 43 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index fe62500..91c79fa 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1788,6 +1788,9 @@ static int _hvm_emulate_one(struct
> hvm_emulate_ctxt *hvmemul_ctxt,
>  if ( rc != X86EMUL_OKAY )
>  return rc;
> 
> +if ( hvmemul_ctxt->ctxt.retire.singlestep )
> +hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +
>  new_intr_shadow = hvmemul_ctxt->intr_shadow;
> 
>  /* MOV-SS instruction toggles MOV-SS shadow, else we just clear it. */
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index b7c7122..231c7bf 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5382,6 +5382,9 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned
> long addr,
>  if ( rc == X86EMUL_UNHANDLEABLE )
>  goto bail;
> 
> +if ( ptwr_ctxt.ctxt.retire.singlestep )
> +pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +
>  perfc_incr(ptwr_emulations);
>  return EXCRET_fault_fixed;
> 
> @@ -5503,7 +5506,13 @@ int mmio_ro_do_page_fault(struct vcpu *v,
> unsigned long addr,
>  else
>  rc = x86_emulate(, _ro_emulate_ops);
> 
> -return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
> +if ( rc == X86EMUL_UNHANDLEABLE )
> +return 0;
> +
> +if ( ctxt.retire.singlestep )
> +pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +
> +return EXCRET_fault_fixed;
>  }
> 
>  void *alloc_xen_pagetable(void)
> diff --git a/xen/arch/x86/mm/shadow/multi.c
> b/xen/arch/x86/mm/shadow/multi.c
> index 9ee48a8..ddfb815 100644
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3422,6 +3422,16 @@ static int sh_page_fault(struct vcpu *v,
>  v->arch.paging.last_write_emul_ok = 0;
>  #endif
> 
> +if ( emul_ctxt.ctxt.retire.singlestep )
> +{
> +if ( is_hvm_vcpu(v) )
> +hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +else
> +pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +
> +goto emulate_done;
> +}
> +
>  #if GUEST_PAGING_LEVELS == 3 /* PAE guest */
>  if ( r == X86EMUL_OKAY ) {
>  int i, emulation_count=0;
> @@ -3433,7 +3443,7 @@ static int sh_page_fault(struct vcpu *v,
>  shadow_continue_emulation(_ctxt, regs);
>  v->arch.paging.last_write_was_pt = 0;
>  r = x86_emulate(_ctxt.ctxt, emul_ops);
> -if ( r == X86EMUL_OKAY )
> +if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw )
>  {
>  emulation_count++;
>  if ( v->arch.paging.last_write_was_pt )
> @@ -3449,6 +3459,15 @@ static int sh_page_fault(struct vcpu *v,
>  {
>  perfc_incr(shadow_em_ex_fail);
> 
> TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_EMULATION_LAST_FAILED);
> +
> +if ( emul_ctxt.ctxt.retire.singlestep )
> +{
> +if ( is_hvm_vcpu(v) )
> +hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +else
> +pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +}
> +
>  break; /* 

Re: [Xen-devel] [PATCH] arm/acpi: hide watchdog timer in GTDT table for dom0

2016-11-30 Thread Shanker Donthineni

Hi Julien,

We are using Fu's  [v5] patch series 
https://patchwork.codeaurora.org/patch/20325/ in our testing. We thought 
system crash in xen was related to watchdog timer driver, so removed the 
watchdog timer sections including GT blocks in GTDT to fix the crash. 
Let me root cause the issue and update the results to you by end of this 
week.


Thanks,
Shanker

--
Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-4.5-testing test] 102721: tolerable FAIL - PUSHED

2016-11-30 Thread osstest service owner
flight 102721 xen-4.5-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/102721/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemuu-win7-amd64 13 guest-localmigrate fail like 102543
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 15 guest-localmigrate/x10 fail like 
102673
 test-armhf-armhf-xl-rtds 11 guest-start  fail  like 102703
 test-amd64-amd64-xl-rtds  6 xen-boot fail  like 102710
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 102710
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 102710
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 102710
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 102710

Tests which did not succeed, but are not blocking:
 test-xtf-amd64-amd64-2   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-5   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-2 27 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-1   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-5 27 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-xtf-amd64-amd64-4   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-3   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-1 27 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-3 27 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-1 33 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-3 33 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-3   37 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-1   37 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-3  49 xtf/test-pv32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-3   57 xtf/test-pv64-cpuid-faulting fail   never pass
 test-xtf-amd64-amd64-4 27 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-1  49 xtf/test-pv32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-1   57 xtf/test-pv64-cpuid-faulting fail   never pass
 test-xtf-amd64-amd64-4 33 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-4   37 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-2 33 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-2   37 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-4  49 xtf/test-pv32pae-cpuid-faulting fail never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-xtf-amd64-amd64-5 33 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-4   57 xtf/test-pv64-cpuid-faulting fail   never pass
 test-xtf-amd64-amd64-5   37 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-2  49 xtf/test-pv32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-2   57 xtf/test-pv64-cpuid-faulting fail   never pass
 test-xtf-amd64-amd64-5  49 xtf/test-pv32pae-cpuid-faulting fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-xtf-amd64-amd64-5   57 xtf/test-pv64-cpuid-faulting fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 10 guest-start  fail   never pass
 test-armhf-armhf-libvirt-qcow2 10 guest-start  fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  10 guest-start  fail   never pass

version targeted for testing:
 xen  cc325c0bd56d5cc327f8a426d661cc1a2f3a52bd
baseline 

Re: [Xen-devel] [PATCH v3.1 15/15] xen/x86: setup PVHv2 Dom0 ACPI tables

2016-11-30 Thread Roger Pau Monne
On Wed, Nov 30, 2016 at 07:09:47AM -0700, Jan Beulich wrote:
> >>> On 30.11.16 at 13:40,  wrote:
> > On Mon, Nov 14, 2016 at 09:15:37AM -0700, Jan Beulich wrote:
> >> >>> On 29.10.16 at 11:00,  wrote:
> >> > Also, regions marked as E820_ACPI or E820_NVS are identity mapped into 
> >> > Dom0
> >> > p2m, plus any top-level ACPI tables that should be accessible to Dom0 and
> >> > that don't reside in RAM regions. This is needed because some memory maps
> >> > don't properly account for all the memory used by ACPI, so it's common to
> >> > find ACPI tables in holes.
> >> 
> >> I question whether this behavior should be enabled by default. Not
> >> having seen the code yet I also wonder whether these regions
> >> shouldn't simply be added to the guest's E820 as E820_ACPI, which
> >> should then result in them getting mapped without further special
> >> casing.
> >> 
> >> > +static int __init hvm_add_mem_range(struct domain *d, uint64_t s, 
> >> > uint64_t e,
> >> > +uint32_t type)
> >> 
> >> I see s and e being uint64_t, but I don't see why type can't be plain
> >> unsigned int.
> > 
> > Well, that's the type for "type" as defined in e820.h. I'm just using 
> > uint32_t 
> > for consistency with that.
> 
> As said a number of times in various contexts: We should try to
> get away from using fixed width types where we don't really need
> them.

Done, I've changed it. Would you like me to also change the uint64_t's to 
paddr_t?

> >> > +{
> >> > +d->arch.e820[i].size += e - s;
> >> > +return 0;
> >> > +}
> >> > +
> >> > +if ( rs >= e )
> >> > +break;
> >> > +
> >> > +if ( re > s )
> >> > +return -ENOMEM;
> >> 
> >> I don't think ENOMEM is appropriate to signal an overlap. And don't
> >> you need to reverse these last two if()s?
> > 
> > I've changed ENOMEM to EEXIST. Hm, I don't think so, if I reversed those we 
> > will 
> > get error when trying to add a non-contiguous region to fill a hole between 
> > two 
> > existing regions right?
> 
> Looks like I've managed to write something else than I meant. I was
> really thinking of
> 
> if ( re > s )
> {
> if ( rs >= e )
> break;
> return -ENOMEM;
> }
> 
> But then again I think with things being sorted it may not matter at all.

I slightly prefer the current one since it has less nested ifs, but if you have 
a strong preference for the later I don't really mind changing it.

> >> > +if ( nr_ioapics > 1 )
> >> > +printk("WARNING: found %d IO APICs, Dom0 will only have access 
> >> > to 1 emulated IO APIC\n",
> >> > +   nr_ioapics);
> >> 
> >> I've said elsewhere already that I think we should provide 1 vIO-APIC
> >> per physical one.
> > 
> > Agree, but the current vIO-APIC is not really up to it. I will work on 
> > getting 
> > it to support multiple instances.
> 
> Until then this should obtain a grep-able "fixme" annotation.

Oh, right (you said that several times, sorry).
 
Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Xen ARM - Exposing a PL011 to the guest

2016-11-30 Thread Julien Grall

Hi all,

Few months ago, Linaro has published the version 2 of the VM 
specification [1].


For those who don't know, the specification provides guidelines to 
guarantee a compliant OS images could run on various hypervisor (e.g 
Xen, KVM).


Looking at the specification, it will require Xen to expose new devices 
to the guest: pl011, rtc, persistent flash (for UEFI variables).


The RTC and persistent will only be used by the UEFI firwmare. The 
firwmare is custom made for Xen guest and be loaded by the toolstack, so 
we could theoretically provide PV drivers for those.


This is not the case for the PL011. The guest will be shipped with a 
PL011/SBSA UART driver,.This means it will expect to access it through MMIO.


So we have to emulate a PL011. The question is where? Before suggesting 
some ideas, the guest/user will expect to be able to interact with the 
console through the UART. This means that the UART and xenconsoled needs 
to communicate together.


I think we can distinct two places where the PL011 could be emulated:
in the hypervisor, or outside the hypervisor.

Emulating the UART in the hypervisor means that we take the risk to 
increase to the attack surface of Xen if there is a bug in the emulation 
code. The attack surface could be reduced by emulating the UART in 
another exception level (e.g EL1, EL0) but still under the control of 
the hypervisor. Usually the guest is communicating between with 
xenconsoled using a ring. For the first console this could be discovered 
using hypercall HVMOP_get_param. For the second and onwards, it 
described in xenstore. I would not worry too much about emulating 
multiple PL011s, so we could implement the PV frontend in Xen.


Emulating the UART outside the hypervisor (e.g in DOM0 or special 
domain) would require to bring the concept of ioreq server on ARM. Which 
left the question where do we emulate the PL011? The best place would be 
xenconsoled. But I am not sure how would be the security impact here. 
Does all guest consoles are emulated within the same daemon?


I would lean towards the first solution if we implement all the security 
safety I mentioned. Although, the second solution would be a good move 
if we decide to implement more devices (e.g RTC, pflash) in the future.


Do you have any opinions?

Cheers,

[1] 
http://people.linaro.org/~christoffer.dall/VMSystemSpecificationForARM-v2.0-rc1.pdf


--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union

2016-11-30 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper
> Sent: 30 November 2016 14:02
> To: Paul Durrant ; Xen-devel  de...@lists.xen.org>
> Cc: Jan Beulich 
> Subject: Re: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire
> union
> 
> On 30/11/16 13:58, Paul Durrant wrote:
> >> -Original Message-
> >> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> >> Sent: 30 November 2016 13:50
> >> To: Xen-devel 
> >> Cc: Andrew Cooper ; Jan Beulich
> >> ; Paul Durrant 
> >> Subject: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire
> union
> >>
> >> Rename byte to raw, as the field being a single byte long is an
> >> implementation
> >> detail.  Make the bitfields part of an anonymous struct to remove the
> .flags
> >> qualifier.  Change the types of the flags to being booleans, to match their
> >> use.
> >>
> > Is it legitimate to use a bool in a bitfield?
> 
> Yes.  Why wouldn't it be?
> 

They always used to be restricted to int or unsigned int. Looks like this was 
relaxed in C99.

> > Also, anonymous unions are not part of C99 AFAIK... are we now stipulating
> something more recent?
> 
> We used gnu99 for as long as I can remember, and we have other examples
> of this pattern already in Xen.
> 

If there's precedent then that's fine.

Reviewed-by: Paul Durrant 

> ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union

2016-11-30 Thread Andrew Cooper
On 30/11/16 13:58, Paul Durrant wrote:
>> -Original Message-
>> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
>> Sent: 30 November 2016 13:50
>> To: Xen-devel 
>> Cc: Andrew Cooper ; Jan Beulich
>> ; Paul Durrant 
>> Subject: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union
>>
>> Rename byte to raw, as the field being a single byte long is an
>> implementation
>> detail.  Make the bitfields part of an anonymous struct to remove the .flags
>> qualifier.  Change the types of the flags to being booleans, to match their
>> use.
>>
> Is it legitimate to use a bool in a bitfield?

Yes.  Why wouldn't it be?

> Also, anonymous unions are not part of C99 AFAIK... are we now stipulating 
> something more recent?

We used gnu99 for as long as I can remember, and we have other examples
of this pattern already in Xen.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union

2016-11-30 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: 30 November 2016 13:50
> To: Xen-devel 
> Cc: Andrew Cooper ; Jan Beulich
> ; Paul Durrant 
> Subject: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union
> 
> Rename byte to raw, as the field being a single byte long is an
> implementation
> detail.  Make the bitfields part of an anonymous struct to remove the .flags
> qualifier.  Change the types of the flags to being booleans, to match their
> use.
> 

Is it legitimate to use a bool in a bitfield? Also, anonymous unions are not 
part of C99 AFAIK... are we now stipulating something more recent?

  Paul

> No functional change.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Paul Durrant 
> 
> v3:
>  * New
> ---
>  xen/arch/x86/hvm/emulate.c |  6 +++---
>  xen/arch/x86/x86_emulate/x86_emulate.c | 10 +-
>  xen/arch/x86/x86_emulate/x86_emulate.h | 10 +-
>  3 files changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index bc259ec..fe62500 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1791,13 +1791,13 @@ static int _hvm_emulate_one(struct
> hvm_emulate_ctxt *hvmemul_ctxt,
>  new_intr_shadow = hvmemul_ctxt->intr_shadow;
> 
>  /* MOV-SS instruction toggles MOV-SS shadow, else we just clear it. */
> -if ( hvmemul_ctxt->ctxt.retire.flags.mov_ss )
> +if ( hvmemul_ctxt->ctxt.retire.mov_ss )
>  new_intr_shadow ^= HVM_INTR_SHADOW_MOV_SS;
>  else
>  new_intr_shadow &= ~HVM_INTR_SHADOW_MOV_SS;
> 
>  /* STI instruction toggles STI shadow, else we just clear it. */
> -if ( hvmemul_ctxt->ctxt.retire.flags.sti )
> +if ( hvmemul_ctxt->ctxt.retire.sti )
>  new_intr_shadow ^= HVM_INTR_SHADOW_STI;
>  else
>  new_intr_shadow &= ~HVM_INTR_SHADOW_STI;
> @@ -1808,7 +1808,7 @@ static int _hvm_emulate_one(struct
> hvm_emulate_ctxt *hvmemul_ctxt,
>  hvm_funcs.set_interrupt_shadow(curr, new_intr_shadow);
>  }
> 
> -if ( hvmemul_ctxt->ctxt.retire.flags.hlt &&
> +if ( hvmemul_ctxt->ctxt.retire.hlt &&
>   !hvm_local_events_need_delivery(curr) )
>  {
>  hvm_hlt(regs->eflags);
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
> b/xen/arch/x86/x86_emulate/x86_emulate.c
> index 9c28ed4..416812e 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -1905,7 +1905,7 @@ x86_decode(
>  state->eip = ctxt->regs->eip;
> 
>  /* Initialise output state in x86_emulate_ctxt */
> -ctxt->retire.byte = 0;
> +ctxt->retire.raw = 0;
> 
>  op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt-
> >addr_size/8;
>  if ( op_bytes == 8 )
> @@ -2668,7 +2668,7 @@ x86_emulate(
> 
>  case 0x17: /* pop %%ss */
>  src.val = x86_seg_ss;
> -ctxt->retire.flags.mov_ss = 1;
> +ctxt->retire.mov_ss = 1;
>  goto pop_seg;
> 
>  case 0x1e: /* push %%ds */
> @@ -2996,7 +2996,7 @@ x86_emulate(
>  if ( (rc = load_seg(seg, src.val, 0, NULL, ctxt, ops)) != 0 )
>  goto done;
>  if ( seg == x86_seg_ss )
> -ctxt->retire.flags.mov_ss = 1;
> +ctxt->retire.mov_ss = 1;
>  dst.type = OP_NONE;
>  break;
> 
> @@ -4033,7 +4033,7 @@ x86_emulate(
> 
>  case 0xf4: /* hlt */
>  generate_exception_if(!mode_ring0(), EXC_GP, 0);
> -ctxt->retire.flags.hlt = 1;
> +ctxt->retire.hlt = 1;
>  break;
> 
>  case 0xf5: /* cmc */
> @@ -4247,7 +4247,7 @@ x86_emulate(
>  if ( !(_regs.eflags & EFLG_IF) )
>  {
>  _regs.eflags |= EFLG_IF;
> -ctxt->retire.flags.sti = 1;
> +ctxt->retire.sti = 1;
>  }
>  break;
> 
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index b0f0304..ef39601 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -468,12 +468,12 @@ struct x86_emulate_ctxt
> 
>  /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY).
> */
>  union {
> +uint8_t raw;
>  struct {
> -uint8_t hlt:1;  /* Instruction HLTed. */
> -uint8_t mov_ss:1;   /* Instruction sets MOV-SS irq shadow. */
> -uint8_t sti:1;  /* Instruction sets STI irq shadow. */
> -} flags;
> -uint8_t byte;
> +bool hlt:1;  /* Instruction HLTed. */
> +bool mov_ss:1;   /* Instruction sets MOV-SS irq shadow. */
> +bool sti:1;  /* Instruction sets STI irq shadow. */
> +};
>  } retire;
>  };
> 
> --
> 2.1.4



Re: [Xen-devel] [PATCH] arm/acpi: hide watchdog timer in GTDT table for dom0

2016-11-30 Thread Julien Grall



On 30/11/16 14:43, Shanker Donthineni wrote:

Hi Julien,


Hi Shanker,


On 11/30/2016 08:31 AM, Shanker Donthineni wrote:

On 11/30/2016 04:29 AM, Julien Grall wrote:

Hi Shanker,

On 29/11/2016 02:59, Shanker Donthineni wrote:

Either we have to hide the watchdog timer section in GTDT or emulate
watchdog timer block for dom0. Otherwise, system gets panic when
dom0 accesses its MMIO registers. The current XEN doesn't support
virtualization of watchdog timer, so hide the watchdog timer section
for dom0.


IHMO, the patch description is not really accurate. You are removing
the platform timer array that contains watchdog but also Block Timer.

Whilst you mention watchdog, you don't have a word on the Block Timer.



Sure, I'll include detailed description that has word  'Block Timer'
in v2 patch.


Taking a step back, DOM0 is not able to use it because it does not
request to map the memory region (this is the behavior expected for
PCI and AMBA devices). So this is a bug in the kernel for me.


Not all the drivers/modules in the Linux kernel are bounded to a device
object 'struct device'.  The watchdog timer driver is one of this
category, neither a PCIe nor a AMBA device.

You can find GTDT watchdog timer code at
https://lkml.org/lkml/2016/7/25/34.



Sorry, right link is https://lkml.org/lkml/2016/7/25/345


Thank you. Looking at patch #9 [1] that add the watchdog driver. A new 
platform device is created (see the call to 
platform_device_register_simple).


We already have a platform bus notifier for Xen in linux (see 
drivers/xen/arm-device.c). So I am not sure to understand what is the 
problem here as the region should be mapped by the notifier. Could you 
give a bit more details?


Regards,

[1] https://lkml.org/lkml/2016/7/25/353

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] arm/acpi: hide watchdog timer in GTDT table for dom0

2016-11-30 Thread Shanker Donthineni

Hi Julien,


On 11/30/2016 08:31 AM, Shanker Donthineni wrote:

Hi Julien,


On 11/30/2016 04:29 AM, Julien Grall wrote:

Hi Shanker,

On 29/11/2016 02:59, Shanker Donthineni wrote:

Either we have to hide the watchdog timer section in GTDT or emulate
watchdog timer block for dom0. Otherwise, system gets panic when
dom0 accesses its MMIO registers. The current XEN doesn't support
virtualization of watchdog timer, so hide the watchdog timer section
for dom0.


IHMO, the patch description is not really accurate. You are removing
the platform timer array that contains watchdog but also Block Timer.

Whilst you mention watchdog, you don't have a word on the Block Timer.



Sure, I'll include detailed description that has word  'Block Timer' 
in v2 patch.



Taking a step back, DOM0 is not able to use it because it does not
request to map the memory region (this is the behavior expected for
PCI and AMBA devices). So this is a bug in the kernel for me.


Not all the drivers/modules in the Linux kernel are bounded to a device
object 'struct device'.  The watchdog timer driver is one of this
category, neither a PCIe nor a AMBA device.

You can find GTDT watchdog timer code at 
https://lkml.org/lkml/2016/7/25/34.




Sorry, right link is https://lkml.org/lkml/2016/7/25/345


Assuming this would be fixed, what would be the drawback to give
access to dom0 to the watchdogs?


Absolutely no problem.


My worry with that change is what if in the future we decide to expose
watchdog to DOM0? Linux will still not be ready, unless we have Xen to
map those regions at DOM0 build time. That would break the design we
have for ACPI.



Agree, I was also thinking in the same direction. Let me create a v2
patch that does stage-2 map of MMIO regions at the time of building the
dom0 domain to fix the problem.


Regards,





--
Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] arm/acpi: hide watchdog timer in GTDT table for dom0

2016-11-30 Thread Julien Grall

On 30/11/16 14:31, Shanker Donthineni wrote:

Hi Julien,


Hi Shanker,



On 11/30/2016 04:29 AM, Julien Grall wrote:

Hi Shanker,

On 29/11/2016 02:59, Shanker Donthineni wrote:

Either we have to hide the watchdog timer section in GTDT or emulate
watchdog timer block for dom0. Otherwise, system gets panic when
dom0 accesses its MMIO registers. The current XEN doesn't support
virtualization of watchdog timer, so hide the watchdog timer section
for dom0.


IHMO, the patch description is not really accurate. You are removing
the platform timer array that contains watchdog but also Block Timer.

Whilst you mention watchdog, you don't have a word on the Block Timer.



Sure, I'll include detailed description that has word  'Block Timer'.


Taking a step back, DOM0 is not able to use it because it does not
request to map the memory region (this is the behavior expected for
PCI and AMBA devices). So this is a bug in the kernel for me.


Not all the drivers/modules in the Linux kernel are bounded to a device
object 'struct device'.  The watchdog timer driver is one of this
category, neither a PCIe nor a AMBA device.

You can find GTDT watchdog timer code at
https://lkml.org/lkml/2016/7/25/34.


This link points to an i2c driver. Did you intend to post a different link?




Assuming this would be fixed, what would be the drawback to give
access to dom0 to the watchdogs?


Absolutely no problem.


My worry with that change is what if in the future we decide to expose
watchdog to DOM0? Linux will still not be ready, unless we have Xen to
map those regions at DOM0 build time. That would break the design we
have for ACPI.



Agree, I was also thinking in the same direction. Let me create a v2
patch that does stage-2 map of MMIO regions at the time of building the
dom0 domain to fix the problem.


You didn't understand my point here. The design we agreed for ACPI
support is DOM0 should request the mapping before using any device. This 
is because Xen cannot know in advance the memory attribute to use.


So for me this solution is just a workaround. It does not address the 
real problem that Linux does not request the mapping of the memory 
region before using it. So we will end up with the same problem later if 
a device is neither AMBA, nor PCI.


This is easy to solve because watchdog are described the static tables. 
What if the device is described in ASL?


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 06/19] x86/pv: Implement pv_inject_{event, page_fault, hw_exception}()

2016-11-30 Thread Andrew Cooper
On 30/11/16 08:41, Jan Beulich wrote:
  if ( unlikely(null_trap_bounce(v, tb)) )
 -gprintk(XENLOG_WARNING,
 -"Unhandled %s fault/trap [#%d, ec=%04x]\n",
 -trapstr(trapnr), trapnr, regs->error_code);
 +{
 +if ( vector == TRAP_page_fault )
 +{
 +printk("%pv: unhandled page fault (ec=%04X)\n", v, 
 error_code);
 +show_page_walk(event->cr2);
 +
 +if ( unlikely(error_code & PFEC_reserved_bit) )
 +reserved_bit_page_fault(event->cr2, regs);
>>> I think you want to move the show_page_walk() into an else here,
>>> to avoid logging two of them. But then again - why do you move
>>> this behind the null_trap_bounce() check? It had been logged
>>> independently in the original code, and for (imo) a good reason
>>> (reserved bit faults are always a sign of a hypervisor problem
>>> after all).
>> TBH, I found it odd that it was in propagate_page_fault() to start with.
>>
>> It is the kind of thing which should be in the pagefault handler itself,
>> not in the reinjection-to-pv-guests code.
>>
>> Would moving it into fixup_page_fault() be ok?  It should probably go
>> between the hap/shadow and memadd checks, as shadow guests can have
>> reserved bits set.
> That would move it ahead in the flow quite a bit, which I'm not sure
> is a good idea (due to possible [hypothetical] other uses of reserved
> bits). Also note that there already is a call to reserved_bit_page_fault()
> in the !guest_mode() case, so if anything I would see it moved right
> ahead of the pv_inject_page_fault() invocation from do_page_fault().
> (This would then shrink page size too, as you wouldn't have to move
> around reserved_bit_page_fault() itself.)

Done.  This looks much cleaner.

>
> Btw, looking at the full patch again I notice that the error code
> parameter to pv_inject_page_fault() is signed, which is contrary to
> what I think I recall you saying on the HVM side of things (the
> error code being non-optional here, and hence better being unsigned,
> as X86_EVENT_NO_EC is not allowed).

hvm_inject_page_fault() has always had a signed error code, and the API
is maintained by x86_emul_* and pv_inject_* for consistency.

struct pfinfo currently has an unsigned ec parameter, because all the
existing code uses uint32_t pfec.  In reality, this isn't a problem at
the pfinfo / *_inject_page_fault() boundary.

I have a number of patches focusing on pagefault, and in particular
trying to clean up a lot of misuse of pfec.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] arm/acpi: hide watchdog timer in GTDT table for dom0

2016-11-30 Thread Shanker Donthineni

Hi Julien,


On 11/30/2016 04:29 AM, Julien Grall wrote:

Hi Shanker,

On 29/11/2016 02:59, Shanker Donthineni wrote:

Either we have to hide the watchdog timer section in GTDT or emulate
watchdog timer block for dom0. Otherwise, system gets panic when
dom0 accesses its MMIO registers. The current XEN doesn't support
virtualization of watchdog timer, so hide the watchdog timer section
for dom0.


IHMO, the patch description is not really accurate. You are removing 
the platform timer array that contains watchdog but also Block Timer.


Whilst you mention watchdog, you don't have a word on the Block Timer.



Sure, I'll include detailed description that has word  'Block Timer'.

Taking a step back, DOM0 is not able to use it because it does not 
request to map the memory region (this is the behavior expected for 
PCI and AMBA devices). So this is a bug in the kernel for me.


Not all the drivers/modules in the Linux kernel are bounded to a device 
object 'struct device'.  The watchdog timer driver is one of this 
category, neither a PCIe nor a AMBA device.


You can find GTDT watchdog timer code at https://lkml.org/lkml/2016/7/25/34.

Assuming this would be fixed, what would be the drawback to give 
access to dom0 to the watchdogs?



Absolutely no problem.

My worry with that change is what if in the future we decide to expose 
watchdog to DOM0? Linux will still not be ready, unless we have Xen to 
map those regions at DOM0 build time. That would break the design we 
have for ACPI.




Agree, I was also thinking in the same direction. Let me create a v2 
patch that does stage-2 map of MMIO regions at the time of building the 
dom0 domain to fix the problem.



Regards,



--
Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [OSSTEST PATCH 1/2] Executive database: set isolation level in Perl

2016-11-30 Thread Ian Jackson
The Perl was lacking SET TRANSACTION ISOLATION LEVEL SERIALIZABLE,
which is sadly not the default.  Currently that does not matter
because of all the table locking, but we are about to abolish that.

Signed-off-by: Ian Jackson 
---
 Osstest/JobDB/Executive.pm | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm
index 76f3293..557cee1 100644
--- a/Osstest/JobDB/Executive.pm
+++ b/Osstest/JobDB/Executive.pm
@@ -42,6 +42,9 @@ sub begin_work ($$$) { #method
 my ($jd, $dbh,$tables) = @_;
 
 return if $ENV{'OSSTEST_DEBUG_NOSQLLOCK'};
+
+$dbh->do("SET TRANSACTION ISOLATION LEVEL SERIALIZABLE");
+
 foreach my $tab (@$tables) {
 $dbh->do("LOCK TABLE $tab IN EXCLUSIVE MODE");
 }
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [OSSTEST PATCH 2/2] Executive database: stub out use of LOCK TABLES

2016-11-30 Thread Ian Jackson
We want to improve database performance, and one of the problems is
excessive locking.  Postgresql now has predictate locking, and we
have, we think, eliminated all the places that do not handle a
database transaction failure.  So we can rely on optimistic
concurrency.

So, eliminate all uses of LOCK TABLES.

However, I'm not quite sure that all of the above is actually true -
particularly, with relation to our own error handling.  So, we want to
leave ourselves an escape hatch and an easy reversion path.

The approach adopted is to change the semantics of the transaction
support routines (one in Perl, and one in Tcl) so that the meaning of
all the existing call sites is changed to "do not lock any tables".

But the facility for table locking is retained and any call sites
which still need locking or fixing can use a new parameter format to
say they actually want the locking.

Hopefully this will turn out to be unnecessary.  In that case, in due
course, we can strip out all the locking machinery, abolish all the
corresponding parameters, and so on.

Signed-off-by: Ian Jackson 
---
 Osstest/JobDB/Executive.pm | 8 
 tcl/JobDB-Executive.tcl| 2 ++
 2 files changed, 10 insertions(+)

diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm
index 557cee1..c02ff19 100644
--- a/Osstest/JobDB/Executive.pm
+++ b/Osstest/JobDB/Executive.pm
@@ -45,6 +45,14 @@ sub begin_work ($$$) { #method
 
 $dbh->do("SET TRANSACTION ISOLATION LEVEL SERIALIZABLE");
 
+# $tables used to be [qw(some tables)]
+#  we ignore that for now
+# callers can pass for $tables [[qw(some tables)]]
+#  to override this ignorement, in case it causes trouble
+return unless @$tables && ref $tables->[0];
+die '[[qw(some tables)], something] ?' unless @$tables == 1;
+$tables = $tables->[0];
+
 foreach my $tab (@$tables) {
 $dbh->do("LOCK TABLE $tab IN EXCLUSIVE MODE");
 }
diff --git a/tcl/JobDB-Executive.tcl b/tcl/JobDB-Executive.tcl
index 6225bd9..05677e3 100644
--- a/tcl/JobDB-Executive.tcl
+++ b/tcl/JobDB-Executive.tcl
@@ -258,6 +258,8 @@ proc db-execute-array {arrayvar stmt {body {}}} {
 
 proc lock-tables {tables} {
 # must be inside transaction
+set first [lshift tables]
+if {[string compare $first REALLY]} return
 foreach tab $tables {
 db-execute "
LOCK TABLE $tab IN EXCLUSIVE MODE
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 19/24] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer

2016-11-30 Thread Andrew Cooper
which is filled with pagefault information should one occur.

No functional change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
Acked-by: Tim Deegan 
Reviewed-by: Paul Durrant 
Reviewed-by: Kevin Tian 
---
 xen/arch/x86/hvm/emulate.c|  8 ---
 xen/arch/x86/hvm/hvm.c| 49 +--
 xen/arch/x86/hvm/vmx/vvmx.c   |  9 ---
 xen/arch/x86/mm/shadow/common.c   |  5 ++--
 xen/include/asm-x86/hvm/support.h | 23 +-
 5 files changed, 63 insertions(+), 31 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 614e182..41f689e 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -770,6 +770,7 @@ static int __hvmemul_read(
 struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
 struct vcpu *curr = current;
+pagefault_info_t pfinfo;
 unsigned long addr, reps = 1;
 uint32_t pfec = PFEC_page_present;
 struct hvm_vcpu_io *vio = >arch.hvm_vcpu.hvm_io;
@@ -790,8 +791,8 @@ static int __hvmemul_read(
 pfec |= PFEC_user_mode;
 
 rc = ((access_type == hvm_access_insn_fetch) ?
-  hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec) :
-  hvm_copy_from_guest_virt(p_data, addr, bytes, pfec));
+  hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec, ) :
+  hvm_copy_from_guest_virt(p_data, addr, bytes, pfec, ));
 
 switch ( rc )
 {
@@ -878,6 +879,7 @@ static int hvmemul_write(
 struct hvm_emulate_ctxt *hvmemul_ctxt =
 container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
 struct vcpu *curr = current;
+pagefault_info_t pfinfo;
 unsigned long addr, reps = 1;
 uint32_t pfec = PFEC_page_present | PFEC_write_access;
 struct hvm_vcpu_io *vio = >arch.hvm_vcpu.hvm_io;
@@ -896,7 +898,7 @@ static int hvmemul_write(
  (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
 pfec |= PFEC_user_mode;
 
-rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec);
+rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec, );
 
 switch ( rc )
 {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index bdfd94e..390f76d 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2859,6 +2859,7 @@ void hvm_task_switch(
 struct desc_struct *optss_desc = NULL, *nptss_desc = NULL, tss_desc;
 bool_t otd_writable, ntd_writable;
 unsigned long eflags;
+pagefault_info_t pfinfo;
 int exn_raised, rc;
 struct {
 u16 back_link,__blh;
@@ -2925,7 +2926,7 @@ void hvm_task_switch(
 }
 
 rc = hvm_copy_from_guest_virt(
-, prev_tr.base, sizeof(tss), PFEC_page_present);
+, prev_tr.base, sizeof(tss), PFEC_page_present, );
 if ( rc != HVMCOPY_okay )
 goto out;
 
@@ -2963,12 +2964,12 @@ void hvm_task_switch(
 ,
 offsetof(typeof(tss), trace) -
 offsetof(typeof(tss), eip),
-PFEC_page_present);
+PFEC_page_present, );
 if ( rc != HVMCOPY_okay )
 goto out;
 
 rc = hvm_copy_from_guest_virt(
-, tr.base, sizeof(tss), PFEC_page_present);
+, tr.base, sizeof(tss), PFEC_page_present, );
 /*
  * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
  * functions knew we want RO access.
@@ -3008,7 +3009,8 @@ void hvm_task_switch(
 tss.back_link = prev_tr.sel;
 
 rc = hvm_copy_to_guest_virt(tr.base + offsetof(typeof(tss), back_link),
-_link, sizeof(tss.back_link), 0);
+_link, sizeof(tss.back_link), 0,
+);
 if ( rc == HVMCOPY_bad_gva_to_gfn )
 exn_raised = 1;
 else if ( rc != HVMCOPY_okay )
@@ -3045,7 +3047,8 @@ void hvm_task_switch(
 16 << segr.attr.fields.db,
 _addr) )
 {
-rc = hvm_copy_to_guest_virt(linear_addr, , opsz, 0);
+rc = hvm_copy_to_guest_virt(linear_addr, , opsz, 0,
+);
 if ( rc == HVMCOPY_bad_gva_to_gfn )
 exn_raised = 1;
 else if ( rc != HVMCOPY_okay )
@@ -3068,7 +3071,8 @@ void hvm_task_switch(
 #define HVMCOPY_phys   (0u<<2)
 #define HVMCOPY_virt   (1u<<2)
 static enum hvm_copy_result __hvm_copy(
-void *buf, paddr_t addr, int size, unsigned int flags, uint32_t pfec)
+void *buf, paddr_t addr, int size, unsigned int flags, uint32_t pfec,
+pagefault_info_t *pfinfo)
 {
 struct vcpu *curr = current;
 unsigned long gfn;
@@ -3109,7 +3113,15 @@ static enum hvm_copy_result __hvm_copy(
 if ( pfec & PFEC_page_shared )
 

[Xen-devel] [PATCH v3 14/24] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register()

2016-11-30 Thread Andrew Cooper
No functional change at this point, but this is a prerequisite for forthcoming
functional changes.

Make vmx_get_segment_register() private to vmx.c like all the other Vendor
get/set functions.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
Reviewed-by: George Dunlap 
Acked-by: Kevin Tian 
---
 xen/arch/x86/hvm/vmx/vmx.c| 14 +++---
 xen/arch/x86/hvm/vmx/vvmx.c   |  6 +++---
 xen/include/asm-x86/hvm/vmx/vmx.h |  2 --
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 31f08d2..377c789 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -940,8 +940,8 @@ static void vmx_ctxt_switch_to(struct vcpu *v)
 .fields = { .type = 0xb, .s = 0, .dpl = 0, .p = 1, .avl = 0,\
 .l = 0, .db = 0, .g = 0, .pad = 0 } }).bytes)
 
-void vmx_get_segment_register(struct vcpu *v, enum x86_segment seg,
-  struct segment_register *reg)
+static void vmx_get_segment_register(struct vcpu *v, enum x86_segment seg,
+ struct segment_register *reg)
 {
 unsigned long attr = 0, sel = 0, limit;
 
@@ -1504,19 +1504,19 @@ static void vmx_update_guest_cr(struct vcpu *v, 
unsigned int cr)
  * Need to read them all either way, as realmode reads can update
  * the saved values we'll use when returning to prot mode. */
 for ( s = 0; s < ARRAY_SIZE(reg); s++ )
-vmx_get_segment_register(v, s, [s]);
+hvm_get_segment_register(v, s, [s]);
 v->arch.hvm_vmx.vmx_realmode = realmode;
 
 if ( realmode )
 {
 for ( s = 0; s < ARRAY_SIZE(reg); s++ )
-vmx_set_segment_register(v, s, [s]);
+hvm_set_segment_register(v, s, [s]);
 }
 else 
 {
 for ( s = 0; s < ARRAY_SIZE(reg); s++ )
 if ( !(v->arch.hvm_vmx.vm86_segment_mask & (1<arch.hvm_vmx.vm86_saved_seg[s]);
 }
 
@@ -3907,7 +3907,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 gdprintk(XENLOG_WARNING, "Bad vmexit (reason %#lx)\n",
  exit_reason);
 
-vmx_get_segment_register(v, x86_seg_ss, );
+hvm_get_segment_register(v, x86_seg_ss, );
 if ( ss.attr.fields.dpl )
 hvm_inject_hw_exception(TRAP_invalid_op,
 X86_EVENT_NO_EC);
@@ -3939,7 +3939,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 
 gprintk(XENLOG_WARNING, "Bad rIP %lx for mode %u\n", regs->rip, mode);
 
-vmx_get_segment_register(v, x86_seg_ss, );
+hvm_get_segment_register(v, x86_seg_ss, );
 if ( ss.attr.fields.dpl )
 {
 __vmread(VM_ENTRY_INTR_INFO, _info);
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index efaf54c..bcc4a97 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -360,7 +360,7 @@ static int vmx_inst_check_privilege(struct cpu_user_regs 
*regs, int vmxop_check)
 else if ( !vcpu_2_nvmx(v).vmxon_region_pa )
 goto invalid_op;
 
-vmx_get_segment_register(v, x86_seg_cs, );
+hvm_get_segment_register(v, x86_seg_cs, );
 
 if ( (regs->eflags & X86_EFLAGS_VM) ||
  (hvm_long_mode_enabled(v) && cs.attr.fields.l == 0) )
@@ -419,13 +419,13 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
 
 if ( hvm_long_mode_enabled(v) )
 {
-vmx_get_segment_register(v, x86_seg_cs, );
+hvm_get_segment_register(v, x86_seg_cs, );
 mode_64bit = seg.attr.fields.l;
 }
 
 if ( info.fields.segment > VMX_SREG_GS )
 goto gp_fault;
-vmx_get_segment_register(v, sreg_to_index[info.fields.segment], );
+hvm_get_segment_register(v, sreg_to_index[info.fields.segment], );
 seg_base = seg.base;
 
 base = info.fields.base_reg_invalid ? 0 :
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h 
b/xen/include/asm-x86/hvm/vmx/vmx.h
index 4cdd9b1..0e5902d 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -550,8 +550,6 @@ static inline int __vmxon(u64 addr)
 return rc;
 }
 
-void vmx_get_segment_register(struct vcpu *, enum x86_segment,
-  struct segment_register *);
 void vmx_inject_extint(int trap, uint8_t source);
 void vmx_inject_nmi(void);
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 16/24] x86/emul: Avoid raising faults behind the emulators back

2016-11-30 Thread Andrew Cooper
Introduce a new x86_emul_pagefault() similar to x86_emul_hw_exception(), and
use this instead of hvm_inject_page_fault() from emulation codepaths.

Signed-off-by: Andrew Cooper 
Reviewed-by: Paul Durrant 
Reviewed-by: Jan Beulich 
---
v2:
 * Change x86_emul_pagefault()'s error_code parameter to being signed
 * Split out shadow changes
---
 xen/arch/x86/hvm/emulate.c |  4 ++--
 xen/arch/x86/x86_emulate/x86_emulate.h | 13 +
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 4b8c9a0..614e182 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -459,7 +459,7 @@ static int hvmemul_linear_to_phys(
 {
 if ( pfec & (PFEC_page_paged | PFEC_page_shared) )
 return X86EMUL_RETRY;
-hvm_inject_page_fault(pfec, addr);
+x86_emul_pagefault(pfec, addr, _ctxt->ctxt);
 return X86EMUL_EXCEPTION;
 }
 
@@ -483,7 +483,7 @@ static int hvmemul_linear_to_phys(
 ASSERT(!reverse);
 if ( npfn != gfn_x(INVALID_GFN) )
 return X86EMUL_UNHANDLEABLE;
-hvm_inject_page_fault(pfec, addr & PAGE_MASK);
+x86_emul_pagefault(pfec, addr & PAGE_MASK, 
_ctxt->ctxt);
 return X86EMUL_EXCEPTION;
 }
 *reps = done;
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h 
b/xen/arch/x86/x86_emulate/x86_emulate.h
index 3c0b25d..8aa4b0b 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -648,6 +648,19 @@ static inline void x86_emul_hw_exception(
 ctxt->event_pending = true;
 }
 
+static inline void x86_emul_pagefault(
+int error_code, unsigned long cr2, struct x86_emulate_ctxt *ctxt)
+{
+ASSERT(!ctxt->event_pending);
+
+ctxt->event.vector = 14; /* TRAP_page_fault */
+ctxt->event.type = X86_EVENTTYPE_HW_EXCEPTION;
+ctxt->event.error_code = error_code;
+ctxt->event.cr2 = cr2;
+
+ctxt->event_pending = true;
+}
+
 static inline void x86_emul_software_event(
 enum x86_swint_type type, uint8_t vector, uint8_t insn_len,
 struct x86_emulate_ctxt *ctxt)
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 10/24] x86/emul: Always use fault semantics for software events

2016-11-30 Thread Andrew Cooper
The common case is already using fault semantics out of x86_emulate(), as that
is how VT-x/SVM expects to inject the event (given suitable hardware support).

However, x86_emulate() returning X86EMUL_EXCEPTION and also completing a
register writeback is problematic for callers.

Switch the logic to always using fault semantics, and leave svm_inject_trap()
to fix up %eip if necessary.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Tim Deegan 
CC: Boris Ostrovsky 
CC: Suravee Suthikulpanit 

v3:
 * New
---
 xen/arch/x86/hvm/svm/svm.c | 44 --
 xen/arch/x86/x86_emulate/x86_emulate.c |  2 --
 xen/arch/x86/x86_emulate/x86_emulate.h |  8 ++-
 3 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 912d871..65eeab7 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1209,7 +1209,7 @@ static void svm_inject_event(const struct x86_event 
*event)
 struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
 eventinj_t eventinj = vmcb->eventinj;
 struct x86_event _event = *event;
-const struct cpu_user_regs *regs = guest_cpu_user_regs();
+struct cpu_user_regs *regs = guest_cpu_user_regs();
 
 switch ( _event.vector )
 {
@@ -1242,27 +1242,38 @@ static void svm_inject_event(const struct x86_event 
*event)
 eventinj.fields.v = 1;
 eventinj.fields.vector = _event.vector;
 
-/* Refer to AMD Vol 2: System Programming, 15.20 Event Injection. */
+/*
+ * Refer to AMD Vol 2: System Programming, 15.20 Event Injection.
+ *
+ * On hardware lacking NextRIP support, and all hardware in the case of
+ * icebp, software events with trap semantics need emulating, so %eip in
+ * the trap frame points after the instruction.
+ *
+ * The x86 emulator (if requested by the x86_swint_emulate_* choice) will
+ * have performed checks such as presence/dpl/etc and believes that the
+ * event injection will succeed without faulting.
+ *
+ * The x86 emulator will always provide fault semantics for software
+ * events, with _trap.insn_len set appropriately.  If the injection
+ * requires emulation, move %eip forwards at this point.
+ */
 switch ( _event.type )
 {
 case X86_EVENTTYPE_SW_INTERRUPT: /* int $n */
-/*
- * Software interrupts (type 4) cannot be properly injected if the
- * processor doesn't support NextRIP.  Without NextRIP, the emulator
- * will have performed DPL and presence checks for us, and will have
- * moved eip forward if appropriate.
- */
 if ( cpu_has_svm_nrips )
 vmcb->nextrip = regs->eip + _event.insn_len;
+else
+regs->eip += _event.insn_len;
 eventinj.fields.type = X86_EVENTTYPE_SW_INTERRUPT;
 break;
 
 case X86_EVENTTYPE_PRI_SW_EXCEPTION: /* icebp */
 /*
- * icebp's injection must always be emulated.  Software injection help
- * in x86_emulate has moved eip forward, but NextRIP (if used) still
- * needs setting or execution will resume from 0.
+ * icebp's injection must always be emulated, as hardware does not
+ * special case HW_EXCEPTION with vector 1 (#DB) as having trap
+ * semantics.
  */
+regs->eip += _event.insn_len;
 if ( cpu_has_svm_nrips )
 vmcb->nextrip = regs->eip;
 eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
@@ -1270,16 +1281,13 @@ static void svm_inject_event(const struct x86_event 
*event)
 
 case X86_EVENTTYPE_SW_EXCEPTION: /* int3, into */
 /*
- * The AMD manual states that .type=3 (HW exception), .vector=3 or 4,
- * will perform DPL checks.  Experimentally, DPL and presence checks
- * are indeed performed, even without NextRIP support.
- *
- * However without NextRIP support, the event injection still needs
- * fully emulating to get the correct eip in the trap frame, yet get
- * the correct faulting eip should a fault occur.
+ * Hardware special cases HW_EXCEPTION with vectors 3 and 4 as having
+ * trap semantics, and will perform DPL checks.
  */
 if ( cpu_has_svm_nrips )
 vmcb->nextrip = regs->eip + _event.insn_len;
+else
+regs->eip += _event.insn_len;
 eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
 break;
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c 
b/xen/arch/x86/x86_emulate/x86_emulate.c
index e4643a3..8a1f1f5 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1694,8 +1694,6 @@ static int inject_swint(enum x86_swint_type type,
 goto raise_exn;
 }
 }
-
-

[Xen-devel] [PATCH v3 21/24] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear()

2016-11-30 Thread Andrew Cooper
The functions use linear addresses, not virtual addresses, as no segmentation
is used.  (Lots of other code in Xen makes this mistake.)

Signed-off-by: Andrew Cooper 
Acked-by: Tim Deegan 
Reviewed-by: Kevin Tian 
Reviewed-by: Jan Beulich 
Reviewed-by: Paul Durrant 
---
 xen/arch/x86/hvm/emulate.c| 12 
 xen/arch/x86/hvm/hvm.c| 60 +++
 xen/arch/x86/hvm/vmx/vvmx.c   |  6 ++--
 xen/arch/x86/mm/shadow/common.c   |  8 +++---
 xen/include/asm-x86/hvm/support.h | 14 -
 5 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 321c5aa..035b654 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -791,8 +791,8 @@ static int __hvmemul_read(
 pfec |= PFEC_user_mode;
 
 rc = ((access_type == hvm_access_insn_fetch) ?
-  hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec, ) :
-  hvm_copy_from_guest_virt(p_data, addr, bytes, pfec, ));
+  hvm_fetch_from_guest_linear(p_data, addr, bytes, pfec, ) :
+  hvm_copy_from_guest_linear(p_data, addr, bytes, pfec, ));
 
 switch ( rc )
 {
@@ -898,7 +898,7 @@ static int hvmemul_write(
  (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
 pfec |= PFEC_user_mode;
 
-rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec, );
+rc = hvm_copy_to_guest_linear(addr, p_data, bytes, pfec, );
 
 switch ( rc )
 {
@@ -1937,9 +1937,9 @@ void hvm_emulate_init_per_insn(
 hvm_access_insn_fetch,
 hvmemul_ctxt->ctxt.addr_size,
 ) &&
- hvm_fetch_from_guest_virt(hvmemul_ctxt->insn_buf, addr,
-   sizeof(hvmemul_ctxt->insn_buf),
-   pfec, NULL) == HVMCOPY_okay) ?
+ hvm_fetch_from_guest_linear(hvmemul_ctxt->insn_buf, addr,
+ sizeof(hvmemul_ctxt->insn_buf),
+ pfec, NULL) == HVMCOPY_okay) ?
 sizeof(hvmemul_ctxt->insn_buf) : 0;
 }
 else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 5eae06a..37eaee2 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2925,7 +2925,7 @@ void hvm_task_switch(
 goto out;
 }
 
-rc = hvm_copy_from_guest_virt(
+rc = hvm_copy_from_guest_linear(
 , prev_tr.base, sizeof(tss), PFEC_page_present, );
 if ( rc != HVMCOPY_okay )
 goto out;
@@ -2960,15 +2960,15 @@ void hvm_task_switch(
 hvm_get_segment_register(v, x86_seg_ldtr, );
 tss.ldt = segr.sel;
 
-rc = hvm_copy_to_guest_virt(prev_tr.base + offsetof(typeof(tss), eip),
-,
-offsetof(typeof(tss), trace) -
-offsetof(typeof(tss), eip),
-PFEC_page_present, );
+rc = hvm_copy_to_guest_linear(prev_tr.base + offsetof(typeof(tss), eip),
+  ,
+  offsetof(typeof(tss), trace) -
+  offsetof(typeof(tss), eip),
+  PFEC_page_present, );
 if ( rc != HVMCOPY_okay )
 goto out;
 
-rc = hvm_copy_from_guest_virt(
+rc = hvm_copy_from_guest_linear(
 , tr.base, sizeof(tss), PFEC_page_present, );
 /*
  * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
@@ -3008,9 +3008,9 @@ void hvm_task_switch(
 regs->eflags |= X86_EFLAGS_NT;
 tss.back_link = prev_tr.sel;
 
-rc = hvm_copy_to_guest_virt(tr.base + offsetof(typeof(tss), back_link),
-_link, sizeof(tss.back_link), 0,
-);
+rc = hvm_copy_to_guest_linear(tr.base + offsetof(typeof(tss), 
back_link),
+  _link, sizeof(tss.back_link), 0,
+  );
 if ( rc == HVMCOPY_bad_gva_to_gfn )
 exn_raised = 1;
 else if ( rc != HVMCOPY_okay )
@@ -3047,8 +3047,8 @@ void hvm_task_switch(
 16 << segr.attr.fields.db,
 _addr) )
 {
-rc = hvm_copy_to_guest_virt(linear_addr, , opsz, 0,
-);
+rc = hvm_copy_to_guest_linear(linear_addr, , opsz, 0,
+  );
 if ( rc == HVMCOPY_bad_gva_to_gfn )
 exn_raised = 1;
 else if ( rc != HVMCOPY_okay )
@@ -3067,7 +3067,7 @@ void hvm_task_switch(
 #define HVMCOPY_from_guest (0u<<0)
 #define HVMCOPY_to_guest   (1u<<0)
 

[Xen-devel] [PATCH v3 15/24] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS

2016-11-30 Thread Andrew Cooper
Intel VT-x and AMD SVM provide access to the full segment descriptor cache via
fields in the VMCB/VMCS.  However, the bits which are actually checked by
hardware and preserved across vmentry/exit are inconsistent, and the vendor
accessor functions perform inconsistent modification to the raw values.

Convert {svm,vmx}_{get,set}_segment_register() into raw accessors, and alter
hvm_{get,set}_segment_register() to cook the values consistently.  This allows
the common emulation code to better rely on finding architecturally-expected
values.

While moving the code performing the cooking, fix the %ss.db quirk.  A NULL
selector is indicated by .p being clear, not the value of the .type field.

This does cause some functional changes because of the modifications being
applied uniformly.  A side effect of this fixes latent bugs where
vmx_set_segment_register() didn't correctly fix up .G for segments, and
inconsistent fixing up of the GDTR/IDTR limits.

Signed-off-by: Andrew Cooper 
Reviewed-by: Kevin Tian 
Reviewed-by: Jan Beulich 
Reviewed-by: Boris Ostrovsky 
---
v2:
 * Clarify the change of the %ss.db quirk
 * Rework %tr typecheck logic
 * Swap a break for return following ASSERT_UNREACHABLE()
---
 xen/arch/x86/hvm/hvm.c| 154 ++
 xen/arch/x86/hvm/svm/svm.c|  20 +-
 xen/arch/x86/hvm/vmx/vmx.c|   6 +-
 xen/include/asm-x86/desc.h|   6 ++
 xen/include/asm-x86/hvm/hvm.h |  17 ++---
 5 files changed, 167 insertions(+), 36 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ef83100..bdfd94e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6051,6 +6051,160 @@ void hvm_domain_soft_reset(struct domain *d)
 }
 
 /*
+ * Segment caches in VMCB/VMCS are inconsistent about which bits are checked,
+ * important, and preserved across vmentry/exit.  Cook the values to make them
+ * closer to what is architecturally expected from entries in the segment
+ * cache.
+ */
+void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
+  struct segment_register *reg)
+{
+hvm_funcs.get_segment_register(v, seg, reg);
+
+switch ( seg )
+{
+case x86_seg_ss:
+/* SVM may retain %ss.DB when %ss is loaded with a NULL selector. */
+if ( !reg->attr.fields.p )
+reg->attr.fields.db = 0;
+break;
+
+case x86_seg_tr:
+/*
+ * SVM doesn't track %tr.B. Architecturally, a loaded TSS segment will
+ * always be busy.
+ */
+reg->attr.fields.type |= 0x2;
+
+/*
+ * %cs and %tr are unconditionally present.  SVM ignores these present
+ * bits and will happily run without them set.
+ */
+case x86_seg_cs:
+reg->attr.fields.p = 1;
+break;
+
+case x86_seg_gdtr:
+case x86_seg_idtr:
+/*
+ * Treat GDTR/IDTR as being present system segments.  This avoids them
+ * needing special casing for segmentation checks.
+ */
+reg->attr.bytes = 0x80;
+break;
+
+default: /* Avoid triggering -Werror=switch */
+break;
+}
+
+if ( reg->attr.fields.p )
+{
+/*
+ * For segments which are present/usable, cook the system flag.  SVM
+ * ignores the S bit on all segments and will happily run with them in
+ * any state.
+ */
+reg->attr.fields.s = is_x86_user_segment(seg);
+
+/*
+ * SVM discards %cs.G on #VMEXIT.  Other user segments do have .G
+ * tracked, but Linux commit 80112c89ed87 "KVM: Synthesize G bit for
+ * all segments." indicates that this isn't necessarily the case when
+ * nested under ESXi.
+ *
+ * Unconditionally recalculate G.
+ */
+reg->attr.fields.g = !!(reg->limit >> 20);
+
+/*
+ * SVM doesn't track the Accessed flag.  It will always be set for
+ * usable user segments loaded into the descriptor cache.
+ */
+if ( is_x86_user_segment(seg) )
+reg->attr.fields.type |= 0x1;
+}
+}
+
+void hvm_set_segment_register(struct vcpu *v, enum x86_segment seg,
+  struct segment_register *reg)
+{
+/* Set G to match the limit field.  VT-x cares, while SVM doesn't. */
+if ( reg->attr.fields.p )
+reg->attr.fields.g = !!(reg->limit >> 20);
+
+switch ( seg )
+{
+case x86_seg_cs:
+ASSERT(reg->attr.fields.p);  /* Usable. */
+ASSERT(reg->attr.fields.s);  /* User segment. */
+ASSERT((reg->base >> 32) == 0);  /* Upper bits clear. */
+break;
+
+case x86_seg_ss:
+if ( reg->attr.fields.p )
+{
+ASSERT(reg->attr.fields.s);  /* User segment. */
+ASSERT(!(reg->attr.fields.type & 0x8));  

[Xen-devel] [PATCH v3 20/24] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info

2016-11-30 Thread Andrew Cooper
No functional change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
Acked-by: Tim Deegan 
Reviewed-by: Paul Durrant 
---
 xen/arch/x86/hvm/emulate.c|  6 ++---
 xen/arch/x86/hvm/hvm.c| 56 +--
 xen/arch/x86/mm/shadow/common.c   |  8 +++---
 xen/include/asm-x86/hvm/support.h | 11 
 4 files changed, 19 insertions(+), 62 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 41f689e..321c5aa 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1937,9 +1937,9 @@ void hvm_emulate_init_per_insn(
 hvm_access_insn_fetch,
 hvmemul_ctxt->ctxt.addr_size,
 ) &&
- hvm_fetch_from_guest_virt_nofault(hvmemul_ctxt->insn_buf, addr,
-   sizeof(hvmemul_ctxt->insn_buf),
-   pfec) == HVMCOPY_okay) ?
+ hvm_fetch_from_guest_virt(hvmemul_ctxt->insn_buf, addr,
+   sizeof(hvmemul_ctxt->insn_buf),
+   pfec, NULL) == HVMCOPY_okay) ?
 sizeof(hvmemul_ctxt->insn_buf) : 0;
 }
 else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 390f76d..5eae06a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3066,8 +3066,6 @@ void hvm_task_switch(
 
 #define HVMCOPY_from_guest (0u<<0)
 #define HVMCOPY_to_guest   (1u<<0)
-#define HVMCOPY_no_fault   (0u<<1)
-#define HVMCOPY_fault  (1u<<1)
 #define HVMCOPY_phys   (0u<<2)
 #define HVMCOPY_virt   (1u<<2)
 static enum hvm_copy_result __hvm_copy(
@@ -3112,13 +3110,10 @@ static enum hvm_copy_result __hvm_copy(
 return HVMCOPY_gfn_paged_out;
 if ( pfec & PFEC_page_shared )
 return HVMCOPY_gfn_shared;
-if ( flags & HVMCOPY_fault )
+if ( pfinfo )
 {
-if ( pfinfo )
-{
-pfinfo->linear = addr;
-pfinfo->ec = pfec;
-}
+pfinfo->linear = addr;
+pfinfo->ec = pfec;
 
 hvm_inject_page_fault(pfec, addr);
 }
@@ -3290,16 +3285,14 @@ enum hvm_copy_result hvm_copy_to_guest_phys(
 paddr_t paddr, void *buf, int size)
 {
 return __hvm_copy(buf, paddr, size,
-  HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_phys,
-  0, NULL);
+  HVMCOPY_to_guest | HVMCOPY_phys, 0, NULL);
 }
 
 enum hvm_copy_result hvm_copy_from_guest_phys(
 void *buf, paddr_t paddr, int size)
 {
 return __hvm_copy(buf, paddr, size,
-  HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_phys,
-  0, NULL);
+  HVMCOPY_from_guest | HVMCOPY_phys, 0, NULL);
 }
 
 enum hvm_copy_result hvm_copy_to_guest_virt(
@@ -3307,7 +3300,7 @@ enum hvm_copy_result hvm_copy_to_guest_virt(
 pagefault_info_t *pfinfo)
 {
 return __hvm_copy(buf, vaddr, size,
-  HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_virt,
+  HVMCOPY_to_guest | HVMCOPY_virt,
   PFEC_page_present | PFEC_write_access | pfec, pfinfo);
 }
 
@@ -3316,7 +3309,7 @@ enum hvm_copy_result hvm_copy_from_guest_virt(
 pagefault_info_t *pfinfo)
 {
 return __hvm_copy(buf, vaddr, size,
-  HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
+  HVMCOPY_from_guest | HVMCOPY_virt,
   PFEC_page_present | pfec, pfinfo);
 }
 
@@ -3325,34 +3318,10 @@ enum hvm_copy_result hvm_fetch_from_guest_virt(
 pagefault_info_t *pfinfo)
 {
 return __hvm_copy(buf, vaddr, size,
-  HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
+  HVMCOPY_from_guest | HVMCOPY_virt,
   PFEC_page_present | PFEC_insn_fetch | pfec, pfinfo);
 }
 
-enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
-unsigned long vaddr, void *buf, int size, uint32_t pfec)
-{
-return __hvm_copy(buf, vaddr, size,
-  HVMCOPY_to_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-  PFEC_page_present | PFEC_write_access | pfec, NULL);
-}
-
-enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
-void *buf, unsigned long vaddr, int size, uint32_t pfec)
-{
-return __hvm_copy(buf, vaddr, size,
-  HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-  PFEC_page_present | pfec, NULL);
-}
-
-enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
-void *buf, unsigned long vaddr, int size, uint32_t pfec)
-{
-return 

[Xen-devel] [PATCH v3 22/24] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back

2016-11-30 Thread Andrew Cooper
Drop the call to hvm_inject_page_fault() in __hvm_copy(), and require callers
to inject the pagefault themselves.

Signed-off-by: Andrew Cooper 
Acked-by: Tim Deegan 
Acked-by: Kevin Tian 
Reviewed-by: Jan Beulich 
---
CC: Paul Durrant 

v3:
 * Correct patch description
 * Fix rebasing error over previous TSS series
---
 xen/arch/x86/hvm/emulate.c|  2 ++
 xen/arch/x86/hvm/hvm.c| 14 --
 xen/arch/x86/hvm/vmx/vvmx.c   | 20 +++-
 xen/arch/x86/mm/shadow/common.c   |  1 +
 xen/include/asm-x86/hvm/support.h |  4 +---
 5 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 035b654..ccf3aa2 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -799,6 +799,7 @@ static int __hvmemul_read(
 case HVMCOPY_okay:
 break;
 case HVMCOPY_bad_gva_to_gfn:
+x86_emul_pagefault(pfinfo.ec, pfinfo.linear, _ctxt->ctxt);
 return X86EMUL_EXCEPTION;
 case HVMCOPY_bad_gfn_to_mfn:
 if ( access_type == hvm_access_insn_fetch )
@@ -905,6 +906,7 @@ static int hvmemul_write(
 case HVMCOPY_okay:
 break;
 case HVMCOPY_bad_gva_to_gfn:
+x86_emul_pagefault(pfinfo.ec, pfinfo.linear, _ctxt->ctxt);
 return X86EMUL_EXCEPTION;
 case HVMCOPY_bad_gfn_to_mfn:
 return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, 
hvmemul_ctxt, 0);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 37eaee2..3596f2c 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2927,6 +2927,8 @@ void hvm_task_switch(
 
 rc = hvm_copy_from_guest_linear(
 , prev_tr.base, sizeof(tss), PFEC_page_present, );
+if ( rc == HVMCOPY_bad_gva_to_gfn )
+hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
 if ( rc != HVMCOPY_okay )
 goto out;
 
@@ -2965,11 +2967,15 @@ void hvm_task_switch(
   offsetof(typeof(tss), trace) -
   offsetof(typeof(tss), eip),
   PFEC_page_present, );
+if ( rc == HVMCOPY_bad_gva_to_gfn )
+hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
 if ( rc != HVMCOPY_okay )
 goto out;
 
 rc = hvm_copy_from_guest_linear(
 , tr.base, sizeof(tss), PFEC_page_present, );
+if ( rc == HVMCOPY_bad_gva_to_gfn )
+hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
 /*
  * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
  * functions knew we want RO access.
@@ -3012,7 +3018,10 @@ void hvm_task_switch(
   _link, sizeof(tss.back_link), 0,
   );
 if ( rc == HVMCOPY_bad_gva_to_gfn )
+{
+hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
 exn_raised = 1;
+}
 else if ( rc != HVMCOPY_okay )
 goto out;
 }
@@ -3050,7 +3059,10 @@ void hvm_task_switch(
 rc = hvm_copy_to_guest_linear(linear_addr, , opsz, 0,
   );
 if ( rc == HVMCOPY_bad_gva_to_gfn )
+{
+hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
 exn_raised = 1;
+}
 else if ( rc != HVMCOPY_okay )
 goto out;
 }
@@ -3114,8 +3126,6 @@ static enum hvm_copy_result __hvm_copy(
 {
 pfinfo->linear = addr;
 pfinfo->ec = pfec;
-
-hvm_inject_page_fault(pfec, addr);
 }
 return HVMCOPY_bad_gva_to_gfn;
 }
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index fd7ea0a..e6e9ebd 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -396,7 +396,6 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
 struct vcpu *v = current;
 union vmx_inst_info info;
 struct segment_register seg;
-pagefault_info_t pfinfo;
 unsigned long base, index, seg_base, disp, offset;
 int scale, size;
 
@@ -451,10 +450,17 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
   offset + size - 1 > seg.limit) )
 goto gp_fault;
 
-if ( poperandS != NULL &&
- hvm_copy_from_guest_linear(poperandS, base, size, 0, )
-  != HVMCOPY_okay )
-return X86EMUL_EXCEPTION;
+if ( poperandS != NULL )
+{
+pagefault_info_t pfinfo;
+int rc = hvm_copy_from_guest_linear(poperandS, base, size,
+0, );
+
+if ( rc == HVMCOPY_bad_gva_to_gfn )
+hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
+if ( rc != HVMCOPY_okay )
+return X86EMUL_EXCEPTION;

[Xen-devel] [PATCH v3 13/24] x86/emul: Rework emulator event injection

2016-11-30 Thread Andrew Cooper
The emulator needs to gain an understanding of interrupts and exceptions
generated by its actions.

Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so they
are visible to the emulator.  This removes the need for the
inject_{hw_exception,sw_interrupt}() hooks, which are dropped and replaced
with x86_emul_{hw_exception,software_event,reset_event}() instead.

For exceptions raised by x86_emulate() itself (rather than its callbacks), the
shadow pagetable and PV uses of x86_emulate() previously failed with
X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks.

This behaviour has changed, and such cases will now return X86EMUL_EXCEPTION
with event_pending set.  Until the callers of x86_emulate() have been updated
to inject events back into the guest, divert the event_pending case back into
the X86EMUL_UNHANDLEABLE path to maintain the same guest-visible behaviour.

No overall functional change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Boris Ostrovsky 
Reviewed-by: Kevin Tian 
---
CC: Jan Beulich 
CC: Paul Durrant 
CC: Tim Deegan 
CC: George Dunlap 

v3:
 * Rework how the event_pending case is currently handled
v2:
 * Change x86_emul_hw_exception()'s error_code parameter to being signed
 * Clarify how software interrupt injection happens.
 * More ASSERT()'s and description of how event_pending works without the
   inject_sw_interrupt() hook
---
 xen/arch/x86/hvm/emulate.c | 81 --
 xen/arch/x86/hvm/hvm.c |  4 +-
 xen/arch/x86/hvm/io.c  |  4 +-
 xen/arch/x86/hvm/vmx/realmode.c| 16 +++
 xen/arch/x86/mm.c  | 26 +++
 xen/arch/x86/mm/shadow/multi.c | 17 +++
 xen/arch/x86/x86_emulate/x86_emulate.c | 12 +++--
 xen/arch/x86/x86_emulate/x86_emulate.h | 76 +--
 xen/include/asm-x86/hvm/emulate.h  |  3 --
 9 files changed, 132 insertions(+), 107 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 91c79fa..4b8c9a0 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -568,12 +568,9 @@ static int hvmemul_virtual_to_linear(
 return X86EMUL_UNHANDLEABLE;
 
 /* This is a singleton operation: fail it with an exception. */
-hvmemul_ctxt->exn_pending = 1;
-hvmemul_ctxt->trap.vector =
-(seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault;
-hvmemul_ctxt->trap.type = X86_EVENTTYPE_HW_EXCEPTION;
-hvmemul_ctxt->trap.error_code = 0;
-hvmemul_ctxt->trap.insn_len = 0;
+x86_emul_hw_exception((seg == x86_seg_ss)
+  ? TRAP_stack_error
+  : TRAP_gp_fault, 0, _ctxt->ctxt);
 return X86EMUL_EXCEPTION;
 }
 
@@ -1562,59 +1559,6 @@ int hvmemul_cpuid(
 return X86EMUL_OKAY;
 }
 
-static int hvmemul_inject_hw_exception(
-uint8_t vector,
-int32_t error_code,
-struct x86_emulate_ctxt *ctxt)
-{
-struct hvm_emulate_ctxt *hvmemul_ctxt =
-container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
-
-hvmemul_ctxt->exn_pending = 1;
-hvmemul_ctxt->trap.vector = vector;
-hvmemul_ctxt->trap.type = X86_EVENTTYPE_HW_EXCEPTION;
-hvmemul_ctxt->trap.error_code = error_code;
-hvmemul_ctxt->trap.insn_len = 0;
-
-return X86EMUL_OKAY;
-}
-
-static int hvmemul_inject_sw_interrupt(
-enum x86_swint_type type,
-uint8_t vector,
-uint8_t insn_len,
-struct x86_emulate_ctxt *ctxt)
-{
-struct hvm_emulate_ctxt *hvmemul_ctxt =
-container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
-
-switch ( type )
-{
-case x86_swint_icebp:
-hvmemul_ctxt->trap.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
-break;
-
-case x86_swint_int3:
-case x86_swint_into:
-hvmemul_ctxt->trap.type = X86_EVENTTYPE_SW_EXCEPTION;
-break;
-
-case x86_swint_int:
-hvmemul_ctxt->trap.type = X86_EVENTTYPE_SW_INTERRUPT;
-break;
-
-default:
-return X86EMUL_UNHANDLEABLE;
-}
-
-hvmemul_ctxt->exn_pending = 1;
-hvmemul_ctxt->trap.vector = vector;
-hvmemul_ctxt->trap.error_code = X86_EVENT_NO_EC;
-hvmemul_ctxt->trap.insn_len = insn_len;
-
-return X86EMUL_OKAY;
-}
-
 static int hvmemul_get_fpu(
 void (*exception_callback)(void *, struct cpu_user_regs *),
 void *exception_callback_arg,
@@ -1678,8 +1622,7 @@ static int hvmemul_invlpg(
  * hvmemul_virtual_to_linear() raises exceptions for type/limit
  * violations, so squash them.
  */
-hvmemul_ctxt->exn_pending = 0;
-hvmemul_ctxt->trap = (struct x86_event){};
+x86_emul_reset_event(ctxt);
 rc = X86EMUL_OKAY;
 }
 
@@ -1696,7 +1639,7 @@ static int hvmemul_vmfunc(
 
 rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
 if ( rc != 

[Xen-devel] [PATCH v3 18/24] x86/shadow: Avoid raising faults behind the emulators back

2016-11-30 Thread Andrew Cooper
Use x86_emul_{hw_exception,pagefault}() rather than
{pv,hvm}_inject_page_fault() and hvm_inject_hw_exception() to cause raised
faults to be known to the emulator.  This requires altering the callers of
x86_emulate() to properly re-inject the event.

While fixing this, fix the singlestep behaviour.  Previously, an otherwise
successful emulation would fail if singlestepping was active, as the emulator
couldn't raise #DB.  This is unreasonable from the point of view of the guest.

We therefore tolerate #PF/#GP/SS and #DB being raised by the emulator, but
reject anything else as unexpected.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Tim Deegan 

v3:
 * Split out #DB handling to an earlier part of the series
 * Don't inject #GP faults for unexpected events, but do reenter the guest.
v2:
 * New
---
 xen/arch/x86/mm/shadow/common.c | 13 ++---
 xen/arch/x86/mm/shadow/multi.c  | 39 ---
 2 files changed, 34 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index f07803b..e509cc1 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -162,8 +162,9 @@ static int hvm_translate_linear_addr(
 
 if ( !okay )
 {
-hvm_inject_hw_exception(
-(seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault, 0);
+x86_emul_hw_exception(
+(seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault,
+0, _ctxt->ctxt);
 return X86EMUL_EXCEPTION;
 }
 
@@ -323,7 +324,7 @@ pv_emulate_read(enum x86_segment seg,
 
 if ( (rc = copy_from_user(p_data, (void *)offset, bytes)) != 0 )
 {
-pv_inject_page_fault(0, offset + bytes - rc); /* Read fault. */
+x86_emul_pagefault(0, offset + bytes - rc, ctxt); /* Read fault. */
 return X86EMUL_EXCEPTION;
 }
 
@@ -1720,10 +1721,8 @@ static mfn_t emulate_gva_to_mfn(struct vcpu *v, unsigned 
long vaddr,
 gfn = paging_get_hostmode(v)->gva_to_gfn(v, NULL, vaddr, );
 if ( gfn == gfn_x(INVALID_GFN) )
 {
-if ( is_hvm_vcpu(v) )
-hvm_inject_page_fault(pfec, vaddr);
-else
-pv_inject_page_fault(pfec, vaddr);
+x86_emul_pagefault(pfec, vaddr, _ctxt->ctxt);
+
 return _mfn(BAD_GVA_TO_GFN);
 }
 
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 56c40f8..098b653 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3373,18 +3373,35 @@ static int sh_page_fault(struct vcpu *v,
 
 r = x86_emulate(_ctxt.ctxt, emul_ops);
 
-/*
- * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
- * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
- * now set event_pending instead.  Exceptions raised behind the back of
- * the emulator don't yet set event_pending.
- *
- * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
- * for no functional change from before.  Future patches will fix this
- * properly.
- */
 if ( r == X86EMUL_EXCEPTION && emul_ctxt.ctxt.event_pending )
-r = X86EMUL_UNHANDLEABLE;
+{
+/*
+ * This emulation covers writes to shadow pagetables.  We tolerate #PF
+ * (from hitting adjacent pages) and #GP/#SS (from segmentation
+ * errors).  Anything else is an emulation bug, or a guest playing
+ * with the instruction stream under Xen's feet.
+ */
+if ( emul_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
+ (emul_ctxt.ctxt.event.vector < 32) &&
+ ((1u << emul_ctxt.ctxt.event.vector) &
+  ((1u << TRAP_stack_error) | (1u << TRAP_gp_fault) |
+   (1u << TRAP_page_fault))) )
+{
+if ( is_hvm_vcpu(v) )
+hvm_inject_event(_ctxt.ctxt.event);
+else
+pv_inject_event(_ctxt.ctxt.event);
+
+goto emulate_done;
+}
+else
+{
+SHADOW_PRINTK(
+"Unexpected event (type %u, vector %#x) from emulation\n",
+emul_ctxt.ctxt.event.type, emul_ctxt.ctxt.event.vector);
+r = X86EMUL_UNHANDLEABLE;
+}
+}
 
 /*
  * NB. We do not unshadow on X86EMUL_EXCEPTION. It's not clear that it
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag

2016-11-30 Thread Andrew Cooper
The behaviour of singlestep is to raise #DB after the instruction has been
completed, but implementing it with inject_hw_exception() causes x86_emulate()
to return X86EMUL_EXCEPTION, despite succesfully completing execution of the
instruction, including register writeback.

Instead, use a retire flag to indicate singlestep, which causes x86_emulate()
to return X86EMUL_OKAY.

Update all callers of x86_emulate() to use the new retire flag.  This fixes
the behaviour of singlestep for shadow pagetable updates and mmcfg/mmio_ro
intercepts, which previously discarded the exception.

With this change, all uses of X86EMUL_EXCEPTION from x86_emulate() are
believed to have strictly fault semantics.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Tim Deegan 
CC: Paul Durrant 

v3:
 * New
---
 xen/arch/x86/hvm/emulate.c |  3 +++
 xen/arch/x86/mm.c  | 11 ++-
 xen/arch/x86/mm/shadow/multi.c | 21 -
 xen/arch/x86/x86_emulate/x86_emulate.c |  9 -
 xen/arch/x86/x86_emulate/x86_emulate.h |  6 ++
 5 files changed, 43 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index fe62500..91c79fa 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1788,6 +1788,9 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt 
*hvmemul_ctxt,
 if ( rc != X86EMUL_OKAY )
 return rc;
 
+if ( hvmemul_ctxt->ctxt.retire.singlestep )
+hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+
 new_intr_shadow = hvmemul_ctxt->intr_shadow;
 
 /* MOV-SS instruction toggles MOV-SS shadow, else we just clear it. */
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index b7c7122..231c7bf 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5382,6 +5382,9 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
 if ( rc == X86EMUL_UNHANDLEABLE )
 goto bail;
 
+if ( ptwr_ctxt.ctxt.retire.singlestep )
+pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+
 perfc_incr(ptwr_emulations);
 return EXCRET_fault_fixed;
 
@@ -5503,7 +5506,13 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long 
addr,
 else
 rc = x86_emulate(, _ro_emulate_ops);
 
-return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
+if ( rc == X86EMUL_UNHANDLEABLE )
+return 0;
+
+if ( ctxt.retire.singlestep )
+pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+
+return EXCRET_fault_fixed;
 }
 
 void *alloc_xen_pagetable(void)
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 9ee48a8..ddfb815 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3422,6 +3422,16 @@ static int sh_page_fault(struct vcpu *v,
 v->arch.paging.last_write_emul_ok = 0;
 #endif
 
+if ( emul_ctxt.ctxt.retire.singlestep )
+{
+if ( is_hvm_vcpu(v) )
+hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+else
+pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+
+goto emulate_done;
+}
+
 #if GUEST_PAGING_LEVELS == 3 /* PAE guest */
 if ( r == X86EMUL_OKAY ) {
 int i, emulation_count=0;
@@ -3433,7 +3443,7 @@ static int sh_page_fault(struct vcpu *v,
 shadow_continue_emulation(_ctxt, regs);
 v->arch.paging.last_write_was_pt = 0;
 r = x86_emulate(_ctxt.ctxt, emul_ops);
-if ( r == X86EMUL_OKAY )
+if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw )
 {
 emulation_count++;
 if ( v->arch.paging.last_write_was_pt )
@@ -3449,6 +3459,15 @@ static int sh_page_fault(struct vcpu *v,
 {
 perfc_incr(shadow_em_ex_fail);
 TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_EMULATION_LAST_FAILED);
+
+if ( emul_ctxt.ctxt.retire.singlestep )
+{
+if ( is_hvm_vcpu(v) )
+hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+else
+pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+}
+
 break; /* Don't emulate again if we failed! */
 }
 }
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c 
b/xen/arch/x86/x86_emulate/x86_emulate.c
index 8a1f1f5..0af532e 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -2417,7 +2417,6 @@ x86_emulate(
 struct x86_emulate_state state;
 int rc;
 uint8_t b, d;
-bool tf = ctxt->regs->eflags & EFLG_TF;
 struct operand src = { .reg = PTR_POISON };
 struct operand dst = { .reg = PTR_POISON };
 enum x86_swint_type swint_type;
@@ -5415,11 +5414,11 @@ x86_emulate(
 if ( !mode_64bit() )
 _regs.eip = (uint32_t)_regs.eip;
 
-

[Xen-devel] [PATCH v3 24/24] x86/emul: Use system-segment relative memory accesses

2016-11-30 Thread Andrew Cooper
With hvm_virtual_to_linear_addr() capable of doing proper system-segment
relative memory accesses, avoid open-coding the address and limit calculations
locally.

When a table spans the 4GB boundary (32bit) or non-canonical boundary (64bit),
segmentation errors are now raised.  Previously, the use of x86_seg_none
resulted in segmentation being skipped, and the linear address being truncated
through the pagewalk, and possibly coming out valid on the far side.

Signed-off-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
Reviewed-by: George Dunlap 
---
v2:
 * Shorten exception handling
 * Replace ->cmpxchg() assertion with proper exception handling
---
 xen/arch/x86/hvm/hvm.c |   8 +++
 xen/arch/x86/x86_emulate/x86_emulate.c | 123 +
 2 files changed, 85 insertions(+), 46 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 426edee..599363b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2470,6 +2470,14 @@ bool_t hvm_virtual_to_linear_addr(
 unsigned long addr = offset, last_byte;
 bool_t okay = 0;
 
+/*
+ * These checks are for a memory access through an active segment.
+ *
+ * It is expected that the access rights of reg are suitable for seg (and
+ * that this is enforced at the point that seg is loaded).
+ */
+ASSERT(seg < x86_seg_none);
+
 if ( !(current->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
 {
 /*
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c 
b/xen/arch/x86/x86_emulate/x86_emulate.c
index 0fb2c09..c18adbe 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1181,20 +1181,36 @@ static int ioport_access_check(
 return rc;
 
 /* Ensure the TSS has an io-bitmap-offset field. */
-generate_exception_if(tr.attr.fields.type != 0xb ||
-  tr.limit < 0x67, EXC_GP, 0);
+generate_exception_if(tr.attr.fields.type != 0xb, EXC_GP, 0);
 
-if ( (rc = read_ulong(x86_seg_none, tr.base + 0x66,
-  , 2, ctxt, ops)) )
+switch ( rc = read_ulong(x86_seg_tr, 0x66, , 2, ctxt, ops) )
+{
+case X86EMUL_OKAY:
+break;
+
+case X86EMUL_EXCEPTION:
+generate_exception_if(!ctxt->event_pending, EXC_GP, 0);
+/* fallthrough */
+
+default:
 return rc;
+}
 
-/* Ensure TSS includes two bytes including byte containing first port. */
-iobmp += first_port / 8;
-generate_exception_if(tr.limit <= iobmp, EXC_GP, 0);
+/* Read two bytes including byte containing first port. */
+switch ( rc = read_ulong(x86_seg_tr, iobmp + first_port / 8,
+ , 2, ctxt, ops) )
+{
+case X86EMUL_OKAY:
+break;
+
+case X86EMUL_EXCEPTION:
+generate_exception_if(!ctxt->event_pending, EXC_GP, 0);
+/* fallthrough */
 
-if ( (rc = read_ulong(x86_seg_none, tr.base + iobmp,
-  , 2, ctxt, ops)) )
+default:
 return rc;
+}
+
 generate_exception_if(iobmp & (((1 << bytes) - 1) << (first_port & 7)),
   EXC_GP, 0);
 
@@ -1317,9 +1333,12 @@ realmode_load_seg(
 struct x86_emulate_ctxt *ctxt,
 const struct x86_emulate_ops *ops)
 {
-int rc = ops->read_segment(seg, sreg, ctxt);
+int rc;
+
+if ( !ops->read_segment )
+return X86EMUL_UNHANDLEABLE;
 
-if ( !rc )
+if ( (rc = ops->read_segment(seg, sreg, ctxt)) == X86EMUL_OKAY )
 {
 sreg->sel  = sel;
 sreg->base = (uint32_t)sel << 4;
@@ -1336,7 +1355,7 @@ protmode_load_seg(
 struct x86_emulate_ctxt *ctxt,
 const struct x86_emulate_ops *ops)
 {
-struct segment_register desctab;
+enum x86_segment sel_seg = (sel & 4) ? x86_seg_ldtr : x86_seg_gdtr;
 struct { uint32_t a, b; } desc;
 uint8_t dpl, rpl;
 int cpl = get_cpl(ctxt, ops);
@@ -1369,21 +1388,19 @@ protmode_load_seg(
 if ( !is_x86_user_segment(seg) && (sel & 4) )
 goto raise_exn;
 
-if ( (rc = ops->read_segment((sel & 4) ? x86_seg_ldtr : x86_seg_gdtr,
- , ctxt)) )
-return rc;
-
-/* Segment not valid for use (cooked meaning of .p)? */
-if ( !desctab.attr.fields.p )
-goto raise_exn;
+switch ( rc = ops->read(sel_seg, sel & 0xfff8, , sizeof(desc), ctxt) )
+{
+case X86EMUL_OKAY:
+break;
 
-/* Check against descriptor table limit. */
-if ( ((sel & 0xfff8) + 7) > desctab.limit )
-goto raise_exn;
+case X86EMUL_EXCEPTION:
+if ( !ctxt->event_pending )
+goto raise_exn;
+/* fallthrough */
 
-if ( (rc = ops->read(x86_seg_none, desctab.base + (sel & 0xfff8),
- , sizeof(desc), ctxt)) )
+default:
 return rc;
+}
 
 if ( !is_x86_user_segment(seg) )
 {
@@ -1471,9 +1488,20 @@ protmode_load_seg(
 

[Xen-devel] [PATCH v3 12/24] x86/emul: Remove opencoded exception generation

2016-11-30 Thread Andrew Cooper
Introduce generate_exception() for unconditional exception generation, and
replace existing uses.  Both generate_exception() and generate_exception_if()
are updated to make their error code parameters optional, which removes the
use of the -1 sentinal.

The ioport_access_check() check loses the presence check for %tr, as the x86
architecture has no concept of a non-usable task register.

No functional change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v3:
 * Rebase over singlestepping changes
v2:
 * Brackets around &
---
 xen/arch/x86/x86_emulate/x86_emulate.c | 193 +
 1 file changed, 99 insertions(+), 94 deletions(-)

diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c 
b/xen/arch/x86/x86_emulate/x86_emulate.c
index 0af532e..6adfdbe 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -457,14 +457,20 @@ typedef union {
 #define EXC_BR  5
 #define EXC_UD  6
 #define EXC_NM  7
+#define EXC_DF  8
 #define EXC_TS 10
 #define EXC_NP 11
 #define EXC_SS 12
 #define EXC_GP 13
 #define EXC_PF 14
 #define EXC_MF 16
+#define EXC_AC 17
 #define EXC_XM 19
 
+#define EXC_HAS_EC  \
+((1u << EXC_DF) | (1u << EXC_TS) | (1u << EXC_NP) | \
+ (1u << EXC_SS) | (1u << EXC_GP) | (1u << EXC_PF) | (1u << EXC_AC))
+
 /* Segment selector error code bits. */
 #define ECODE_EXT (1 << 0)
 #define ECODE_IDT (1 << 1)
@@ -667,14 +673,22 @@ do {\
 if ( rc ) goto done;\
 } while (0)
 
-#define generate_exception_if(p, e, ec)   \
+static inline int mkec(uint8_t e, int32_t ec, ...)
+{
+return (e < 32 && ((1u << e) & EXC_HAS_EC)) ? ec : X86_EVENT_NO_EC;
+}
+
+#define generate_exception_if(p, e, ec...)\
 ({  if ( (p) ) {  \
 fail_if(ops->inject_hw_exception == NULL);\
-rc = ops->inject_hw_exception(e, ec, ctxt) ? : X86EMUL_EXCEPTION; \
+rc = ops->inject_hw_exception(e, mkec(e, ##ec, 0), ctxt)  \
+? : X86EMUL_EXCEPTION;\
 goto done;\
 } \
 })
 
+#define generate_exception(e, ec...) generate_exception_if(true, e, ##ec)
+
 /*
  * Given byte has even parity (even number of 1s)? SDM Vol. 1 Sec. 3.4.3.1,
  * "Status Flags": EFLAGS.PF reflects parity of least-sig. byte of result only.
@@ -785,7 +799,7 @@ static int _get_fpu(
 return rc;
 generate_exception_if(!(cr4 & ((type == X86EMUL_FPU_xmm)
? CR4_OSFXSR : CR4_OSXSAVE)),
-  EXC_UD, -1);
+  EXC_UD);
 }
 
 rc = ops->read_cr(0, , ctxt);
@@ -798,13 +812,13 @@ static int _get_fpu(
 }
 if ( cr0 & CR0_EM )
 {
-generate_exception_if(type == X86EMUL_FPU_fpu, EXC_NM, -1);
-generate_exception_if(type == X86EMUL_FPU_mmx, EXC_UD, -1);
-generate_exception_if(type == X86EMUL_FPU_xmm, EXC_UD, -1);
+generate_exception_if(type == X86EMUL_FPU_fpu, EXC_NM);
+generate_exception_if(type == X86EMUL_FPU_mmx, EXC_UD);
+generate_exception_if(type == X86EMUL_FPU_xmm, EXC_UD);
 }
 generate_exception_if((cr0 & CR0_TS) &&
   (type != X86EMUL_FPU_wait || (cr0 & CR0_MP)),
-  EXC_NM, -1);
+  EXC_NM);
 }
 
  done:
@@ -832,7 +846,7 @@ do {
\
 (_fic)->exn_raised = EXC_UD;\
 }   \
 generate_exception_if((_fic)->exn_raised >= 0,  \
-  (_fic)->exn_raised, -1);  \
+  (_fic)->exn_raised);  \
 } while (0)
 
 #define emulate_fpu_insn(_op)   \
@@ -1167,11 +1181,9 @@ static int ioport_access_check(
 if ( (rc = ops->read_segment(x86_seg_tr, , ctxt)) != 0 )
 return rc;
 
-/* Ensure that the TSS is valid and has an io-bitmap-offset field. */
-if ( !tr.attr.fields.p ||
- ((tr.attr.fields.type & 0xd) != 0x9) ||
- (tr.limit < 0x67) )
-goto raise_exception;
+/* Ensure the TSS has an io-bitmap-offset field. */
+generate_exception_if(tr.attr.fields.type != 0xb ||
+  tr.limit < 0x67, EXC_GP, 0);
 
 if ( (rc = read_ulong(x86_seg_none, tr.base + 0x66,
   , 2, ctxt, ops)) )
@@ 

[Xen-devel] [PATCH v3 23/24] x86/emul: Prepare to allow use of system segments for memory references

2016-11-30 Thread Andrew Cooper
All system segments (GDT/IDT/LDT and TR) describe a linear address and limit,
and act similarly to user segments.  However all current uses of these tables
in the emulator opencode the address calculations and limit checks.  In
particular, no care is taken for access which wrap around the 4GB or
non-canonical boundaries.

Alter hvm_virtual_to_linear_addr() to cope with performing segmentation checks
on system segments.  This involves restricting access checks in the 32bit case
to user segments only, and adding presence/limit checks in the 64bit case.

When suffering a segmentation fault for a system segments, return
X86EMUL_EXCEPTION but leave the fault injection to the caller.  The fault type
depends on the higher level action being performed.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
Reviewed-by: George Dunlap 
Reviewed-by: Paul Durrant 
---
 xen/arch/x86/hvm/emulate.c | 14 
 xen/arch/x86/hvm/hvm.c | 40 ++
 xen/arch/x86/mm/shadow/common.c| 12 +++---
 xen/arch/x86/x86_emulate/x86_emulate.h | 26 ++
 4 files changed, 62 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index ccf3aa2..d0a043b 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -567,10 +567,16 @@ static int hvmemul_virtual_to_linear(
 if ( *reps != 1 )
 return X86EMUL_UNHANDLEABLE;
 
-/* This is a singleton operation: fail it with an exception. */
-x86_emul_hw_exception((seg == x86_seg_ss)
-  ? TRAP_stack_error
-  : TRAP_gp_fault, 0, _ctxt->ctxt);
+/*
+ * Leave exception injection to the caller for non-user segments: We
+ * neither know the exact error code to be used, nor can we easily
+ * determine the kind of exception (#GP or #TS) in that case.
+ */
+if ( is_x86_user_segment(seg) )
+x86_emul_hw_exception((seg == x86_seg_ss)
+  ? TRAP_stack_error
+  : TRAP_gp_fault, 0, _ctxt->ctxt);
+
 return X86EMUL_EXCEPTION;
 }
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 3596f2c..426edee 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2497,24 +2497,28 @@ bool_t hvm_virtual_to_linear_addr(
 if ( !reg->attr.fields.p )
 goto out;
 
-switch ( access_type )
+/* Read/write restrictions only exist for user segments. */
+if ( reg->attr.fields.s )
 {
-case hvm_access_read:
-if ( (reg->attr.fields.type & 0xa) == 0x8 )
-goto out; /* execute-only code segment */
-break;
-case hvm_access_write:
-if ( (reg->attr.fields.type & 0xa) != 0x2 )
-goto out; /* not a writable data segment */
-break;
-default:
-break;
+switch ( access_type )
+{
+case hvm_access_read:
+if ( (reg->attr.fields.type & 0xa) == 0x8 )
+goto out; /* execute-only code segment */
+break;
+case hvm_access_write:
+if ( (reg->attr.fields.type & 0xa) != 0x2 )
+goto out; /* not a writable data segment */
+break;
+default:
+break;
+}
 }
 
 last_byte = (uint32_t)offset + bytes - !!bytes;
 
 /* Is this a grows-down data segment? Special limit check if so. */
-if ( (reg->attr.fields.type & 0xc) == 0x4 )
+if ( reg->attr.fields.s && (reg->attr.fields.type & 0xc) == 0x4 )
 {
 /* Is upper limit 0x or 0x? */
 if ( !reg->attr.fields.db )
@@ -2530,10 +2534,18 @@ bool_t hvm_virtual_to_linear_addr(
 else
 {
 /*
- * LONG MODE: FS and GS add segment base. Addresses must be canonical.
+ * User segments are always treated as present.  System segment may
+ * not be, and also incur limit checks.
  */
+if ( is_x86_system_segment(seg) &&
+ (!reg->attr.fields.p || (offset + bytes - !!bytes) > reg->limit) )
+goto out;
 
-if ( (seg == x86_seg_fs) || (seg == x86_seg_gs) )
+/*
+ * LONG MODE: FS, GS and system segments: add segment base. All
+ * addresses must be canonical.
+ */
+if ( seg >= x86_seg_fs )
 addr += reg->base;
 
 last_byte = addr + bytes - !!bytes;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index fbe49e1..6c146f8 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -162,9 +162,15 @@ static int hvm_translate_linear_addr(
 
 if ( !okay )
 {
-x86_emul_hw_exception(
- 

[Xen-devel] [PATCH v3 17/24] x86/pv: Avoid raising faults behind the emulators back

2016-11-30 Thread Andrew Cooper
Use x86_emul_pagefault() rather than pv_inject_page_fault() to cause raised
pagefaults to be known to the emulator.  This requires altering the callers of
x86_emulate() to properly re-inject the event.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Tim Deegan 

v3:
 * Split out #DB handling to an earlier part of the series
 * Don't raise #GP faults for unexpected events, but do return back to the
   guest.
v2:
 * New
---
 xen/arch/x86/mm.c | 104 ++
 1 file changed, 65 insertions(+), 39 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 5d59479..cdfa85e 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5136,7 +5136,7 @@ static int ptwr_emulated_read(
 if ( !__addr_ok(addr) ||
  (rc = __copy_from_user(p_data, (void *)addr, bytes)) )
 {
-pv_inject_page_fault(0, addr + bytes - rc); /* Read fault. */
+x86_emul_pagefault(0, addr + bytes - rc, ctxt);  /* Read fault. */
 return X86EMUL_EXCEPTION;
 }
 
@@ -5177,8 +5177,9 @@ static int ptwr_emulated_update(
 addr &= ~(sizeof(paddr_t)-1);
 if ( (rc = copy_from_user(, (void *)addr, sizeof(paddr_t))) != 0 )
 {
-pv_inject_page_fault(0, /* Read fault. */
- addr + sizeof(paddr_t) - rc);
+x86_emul_pagefault(0, /* Read fault. */
+   addr + sizeof(paddr_t) - rc,
+   _ctxt->ctxt);
 return X86EMUL_EXCEPTION;
 }
 /* Mask out bits provided by caller. */
@@ -5379,30 +5380,41 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long 
addr,
 page_unlock(page);
 put_page(page);
 
-/*
- * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
- * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
- * now set event_pending instead.  Exceptions raised behind the back of
- * the emulator don't yet set event_pending.
- *
- * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
- * for no functional change from before.  Future patches will fix this
- * properly.
- */
-if ( rc == X86EMUL_EXCEPTION && ptwr_ctxt.ctxt.event_pending )
-rc = X86EMUL_UNHANDLEABLE;
+/* More strict than x86_emulate_wrapper(), as this is now true for PV. */
+ASSERT(ptwr_ctxt.ctxt.event_pending == (rc == X86EMUL_EXCEPTION));
 
-if ( rc == X86EMUL_UNHANDLEABLE )
-goto bail;
+switch ( rc )
+{
+case X86EMUL_EXCEPTION:
+/*
+ * This emulation only covers writes to pagetables which marked
+ * read-only by Xen.  We tolerate #PF (from hitting an adjacent page).
+ * Anything else is an emulation bug, or a guest playing with the
+ * instruction stream under Xen's feet.
+ */
+if ( ptwr_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
+ ptwr_ctxt.ctxt.event.vector == TRAP_page_fault )
+pv_inject_event(_ctxt.ctxt.event);
+else
+gdprintk(XENLOG_WARNING,
+ "Unexpected event (type %u, vector %#x) from emulation\n",
+ ptwr_ctxt.ctxt.event.type, ptwr_ctxt.ctxt.event.vector);
+
+/* Fallthrough */
+case X86EMUL_OKAY:
 
-if ( ptwr_ctxt.ctxt.retire.singlestep )
-pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+if ( ptwr_ctxt.ctxt.retire.singlestep )
+pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 
-perfc_incr(ptwr_emulations);
-return EXCRET_fault_fixed;
+/* Fallthrough */
+case X86EMUL_RETRY:
+perfc_incr(ptwr_emulations);
+return EXCRET_fault_fixed;
 
  bail:
-return 0;
+default:
+return 0;
+}
 }
 
 /*
@@ -5519,26 +5531,40 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long 
addr,
 else
 rc = x86_emulate(, _ro_emulate_ops);
 
-/*
- * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
- * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
- * now set event_pending instead.  Exceptions raised behind the back of
- * the emulator don't yet set event_pending.
- *
- * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
- * for no functional change from before.  Future patches will fix this
- * properly.
- */
-if ( rc == X86EMUL_EXCEPTION && ctxt.event_pending )
-rc = X86EMUL_UNHANDLEABLE;
+/* More strict than x86_emulate_wrapper(), as this is now true for PV. */
+ASSERT(ctxt.event_pending == (rc == X86EMUL_EXCEPTION));
 
-if ( rc == X86EMUL_UNHANDLEABLE )
-return 0;
+switch ( rc )
+{
+case X86EMUL_EXCEPTION:
+/*
+ * This emulation only covers writes to MMCFG space or read-only MFNs.
+ * We 

Re: [Xen-devel] [PATCH v3.1 15/15] xen/x86: setup PVHv2 Dom0 ACPI tables

2016-11-30 Thread Jan Beulich
>>> On 30.11.16 at 13:40,  wrote:
> On Mon, Nov 14, 2016 at 09:15:37AM -0700, Jan Beulich wrote:
>> >>> On 29.10.16 at 11:00,  wrote:
>> > Also, regions marked as E820_ACPI or E820_NVS are identity mapped into Dom0
>> > p2m, plus any top-level ACPI tables that should be accessible to Dom0 and
>> > that don't reside in RAM regions. This is needed because some memory maps
>> > don't properly account for all the memory used by ACPI, so it's common to
>> > find ACPI tables in holes.
>> 
>> I question whether this behavior should be enabled by default. Not
>> having seen the code yet I also wonder whether these regions
>> shouldn't simply be added to the guest's E820 as E820_ACPI, which
>> should then result in them getting mapped without further special
>> casing.
>> 
>> > +static int __init hvm_add_mem_range(struct domain *d, uint64_t s, 
>> > uint64_t e,
>> > +uint32_t type)
>> 
>> I see s and e being uint64_t, but I don't see why type can't be plain
>> unsigned int.
> 
> Well, that's the type for "type" as defined in e820.h. I'm just using 
> uint32_t 
> for consistency with that.

As said a number of times in various contexts: We should try to
get away from using fixed width types where we don't really need
them.

>> > +{
>> > +d->arch.e820[i].size += e - s;
>> > +return 0;
>> > +}
>> > +
>> > +if ( rs >= e )
>> > +break;
>> > +
>> > +if ( re > s )
>> > +return -ENOMEM;
>> 
>> I don't think ENOMEM is appropriate to signal an overlap. And don't
>> you need to reverse these last two if()s?
> 
> I've changed ENOMEM to EEXIST. Hm, I don't think so, if I reversed those we 
> will 
> get error when trying to add a non-contiguous region to fill a hole between 
> two 
> existing regions right?

Looks like I've managed to write something else than I meant. I was
really thinking of

if ( re > s )
{
if ( rs >= e )
break;
return -ENOMEM;
}

But then again I think with things being sorted it may not matter at all.

>> > +ACPI_MEMCPY(intsrcovr, intr, sizeof(*intr));
>> 
>> Structure assignment (for type safety; also elsewhere)?
> 
> I wasn't sure what to do here, since there's a specific ACPI_MEMCPY function, 
> but I guess this is designed to be used by acpica code itself, and 
> ACPI_MEMCPY 
> is just an OS-agnotic wrapper to memcpy.

Indeed.

>> > +/* Setup the IO APIC entry. */
>> > +if ( nr_ioapics > 1 )
>> > +printk("WARNING: found %d IO APICs, Dom0 will only have access to 
>> > 1 emulated IO APIC\n",
>> > +   nr_ioapics);
>> 
>> I've said elsewhere already that I think we should provide 1 vIO-APIC
>> per physical one.
> 
> Agree, but the current vIO-APIC is not really up to it. I will work on 
> getting 
> it to support multiple instances.

Until then this should obtain a grep-able "fixme" annotation.

>> > +io_apic = (struct acpi_madt_io_apic *)(madt + 1);
>> > +io_apic->header.type = ACPI_MADT_TYPE_IO_APIC;
>> > +io_apic->header.length = sizeof(*io_apic);
>> > +io_apic->id = 1;
>> > +io_apic->address = VIOAPIC_DEFAULT_BASE_ADDRESS;
>> > +
>> > +local_apic = (struct acpi_madt_local_apic *)(io_apic + 1);
>> > +for ( i = 0; i < dom0_max_vcpus(); i++ )
>> > +{
>> > +local_apic->header.type = ACPI_MADT_TYPE_LOCAL_APIC;
>> > +local_apic->header.length = sizeof(*local_apic);
>> > +local_apic->processor_id = i;
>> > +local_apic->id = i * 2;
>> > +local_apic->lapic_flags = ACPI_MADT_ENABLED;
>> > +local_apic++;
>> > +}
>> 
>> What about x2apic? And for lapic, do you limit vCPU count anywhere?
> 
> Yes, there's no x2apic information, I'm currently looking at libacpi in 
> tools, 
> and there doesn't seem to be any local x2apic structure there either. Am I 
> missing something?

I don't think you are.

> Regarding vCPU count, I will limit it to 128.

With it limited there'll be no strict need for x2apic structures. Still
we should get them added eventually.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 09/13] x86: change default load address from 1 MiB to 2 MiB

2016-11-30 Thread Juergen Gross
On 30/11/16 14:51, Juergen Gross wrote:
> On 30/11/16 14:04, Daniel Kiper wrote:
>> Subsequent patches introducing relocatable early boot code play with
>> page tables using 2 MiB huge pages. If load address is not aligned at
>> 2 MiB then code touching such page tables must have special cases for
>> start and end of Xen image memory region. So, let's make life easier
>> and move default load address from 1 MiB to 2 MiB. This way page table
>> code will be nice and easy. Hence, there is a chance that it will be
>> less error prone too... :-)))
>>
>> Additionally, drop first 2 MiB mapping from Xen image mapping.
>> It is no longer needed.
>>
>> Signed-off-by: Daniel Kiper 
>> Reviewed-by: Jan Beulich 
>> ---
>> v8 - suggestions/fixes:
>>- drop first 2 MiB mapping from Xen image mapping
>>  (suggested by Jan Beulich),
>>- improve commit message.
>>
>> v7 - suggestions/fixes:
>>- minor cleanups
>>  (suggested by Jan Beulich).
>> ---
>>  xen/arch/x86/Makefile  |2 +-
>>  xen/arch/x86/Rules.mk  |3 +++
>>  xen/arch/x86/boot/head.S   |8 
>>  xen/arch/x86/boot/x86_64.S |5 +++--
>>  xen/arch/x86/setup.c   |3 ++-
>>  xen/arch/x86/xen.lds.S |2 +-
>>  6 files changed, 10 insertions(+), 13 deletions(-)
>>
>> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
>> index e74fe62..d5d0651 100644
>> --- a/xen/arch/x86/Makefile
>> +++ b/xen/arch/x86/Makefile
>> @@ -90,7 +90,7 @@ all_symbols =
>>  endif
>>  
>>  $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32
>> -./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 0x10 \
>> +./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 
>> $(XEN_IMG_OFFSET) \
>> `$(NM) $(TARGET)-syms | sed -ne 's/^\([^ ]*\) . 
>> __2M_rwdata_end$$/0x\1/p'`
> 
> This doesn't apply (somehow you managed to insert spaces into the patch
> file).

Sorry for the noise, somehow I managed to skip patch 4.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 07/13] x86: add multiboot2 protocol support for EFI platforms

2016-11-30 Thread Jan Beulich
>>> On 30.11.16 at 14:45,  wrote:
> On Fri, Nov 25, 2016 at 12:50:55AM -0700, Jan Beulich wrote:
>> >>> On 24.11.16 at 22:44,  wrote:
>> > On Thu, Nov 24, 2016 at 04:08:12AM -0700, Jan Beulich wrote:
>> >> >>> On 23.11.16 at 19:52,  wrote:
>> >> > Always use add/sub 1 in preference to inc and dec.  They are the same
>> >> > length to encode in 64bit, and avoids a pipeline stall from a merge of
>> >> > the eflags register.
>> >>
>> >> What you say regarding length not true - add/sub need to encode
>> >> the immediate somewhere (even if the operand was a register,
>> >> inc/dec would still be smaller than add/sub, just not by as much as
>> >> in 32-bit code). And the pipeline stall, afaik, affects only rather old
>> >> processors.
>> >
>> > Intel 64 and IA-32 Architectures Optimization Reference Manual, section
>> > 3.5.1.1, Use of the INC and DEC Instructions says nothing about exceptions.
>>
>> Which by itself is suspicious, as the dependency issue had been
>> introduced only in (iirc) Pentium4. And anyway, this is a general
> 
> Hmmm... Interesting... It looks that INC/DEC behavior has not changed since
> the beginning (why it would change? It would not make sense).

Please properly disambiguate "behavior": Architectural behavior of
course can't change. Performance, otoh, has changed many times.
And the resource dependency issue did appear only once the
pipelining of instructions was sophisticated enough, but not good
enough yet to track (as a dependency) EFLAGS.CF separately from
the other arithmetic result flags.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen/arm: Fix misplaced parentheses for PSCI version check

2016-11-30 Thread Artem Mygaiev
Fix misplaced parentheses for PSCI version check

Signed-off-by: Artem Mygaiev 
---
 xen/arch/arm/psci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
index 7966b5e..34ee97e 100644
--- a/xen/arch/arm/psci.c
+++ b/xen/arch/arm/psci.c
@@ -147,7 +147,7 @@ int __init psci_init_0_2(void)
 psci_ver = call_smc(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
 
 /* For the moment, we only support PSCI 0.2 and PSCI 1.x */
-if ( psci_ver != PSCI_VERSION(0, 2) && PSCI_VERSION_MAJOR(psci_ver != 1) )
+if ( psci_ver != PSCI_VERSION(0, 2) && PSCI_VERSION_MAJOR(psci_ver) != 1 )
 {
 printk("Error: Unrecognized PSCI version %u.%u\n",
PSCI_VERSION_MAJOR(psci_ver), PSCI_VERSION_MINOR(psci_ver));

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 02/24] x86/emul: Drop X86EMUL_CMPXCHG_FAILED

2016-11-30 Thread Andrew Cooper
X86EMUL_CMPXCHG_FAILED was introduced in c/s d430aae25 in 2005.  Even at the
time it alised what is now X86EMUL_RETRY (as well as what is now
X86EMUL_EXCEPTION).  I am not sure why the distinction was considered useful
at the time.

It is only used twice; there is no need to call it out differently from other
uses of X86EMUL_RETRY.

No functional change.

Signed-off-by: Andrew Cooper 
Acked-by: Tim Deegan 
Acked-by: Jan Beulich 
---
v2:
 * New
---
 xen/arch/x86/mm.c  | 2 +-
 xen/arch/x86/mm/shadow/multi.c | 2 +-
 xen/arch/x86/x86_emulate/x86_emulate.h | 2 --
 3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 03dcd71..5b0e9f3 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5254,7 +5254,7 @@ static int ptwr_emulated_update(
 {
 unmap_domain_page(pl1e);
 put_page_from_l1e(nl1e, d);
-return X86EMUL_CMPXCHG_FAILED;
+return X86EMUL_RETRY;
 }
 }
 else
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index d70b1c6..9ee48a8 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -4694,7 +4694,7 @@ sh_x86_emulate_cmpxchg(struct vcpu *v, unsigned long 
vaddr,
 }
 
 if ( prev != old )
-rv = X86EMUL_CMPXCHG_FAILED;
+rv = X86EMUL_RETRY;
 
 SHADOW_DEBUG(EMULATE, "va %#lx was %#lx expected %#lx"
   " wanted %#lx now %#lx bytes %u\n",
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h 
b/xen/arch/x86/x86_emulate/x86_emulate.h
index 993c576..ec824ce 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -109,8 +109,6 @@ struct __attribute__((__packed__)) segment_register {
 #define X86EMUL_EXCEPTION  2
  /* Retry the emulation for some reason. No state modified. */
 #define X86EMUL_RETRY  3
- /* (cmpxchg accessor): CMPXCHG failed. Maps to X86EMUL_RETRY in caller. */
-#define X86EMUL_CMPXCHG_FAILED 3
 
 /* FPU sub-types which may be requested via ->get_fpu(). */
 enum x86_emulate_fpu_type {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH for-4.9 v3 00/24] XSA-191 followup

2016-11-30 Thread Andrew Cooper
This is the quantity of changes required to fix some edgecases in XSA-191
which were ultimately chosen not to go out in the security fix.  The main
purpose of this series is to fix emulation sufficiently to allow the final
patch to avoid opencoding all of the segmenation logic.

Changes from v2:

 * 5 new patches (7-11) fixing x86_emulate() not to return X86EMUL_EXCEPTION
   with trap semantics.
 * Adjustments to callers of x86_emulate() to cope with the fault semantics.
 * Tweaks to the implementation of pv_inject_{event,page_fault,hw_exception}().

Andrew Cooper (24):
  x86/shadow: Fix #PFs from emulated writes crossing a page boundary
  x86/emul: Drop X86EMUL_CMPXCHG_FAILED
  x86/emul: Simplfy emulation state setup
  x86/emul: Rename hvm_trap to x86_event and move it into the emulation 
infrastructure
  x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC
  x86/pv: Implement pv_inject_{event,page_fault,hw_exception}()
  x86/emul: Clean up the naming of the retire union
  x86/emul: Correct the behaviour of pop %ss and interrupt shadowing
  x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour
  x86/emul: Always use fault semantics for software events
  x86/emul: Implement singlestep as a retire flag
  x86/emul: Remove opencoded exception generation
  x86/emul: Rework emulator event injection
  x86/vmx: Use hvm_{get,set}_segment_register() rather than 
vmx_{get,set}_segment_register()
  x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  x86/emul: Avoid raising faults behind the emulators back
  x86/pv: Avoid raising faults behind the emulators back
  x86/shadow: Avoid raising faults behind the emulators back
  x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer
  x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info
  x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear()
  x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back
  x86/emul: Prepare to allow use of system segments for memory references
  x86/emul: Use system-segment relative memory accesses

 tools/tests/x86_emulator/test_x86_emulator.c |   1 +
 tools/tests/x86_emulator/x86_emulate.c   |   3 +
 xen/arch/x86/hvm/emulate.c   | 147 ---
 xen/arch/x86/hvm/hvm.c   | 370 +++
 xen/arch/x86/hvm/io.c|   4 +-
 xen/arch/x86/hvm/nestedhvm.c |   2 +-
 xen/arch/x86/hvm/svm/nestedsvm.c |  13 +-
 xen/arch/x86/hvm/svm/svm.c   | 144 +--
 xen/arch/x86/hvm/vmx/intr.c  |   2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |  16 +-
 xen/arch/x86/hvm/vmx/vmx.c   | 109 
 xen/arch/x86/hvm/vmx/vvmx.c  |  44 ++--
 xen/arch/x86/mm.c|  94 +--
 xen/arch/x86/mm/shadow/common.c  |  40 +--
 xen/arch/x86/mm/shadow/multi.c   |  57 -
 xen/arch/x86/traps.c | 147 ++-
 xen/arch/x86/x86_emulate/x86_emulate.c   | 357 +++---
 xen/arch/x86/x86_emulate/x86_emulate.h   | 219 +---
 xen/include/asm-x86/desc.h   |   6 +
 xen/include/asm-x86/domain.h |  26 ++
 xen/include/asm-x86/hvm/emulate.h|   3 -
 xen/include/asm-x86/hvm/hvm.h|  86 +++
 xen/include/asm-x86/hvm/support.h|  42 ++-
 xen/include/asm-x86/hvm/svm/nestedsvm.h  |   6 +-
 xen/include/asm-x86/hvm/vcpu.h   |   2 +-
 xen/include/asm-x86/hvm/vmx/vmx.h|   2 -
 xen/include/asm-x86/hvm/vmx/vvmx.h   |   4 +-
 xen/include/asm-x86/mm.h |   1 -
 28 files changed, 1190 insertions(+), 757 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 09/24] x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour

2016-11-30 Thread Andrew Cooper
In debug builds, confirm that some properties of x86_emulate()'s behaviour
actually hold.  The first property, fixed in a previous change, is that retire
flags are only ever set in the X86EMUL_OKAY case.

While adjusting the userspace test harness to cope with ASSERT() in
x86_emulate.h, fix a build problem introduced in c/s 122dd9575c7 "x86emul:
in_longmode() should not ignore ->read_msr() errors" by providing an
implementation of likely()/unlikely().

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v3:
 * New
---
 tools/tests/x86_emulator/test_x86_emulator.c |  1 +
 tools/tests/x86_emulator/x86_emulate.c   |  3 +++
 xen/arch/x86/x86_emulate/x86_emulate.c   |  5 +
 xen/arch/x86/x86_emulate/x86_emulate.h   | 25 +
 4 files changed, 34 insertions(+)

diff --git a/tools/tests/x86_emulator/test_x86_emulator.c 
b/tools/tests/x86_emulator/test_x86_emulator.c
index f255fef..b54fd11 100644
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -1,3 +1,4 @@
+#include 
 #include 
 #include 
 #include 
diff --git a/tools/tests/x86_emulator/x86_emulate.c 
b/tools/tests/x86_emulator/x86_emulate.c
index c46b7fc..3272867 100644
--- a/tools/tests/x86_emulator/x86_emulate.c
+++ b/tools/tests/x86_emulator/x86_emulate.c
@@ -50,4 +50,7 @@ typedef bool bool_t;
 #define __init
 #define __maybe_unused __attribute__((__unused__))
 
+#define likely(x) __builtin_expect(!!(x),1)
+#define unlikely(x)   __builtin_expect(!!(x),0)
+
 #include "x86_emulate/x86_emulate.c"
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c 
b/xen/arch/x86/x86_emulate/x86_emulate.c
index bacdee6..e4643a3 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -2404,6 +2404,11 @@ x86_decode(
 #undef insn_fetch_bytes
 #undef insn_fetch_type
 
+/* Undo DEBUG wrapper. */
+#ifdef x86_emulate
+#undef x86_emulate
+#endif
+
 int
 x86_emulate(
 struct x86_emulate_ctxt *ctxt,
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h 
b/xen/arch/x86/x86_emulate/x86_emulate.h
index ef39601..f84ced2 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -23,6 +23,10 @@
 #ifndef __X86_EMULATE_H__
 #define __X86_EMULATE_H__
 
+#ifndef ASSERT
+#define ASSERT assert
+#endif
+
 #define MAX_INST_LEN 15
 
 struct x86_emulate_ctxt;
@@ -554,6 +558,27 @@ x86_emulate(
 const struct x86_emulate_ops *ops);
 
 /*
+ * In debug builds, wrap x86_emulate() with some assertions about its expected
+ * behaviour.
+ */
+#ifndef NDEBUG
+static inline int x86_emulate_wrapper(
+struct x86_emulate_ctxt *ctxt,
+const struct x86_emulate_ops *ops)
+{
+int rc = x86_emulate(ctxt, ops);
+
+/* Retire flags should only be set for successful instruction emulation. */
+if ( rc != X86EMUL_OKAY )
+ASSERT(ctxt->retire.raw == 0);
+
+return rc;
+}
+
+#define x86_emulate x86_emulate_wrapper
+#endif
+
+/*
  * Given the 'reg' portion of a ModRM byte, and a register block, return a
  * pointer into the block that addresses the relevant register.
  * @highbyte_regs specifies whether to decode AH,CH,DH,BH.
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 06/24] x86/pv: Implement pv_inject_{event, page_fault, hw_exception}()

2016-11-30 Thread Andrew Cooper
To help with event injection improvements for the PV uses of x86_emulate(),
implement a event injection API which matches its hvm counterpart.

This is started with taking do_guest_trap() and modifying its calling API to
pv_inject_event(), subsequentally implementing the former in terms of the
latter.

The existing propagate_page_fault() is fairly similar to
pv_inject_page_fault(), although it has a return value.  Only a single caller
makes use of the return value, and non-NULL is only returned if the passed cr2
is non-canonical.  Opencode this single case in
handle_gdt_ldt_mapping_fault(), allowing propagate_page_fault() to become
void.

The call to reserved_bit_page_fault() in propagate_page_fault() was
conceptually wrong to start with.  Complaining about reserved bits should be
part of handling the pagefault itself, not part of injecting a pagefault into
the guest.  It is therefore moved ahead of the injection call in
do_page_fault() to compensate.

The remaining #PF specific bits are moved into pv_inject_event(), and
pv_inject_page_fault() is implemented as a static inline wrapper.

No practical change from a guests point of view.

Signed-off-by: Andrew Cooper 
Acked-by: Tim Deegan 
---
CC: Jan Beulich 

v3:
 * Reposition reserved_bit_page_fault() handling
v2:
 * New
---
 xen/arch/x86/mm.c   |   5 +-
 xen/arch/x86/mm/shadow/common.c |   4 +-
 xen/arch/x86/traps.c| 147 
 xen/include/asm-x86/domain.h|  26 +++
 xen/include/asm-x86/mm.h|   1 -
 5 files changed, 104 insertions(+), 79 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index d365f59..b7c7122 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5136,7 +5136,7 @@ static int ptwr_emulated_read(
 if ( !__addr_ok(addr) ||
  (rc = __copy_from_user(p_data, (void *)addr, bytes)) )
 {
-propagate_page_fault(addr + bytes - rc, 0); /* read fault */
+pv_inject_page_fault(0, addr + bytes - rc); /* Read fault. */
 return X86EMUL_EXCEPTION;
 }
 
@@ -5177,7 +5177,8 @@ static int ptwr_emulated_update(
 addr &= ~(sizeof(paddr_t)-1);
 if ( (rc = copy_from_user(, (void *)addr, sizeof(paddr_t))) != 0 )
 {
-propagate_page_fault(addr+sizeof(paddr_t)-rc, 0); /* read fault */
+pv_inject_page_fault(0, /* Read fault. */
+ addr + sizeof(paddr_t) - rc);
 return X86EMUL_EXCEPTION;
 }
 /* Mask out bits provided by caller. */
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index a4a3c4b..f07803b 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -323,7 +323,7 @@ pv_emulate_read(enum x86_segment seg,
 
 if ( (rc = copy_from_user(p_data, (void *)offset, bytes)) != 0 )
 {
-propagate_page_fault(offset + bytes - rc, 0); /* read fault */
+pv_inject_page_fault(0, offset + bytes - rc); /* Read fault. */
 return X86EMUL_EXCEPTION;
 }
 
@@ -1723,7 +1723,7 @@ static mfn_t emulate_gva_to_mfn(struct vcpu *v, unsigned 
long vaddr,
 if ( is_hvm_vcpu(v) )
 hvm_inject_page_fault(pfec, vaddr);
 else
-propagate_page_fault(vaddr, pfec);
+pv_inject_page_fault(pfec, vaddr);
 return _mfn(BAD_GVA_TO_GFN);
 }
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index b464211..195d590 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -625,37 +625,75 @@ void fatal_trap(const struct cpu_user_regs *regs, bool_t 
show_remote)
   (regs->eflags & X86_EFLAGS_IF) ? "" : ", IN INTERRUPT CONTEXT");
 }
 
-static void do_guest_trap(unsigned int trapnr,
-  const struct cpu_user_regs *regs)
+void pv_inject_event(const struct x86_event *event)
 {
 struct vcpu *v = current;
+struct cpu_user_regs *regs = guest_cpu_user_regs();
 struct trap_bounce *tb;
 const struct trap_info *ti;
+const uint8_t vector = event->vector;
 const bool use_error_code =
-((trapnr < 32) && (TRAP_HAVE_EC & (1u << trapnr)));
+((vector < 32) && (TRAP_HAVE_EC & (1u << vector)));
+unsigned int error_code = event->error_code;
 
-trace_pv_trap(trapnr, regs->eip, use_error_code, regs->error_code);
+ASSERT(vector == event->vector); /* Confirm no truncation. */
+if ( use_error_code )
+ASSERT(error_code != X86_EVENT_NO_EC);
+else
+ASSERT(error_code == X86_EVENT_NO_EC);
 
 tb = >arch.pv_vcpu.trap_bounce;
-ti = >arch.pv_vcpu.trap_ctxt[trapnr];
+ti = >arch.pv_vcpu.trap_ctxt[vector];
 
 tb->flags = TBF_EXCEPTION;
 tb->cs= ti->cs;
 tb->eip   = ti->address;
 
+if ( vector == TRAP_page_fault )
+{
+v->arch.pv_vcpu.ctrlreg[2] = event->cr2;
+arch_set_cr2(v, event->cr2);
+
+/* Re-set error_code.user flag 

Re: [Xen-devel] [PATCH v10 09/13] x86: change default load address from 1 MiB to 2 MiB

2016-11-30 Thread Juergen Gross
On 30/11/16 14:04, Daniel Kiper wrote:
> Subsequent patches introducing relocatable early boot code play with
> page tables using 2 MiB huge pages. If load address is not aligned at
> 2 MiB then code touching such page tables must have special cases for
> start and end of Xen image memory region. So, let's make life easier
> and move default load address from 1 MiB to 2 MiB. This way page table
> code will be nice and easy. Hence, there is a chance that it will be
> less error prone too... :-)))
> 
> Additionally, drop first 2 MiB mapping from Xen image mapping.
> It is no longer needed.
> 
> Signed-off-by: Daniel Kiper 
> Reviewed-by: Jan Beulich 
> ---
> v8 - suggestions/fixes:
>- drop first 2 MiB mapping from Xen image mapping
>  (suggested by Jan Beulich),
>- improve commit message.
> 
> v7 - suggestions/fixes:
>- minor cleanups
>  (suggested by Jan Beulich).
> ---
>  xen/arch/x86/Makefile  |2 +-
>  xen/arch/x86/Rules.mk  |3 +++
>  xen/arch/x86/boot/head.S   |8 
>  xen/arch/x86/boot/x86_64.S |5 +++--
>  xen/arch/x86/setup.c   |3 ++-
>  xen/arch/x86/xen.lds.S |2 +-
>  6 files changed, 10 insertions(+), 13 deletions(-)
> 
> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
> index e74fe62..d5d0651 100644
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -90,7 +90,7 @@ all_symbols =
>  endif
>  
>  $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32
> - ./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 0x10 \
> + ./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 
> $(XEN_IMG_OFFSET) \
>  `$(NM) $(TARGET)-syms | sed -ne 's/^\([^ ]*\) . 
> __2M_rwdata_end$$/0x\1/p'`

This doesn't apply (somehow you managed to insert spaces into the patch
file).


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union

2016-11-30 Thread Andrew Cooper
Rename byte to raw, as the field being a single byte long is an implementation
detail.  Make the bitfields part of an anonymous struct to remove the .flags
qualifier.  Change the types of the flags to being booleans, to match their
use.

No functional change.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Paul Durrant 

v3:
 * New
---
 xen/arch/x86/hvm/emulate.c |  6 +++---
 xen/arch/x86/x86_emulate/x86_emulate.c | 10 +-
 xen/arch/x86/x86_emulate/x86_emulate.h | 10 +-
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index bc259ec..fe62500 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1791,13 +1791,13 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt 
*hvmemul_ctxt,
 new_intr_shadow = hvmemul_ctxt->intr_shadow;
 
 /* MOV-SS instruction toggles MOV-SS shadow, else we just clear it. */
-if ( hvmemul_ctxt->ctxt.retire.flags.mov_ss )
+if ( hvmemul_ctxt->ctxt.retire.mov_ss )
 new_intr_shadow ^= HVM_INTR_SHADOW_MOV_SS;
 else
 new_intr_shadow &= ~HVM_INTR_SHADOW_MOV_SS;
 
 /* STI instruction toggles STI shadow, else we just clear it. */
-if ( hvmemul_ctxt->ctxt.retire.flags.sti )
+if ( hvmemul_ctxt->ctxt.retire.sti )
 new_intr_shadow ^= HVM_INTR_SHADOW_STI;
 else
 new_intr_shadow &= ~HVM_INTR_SHADOW_STI;
@@ -1808,7 +1808,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt 
*hvmemul_ctxt,
 hvm_funcs.set_interrupt_shadow(curr, new_intr_shadow);
 }
 
-if ( hvmemul_ctxt->ctxt.retire.flags.hlt &&
+if ( hvmemul_ctxt->ctxt.retire.hlt &&
  !hvm_local_events_need_delivery(curr) )
 {
 hvm_hlt(regs->eflags);
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c 
b/xen/arch/x86/x86_emulate/x86_emulate.c
index 9c28ed4..416812e 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1905,7 +1905,7 @@ x86_decode(
 state->eip = ctxt->regs->eip;
 
 /* Initialise output state in x86_emulate_ctxt */
-ctxt->retire.byte = 0;
+ctxt->retire.raw = 0;
 
 op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt->addr_size/8;
 if ( op_bytes == 8 )
@@ -2668,7 +2668,7 @@ x86_emulate(
 
 case 0x17: /* pop %%ss */
 src.val = x86_seg_ss;
-ctxt->retire.flags.mov_ss = 1;
+ctxt->retire.mov_ss = 1;
 goto pop_seg;
 
 case 0x1e: /* push %%ds */
@@ -2996,7 +2996,7 @@ x86_emulate(
 if ( (rc = load_seg(seg, src.val, 0, NULL, ctxt, ops)) != 0 )
 goto done;
 if ( seg == x86_seg_ss )
-ctxt->retire.flags.mov_ss = 1;
+ctxt->retire.mov_ss = 1;
 dst.type = OP_NONE;
 break;
 
@@ -4033,7 +4033,7 @@ x86_emulate(
 
 case 0xf4: /* hlt */
 generate_exception_if(!mode_ring0(), EXC_GP, 0);
-ctxt->retire.flags.hlt = 1;
+ctxt->retire.hlt = 1;
 break;
 
 case 0xf5: /* cmc */
@@ -4247,7 +4247,7 @@ x86_emulate(
 if ( !(_regs.eflags & EFLG_IF) )
 {
 _regs.eflags |= EFLG_IF;
-ctxt->retire.flags.sti = 1;
+ctxt->retire.sti = 1;
 }
 break;
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h 
b/xen/arch/x86/x86_emulate/x86_emulate.h
index b0f0304..ef39601 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -468,12 +468,12 @@ struct x86_emulate_ctxt
 
 /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY). */
 union {
+uint8_t raw;
 struct {
-uint8_t hlt:1;  /* Instruction HLTed. */
-uint8_t mov_ss:1;   /* Instruction sets MOV-SS irq shadow. */
-uint8_t sti:1;  /* Instruction sets STI irq shadow. */
-} flags;
-uint8_t byte;
+bool hlt:1;  /* Instruction HLTed. */
+bool mov_ss:1;   /* Instruction sets MOV-SS irq shadow. */
+bool sti:1;  /* Instruction sets STI irq shadow. */
+};
 } retire;
 };
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 05/24] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC

2016-11-30 Thread Andrew Cooper
and move it to live with the other x86_event infrastructure in x86_emulate.h.
Switch it and x86_event.error_code to being signed, matching the rest of the
code.

Signed-off-by: Andrew Cooper 
Reviewed-by: Paul Durrant 
Reviewed-by: Boris Ostrovsky 
Reviewed-by: Kevin Tian 
Reviewed-by: Jan Beulich 
---
v2:
 * Rebase over corrections to the use of HVM_DELIVER_NO_ERROR_CODE
---
 xen/arch/x86/hvm/emulate.c |  5 ++---
 xen/arch/x86/hvm/hvm.c |  6 +++---
 xen/arch/x86/hvm/nestedhvm.c   |  2 +-
 xen/arch/x86/hvm/svm/nestedsvm.c   |  6 +++---
 xen/arch/x86/hvm/svm/svm.c | 20 ++--
 xen/arch/x86/hvm/vmx/intr.c|  2 +-
 xen/arch/x86/hvm/vmx/vmx.c | 25 +
 xen/arch/x86/hvm/vmx/vvmx.c|  2 +-
 xen/arch/x86/x86_emulate/x86_emulate.h |  3 ++-
 xen/include/asm-x86/hvm/support.h  |  2 --
 10 files changed, 36 insertions(+), 37 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index bb26d40..bc259ec 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1609,7 +1609,7 @@ static int hvmemul_inject_sw_interrupt(
 
 hvmemul_ctxt->exn_pending = 1;
 hvmemul_ctxt->trap.vector = vector;
-hvmemul_ctxt->trap.error_code = HVM_DELIVER_NO_ERROR_CODE;
+hvmemul_ctxt->trap.error_code = X86_EVENT_NO_EC;
 hvmemul_ctxt->trap.insn_len = insn_len;
 
 return X86EMUL_OKAY;
@@ -1696,8 +1696,7 @@ static int hvmemul_vmfunc(
 
 rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
 if ( rc != X86EMUL_OKAY )
-hvmemul_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE,
-ctxt);
+hvmemul_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC, ctxt);
 
 return rc;
 }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 7b434aa..b950842 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -502,7 +502,7 @@ void hvm_do_resume(struct vcpu *v)
 kind = EMUL_KIND_SET_CONTEXT_INSN;
 
 hvm_emulate_one_vm_event(kind, TRAP_invalid_op,
- HVM_DELIVER_NO_ERROR_CODE);
+ X86_EVENT_NO_EC);
 
 v->arch.vm_event->emulate_flags = 0;
 }
@@ -3054,7 +3054,7 @@ void hvm_task_switch(
 }
 
 if ( (tss.trace & 1) && !exn_raised )
-hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 
  out:
 hvm_unmap_entry(optss_desc);
@@ -4073,7 +4073,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
 switch ( hvm_emulate_one() )
 {
 case X86EMUL_UNHANDLEABLE:
-hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
 break;
 case X86EMUL_EXCEPTION:
 if ( ctxt.exn_pending )
diff --git a/xen/arch/x86/hvm/nestedhvm.c b/xen/arch/x86/hvm/nestedhvm.c
index caad525..c4671d8 100644
--- a/xen/arch/x86/hvm/nestedhvm.c
+++ b/xen/arch/x86/hvm/nestedhvm.c
@@ -17,7 +17,7 @@
  */
 
 #include 
-#include/* for HVM_DELIVER_NO_ERROR_CODE */
+#include 
 #include 
 #include /* for struct p2m_domain */
 #include 
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index b6b8526..8c9b073 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -756,7 +756,7 @@ nsvm_vcpu_vmrun(struct vcpu *v, struct cpu_user_regs *regs)
 default:
 gdprintk(XENLOG_ERR,
 "nsvm_vcpu_vmentry failed, injecting #UD\n");
-hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
 /* Must happen after hvm_inject_hw_exception or it doesn't work right. 
*/
 nv->nv_vmswitch_in_progress = 0;
 return 1;
@@ -1581,7 +1581,7 @@ void svm_vmexit_do_stgi(struct cpu_user_regs *regs, 
struct vcpu *v)
 unsigned int inst_len;
 
 if ( !nestedhvm_enabled(v->domain) ) {
-hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
 return;
 }
 
@@ -1601,7 +1601,7 @@ void svm_vmexit_do_clgi(struct cpu_user_regs *regs, 
struct vcpu *v)
 vintr_t intr;
 
 if ( !nestedhvm_enabled(v->domain) ) {
-hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
 return;
 }
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index caab5ce..912d871 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -89,7 +89,7 @@ static DEFINE_SPINLOCK(osvw_lock);
 static void 

[Xen-devel] [PATCH v3 04/24] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure

2016-11-30 Thread Andrew Cooper
The x86 emulator needs to gain an understanding of interrupts and exceptions
generated by its actions.  The naming choice is to match both the Intel and
AMD terms, and to avoid 'trap' specifically as it has an architectural meaning
different to its current usage.

While making this change, make other changes for consistency

 * Rename *_trap() infrastructure to *_event()
 * Rename trapnr/trap parameters to vector
 * Convert hvm_inject_hw_exception() and hvm_inject_page_fault() to being
   static inlines, as they are only thin wrappers around hvm_inject_event()

No functional change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Paul Durrant 
Reviewed-by: Boris Ostrovsky 
Reviewed-by: Kevin Tian 
Reviewed-by: Jan Beulich 
---
 xen/arch/x86/hvm/emulate.c  |  6 +--
 xen/arch/x86/hvm/hvm.c  | 33 
 xen/arch/x86/hvm/io.c   |  2 +-
 xen/arch/x86/hvm/svm/nestedsvm.c|  7 ++--
 xen/arch/x86/hvm/svm/svm.c  | 62 ++---
 xen/arch/x86/hvm/vmx/vmx.c  | 66 +++
 xen/arch/x86/hvm/vmx/vvmx.c | 11 +++---
 xen/arch/x86/x86_emulate/x86_emulate.c  | 11 ++
 xen/arch/x86/x86_emulate/x86_emulate.h  | 22 +++
 xen/include/asm-x86/hvm/emulate.h   |  2 +-
 xen/include/asm-x86/hvm/hvm.h   | 69 -
 xen/include/asm-x86/hvm/svm/nestedsvm.h |  6 +--
 xen/include/asm-x86/hvm/vcpu.h  |  2 +-
 xen/include/asm-x86/hvm/vmx/vvmx.h  |  4 +-
 14 files changed, 159 insertions(+), 144 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 3efeead..bb26d40 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1679,7 +1679,7 @@ static int hvmemul_invlpg(
  * violations, so squash them.
  */
 hvmemul_ctxt->exn_pending = 0;
-hvmemul_ctxt->trap = (struct hvm_trap){};
+hvmemul_ctxt->trap = (struct x86_event){};
 rc = X86EMUL_OKAY;
 }
 
@@ -1869,7 +1869,7 @@ int hvm_emulate_one_mmio(unsigned long mfn, unsigned long 
gla)
 break;
 case X86EMUL_EXCEPTION:
 if ( ctxt.exn_pending )
-hvm_inject_trap();
+hvm_inject_event();
 /* fallthrough */
 default:
 hvm_emulate_writeback();
@@ -1929,7 +1929,7 @@ void hvm_emulate_one_vm_event(enum emul_kind kind, 
unsigned int trapnr,
 break;
 case X86EMUL_EXCEPTION:
 if ( ctx.exn_pending )
-hvm_inject_trap();
+hvm_inject_event();
 break;
 }
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 25dc759..7b434aa 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -535,7 +535,7 @@ void hvm_do_resume(struct vcpu *v)
 /* Inject pending hw/sw trap */
 if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )
 {
-hvm_inject_trap(>arch.hvm_vcpu.inject_trap);
+hvm_inject_event(>arch.hvm_vcpu.inject_trap);
 v->arch.hvm_vcpu.inject_trap.vector = -1;
 }
 }
@@ -1676,19 +1676,19 @@ void hvm_triple_fault(void)
 domain_shutdown(d, reason);
 }
 
-void hvm_inject_trap(const struct hvm_trap *trap)
+void hvm_inject_event(const struct x86_event *event)
 {
 struct vcpu *curr = current;
 
 if ( nestedhvm_enabled(curr->domain) &&
  !nestedhvm_vmswitch_in_progress(curr) &&
  nestedhvm_vcpu_in_guestmode(curr) &&
- nhvm_vmcx_guest_intercepts_trap(
- curr, trap->vector, trap->error_code) )
+ nhvm_vmcx_guest_intercepts_event(
+ curr, event->vector, event->error_code) )
 {
 enum nestedhvm_vmexits nsret;
 
-nsret = nhvm_vcpu_vmexit_trap(curr, trap);
+nsret = nhvm_vcpu_vmexit_event(curr, event);
 
 switch ( nsret )
 {
@@ -1704,26 +1704,7 @@ void hvm_inject_trap(const struct hvm_trap *trap)
 }
 }
 
-hvm_funcs.inject_trap(trap);
-}
-
-void hvm_inject_hw_exception(unsigned int trapnr, int errcode)
-{
-struct hvm_trap trap = {
-.vector = trapnr,
-.type = X86_EVENTTYPE_HW_EXCEPTION,
-.error_code = errcode };
-hvm_inject_trap();
-}
-
-void hvm_inject_page_fault(int errcode, unsigned long cr2)
-{
-struct hvm_trap trap = {
-.vector = TRAP_page_fault,
-.type = X86_EVENTTYPE_HW_EXCEPTION,
-.error_code = errcode,
-.cr2 = cr2 };
-hvm_inject_trap();
+hvm_funcs.inject_event(event);
 }
 
 int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
@@ -4096,7 +4077,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
 break;
 case X86EMUL_EXCEPTION:
 if ( ctxt.exn_pending )
-hvm_inject_trap();
+hvm_inject_event();
 /* fall through */
 default:
 hvm_emulate_writeback();
diff --git 

[Xen-devel] [PATCH v3 08/24] x86/emul: Correct the behaviour of pop %ss and interrupt shadowing

2016-11-30 Thread Andrew Cooper
The mov_ss retire flag should only be set once load_seg() has returned
success.  In particular, it should not be set if an exception occured when
trying to load %ss.

_hvm_emulate_one(), currently the sole user of mov_ss, only consideres it in
the case that x86_emulate() returns X86EMUL_OKAY, so this bug isn't actually
exposed to guests.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v3:
 * New
---
 xen/arch/x86/x86_emulate/x86_emulate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c 
b/xen/arch/x86/x86_emulate/x86_emulate.c
index 416812e..bacdee6 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -2656,6 +2656,8 @@ x86_emulate(
   , op_bytes, ctxt, ops)) != 0 ||
  (rc = load_seg(src.val, dst.val, 0, NULL, ctxt, ops)) != 0 )
 goto done;
+if ( src.val == x86_seg_ss )
+ctxt->retire.mov_ss = 1;
 break;
 
 case 0x0e: /* push %%cs */
@@ -2668,7 +2670,6 @@ x86_emulate(
 
 case 0x17: /* pop %%ss */
 src.val = x86_seg_ss;
-ctxt->retire.mov_ss = 1;
 goto pop_seg;
 
 case 0x1e: /* push %%ds */
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   >