[Xen-devel] [qemu-mainline test] 104153: regressions - FAIL

2017-01-12 Thread osstest service owner
flight 104153 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/104153/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl   6 xen-boot fail REGR. vs. 104106
 test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail REGR. vs. 104106

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 104106
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 104106
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 104106
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 104106
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 104106
 test-armhf-armhf-libvirt-qcow2 12 saverestore-support-check   fail like 104106
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 104106

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuub6c08970bc989bfddcf830684ea7a96b7a4d62a7
baseline version:
 qemuub44486dfb9447c88e4b216e730adcc780190852c

Last test of basis   104106  2017-01-11 01:46:06 Z2 days
Failing since104142  2017-01-12 12:12:27 Z0 days3 attempts
Testing same since   104153  2017-01-12 23:13:46 Z0 days1 attempts


People who touched revisions under test:
  Alex Bennée 
  Bastian Koppelmann 
  Bruce Rogers 
  Eduardo Habkost 
  Gerd Hoffmann 
  Greg Kurz 
  Li Qiang 
  Mark Cave-Ayland 
  Peer Adelt 
  Peter Maydell 
  Richard Henderson 
  xiaoqiang zhao 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvops   

Re: [Xen-devel] [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-12 Thread Jeff Epler
On Thu, Jan 12, 2017 at 05:32:09PM +0100, Nicolas Dichtel wrote:
> What I was trying to say is that I export those directories like other are.
> Removing those files is not related to that series.

Perhaps the correct solution is to only copy files matching "*.h" to
reduce the risk of copying files incidentally created by kbuild but
which shouldn't be installed as uapi headers.

jeff

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 104155: all pass - PUSHED

2017-01-12 Thread osstest service owner
flight 104155 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/104155/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 521981ee7608b75b51693ea367c9e1d83687d110
baseline version:
 ovmf 6157f6500c4098d7b541c1f1e9ba28e73fe9b70c

Last test of basis   104151  2017-01-12 22:46:07 Z0 days
Testing same since   104155  2017-01-13 01:47:29 Z0 days1 attempts


People who touched revisions under test:
  Augustine Linson P 
  Hegde Nagaraj P 
  hegdenag 
  Linson Augustine 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=521981ee7608b75b51693ea367c9e1d83687d110
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
521981ee7608b75b51693ea367c9e1d83687d110
+ branch=ovmf
+ revision=521981ee7608b75b51693ea367c9e1d83687d110
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.8-testing
+ '[' x521981ee7608b75b51693ea367c9e1d83687d110 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : 

Re: [Xen-devel] Xen 4.8 + Linux 4.9 + Credit2 = can't bootup

2017-01-12 Thread Boris Ostrovsky



On 01/12/2017 01:27 PM, Ian Jackson wrote:

Dario Faggioli writes ("Re: [Xen-devel] Xen 4.8 + Linux 4.9 + Credit2 = can't 
bootup"):

Anyway, we should have some multi-socket boxes on OSSTest, AFAICR.


I think we do but I haven't got a systematic way of answering that
question other than by manual eyeballing of the spec sheets.

IF there were something easy to look for in the dmesg output (say) I
could probably grep historical logs.



[root@ovs104 ~]# xl dmesg | grep Scrubbing
(XEN) Scrubbing Free RAM on 2 nodes using 16 CPUs
[root@ovs104 ~]#

or

[root@ovs104 ~]# xl info | grep nr_nodes
nr_nodes   : 2
[root@ovs104 ~]#

may be useful.

BTW, when I said that the problem that this thread was started with 
required multi-socket system I should have also said that dom0 needs to 
span nodes (or so I think).



-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 104156: tolerable all pass - PUSHED

2017-01-12 Thread osstest service owner
flight 104156 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/104156/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  904f9314540bcfbcfa60245e8f41ff1b671cdd9a
baseline version:
 xen  0d045d65c19ac48b31344b566cbf82a0270e6e44

Last test of basis   104127  2017-01-11 17:01:07 Z1 days
Testing same since   104156  2017-01-13 02:01:01 Z0 days1 attempts


People who touched revisions under test:
  Konrad Rzeszutek Wilk 
  Stefano Stabellini 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=904f9314540bcfbcfa60245e8f41ff1b671cdd9a
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
904f9314540bcfbcfa60245e8f41ff1b671cdd9a
+ branch=xen-unstable-smoke
+ revision=904f9314540bcfbcfa60245e8f41ff1b671cdd9a
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.8-testing
+ '[' x904f9314540bcfbcfa60245e8f41ff1b671cdd9a = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : 

Re: [Xen-devel] PVH CPU hotplug design document

2017-01-12 Thread Boris Ostrovsky



On 01/12/2017 02:00 PM, Andrew Cooper wrote:

On 12/01/17 12:13, Roger Pau Monné wrote:



## Proposed solution using the STAO

The general idea of this method is to use the STAO in order to hide the pCPUs
from the hardware domain, and provide processor objects for vCPUs in an extra
SSDT table.

This method requires one change to the STAO, in order to be able to notify the
hardware domain of which processors found in ACPI tables are pCPUs. The
description of the new STAO field is as follows:

 |   Field| Byte Length | Byte Offset | Description  |
 ||:---:|:---:|--|
 | Processor List [n] |  -  |  -  | A list of ACPI numbers,  |
 || | | where each number is the |
 || | | Processor UID of a   |
 || | | physical CPU, and should |
 || | | be treated specially by  |
 || | | the OSPM |

The list of UIDs in this new field would be matched against the ACPI Processor
UID field found in local/x2 APIC MADT structs and Processor objects in the ACPI
namespace, and the OSPM should either ignore those objects, or in case it
implements pCPU hotplug, it should notify Xen of changes to these objects.

The contents of the MADT provided to the hardware domain are also going to be
different from the contents of the MADT as found in native ACPI. The local/x2
APIC entries for all the pCPUs are going to be marked as disabled.

Extra entries are going to be added for each vCPU available to the hardware
domain, up to the maximum number of supported vCPUs. Note that supported vCPUs
might be different than enabled vCPUs, so it's possible that some of these
entries are also going to be marked as disabled. The entries for vCPUs on the
MADT are going to use a processor local x2 APIC structure, and the ACPI
processor ID of the first vCPU is going to be UINT32_MAX - HVM_MAX_VCPUS, in
order to avoid clashes with IDs of pCPUs.


This is slightly problematic.  There is no restriction (so far as I am
aware) on which ACPI IDs the firmware picks for its objects.  They need
not be consecutive, logical, or start from 0.

If STAO is being extended to list the IDs of the physical processor
objects, we should go one step further and explicitly list the IDs of
the virtual processor objects.  This leaves us flexibility if we have to
avoid awkward firmware ID layouts.



I don't think I understand how we'd use VCPU list in STAO. Can you 
explain this?





It is also work stating that this puts an upper limit on nr_pcpus +
nr_dom0_vcpus (but 4 billion processors really ought to be enough for
anyone...)


In order to be able to perform vCPU hotplug, the vCPUs must have an ACPI
processor object in the ACPI namespace, so that the OSPM can request
notifications and get the value of the \_STA and \_MAT methods. This can be
problematic because Xen doesn't know the ACPI name of the other processor
objects, so blindly adding new ones can create namespace clashes.

This can be solved by using a different ACPI name in order to describe vCPUs in
the ACPI namespace. Most hardware vendors tend to use CPU or PR prefixes for
the processor objects, so using a 'VP' (ie: Virtual Processor) prefix should
prevent clashes.


One system I have to hand (with more than 255 pcpus) uses Cxxx

To avoid namespace collisions, I can't see any option but to parse the
DSDT/SSDTs to at least confirm that VPxx is available to use.


You are talking about Xen doing this, right? Meaning that we'd need to 
add AML parser to the hypervisor?


If we do that, I wonder whether this will also help us to deal with _PSS 
and _CST, which we now have to pass down from dom0.






A Xen GPE device block will be used in order to deliver events related to the
vCPUs available to the guest, since Xen doesn't know if there are any bits
available in the native GPEs. A SCI interrupt will be injected into the guest
in order to trigger the event.

The following snippet is a representation of the ASL SSDT code that is proposed
for the hardware domain:

DefinitionBlock ("SSDT.aml", "SSDT", 5, "Xen", "HVM", 0)
{
Scope (\_SB)
{
   OperationRegion(XEN, SystemMemory, 0xDEADBEEF, 40)
   Field(XEN, ByteAcc, NoLock, Preserve) {
   NCPU, 16, /* Number of vCPUs */
   MSUA, 32, /* MADT checksum address */
   MAPA, 32, /* MADT LAPIC0 address */
   }
}
Scope ( \_SB ) {
OperationRegion ( MSUM, SystemMemory, \_SB.MSUA, 1 )
Field ( MSUM, ByteAcc, NoLock, Preserve ) {
MSU, 8
}
Method ( PMAT, 2 ) {
If ( LLess(Arg0, NCPU) ) {
Return ( ToBuffer(Arg1) )
}
Return ( 

Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky
On 01/12/2017 03:51 PM, Boris Ostrovsky wrote:
> On 01/12/2017 03:48 PM, Andrew Cooper wrote:
>> On 12/01/17 20:46, Boris Ostrovsky wrote:
>>> On 01/12/2017 02:27 PM, Andrew Cooper wrote:
 On 12/01/17 18:00, Boris Ostrovsky wrote:
>> Ahh! found it.  This is a side effect of starting to generate the dom0
>> policy in Xen.
>>
>> Can you try this patch?
> Intel/AMD HVM/PV 64/32bit all look good. So
>
> Tested-by: Boris Ostrovsky 
 Does this mean that newer versions of Linux more picky about what they
 tolerate in cpuid?
>>> We started to fail after change in Xen so I am not sure it's something
>>> new in Linux.
>> Right, but Linux 4.4 was entirely happy with this bug, both with and
>> without having CPUID faulting imposed on it.
> Oh, I see. My tests (typically) build and run the latest Linux tree (and
> Xen staging) every morning.
>
> I am trying to see what part of Linux caused the crash.


So the problem starts in Linux ht_detect(), where we check
X86_FEATURE_CMP_LEGACY. On Intel this is supposed to be clear and we
should end up setting phys_proc_id below. This value is then used in
topology_update_package_map(). If the value is incorrect (which it will
be if we bail early in ht_detect()) we may get a BUG_ON() at the caller.
Unfortunately we were too early to see the splat from the BUG_ON so it
wasn't clear right away why we were dying.

On AMD phys_proc_id is set elsewhere.

And the reason you haven't seen problems with earlier versions of Linux
is because the last two or so kernel releases saw major changes in
topology discovery (and, more importantly, topology validation). There
have been a bunch of Xen regressions due to that (the most recent is the
one Konrad reported a few days ago with 32 cores). This all is very
fragile for Xen guests due to bogus CPUID/APICID values.

(+Mohit who has been looking into another problem related to topology)

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 104150: tolerable FAIL - PUSHED

2017-01-12 Thread osstest service owner
flight 104150 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/104150/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 104104
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 104119
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 104119
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 104119
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 104119
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 104119
 test-armhf-armhf-libvirt-qcow2 12 saverestore-support-check   fail like 104119
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 104119
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 104119

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  0d045d65c19ac48b31344b566cbf82a0270e6e44
baseline version:
 xen  ffc103c223a6d12e5221f66b7e96396a61ba1b20

Last test of basis   104119  2017-01-11 06:45:46 Z1 days
Failing since104126  2017-01-11 16:44:54 Z1 days5 attempts
Testing same since   104131  2017-01-11 22:43:41 Z1 days4 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Kevin Tian 
  Stefano Stabellini 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-prev pass  

Re: [Xen-devel] [RFC v2] Xen PV Drivers Lifecycle

2017-01-12 Thread Stefano Stabellini
On Wed, 11 Jan 2017, Konrad Rzeszutek Wilk wrote:
> On Wed, Jan 11, 2017 at 10:49:15AM -0800, Stefano Stabellini wrote:
> > On Wed, 4 Jan 2017, Stefano Stabellini wrote:
> > > On Wed, 4 Jan 2017, Konrad Rzeszutek Wilk wrote:
> > > > On Wed, Jan 04, 2017 at 10:00:01AM -0800, Stefano Stabellini wrote:
> > > > > Hi all,
> > > > > 
> > > > > as you know, we have an issue with the speed of review and acceptance 
> > > > > of
> > > > > new PV drivers. In a discussion among committers, George wrote an 
> > > > > email
> > > > > with a short proposal to clarify the development lifecycle of new PV
> > > > > drivers and the different expectations at each stage of the process. I
> > > > > took that email, polished it and turned it into markdown. Here it is.
> > > > > 
> > > > > ---
> > > > > Acks:
> > > > > +1 from Wei Liu
> > > > 
> > > > +1.
> > > > 
> > > > Albeit I am concerned about the 
> > > > ..
> > > > > from one stage to the next within a reasonable time frame unless 
> > > > > someone
> > > > 
> > > > .. of what 'reasonable time' is when somebody is on vacation
> > > > or sick.
> > > > 
> > > > Is it worth spelling that out?
> > > 
> > > I don't think we should introduce hard time limits, but it should be in
> > > terms of few months, not years.
> > 
> > We have only had positive feedback so far and two explicit +1's. Should
> > we call "lazy-consensus" and commit it?
> 
> Yes!

Committed. I translated the +1's into Acked-by in the commit message for
clarity.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] partially revert "xen: Remove event channel notification through Xen PCI platform device"

2017-01-12 Thread Stefano Stabellini
On Thu, 12 Jan 2017, Boris Ostrovsky wrote:
> On 01/12/2017 04:33 PM, Stefano Stabellini wrote:
> > On Thu, 12 Jan 2017, Boris Ostrovsky wrote:
> >> On 01/11/2017 06:36 PM, Stefano Stabellini wrote:
> >>> The following commit:
> >>>
> >>> commit 72a9b186292d98494f26cfd24a1621796209
> >>> Author: KarimAllah Ahmed 
> >>> Date:   Fri Aug 26 23:55:36 2016 +0200
> >>>
> >>> xen: Remove event channel notification through Xen PCI platform device
> 
> Can you also replace this with
> 
> "Commit 72a9b186292d ("xen: Remove event channel notification through
> Xen PCI platform device")" ... ?

Too late. But if you are Ok with the patch, please fix the message while
committing.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 104151: all pass - PUSHED

2017-01-12 Thread osstest service owner
flight 104151 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/104151/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 6157f6500c4098d7b541c1f1e9ba28e73fe9b70c
baseline version:
 ovmf b494cf96e70f8640acd9288951be39a0f714f2be

Last test of basis   104144  2017-01-12 13:46:32 Z0 days
Testing same since   104151  2017-01-12 22:46:07 Z0 days1 attempts


People who touched revisions under test:
  Maurice Ma 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=6157f6500c4098d7b541c1f1e9ba28e73fe9b70c
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
6157f6500c4098d7b541c1f1e9ba28e73fe9b70c
+ branch=ovmf
+ revision=6157f6500c4098d7b541c1f1e9ba28e73fe9b70c
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.8-testing
+ '[' x6157f6500c4098d7b541c1f1e9ba28e73fe9b70c = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git
++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
++ : 

Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Doug Goldstein
On 1/12/17 6:04 PM, Daniel Kiper wrote:
> On Thu, Jan 12, 2017 at 04:23:59PM -0600, Doug Goldstein wrote:
>> On 1/12/17 2:28 PM, Daniel Kiper wrote:
>>> On Thu, Jan 12, 2017 at 09:52:15AM -0600, Doug Goldstein wrote:
 On 1/12/17 6:50 AM, Daniel Kiper wrote:
> On Wed, Jan 11, 2017 at 02:20:15PM -0600, Doug Goldstein wrote:
>> On 1/11/17 1:47 PM, Daniel Kiper wrote:
>>> On Tue, Jan 10, 2017 at 02:51:27PM -0600, Doug Goldstein wrote:
 On 1/9/17 7:37 PM, Doug Goldstein wrote:
> 
> [...]
> 
 memory region). You need to use AllocatePages() otherwise you are
 trampling memory that might have been allocated by the bootloader or 
 any
>>>
>>> Bootloader code/data should be dead here.
>>
>> Correct. Unfortunately on my Lenovo laptop and my Intel NUCs I can't
>> currently call ExitBootServices and a timer that iPXE has wired up has
>
> If you disable an important wheel in a machine you should not expect
> that the machine will work. Sorry! No way!

 Speak to your co-workers Konrad and Boris. We've had long email threads
 about how certain hardware does not work with the way Xen calls
 ExitBootServices.
>>>
>>> Could you be more precise what is wrong? Or at least send links to
>>> relevant threads.
>>
>> There have been several on the ML over the past 2 years. A quick Google
>> search turns these up.
>>
>> http://xen.markmail.org/message/f6lx2ab4o2fch35r
>> https://lists.xenproject.org/archives/html/xen-devel/2015-01/msg03164.html
> 
> This is more or less what I expected. However, IIRC, it was not related
> to ExitBootServices() itself. The problem was that some runtime services
> code lived in boot services code and data regions. So, I suppose that if
> you map boot services code and data regions with runtime services code
> and data everything should work. However, I have just realized that we
> need an option to enable this functionality from GRUB2 command line.
> Though you can do a test by setting map_bs to 1 at the beginning of
> efi_multiboot2(). Do not forget to enable ExitBootServices() call in Xen.

On Intel NUCs, Super Micro boards and others, you are unable to call
ExitBootServices() without having called SetVirtualAddressMap(). If you
follow through both threads you'll see there's more than just map_bs.
Konrad also proposed adding /noexitbs

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Doug Goldstein
On 1/12/17 6:04 PM, Daniel Kiper wrote:
> On Thu, Jan 12, 2017 at 04:23:59PM -0600, Doug Goldstein wrote:
>> On 1/12/17 2:28 PM, Daniel Kiper wrote:
>>> On Thu, Jan 12, 2017 at 09:52:15AM -0600, Doug Goldstein wrote:
 On 1/12/17 6:50 AM, Daniel Kiper wrote:
> On Wed, Jan 11, 2017 at 02:20:15PM -0600, Doug Goldstein wrote:
>> On 1/11/17 1:47 PM, Daniel Kiper wrote:
>>> On Tue, Jan 10, 2017 at 02:51:27PM -0600, Doug Goldstein wrote:
 On 1/9/17 7:37 PM, Doug Goldstein wrote:
> 
> [...]
> 
 memory region). You need to use AllocatePages() otherwise you are
 trampling memory that might have been allocated by the bootloader or 
 any
>>>
>>> Bootloader code/data should be dead here.
>>
>> Correct. Unfortunately on my Lenovo laptop and my Intel NUCs I can't
>> currently call ExitBootServices and a timer that iPXE has wired up has
>
> If you disable an important wheel in a machine you should not expect
> that the machine will work. Sorry! No way!

 Speak to your co-workers Konrad and Boris. We've had long email threads
 about how certain hardware does not work with the way Xen calls
 ExitBootServices.
>>>
>>> Could you be more precise what is wrong? Or at least send links to
>>> relevant threads.
>>
>> There have been several on the ML over the past 2 years. A quick Google
>> search turns these up.
>>
>> http://xen.markmail.org/message/f6lx2ab4o2fch35r
>> https://lists.xenproject.org/archives/html/xen-devel/2015-01/msg03164.html
> 
> This is more or less what I expected. However, IIRC, it was not related
> to ExitBootServices() itself. The problem was that some runtime services
> code lived in boot services code and data regions. So, I suppose that if
> you map boot services code and data regions with runtime services code
> and data everything should work. However, I have just realized that we
> need an option to enable this functionality from GRUB2 command line.
> Though you can do a test by setting map_bs to 1 at the beginning of
> efi_multiboot2(). Do not forget to enable ExitBootServices() call in Xen.
> 
>> some memory reserved down there and it was getting trampled. The real
>
> I still do not know why remnants of iPXE should run at this Xen boot 
> stage.
> It looks like an iPXE bug and IMO it should be fixed first.

 Like I said above, its because on this machine I am unable to call Xen's
 EBS.
>>>
>>> I do not understand how ExitBootServices() call is related to iPXE timer 
>>> remnants
>>> or so. Though if it is related somehow then I think that you should blame 
>>> machine
>>> and/or iPXE designer/developer not Xen developer.
>>
>> iPXE registers a callback for when EBS is called to clean up a timer.
> 
> Could not you unregister this callback just before jump to the Xen image?
> I do not think it is needed for Xen boot.

Yep. Already done and merged. But my point is we should prefer to use
AllocatePages() and only fall back to trampling any conventional memory
region if the call didn't work.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Daniel Kiper
On Thu, Jan 12, 2017 at 04:23:59PM -0600, Doug Goldstein wrote:
> On 1/12/17 2:28 PM, Daniel Kiper wrote:
> > On Thu, Jan 12, 2017 at 09:52:15AM -0600, Doug Goldstein wrote:
> >> On 1/12/17 6:50 AM, Daniel Kiper wrote:
> >>> On Wed, Jan 11, 2017 at 02:20:15PM -0600, Doug Goldstein wrote:
>  On 1/11/17 1:47 PM, Daniel Kiper wrote:
> > On Tue, Jan 10, 2017 at 02:51:27PM -0600, Doug Goldstein wrote:
> >> On 1/9/17 7:37 PM, Doug Goldstein wrote:

[...]

> >> memory region). You need to use AllocatePages() otherwise you are
> >> trampling memory that might have been allocated by the bootloader or 
> >> any
> >
> > Bootloader code/data should be dead here.
> 
>  Correct. Unfortunately on my Lenovo laptop and my Intel NUCs I can't
>  currently call ExitBootServices and a timer that iPXE has wired up has
> >>>
> >>> If you disable an important wheel in a machine you should not expect
> >>> that the machine will work. Sorry! No way!
> >>
> >> Speak to your co-workers Konrad and Boris. We've had long email threads
> >> about how certain hardware does not work with the way Xen calls
> >> ExitBootServices.
> >
> > Could you be more precise what is wrong? Or at least send links to
> > relevant threads.
>
> There have been several on the ML over the past 2 years. A quick Google
> search turns these up.
>
> http://xen.markmail.org/message/f6lx2ab4o2fch35r
> https://lists.xenproject.org/archives/html/xen-devel/2015-01/msg03164.html

This is more or less what I expected. However, IIRC, it was not related
to ExitBootServices() itself. The problem was that some runtime services
code lived in boot services code and data regions. So, I suppose that if
you map boot services code and data regions with runtime services code
and data everything should work. However, I have just realized that we
need an option to enable this functionality from GRUB2 command line.
Though you can do a test by setting map_bs to 1 at the beginning of
efi_multiboot2(). Do not forget to enable ExitBootServices() call in Xen.

>  some memory reserved down there and it was getting trampled. The real
> >>>
> >>> I still do not know why remnants of iPXE should run at this Xen boot 
> >>> stage.
> >>> It looks like an iPXE bug and IMO it should be fixed first.
> >>
> >> Like I said above, its because on this machine I am unable to call Xen's
> >> EBS.
> >
> > I do not understand how ExitBootServices() call is related to iPXE timer 
> > remnants
> > or so. Though if it is related somehow then I think that you should blame 
> > machine
> > and/or iPXE designer/developer not Xen developer.
>
> iPXE registers a callback for when EBS is called to clean up a timer.

Could not you unregister this callback just before jump to the Xen image?
I do not think it is needed for Xen boot.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Daniel Kiper
On Thu, Jan 12, 2017 at 04:20:00PM -0600, Doug Goldstein wrote:
> On 1/12/17 3:45 PM, Daniel Kiper wrote:
> > On Thu, Jan 12, 2017 at 01:46:41PM -0600, Doug Goldstein wrote:
> >> On 1/12/17 1:30 PM, Daniel Kiper wrote:
> >>> On Thu, Jan 12, 2017 at 09:44:59AM -0600, Doug Goldstein wrote:
>  view there's no reason for adding MB2 support for BIOS since it provides
>  no advantage over MB1 when booting from the BIOS. Now MB2 solves a
> >>>
> >>> From your point of view maybe it does not. However, from user point of 
> >>> view it may.
> >>> If you have support for MB2 on legacy BIOS and EFI platforms then you can 
> >>> boot Xen
> >>> on both platforms without changing anything in boot config files. 
> >>> Otherwise you have
> >>> to prepare separate configuration for different platforms.
> >>
> >> Neither Grub nor iPXE require different configs for MB1 vs MB2 so I'm
> >> not seeing the validity of this logic.
> >
> > Hmmm... This is interesting. I do not know iPXE, however, in GRUB you must
> > use multiboot/module for MB1 and multiboot2/module2 for MB2. I suppose that
> > you have to differentiate between both of them in iPXE somehow too. Hence,
> > there is pretty good chance that configs for MB1 and MB2 are different.
>
> multiboot/multiboot2 and module/module2 are aliases of each other. They
> work interchangeably. Its the same way in iPXE.

If you carefully look at GRUB2 code and how multiboot and multiboot2
modules are build you quickly realize that they are not aliases.
Though I do not how it works in iPXE.

>  problem with booting over EFI vs MB1 so they'll be willing to take a
>  change there. I'll also disagree that BIOS is easier than EFI since with
>  EFI its just load the ELF into memory and set a few pointers in tags.
>  With BIOS it requires me to build up the memory map into a MB2 structure.
> >>>
> >>> Xen uses only these tags on legacy BIOS platforms: 
> >>> MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
> >>> (well, nice to have but it can be also not provided), 
> >>> MULTIBOOT2_TAG_TYPE_MMAP (same
> >>> as MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO), 
> >>> MULTIBOOT2_TAG_TYPE_BOOT_LOADER_NAME
> >>> (same as MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO) ,MULTIBOOT2_TAG_TYPE_CMDLINE,
> >>> MULTIBOOT2_TAG_TYPE_MODULE. I do not mention MULTIBOOT2_TAG_TYPE_END which
> >>> is obvious. So, if you are real hardcore minimalist then you have to 
> >>> provide
> >>> MULTIBOOT2_TAG_TYPE_CMDLINE and MULTIBOOT2_TAG_TYPE_MODULE. All of them
> >>> are provided also on EFI. So, I do not see any reason to not provide MB2
> >>> for legacy BIOS. And I do not think that it is very difficult to provide
> >>> all optional tags mentioned above.
> >>
> >> I don't understand what you're attempting to convey here. You've listed
> >> out a number of tags that I mentioned in my message that I don't have to
> >> implement for EFI. You've basically reinforced my point that its easier
> >> to implement this for EFI than BIOS. MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
> >> and MULTIBOOT2_TAG_TYPE_MMAP are unused by Xen on EFI. It gets this info
> >
> > I showed you that if you are real minimalist you can enable the same MB2 
> > code
> > on legacy BIOS and EFI. I do not understand your objection against providing
> > MB2 in iPXE on legacy BIOS if you do not need extra code (maybe a few 
> > #ifdefs).
> > Though I am not going to convince you. It is your choice but I am still 
> > thinking
> > that it is wrong choice.
>
> Its not my choice. Its the feedback I've received from upstream.

OK, they are iPXE maintainers. Though it still does not change my
opinion about their decision.

> > By the way, does iPXE check MULTIBOOT2_HEADER_TAG_INFORMATION_REQUEST in 
> > Xen header.
> > If it does (it should) and do not understand 
> > MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO and
> > MULTIBOOT2_TAG_TYPE_MMAP then it should fail.
>
> It does but I know that Xen doesn't use that information if Boot
> Services are available by code inspection. Which is what my comments are
> related to.

I am not sure that you correctly understood what I mean. Please read
multiboot2 "Information request header tag" section for more details.
iPXE should obey this rule even if you do not provide 
MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
and MULTIBOOT2_TAG_TYPE_MMAP. The same is relevant for other tags in
image header. To be precise: I mean that iPXE should complain if it
sees __REQUESTS__ for unknown tags.

> >> from a call to GetMemoryMap(). You actually reminded me of another bug.
> >> Calling ExitBootServices() on Grub and letting it pass the memory info
> >> causes Xen to fail to load.
> >
> > How come... Which GRUB version do you use? Xen clearly says that it needs
> > boot services (look into MB2 header). So, GRUB is not allowed to call
> > ExitBootServices(). If it does then it is GRUB bug.
>
> No. That's not how it works at all. To quote 3.1.12 of the Multiboot2
> spec...
>
> "This tag indicates that payload supports starting without terminating
> boot services."
>
> This 

Re: [Xen-devel] [PATCH] xen-netfront: Fix Rx stall during network stress and OOM

2017-01-12 Thread Vineeth Remanan Pillai



On 01/12/2017 12:17 PM, David Miller wrote:

From: Vineeth Remanan Pillai 
Date: Wed, 11 Jan 2017 23:17:17 +


@@ -1054,7 +1059,11 @@ static int xennet_poll(struct napi_struct *napi, int 
budget)
napi_complete(napi);
  
  		RING_FINAL_CHECK_FOR_RESPONSES(>rx, more_to_do);

-   if (more_to_do)
+
+   /* If there is more work to do or could not allocate
+* rx buffers, re-enable polling.
+*/
+   if (more_to_do || err != 0)
napi_schedule(napi);

Just polling endlessly in a loop retrying the SKB allocation over and over
again until it succeeds is not very nice behavior.

You already have that refill timer, so please use that to retry instead
of wasting cpu cycles looping in NAPI poll.

Thanks Dave for the inputs.
On further look, I think I can fix it much simpler by correcting the 
test condition

for minimum slots for pushing requests. Existing test is like this:


/* Not enough requests? Try again later. */
   if (req_prod - queue->rx.rsp_cons < NET_RX_SLOTS_MIN) {
mod_timer(>rx_refill_timer, jiffies + (HZ/10));
return;
}


Actually the above check counts more than the newly created request slots
as it counts from rsp_cons. The actual count should be the difference 
between
new req_prod and old req_prod(in the queue). If skbs cannot be created, 
this
count remains small and hence we would schedule the timer. So the fix 
could be:


/* Not enough requests? Try again later. */
-   if (req_prod - queue->rx.rsp_cons < NET_RX_SLOTS_MIN) {
+   if (req_prod - queue->rx.sring->req_prod < NET_RX_SLOTS_MIN) {


I have done some initial testing to verify the fix. Will send out v2 
patch after couple

more round of testing.

Thanks,
Vineeth

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 104148: regressions - FAIL

2017-01-12 Thread osstest service owner
flight 104148 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/104148/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail REGR. vs. 104106

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 104106
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 104106
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 104106
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 104106
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 104106
 test-armhf-armhf-libvirt-qcow2 12 saverestore-support-check   fail like 104106
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 104106

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu4201e616c0fa48de99a505a6ade979f0c2c65e28
baseline version:
 qemuub44486dfb9447c88e4b216e730adcc780190852c

Last test of basis   104106  2017-01-11 01:46:06 Z1 days
Failing since104142  2017-01-12 12:12:27 Z0 days2 attempts
Testing same since   104148  2017-01-12 18:12:06 Z0 days1 attempts


People who touched revisions under test:
  Alex Bennée 
  Bastian Koppelmann 
  Eduardo Habkost 
  Gerd Hoffmann 
  Greg Kurz 
  Li Qiang 
  Mark Cave-Ayland 
  Peter Maydell 
  Richard Henderson 
  xiaoqiang zhao 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 

Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Doug Goldstein
On 1/12/17 2:28 PM, Daniel Kiper wrote:
> On Thu, Jan 12, 2017 at 09:52:15AM -0600, Doug Goldstein wrote:
>> On 1/12/17 6:50 AM, Daniel Kiper wrote:
>>> On Wed, Jan 11, 2017 at 02:20:15PM -0600, Doug Goldstein wrote:
 On 1/11/17 1:47 PM, Daniel Kiper wrote:
> On Tue, Jan 10, 2017 at 02:51:27PM -0600, Doug Goldstein wrote:
>> On 1/9/17 7:37 PM, Doug Goldstein wrote:
>>> On 12/5/16 4:25 PM, Daniel Kiper wrote:
>>
 diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
 index 62c010e..dc857d8 100644
 --- a/xen/arch/x86/efi/efi-boot.h
 +++ b/xen/arch/x86/efi/efi-boot.h
 @@ -146,6 +146,8 @@ static void __init 
 efi_arch_process_memory_map(EFI_SYSTEM_TABLE *SystemTable,
  {
  struct e820entry *e;
  unsigned int i;
 +/* Check for extra mem for mbi data if Xen is loaded via 
 multiboot2 protocol. */
 +UINTN extra_mem = efi_enabled(EFI_LOADER) ? 0 : (64 << 10);
>>>
>>> Just wondering where the constant came from? And if there should be a
>>> little bit of information about it. To me its just weird to shift 64.
>>
>> Its the size of the stack used in the assembly code.
>
> No, it is trampoline region size.

 trampoline + stack in head.S We take the address where we're going to
 copy the trampoline and set the stack to 0x1 past it.
>>>
>>> I suppose that you think about this:
>>>
>>> /* Switch to low-memory stack.  */
>>> mov sym_fs(trampoline_phys),%edi
>>> lea 0x1(%edi),%esp
>>>
>>> However, trampoline region size is (should be) 64 KiB. No way. Please
>>> look below for more details.
>>
>> The trampoline + stack are 64kb together. The stack grows down and the
>> trampoline grows up. The stack starts at 64kb past the start of the
>> trampoline. %edi is the start of the trampoline.
> 
> Yep. I think that right now we are on the same boat.
> 
  /* Populate E820 table and check trampoline area availability. */
  e = e820map - 1;
 @@ -168,7 +170,8 @@ static void __init 
 efi_arch_process_memory_map(EFI_SYSTEM_TABLE *SystemTable,
  /* fall through */
  case EfiConventionalMemory:
  if ( !trampoline_phys && desc->PhysicalStart + len <= 
 0x10 &&
 - len >= cfg.size && desc->PhysicalStart + len > 
 cfg.addr )
 + len >= cfg.size + extra_mem &&
 + desc->PhysicalStart + len > cfg.addr )
  cfg.addr = (desc->PhysicalStart + len - cfg.size) & 
 PAGE_MASK;
>>>
>>> So this is where the current series blows up and fails on real hardware.
>>
>> Honestly this was my misunderstanding and this shouldn't ever be used to
>> get memory for the trampoline. This also has the bug in it that it needs
>> to be:
>>
>> ASSERT(cfg.size > 0);
>> cfg.addr = (desc->PhysicalStart + len - (cfg.size + extra_mem) & 
>> PAGE_MASK;
>
> As I said earlier. This extra_mem stuff is (maybe) wrong and should be 
> fixed
> in one way or another. Hmmm... It looks OK. I will double check it because
> I do not looked at this code long time and maybe I am missing something.

 cfg.size needs to be the size of the trampolines + stack.
>>>
>>> It looks that during some code rearrangement I moved one instruction too
>>> much to trampoline_bios_setup. So, I can agree that right now cfg.size
>>> should be properly initialized. Though it should be cfg.size = 64 << 10.
>>> Then extra_mem should be dropped.
>>
>> That's fine as long as its clear that 64kb is for the trampoline + the
>> stack.
> 
> OK, but there are two stacks. We talk about "low-memory stack". I will improve
> the comment.
> 
> [...]
> 
>> memory region). You need to use AllocatePages() otherwise you are
>> trampling memory that might have been allocated by the bootloader or any
>
> Bootloader code/data should be dead here.

 Correct. Unfortunately on my Lenovo laptop and my Intel NUCs I can't
 currently call ExitBootServices and a timer that iPXE has wired up has
>>>
>>> If you disable an important wheel in a machine you should not expect
>>> that the machine will work. Sorry! No way!
>>
>> Speak to your co-workers Konrad and Boris. We've had long email threads
>> about how certain hardware does not work with the way Xen calls
>> ExitBootServices.
> 
> Could you be more precise what is wrong? Or at least send links to
> relevant threads.

There have been several on the ML over the past 2 years. A quick Google
search turns these up.

http://xen.markmail.org/message/f6lx2ab4o2fch35r
https://lists.xenproject.org/archives/html/xen-devel/2015-01/msg03164.html


> 
 some memory reserved down there and it was getting trampled. The real
>>>
>>> I still do not know why remnants 

Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Doug Goldstein
On 1/12/17 3:45 PM, Daniel Kiper wrote:
> On Thu, Jan 12, 2017 at 01:46:41PM -0600, Doug Goldstein wrote:
>> On 1/12/17 1:30 PM, Daniel Kiper wrote:
>>> On Thu, Jan 12, 2017 at 09:44:59AM -0600, Doug Goldstein wrote:
>>
>>>
 view there's no reason for adding MB2 support for BIOS since it provides
 no advantage over MB1 when booting from the BIOS. Now MB2 solves a
>>>
>>> From your point of view maybe it does not. However, from user point of view 
>>> it may.
>>> If you have support for MB2 on legacy BIOS and EFI platforms then you can 
>>> boot Xen
>>> on both platforms without changing anything in boot config files. Otherwise 
>>> you have
>>> to prepare separate configuration for different platforms.
>>
>> Neither Grub nor iPXE require different configs for MB1 vs MB2 so I'm
>> not seeing the validity of this logic.
> 
> Hmmm... This is interesting. I do not know iPXE, however, in GRUB you must
> use multiboot/module for MB1 and multiboot2/module2 for MB2. I suppose that
> you have to differentiate between both of them in iPXE somehow too. Hence,
> there is pretty good chance that configs for MB1 and MB2 are different.

multiboot/multiboot2 and module/module2 are aliases of each other. They
work interchangeably. Its the same way in iPXE.


> 
 problem with booting over EFI vs MB1 so they'll be willing to take a
 change there. I'll also disagree that BIOS is easier than EFI since with
 EFI its just load the ELF into memory and set a few pointers in tags.
 With BIOS it requires me to build up the memory map into a MB2 structure.
>>>
>>> Xen uses only these tags on legacy BIOS platforms: 
>>> MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
>>> (well, nice to have but it can be also not provided), 
>>> MULTIBOOT2_TAG_TYPE_MMAP (same
>>> as MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO), MULTIBOOT2_TAG_TYPE_BOOT_LOADER_NAME
>>> (same as MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO) ,MULTIBOOT2_TAG_TYPE_CMDLINE,
>>> MULTIBOOT2_TAG_TYPE_MODULE. I do not mention MULTIBOOT2_TAG_TYPE_END which
>>> is obvious. So, if you are real hardcore minimalist then you have to provide
>>> MULTIBOOT2_TAG_TYPE_CMDLINE and MULTIBOOT2_TAG_TYPE_MODULE. All of them
>>> are provided also on EFI. So, I do not see any reason to not provide MB2
>>> for legacy BIOS. And I do not think that it is very difficult to provide
>>> all optional tags mentioned above.
>>
>> I don't understand what you're attempting to convey here. You've listed
>> out a number of tags that I mentioned in my message that I don't have to
>> implement for EFI. You've basically reinforced my point that its easier
>> to implement this for EFI than BIOS. MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
>> and MULTIBOOT2_TAG_TYPE_MMAP are unused by Xen on EFI. It gets this info
> 
> I showed you that if you are real minimalist you can enable the same MB2 code
> on legacy BIOS and EFI. I do not understand your objection against providing
> MB2 in iPXE on legacy BIOS if you do not need extra code (maybe a few 
> #ifdefs).
> Though I am not going to convince you. It is your choice but I am still 
> thinking
> that it is wrong choice.

Its not my choice. Its the feedback I've received from upstream.

> 
> By the way, does iPXE check MULTIBOOT2_HEADER_TAG_INFORMATION_REQUEST in Xen 
> header.
> If it does (it should) and do not understand 
> MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO and
> MULTIBOOT2_TAG_TYPE_MMAP then it should fail.

It does but I know that Xen doesn't use that information if Boot
Services are available by code inspection. Which is what my comments are
related to.

> 
>> from a call to GetMemoryMap(). You actually reminded me of another bug.
>> Calling ExitBootServices() on Grub and letting it pass the memory info
>> causes Xen to fail to load.
> 
> How come... Which GRUB version do you use? Xen clearly says that it needs
> boot services (look into MB2 header). So, GRUB is not allowed to call
> ExitBootServices(). If it does then it is GRUB bug.

No. That's not how it works at all. To quote 3.1.12 of the Multiboot2
spec...

"This tag indicates that payload supports starting without terminating
boot services."

This tag is not required to be respected but instead means that the
payload supports using boot services. Additionally section 3.6.3 which
talks about passing MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO states...

"This tag may not be provided by some boot loaders on EFI platforms if
EFI boot services are enabled and available for the loaded image (EFI
boot services not terminated tag exists in Multiboot2 information
structure)."

And section 3.6.8 talks about passing MULTIBOOT2_TAG_TYPE_MMAP states...

"This tag may not be provided by some boot loaders on EFI platforms if
EFI boot services are enabled and available for the loaded image (EFI
boot services not terminated tag exists in Multiboot2 information
structure)."


So for my iPXE support if the payload (in this case Xen) reports that it
supports not having boot services exited then I don't exit it and I
don't provide 

[Xen-devel] [PATCH v2] kexec: implement STATUS hypercall to check if image is loaded

2017-01-12 Thread Eric DeVolder
The tools that use kexec are asynchronous in nature and do not keep
state changes. As such provide an hypercall to find out whether an
image has been loaded for either type.

Note: No need to modify XSM as it has one size fits all check and
does not check for subcommands.

Note: No need to check KEXEC_FLAG_IN_PROGRESS (and error out of
kexec_status()) as this flag is set only once by the first/only
cpu on the crash path.

Note: The __XEN_LATEST_INTERFACE_VERSION__ has been bumped to
0x00040900 due to the introduction of a new hypervisor call.

Note: This is just the Xen side of the hypercall, kexec-tools patch
to come separately.

Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Eric DeVolder 
Reviewed-by: Daniel Kiper 
---
CC: Elena Ufimtseva 
CC: Daniel Kiper 

v0: Internal version.
v1: Dropped Reviewed-by, posting on xen-devel.
v2: Incorporated xen-devel feedback
---
 tools/libxc/include/xenctrl.h   | 10 ++
 tools/libxc/xc_kexec.c  | 24 
 xen/common/kexec.c  | 19 +++
 xen/include/public/kexec.h  | 13 +
 xen/include/public/xen-compat.h |  2 +-
 5 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 4ab0f57117..63c616ff6a 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2574,6 +2574,16 @@ int xc_kexec_load(xc_interface *xch, uint8_t type, 
uint16_t arch,
  */
 int xc_kexec_unload(xc_interface *xch, int type);
 
+/*
+ * Find out whether the image has been succesfully loaded.
+ *
+ * The type can be either KEXEC_TYPE_DEFAULT or KEXEC_TYPE_CRASH.
+ * If zero is returned, that means no image is loaded for the type.
+ * If one is returned, that means an image is loaded for the type.
+ * Otherwise, negative return value indicates error.
+ */
+int xc_kexec_status(xc_interface *xch, int type);
+
 typedef xenpf_resource_entry_t xc_resource_entry_t;
 
 /*
diff --git a/tools/libxc/xc_kexec.c b/tools/libxc/xc_kexec.c
index 989e225192..a0a9fd3841 100644
--- a/tools/libxc/xc_kexec.c
+++ b/tools/libxc/xc_kexec.c
@@ -126,3 +126,27 @@ out:
 
 return ret;
 }
+
+int xc_kexec_status(xc_interface *xch, int type)
+{
+DECLARE_HYPERCALL_BUFFER(xen_kexec_status_t, status);
+int ret = -1;
+
+status = xc_hypercall_buffer_alloc(xch, status, sizeof(*status));
+if ( status == NULL )
+{
+PERROR("Could not alloc buffer for kexec status hypercall");
+goto out;
+}
+
+status->type = type;
+
+ret = xencall2(xch->xcall, __HYPERVISOR_kexec_op,
+   KEXEC_CMD_kexec_status,
+   HYPERCALL_BUFFER_AS_ARG(status));
+
+out:
+xc_hypercall_buffer_free(xch, status);
+
+return ret;
+}
diff --git a/xen/common/kexec.c b/xen/common/kexec.c
index c83d48fc79..aa808cb2f2 100644
--- a/xen/common/kexec.c
+++ b/xen/common/kexec.c
@@ -1169,6 +1169,22 @@ static int kexec_unload(XEN_GUEST_HANDLE_PARAM(void) 
uarg)
 return kexec_do_unload();
 }
 
+static int kexec_status(XEN_GUEST_HANDLE_PARAM(void) uarg)
+{
+xen_kexec_status_t status;
+int base, bit;
+
+if ( unlikely(copy_from_guest(, uarg, 1)) )
+return -EFAULT;
+
+/* No need to check KEXEC_FLAG_IN_PROGRESS. */
+
+if ( kexec_load_get_bits(status.type, , ) )
+return -EINVAL;
+
+return test_bit(bit, _flags);
+}
+
 static int do_kexec_op_internal(unsigned long op,
 XEN_GUEST_HANDLE_PARAM(void) uarg,
 bool_t compat)
@@ -1208,6 +1224,9 @@ static int do_kexec_op_internal(unsigned long op,
 case KEXEC_CMD_kexec_unload:
 ret = kexec_unload(uarg);
 break;
+case KEXEC_CMD_kexec_status:
+ret = kexec_status(uarg);
+break;
 }
 
 return ret;
diff --git a/xen/include/public/kexec.h b/xen/include/public/kexec.h
index a6a0a88f4f..c200e8ceee 100644
--- a/xen/include/public/kexec.h
+++ b/xen/include/public/kexec.h
@@ -227,6 +227,19 @@ typedef struct xen_kexec_unload {
 } xen_kexec_unload_t;
 DEFINE_XEN_GUEST_HANDLE(xen_kexec_unload_t);
 
+/*
+ * Figure out whether we have an image loaded. A return value of
+ * zero indicates no image loaded. A return value of one
+ * indicates an image is loaded. A negative return value
+ * indicates an error.
+ *
+ * Type must be one of KEXEC_TYPE_DEFAULT or KEXEC_TYPE_CRASH.
+ */
+#define KEXEC_CMD_kexec_status 6
+typedef struct xen_kexec_status {
+uint8_t type;
+} xen_kexec_status_t;
+DEFINE_XEN_GUEST_HANDLE(xen_kexec_status_t);
 #else /* __XEN_INTERFACE_VERSION__ < 0x00040400 */
 
 #define KEXEC_CMD_kexec_load KEXEC_CMD_kexec_load_v1
diff --git a/xen/include/public/xen-compat.h b/xen/include/public/xen-compat.h
index dd8a5c0d0e..b67365340b 100644
--- a/xen/include/public/xen-compat.h
+++ b/xen/include/public/xen-compat.h
@@ 

Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Daniel Kiper
On Thu, Jan 12, 2017 at 01:46:41PM -0600, Doug Goldstein wrote:
> On 1/12/17 1:30 PM, Daniel Kiper wrote:
> > On Thu, Jan 12, 2017 at 09:44:59AM -0600, Doug Goldstein wrote:
>
> >
> >> view there's no reason for adding MB2 support for BIOS since it provides
> >> no advantage over MB1 when booting from the BIOS. Now MB2 solves a
> >
> > From your point of view maybe it does not. However, from user point of view 
> > it may.
> > If you have support for MB2 on legacy BIOS and EFI platforms then you can 
> > boot Xen
> > on both platforms without changing anything in boot config files. Otherwise 
> > you have
> > to prepare separate configuration for different platforms.
>
> Neither Grub nor iPXE require different configs for MB1 vs MB2 so I'm
> not seeing the validity of this logic.

Hmmm... This is interesting. I do not know iPXE, however, in GRUB you must
use multiboot/module for MB1 and multiboot2/module2 for MB2. I suppose that
you have to differentiate between both of them in iPXE somehow too. Hence,
there is pretty good chance that configs for MB1 and MB2 are different.

> >> problem with booting over EFI vs MB1 so they'll be willing to take a
> >> change there. I'll also disagree that BIOS is easier than EFI since with
> >> EFI its just load the ELF into memory and set a few pointers in tags.
> >> With BIOS it requires me to build up the memory map into a MB2 structure.
> >
> > Xen uses only these tags on legacy BIOS platforms: 
> > MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
> > (well, nice to have but it can be also not provided), 
> > MULTIBOOT2_TAG_TYPE_MMAP (same
> > as MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO), MULTIBOOT2_TAG_TYPE_BOOT_LOADER_NAME
> > (same as MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO) ,MULTIBOOT2_TAG_TYPE_CMDLINE,
> > MULTIBOOT2_TAG_TYPE_MODULE. I do not mention MULTIBOOT2_TAG_TYPE_END which
> > is obvious. So, if you are real hardcore minimalist then you have to provide
> > MULTIBOOT2_TAG_TYPE_CMDLINE and MULTIBOOT2_TAG_TYPE_MODULE. All of them
> > are provided also on EFI. So, I do not see any reason to not provide MB2
> > for legacy BIOS. And I do not think that it is very difficult to provide
> > all optional tags mentioned above.
>
> I don't understand what you're attempting to convey here. You've listed
> out a number of tags that I mentioned in my message that I don't have to
> implement for EFI. You've basically reinforced my point that its easier
> to implement this for EFI than BIOS. MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
> and MULTIBOOT2_TAG_TYPE_MMAP are unused by Xen on EFI. It gets this info

I showed you that if you are real minimalist you can enable the same MB2 code
on legacy BIOS and EFI. I do not understand your objection against providing
MB2 in iPXE on legacy BIOS if you do not need extra code (maybe a few #ifdefs).
Though I am not going to convince you. It is your choice but I am still thinking
that it is wrong choice.

By the way, does iPXE check MULTIBOOT2_HEADER_TAG_INFORMATION_REQUEST in Xen 
header.
If it does (it should) and do not understand MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO 
and
MULTIBOOT2_TAG_TYPE_MMAP then it should fail.

> from a call to GetMemoryMap(). You actually reminded me of another bug.
> Calling ExitBootServices() on Grub and letting it pass the memory info
> causes Xen to fail to load.

How come... Which GRUB version do you use? Xen clearly says that it needs
boot services (look into MB2 header). So, GRUB is not allowed to call
ExitBootServices(). If it does then it is GRUB bug.

> Andrew helped me troubleshoot this and he discovered the fix. You've got
> code:
>
> /* Store Xen image load base address in place accessible for 32-bit code. */
> mov %r15d,%esi
>
> But if any of the checks under the run_bs: label specifically:
> - /* Are EFI boot services available? */
> - /* Is EFI SystemTable address provided by boot loader? */
> - /* Is EFI ImageHandle address provided by boot loader? */
>
> Will not run the mov instruction and then fail to boot. Its only if any
> of these are false will it attempt to use the tags mentioned above as well.

OK, this is a bug and I will fix it. However, this is not related to
ExitBootServices() call in GRUB2.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86, locking/spinlocks: Remove paravirt_ticketlocks_enabled

2017-01-12 Thread Boris Ostrovsky

> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
> index e8a9ea7..25a7c43 100644
> --- a/arch/x86/xen/spinlock.c
> +++ b/arch/x86/xen/spinlock.c
> @@ -141,25 +141,6 @@ void __init xen_init_spinlocks(void)
>   pv_lock_ops.vcpu_is_preempted = PV_CALLEE_SAVE(xen_vcpu_stolen);
>  }
>  
> -/*
> - * While the jump_label init code needs to happend _after_ the jump labels 
> are
> - * enabled and before SMP is started. Hence we use pre-SMP initcall level
> - * init. We cannot do it in xen_init_spinlocks as that is done before
> - * jump labels are activated.
> - */
> -static __init int xen_init_spinlocks_jump(void)
> -{
> - if (!xen_pvspin)
> - return 0;
> -
> - if (!xen_domain())
> - return 0;
> -
> - static_key_slow_inc(_ticketlocks_enabled);
> - return 0;
> -}
> -early_initcall(xen_init_spinlocks_jump);
> -
>  static __init int xen_parse_nopvspin(char *arg)
>  {
>   xen_pvspin = false;


Xen bits:

Reviewed-by: Boris Ostrovsky 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] partially revert "xen: Remove event channel notification through Xen PCI platform device"

2017-01-12 Thread Stefano Stabellini
The following commit:

commit 72a9b186292d98494f26cfd24a1621796209
Author: KarimAllah Ahmed 
Date:   Fri Aug 26 23:55:36 2016 +0200

xen: Remove event channel notification through Xen PCI platform device

broke Linux when booting as Dom0 on Xen in a nested Xen environment (Xen
installed inside a Xen VM). In this scenario, Linux is a PV guest, but
at the same time it uses the platform-pci driver to receive
notifications from L0 Xen. vector callbacks are not available because L1
Xen doesn't allow them.

Partially revert the offending commit, by restoring IRQ based
notifications for PV guests only. I restored only the code which is
strictly needed and replaced the xen_have_vector_callback checks within
it with xen_pv_domain() checks.

Signed-off-by: Stefano Stabellini 

---
Changes in v2:
- in code comment
- use HVM_CALLBACK_VIA_TYPE_SHIFT
---

diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
index b59c9455..549c618 100644
--- a/drivers/xen/platform-pci.c
+++ b/drivers/xen/platform-pci.c
@@ -42,6 +42,7 @@
 static unsigned long platform_mmio;
 static unsigned long platform_mmio_alloc;
 static unsigned long platform_mmiolen;
+static uint64_t callback_via;
 
 static unsigned long alloc_xen_mmio(unsigned long len)
 {
@@ -54,6 +55,51 @@ static unsigned long alloc_xen_mmio(unsigned long len)
return addr;
 }
 
+static uint64_t get_callback_via(struct pci_dev *pdev)
+{
+   u8 pin;
+   int irq;
+
+   irq = pdev->irq;
+   if (irq < 16)
+   return irq; /* ISA IRQ */
+
+   pin = pdev->pin;
+
+   /* We don't know the GSI. Specify the PCI INTx line instead. */
+   return ((uint64_t)0x01 << HVM_CALLBACK_VIA_TYPE_SHIFT) | /* PCI INTx 
identifier */
+   ((uint64_t)pci_domain_nr(pdev->bus) << 32) |
+   ((uint64_t)pdev->bus->number << 16) |
+   ((uint64_t)(pdev->devfn & 0xff) << 8) |
+   ((uint64_t)(pin - 1) & 3);
+}
+
+static irqreturn_t do_hvm_evtchn_intr(int irq, void *dev_id)
+{
+   xen_hvm_evtchn_do_upcall();
+   return IRQ_HANDLED;
+}
+
+static int xen_allocate_irq(struct pci_dev *pdev)
+{
+   return request_irq(pdev->irq, do_hvm_evtchn_intr,
+   IRQF_NOBALANCING | IRQF_TRIGGER_RISING,
+   "xen-platform-pci", pdev);
+}
+
+static int platform_pci_resume(struct pci_dev *pdev)
+{
+   int err;
+   if (!xen_pv_domain())
+   return 0;
+   err = xen_set_callback_via(callback_via);
+   if (err) {
+   dev_err(>dev, "platform_pci_resume failure!\n");
+   return err;
+   }
+   return 0;
+}
+
 static int platform_pci_probe(struct pci_dev *pdev,
  const struct pci_device_id *ent)
 {
@@ -92,6 +138,28 @@ static int platform_pci_probe(struct pci_dev *pdev,
platform_mmio = mmio_addr;
platform_mmiolen = mmio_len;
 
+   /* 
+* Xen HVM guests always use the vector callback mechanism.
+* L1 Dom0 in a nested Xen environment is a PV guest inside in an
+* HVM environment. It needs the platform-pci driver to get
+* notifications from L0 Xen, but it cannot use the vector callback
+* as it is not exported by L1 Xen.
+*/
+   if (xen_pv_domain()) {
+   ret = xen_allocate_irq(pdev);
+   if (ret) {
+   dev_warn(>dev, "request_irq failed err=%d\n", 
ret);
+   goto out;
+   }
+   callback_via = get_callback_via(pdev);
+   ret = xen_set_callback_via(callback_via);
+   if (ret) {
+   dev_warn(>dev, "Unable to set the evtchn callback 
"
+"err=%d\n", ret);
+   goto out;
+   }
+   }
+
max_nr_gframes = gnttab_max_grant_frames();
grant_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes);
ret = gnttab_setup_auto_xlat_frames(grant_frames);
@@ -123,6 +191,9 @@ static int platform_pci_probe(struct pci_dev *pdev,
.name =   DRV_NAME,
.probe =  platform_pci_probe,
.id_table =   platform_pci_tbl,
+#ifdef CONFIG_PM
+   .resume_early =   platform_pci_resume,
+#endif
 };
 
 static int __init platform_pci_init(void)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] partially revert "xen: Remove event channel notification through Xen PCI platform device"

2017-01-12 Thread Boris Ostrovsky
On 01/12/2017 04:33 PM, Stefano Stabellini wrote:
> On Thu, 12 Jan 2017, Boris Ostrovsky wrote:
>> On 01/11/2017 06:36 PM, Stefano Stabellini wrote:
>>> The following commit:
>>>
>>> commit 72a9b186292d98494f26cfd24a1621796209
>>> Author: KarimAllah Ahmed 
>>> Date:   Fri Aug 26 23:55:36 2016 +0200
>>>
>>> xen: Remove event channel notification through Xen PCI platform device

Can you also replace this with

"Commit 72a9b186292d ("xen: Remove event channel notification through
Xen PCI platform device")" ... ?

Thanks.

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] partially revert "xen: Remove event channel notification through Xen PCI platform device"

2017-01-12 Thread Stefano Stabellini
On Thu, 12 Jan 2017, Boris Ostrovsky wrote:
> On 01/11/2017 06:36 PM, Stefano Stabellini wrote:
> > The following commit:
> >
> > commit 72a9b186292d98494f26cfd24a1621796209
> > Author: KarimAllah Ahmed 
> > Date:   Fri Aug 26 23:55:36 2016 +0200
> >
> > xen: Remove event channel notification through Xen PCI platform device
> >
> > broke Linux when booting as Dom0 on Xen in a nested Xen environment (Xen
> > installed inside a Xen VM). In this scenario, Linux is a PV guest, but
> > at the same time it uses the platform-pci driver to receive
> > notifications from L0 Xen. vector callbacks are not available because L1
> > Xen doesn't allow them.
> 
> (+Konrad who has been running nested)
> 
> >
> > Partially revert the offending commit, by restoring IRQ based
> > notifications for PV guests only. I restored only the code which is
> > strictly needed and replaced the xen_have_vector_callback checks within
> > it with xen_pv_domain() checks.
> >
> > Signed-off-by: Stefano Stabellini 
> >
> > ---
> > Alternatively, I could also restore the xen_have_vector_callback
> > checks. In general, it's best to have feature flag checks than umbrella
> > xen_pv/hvm_domain() checks.
> 
> I don't think it's worth doing given that we know that HVM we cant' run
> without this feature (we have
> BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector)) in xen_hvm_guest_init()).

OK. I'll add an in-code comment.


> > ---
> >
> > diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
> > index b59c9455..1477f1d 100644
> > --- a/drivers/xen/platform-pci.c
> > +++ b/drivers/xen/platform-pci.c
> > @@ -42,6 +42,7 @@
> >  static unsigned long platform_mmio;
> >  static unsigned long platform_mmio_alloc;
> >  static unsigned long platform_mmiolen;
> > +static uint64_t callback_via;
> >  
> >  static unsigned long alloc_xen_mmio(unsigned long len)
> >  {
> > @@ -54,6 +55,51 @@ static unsigned long alloc_xen_mmio(unsigned long len)
> > return addr;
> >  }
> >  
> > +static uint64_t get_callback_via(struct pci_dev *pdev)
> > +{
> > +   u8 pin;
> > +   int irq;
> > +
> > +   irq = pdev->irq;
> > +   if (irq < 16)
> > +   return irq; /* ISA IRQ */
> > +
> > +   pin = pdev->pin;
> > +
> > +   /* We don't know the GSI. Specify the PCI INTx line instead. */
> > +   return ((uint64_t)0x01 << 56) | /* PCI INTx identifier */
> 
> You can use HVM_CALLBACK_VIA_TYPE_SHIFT here.

OK


> 
> > +   ((uint64_t)pci_domain_nr(pdev->bus) << 32) |
> > +   ((uint64_t)pdev->bus->number << 16) |
> > +   ((uint64_t)(pdev->devfn & 0xff) << 8) |
> > +   ((uint64_t)(pin - 1) & 3);
> > +}
> > +
> >
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 104146: regressions - FAIL

2017-01-12 Thread osstest service owner
flight 104146 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/104146/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-xsm  4 host-build-prep fail in 104136 REGR. vs. 104119

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-vhd   9 debian-di-install  fail pass in 104136
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 12 guest-saverestore fail pass in 
104136

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 104104
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 104119
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 104119
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 104119
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 104119
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 104119
 test-armhf-armhf-libvirt-qcow2 12 saverestore-support-check   fail like 104119
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 104119
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 104119

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked in 104136 n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked in 104136 n/a
 test-armhf-armhf-xl-vhd 11 migrate-support-check fail in 104136 never pass
 test-armhf-armhf-xl-vhd 12 saverestore-support-check fail in 104136 never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  0d045d65c19ac48b31344b566cbf82a0270e6e44
baseline version:
 xen  ffc103c223a6d12e5221f66b7e96396a61ba1b20

Last test of basis   104119  2017-01-11 06:45:46 Z1 days
Failing since104126  2017-01-11 16:44:54 Z1 days4 attempts
Testing same since   104131  2017-01-11 22:43:41 Z0 days3 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Kevin Tian 
  Stefano Stabellini 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-armhf   

Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Doug Goldstein
On 1/12/17 1:30 PM, Daniel Kiper wrote:
> On Thu, Jan 12, 2017 at 09:44:59AM -0600, Doug Goldstein wrote:

> 
>> view there's no reason for adding MB2 support for BIOS since it provides
>> no advantage over MB1 when booting from the BIOS. Now MB2 solves a
> 
> From your point of view maybe it does not. However, from user point of view 
> it may.
> If you have support for MB2 on legacy BIOS and EFI platforms then you can 
> boot Xen
> on both platforms without changing anything in boot config files. Otherwise 
> you have
> to prepare separate configuration for different platforms.

Neither Grub nor iPXE require different configs for MB1 vs MB2 so I'm
not seeing the validity of this logic.

> 
>> problem with booting over EFI vs MB1 so they'll be willing to take a
>> change there. I'll also disagree that BIOS is easier than EFI since with
>> EFI its just load the ELF into memory and set a few pointers in tags.
>> With BIOS it requires me to build up the memory map into a MB2 structure.
> 
> Xen uses only these tags on legacy BIOS platforms: 
> MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
> (well, nice to have but it can be also not provided), 
> MULTIBOOT2_TAG_TYPE_MMAP (same
> as MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO), MULTIBOOT2_TAG_TYPE_BOOT_LOADER_NAME
> (same as MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO) ,MULTIBOOT2_TAG_TYPE_CMDLINE,
> MULTIBOOT2_TAG_TYPE_MODULE. I do not mention MULTIBOOT2_TAG_TYPE_END which
> is obvious. So, if you are real hardcore minimalist then you have to provide
> MULTIBOOT2_TAG_TYPE_CMDLINE and MULTIBOOT2_TAG_TYPE_MODULE. All of them
> are provided also on EFI. So, I do not see any reason to not provide MB2
> for legacy BIOS. And I do not think that it is very difficult to provide
> all optional tags mentioned above.

I don't understand what you're attempting to convey here. You've listed
out a number of tags that I mentioned in my message that I don't have to
implement for EFI. You've basically reinforced my point that its easier
to implement this for EFI than BIOS. MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
and MULTIBOOT2_TAG_TYPE_MMAP are unused by Xen on EFI. It gets this info
from a call to GetMemoryMap(). You actually reminded me of another bug.
Calling ExitBootServices() on Grub and letting it pass the memory info
causes Xen to fail to load.

Andrew helped me troubleshoot this and he discovered the fix. You've got
code:

/* Store Xen image load base address in place accessible for 32-bit code. */
mov %r15d,%esi

But if any of the checks under the run_bs: label specifically:
- /* Are EFI boot services available? */
- /* Is EFI SystemTable address provided by boot loader? */
- /* Is EFI ImageHandle address provided by boot loader? */

Will not run the mov instruction and then fail to boot. Its only if any
of these are false will it attempt to use the tags mentioned above as well.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky
On 01/12/2017 03:48 PM, Andrew Cooper wrote:
> On 12/01/17 20:46, Boris Ostrovsky wrote:
>> On 01/12/2017 02:27 PM, Andrew Cooper wrote:
>>> On 12/01/17 18:00, Boris Ostrovsky wrote:
> Ahh! found it.  This is a side effect of starting to generate the dom0
> policy in Xen.
>
> Can you try this patch?
 Intel/AMD HVM/PV 64/32bit all look good. So

 Tested-by: Boris Ostrovsky 
>>> Does this mean that newer versions of Linux more picky about what they
>>> tolerate in cpuid?
>> We started to fail after change in Xen so I am not sure it's something
>> new in Linux.
> Right, but Linux 4.4 was entirely happy with this bug, both with and
> without having CPUID faulting imposed on it.

Oh, I see. My tests (typically) build and run the latest Linux tree (and
Xen staging) every morning.

I am trying to see what part of Linux caused the crash.

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Andrew Cooper
On 12/01/17 20:46, Boris Ostrovsky wrote:
> On 01/12/2017 02:27 PM, Andrew Cooper wrote:
>> On 12/01/17 18:00, Boris Ostrovsky wrote:
 Ahh! found it.  This is a side effect of starting to generate the dom0
 policy in Xen.

 Can you try this patch?
>>> Intel/AMD HVM/PV 64/32bit all look good. So
>>>
>>> Tested-by: Boris Ostrovsky 
>> Does this mean that newer versions of Linux more picky about what they
>> tolerate in cpuid?
> We started to fail after change in Xen so I am not sure it's something
> new in Linux.

Right, but Linux 4.4 was entirely happy with this bug, both with and
without having CPUID faulting imposed on it.

Either way, it is definitely a bug in my code.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky
On 01/12/2017 02:27 PM, Andrew Cooper wrote:
> On 12/01/17 18:00, Boris Ostrovsky wrote:
>>> Ahh! found it.  This is a side effect of starting to generate the dom0
>>> policy in Xen.
>>>
>>> Can you try this patch?
>> Intel/AMD HVM/PV 64/32bit all look good. So
>>
>> Tested-by: Boris Ostrovsky 
> Does this mean that newer versions of Linux more picky about what they
> tolerate in cpuid?

We started to fail after change in Xen so I am not sure it's something
new in Linux.

-boris


>
> This bug highlights a hole in my testing strategy, which I will attempt
> to plug.
>
> ~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Daniel Kiper
On Thu, Jan 12, 2017 at 09:52:15AM -0600, Doug Goldstein wrote:
> On 1/12/17 6:50 AM, Daniel Kiper wrote:
> > On Wed, Jan 11, 2017 at 02:20:15PM -0600, Doug Goldstein wrote:
> >> On 1/11/17 1:47 PM, Daniel Kiper wrote:
> >>> On Tue, Jan 10, 2017 at 02:51:27PM -0600, Doug Goldstein wrote:
>  On 1/9/17 7:37 PM, Doug Goldstein wrote:
> > On 12/5/16 4:25 PM, Daniel Kiper wrote:
> 
> >> diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
> >> index 62c010e..dc857d8 100644
> >> --- a/xen/arch/x86/efi/efi-boot.h
> >> +++ b/xen/arch/x86/efi/efi-boot.h
> >> @@ -146,6 +146,8 @@ static void __init 
> >> efi_arch_process_memory_map(EFI_SYSTEM_TABLE *SystemTable,
> >>  {
> >>  struct e820entry *e;
> >>  unsigned int i;
> >> +/* Check for extra mem for mbi data if Xen is loaded via 
> >> multiboot2 protocol. */
> >> +UINTN extra_mem = efi_enabled(EFI_LOADER) ? 0 : (64 << 10);
> >
> > Just wondering where the constant came from? And if there should be a
> > little bit of information about it. To me its just weird to shift 64.
> 
>  Its the size of the stack used in the assembly code.
> >>>
> >>> No, it is trampoline region size.
> >>
> >> trampoline + stack in head.S We take the address where we're going to
> >> copy the trampoline and set the stack to 0x1 past it.
> >
> > I suppose that you think about this:
> >
> > /* Switch to low-memory stack.  */
> > mov sym_fs(trampoline_phys),%edi
> > lea 0x1(%edi),%esp
> >
> > However, trampoline region size is (should be) 64 KiB. No way. Please
> > look below for more details.
>
> The trampoline + stack are 64kb together. The stack grows down and the
> trampoline grows up. The stack starts at 64kb past the start of the
> trampoline. %edi is the start of the trampoline.

Yep. I think that right now we are on the same boat.

> >>  /* Populate E820 table and check trampoline area availability. */
> >>  e = e820map - 1;
> >> @@ -168,7 +170,8 @@ static void __init 
> >> efi_arch_process_memory_map(EFI_SYSTEM_TABLE *SystemTable,
> >>  /* fall through */
> >>  case EfiConventionalMemory:
> >>  if ( !trampoline_phys && desc->PhysicalStart + len <= 
> >> 0x10 &&
> >> - len >= cfg.size && desc->PhysicalStart + len > 
> >> cfg.addr )
> >> + len >= cfg.size + extra_mem &&
> >> + desc->PhysicalStart + len > cfg.addr )
> >>  cfg.addr = (desc->PhysicalStart + len - cfg.size) & 
> >> PAGE_MASK;
> >
> > So this is where the current series blows up and fails on real hardware.
> 
>  Honestly this was my misunderstanding and this shouldn't ever be used to
>  get memory for the trampoline. This also has the bug in it that it needs
>  to be:
> 
>  ASSERT(cfg.size > 0);
>  cfg.addr = (desc->PhysicalStart + len - (cfg.size + extra_mem) & 
>  PAGE_MASK;
> >>>
> >>> As I said earlier. This extra_mem stuff is (maybe) wrong and should be 
> >>> fixed
> >>> in one way or another. Hmmm... It looks OK. I will double check it because
> >>> I do not looked at this code long time and maybe I am missing something.
> >>
> >> cfg.size needs to be the size of the trampolines + stack.
> >
> > It looks that during some code rearrangement I moved one instruction too
> > much to trampoline_bios_setup. So, I can agree that right now cfg.size
> > should be properly initialized. Though it should be cfg.size = 64 << 10.
> > Then extra_mem should be dropped.
>
> That's fine as long as its clear that 64kb is for the trampoline + the
> stack.

OK, but there are two stacks. We talk about "low-memory stack". I will improve
the comment.

[...]

>  memory region). You need to use AllocatePages() otherwise you are
>  trampling memory that might have been allocated by the bootloader or any
> >>>
> >>> Bootloader code/data should be dead here.
> >>
> >> Correct. Unfortunately on my Lenovo laptop and my Intel NUCs I can't
> >> currently call ExitBootServices and a timer that iPXE has wired up has
> >
> > If you disable an important wheel in a machine you should not expect
> > that the machine will work. Sorry! No way!
>
> Speak to your co-workers Konrad and Boris. We've had long email threads
> about how certain hardware does not work with the way Xen calls
> ExitBootServices.

Could you be more precise what is wrong? Or at least send links to
relevant threads.

> >> some memory reserved down there and it was getting trampled. The real
> >
> > I still do not know why remnants of iPXE should run at this Xen boot stage.
> > It looks like an iPXE bug and IMO it should be fixed first.
>
> Like I said above, its because on this machine I am unable to call Xen's
> EBS.

I do not understand how ExitBootServices() call is related to iPXE timer 
remnants
or so. Though if it is 

[Xen-devel] [PATCH v2] x86, locking/spinlocks: Remove paravirt_ticketlocks_enabled

2017-01-12 Thread Waiman Long
This is a follow-up of commit cfd8983f03c7b2 ("x86, locking/spinlocks:
Remove ticket (spin)lock implementation"). The static_key structure
paravirt_ticketlocks_enabled is now removed as it is no longer used.
As a result, the init functions kvm_spinlock_init_jump() and
xen_init_spinlocks_jump() are also removed.

A simple build and boot test was done to verify it.

Signed-off-by: Waiman Long 
---
 v1->v2:
  - Remove init functions kvm_spinlock_init_jump() and
xen_init_spinlocks_jump().

 arch/x86/include/asm/spinlock.h  |  3 ---
 arch/x86/kernel/kvm.c| 14 --
 arch/x86/kernel/paravirt-spinlocks.c |  3 ---
 arch/x86/xen/spinlock.c  | 19 ---
 4 files changed, 39 deletions(-)

diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index 921bea7..6d39190 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -23,9 +23,6 @@
 /* How long a lock should spin before we consider blocking */
 #define SPIN_THRESHOLD (1 << 15)
 
-extern struct static_key paravirt_ticketlocks_enabled;
-static __always_inline bool static_key_false(struct static_key *key);
-
 #include 
 
 /*
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 36bc664..099fcba 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -620,18 +620,4 @@ void __init kvm_spinlock_init(void)
}
 }
 
-static __init int kvm_spinlock_init_jump(void)
-{
-   if (!kvm_para_available())
-   return 0;
-   if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
-   return 0;
-
-   static_key_slow_inc(_ticketlocks_enabled);
-   printk(KERN_INFO "KVM setup paravirtual spinlock\n");
-
-   return 0;
-}
-early_initcall(kvm_spinlock_init_jump);
-
 #endif /* CONFIG_PARAVIRT_SPINLOCKS */
diff --git a/arch/x86/kernel/paravirt-spinlocks.c 
b/arch/x86/kernel/paravirt-spinlocks.c
index 6d4bf81..6259327 100644
--- a/arch/x86/kernel/paravirt-spinlocks.c
+++ b/arch/x86/kernel/paravirt-spinlocks.c
@@ -42,6 +42,3 @@ struct pv_lock_ops pv_lock_ops = {
 #endif /* SMP */
 };
 EXPORT_SYMBOL(pv_lock_ops);
-
-struct static_key paravirt_ticketlocks_enabled = STATIC_KEY_INIT_FALSE;
-EXPORT_SYMBOL(paravirt_ticketlocks_enabled);
diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index e8a9ea7..25a7c43 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -141,25 +141,6 @@ void __init xen_init_spinlocks(void)
pv_lock_ops.vcpu_is_preempted = PV_CALLEE_SAVE(xen_vcpu_stolen);
 }
 
-/*
- * While the jump_label init code needs to happend _after_ the jump labels are
- * enabled and before SMP is started. Hence we use pre-SMP initcall level
- * init. We cannot do it in xen_init_spinlocks as that is done before
- * jump labels are activated.
- */
-static __init int xen_init_spinlocks_jump(void)
-{
-   if (!xen_pvspin)
-   return 0;
-
-   if (!xen_domain())
-   return 0;
-
-   static_key_slow_inc(_ticketlocks_enabled);
-   return 0;
-}
-early_initcall(xen_init_spinlocks_jump);
-
 static __init int xen_parse_nopvspin(char *arg)
 {
xen_pvspin = false;
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-netfront: Fix Rx stall during network stress and OOM

2017-01-12 Thread David Miller
From: Vineeth Remanan Pillai 
Date: Wed, 11 Jan 2017 23:17:17 +

> @@ -1054,7 +1059,11 @@ static int xennet_poll(struct napi_struct *napi, int 
> budget)
>   napi_complete(napi);
>  
>   RING_FINAL_CHECK_FOR_RESPONSES(>rx, more_to_do);
> - if (more_to_do)
> +
> + /* If there is more work to do or could not allocate
> +  * rx buffers, re-enable polling.
> +  */
> + if (more_to_do || err != 0)
>   napi_schedule(napi);

Just polling endlessly in a loop retrying the SKB allocation over and over
again until it succeeds is not very nice behavior.

You already have that refill timer, so please use that to retry instead
of wasting cpu cycles looping in NAPI poll.

Thanks.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86, locking/spinlocks: Remove paravirt_ticketlocks_enabled

2017-01-12 Thread Boris Ostrovsky

> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
> index e8a9ea7..a822606 100644
> --- a/arch/x86/xen/spinlock.c
> +++ b/arch/x86/xen/spinlock.c
> @@ -155,7 +155,6 @@ static __init int xen_init_spinlocks_jump(void)
>   if (!xen_domain())
>   return 0;
>  
> - static_key_slow_inc(_ticketlocks_enabled);
>   return 0;
>  }
>  early_initcall(xen_init_spinlocks_jump);


Looks that with this change there is not much left of
xen_init_spinlocks_jump() and so it can be removed.

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/cpuid: Fix feature flags reported to dom0

2017-01-12 Thread Andrew Cooper
c/s a11e8c9 "x86/pv: Use per-domain policy information in pv_cpuid()" switched
PV domains from using a (hardware for dom0, toolstack-chosen from domU) value
masked against pv_featureset[], to actually using the value calculated by
recalculate_cpuid_policy().

For domU, this is no practical change as the content is still chosen by the
toolstack.  For dom0 however, we no longer have two sources of information
potentially clearing bits.  Modern Linux seems to care about having CMP_LEGACY
set in its view of CPUID on an Intel box.

The deliberate setting of HTT, X2APIC and CMP_LEGACY in {pv,hvm}_featureset[]
is necessary for domUs, as the toolstack may have (tried to) set up topology
information in a different representation than the hardware uses.  The bits
therefore needed to be set in the masks used in the older logic, to avoid
clobbering the toolstacks information.

Move the HTT/X2APIC/CMP_LEGACY logic from calculate_{pv,hvm}_max_policy()
(where the meaning of {pv,hvm}_featureset[] has changed subtly) to
recalculate_cpuid_policy() where the masking logic now lives.

This will cause {pv,hvm}_max_policy to actually contain real hardware values
(so dom0 sees real hardware values), but still allows the toolstack to set
bits not present in real hardware for domUs.

Reported-by: Boris Ostrovsky 
Signed-off-by: Andrew Cooper 
Tested-by: Boris Ostrovsky 
---
CC: Jan Beulich 
---
 xen/arch/x86/cpuid.c | 24 
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index b685874..1e5013d 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -164,14 +164,6 @@ static void __init calculate_pv_max_policy(void)
 /* Unconditionally claim to be able to set the hypervisor bit. */
 __set_bit(X86_FEATURE_HYPERVISOR, pv_featureset);
 
-/*
- * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
- * affect how to interpret topology information in other cpuid leaves.
- */
-__set_bit(X86_FEATURE_HTT, pv_featureset);
-__set_bit(X86_FEATURE_X2APIC, pv_featureset);
-__set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
-
 sanitise_featureset(pv_featureset);
 cpuid_featureset_to_policy(pv_featureset, p);
 }
@@ -199,14 +191,6 @@ static void __init calculate_hvm_max_policy(void)
 __set_bit(X86_FEATURE_HYPERVISOR, hvm_featureset);
 
 /*
- * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
- * affect how to interpret topology information in other cpuid leaves.
- */
-__set_bit(X86_FEATURE_HTT, hvm_featureset);
-__set_bit(X86_FEATURE_X2APIC, hvm_featureset);
-__set_bit(X86_FEATURE_CMP_LEGACY, hvm_featureset);
-
-/*
  * Xen can provide an APIC emulation to HVM guests even if the host's APIC
  * isn't enabled.
  */
@@ -301,6 +285,14 @@ void recalculate_cpuid_policy(struct domain *d)
 }
 
 /*
+ * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
+ * affect how to interpret topology information in other cpuid leaves.
+ */
+__set_bit(X86_FEATURE_HTT, max_fs);
+__set_bit(X86_FEATURE_X2APIC, max_fs);
+__set_bit(X86_FEATURE_CMP_LEGACY, max_fs);
+
+/*
  * 32bit PV domains can't use any Long Mode features, and cannot use
  * SYSCALL on non-AMD hardware.
  */
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Daniel Kiper
On Thu, Jan 12, 2017 at 09:44:59AM -0600, Doug Goldstein wrote:
> On 1/12/17 6:18 AM, Daniel Kiper wrote:
>  So as an aside, IMHO this is where the series should end and the next
>  set of patches should be a follow on.
> >>>
> >>> Hmmm... Why? If you do not apply rest of patches then MB2 does not
> >>> work on all EFI platforms.
> >>>
> >>> Daniel
> >>
> >> So I should have expanded more in my other email. I've got this series
> >> pulled in on top of 4.8 along with different fixes as discussed on this
> >> thread:
> >>
> >> https://github.com/cardoe/xen/tree/48-and-daniel
> >>
> >> This boots up on my NUC but reports the other CPUs as stuck and the
> >> error is -5. This starts to come up on the Lenovo and it gets to near
> >> where it starts the dom0 kernel and then blanks the screen and hard
> >> hangs. This causes cr0 crashes on the other boards I've got access to.
> >>
> >> I've also got the series only to this point with the fixes.
> >>
> >> https://github.com/cardoe/xen/tree/48-and-daniel-sans-relocate
> >>
> >> The later version boots up on my NUC with all CPUs. It still hangs on
> >> the Lenovo. It works on the other boards. It also appears work under QEMU.
> >
> > AIUI, you are trying to add full (legacy BIOS and EFI) MB2 support to iPXE. 
> > Great!.
> > Though I think that you should do this in steps. First of all you should 
> > have MB2
> > fully running on legacy BIOS platforms. It is much simpler. If it works 
> > move to EFI
> > platforms. OVMF is good choice for start but of course finally tests should 
> > be done
> > on real hardware. You can do tests on legacy BIOS with just patch #01. If 
> > everything
> > works then apply whole patch series to Xen and add MB2 reloc functionality. 
> > If it
> > works move to EFI platform tests. It is important that you do EFI platform 
> > tests with
> > whole patch series. This way you avoid issues related to overwriting BS/RS 
> > code/data.
> >
> > Daniel
>
> Daniel,
>
> I appreciate your input. I do like the approach of splitting things up
> into small incremental pieces, that's the way all this work should be
> happening. You should also be aware that iPXE takes the approach of
> least amount of functionality/code to make things work. So from their

Nice and appreciated but look below...

> view there's no reason for adding MB2 support for BIOS since it provides
> no advantage over MB1 when booting from the BIOS. Now MB2 solves a

From your point of view maybe it does not. However, from user point of view it 
may.
If you have support for MB2 on legacy BIOS and EFI platforms then you can boot 
Xen
on both platforms without changing anything in boot config files. Otherwise you 
have
to prepare separate configuration for different platforms.

> problem with booting over EFI vs MB1 so they'll be willing to take a
> change there. I'll also disagree that BIOS is easier than EFI since with
> EFI its just load the ELF into memory and set a few pointers in tags.
> With BIOS it requires me to build up the memory map into a MB2 structure.

Xen uses only these tags on legacy BIOS platforms: 
MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
(well, nice to have but it can be also not provided), MULTIBOOT2_TAG_TYPE_MMAP 
(same
as MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO), MULTIBOOT2_TAG_TYPE_BOOT_LOADER_NAME
(same as MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO) ,MULTIBOOT2_TAG_TYPE_CMDLINE,
MULTIBOOT2_TAG_TYPE_MODULE. I do not mention MULTIBOOT2_TAG_TYPE_END which
is obvious. So, if you are real hardcore minimalist then you have to provide
MULTIBOOT2_TAG_TYPE_CMDLINE and MULTIBOOT2_TAG_TYPE_MODULE. All of them
are provided also on EFI. So, I do not see any reason to not provide MB2
for legacy BIOS. And I do not think that it is very difficult to provide
all optional tags mentioned above.

> As far as it goes I've got iPXE booting MB2 EFI payloads just fine. The
> issues I've explained here happen when I use Grub or iPXE to boot Xen so
> its not implementation specific to my iPXE code.

It looks that we have found the reason and solution for this problem.
I will fix it.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Andrew Cooper
On 12/01/17 18:00, Boris Ostrovsky wrote:
>> Ahh! found it.  This is a side effect of starting to generate the dom0
>> policy in Xen.
>>
>> Can you try this patch?
>
> Intel/AMD HVM/PV 64/32bit all look good. So
>
> Tested-by: Boris Ostrovsky 

Does this mean that newer versions of Linux more picky about what they
tolerate in cpuid?

This bug highlights a hole in my testing strategy, which I will attempt
to plug.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 09/26] ARM: GICv3: introduce separate pending_irq structs for LPIs

2017-01-12 Thread Andre Przywara
Hi,

On 12/01/17 19:15, Stefano Stabellini wrote:
> On Thu, 12 Jan 2017, Andre Przywara wrote:
>> Hi Stefano,
>>
>> On 05/01/17 21:36, Stefano Stabellini wrote:
>>> On Thu, 22 Dec 2016, Andre Przywara wrote:
 For the same reason that allocating a struct irq_desc for each
 possible LPI is not an option, having a struct pending_irq for each LPI
 is also not feasible. However we actually only need those when an
 interrupt is on a vCPU (or is about to be injected).
 Maintain a list of those structs that we can use for the lifecycle of
 a guest LPI. We allocate new entries if necessary, however reuse
 pre-owned entries whenever possible.
 I added some locking around this list here, however my gut feeling is
 that we don't need one because this a per-VCPU structure anyway.
 If someone could confirm this, I'd be grateful.
>>>
>>> I don't think the list should be per-VCPU, because the underlying LPIs
>>> are global. 
>>
>> But _pending_ IRQs (regardless of their type) are definitely a per-VCPU
>> thing. I consider struct pending_irq something like an LR precursor or a
>> software representation of it.
>> Also the pending bitmap is a per-redistributor table.
>>
>> The problem is that struct pending_irq is pretty big, 56 Bytes on arm64
>> if I did the math correctly. So the structs for the 32 private
>> interrupts per VCPU alone account for 1792 Byte. Actually I tried to add
>> another list head to it to be able to reuse the structure, but that
>> broke the build because struct vcpu got bigger than 4K.
>>
>> So the main reason I went for a dynamic pending_irq approach was that
>> the potentially big number of LPIs could lead to a lot of memory to be
>> allocated by Xen. And the ITS architecture does not provides any memory
>> table (provided by the guest) to be used for storing this information.
>> Also ...
> 
> Dynamic pending_irqs are good, but why one list per vcpu, rather than
> one list per domain, given that in our current design they can only be
> targeting one vcpu at any given time?

I believe the specs demands that: one LPI can only be pending at one
redistributor for any given point in time, but I will ask Marc tomorrow.
I believe it isn't spelled out in the spec, but can be deducted.
But as the pending table is per-redistributor, I was assuming that
modelling this per VCPU is the right thing.
I will think about it, your rationale of the LPIs being global makes
some sense ...

> 
>>> Similarly, the struct pending_irq array is per-domain, only
>>> the first 32 (PPIs) are per vcpu. Besides, it shouldn't be a list :-)
>>
>> In reality the number of interrupts which are on a VCPU at any given
>> point in time is expected to be very low, in some previous experiments I
>> found never more than four. This is even more true for LPIs, which, due
>> to the lack of an active state, fall out of the system as soon as the
>> VCPU reads the ICC_IAR register.
>> So the list will be very short, usually, which made it very appealing to
>> just go with a list, especially for an RFC.
>>
>> Also having potentially thousands of those structures lying around
>> mostly unused doesn't sound very clever to me. Actually I was thinking
>> about using the same approach for the other interrupts (SPI/PPI/SGI) as
>> well, but I guess that doesn't give us much, apart from breaking
>> everything ;-)
>>
>> But that being said:
>> If we can prove that the number of LPIs actually is limited (because it
>> is bounded by the number of devices, which Dom0 controls), I am willing
>> to investigate if we can switch over to using one struct pending_irq per
>> LPI.
>> Or do you want me to just use a more advanced data structure for that?
> 
> I think we misunderstood each other. I am definitely not suggesting to
> have one struct pending_irq per LPI. Your idea of dynamically allocating
> them is good.

Sorry for the misunderstanding - I am relieved that I don't have to
change that ;-)

> The things I am concerned about are:
> 
> 1) the choice of a list as a data structure, instead of an hashtable or
>a tree (see alpine.DEB.2.10.1610271725280.9978@sstabellini-ThinkPad-X260)
> 2) the choice of having one data structure (list or whatever) per vcpu,
>rather than one per domain
> 
> In both cases, it's not a matter of opinion but a matter of numbers and
> performance. I would like to see some numbers to prove our choices right
> or wrong.

Got it. I will see what I can do. In the moment my
number-and-performance gathering capabilities are severely limited due
to me running everything on the model.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 09/26] ARM: GICv3: introduce separate pending_irq structs for LPIs

2017-01-12 Thread Stefano Stabellini
On Thu, 12 Jan 2017, Andre Przywara wrote:
> Hi Stefano,
> 
> On 05/01/17 21:36, Stefano Stabellini wrote:
> > On Thu, 22 Dec 2016, Andre Przywara wrote:
> >> For the same reason that allocating a struct irq_desc for each
> >> possible LPI is not an option, having a struct pending_irq for each LPI
> >> is also not feasible. However we actually only need those when an
> >> interrupt is on a vCPU (or is about to be injected).
> >> Maintain a list of those structs that we can use for the lifecycle of
> >> a guest LPI. We allocate new entries if necessary, however reuse
> >> pre-owned entries whenever possible.
> >> I added some locking around this list here, however my gut feeling is
> >> that we don't need one because this a per-VCPU structure anyway.
> >> If someone could confirm this, I'd be grateful.
> > 
> > I don't think the list should be per-VCPU, because the underlying LPIs
> > are global. 
> 
> But _pending_ IRQs (regardless of their type) are definitely a per-VCPU
> thing. I consider struct pending_irq something like an LR precursor or a
> software representation of it.
> Also the pending bitmap is a per-redistributor table.
> 
> The problem is that struct pending_irq is pretty big, 56 Bytes on arm64
> if I did the math correctly. So the structs for the 32 private
> interrupts per VCPU alone account for 1792 Byte. Actually I tried to add
> another list head to it to be able to reuse the structure, but that
> broke the build because struct vcpu got bigger than 4K.
> 
> So the main reason I went for a dynamic pending_irq approach was that
> the potentially big number of LPIs could lead to a lot of memory to be
> allocated by Xen. And the ITS architecture does not provides any memory
> table (provided by the guest) to be used for storing this information.
> Also ...

Dynamic pending_irqs are good, but why one list per vcpu, rather than
one list per domain, given that in our current design they can only be
targeting one vcpu at any given time?


> > Similarly, the struct pending_irq array is per-domain, only
> > the first 32 (PPIs) are per vcpu. Besides, it shouldn't be a list :-)
> 
> In reality the number of interrupts which are on a VCPU at any given
> point in time is expected to be very low, in some previous experiments I
> found never more than four. This is even more true for LPIs, which, due
> to the lack of an active state, fall out of the system as soon as the
> VCPU reads the ICC_IAR register.
> So the list will be very short, usually, which made it very appealing to
> just go with a list, especially for an RFC.
> 
> Also having potentially thousands of those structures lying around
> mostly unused doesn't sound very clever to me. Actually I was thinking
> about using the same approach for the other interrupts (SPI/PPI/SGI) as
> well, but I guess that doesn't give us much, apart from breaking
> everything ;-)
> 
> But that being said:
> If we can prove that the number of LPIs actually is limited (because it
> is bounded by the number of devices, which Dom0 controls), I am willing
> to investigate if we can switch over to using one struct pending_irq per
> LPI.
> Or do you want me to just use a more advanced data structure for that?

I think we misunderstood each other. I am definitely not suggesting to
have one struct pending_irq per LPI. Your idea of dynamically allocating
them is good.

The things I am concerned about are:

1) the choice of a list as a data structure, instead of an hashtable or
   a tree (see alpine.DEB.2.10.1610271725280.9978@sstabellini-ThinkPad-X260)
2) the choice of having one data structure (list or whatever) per vcpu,
   rather than one per domain

In both cases, it's not a matter of opinion but a matter of numbers and
performance. I would like to see some numbers to prove our choices right
or wrong.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs

2017-01-12 Thread Andre Przywara
Hi Stefano,

as just mentioned in my last reply, I missed that email last time. Sorry
for that.

Replying to the comments that still apply to the new drop ...

On 28/10/16 02:04, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> For the same reason that allocating a struct irq_desc for each
>> possible LPI is not an option, having a struct pending_irq for each LPI
>> is also not feasible. However we actually only need those when an
>> interrupt is on a vCPU (or is about to be injected).
>> Maintain a list of those structs that we can use for the lifecycle of
>> a guest LPI. We allocate new entries if necessary, however reuse
>> pre-owned entries whenever possible.
>> Teach the existing VGIC functions to find the right pointer when being
>> given a virtual LPI number.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  xen/arch/arm/gic.c|  3 +++
>>  xen/arch/arm/vgic-v3.c|  2 ++
>>  xen/arch/arm/vgic.c   | 56 
>> ---
>>  xen/include/asm-arm/domain.h  |  1 +
>>  xen/include/asm-arm/gic-its.h | 10 
>>  xen/include/asm-arm/vgic.h|  9 +++
>>  6 files changed, 78 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> index 63c744a..ebe4035 100644
>> --- a/xen/arch/arm/gic.c
>> +++ b/xen/arch/arm/gic.c
>> @@ -506,6 +506,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>>  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
>>  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
>>  }
>> +/* If this was an LPI, mark this struct as available again. */
>> +if ( p->irq >= 8192 )
>> +p->irq = 0;
> 
> I believe that 0 is a valid irq number, we need to come up with a
> different invalid_irq value, and we should #define it. We could also
> consider checking if the irq is inflight (linked to the inflight list)
> instead of using irq == 0 to understand if it is reusable.

But those pending_irqs here are used by LPIs only, where everything
below 8192 is invalid. So that seemed like an easy and straightforward
value to use. The other, statically allocated pending_irqs would never
read an IRQ number above 8192. When searching for an empty pending_irq
for a new LPI, we would never touch any of the statically allocated
structs, so this is safe, isn't it?

>>  }
>>  }
>>  }
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index ec038a3..e9b6490 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -1388,6 +1388,8 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
>>  if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
>>  v->arch.vgic.flags |= VGIC_V3_RDIST_LAST;
>>  
>> +INIT_LIST_HEAD(>arch.vgic.pending_lpi_list);
>> +
>>  return 0;
>>  }
>>  
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index 0965119..b961551 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -31,6 +31,8 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>> +#include 
>>  
>>  static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
>>  {
>> @@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, 
>> unsigned int irq)
>>  return vgic_get_rank(v, rank);
>>  }
>>  
>> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>>  {
>>  INIT_LIST_HEAD(>inflight);
>>  INIT_LIST_HEAD(>lr_queue);
>> @@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, 
>> unsigned int virq)
>>  
>>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>>  {
>> -struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
>> +struct vgic_irq_rank *rank;
>>  unsigned long flags;
>>  int priority;
>>  
>> +if ( virq >= 8192 )
> 
> Please introduce a convenience static inline function such as:
> 
>   bool is_lpi(unsigned int irq)

Sure.

>> +return gicv3_lpi_get_priority(v->domain, virq);
>> +
>> +rank = vgic_rank_irq(v, virq);
>>  vgic_lock_rank(v, rank, flags);
>>  priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>>  vgic_unlock_rank(v, rank, flags);
>> @@ -446,13 +452,55 @@ int vgic_to_sgi(struct vcpu *v, register_t sgir, enum 
>> gic_sgi_mode irqmode, int
>>  return 1;
>>  }
>>  
>> +/*
>> + * Holding struct pending_irq's for each possible virtual LPI in each domain
>> + * requires too much Xen memory, also a malicious guest could potentially
>> + * spam Xen with LPI map requests. We cannot cover those with (guest 
>> allocated)
>> + * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
>> + * on demand.
>> + */
>> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>> +   bool allocate)
>> +{
>> +

[Xen-devel] libxl to json return string

2017-01-12 Thread Ronald Rojas
Hi,
I have an example attached below where I believe that libxl_physinfo_to_json() 
does not return all of the correct output. physinfo.c is the C program that 
outputs the json string. physc.out is the output that I recieved. physgo.out 
is the output I get using the golang bindings I'm creating for xenlight. 
The missing fields are :
scrub_pages, outstanding_pages, sharing_freed_pages,sharing_used_frames, 
the last 4 indices of hw_cap, cap_hvm, and cap_hvm_directio

I had a similar problem with libxl_dominfo_to_json() where some data fields
were not added to the string. 

It was said on the IRC channel that default values are not converted to json,
but is there a way to tell what is the default value for a data field?

Thanks!
Ronald Rojas
{
"threads_per_core": 1,
"cores_per_socket": 4,
"max_cpu_id": 3,
"nr_cpus": 4,
"cpu_khz": 3198157,
"total_pages": 4179453,
"free_pages": 32702,
"nr_nodes": 1,
"hw_cap": [
3085695999,
2012935103,
739248128,
33
]
}

{
"Threads_per_core": 1,
"Cores_per_socket": 4,
"Max_cpu_id": 3,
"Nr_cpus": 4,
"Cpu_khz": 3198157,
"Total_pages": 4179453,
"Free_pages": 32702,
"Scrub_pages": 0,
"Outstanding_pages": 0,
"Sharing_freed_pages": 0,
"Sharing_used_frames": 0,
"Nr_nodes": 1,
"Hw_cap": [
3085695999,
2012935103,
739248128,
33,
1,
10155,
0,
256
],
"Cap_hvm": false,
"Cap_hvm_directio": false
}
#include 
#include 
#include 

char * main(){

libxl_ctx *context;
libxl_ctx_alloc(,LIBXL_VERSION, 0, NULL);
libxl_physinfo info ;
int err= libxl_get_physinfo(context, );
if(err != 0){
 return NULL;
}



char * json= libxl_physinfo_to_json(context, );

libxl_ctx_free(context);
return json;

}
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PVH CPU hotplug design document

2017-01-12 Thread Andrew Cooper
On 12/01/17 12:13, Roger Pau Monné wrote:
> Hello,
>
> Below is a draft of a design document for PVHv2 CPU hotplug. It should cover
> both vCPU and pCPU hotplug. It's mainly centered around the hardware domain,
> since for unprivileged PVH guests the vCPU hotplug mechanism is already
> described in Boris series [0], and it's shared with HVM.
>
> The aim here is to find a way to use ACPI vCPU hotplug for the hardware 
> domain,
> while still being able to properly detect and notify Xen of pCPU hotplug.
>
> Thanks, Roger.
>
> [0] https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html
>
> ---8<---
> % CPU hotplug support for PVH
> % Roger Pau Monné 
> % Draft B
>
> # Revision History
>
> | Version | Date| Changes   |
> |-|-|---|
> | Draft A | 5 Jan 2017  | Initial draft.|
> |-|-|---|
> | Draft B | 12 Jan 2017 | Removed the XXX comments and clarify some |
> | | | sections. |
> | | |   |
> | | | Added a sample of the SSDT ASL code that would be |
> | | | appended to the hardware domain.  |
>
> # Preface
>
> This document aims to describe the interface to use in order to implement CPU
> hotplug for PVH guests, this applies to hotplug of both physical and virtual
> CPUs.
>
> # Introduction
>
> One of the design goals of PVH is to be able to remove as much Xen PV specific
> code as possible, thus limiting the number of Xen PV interfaces used by 
> guests,
> and tending to use native interfaces (as used by bare metal) as much as
> possible. This is in line with the efforts also done by Xen on ARM and helps
> reduce the burden of maintaining huge amounts of Xen PV code inside of guests
> kernels.
>
> This however presents some challenges due to the model used by the Xen
> Hypervisor, where some devices are handled by Xen while others are left for 
> the
> hardware domain to manage. The fact that Xen lacks and AML parser also makes 
> it
> harder, since it cannot get the full hardware description from dynamic ACPI
> tables (DSDT, SSDT) without the hardware domain collaboration.
>
> One of such issues is CPU enumeration and hotplug, for both the hardware and
> unprivileged domains. The aim is to be able to use the same enumeration and
> hotplug interface for all PVH guests, regardless of their privilege.
>
> This document aims to describe the interface used in order to fulfill the
> following actions:
>
>  * Virtual CPU (vCPU) enumeration at boot time.
>  * Hotplug of vCPUs.
>  * Hotplug of physical CPUs (pCPUs) to Xen.
>
> # Prior work
>
> ## PV CPU hotplug
>
> CPU hotplug for Xen PV guests is implemented using xenstore and hypercalls. 
> The
> guest has to setup a watch event on the "cpu/" xenstore node, and react to
> changes in this directory. CPUs are added creating a new node and setting it's
> "availability" to online:
>
> cpu/X/availability = "online"
>
> Where X is the vCPU ID. This is an out-of-band method, that relies on Xen
> specific interfaces in order to perform CPU hotplug.

It is also worth pointing the shortcomings of this model, i.e. that
there is no mechanism to prevent a guest onlining more processors if it
ignores the xenstore values.

>
> ## QEMU CPU hotplug using ACPI
>
> The ACPI tables provided to HVM guests contain processor objects, as created 
> by
> libacpi. The number of processor objects in the ACPI namespace matches the
> maximum number of processors supported by HVM guests (up to 128 at the time of
> writing). Processors currently disabled are marked as so in the MADT and in
> their \_MAT and \_STA methods.
>
> A PRST operation region in I/O space is also defined, with a size of 128bits,
> that's used as a bitmap of enabled vCPUs on the system. A PRSC method is
> provided in order to check for updates to the PRST region and trigger
> notifications on the affected processor objects. The execution of the PRSC
> method is done by a GPE event. Then OSPM checks the value returned by \_STA 
> for
> the ACPI\_STA\_DEVICE\_PRESENT flag in order to check if the vCPU has been
> enabled.

It is worth describing the toolstack side of hotplug? It is equally
relevant IMO.

>
> ## Native CPU hotplug
>
> OSPM waits for a notification from ACPI on the processor object and when an
> event is received the return value from _STA is checked in order to see if
> ACPI\_STA\_DEVICE\_PRESENT has been enabled. This notification is triggered
> from the method of a GPE block.
>
> # PVH CPU hotplug
>
> The aim as stated in the introduction is to use a method as similar as 
> possible
> to bare metal CPU hotplug for PVH, this is feasible for unprivileged domains,

Re: [Xen-devel] [RFC PATCH v2 10/26] ARM: GICv3: forward pending LPIs to guests

2017-01-12 Thread Stefano Stabellini
On Thu, 12 Jan 2017, Andre Przywara wrote:
> On 05/01/17 22:10, Stefano Stabellini wrote:
> > On Thu, 22 Dec 2016, Andre Przywara wrote:
> >> Upon receiving an LPI, we need to find the right VCPU and virtual IRQ
> >> number to get this IRQ injected.
> >> Iterate our two-level LPI table to find this information quickly when
> >> the host takes an LPI. Call the existing injection function to let the
> >> GIC emulation deal with this interrupt.
> >>
> >> Signed-off-by: Andre Przywara 
> >> ---
> >>  xen/arch/arm/gic-its.c| 35 +++
> >>  xen/arch/arm/gic.c|  6 --
> >>  xen/include/asm-arm/irq.h |  8 
> >>  3 files changed, 47 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> >> index e7ddd90..0d4ca1b 100644
> >> --- a/xen/arch/arm/gic-its.c
> >> +++ b/xen/arch/arm/gic-its.c
> >> @@ -72,6 +72,41 @@ static union host_lpi *gic_get_host_lpi(uint32_t plpi)
> >>  return _data.host_lpis[plpi / HOST_LPIS_PER_PAGE][plpi % 
> >> HOST_LPIS_PER_PAGE];
> >>  }
> >>  
> >> +/* Handle incoming LPIs, which are a bit special, because they are 
> >> potentially
> >> + * numerous and also only get injected into guests. Treat them specially 
> >> here,
> >> + * by just looking up their target vCPU and virtual LPI number and hand it
> >> + * over to the injection function.
> >> + */
> >> +void do_LPI(unsigned int lpi)
> >> +{
> >> +struct domain *d;
> >> +union host_lpi *hlpip, hlpi;
> >> +struct vcpu *vcpu;
> >> +
> >> +WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
> >> +
> >> +hlpip = gic_get_host_lpi(lpi);
> >> +if ( !hlpip )
> >> +return;
> >> +
> >> +hlpi.data = hlpip->data;
> > 
> > Why can't we just reference hlpip directly throughout this function? Is
> > it for atomicity reasons?
> 
> Yes. We have to make sure that the LPI entry is consistent, but we don't
> want to (and probably can't) take a lock here and everywhere else where
> we touch it. We are fine with reading an "outdated" entry (which is just
> about to change), this is a benign race which can happen on real
> hardware as well.

In that case, it's best to use atomic functions.


> >> +/* We may have mapped more host LPIs than the guest actually asked 
> >> for. */
> >> +if ( !hlpi.virt_lpi )
> >> +return;
> >> +
> >> +d = get_domain_by_id(hlpi.dom_id);
> >> +if ( !d )
> >> +return;
> >> +
> >> +if ( hlpi.vcpu_id >= d->max_vcpus )
> 
> I am just seeing that I miss a put_domain(d) here and 
> 
> >> +return;
> >> +
> >> +vcpu = d->vcpu[hlpi.vcpu_id];
> 
> ... here.

Oops, you are right.


> Which makes me wonder if it is legal to use a VCPU reference even though
> I "put back" the domain pointer?

I don't think so.


> Is there a get_vcpu()/put_vcpu() equivalent? Or is this supposed to
> covered by the domain pointer as well?
> 
> Or shall I use get_domain() the moment I enter the domain ID into the
> host LPI array and only "put" it when an entry gets changed or the LPI
> gets somehow else invalid (VCPU destroyed, domain destroyed)?

The pattern is to get_domain at the beginning of the implementation of
the function in the hypervisor and put_domain at the end of it. In this
case, I think you should replace the returns in this function with goto
out, and have an out label with put_domain.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen ARM community call - meeting minutes and date for the next one

2017-01-12 Thread Stefano Stabellini
On Thu, 12 Jan 2017, Pooya.Keshavarzi wrote:
> > The other issue I heard about was some root file system corruptions after 
> > two or three re-boots we haven't observed in the native Linux case. The 
> > plan was to do some further analysis, first, before we blame Xen regarding 
> > this, though.
> > 
> > As mentioned, Pooya will have the details and correct me if I'm totally 
> > wrong here ;)
> > 
> 
> Firstly sorry for the late reply on this.
> 
> Regarding the problem with swiotlb-xen here are some more details:
> 
> If we limit Dom0's memory such that only low-memory (up to 32-bit addressable 
> memory) is available to Dom0, then swiotlb-xen does not have to use bounce 
> buffers and the devices (e.g. USB, ethernet) would work.
> 
> But when there is some high memory also available to Dom0, the followings 
> happen:
>  - If the the device address happens to be in the device's DMA window (see 
> xen_swiotlb_map_page()), then the device would work.
>  - Otherwise if it has to allocate and map a bounce buffer, then the device 
> would not work.

From what you wrote it looks like the xen_swiotlb_map_page path: 

if (dma_capable(dev, dev_addr, size) &&
!range_straddles_page_boundary(phys, size) &&
!xen_arch_need_swiotlb(dev, phys, dev_addr) &&
!swiotlb_force) {
/* we are not interested in the dma_addr returned by
 * xen_dma_map_page, only in the potential cache flushes 
executed
 * by the function. */
xen_dma_map_page(dev, page, dev_addr, offset, size, dir, attrs);
return dev_addr;
}

works, but the other does not. Does it match your understanding? Have
you done any digging to find the reason why the bounce buffer code path
is broken on your platform?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.8 + Linux 4.9 + Credit2 = can't bootup

2017-01-12 Thread Ian Jackson
Dario Faggioli writes ("Re: [Xen-devel] Xen 4.8 + Linux 4.9 + Credit2 = can't 
bootup"):
> Anyway, we should have some multi-socket boxes on OSSTest, AFAICR.

I think we do but I haven't got a systematic way of answering that
question other than by manual eyeballing of the spec sheets.

IF there were something easy to look for in the dmesg output (say) I
could probably grep historical logs.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen Community Call on new PV protocols, Tuesday 10th Jan 9AM PST

2017-01-12 Thread Stefano Stabellini
On Thu, 12 Jan 2017, Roger Pau Monné wrote:
> On Tue, Jan 10, 2017 at 11:29:44AM -0800, Stefano Stabellini wrote:
> > These are the minutes I took during the call:
> > 
> > Xen PV Drivers Lifecycle document ready to be committed.
> > 
> > Common Pitfalls for new PV protocols:
> > - 32 vs 64 fields
> > - not Linux centric
> > - missing version fields and feature flags
> > 
> > Full list of outstanding PV protocols:
> > pvcalls, xen-9pfs, multitouch events, sound, display, netfront/netback 
> > extension
> > 
> > PVCalls: should be OK to move forward, Konrad will review next
> > Sound: should be OK to move forward, Konrad will review next
> > Display: concerns about whether it will replace xenfb. Konrad will take
> > a look after
> > Multitouch: new, but simple, Stefano will review
> > Netfront/netback extension: new, still many comments outstanding,
> > probably not for the next release
> > 
> > Things to do
> > - use pahole 32 bit and 64 bit to check the structs, publish the output
> >   on xendevel
> 
> Sorry, I wasn't able to attend the call, but didn't we agree that all new
> protocols would be described using binary layouts instead of C structs?

Yes, this is in addition to it, not instead, and it is just a little
courtesy for reviewers. ___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 00/13] x86: multiboot2 protocol support

2017-01-12 Thread Daniel Kiper
On Thu, Jan 12, 2017 at 11:46:04AM -0600, Doug Goldstein wrote:
> On 12/5/16 4:25 PM, Daniel Kiper wrote:
> > Hi,
> >
> > I am sending eleventh version of multiboot2 protocol support for
> > legacy BIOS and EFI platforms. This patch series release contains
> > fixes for all known issues.
> >
> > The final goal is xen.efi binary file which could be loaded by EFI
> > loader, multiboot (v1) protocol (only on legacy BIOS platforms) and
> > multiboot2 protocol. This way we will have:
>
> So another issue I've found in the series is that xen/xen.gz is loadable
> with MB2 but xen.efi is not but includes the MB2 header so I detect it
> as a valid MB2 module. There's no entry point advertised in the xen.efi
> case.
>
> I think we'd probably just leave off the MB2 header for xen.efi and
> leave that as a plain EFI loader case.

This is known issue. xen.efi contains MB1 and MB2 headers because it is
build from almost the same source/object files. I tried to fix this once
but this is not easy. I am going to do that when current patch series is
applied. After that xen.efi will be loadable by three boot protocols.
Additionally, there is a chance that this way we will drop dependency
on specific binutils version.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] partially revert "xen: Remove event channel notification through Xen PCI platform device"

2017-01-12 Thread Boris Ostrovsky
On 01/11/2017 06:36 PM, Stefano Stabellini wrote:
> The following commit:
>
> commit 72a9b186292d98494f26cfd24a1621796209
> Author: KarimAllah Ahmed 
> Date:   Fri Aug 26 23:55:36 2016 +0200
>
> xen: Remove event channel notification through Xen PCI platform device
>
> broke Linux when booting as Dom0 on Xen in a nested Xen environment (Xen
> installed inside a Xen VM). In this scenario, Linux is a PV guest, but
> at the same time it uses the platform-pci driver to receive
> notifications from L0 Xen. vector callbacks are not available because L1
> Xen doesn't allow them.

(+Konrad who has been running nested)

>
> Partially revert the offending commit, by restoring IRQ based
> notifications for PV guests only. I restored only the code which is
> strictly needed and replaced the xen_have_vector_callback checks within
> it with xen_pv_domain() checks.
>
> Signed-off-by: Stefano Stabellini 
>
> ---
> Alternatively, I could also restore the xen_have_vector_callback
> checks. In general, it's best to have feature flag checks than umbrella
> xen_pv/hvm_domain() checks.

I don't think it's worth doing given that we know that HVM we cant' run
without this feature (we have
BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector)) in xen_hvm_guest_init()).


> ---
>
> diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
> index b59c9455..1477f1d 100644
> --- a/drivers/xen/platform-pci.c
> +++ b/drivers/xen/platform-pci.c
> @@ -42,6 +42,7 @@
>  static unsigned long platform_mmio;
>  static unsigned long platform_mmio_alloc;
>  static unsigned long platform_mmiolen;
> +static uint64_t callback_via;
>  
>  static unsigned long alloc_xen_mmio(unsigned long len)
>  {
> @@ -54,6 +55,51 @@ static unsigned long alloc_xen_mmio(unsigned long len)
>   return addr;
>  }
>  
> +static uint64_t get_callback_via(struct pci_dev *pdev)
> +{
> + u8 pin;
> + int irq;
> +
> + irq = pdev->irq;
> + if (irq < 16)
> + return irq; /* ISA IRQ */
> +
> + pin = pdev->pin;
> +
> + /* We don't know the GSI. Specify the PCI INTx line instead. */
> + return ((uint64_t)0x01 << 56) | /* PCI INTx identifier */

You can use HVM_CALLBACK_VIA_TYPE_SHIFT here.

-boris


> + ((uint64_t)pci_domain_nr(pdev->bus) << 32) |
> + ((uint64_t)pdev->bus->number << 16) |
> + ((uint64_t)(pdev->devfn & 0xff) << 8) |
> + ((uint64_t)(pin - 1) & 3);
> +}
> +
>



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.8 + Linux 4.9 + Credit2 = can't bootup

2017-01-12 Thread Ian Jackson
Dario Faggioli writes ("Re: [Xen-devel] Xen 4.8 + Linux 4.9 + Credit2 = can't 
bootup"):
> Maybe it's me misremembering/saying stupid things, but I recall that at
> some point we were testing some of the recent and in development Linux
> branches in OSSTest.

We used to, but no-one fixed any of the bugs it discovered so I turned
that off in April.

Possibly now that osstest is performing better I should turn these on
again.  Do we have anyone who will take care of chasing down the bugs
and getting them fixed upstream ?

Ian.

From: Ian Jackson 
To: 
CC: Ian Jackson , Ian Jackson
, Konrad Rzeszutek Wilk 
,
Boris Ostrovsky , David Vrabel
, Stefano Stabellini
, Wei Liu , 
"Roger Pau
 Monne" , Juergen Gross , Anshul Makkar

Subject: [OSSTEST PATCH] crontab: Drop linux-mingo-tip-master linux-next 
linux-linus
Date: Fri, 22 Apr 2016 15:54:43 +0100

It appears that no-one is looking at the output.  These have not had a
push to the tested output branch for at least 250 days (742 days in
the case of linux-linus!) and the reports don't seem to be generating
any bugfixing activity.

There is a plan to do some Xen testing in Zero-day but even if that
doesn't lead to anything we would still be just where we are now.

So drop these to save our test bandwith for more useful work.

Signed-off-by: Ian Jackson 
CC: Konrad Rzeszutek Wilk 
CC: Boris Ostrovsky 
CC: David Vrabel 
CC: Stefano Stabellini 
CC: Wei Liu 
CC: Roger Pau Monne 
CC: Juergen Gross 
CC: Anshul Makkar 
---
 crontab |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/crontab b/crontab
index 2cfad74..6ddc2b8 100755
--- a/crontab
+++ b/crontab
@@ -7,10 +7,9 @@ MAILTO=ian.jack...@citrix.com,ian.campb...@eu.citrix.com
 49 1   * * *   cd testing.git && 
BRANCHES_ALWAYS=xen-unstable  ./cr-for-branches branches -w "./cr-daily-branch 
--real"
 0  *   * * *   cd testing.git && 
BRANCHES=xen-unstable-smoke   ./cr-for-branches branches -q "./cr-daily-branch 
--real"
 4-59/30*   * * *   cd testing.git &&   
./cr-for-branches branches -q "./cr-daily-branch --real"
-18 9   * * 1,3,5   cd testing.git && BRANCHES=linux-next   
./cr-for-branches branches -w "./cr-daily-branch --real"
 18 9   * * 3,7 cd testing.git && 
BRANCHES=xen-unstable-coverity ./cr-for-branches branches -w "./cr-daily-branch 
--real"
-18 4   * * *   cd testing.git && BRANCHES='linux-linus 
linux-mingo-tip-master linux-3.0 libvirt rumpuserxen' ./cr-for-branches 
branches -w "./cr-daily-branch --real"
-6-59/15*   * * *   cd testing.git && 
EXTRA_BRANCHES='linux-linus linux-3.0 rumpuserxen libvirt' ./cr-for-branches 
bisects -w "./cr-try-bisect --real"
+18 4   * * *   cd testing.git && BRANCHES='linux-3.0 
libvirt rumpuserxen' ./cr-for-branches branches -w "./cr-daily-branch --real"
+6-59/15*   * * *   cd testing.git && 
EXTRA_BRANCHES='linux-3.0 rumpuserxen libvirt' ./cr-for-branches bisects -w 
"./cr-try-bisect --real"
 #8-59/5*   * * *   cd bisects/adhoc.git && 
with-lock-ex -q data-tree-lock bash -c "./cr-try-bisect-adhoc; exit $?"
 22 8   * * *   cd testing.git && BRANCHES=maintjobs
./cr-for-branches . -w ./cr-all-branch-statuses ''
 3  4   * * *   savelog -c28 
testing.git/tmp/cr-for-branches.log >/dev/null
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 09/26] ARM: GICv3: introduce separate pending_irq structs for LPIs

2017-01-12 Thread Andre Przywara
Hi Stefano,

On 05/01/17 21:36, Stefano Stabellini wrote:
> On Thu, 22 Dec 2016, Andre Przywara wrote:
>> For the same reason that allocating a struct irq_desc for each
>> possible LPI is not an option, having a struct pending_irq for each LPI
>> is also not feasible. However we actually only need those when an
>> interrupt is on a vCPU (or is about to be injected).
>> Maintain a list of those structs that we can use for the lifecycle of
>> a guest LPI. We allocate new entries if necessary, however reuse
>> pre-owned entries whenever possible.
>> I added some locking around this list here, however my gut feeling is
>> that we don't need one because this a per-VCPU structure anyway.
>> If someone could confirm this, I'd be grateful.
> 
> I don't think the list should be per-VCPU, because the underlying LPIs
> are global. 

But _pending_ IRQs (regardless of their type) are definitely a per-VCPU
thing. I consider struct pending_irq something like an LR precursor or a
software representation of it.
Also the pending bitmap is a per-redistributor table.

The problem is that struct pending_irq is pretty big, 56 Bytes on arm64
if I did the math correctly. So the structs for the 32 private
interrupts per VCPU alone account for 1792 Byte. Actually I tried to add
another list head to it to be able to reuse the structure, but that
broke the build because struct vcpu got bigger than 4K.

So the main reason I went for a dynamic pending_irq approach was that
the potentially big number of LPIs could lead to a lot of memory to be
allocated by Xen. And the ITS architecture does not provides any memory
table (provided by the guest) to be used for storing this information.
Also ...

> Similarly, the struct pending_irq array is per-domain, only
> the first 32 (PPIs) are per vcpu. Besides, it shouldn't be a list :-)

In reality the number of interrupts which are on a VCPU at any given
point in time is expected to be very low, in some previous experiments I
found never more than four. This is even more true for LPIs, which, due
to the lack of an active state, fall out of the system as soon as the
VCPU reads the ICC_IAR register.
So the list will be very short, usually, which made it very appealing to
just go with a list, especially for an RFC.

Also having potentially thousands of those structures lying around
mostly unused doesn't sound very clever to me. Actually I was thinking
about using the same approach for the other interrupts (SPI/PPI/SGI) as
well, but I guess that doesn't give us much, apart from breaking
everything ;-)

But that being said:
If we can prove that the number of LPIs actually is limited (because it
is bounded by the number of devices, which Dom0 controls), I am willing
to investigate if we can switch over to using one struct pending_irq per
LPI.
Or do you want me to just use a more advanced data structure for that?

>> Teach the existing VGIC functions to find the right pointer when being
>> given a virtual LPI number.
>>
>> Signed-off-by: Andre Przywara 
> 
> Most of my comments on the previous version of the patch are still
> unaddressed.

Indeed, I just found that your reply wasn't tagged in my mailer, so I
missed it. Sorry about that! Will look at it now.

Cheers,
Andre.

> 
> 
>> ---
>>  xen/arch/arm/gic.c   |  3 +++
>>  xen/arch/arm/vgic-v3.c   | 11 
>>  xen/arch/arm/vgic.c  | 64 
>> +---
>>  xen/include/asm-arm/domain.h |  2 ++
>>  xen/include/asm-arm/vgic.h   | 10 +++
>>  5 files changed, 87 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> index a5348f2..6f25501 100644
>> --- a/xen/arch/arm/gic.c
>> +++ b/xen/arch/arm/gic.c
>> @@ -509,6 +509,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>>  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
>>  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
>>  }
>> +/* If this was an LPI, mark this struct as available again. */
>> +if ( p->irq >= 8192 )
>> +p->irq = 0;
>>  }
>>  }
>>  }
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index d61479d..0ffde74 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -331,6 +331,14 @@ read_unknown:
>>  return 1;
>>  }
>>  
>> +int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
>> +{
>> +if ( vlpi >= d->arch.vgic.nr_lpis )
>> +return GIC_PRI_IRQ;
>> +
>> +return d->arch.vgic.proptable[vlpi - 8192] & 0xfc;
>> +}
>> +
>>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>>uint32_t gicr_reg,
>>register_t r)
>> @@ -1426,6 +1434,9 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
>>  if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
>>  v->arch.vgic.flags |= 

Re: [Xen-devel] [PATCH v2 5/5] libxl: Add explicit cast to libxl_psr_cat_set_cbm

2017-01-12 Thread George Dunlap
On Tue, Jan 19, 2016 at 2:35 PM, Ian Jackson  wrote:
> Ian Campbell writes ("Re: [PATCH v2 5/5] libxl: Add explicit cast to 
> libxl_psr_cat_set_cbm"):
>> On Tue, 2016-01-19 at 14:06 +, Ian Jackson wrote:
>> >  * XEN_DOMCTL_PSR_CAT_OP_SET_L3_* (public/domctl.h)
>> >  * enum xc_psr_cat_type (xenctrl.h)
>> >  * Enumeration("psr_cbm_type",...) (libxl_types.idl)
>>
>> Forgot to say in my other reply, but we could try and abolish at least the
>> xc one and have libxl internally use the domctl values.
>
> Yes.
>
> I like George's IDL suggestion.

Out of curiosity, did anything ever come of this?  If not it seems
like we should write it down somewhere.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-netback: fix memory leaks on XenBus disconnect

2017-01-12 Thread Igor Druzhinin
On 12/01/17 17:51, Igor Druzhinin wrote:
> Eliminate memory leaks introduced several years ago by cleaning the queue
> resources which are allocated on XenBus connection event. Namely, queue
> structure array and pages used for IO rings.
> vif->lock is used to protect statistics gathering agents from using the
> queue structure during cleaning.
> 
> Signed-off-by: Igor Druzhinin 
> ---
>  drivers/net/xen-netback/interface.c |  6 --
>  drivers/net/xen-netback/xenbus.c| 13 +
>  2 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c 
> b/drivers/net/xen-netback/interface.c
> index e30ffd2..5795213 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -221,18 +221,18 @@ static struct net_device_stats *xenvif_get_stats(struct 
> net_device *dev)
>  {
>   struct xenvif *vif = netdev_priv(dev);
>   struct xenvif_queue *queue = NULL;
> - unsigned int num_queues = vif->num_queues;
>   unsigned long rx_bytes = 0;
>   unsigned long rx_packets = 0;
>   unsigned long tx_bytes = 0;
>   unsigned long tx_packets = 0;
>   unsigned int index;
>  
> + spin_lock(>lock);
>   if (vif->queues == NULL)
>   goto out;
>  
>   /* Aggregate tx and rx stats from each queue */
> - for (index = 0; index < num_queues; ++index) {
> + for (index = 0; index < vif->num_queues; ++index) {
>   queue = >queues[index];
>   rx_bytes += queue->stats.rx_bytes;
>   rx_packets += queue->stats.rx_packets;
> @@ -241,6 +241,8 @@ static struct net_device_stats *xenvif_get_stats(struct 
> net_device *dev)
>   }
>  
>  out:
> + spin_unlock(>lock);
> +
>   vif->dev->stats.rx_bytes = rx_bytes;
>   vif->dev->stats.rx_packets = rx_packets;
>   vif->dev->stats.tx_bytes = tx_bytes;
> diff --git a/drivers/net/xen-netback/xenbus.c 
> b/drivers/net/xen-netback/xenbus.c
> index 3124eae..85b742e 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -493,11 +493,22 @@ static int backend_create_xenvif(struct backend_info 
> *be)
>  static void backend_disconnect(struct backend_info *be)
>  {
>   if (be->vif) {
> + unsigned int queue_index;
> +
>   xen_unregister_watchers(be->vif);
>  #ifdef CONFIG_DEBUG_FS
>   xenvif_debugfs_delif(be->vif);
>  #endif /* CONFIG_DEBUG_FS */
>   xenvif_disconnect_data(be->vif);
> + for (queue_index = 0; queue_index < be->vif->num_queues; 
> ++queue_index)
> + xenvif_deinit_queue(>vif->queues[queue_index]);
> +
> + spin_lock(>vif->lock);
> + vfree(be->vif->queues);
> + be->vif->num_queues = 0;
> + be->vif->queues = NULL;
> + spin_unlock(>vif->lock);
> +
>   xenvif_disconnect_ctrl(be->vif);
>   }
>  }
> @@ -1034,6 +1045,8 @@ static void connect(struct backend_info *be)
>  err:
>   if (be->vif->num_queues > 0)
>   xenvif_disconnect_data(be->vif); /* Clean up existing queues */
> + for (queue_index = 0; queue_index < be->vif->num_queues; ++queue_index)
> + xenvif_deinit_queue(>vif->queues[queue_index]);
>   vfree(be->vif->queues);
>   be->vif->queues = NULL;
>   be->vif->num_queues = 0;
> 

Add Juergen Gross to CC.

Igor

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky

> Ahh! found it.  This is a side effect of starting to generate the dom0
> policy in Xen.
>
> Can you try this patch?


Intel/AMD HVM/PV 64/32bit all look good. So

Tested-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen-netback: fix memory leaks on XenBus disconnect

2017-01-12 Thread Igor Druzhinin
Eliminate memory leaks introduced several years ago by cleaning the queue
resources which are allocated on XenBus connection event. Namely, queue
structure array and pages used for IO rings.
vif->lock is used to protect statistics gathering agents from using the
queue structure during cleaning.

Signed-off-by: Igor Druzhinin 
---
 drivers/net/xen-netback/interface.c |  6 --
 drivers/net/xen-netback/xenbus.c| 13 +
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index e30ffd2..5795213 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -221,18 +221,18 @@ static struct net_device_stats *xenvif_get_stats(struct 
net_device *dev)
 {
struct xenvif *vif = netdev_priv(dev);
struct xenvif_queue *queue = NULL;
-   unsigned int num_queues = vif->num_queues;
unsigned long rx_bytes = 0;
unsigned long rx_packets = 0;
unsigned long tx_bytes = 0;
unsigned long tx_packets = 0;
unsigned int index;
 
+   spin_lock(>lock);
if (vif->queues == NULL)
goto out;
 
/* Aggregate tx and rx stats from each queue */
-   for (index = 0; index < num_queues; ++index) {
+   for (index = 0; index < vif->num_queues; ++index) {
queue = >queues[index];
rx_bytes += queue->stats.rx_bytes;
rx_packets += queue->stats.rx_packets;
@@ -241,6 +241,8 @@ static struct net_device_stats *xenvif_get_stats(struct 
net_device *dev)
}
 
 out:
+   spin_unlock(>lock);
+
vif->dev->stats.rx_bytes = rx_bytes;
vif->dev->stats.rx_packets = rx_packets;
vif->dev->stats.tx_bytes = tx_bytes;
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 3124eae..85b742e 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -493,11 +493,22 @@ static int backend_create_xenvif(struct backend_info *be)
 static void backend_disconnect(struct backend_info *be)
 {
if (be->vif) {
+   unsigned int queue_index;
+
xen_unregister_watchers(be->vif);
 #ifdef CONFIG_DEBUG_FS
xenvif_debugfs_delif(be->vif);
 #endif /* CONFIG_DEBUG_FS */
xenvif_disconnect_data(be->vif);
+   for (queue_index = 0; queue_index < be->vif->num_queues; 
++queue_index)
+   xenvif_deinit_queue(>vif->queues[queue_index]);
+
+   spin_lock(>vif->lock);
+   vfree(be->vif->queues);
+   be->vif->num_queues = 0;
+   be->vif->queues = NULL;
+   spin_unlock(>vif->lock);
+
xenvif_disconnect_ctrl(be->vif);
}
 }
@@ -1034,6 +1045,8 @@ static void connect(struct backend_info *be)
 err:
if (be->vif->num_queues > 0)
xenvif_disconnect_data(be->vif); /* Clean up existing queues */
+   for (queue_index = 0; queue_index < be->vif->num_queues; ++queue_index)
+   xenvif_deinit_queue(>vif->queues[queue_index]);
vfree(be->vif->queues);
be->vif->queues = NULL;
be->vif->num_queues = 0;
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 104142: regressions - FAIL

2017-01-12 Thread osstest service owner
flight 104142 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/104142/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-xsm   11 guest-start  fail REGR. vs. 104106

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 104106
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 104106
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 104106
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 104106
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 104106
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail  like 104106
 test-armhf-armhf-libvirt-qcow2 12 saverestore-support-check   fail like 104106
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 104106

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu204febd17f9ebb9e94b1980b42c7f2c2307851c1
baseline version:
 qemuub44486dfb9447c88e4b216e730adcc780190852c

Last test of basis   104106  2017-01-11 01:46:06 Z1 days
Testing same since   104142  2017-01-12 12:12:27 Z0 days1 attempts


People who touched revisions under test:
  Greg Kurz 
  Peter Maydell 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   

Re: [Xen-devel] [PATCH v11 00/13] x86: multiboot2 protocol support

2017-01-12 Thread Doug Goldstein
On 12/5/16 4:25 PM, Daniel Kiper wrote:
> Hi,
> 
> I am sending eleventh version of multiboot2 protocol support for
> legacy BIOS and EFI platforms. This patch series release contains
> fixes for all known issues.
> 
> The final goal is xen.efi binary file which could be loaded by EFI
> loader, multiboot (v1) protocol (only on legacy BIOS platforms) and
> multiboot2 protocol. This way we will have:

So another issue I've found in the series is that xen/xen.gz is loadable
with MB2 but xen.efi is not but includes the MB2 header so I detect it
as a valid MB2 module. There's no entry point advertised in the xen.efi
case.

I think we'd probably just leave off the MB2 header for xen.efi and
leave that as a plain EFI loader case.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv7 00/11] CONFIG_DEBUG_VIRTUAL for arm64

2017-01-12 Thread Will Deacon
On Tue, Jan 10, 2017 at 01:35:39PM -0800, Laura Abbott wrote:
> This is v7 of the patches to add CONFIG_DEBUG_VIRTUAL for arm64. This is
> a simple reordering of patches from v6 per request of Will Deacon for ease
> of merging support for arm which depends on this series.
> 
> Laura Abbott (11):
>   lib/Kconfig.debug: Add ARCH_HAS_DEBUG_VIRTUAL
>   mm/cma: Cleanup highmem check
>   mm: Introduce lm_alias
>   kexec: Switch to __pa_symbol
>   mm/kasan: Switch to using __pa_symbol and lm_alias
>   mm/usercopy: Switch to using lm_alias
>   drivers: firmware: psci: Use __pa_symbol for kernel symbol
>   arm64: Move some macros under #ifndef __ASSEMBLY__
>   arm64: Add cast for virt_to_pfn
>   arm64: Use __pa_symbol for kernel symbols
>   arm64: Add support for CONFIG_DEBUG_VIRTUAL

I've pushed this into linux-next and, assuming it survives the
autobuilders etc I'll co-ordinate with Russell to get the common parts
pulled into the ARM tree too (so he can take Florian's series). They're
currently split out on the arm64 for-next/debug-virtual branch.

Thanks!

Will

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86emul: VEX.B is ignored in compatibility mode

2017-01-12 Thread Andrew Cooper
On 12/01/17 16:37, Jan Beulich wrote:
> While VEX.R and VEX.X are guaranteed to be 1 in compatibility mode,
> VEX.B can be encoded as zero, but would be ignored by the processor.

Really?  That is unfortunate.

It would have been far more helpful for this to raise #UD, like the
other prohibited VEX encodings.

> @@ -2235,7 +2241,7 @@ x86_decode(
>  break;
>  }
>  }
> -if ( mode_64bit() && !vex.r )
> +if ( !vex.r )
>  rex_prefix |= REX_R;
>  
>  ext = vex.opcx;
>

What is the purpose of this change? I doesn't appear to be related to
the rest of the patch.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Read Performance issue when Xen Hypervisor is activated

2017-01-12 Thread Dario Faggioli
On Mon, 2017-01-02 at 07:15 +, Michael Schinzel wrote:
> Good Morning,
>
I'm back, although, as anticipate, I can't be terribly useful, I'm
afraid...

> You can see, in default Xen configuration, the most important thing
> at read performance test -> 2414.92 MB/sec <- the used cache is half
> of the cache like the same host is bootet without hypervisor. We now
> searched and searched and searched and find the Case:
> xen_acpi_processor
> 
> Xen is manageing the CPU Performance default with 1.200 Mhz. It is
> like you are driving a Ferrari all the time with 30 miles/h :) So we
> changed the Performance parameter to
> 
>  xenpm set-scaling-governor all performance
> 
Well, yes, this will have an impact, but it's unlikely what you're
looking for. In fact, something similar would apply also to baremetal
Linux.

> After a little bit searching around, i also find a parameter for the
> scheduler.
> 
> root@v7:~# cat /sys/block/sda/queue/scheduler
> noop deadline [cfq]
> 
> I changed the scheduler to deadline.  After this Change
> 
Well, ISTR [nop] could be even better. But I don't think this will make
much difference either, in this case.

> We have already tried to remove the CPU reservation, memory limit and
> so on but this don't change anythink. Also upgrading the Hypervisor
> dont change anythink at this performance issue. 
>  
Well, these are all sequential benchmarks, so it indeed could have been
expected that adding more vCPUs wouldn't have changed things much.

I decided to re-run some of your tests on my test hardware (which is
way lower end than yours, especially as far as storage is concerned).

These are m results:

 hdparm -Tt /dev/sda   Without Xen (baremetal Linux)
With Xen (from within dom0)
 Timing cached reads 14074 MB in  2.00 seconds = 7043.05 MB/sec 
14694 MB in  1.99 seconds = 7382.22 MB/sec
 Timing buffered disk reads364 MB in  3.01 seconds =  120.78 MB/sec   
364 MB in  3.00 seconds =  121.22 MB/sec


 dd_obs_test.sh datei  transfer rate
 block size   Without Xen (baremetal Linux)   With Xen (from within dom0)
512279 MB/s  123 MB/s
   1024454 MB/s  217 MB/s
   2048275 MB/s  359 MB/s
   4096888 MB/s  532 MB/s
   8192987 MB/s  659 MB/s
  163841.0 GB/s  685 MB/s
  327681.1 GB/s  773 MB/s
  655361.1 GB/s  846 MB/s
 1310721.1 GB/s  749 MB/s
 262144327 MB/s  844 MB/s
 5242881.1 GB/s  783 MB/s
1048576420 MB/s  823 MB/s
2097152485 MB/s  305 MB/s
4194304409 MB/s  783 MB/s
8388608380 MB/s  776 MB/s
   16777216950 MB/s  703 MB/s
   33554432916 MB/s  297 MB/s
   67108864856 MB/s  492 MB/s


time dd if=/dev/zero of=datei bs=1M count=10240
  Without Xen (baremetal Linux)   With Xen (from within dom0)
   73.7224 s, 146 MB/s97.6948 s, 110 MB/s
real 1m13.724s real 1m37.700s
 user 0m0.000s user  0m0.068s
 sys  0m9.364s sys  0m15.180s


root@Zhaman:~# time dd if=datei of=/dev/null
  Without Xen (baremetal Linux)   With Xen (from within dom0)
   9.92787 s, 1.1 GB/s95.1827 s, 113 MB/s
 real 0m9.953s real 1m35.194s
 user 0m2.096s user 0m10.632s
  sys 0m7.300s  sys 0m51.820s

Which confirms that, when running the tests inside a Xen Dom0, things
are indeed slower.

Let me say something, though: the purpose of Xen is not to achieve the
best possible performance in Dom0. In fact, it is to achieve the best
possible aggregated performance of a number of guest domains.

The fact that virtualization has an overhead and that Dom0 pays quite a
high price are well known. Have you tried, for instance, running some
of the test in a DomU?

Now, whether what both you and I are seeing is to be considered
"normal", I can't tell. Maybe Roger can (or he can tell us who to
bother for that).

In general, I don't think updating random system and firmware
components is useful at all... This is not a BIOS issue, IMO.

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

signature.asc
Description: This is a digitally signed message part

Re: [Xen-devel] Xen 4.8 + Linux 4.9 + Credit2 = can't bootup

2017-01-12 Thread Dario Faggioli
On Thu, 2017-01-12 at 11:22 -0500, Boris Ostrovsky wrote:
> On 01/12/2017 07:50 AM, Dario Faggioli wrote:
> > I don't think we do that any longer, and that may be part of the
> > reason
> > why we missed this one?
> 
> I believe you needed to be on a multi-socket system to catch this
> bug.
> That's why, for example, my tests missed it --- the boxes that I use
> are
> all single-node.
>
Yeah, while I do test on NUMA, but I do mostly Xen development so I
test the latest Xen but (most of the time) with whatever distro kernel
is easier to use (although, usually fairly recent ones, like 4.8).

Anyway, we should have some multi-socket boxes on OSSTest, AFAICR.

Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86emul: suppress memory writes after faulting FPU insns

2017-01-12 Thread Andrew Cooper
On 12/01/17 16:12, Jan Beulich wrote:
 On 12.01.17 at 16:04,  wrote:
>> On 12/01/17 14:02, Jan Beulich wrote:
>>> Furthermore I think we have another issue with writes: If the write
>>> faults, the FSW (or MXCSR, albeit there only for instructions we don't
>>> emulate yet) register may have been updated already, so we'd need to
>>> undo that update.
>> Do you mean restore the value before we sample it, or before the guest
>> gets to see it?
> Read it, run the stub, call ->write(), and upon failure restore the
> value read in the first step.
>
>> (I can't see what the problem is here.)
> The stub execution may modify FSW/MXCSR, if the operation causes
> an exception to be latched (for MXCSR this would need to be a
> masked exception), but if ->write() fails architecturally the update to
> FSW/MXCSR should not be committed.

Ok - I see now.  Yes - this is ugly corner case.  Short of doing a
pre-emptive fpu save before emulation, I don't see an alternative.  This
at least makes us no worse than taking a context switch.

>
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -3723,6 +3735,8 @@ x86_emulate(
>>>  default:
>>>  generate_exception(EXC_UD);
>>>  }
>>> +if ( dst.type == OP_MEM && dst.bytes == 4 && 
>>> !fpu_check_write() )
>>> +dst.type = OP_NONE;
>> This dst.bytes check is rather suspicious, as the size of the operand
>> has nothing to do with whether the write should be surpressed.
>>
>> I presume you actually mean (modrm_reg & 7) < 6 to exclude fnstenv and
>> fnstcw from triggering the fpu_check_write() logic?
> I had it this way first, and then thought it's better the way it is now:
> The cases we want to exclude are the non-register-data stores,
> and in both groups all register stores are respectively uniform in size.
> Plus this way the conditional is slightly shorter (i.e. doesn't require
> splitting across lines). Yet if you strongly prefer the other variant, I
> can of course switch back. Just let me know.

As with everything here, clarity of code is the most important.

I'd prefer the modrm_reg check over dst.bytes, although would settle for
a comment describing the situations when we shouldn't suppress a write
despite an exception occuring.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 8/8] x86/hvm: serialize trap injecting producer and consumer

2017-01-12 Thread Jan Beulich
>>> On 12.01.17 at 17:28,  wrote:
> On 12/01/17 14:58, Paul Durrant wrote:
>> Since injection works on a remote vCPU, and since there's no
>> enforcement of the subject vCPU being paused, there's a potential race
>> between the producing and consuming sides. Fix this by leveraging the
>> vector field as synchronization variable.
>>
>> Signed-off-by: Jan Beulich 
>> [re-based]
>> Signed-off-by: Paul Durrant 
>> ---
>> Cc: Andrew Cooper 
> 
> Reviewed-by: Andrew Cooper 
> 
> This looks fairly unrelated to the other dm changes.  Given that it is a
> backport candidate, should it be pulled ahead of the move to dm.c?  (I
> can make this happen on commit if we are in agreement).

This indeed was the case from the very beginning of the HVMOP
series, and I've never done the re-order because I had never
expected it to take so long for the earlier patches to go in. And
admittedly I was also too lazy (or should I say too busy, to make
it sound better) to do that re-ordering, as the fix didn't seem
important enough to bother (I'm not sure why you think this
would be a backporting candidate, as I don't think this hypercall
is used by anything that's in fully supported state).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86emul: VEX.B is ignored in compatibility mode

2017-01-12 Thread Jan Beulich
While VEX.R and VEX.X are guaranteed to be 1 in compatibility mode,
VEX.B can be encoded as zero, but would be ignored by the processor.
Since we emulate instructions in 64-bit mode, we need to force the
bit to 1 in order to not act on the wrong {X,Y,Z}MM register.

We must not, however, fiddle with the high bit of VEX. in the
decode phase, as that would undermine the checking of instructions
requiring the field to be all ones independent of mode. This is
being enforced in copy_REX_VEX() instead.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -331,7 +331,11 @@ union vex {
 
 #define copy_REX_VEX(ptr, rex, vex) do { \
 if ( (vex).opcx != vex_none ) \
+{ \
+if ( !mode_64bit() ) \
+vex.reg |= 8; \
 ptr[0] = 0xc4, ptr[1] = (vex).raw[0], ptr[2] = (vex).raw[1]; \
+} \
 else if ( mode_64bit() ) \
 ptr[1] = rex | REX_PREFIX; \
 } while (0)
@@ -2217,6 +2221,8 @@ x86_decode(
 op_bytes = 8;
 }
 }
+else
+vex.b = 1;
 switch ( b )
 {
 case 0x62:
@@ -2235,7 +2241,7 @@ x86_decode(
 break;
 }
 }
-if ( mode_64bit() && !vex.r )
+if ( !vex.r )
 rex_prefix |= REX_R;
 
 ext = vex.opcx;



x86emul: VEX.B is ignored in compatibility mode

While VEX.R and VEX.X are guaranteed to be 1 in compatibility mode,
VEX.B can be encoded as zero, but would be ignored by the processor.
Since we emulate instructions in 64-bit mode, we need to force the
bit to 1 in order to not act on the wrong {X,Y,Z}MM register.

We must not, however, fiddle with the high bit of VEX. in the
decode phase, as that would undermine the checking of instructions
requiring the field to be all ones independent of mode. This is
being enforced in copy_REX_VEX() instead.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -331,7 +331,11 @@ union vex {
 
 #define copy_REX_VEX(ptr, rex, vex) do { \
 if ( (vex).opcx != vex_none ) \
+{ \
+if ( !mode_64bit() ) \
+vex.reg |= 8; \
 ptr[0] = 0xc4, ptr[1] = (vex).raw[0], ptr[2] = (vex).raw[1]; \
+} \
 else if ( mode_64bit() ) \
 ptr[1] = rex | REX_PREFIX; \
 } while (0)
@@ -2217,6 +2221,8 @@ x86_decode(
 op_bytes = 8;
 }
 }
+else
+vex.b = 1;
 switch ( b )
 {
 case 0x62:
@@ -2235,7 +2241,7 @@ x86_decode(
 break;
 }
 }
-if ( mode_64bit() && !vex.r )
+if ( !vex.r )
 rex_prefix |= REX_R;
 
 ext = vex.opcx;
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-12 Thread Jan Engelhardt
On Thursday 2017-01-12 16:52, Nicolas Dichtel wrote:

>Le 09/01/2017 à 13:56, Christoph Hellwig a écrit :
>> On Fri, Jan 06, 2017 at 10:43:59AM +0100, Nicolas Dichtel wrote:
>>> Regularly, when a new header is created in include/uapi/, the developer
>>> forgets to add it in the corresponding Kbuild file. This error is usually
>>> detected after the release is out.
>>>
>>> In fact, all headers under uapi directories should be exported, thus it's
>>> useless to have an exhaustive list.
>>>
>>> After this patch, the following files, which were not exported, are now
>>> exported (with make headers_install_all):
>> 
>> ... snip ...
>> 
>>> linux/genwqe/.install
>>> linux/genwqe/..install.cmd
>>> linux/cifs/.install
>>> linux/cifs/..install.cmd
>> 
>> I'm pretty sure these should not be exported!
>> 
>Those files are created in every directory:
>$ find usr/include/ -name '\.\.install.cmd' | wc -l
>71

That still does not mean they should be exported.

Anything but headers (and directories as a skeleton structure) is maximally 
suspicious.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-12 Thread Nicolas Dichtel
Le 12/01/2017 à 17:28, Jan Engelhardt a écrit :
> On Thursday 2017-01-12 16:52, Nicolas Dichtel wrote:
> 
>> Le 09/01/2017 à 13:56, Christoph Hellwig a écrit :
>>> On Fri, Jan 06, 2017 at 10:43:59AM +0100, Nicolas Dichtel wrote:
 Regularly, when a new header is created in include/uapi/, the developer
 forgets to add it in the corresponding Kbuild file. This error is usually
 detected after the release is out.

 In fact, all headers under uapi directories should be exported, thus it's
 useless to have an exhaustive list.

 After this patch, the following files, which were not exported, are now
 exported (with make headers_install_all):
>>>
>>> ... snip ...
>>>
 linux/genwqe/.install
 linux/genwqe/..install.cmd
 linux/cifs/.install
 linux/cifs/..install.cmd
>>>
>>> I'm pretty sure these should not be exported!
>>>
>> Those files are created in every directory:
>> $ find usr/include/ -name '\.\.install.cmd' | wc -l
>> 71
> 
> That still does not mean they should be exported.
> 
> Anything but headers (and directories as a skeleton structure) is maximally 
> suspicious.
> 
What I was trying to say is that I export those directories like other are.
Removing those files is not related to that series.


Regards,
Nicolas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 8/8] x86/hvm: serialize trap injecting producer and consumer

2017-01-12 Thread Andrew Cooper
On 12/01/17 14:58, Paul Durrant wrote:
> Since injection works on a remote vCPU, and since there's no
> enforcement of the subject vCPU being paused, there's a potential race
> between the producing and consuming sides. Fix this by leveraging the
> vector field as synchronization variable.
>
> Signed-off-by: Jan Beulich 
> [re-based]
> Signed-off-by: Paul Durrant 
> ---
> Cc: Andrew Cooper 

Reviewed-by: Andrew Cooper 

This looks fairly unrelated to the other dm changes.  Given that it is a
backport candidate, should it be pulled ahead of the move to dm.c?  (I
can make this happen on commit if we are in agreement).

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/8] dm_op: convert HVMOP_*ioreq_server*

2017-01-12 Thread Jan Beulich
>>> On 12.01.17 at 15:58,  wrote:
> The definitions of HVM_IOREQSRV_BUFIOREQ_* have to persist as they are
> already in use by callers of the libxc interface.
> 
> Suggested-by: Jan Beulich 
> Signed-off-by: Paul Durrant 
> --
> Reviewed-by: Jan Beulich 

That's an odd placement ...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.8 + Linux 4.9 + Credit2 = can't bootup

2017-01-12 Thread Boris Ostrovsky
On 01/12/2017 07:50 AM, Dario Faggioli wrote:
> On Wed, 2017-01-04 at 22:13 -0500, Boris Ostrovsky wrote:
>> On 01/04/2017 09:10 PM, Konrad Rzeszutek Wilk wrote:
>>> On Wed, Jan 04, 2017 at 08:52:03PM -0500, Konrad Rzeszutek Wilk
>>> wrote:
 I was trying to bootup on an 30 CPU machine (15 core, SMT).

 It works just fine with credit1 (see further down the log)
 but if I try credit2 it ends up hanging during bootup.

 I am a going to naively assume it is due to how the vCPUs are
 exposed (Where they match the physical CPUs under credit1),
 but under credit2 they are different.
>>> It seems now that I took dom0_max_vcpus out of the picture I can
>>> reproduce this with credit1 scheduler. So it looks like an Linux
>>> issue.
>>>
>>> Boris, any ideas? This is 4.9.
>>>
>> I think 4.9 is broken. There were changes in topology initialization 
>> that broke Xen in early 4.9 RCs. 
>>
> Maybe it's me misremembering/saying stupid things, but I recall that at
> some point we were testing some of the recent and in development Linux
> branches in OSSTest.
>
> I don't think we do that any longer, and that may be part of the reason
> why we missed this one?

I believe you needed to be on a multi-socket system to catch this bug.
That's why, for example, my tests missed it --- the boxes that I use are
all single-node.

-boris

>
> Ian, Wei, thoughts?
>
> Regards,
> Dario




signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Andrew Cooper
On 12/01/17 15:50, Boris Ostrovsky wrote:
> On 01/12/2017 10:31 AM, Andrew Cooper wrote:
>> On 12/01/17 15:22, Boris Ostrovsky wrote:
  case 0x8001:
 -c &= pv_featureset[FEATURESET_e1c];
 -d &= pv_featureset[FEATURESET_e1d];
 +c = p->extd.e1c;
>>> This appears to crash guests Intel, at least for dom0.
>> Is this a PVH dom0?  I can't see from this snippet which function you
>> are in.
> No, this is normal PV dom0.
>
> I may have gone too far trimming the patch. It's this chunk:
>
>
> @@ -1291,15 +1281,15 @@ void pv_cpuid(struct cpu_user_regs *regs)
>  }
>  
>  case 1:
> -a &= pv_featureset[FEATURESET_Da1];
> +a = p->xstate.Da1;
>  b = c = d = 0;
>  break;
>  }
>  break;
>  
>  case 0x8001:
> -c &= pv_featureset[FEATURESET_e1c];
> -d &= pv_featureset[FEATURESET_e1d];
> +c = p->extd.e1c;
> +d = p->extd.e1d;
>
>
>>> p->extd.e1c is 0x3 and bit 1 is reserved on Intel.
>>> I haven't traced it yet to exact place that causes dom0 to crash but
>>> clearing this bit make dom0 boot.
>> The logic immediately below the snippet should clean out the common bits
>> if vendor != AMD.  Do we perhaps have a bad vendor setting?
>>
> -bash-4.1# ./cpuid 0
> CPUID 0x: eax = 0x000d ebx = 0x756e6547 ecx = 0x6c65746e edx
> = 0x49656e69
> -bash-4.1#
>
> This is machine that I run my nightly tests on and it failed this
> morning so it's not a new HW.
>
> As far as adjusting the bits based on vendor --- don't you only do this
> for edx:
>
> arch/x86/cpuid.c: pv_cpuid():
>
>case 0x8001:
> res->c = p->extd.e1c;
> res->c &= ~2U; // My workaround
> res->d = p->extd.e1d;
>
> /* If not emulating AMD, clear the duplicated features in e1d. */
> if ( currd->arch.x86_vendor != X86_VENDOR_AMD )
> res->d &= ~CPUID_COMMON_1D_FEATURES;

Ahh! found it.  This is a side effect of starting to generate the dom0
policy in Xen.

Can you try this patch?

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index b685874..1e5013d 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -164,14 +164,6 @@ static void __init calculate_pv_max_policy(void)
 /* Unconditionally claim to be able to set the hypervisor bit. */
 __set_bit(X86_FEATURE_HYPERVISOR, pv_featureset);
 
-/*
- * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
- * affect how to interpret topology information in other cpuid leaves.
- */
-__set_bit(X86_FEATURE_HTT, pv_featureset);
-__set_bit(X86_FEATURE_X2APIC, pv_featureset);
-__set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
-
 sanitise_featureset(pv_featureset);
 cpuid_featureset_to_policy(pv_featureset, p);
 }
@@ -199,14 +191,6 @@ static void __init calculate_hvm_max_policy(void)
 __set_bit(X86_FEATURE_HYPERVISOR, hvm_featureset);
 
 /*
- * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
- * affect how to interpret topology information in other cpuid leaves.
- */
-__set_bit(X86_FEATURE_HTT, hvm_featureset);
-__set_bit(X86_FEATURE_X2APIC, hvm_featureset);
-__set_bit(X86_FEATURE_CMP_LEGACY, hvm_featureset);
-
-/*
  * Xen can provide an APIC emulation to HVM guests even if the
host's APIC
  * isn't enabled.
  */
@@ -301,6 +285,14 @@ void recalculate_cpuid_policy(struct domain *d)
 }
 
 /*
+ * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
+ * affect how to interpret topology information in other cpuid leaves.
+ */
+__set_bit(X86_FEATURE_HTT, max_fs);
+__set_bit(X86_FEATURE_X2APIC, max_fs);
+__set_bit(X86_FEATURE_CMP_LEGACY, max_fs);
+
+/*
  * 32bit PV domains can't use any Long Mode features, and cannot use
  * SYSCALL on non-AMD hardware.
  */


The toolstack fudge is still necessary for PV guests (where faulting
isn't in use), and still necessary for HVM guests until I fix topology
representation, but we shouldn't be exposing them by default on hardware
which lacks the appropriate bits.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86emul: suppress memory writes after faulting FPU insns

2017-01-12 Thread Jan Beulich
>>> On 12.01.17 at 16:04,  wrote:
> On 12/01/17 14:02, Jan Beulich wrote:
>> Furthermore I think we have another issue with writes: If the write
>> faults, the FSW (or MXCSR, albeit there only for instructions we don't
>> emulate yet) register may have been updated already, so we'd need to
>> undo that update.
> 
> Do you mean restore the value before we sample it, or before the guest
> gets to see it?

Read it, run the stub, call ->write(), and upon failure restore the
value read in the first step.

> (I can't see what the problem is here.)

The stub execution may modify FSW/MXCSR, if the operation causes
an exception to be latched (for MXCSR this would need to be a
masked exception), but if ->write() fails architecturally the update to
FSW/MXCSR should not be committed.

>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -3723,6 +3735,8 @@ x86_emulate(
>>  default:
>>  generate_exception(EXC_UD);
>>  }
>> +if ( dst.type == OP_MEM && dst.bytes == 4 && !fpu_check_write() 
>> )
>> +dst.type = OP_NONE;
> 
> This dst.bytes check is rather suspicious, as the size of the operand
> has nothing to do with whether the write should be surpressed.
> 
> I presume you actually mean (modrm_reg & 7) < 6 to exclude fnstenv and
> fnstcw from triggering the fpu_check_write() logic?

I had it this way first, and then thought it's better the way it is now:
The cases we want to exclude are the non-register-data stores,
and in both groups all register stores are respectively uniform in size.
Plus this way the conditional is slightly shorter (i.e. doesn't require
splitting across lines). Yet if you strongly prefer the other variant, I
can of course switch back. Just let me know.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable test] 104131: regressions - FAIL

2017-01-12 Thread Chao Gao
According the code around the assert: 
movzbl %r14b, %esi  41 0f b6 f6 
cmp %esi, %eax  39 f0
jle ... 7e 02
ud2 <0f> 0b 
mov %rbx, %rdi  48 89 df
callq ...   e8 51 20 00 00 
mov $0x810, %eaxb8 10 08 00 00 

so I think one is 0x38 %eax, the other is 0x30 %esi

On Thu, Jan 12, 2017 at 12:07:53PM +, Xuquan (Quan Xu) wrote:
>On January 12, 2017 5:14 PM, Andrew Cooper wrote:
>>On 12/01/2017 06:46, osstest service owner wrote:
>>> flight 104131 xen-unstable real [real]
>>> http://logs.test-lab.xenproject.org/osstest/logs/104131/
>>>
>>> Regressions :-(
>>>
>>> Tests which did not succeed and are blocking, including tests which
>>> could not be run:
>>>  test-amd64-i386-xl-qemuu-debianhvm-amd64 16 guest-stop   fail
>>REGR. vs. 104119
>>
>>Jan 12 01:25:17.397607 (XEN) Assertion 'intack.vector >= pt_vector' failed at
>>intr.c:321
>>Jan 12 01:25:37.133596 (XEN) [ Xen-4.9-unstable  x86_64  debug=y
>>Not tainted ]
>>Jan 12 01:25:37.141577 (XEN) CPU:14
>>Jan 12 01:25:37.141607 (XEN) RIP:e008:[]
>>vmx_intr_assist+0x35e/0x51d
>>Jan 12 01:25:37.149617 (XEN) RFLAGS: 00010202   CONTEXT:
>>hypervisor (d15v0)
>>Jan 12 01:25:37.149655 (XEN) rax: 0038   rbx:
>>830079e1e000   rcx: 0030
>>Jan 12 01:25:37.157582 (XEN) rdx:    rsi:
>>0030   rdi: 830079e1e000
>>Jan 12 01:25:37.165584 (XEN) rbp: 83047de2ff08   rsp: 83047de2fea8
>>r8:  82c00022f000
>>Jan 12 01:25:37.173579 (XEN) r9:  8301b63ede80   r10:
>>830176386560   r11: 01955ee79bd0
>>Jan 12 01:25:37.181582 (XEN) r12: 3002   r13:
>>3002   r14: 0030
>>Jan 12 01:25:37.189584 (XEN) r15: 83023fec2000   cr0:
>>80050033   cr4: 003526e0
>>Jan 12 01:25:37.197572 (XEN) cr3: 000232edb000   cr2:
>>02487034
>>Jan 12 01:25:37.205569 (XEN) ds:    es:    fs:    gs: 
>>ss:    cs: e008
>>Jan 12 01:25:37.205606 (XEN) Xen code around 
>>(vmx_intr_assist+0x35e/0x51d):
>>Jan 12 01:25:37.213575 (XEN)  41 0f b6 f6 39 f0 7e 02 <0f> 0b 48 89 df e8 51
>>20 00 00 b8 10 08 00 00 0f Jan 12 01:25:37.221561 (XEN) Xen stack trace
>>from rsp=83047de2fea8:
>>Jan 12 01:25:37.229600 (XEN)82d08031aa80 0038
>>83047de2 83023fec2000
>>Jan 12 01:25:37.237594 (XEN)83047de2fef8 82d080130cb6
>>830079e1e000 830079e1e000
>>Jan 12 01:25:37.245588 (XEN)83007bae2000 000e
>>830233117000 83023fec2000
>>Jan 12 01:25:37.253594 (XEN)83047de2fdc0 82d0801fdeb1
>>0004 00c2
>>Jan 12 01:25:37.261584 (XEN)0020 0007
>>8800e8d28000 81add0a0
>>Jan 12 01:25:37.269607 (XEN)0246 
>>88014248 0004
>>Jan 12 01:25:37.277580 (XEN)0036 
>>03f8 03f8
>>Jan 12 01:25:37.285584 (XEN)81add0a0 beefbeef
>>813899a4 00bfbeef
>>Jan 12 01:25:37.293567 (XEN)0002 880147c03e08
>>beef 1cec835356e5beef
>>Jan 12 01:25:37.293606 (XEN)085d8b002674beef 01dcb38b000cbeef
>>8914458d3174beef 2444c71e
>>Jan 12 01:25:37.301586 (XEN)830079e1e000 0031bfc37600
>>003526e0
>>Jan 12 01:25:37.309607 (XEN) Xen call trace:
>>Jan 12 01:25:37.309639 (XEN)[]
>>vmx_intr_assist+0x35e/0x51d
>>Jan 12 01:25:37.317591 (XEN)[]
>>vmx_asm_vmexit_handler+0x41/0x120
>>Jan 12 01:25:37.325598 (XEN)
>>Jan 12 01:25:37.325624 (XEN)
>>Jan 12 01:25:37.325647 (XEN)
>>
>>Jan 12 01:25:37.333653 (XEN) Panic on CPU 14:
>>Jan 12 01:25:37.333684 (XEN) Assertion 'intack.vector >= pt_vector' failed at
>>intr.c:321 Jan 12 01:25:37.341571 (XEN)
>>
>>Jan 12 01:25:37.341603 (XEN)
>>Jan 12 01:25:37.341626 (XEN) Reboot in five seconds...
>>Jan 12 01:25:37.349566 (XEN) Resetting with ACPI MEMORY or I/O
>>RESET_REG.
>>
>>This is caused by "x86/apicv: fix RTC periodic timer and apicv issue".  It is
>>not a deterministic issue, as it appears to have survived a week of testing
>>already, but there is clearly something still problematic with the code.
>>
>
>
>Andrew,
>If you have, could you give more information? Such as the value of 
>intack.vector / pt_vector..
>I guess, the reason may be that the intack.vector is ' uint8_t ' and the 
>pt_vector is 'int'..
>
>Or there is a corner case that intack.vector is __not__ the highest priority 
>vector..
>
>Kevin / Jan,  any thoughts?
>
>Quan
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 104144: all pass - PUSHED

2017-01-12 Thread osstest service owner
flight 104144 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/104144/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf b494cf96e70f8640acd9288951be39a0f714f2be
baseline version:
 ovmf 12233c19177d3971b657200778b681c6132e598b

Last test of basis   104141  2017-01-12 11:15:13 Z0 days
Testing same since   104144  2017-01-12 13:46:32 Z0 days1 attempts


People who touched revisions under test:
  Hao Wu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=b494cf96e70f8640acd9288951be39a0f714f2be
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
b494cf96e70f8640acd9288951be39a0f714f2be
+ branch=ovmf
+ revision=b494cf96e70f8640acd9288951be39a0f714f2be
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.8-testing
+ '[' xb494cf96e70f8640acd9288951be39a0f714f2be = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git
++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
++ : 

Re: [Xen-devel] [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-12 Thread Nicolas Dichtel
Le 09/01/2017 à 13:56, Christoph Hellwig a écrit :
> On Fri, Jan 06, 2017 at 10:43:59AM +0100, Nicolas Dichtel wrote:
>> Regularly, when a new header is created in include/uapi/, the developer
>> forgets to add it in the corresponding Kbuild file. This error is usually
>> detected after the release is out.
>>
>> In fact, all headers under uapi directories should be exported, thus it's
>> useless to have an exhaustive list.
>>
>> After this patch, the following files, which were not exported, are now
>> exported (with make headers_install_all):
> 
> ... snip ...
> 
>> linux/genwqe/.install
>> linux/genwqe/..install.cmd
>> linux/cifs/.install
>> linux/cifs/..install.cmd
> 
> I'm pretty sure these should not be exported!
> 
Those files are created in every directory:
$ find usr/include/ -name '\.\.install.cmd' | wc -l
71
$ find usr/include/ -name '\.install' | wc -l
71

See also
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/scripts/Makefile.headersinst#n32


Thank you,
Nicolas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Doug Goldstein
On 1/12/17 6:50 AM, Daniel Kiper wrote:
> On Wed, Jan 11, 2017 at 02:20:15PM -0600, Doug Goldstein wrote:
>> On 1/11/17 1:47 PM, Daniel Kiper wrote:
>>> On Tue, Jan 10, 2017 at 02:51:27PM -0600, Doug Goldstein wrote:
 On 1/9/17 7:37 PM, Doug Goldstein wrote:
> On 12/5/16 4:25 PM, Daniel Kiper wrote:

>> diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
>> index 62c010e..dc857d8 100644
>> --- a/xen/arch/x86/efi/efi-boot.h
>> +++ b/xen/arch/x86/efi/efi-boot.h
>> @@ -146,6 +146,8 @@ static void __init 
>> efi_arch_process_memory_map(EFI_SYSTEM_TABLE *SystemTable,
>>  {
>>  struct e820entry *e;
>>  unsigned int i;
>> +/* Check for extra mem for mbi data if Xen is loaded via multiboot2 
>> protocol. */
>> +UINTN extra_mem = efi_enabled(EFI_LOADER) ? 0 : (64 << 10);
>
> Just wondering where the constant came from? And if there should be a
> little bit of information about it. To me its just weird to shift 64.

 Its the size of the stack used in the assembly code.
>>>
>>> No, it is trampoline region size.
>>
>> trampoline + stack in head.S We take the address where we're going to
>> copy the trampoline and set the stack to 0x1 past it.
> 
> I suppose that you think about this:
> 
> /* Switch to low-memory stack.  */
> mov sym_fs(trampoline_phys),%edi
> lea 0x1(%edi),%esp
> 
> However, trampoline region size is (should be) 64 KiB. No way. Please
> look below for more details.

The trampoline + stack are 64kb together. The stack grows down and the
trampoline grows up. The stack starts at 64kb past the start of the
trampoline. %edi is the start of the trampoline.

> 
>>  /* Populate E820 table and check trampoline area availability. */
>>  e = e820map - 1;
>> @@ -168,7 +170,8 @@ static void __init 
>> efi_arch_process_memory_map(EFI_SYSTEM_TABLE *SystemTable,
>>  /* fall through */
>>  case EfiConventionalMemory:
>>  if ( !trampoline_phys && desc->PhysicalStart + len <= 
>> 0x10 &&
>> - len >= cfg.size && desc->PhysicalStart + len > 
>> cfg.addr )
>> + len >= cfg.size + extra_mem &&
>> + desc->PhysicalStart + len > cfg.addr )
>>  cfg.addr = (desc->PhysicalStart + len - cfg.size) & 
>> PAGE_MASK;
>
> So this is where the current series blows up and fails on real hardware.

 Honestly this was my misunderstanding and this shouldn't ever be used to
 get memory for the trampoline. This also has the bug in it that it needs
 to be:

 ASSERT(cfg.size > 0);
 cfg.addr = (desc->PhysicalStart + len - (cfg.size + extra_mem) & PAGE_MASK;
>>>
>>> As I said earlier. This extra_mem stuff is (maybe) wrong and should be fixed
>>> in one way or another. Hmmm... It looks OK. I will double check it because
>>> I do not looked at this code long time and maybe I am missing something.
>>
>> cfg.size needs to be the size of the trampolines + stack.
> 
> It looks that during some code rearrangement I moved one instruction too
> much to trampoline_bios_setup. So, I can agree that right now cfg.size
> should be properly initialized. Though it should be cfg.size = 64 << 10.
> Then extra_mem should be dropped.

That's fine as long as its clear that 64kb is for the trampoline + the
stack.

> 
> No where in the EFI + MB2 code path is cfg.size ever initialized. Its
> only initialized in the straight EFI case. The result is that cfg.addr
> is set to the section immediately following this. Took a bit to
> trackdown because I checked for memory overlaps with where Xen was
> loaded and where it was relocated to but forgot to check for overlaps
> with the trampoline code. This is the address where the trampoline jumps
> are copied.
>
> Personally I'd like to see an ASSERT added or the code swizzled around
> in such a way that its not possible to get into a bad state. But this is
> probably another patch series.
>
>>  /* fall through */
>>  case EfiLoaderCode:
>> @@ -210,12 +213,14 @@ static void *__init 
>> efi_arch_allocate_mmap_buffer(UINTN map_size)
>>
>>  static void __init efi_arch_pre_exit_boot(void)
>>  {
>> -if ( !trampoline_phys )
>> -{
>> -if ( !cfg.addr )
>> -blexit(L"No memory for trampoline");
>> +if ( trampoline_phys )
>> +return;
>> +
>> +if ( !cfg.addr )
>> +blexit(L"No memory for trampoline");
>> +
>> +if ( efi_enabled(EFI_LOADER) )
>>  relocate_trampoline(cfg.addr);

 Why is this call even here anymore? Its called in
 efi_arch_memory_setup() already. If it was unable to allocate memory
 under the 1mb region its just going to trample over ANY conventional
 memory 

[Xen-devel] [PATCH] x86, locking/spinlocks: Remove paravirt_ticketlocks_enabled

2017-01-12 Thread Waiman Long
This is a follow-up of commit cfd8983f03c7b2 ("x86, locking/spinlocks:
Remove ticket (spin)lock implementation"). The static_key structure
paravirt_ticketlocks_enabled is now removed as it is no longer used.

A simple build and boot test was done to verify it.

Signed-off-by: Waiman Long 
---
 arch/x86/include/asm/spinlock.h  | 3 ---
 arch/x86/kernel/kvm.c| 1 -
 arch/x86/kernel/paravirt-spinlocks.c | 3 ---
 arch/x86/xen/spinlock.c  | 1 -
 4 files changed, 8 deletions(-)

diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index 921bea7..6d39190 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -23,9 +23,6 @@
 /* How long a lock should spin before we consider blocking */
 #define SPIN_THRESHOLD (1 << 15)
 
-extern struct static_key paravirt_ticketlocks_enabled;
-static __always_inline bool static_key_false(struct static_key *key);
-
 #include 
 
 /*
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 36bc664..6750fdc 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -627,7 +627,6 @@ static __init int kvm_spinlock_init_jump(void)
if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
return 0;
 
-   static_key_slow_inc(_ticketlocks_enabled);
printk(KERN_INFO "KVM setup paravirtual spinlock\n");
 
return 0;
diff --git a/arch/x86/kernel/paravirt-spinlocks.c 
b/arch/x86/kernel/paravirt-spinlocks.c
index 6d4bf81..6259327 100644
--- a/arch/x86/kernel/paravirt-spinlocks.c
+++ b/arch/x86/kernel/paravirt-spinlocks.c
@@ -42,6 +42,3 @@ struct pv_lock_ops pv_lock_ops = {
 #endif /* SMP */
 };
 EXPORT_SYMBOL(pv_lock_ops);
-
-struct static_key paravirt_ticketlocks_enabled = STATIC_KEY_INIT_FALSE;
-EXPORT_SYMBOL(paravirt_ticketlocks_enabled);
diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index e8a9ea7..a822606 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -155,7 +155,6 @@ static __init int xen_init_spinlocks_jump(void)
if (!xen_domain())
return 0;
 
-   static_key_slow_inc(_ticketlocks_enabled);
return 0;
 }
 early_initcall(xen_init_spinlocks_jump);
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky
On 01/12/2017 10:31 AM, Andrew Cooper wrote:
> On 12/01/17 15:22, Boris Ostrovsky wrote:
>>>  case 0x8001:
>>> -c &= pv_featureset[FEATURESET_e1c];
>>> -d &= pv_featureset[FEATURESET_e1d];
>>> +c = p->extd.e1c;
>> This appears to crash guests Intel, at least for dom0.
> Is this a PVH dom0?  I can't see from this snippet which function you
> are in.

No, this is normal PV dom0.

I may have gone too far trimming the patch. It's this chunk:


@@ -1291,15 +1281,15 @@ void pv_cpuid(struct cpu_user_regs *regs)
 }
 
 case 1:
-a &= pv_featureset[FEATURESET_Da1];
+a = p->xstate.Da1;
 b = c = d = 0;
 break;
 }
 break;
 
 case 0x8001:
-c &= pv_featureset[FEATURESET_e1c];
-d &= pv_featureset[FEATURESET_e1d];
+c = p->extd.e1c;
+d = p->extd.e1d;


>
>> p->extd.e1c is 0x3 and bit 1 is reserved on Intel.
>> I haven't traced it yet to exact place that causes dom0 to crash but
>> clearing this bit make dom0 boot.
> The logic immediately below the snippet should clean out the common bits
> if vendor != AMD.  Do we perhaps have a bad vendor setting?
>

-bash-4.1# ./cpuid 0
CPUID 0x: eax = 0x000d ebx = 0x756e6547 ecx = 0x6c65746e edx
= 0x49656e69
-bash-4.1#

This is machine that I run my nightly tests on and it failed this
morning so it's not a new HW.

As far as adjusting the bits based on vendor --- don't you only do this
for edx:

arch/x86/cpuid.c: pv_cpuid():

   case 0x8001:
res->c = p->extd.e1c;
res->c &= ~2U; // My workaround
res->d = p->extd.e1d;

/* If not emulating AMD, clear the duplicated features in e1d. */
if ( currd->arch.x86_vendor != X86_VENDOR_AMD )
res->d &= ~CPUID_COMMON_1D_FEATURES;


-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms

2017-01-12 Thread Doug Goldstein
On 1/12/17 6:18 AM, Daniel Kiper wrote:


 So as an aside, IMHO this is where the series should end and the next
 set of patches should be a follow on.
>>>
>>> Hmmm... Why? If you do not apply rest of patches then MB2 does not
>>> work on all EFI platforms.
>>>
>>> Daniel
>>
>> So I should have expanded more in my other email. I've got this series
>> pulled in on top of 4.8 along with different fixes as discussed on this
>> thread:
>>
>> https://github.com/cardoe/xen/tree/48-and-daniel
>>
>> This boots up on my NUC but reports the other CPUs as stuck and the
>> error is -5. This starts to come up on the Lenovo and it gets to near
>> where it starts the dom0 kernel and then blanks the screen and hard
>> hangs. This causes cr0 crashes on the other boards I've got access to.
>>
>> I've also got the series only to this point with the fixes.
>>
>> https://github.com/cardoe/xen/tree/48-and-daniel-sans-relocate
>>
>> The later version boots up on my NUC with all CPUs. It still hangs on
>> the Lenovo. It works on the other boards. It also appears work under QEMU.
> 
> AIUI, you are trying to add full (legacy BIOS and EFI) MB2 support to iPXE. 
> Great!.
> Though I think that you should do this in steps. First of all you should have 
> MB2
> fully running on legacy BIOS platforms. It is much simpler. If it works move 
> to EFI
> platforms. OVMF is good choice for start but of course finally tests should 
> be done
> on real hardware. You can do tests on legacy BIOS with just patch #01. If 
> everything
> works then apply whole patch series to Xen and add MB2 reloc functionality. 
> If it
> works move to EFI platform tests. It is important that you do EFI platform 
> tests with
> whole patch series. This way you avoid issues related to overwriting BS/RS 
> code/data.
> 
> Daniel
> 

Daniel,

I appreciate your input. I do like the approach of splitting things up
into small incremental pieces, that's the way all this work should be
happening. You should also be aware that iPXE takes the approach of
least amount of functionality/code to make things work. So from their
view there's no reason for adding MB2 support for BIOS since it provides
no advantage over MB1 when booting from the BIOS. Now MB2 solves a
problem with booting over EFI vs MB1 so they'll be willing to take a
change there. I'll also disagree that BIOS is easier than EFI since with
EFI its just load the ELF into memory and set a few pointers in tags.
With BIOS it requires me to build up the memory map into a MB2 structure.

As far as it goes I've got iPXE booting MB2 EFI payloads just fine. The
issues I've explained here happen when I use Grub or iPXE to boot Xen so
its not implementation specific to my iPXE code.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen ARM community call - meeting minutes and date for the next one

2017-01-12 Thread Pooya . Keshavarzi
Hi,

On 01/03/2017 12:33 PM, Dirk Behme wrote:
> On 20.12.2016 19:01, Julien Grall wrote:
>> Hi Andrii,
>>
>> On 20/12/2016 19:00, Andrii Anisov wrote:
>>> Sorry for the mess,
>>>
>>> I mean the xen-swiotlb issue on renesas board:
>>>
 Bosch: problem with xen-swiotlb. It does not work properly on renesas
>>> board.
 Stefano: please report the error on the ML

 ACTION: Bosch to send a bug report regarding xen-swiotlb
>>
>> No news so far. Dirk, do you have any update on this?
> 
> 
> We will try to update as soon as the vacation season is over, beginning of 
> next year. Together with Pooya (the Bosch guy who attended the first call) 
> I'll try to summarize the issues we see, then.
> 
> I don't have the details atm. But if I remember correctly, I've heard about 
> two (?) issues (haven't done the tests myself).
> 
> The first one was related to (e.g. USB) DMA depending how we configure the 
> memory assignments on the Renesas Salvator-X board. Depending on how much 
> memory we assign to Xen and Linux, e.g. USB wasn't working (trying to use an 
> USB stick as rootfs).
> 
> The other issue I heard about was some root file system corruptions after two 
> or three re-boots we haven't observed in the native Linux case. The plan was 
> to do some further analysis, first, before we blame Xen regarding this, 
> though.
> 
> As mentioned, Pooya will have the details and correct me if I'm totally wrong 
> here ;)
> 

Firstly sorry for the late reply on this.

Regarding the problem with swiotlb-xen here are some more details:

If we limit Dom0's memory such that only low-memory (up to 32-bit addressable 
memory) is available to Dom0, then swiotlb-xen does not have to use bounce 
buffers and the devices (e.g. USB, ethernet) would work.

But when there is some high memory also available to Dom0, the followings 
happen:
 - If the the device address happens to be in the device's DMA window (see 
xen_swiotlb_map_page()), then the device would work.
 - Otherwise if it has to allocate and map a bounce buffer, then the device 
would not work.

Kind regards,
Pooya

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/8] public / x86: Introduce __HYPERCALL_dm_op...

2017-01-12 Thread Andrew Cooper
On 12/01/17 14:58, Paul Durrant wrote:
> ...as a set of hypercalls to be used by a device model.
>
> As stated in the new docs/designs/dm_op.markdown:
>
> "The aim of DMOP is to prevent a compromised device model from
> compromising domains other then the one it is associated with. (And is
> therefore likely already compromised)."
>
> See that file for further information.
>
> This patch simply adds the boilerplate for the hypercall.
>
> Signed-off-by: Paul Durrant 
> Suggested-by: Ian Jackson 
> Suggested-by: Jennifer Herbert 
> ---
> Cc: Ian Jackson 
> Cc: Jennifer Herbert 
> Cc: Daniel De Graaf 
> Cc: Wei Liu 
> Reviewed-by: Jan Beulich 
> Cc: Andrew Cooper 
>
> v3:
> - Re-written large portions of dmop.markdown to remove references to
>   previous proposals and make it a standalone design doc.
>
> v2:
> - Addressed several comments from Jan.
> - Removed modification of __XEN_LATEST_INTERFACE_VERSION__ as it is not
>   needed in this patch.
> ---
>  docs/designs/dmop.markdown| 158 
> ++
>  tools/flask/policy/modules/xen.if |   2 +-
>  tools/libxc/include/xenctrl.h |   1 +
>  tools/libxc/xc_private.c  |  70 +
>  tools/libxc/xc_private.h  |   2 +
>  xen/arch/x86/hvm/Makefile |   1 +
>  xen/arch/x86/hvm/dm.c | 118 
>  xen/arch/x86/hvm/hvm.c|   1 +
>  xen/arch/x86/hypercall.c  |   2 +
>  xen/include/public/hvm/dm_op.h|  71 +
>  xen/include/public/xen.h  |   1 +
>  xen/include/xen/hypercall.h   |   7 ++
>  xen/include/xsm/dummy.h   |   6 ++
>  xen/include/xsm/xsm.h |   6 ++
>  xen/xsm/flask/hooks.c |   7 ++
>  15 files changed, 452 insertions(+), 1 deletion(-)
>  create mode 100644 docs/designs/dmop.markdown
>  create mode 100644 xen/arch/x86/hvm/dm.c
>  create mode 100644 xen/include/public/hvm/dm_op.h
>
> diff --git a/docs/designs/dmop.markdown b/docs/designs/dmop.markdown
> new file mode 100644
> index 000..2a4bd16
> --- /dev/null
> +++ b/docs/designs/dmop.markdown
> @@ -0,0 +1,158 @@
> +DMOP
> +
> +
> +Introduction
> +
> +
> +The aim of DMOP is to prevent a compromised device model from compromising
> +domains other then the one it is associated with. (And is therefore likely
> +already compromised).
> +
> +The problem occurs when you a device model issues an hypercall that
> +includes references to user memory other than the operation structure
> +itself, such as with Track dirty VRAM (as used in VGA emulation).
> +Is this case, the address of this other user memory needs to be vetted,
> +to ensure it is not within restricted address ranges, such as kernel
> +memory. The real problem comes down to how you would vet this address -
> +the idea place to do this is within the privcmd driver, without privcmd
> +having to have specific knowledge of the hypercall's semantics.
> +
> +The Design
> +--
> +
> +The privcmd driver implements a new restriction ioctl, which takes a domid
> +parameter.  After that restriction ioctl is issued, the privcmd driver will
> +permit only DMOP hypercalls, and only with the specified target domid.
> +
> +A DMOP hypercall consists of an array of buffers and lengths, with the
> +first one containing the specific DMOP parameters. These can then reference
> +further buffers from within in the array. Since the only user buffers
> +passed are that found with that array, they can all can be audited by
> +privcmd.
> +
> +The following code illustrates this idea:
> +
> +struct xen_dm_op {
> +uint32_t op;
> +};
> +
> +struct xen_dm_op_buf {
> +XEN_GUEST_HANDLE_64(void) h;
> +uint32_t size;
> +};

Sorry to quibble, but there is a problem here which has only just
occurred to me.  This ABI isn't futureproof, and has padding at the end
which affects how the array is layed out.

The userspace side should be

struct xen_dm_op_buf {
void *h;
size_t size;
}

which will work sensibly for 32bit and 64bit userspace, and futureproof
(for when 128bit turns up).  Its size is also a power of two which
avoids alignment issues in the array.

The kernel already has to parse this structure anyway, and will know the
bitness of its userspace process.  We could easily (at this point)
require the kernel to turn it into the kernels bitness for forwarding on
to Xen, which covers the 32bit userspace under a 64bit kernel problem,
in a way which won't break the hypercall ABI when 128bit comes along.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Andrew Cooper
On 12/01/17 15:22, Boris Ostrovsky wrote:
>>  case 0x8001:
>> -c &= pv_featureset[FEATURESET_e1c];
>> -d &= pv_featureset[FEATURESET_e1d];
>> +c = p->extd.e1c;
> This appears to crash guests Intel, at least for dom0.

Is this a PVH dom0?  I can't see from this snippet which function you
are in.

>
> p->extd.e1c is 0x3 and bit 1 is reserved on Intel.

>
> I haven't traced it yet to exact place that causes dom0 to crash but
> clearing this bit make dom0 boot.

The logic immediately below the snippet should clean out the common bits
if vendor != AMD.  Do we perhaps have a bad vendor setting?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky

>  case 0x8001:
> -c &= pv_featureset[FEATURESET_e1c];
> -d &= pv_featureset[FEATURESET_e1d];
> +c = p->extd.e1c;

This appears to crash guests Intel, at least for dom0.

p->extd.e1c is 0x3 and bit 1 is reserved on Intel.

I haven't traced it yet to exact place that causes dom0 to crash but
clearing this bit make dom0 boot.


-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86emul: suppress memory writes after faulting FPU insns

2017-01-12 Thread Andrew Cooper
On 12/01/17 14:02, Jan Beulich wrote:
> FPU insns writing to memory must not touch memory if they latch #MF (to
> be delivered on the next waiting FPU insn). Note that inspecting FSW.ES
> needs to be avoided for all FNST* insns, as they don't raise exceptions
> themselves, but may instead be invoked with the bit already set.
>
> Signed-off-by: Jan Beulich 
> ---
> While #MF and memory access faults are all listed in the same priority
> group, it is not entirely clear how FPU insns reading memory operate
> when an exception to be delivered doesn't depend on the memory operand
> (which would namely be FPU register stack overflows). It is therefore
> possible that memory reads would need to be suppressed in some
> situations, too. Of course this only matters for reads which have side
> effects. Otoh SNaN operand detection and stack overflow/underflow are
> listed as having the same priority, so the memory read may well be
> performed unconditionally.

This can be checked by using pagetable access bits, although I can't see
any sensible case where one would point the x87 FPU at MMIO (at all, let
alone) with read side effects.

I think it is reasonable to leave reads as they currently are.

> Furthermore I think we have another issue with writes: If the write
> faults, the FSW (or MXCSR, albeit there only for instructions we don't
> emulate yet) register may have been updated already, so we'd need to
> undo that update.

Do you mean restore the value before we sample it, or before the guest
gets to see it?

(I can't see what the problem is here.)

>  For MXCSR that will be possible by saving the initial
> value and re-loading it in case ->write() fails (and iirc there's
> exactly one affected insn - VCVTPS2PH). There's no suitable way to load
> FSW, though - all existing mechanisms have further effects we don't
> really want (albeit arguably the side effects of going through a
> {F,}XSAVE/{F,}XRSTOR cycle could occur at any time as a scheduling side
> effect).
>
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -3723,6 +3735,8 @@ x86_emulate(
>  default:
>  generate_exception(EXC_UD);
>  }
> +if ( dst.type == OP_MEM && dst.bytes == 4 && !fpu_check_write() )
> +dst.type = OP_NONE;

This dst.bytes check is rather suspicious, as the size of the operand
has nothing to do with whether the write should be surpressed.

I presume you actually mean (modrm_reg & 7) < 6 to exclude fnstenv and
fnstcw from triggering the fpu_check_write() logic?

>  }
>  put_fpu();
>  break;
> @@ -3946,6 +3963,8 @@ x86_emulate(
>  default:
>  generate_exception(EXC_UD);
>  }
> +if ( dst.type == OP_MEM && dst.bytes == 8 && !fpu_check_write() )
> +dst.type = OP_NONE;

Same again here.

Otherwise, it looks fine.

~Andrew

>  }
>  put_fpu();
>  break;

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 6/8] dm_op: convert HVMOP_set_mem_type

2017-01-12 Thread Paul Durrant
This patch removes the need for handling HVMOP restarts, so that
infrastructure is removed.

NOTE: This patch also modifies the type of the 'nr' argument of
  xc_hvm_set_mem_type() from uint64_t to uint32_t. In practice the
  value passed was always truncated to 32 bits.

Suggested-by: Jan Beulich 
Signed-off-by: Paul Durrant 
---
Cc: Jan Beulich 
Cc: Ian Jackson 
Acked-by: Wei Liu 
Cc: Andrew Cooper 
Cc: Daniel De Graaf 

v3:
- Addressed more comments from Jan.

v2:
- Addressed several comments from Jan.
---
 tools/libxc/include/xenctrl.h   |   2 +-
 tools/libxc/xc_misc.c   |  29 +++-
 xen/arch/x86/hvm/dm.c   |  92 
 xen/arch/x86/hvm/hvm.c  | 136 +---
 xen/include/public/hvm/dm_op.h  |  22 ++
 xen/include/public/hvm/hvm_op.h |  20 --
 xen/xsm/flask/policy/access_vectors |   2 +-
 7 files changed, 127 insertions(+), 176 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index a5c234f..13431bb 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1634,7 +1634,7 @@ int xc_hvm_modified_memory(
  * Allowed types are HVMMEM_ram_rw, HVMMEM_ram_ro, HVMMEM_mmio_dm
  */
 int xc_hvm_set_mem_type(
-xc_interface *xch, domid_t dom, hvmmem_type_t memtype, uint64_t first_pfn, 
uint64_t nr);
+xc_interface *xch, domid_t dom, hvmmem_type_t memtype, uint64_t first_pfn, 
uint32_t nr);
 
 /*
  * Injects a hardware/software CPU trap, to take effect the next time the HVM 
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 597df99..5b06d6b 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -590,30 +590,21 @@ int xc_hvm_modified_memory(
 }
 
 int xc_hvm_set_mem_type(
-xc_interface *xch, domid_t dom, hvmmem_type_t mem_type, uint64_t 
first_pfn, uint64_t nr)
+xc_interface *xch, domid_t dom, hvmmem_type_t mem_type, uint64_t 
first_pfn, uint32_t nr)
 {
-DECLARE_HYPERCALL_BUFFER(struct xen_hvm_set_mem_type, arg);
-int rc;
-
-arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
-if ( arg == NULL )
-{
-PERROR("Could not allocate memory for xc_hvm_set_mem_type hypercall");
-return -1;
-}
+struct xen_dm_op op;
+struct xen_dm_op_set_mem_type *data;
 
-arg->domid= dom;
-arg->hvmmem_type  = mem_type;
-arg->first_pfn= first_pfn;
-arg->nr   = nr;
+memset(, 0, sizeof(op));
 
-rc = xencall2(xch->xcall, __HYPERVISOR_hvm_op,
-  HVMOP_set_mem_type,
-  HYPERCALL_BUFFER_AS_ARG(arg));
+op.op = XEN_DMOP_set_mem_type;
+data = _mem_type;
 
-xc_hypercall_buffer_free(xch, arg);
+data->mem_type = mem_type;
+data->first_pfn = first_pfn;
+data->nr = nr;
 
-return rc;
+return do_dm_op(xch, dom, 1, , sizeof(op));
 }
 
 int xc_hvm_inject_trap(
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index 1a7c913..97c1f07 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -198,6 +198,84 @@ static int modified_memory(struct domain *d, xen_pfn_t 
*first_pfn,
 return rc;
 }
 
+static bool allow_p2m_type_change(p2m_type_t old, p2m_type_t new)
+{
+return p2m_is_ram(old) ||
+   (p2m_is_hole(old) && new == p2m_mmio_dm) ||
+   (old == p2m_ioreq_server && new == p2m_ram_rw);
+}
+
+static int set_mem_type(struct domain *d, hvmmem_type_t mem_type,
+xen_pfn_t *first_pfn, unsigned int *nr)
+{
+xen_pfn_t last_pfn = *first_pfn + *nr - 1;
+unsigned int iter;
+int rc;
+
+/* Interface types to internal p2m types */
+static const p2m_type_t memtype[] = {
+[HVMMEM_ram_rw]  = p2m_ram_rw,
+[HVMMEM_ram_ro]  = p2m_ram_ro,
+[HVMMEM_mmio_dm] = p2m_mmio_dm,
+[HVMMEM_unused] = p2m_invalid,
+[HVMMEM_ioreq_server] = p2m_ioreq_server
+};
+
+if ( (*first_pfn > last_pfn) ||
+ (last_pfn > domain_get_maximum_gpfn(d)) )
+return -EINVAL;
+
+if ( mem_type >= ARRAY_SIZE(memtype) ||
+ unlikely(mem_type == HVMMEM_unused) )
+return -EINVAL;
+
+iter = 0;
+rc = 0;
+while ( iter < *nr )
+{
+unsigned long pfn = *first_pfn + iter;
+p2m_type_t t;
+
+get_gfn_unshare(d, pfn, );
+if ( p2m_is_paging(t) )
+{
+put_gfn(d, pfn);
+p2m_mem_paging_populate(d, pfn);
+return -EAGAIN;
+}
+
+if ( p2m_is_shared(t) )
+rc = -EAGAIN;
+else if ( !allow_p2m_type_change(t, memtype[mem_type]) )
+rc = -EINVAL;
+else
+rc = p2m_change_type_one(d, pfn, t, memtype[mem_type]);
+
+put_gfn(d, pfn);
+
+if ( rc )
+break;
+
+iter++;
+
+   

[Xen-devel] [PATCH v3 3/8] dm_op: convert HVMOP_track_dirty_vram

2017-01-12 Thread Paul Durrant
The handle type passed to the underlying shadow and hap functions is
changed for compatibility with the new hypercall buffer.

NOTE: This patch also modifies the type of the 'nr' parameter of
  xc_hvm_track_dirty_vram() from uint64_t to uint32_t. In practice
  the value passed was always truncated to 32 bits.

Suggested-by: Jan Beulich 
Signed-off-by: Paul Durrant 
---
Cc: Jan Beulich 
Cc: Daniel De Graaf 
Cc: Ian Jackson 
Acked-by: Wei Liu 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Tim Deegan 

v3:
- Check d->max_vcpus rather than d->vcpu, as requested by Jan.
- The handle type changes (from uint8 to void) are still necessary, hence
  omitting Jan's R-b until this is confirmed to be acceptable.

v2:
- Addressed several comments from Jan.
---
 tools/flask/policy/modules/xen.if   |  4 ++--
 tools/libxc/include/xenctrl.h   |  2 +-
 tools/libxc/xc_misc.c   | 32 +-
 xen/arch/x86/hvm/dm.c   | 45 +
 xen/arch/x86/hvm/hvm.c  | 41 -
 xen/arch/x86/mm/hap/hap.c   |  2 +-
 xen/arch/x86/mm/shadow/common.c |  2 +-
 xen/include/asm-x86/hap.h   |  2 +-
 xen/include/asm-x86/shadow.h|  2 +-
 xen/include/public/hvm/dm_op.h  | 18 +++
 xen/include/public/hvm/hvm_op.h | 16 -
 xen/xsm/flask/hooks.c   |  3 ---
 xen/xsm/flask/policy/access_vectors |  2 --
 13 files changed, 80 insertions(+), 91 deletions(-)

diff --git a/tools/flask/policy/modules/xen.if 
b/tools/flask/policy/modules/xen.if
index f9254c2..45e5b5f 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -58,7 +58,7 @@ define(`create_domain_common', `
allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage 
mmuext_op updatemp };
allow $1 $2:grant setup;
allow $1 $2:hvm { cacheattr getparam hvmctl irqlevel pciroute sethvmc
-   setparam pcilevel trackdirtyvram nested altp2mhvm 
altp2mhvm_op send_irq };
+   setparam pcilevel nested altp2mhvm altp2mhvm_op 
send_irq };
 ')
 
 # create_domain(priv, target)
@@ -151,7 +151,7 @@ define(`device_model', `
 
allow $1 $2_target:domain { getdomaininfo shutdown };
allow $1 $2_target:mmu { map_read map_write adjust physmap target_hack 
};
-   allow $1 $2_target:hvm { getparam setparam trackdirtyvram hvmctl 
irqlevel pciroute pcilevel cacheattr send_irq dm };
+   allow $1 $2_target:hvm { getparam setparam hvmctl irqlevel pciroute 
pcilevel cacheattr send_irq dm };
 ')
 
 # make_device_model(priv, dm_dom, hvm_dom)
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 2ba46d7..c7ee412 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1620,7 +1620,7 @@ int xc_hvm_inject_msi(
  */
 int xc_hvm_track_dirty_vram(
 xc_interface *xch, domid_t dom,
-uint64_t first_pfn, uint64_t nr,
+uint64_t first_pfn, uint32_t nr,
 unsigned long *bitmap);
 
 /*
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 06e90de..4c41d41 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -581,34 +581,22 @@ int xc_hvm_inject_msi(
 
 int xc_hvm_track_dirty_vram(
 xc_interface *xch, domid_t dom,
-uint64_t first_pfn, uint64_t nr,
+uint64_t first_pfn, uint32_t nr,
 unsigned long *dirty_bitmap)
 {
-DECLARE_HYPERCALL_BOUNCE(dirty_bitmap, (nr+7) / 8, 
XC_HYPERCALL_BUFFER_BOUNCE_OUT);
-DECLARE_HYPERCALL_BUFFER(struct xen_hvm_track_dirty_vram, arg);
-int rc;
+struct xen_dm_op op;
+struct xen_dm_op_track_dirty_vram *data;
 
-arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
-if ( arg == NULL || xc_hypercall_bounce_pre(xch, dirty_bitmap) )
-{
-PERROR("Could not bounce memory for xc_hvm_track_dirty_vram 
hypercall");
-rc = -1;
-goto out;
-}
+memset(, 0, sizeof(op));
 
-arg->domid = dom;
-arg->first_pfn = first_pfn;
-arg->nr= nr;
-set_xen_guest_handle(arg->dirty_bitmap, dirty_bitmap);
+op.op = XEN_DMOP_track_dirty_vram;
+data = _dirty_vram;
 
-rc = xencall2(xch->xcall, __HYPERVISOR_hvm_op,
-  HVMOP_track_dirty_vram,
-  HYPERCALL_BUFFER_AS_ARG(arg));
+data->first_pfn = first_pfn;
+data->nr = nr;
 
-out:
-xc_hypercall_buffer_free(xch, arg);
-xc_hypercall_bounce_post(xch, dirty_bitmap);
-return rc;
+return do_dm_op(xch, dom, 2, , sizeof(op),
+dirty_bitmap, (nr + 7) / 8);
 }
 
 int xc_hvm_modified_memory(
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index 4b94d85..d501d56 100644
--- a/xen/arch/x86/hvm/dm.c
+++ 

[Xen-devel] [PATCH v3 8/8] x86/hvm: serialize trap injecting producer and consumer

2017-01-12 Thread Paul Durrant
Since injection works on a remote vCPU, and since there's no
enforcement of the subject vCPU being paused, there's a potential race
between the producing and consuming sides. Fix this by leveraging the
vector field as synchronization variable.

Signed-off-by: Jan Beulich 
[re-based]
Signed-off-by: Paul Durrant 
---
Cc: Andrew Cooper 

v3:
- Re-re-re-based after more changes.

v2:
- Re-re-based after Andrew's recent changes.
---
 xen/arch/x86/hvm/dm.c | 5 -
 xen/arch/x86/hvm/hvm.c| 8 +---
 xen/include/asm-x86/hvm/hvm.h | 3 +++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index 0a7e50a..e09c8e7 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -286,13 +286,16 @@ static int inject_trap(struct domain *d, unsigned int 
vcpuid,
 if ( vcpuid >= d->max_vcpus || !(v = d->vcpu[vcpuid]) )
 return -EINVAL;
 
-if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )
+if ( cmpxchg(>arch.hvm_vcpu.inject_trap.vector,
+ HVM_TRAP_VECTOR_UNSET, HVM_TRAP_VECTOR_UPDATING) !=
+ HVM_TRAP_VECTOR_UNSET )
 return -EBUSY;
 
 v->arch.hvm_vcpu.inject_trap.type = type;
 v->arch.hvm_vcpu.inject_trap.insn_len = insn_len;
 v->arch.hvm_vcpu.inject_trap.error_code = error_code;
 v->arch.hvm_vcpu.inject_trap.cr2 = cr2;
+smp_wmb();
 v->arch.hvm_vcpu.inject_trap.vector = vector;
 
 return 0;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 8d42adc..44b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -539,12 +539,14 @@ void hvm_do_resume(struct vcpu *v)
 }
 
 /* Inject pending hw/sw trap */
-if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )
+if ( v->arch.hvm_vcpu.inject_trap.vector >= 0 )
 {
+smp_rmb();
+
 if ( !hvm_event_pending(v) )
 hvm_inject_event(>arch.hvm_vcpu.inject_trap);
 
-v->arch.hvm_vcpu.inject_trap.vector = -1;
+v->arch.hvm_vcpu.inject_trap.vector = HVM_TRAP_VECTOR_UNSET;
 }
 
 if ( unlikely(v->arch.vm_event) && v->arch.monitor.next_interrupt_enabled )
@@ -1563,7 +1565,7 @@ int hvm_vcpu_initialise(struct vcpu *v)
 (void(*)(unsigned long))hvm_assert_evtchn_irq,
 (unsigned long)v);
 
-v->arch.hvm_vcpu.inject_trap.vector = -1;
+v->arch.hvm_vcpu.inject_trap.vector = HVM_TRAP_VECTOR_UNSET;
 
 if ( is_pvh_domain(d) )
 {
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 8c95c08..bcacee3 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -77,6 +77,9 @@ enum hvm_intblk {
 #define HVM_HAP_SUPERPAGE_2MB   0x0001
 #define HVM_HAP_SUPERPAGE_1GB   0x0002
 
+#define HVM_TRAP_VECTOR_UNSET(-1)
+#define HVM_TRAP_VECTOR_UPDATING (-2)
+
 /*
  * The hardware virtual machine (HVM) interface abstracts away from the
  * x86/x86_64 CPU virtualization assist specifics. Currently this interface
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 5/8] dm_op: convert HVMOP_modified_memory

2017-01-12 Thread Paul Durrant
This patch introduces code to handle DMOP continuations.

NOTE: This patch also modifies the type of the 'nr' argument of
  xc_hvm_modified_memory() from uint64_t to uint32_t. In practice the
  value passed was always truncated to 32 bits.

Suggested-by: Jan Beulich 
Signed-off-by: Paul Durrant 
---
Cc: Jan Beulich 
Cc: Ian Jackson 
Acked-by: Wei Liu 
Cc: Andrew Cooper 
Cc: Daniel De Graaf 

v3:
- Addressed more comments from Jan.

v2:
- Addressed several comments from Jan, including...
- Added explanatory note on continuation handling
---
 tools/libxc/include/xenctrl.h   |  2 +-
 tools/libxc/xc_misc.c   | 27 +
 xen/arch/x86/hvm/dm.c   | 78 -
 xen/arch/x86/hvm/hvm.c  | 60 
 xen/include/public/hvm/dm_op.h  | 19 +
 xen/include/public/hvm/hvm_op.h | 13 ---
 xen/xsm/flask/policy/access_vectors |  2 +-
 7 files changed, 107 insertions(+), 94 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index f819bf2..a5c234f 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1627,7 +1627,7 @@ int xc_hvm_track_dirty_vram(
  * Notify that some pages got modified by the Device Model
  */
 int xc_hvm_modified_memory(
-xc_interface *xch, domid_t dom, uint64_t first_pfn, uint64_t nr);
+xc_interface *xch, domid_t dom, uint64_t first_pfn, uint32_t nr);
 
 /*
  * Set a range of memory to a specific type.
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index ddea2bb..597df99 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -573,29 +573,20 @@ int xc_hvm_track_dirty_vram(
 }
 
 int xc_hvm_modified_memory(
-xc_interface *xch, domid_t dom, uint64_t first_pfn, uint64_t nr)
+xc_interface *xch, domid_t dom, uint64_t first_pfn, uint32_t nr)
 {
-DECLARE_HYPERCALL_BUFFER(struct xen_hvm_modified_memory, arg);
-int rc;
-
-arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
-if ( arg == NULL )
-{
-PERROR("Could not allocate memory for xc_hvm_modified_memory 
hypercall");
-return -1;
-}
+struct xen_dm_op op;
+struct xen_dm_op_modified_memory *data;
 
-arg->domid = dom;
-arg->first_pfn = first_pfn;
-arg->nr= nr;
+memset(, 0, sizeof(op));
 
-rc = xencall2(xch->xcall, __HYPERVISOR_hvm_op,
-  HVMOP_modified_memory,
-  HYPERCALL_BUFFER_AS_ARG(arg));
+op.op = XEN_DMOP_modified_memory;
+data = _memory;
 
-xc_hypercall_buffer_free(xch, arg);
+data->first_pfn = first_pfn;
+data->nr = nr;
 
-return rc;
+return do_dm_op(xch, dom, 1, , sizeof(op));
 }
 
 int xc_hvm_set_mem_type(
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index bcd9ea6..1a7c913 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -14,6 +14,7 @@
  * this program; If not, see .
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -142,12 +143,68 @@ static int set_isa_irq_level(struct domain *d, uint8_t 
isa_irq,
 return 0;
 }
 
+static int modified_memory(struct domain *d, xen_pfn_t *first_pfn,
+   unsigned int *nr)
+{
+xen_pfn_t last_pfn = *first_pfn + *nr - 1;
+unsigned int iter;
+int rc;
+
+if ( (*first_pfn > last_pfn) ||
+ (last_pfn > domain_get_maximum_gpfn(d)) )
+return -EINVAL;
+
+if ( !paging_mode_log_dirty(d) )
+return 0;
+
+iter = 0;
+rc = 0;
+while ( iter < *nr )
+{
+unsigned long pfn = *first_pfn + iter;
+struct page_info *page;
+
+page = get_page_from_gfn(d, pfn, NULL, P2M_UNSHARE);
+if ( page )
+{
+mfn_t gmfn = _mfn(page_to_mfn(page));
+
+paging_mark_dirty(d, gmfn);
+/*
+ * These are most probably not page tables any more
+ * don't take a long time and don't die either.
+ */
+sh_remove_shadows(d, gmfn, 1, 0);
+put_page(page);
+}
+
+iter++;
+
+/*
+ * Check for continuation every 256th iteration and if the
+ * iteration is not the last.
+ */
+if ( (iter < *nr) && ((iter & 0xff) == 0) &&
+ hypercall_preempt_check() )
+{
+*first_pfn += iter;
+*nr -= iter;
+
+rc = -ERESTART;
+break;
+}
+}
+
+return rc;
+}
+
 long do_dm_op(domid_t domid,
   unsigned int nr_bufs,
   XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs)
 {
 struct domain *d;
 struct xen_dm_op op;
+bool restart = false;
 long rc;
 
 rc = rcu_lock_remote_domain_by_id(domid, );
@@ -299,17 +356,36 @@ long 

[Xen-devel] [PATCH v3 0/8] New hypercall for device models

2017-01-12 Thread Paul Durrant
Following on from the design submitted by Jennifer Herbert to the list [1]
this series provides an implementation of __HYPERCALL_dm_op followed by
patches based on Jan Beulich's previous HVMCTL series [2] to convert
tools-only HVMOPs used by device models to DMOPs.

[1] https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01052.html
[2] https://lists.xenproject.org/archives/html/xen-devel/2016-06/msg02433.html

Paul Durrant (8):
  public / x86: Introduce __HYPERCALL_dm_op...
  dm_op: convert HVMOP_*ioreq_server*
  dm_op: convert HVMOP_track_dirty_vram
  dm_op: convert HVMOP_set_pci_intx_level, HVMOP_set_isa_irq_level,
and...
  dm_op: convert HVMOP_modified_memory
  dm_op: convert HVMOP_set_mem_type
  dm_op: convert HVMOP_inject_trap and HVMOP_inject_msi
  x86/hvm: serialize trap injecting producer and consumer

 docs/designs/dmop.markdown  | 158 +
 tools/flask/policy/modules/xen.if   |   8 +-
 tools/libxc/include/xenctrl.h   |  13 +-
 tools/libxc/xc_domain.c | 212 +--
 tools/libxc/xc_misc.c   | 235 +
 tools/libxc/xc_private.c|  70 
 tools/libxc/xc_private.h|   2 +
 xen/arch/x86/hvm/Makefile   |   1 +
 xen/arch/x86/hvm/dm.c   | 544 +
 xen/arch/x86/hvm/hvm.c  | 677 +---
 xen/arch/x86/hvm/ioreq.c|  36 +-
 xen/arch/x86/hvm/irq.c  |   7 +-
 xen/arch/x86/hypercall.c|   2 +
 xen/arch/x86/mm/hap/hap.c   |   2 +-
 xen/arch/x86/mm/shadow/common.c |   2 +-
 xen/include/asm-x86/hap.h   |   2 +-
 xen/include/asm-x86/hvm/domain.h|   3 +-
 xen/include/asm-x86/hvm/hvm.h   |   3 +
 xen/include/asm-x86/shadow.h|   2 +-
 xen/include/public/hvm/dm_op.h  | 373 
 xen/include/public/hvm/hvm_op.h | 230 +---
 xen/include/public/xen-compat.h |   2 +-
 xen/include/public/xen.h|   1 +
 xen/include/xen/hvm/irq.h   |   2 +-
 xen/include/xen/hypercall.h |   7 +
 xen/include/xsm/dummy.h |  36 +-
 xen/include/xsm/xsm.h   |  36 +-
 xen/xsm/dummy.c |   5 -
 xen/xsm/flask/hooks.c   |  37 +-
 xen/xsm/flask/policy/access_vectors |  15 +-
 30 files changed, 1409 insertions(+), 1314 deletions(-)
 create mode 100644 docs/designs/dmop.markdown
 create mode 100644 xen/arch/x86/hvm/dm.c
 create mode 100644 xen/include/public/hvm/dm_op.h

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 4/8] dm_op: convert HVMOP_set_pci_intx_level, HVMOP_set_isa_irq_level, and...

2017-01-12 Thread Paul Durrant
... HVMOP_set_pci_link_route

These HVMOPs were exposed to guests so their definitions need to be
preserved for compatibility. This patch therefore updates
__XEN_LATEST_INTERFACE_VERSION__ to 0x00040900 and makes the HVMOP
defintions conditional on __XEN_INTERFACE_VERSION__ less than that value.

NOTE: This patch also widens the 'domain' parameter of
  xc_hvm_set_pci_intx_level() from a uint8_t to a uint16_t.

Suggested-by: Jan Beulich 
Signed-off-by: Paul Durrant 
---
Reviewed-by: Jan Beulich 
Cc: Daniel De Graaf 
Cc: Ian Jackson 
Acked-by: Wei Liu 
Cc: Andrew Cooper 

v3:
- Remove unnecessary padding.

v2:
- Interface version modification moved to this patch, where it is needed.
- Addressed several comments from Jan.
---
 tools/flask/policy/modules/xen.if   |   8 +--
 tools/libxc/include/xenctrl.h   |   2 +-
 tools/libxc/xc_misc.c   |  83 --
 xen/arch/x86/hvm/dm.c   |  72 +++
 xen/arch/x86/hvm/hvm.c  | 136 
 xen/arch/x86/hvm/irq.c  |   7 +-
 xen/include/public/hvm/dm_op.h  |  42 +++
 xen/include/public/hvm/hvm_op.h |   4 ++
 xen/include/public/xen-compat.h |   2 +-
 xen/include/xen/hvm/irq.h   |   2 +-
 xen/include/xsm/dummy.h |  18 -
 xen/include/xsm/xsm.h   |  18 -
 xen/xsm/dummy.c |   3 -
 xen/xsm/flask/hooks.c   |  15 
 xen/xsm/flask/policy/access_vectors |   6 --
 15 files changed, 158 insertions(+), 260 deletions(-)

diff --git a/tools/flask/policy/modules/xen.if 
b/tools/flask/policy/modules/xen.if
index 45e5b5f..092a6c5 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -57,8 +57,8 @@ define(`create_domain_common', `
allow $1 $2:shadow enable;
allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage 
mmuext_op updatemp };
allow $1 $2:grant setup;
-   allow $1 $2:hvm { cacheattr getparam hvmctl irqlevel pciroute sethvmc
-   setparam pcilevel nested altp2mhvm altp2mhvm_op 
send_irq };
+   allow $1 $2:hvm { cacheattr getparam hvmctl sethvmc
+   setparam nested altp2mhvm altp2mhvm_op send_irq };
 ')
 
 # create_domain(priv, target)
@@ -93,7 +93,7 @@ define(`manage_domain', `
 #   (inbound migration is the same as domain creation)
 define(`migrate_domain_out', `
allow $1 domxen_t:mmu map_read;
-   allow $1 $2:hvm { gethvmc getparam irqlevel };
+   allow $1 $2:hvm { gethvmc getparam };
allow $1 $2:mmu { stat pageinfo map_read };
allow $1 $2:domain { getaddrsize getvcpucontext pause destroy };
allow $1 $2:domain2 gettsc;
@@ -151,7 +151,7 @@ define(`device_model', `
 
allow $1 $2_target:domain { getdomaininfo shutdown };
allow $1 $2_target:mmu { map_read map_write adjust physmap target_hack 
};
-   allow $1 $2_target:hvm { getparam setparam hvmctl irqlevel pciroute 
pcilevel cacheattr send_irq dm };
+   allow $1 $2_target:hvm { getparam setparam hvmctl cacheattr send_irq dm 
};
 ')
 
 # make_device_model(priv, dm_dom, hvm_dom)
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index c7ee412..f819bf2 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1594,7 +1594,7 @@ int xc_physdev_unmap_pirq(xc_interface *xch,
 
 int xc_hvm_set_pci_intx_level(
 xc_interface *xch, domid_t dom,
-uint8_t domain, uint8_t bus, uint8_t device, uint8_t intx,
+uint16_t domain, uint8_t bus, uint8_t device, uint8_t intx,
 unsigned int level);
 int xc_hvm_set_isa_irq_level(
 xc_interface *xch, domid_t dom,
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 4c41d41..ddea2bb 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -470,33 +470,24 @@ int xc_getcpuinfo(xc_interface *xch, int max_cpus,
 
 int xc_hvm_set_pci_intx_level(
 xc_interface *xch, domid_t dom,
-uint8_t domain, uint8_t bus, uint8_t device, uint8_t intx,
+uint16_t domain, uint8_t bus, uint8_t device, uint8_t intx,
 unsigned int level)
 {
-DECLARE_HYPERCALL_BUFFER(struct xen_hvm_set_pci_intx_level, arg);
-int rc;
-
-arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
-if ( arg == NULL )
-{
-PERROR("Could not allocate memory for xc_hvm_set_pci_intx_level 
hypercall");
-return -1;
-}
+struct xen_dm_op op;
+struct xen_dm_op_set_pci_intx_level *data;
 
-arg->domid  = dom;
-arg->domain = domain;
-arg->bus= bus;
-arg->device = device;
-arg->intx   = intx;
-arg->level  = level;
+memset(, 0, sizeof(op));
 
-rc = xencall2(xch->xcall, __HYPERVISOR_hvm_op,
-  HVMOP_set_pci_intx_level,
-   

[Xen-devel] [PATCH v3 1/8] public / x86: Introduce __HYPERCALL_dm_op...

2017-01-12 Thread Paul Durrant
...as a set of hypercalls to be used by a device model.

As stated in the new docs/designs/dm_op.markdown:

"The aim of DMOP is to prevent a compromised device model from
compromising domains other then the one it is associated with. (And is
therefore likely already compromised)."

See that file for further information.

This patch simply adds the boilerplate for the hypercall.

Signed-off-by: Paul Durrant 
Suggested-by: Ian Jackson 
Suggested-by: Jennifer Herbert 
---
Cc: Ian Jackson 
Cc: Jennifer Herbert 
Cc: Daniel De Graaf 
Cc: Wei Liu 
Reviewed-by: Jan Beulich 
Cc: Andrew Cooper 

v3:
- Re-written large portions of dmop.markdown to remove references to
  previous proposals and make it a standalone design doc.

v2:
- Addressed several comments from Jan.
- Removed modification of __XEN_LATEST_INTERFACE_VERSION__ as it is not
  needed in this patch.
---
 docs/designs/dmop.markdown| 158 ++
 tools/flask/policy/modules/xen.if |   2 +-
 tools/libxc/include/xenctrl.h |   1 +
 tools/libxc/xc_private.c  |  70 +
 tools/libxc/xc_private.h  |   2 +
 xen/arch/x86/hvm/Makefile |   1 +
 xen/arch/x86/hvm/dm.c | 118 
 xen/arch/x86/hvm/hvm.c|   1 +
 xen/arch/x86/hypercall.c  |   2 +
 xen/include/public/hvm/dm_op.h|  71 +
 xen/include/public/xen.h  |   1 +
 xen/include/xen/hypercall.h   |   7 ++
 xen/include/xsm/dummy.h   |   6 ++
 xen/include/xsm/xsm.h |   6 ++
 xen/xsm/flask/hooks.c |   7 ++
 15 files changed, 452 insertions(+), 1 deletion(-)
 create mode 100644 docs/designs/dmop.markdown
 create mode 100644 xen/arch/x86/hvm/dm.c
 create mode 100644 xen/include/public/hvm/dm_op.h

diff --git a/docs/designs/dmop.markdown b/docs/designs/dmop.markdown
new file mode 100644
index 000..2a4bd16
--- /dev/null
+++ b/docs/designs/dmop.markdown
@@ -0,0 +1,158 @@
+DMOP
+
+
+Introduction
+
+
+The aim of DMOP is to prevent a compromised device model from compromising
+domains other then the one it is associated with. (And is therefore likely
+already compromised).
+
+The problem occurs when you a device model issues an hypercall that
+includes references to user memory other than the operation structure
+itself, such as with Track dirty VRAM (as used in VGA emulation).
+Is this case, the address of this other user memory needs to be vetted,
+to ensure it is not within restricted address ranges, such as kernel
+memory. The real problem comes down to how you would vet this address -
+the idea place to do this is within the privcmd driver, without privcmd
+having to have specific knowledge of the hypercall's semantics.
+
+The Design
+--
+
+The privcmd driver implements a new restriction ioctl, which takes a domid
+parameter.  After that restriction ioctl is issued, the privcmd driver will
+permit only DMOP hypercalls, and only with the specified target domid.
+
+A DMOP hypercall consists of an array of buffers and lengths, with the
+first one containing the specific DMOP parameters. These can then reference
+further buffers from within in the array. Since the only user buffers
+passed are that found with that array, they can all can be audited by
+privcmd.
+
+The following code illustrates this idea:
+
+struct xen_dm_op {
+uint32_t op;
+};
+
+struct xen_dm_op_buf {
+XEN_GUEST_HANDLE_64(void) h;
+uint32_t size;
+};
+typedef struct xen_dm_op_buf xen_dm_op_buf_t;
+
+enum neg_errnoval
+HYPERVISOR_dm_op(domid_t domid,
+ xen_dm_op_buf_t bufs[],
+ unsigned int nr_bufs)
+
+@domid is the domain the hypercall operates on.
+@bufs points to an array of buffers where @bufs[0] contains a struct
+dm_op, describing the specific device model operation and its parameters.
+@bufs[1..] may be referenced in the parameters for the purposes of
+passing extra information to or from the domain.
+@nr_bufs is the number of buffers in the @bufs array.
+
+It is forbidden for the above struct (xen_dm_op) to contain any guest
+handles. If they are needed, they should instead be in
+HYPERVISOR_dm_op->bufs.
+
+Validation by privcmd driver
+
+
+If the privcmd driver has been restricted to specific domain (using a
+ new ioctl), when it received an op, it will:
+
+1. Check hypercall is DMOP.
+
+2. Check domid == restricted domid.
+
+3. For each @nr_bufs in @bufs: Check @h and @size give a buffer
+   wholly in the user space part of the virtual address space. (e.g.
+   Linux will use access_ok()).
+
+
+Xen Implementation
+--
+
+Since a DMOP buffers need to be copied from or to the guest, functions for
+doing this would be written as 

[Xen-devel] [PATCH v3 2/8] dm_op: convert HVMOP_*ioreq_server*

2017-01-12 Thread Paul Durrant
The definitions of HVM_IOREQSRV_BUFIOREQ_* have to persist as they are
already in use by callers of the libxc interface.

Suggested-by: Jan Beulich 
Signed-off-by: Paul Durrant 
--
Reviewed-by: Jan Beulich 
Cc: Ian Jackson 
Acked-by: Wei Liu 
Cc: Andrew Cooper 
Cc: Daniel De Graaf 

v3:
- Fix pad check.

v2:
- Addressed several comments from Jan.
---
 tools/libxc/xc_domain.c  | 212 -
 xen/arch/x86/hvm/dm.c|  89 
 xen/arch/x86/hvm/hvm.c   | 219 ---
 xen/arch/x86/hvm/ioreq.c |  36 +++
 xen/include/asm-x86/hvm/domain.h |   3 +-
 xen/include/public/hvm/dm_op.h   | 153 +++
 xen/include/public/hvm/hvm_op.h  | 132 +--
 xen/include/xsm/dummy.h  |   6 --
 xen/include/xsm/xsm.h|   6 --
 xen/xsm/dummy.c  |   1 -
 xen/xsm/flask/hooks.c|   6 --
 11 files changed, 356 insertions(+), 507 deletions(-)

diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 296b852..419a897 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1417,24 +1417,24 @@ int xc_hvm_create_ioreq_server(xc_interface *xch,
int handle_bufioreq,
ioservid_t *id)
 {
-DECLARE_HYPERCALL_BUFFER(xen_hvm_create_ioreq_server_t, arg);
+struct xen_dm_op op;
+struct xen_dm_op_create_ioreq_server *data;
 int rc;
 
-arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
-if ( arg == NULL )
-return -1;
+memset(, 0, sizeof(op));
 
-arg->domid = domid;
-arg->handle_bufioreq = handle_bufioreq;
+op.op = XEN_DMOP_create_ioreq_server;
+data = _ioreq_server;
 
-rc = xencall2(xch->xcall, __HYPERVISOR_hvm_op,
-  HVMOP_create_ioreq_server,
-  HYPERCALL_BUFFER_AS_ARG(arg));
+data->handle_bufioreq = handle_bufioreq;
+
+rc = do_dm_op(xch, domid, 1, , sizeof(op));
+if ( rc )
+return rc;
 
-*id = arg->id;
+*id = data->id;
 
-xc_hypercall_buffer_free(xch, arg);
-return rc;
+return 0;
 }
 
 int xc_hvm_get_ioreq_server_info(xc_interface *xch,
@@ -1444,84 +1444,71 @@ int xc_hvm_get_ioreq_server_info(xc_interface *xch,
  xen_pfn_t *bufioreq_pfn,
  evtchn_port_t *bufioreq_port)
 {
-DECLARE_HYPERCALL_BUFFER(xen_hvm_get_ioreq_server_info_t, arg);
+struct xen_dm_op op;
+struct xen_dm_op_get_ioreq_server_info *data;
 int rc;
 
-arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
-if ( arg == NULL )
-return -1;
+memset(, 0, sizeof(op));
 
-arg->domid = domid;
-arg->id = id;
+op.op = XEN_DMOP_get_ioreq_server_info;
+data = _ioreq_server_info;
 
-rc = xencall2(xch->xcall, __HYPERVISOR_hvm_op,
-  HVMOP_get_ioreq_server_info,
-  HYPERCALL_BUFFER_AS_ARG(arg));
-if ( rc != 0 )
-goto done;
+data->id = id;
+
+rc = do_dm_op(xch, domid, 1, , sizeof(op));
+if ( rc )
+return rc;
 
 if ( ioreq_pfn )
-*ioreq_pfn = arg->ioreq_pfn;
+*ioreq_pfn = data->ioreq_pfn;
 
 if ( bufioreq_pfn )
-*bufioreq_pfn = arg->bufioreq_pfn;
+*bufioreq_pfn = data->bufioreq_pfn;
 
 if ( bufioreq_port )
-*bufioreq_port = arg->bufioreq_port;
+*bufioreq_port = data->bufioreq_port;
 
-done:
-xc_hypercall_buffer_free(xch, arg);
-return rc;
+return 0;
 }
 
 int xc_hvm_map_io_range_to_ioreq_server(xc_interface *xch, domid_t domid,
 ioservid_t id, int is_mmio,
 uint64_t start, uint64_t end)
 {
-DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
-int rc;
+struct xen_dm_op op;
+struct xen_dm_op_ioreq_server_range *data;
 
-arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
-if ( arg == NULL )
-return -1;
+memset(, 0, sizeof(op));
 
-arg->domid = domid;
-arg->id = id;
-arg->type = is_mmio ? HVMOP_IO_RANGE_MEMORY : HVMOP_IO_RANGE_PORT;
-arg->start = start;
-arg->end = end;
+op.op = XEN_DMOP_map_io_range_to_ioreq_server;
+data = _io_range_to_ioreq_server;
 
-rc = xencall2(xch->xcall, __HYPERVISOR_hvm_op,
-  HVMOP_map_io_range_to_ioreq_server,
-  HYPERCALL_BUFFER_AS_ARG(arg));
+data->id = id;
+data->type = is_mmio ? XEN_DMOP_IO_RANGE_MEMORY : XEN_DMOP_IO_RANGE_PORT;
+data->start = start;
+data->end = end;
 
-xc_hypercall_buffer_free(xch, arg);
-return rc;
+return do_dm_op(xch, domid, 1, , sizeof(op));
 }
 
 int xc_hvm_unmap_io_range_from_ioreq_server(xc_interface *xch, domid_t 

Re: [Xen-devel] [PATCH v5 2/4] x86emul: support VME and PVI

2017-01-12 Thread Andrew Cooper
On 12/01/17 14:15, Jan Beulich wrote:
> ... affecting PUSHF, POPF, CLI, and STI.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/4] x86emul: conditionally clear BNDn for branches

2017-01-12 Thread Andrew Cooper
On 12/01/17 14:15, Jan Beulich wrote:
> Considering that we surface MPX to HVM guests, instructions we emulate
> should also correctly deal with MPX state. While for now BND*
> instructions don't get emulated, the effect of branches (which we do
> emulate) without BND prefix should be taken care of.
>
> No need to alter XABORT behavior: While not mentioned in the SDM so
> far, this restores BNDn as they were at the XBEGIN, and since we make
> XBEGIN abort right away, XABORT in the emulator is only a no-op.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 3/4] x86emul: use switch()-wide local variable 'cr4'

2017-01-12 Thread Jan Beulich
... rather than various smaller scope ones.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 
---
v2: Re-base over PUSHF adjustment in earlier patch.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -862,13 +862,10 @@ do {
 #define put_fpu(_fic)   \
 do {\
 _put_fpu(); \
-if( (_fic)->exn_raised == EXC_XM && ops->read_cr )  \
-{   \
-unsigned long cr4;  \
-if ( (ops->read_cr(4, , ctxt) == X86EMUL_OKAY) &&   \
- !(cr4 & CR4_OSXMMEXCPT) )  \
-(_fic)->exn_raised = EXC_UD;\
-}   \
+if ( (_fic)->exn_raised == EXC_XM && ops->read_cr &&\
+ ops->read_cr(4, , ctxt) == X86EMUL_OKAY && \
+ !(cr4 & CR4_OSXMMEXCPT) )  \
+(_fic)->exn_raised = EXC_UD;\
 generate_exception_if((_fic)->exn_raised >= 0,  \
   (_fic)->exn_raised);  \
 } while (0)
@@ -1183,7 +1180,7 @@ _mode_iopl(
 _iopl;  \
 })
 #define mode_vif() ({\
-unsigned long cr4 = 0;   \
+cr4 = 0; \
 if ( ops->read_cr && get_cpl(ctxt, ops) == 3 )   \
 {\
 rc = ops->read_cr(4, , ctxt);\
@@ -2783,6 +2780,7 @@ x86_emulate(
 {
 enum x86_segment seg;
 struct segment_register cs, sreg;
+unsigned long cr4;
 
 case 0x00 ... 0x05: add: /* add */
 emulate_2op_SrcV("add", src, dst, _regs._eflags);
@@ -3281,8 +3279,7 @@ x86_emulate(
 if ( (_regs._eflags & EFLG_VM) &&
  MASK_EXTR(_regs._eflags, EFLG_IOPL) != 3 )
 {
-unsigned long cr4 = 0;
-
+cr4 = 0;
 if ( op_bytes == 2 && ops->read_cr )
 {
 rc = ops->read_cr(4, , ctxt);
@@ -3300,8 +3297,8 @@ x86_emulate(
 
 case 0x9d: /* popf */ {
 uint32_t mask = EFLG_VIP | EFLG_VIF | EFLG_VM;
-unsigned long cr4 = 0;
 
+cr4 = 0;
 if ( !mode_ring0() )
 {
 if ( _regs._eflags & EFLG_VM )
@@ -4586,9 +4583,6 @@ x86_emulate(
 
 #ifdef __XEN__
 case 0xd1: /* xsetbv */
-{
-unsigned long cr4;
-
 generate_exception_if(vex.pfx, EXC_UD);
 if ( !ops->read_cr || ops->read_cr(4, , ctxt) != X86EMUL_OKAY )
 cr4 = 0;
@@ -4598,7 +4592,6 @@ x86_emulate(
 _regs._eax | (_regs.rdx << 
32)),
   EXC_GP, 0);
 goto no_writeback;
-}
 #endif
 
 case 0xd4: /* vmfunc */
@@ -5126,8 +5119,8 @@ x86_emulate(
 break;
 
 case X86EMUL_OPC(0x0f, 0x31): rdtsc: /* rdtsc */ {
-unsigned long cr4;
 uint64_t val;
+
 if ( !mode_ring0() )
 {
 fail_if(ops->read_cr == NULL);
@@ -5499,9 +5492,6 @@ x86_emulate(
 break;
 
 case X86EMUL_OPC_F3(0x0f, 0xae): /* Grp15 */
-{
-unsigned long cr4;
-
 fail_if(modrm_mod != 3);
 generate_exception_if((modrm_reg & 4) || !mode_64bit(), EXC_UD);
 fail_if(!ops->read_cr);
@@ -5536,7 +5526,6 @@ x86_emulate(
 goto done;
 }
 break;
-}
 
 case X86EMUL_OPC(0x0f, 0xaf): /* imul */
 emulate_2op_SrcV_srcmem("imul", src, dst, _regs._eflags);


x86emul: use switch()-wide local variable 'cr4'

... rather than various smaller scope ones.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 
---
v2: Re-base over PUSHF adjustment in earlier patch.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -862,13 +862,10 @@ do {
 #define put_fpu(_fic)   \
 do {\
 _put_fpu(); \
-if( (_fic)->exn_raised == EXC_XM && ops->read_cr )  \
-{   \
-unsigned long cr4;  \
-if ( (ops->read_cr(4, , ctxt) == X86EMUL_OKAY) &&   \
- !(cr4 & CR4_OSXMMEXCPT) )  \
-(_fic)->exn_raised = EXC_UD;\
-}   

[Xen-devel] [PATCH v5 4/4] x86emul: improve CR/DR access handling

2017-01-12 Thread Jan Beulich
- don't accept LOCK for DR accesses (it's undefined in the manuals)
- only accept LOCK for CR accesses when the respective feature flag is
  set (which would not normally be the case for Intel)
- add (rather than or) 8 when LOCK is present; real hardware #UDs
  when both REX.W and LOCK are present, implying that these would
  rather access hypothetical CR16...23
- eliminate explicit decode_register() calls
- streamline remaining read/write code

No further functional change, i.e. not addressing the missing exception
generation (#UD for invalid CR/DR encodings, #GP(0) for invalid write
values, #DB for DR accesses with DR7.GD set).

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -194,7 +194,8 @@ static const opcode_desc_t twobyte_table
 ImplicitOps|ModRM, ImplicitOps|ModRM, ImplicitOps|ModRM, ImplicitOps|ModRM,
 ImplicitOps|ModRM, ImplicitOps|ModRM, ImplicitOps|ModRM, ImplicitOps|ModRM,
 /* 0x20 - 0x27 */
-ImplicitOps|ModRM, ImplicitOps|ModRM, ImplicitOps|ModRM, ImplicitOps|ModRM,
+DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM,
+DstImplicit|SrcMem|ModRM, DstImplicit|SrcMem|ModRM,
 0, 0, 0, 0,
 /* 0x28 - 0x2F */
 ImplicitOps|ModRM, ImplicitOps|ModRM, ImplicitOps|ModRM, ImplicitOps|ModRM,
@@ -1320,6 +1321,7 @@ static bool vcpu_has(
 #define vcpu_has_movbe()   vcpu_has( 1, ECX, 22, ctxt, ops)
 #define vcpu_has_avx() vcpu_has( 1, ECX, 28, ctxt, ops)
 #define vcpu_has_lahf_lm() vcpu_has(0x8001, ECX,  0, ctxt, ops)
+#define vcpu_has_cr8_legacy()  vcpu_has(0x8001, ECX,  4, ctxt, ops)
 #define vcpu_has_lzcnt()   vcpu_has(0x8001, ECX,  5, ctxt, ops)
 #define vcpu_has_misalignsse() vcpu_has(0x8001, ECX,  7, ctxt, ops)
 #define vcpu_has_bmi1()vcpu_has( 7, EBX,  3, ctxt, ops)
@@ -2047,6 +2049,19 @@ x86_decode_twobyte(
 case 0xd0 ... 0xfe:
 ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK);
 break;
+
+case 0x20: case 0x22: /* mov to/from cr */
+if ( lock_prefix && vcpu_has_cr8_legacy() )
+{
+modrm_reg += 8;
+lock_prefix = false;
+}
+/* fall through */
+case 0x21: case 0x23: /* mov to/from dr */
+generate_exception_if(lock_prefix || ea.type != OP_REG, EXC_UD);
+op_bytes = mode_64bit() ? 8 : 4;
+break;
+
 /* Intentionally not handling here despite being modified by F3:
 case 0xb8: jmpe / popcnt
 case 0xbc: bsf / tzcnt
@@ -2683,14 +2698,10 @@ x86_emulate(
 case DstNone: /* case DstImplicit: */
 /*
  * The only implicit-operands instructions allowed a LOCK prefix are
- * CMPXCHG{8,16}B, MOV CRn, MOV DRn.
+ * CMPXCHG{8,16}B (MOV CRn is being handled elsewhere).
  */
-generate_exception_if(
-lock_prefix &&
-(ext != ext_0f ||
- (((b < 0x20) || (b > 0x23)) && /* MOV CRn/DRn */
-  (b != 0xc7))),/* CMPXCHG{8,16}B */
-EXC_UD);
+generate_exception_if(lock_prefix && (ext != ext_0f || b != 0xc7),
+  EXC_UD);
 dst.type = OP_NONE;
 break;
 
@@ -5074,38 +5085,25 @@ x86_emulate(
 case X86EMUL_OPC(0x0f, 0x21): /* mov dr,reg */
 case X86EMUL_OPC(0x0f, 0x22): /* mov reg,cr */
 case X86EMUL_OPC(0x0f, 0x23): /* mov reg,dr */
-generate_exception_if(ea.type != OP_REG, EXC_UD);
 generate_exception_if(!mode_ring0(), EXC_GP, 0);
-modrm_reg |= lock_prefix << 3;
 if ( b & 2 )
 {
 /* Write to CR/DR. */
-src.val = *(unsigned long *)decode_register(modrm_rm, &_regs, 0);
-if ( !mode_64bit() )
-src.val = (uint32_t)src.val;
-rc = ((b & 1)
-  ? (ops->write_dr
- ? ops->write_dr(modrm_reg, src.val, ctxt)
- : X86EMUL_UNHANDLEABLE)
-  : (ops->write_cr
- ? ops->write_cr(modrm_reg, src.val, ctxt)
- : X86EMUL_UNHANDLEABLE));
+typeof(ops->write_cr) write = (b & 1) ? ops->write_dr
+  : ops->write_cr;
+
+fail_if(!write);
+rc = write(modrm_reg, src.val, ctxt);
 }
 else
 {
 /* Read from CR/DR. */
-dst.type  = OP_REG;
-dst.bytes = mode_64bit() ? 8 : 4;
-dst.reg   = decode_register(modrm_rm, &_regs, 0);
-rc = ((b & 1)
-  ? (ops->read_dr
- ? ops->read_dr(modrm_reg, , ctxt)
- : X86EMUL_UNHANDLEABLE)
-  : (ops->read_cr
- ? ops->read_cr(modrm_reg, , ctxt)
- : X86EMUL_UNHANDLEABLE));
+typeof(ops->read_cr) 

Re: [Xen-devel] [PATCH] x86/cpuid: Move vendor/family/model information from arch_domain to cpuid_policy

2017-01-12 Thread Andrew Cooper
On 12/01/17 14:13, Jan Beulich wrote:
 On 12.01.17 at 15:02,  wrote:
>> I did make get_cpu_vendor() quite a lot better than it was previously,
>> but it is still searching a loop.  For the extra 3 bytes of data, I
>> still think pre-calculating the values is worth it.
> Well, as said, the question isn't the extra amount of data, but the
> redundancy (and hence risk of going out of sync). Could we meet
> in the middle and agree on just caching vendor, but not the other
> two items?

I was considering that.  Family is only used when writing to cf8, and is
far less overhead to recalculate than vendor, as it is straight
arithmetic on a uint32_t.

I will split the patch in two and go down this route.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 2/4] x86emul: support VME and PVI

2017-01-12 Thread Jan Beulich
... affecting PUSHF, POPF, CLI, and STI.

Signed-off-by: Jan Beulich 
---
v5: Add PUSHF adjustment. mode_pvi() -> mode_vif().

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -433,6 +433,8 @@ typedef union {
 #define CR0_EM(1<<2)
 #define CR0_TS(1<<3)
 
+#define CR4_VME(1<<0)
+#define CR4_PVI(1<<1)
 #define CR4_TSD(1<<2)
 #define CR4_OSFXSR (1<<9)
 #define CR4_OSXMMEXCPT (1<<10)
@@ -1180,6 +1182,15 @@ _mode_iopl(
 fail_if(_iopl < 0); \
 _iopl;  \
 })
+#define mode_vif() ({\
+unsigned long cr4 = 0;   \
+if ( ops->read_cr && get_cpl(ctxt, ops) == 3 )   \
+{\
+rc = ops->read_cr(4, , ctxt);\
+if ( rc != X86EMUL_OKAY ) goto done; \
+}\
+!!(cr4 & (_regs._eflags & EFLG_VM ? CR4_VME : CR4_PVI)); \
+})
 
 static int ioport_access_check(
 unsigned int first_port,
@@ -3267,20 +3278,44 @@ x86_emulate(
 break;
 
 case 0x9c: /* pushf */
-generate_exception_if((_regs._eflags & EFLG_VM) &&
-  MASK_EXTR(_regs._eflags, EFLG_IOPL) != 3,
-  EXC_GP, 0);
-src.val = _regs.r(flags) & ~(EFLG_VM | EFLG_RF);
+if ( (_regs._eflags & EFLG_VM) &&
+ MASK_EXTR(_regs._eflags, EFLG_IOPL) != 3 )
+{
+unsigned long cr4 = 0;
+
+if ( op_bytes == 2 && ops->read_cr )
+{
+rc = ops->read_cr(4, , ctxt);
+if ( rc != X86EMUL_OKAY )
+goto done;
+}
+generate_exception_if(!(cr4 & CR4_VME), EXC_GP, 0);
+src.val = (_regs.flags & ~EFLG_IF) | EFLG_IOPL;
+if ( _regs._eflags & EFLG_VIF )
+src.val |= EFLG_IF;
+}
+else
+src.val = _regs.r(flags) & ~(EFLG_VM | EFLG_RF);
 goto push;
 
 case 0x9d: /* popf */ {
 uint32_t mask = EFLG_VIP | EFLG_VIF | EFLG_VM;
+unsigned long cr4 = 0;
 
 if ( !mode_ring0() )
 {
-generate_exception_if((_regs._eflags & EFLG_VM) &&
-  MASK_EXTR(_regs._eflags, EFLG_IOPL) != 3,
-  EXC_GP, 0);
+if ( _regs._eflags & EFLG_VM )
+{
+if ( op_bytes == 2 && ops->read_cr )
+{
+rc = ops->read_cr(4, , ctxt);
+if ( rc != X86EMUL_OKAY )
+goto done;
+}
+generate_exception_if(!(cr4 & CR4_VME) &&
+  MASK_EXTR(_regs._eflags, EFLG_IOPL) != 3,
+  EXC_GP, 0);
+}
 mask |= EFLG_IOPL;
 if ( !mode_iopl() )
 mask |= EFLG_IF;
@@ -3292,7 +3327,20 @@ x86_emulate(
   , op_bytes, ctxt, ops)) != 0 )
 goto done;
 if ( op_bytes == 2 )
+{
 dst.val = (uint16_t)dst.val | (_regs._eflags & 0xu);
+if ( cr4 & CR4_VME )
+{
+if ( dst.val & EFLG_IF )
+{
+generate_exception_if(_regs._eflags & EFLG_VIP, EXC_GP, 0);
+dst.val |= EFLG_VIF;
+}
+else
+dst.val &= ~EFLG_VIF;
+mask &= ~EFLG_VIF;
+}
+}
 dst.val &= EFLAGS_MODIFIABLE;
 _regs._eflags &= mask;
 _regs._eflags |= (dst.val & ~mask) | EFLG_MBS;
@@ -4399,16 +4447,29 @@ x86_emulate(
 break;
 
 case 0xfa: /* cli */
-generate_exception_if(!mode_iopl(), EXC_GP, 0);
-_regs._eflags &= ~EFLG_IF;
+if ( mode_iopl() )
+_regs._eflags &= ~EFLG_IF;
+else
+{
+generate_exception_if(!mode_vif(), EXC_GP, 0);
+_regs._eflags &= ~EFLG_VIF;
+}
 break;
 
 case 0xfb: /* sti */
-generate_exception_if(!mode_iopl(), EXC_GP, 0);
-if ( !(_regs._eflags & EFLG_IF) )
+if ( mode_iopl() )
 {
+if ( !(_regs._eflags & EFLG_IF) )
+ctxt->retire.sti = true;
 _regs._eflags |= EFLG_IF;
-ctxt->retire.sti = true;
+}
+else
+{
+generate_exception_if((_regs._eflags & EFLG_VIP) || !mode_vif(),
+  EXC_GP, 0);
+if ( !(_regs._eflags & EFLG_VIF) )
+ctxt->retire.sti = true;
+_regs._eflags |= EFLG_VIF;
 }
 break;
 


x86emul: support VME and PVI

... affecting PUSHF, POPF, CLI, and STI.


[Xen-devel] [PATCH v5 1/4] x86emul: conditionally clear BNDn for branches

2017-01-12 Thread Jan Beulich
Considering that we surface MPX to HVM guests, instructions we emulate
should also correctly deal with MPX state. While for now BND*
instructions don't get emulated, the effect of branches (which we do
emulate) without BND prefix should be taken care of.

No need to alter XABORT behavior: While not mentioned in the SDM so
far, this restores BNDn as they were at the XBEGIN, and since we make
XBEGIN abort right away, XABORT in the emulator is only a no-op.

Signed-off-by: Jan Beulich 
---
v5: Add cpu_has_mpx check to adjust_bnd().
v4: Re-base. Rename clear_bnd() to adjust_bnd(). Add
ASSERT_UNREACHABLE() to xstate_set_init(), and consistently use
set_xcr0() instead of xsetbv() there. Drop use of XSAVEOPT in
read_bndcfgu().
v3: Re-base.
v2: Re-base. Address all RFC reasons based on feedback from Intel.
Re-work the actual clearing of BNDn.

--- a/tools/tests/x86_emulator/x86_emulate.c
+++ b/tools/tests/x86_emulator/x86_emulate.c
@@ -7,6 +7,9 @@
 
 #define cpu_has_amd_erratum(nr) 0
 #define mark_regs_dirty(r) ((void)(r))
+#define cpu_has_mpx false
+#define read_bndcfgu() 0
+#define xstate_set_init(what)
 
 /* For generic assembly code: use macros to define operation/operand sizes. */
 #ifdef __i386__
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -391,6 +391,8 @@ int vcpu_initialise(struct vcpu *v)
 
 vmce_init_vcpu(v);
 }
+else if ( (rc = xstate_alloc_save_area(v)) != 0 )
+return rc;
 
 spin_lock_init(>arch.vpmu.vpmu_lock);
 
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -417,6 +417,9 @@ typedef union {
 #define MSR_SYSENTER_EIP 0x0176
 #define MSR_DEBUGCTL 0x01d9
 #define DEBUGCTL_BTF (1 << 1)
+#define MSR_BNDCFGS  0x0d90
+#define BNDCFG_ENABLE(1 << 0)
+#define BNDCFG_PRESERVE  (1 << 1)
 #define MSR_EFER 0xc080
 #define MSR_STAR 0xc081
 #define MSR_LSTAR0xc082
@@ -1314,6 +1317,7 @@ static bool vcpu_has(
 #define vcpu_has_bmi1()vcpu_has( 7, EBX,  3, ctxt, ops)
 #define vcpu_has_hle() vcpu_has( 7, EBX,  4, ctxt, ops)
 #define vcpu_has_rtm() vcpu_has( 7, EBX, 11, ctxt, ops)
+#define vcpu_has_mpx() vcpu_has( 7, EBX, 14, ctxt, ops)
 #define vcpu_has_smap()vcpu_has( 7, EBX, 20, ctxt, ops)
 #define vcpu_has_clflushopt()  vcpu_has( 7, EBX, 23, ctxt, ops)
 #define vcpu_has_clwb()vcpu_has( 7, EBX, 24, ctxt, ops)
@@ -1836,6 +1840,34 @@ static int inject_swint(enum x86_swint_t
 generate_exception(fault_type, error_code);
 }
 
+static void adjust_bnd(struct x86_emulate_ctxt *ctxt,
+   const struct x86_emulate_ops *ops, enum vex_pfx pfx)
+{
+uint64_t bndcfg;
+int rc;
+
+if ( pfx == vex_f2 || !cpu_has_mpx || !vcpu_has_mpx() )
+return;
+
+if ( !mode_ring0() )
+bndcfg = read_bndcfgu();
+else if ( !ops->read_msr ||
+  ops->read_msr(MSR_BNDCFGS, , ctxt) != X86EMUL_OKAY )
+return;
+if ( (bndcfg & BNDCFG_ENABLE) && !(bndcfg & BNDCFG_PRESERVE) )
+{
+/*
+ * Using BNDMK or any other MPX instruction here is pointless, as
+ * we run with MPX disabled ourselves, and hence they're all no-ops.
+ * Therefore we have two ways to clear BNDn: Enable MPX temporarily
+ * (in which case executing any suitable non-prefixed branch
+ * instruction would do), or use XRSTOR.
+ */
+xstate_set_init(XSTATE_BNDREGS);
+}
+ done:;
+}
+
 int x86emul_unhandleable_rw(
 enum x86_segment seg,
 unsigned long offset,
@@ -3072,6 +3104,7 @@ x86_emulate(
 case 0x70 ... 0x7f: /* jcc (short) */
 if ( test_cc(b, _regs._eflags) )
 jmp_rel((int32_t)src.val);
+adjust_bnd(ctxt, ops, vex.pfx);
 break;
 
 case 0x82: /* Grp1 (x86/32 only) */
@@ -3424,6 +3457,7 @@ x86_emulate(
  (rc = ops->insn_fetch(x86_seg_cs, dst.val, NULL, 0, ctxt)) )
 goto done;
 _regs.r(ip) = dst.val;
+adjust_bnd(ctxt, ops, vex.pfx);
 break;
 
 case 0xc4: /* les */
@@ -4137,12 +4171,15 @@ x86_emulate(
 op_bytes = ((op_bytes == 4) && mode_64bit()) ? 8 : op_bytes;
 src.val = _regs.r(ip);
 jmp_rel(rel);
+adjust_bnd(ctxt, ops, vex.pfx);
 goto push;
 }
 
 case 0xe9: /* jmp (near) */
 case 0xeb: /* jmp (short) */
 jmp_rel((int32_t)src.val);
+if ( !(b & 2) )
+adjust_bnd(ctxt, ops, vex.pfx);
 break;
 
 case 0xea: /* jmp (far, absolute) */
@@ -4402,12 +4439,14 @@ x86_emulate(
 goto done;
 _regs.r(ip) = src.val;
 src.val = dst.val;
+adjust_bnd(ctxt, ops, vex.pfx);
 goto push;
 case 4: /* jmp (near) */
 if ( (rc = ops->insn_fetch(x86_seg_cs, src.val, NULL, 0, ctxt)) )
 goto done;
   

Re: [Xen-devel] [PATCH] x86/cpuid: Move vendor/family/model information from arch_domain to cpuid_policy

2017-01-12 Thread Jan Beulich
>>> On 12.01.17 at 15:02,  wrote:
> I did make get_cpu_vendor() quite a lot better than it was previously,
> but it is still searching a loop.  For the extra 3 bytes of data, I
> still think pre-calculating the values is worth it.

Well, as said, the question isn't the extra amount of data, but the
redundancy (and hence risk of going out of sync). Could we meet
in the middle and agree on just caching vendor, but not the other
two items?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 0/4] x86emul: further misc improvements

2017-01-12 Thread Jan Beulich
1: conditionally clear BNDn for branches
2: support VME and PVI
3: use switch()-wide local variable 'cr4'
4: improve CR/DR access handling

Signed-off-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/cpuid: Move vendor/family/model information from arch_domain to cpuid_policy

2017-01-12 Thread Andrew Cooper
On 12/01/17 13:40, Jan Beulich wrote:
 On 12.01.17 at 13:32,  wrote:
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -78,12 +78,11 @@ static void update_domain_cpuid_info(struct domain *d,
>>  switch ( ctl->input[0] )
>>  {
>>  case 0: {
>> -int old_vendor = d->arch.x86_vendor;
>> +int old_vendor = p->x86_vendor;
>>  
>> -d->arch.x86_vendor = get_cpu_vendor(
>> -ctl->ebx, ctl->ecx, ctl->edx, gcv_guest);
>> +p->x86_vendor = get_cpu_vendor(ctl->ebx, ctl->ecx, ctl->edx, 
>> gcv_guest);
>>  
>> -if ( is_hvm_domain(d) && (d->arch.x86_vendor != old_vendor) )
>> +if ( is_hvm_domain(d) && (p->x86_vendor != old_vendor) )
>>  {
>>  struct vcpu *v;
>>  
>> @@ -95,7 +94,7 @@ static void update_domain_cpuid_info(struct domain *d,
>>  }
>>  
>>  case 1:
>> -d->arch.x86 = get_cpu_family(ctl->eax, >arch.x86_model, NULL);
>> +p->x86_family = get_cpu_family(ctl->eax, >x86_model, NULL);
>>  
>>  if ( is_pv_domain(d) && ((levelling_caps & LCAP_1cd) == LCAP_1cd) )
>>  {
> Considering that the three fields can be calculated from other
> CPUID data, is it really worthwhile to store these redundant pieces
> of information, instead of having consumers simply call
> get_cpu_{vendor,policy}()? All we "gain" by storing them is the risk
> of them going out of sync.

x86_model isn't actually used anywhere.  x86_family is only used in
hvm_select_ioreq_server() when handling AMD extended config space.

x86_family however is used quite a lot (all paths into x86_emulate(),
and the IOREQ infrastructure underneath).

I did make get_cpu_vendor() quite a lot better than it was previously,
but it is still searching a loop.  For the extra 3 bytes of data, I
still think pre-calculating the values is worth it.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   >