date:20150325

[Xen-devel] Redhat 6 VM crash on Xen when cpu number reaches 64

2015-03-25 Thread Ouyang Zhaowei (Charles)

Hi all:

Now a days, we tested Redhat 6.2(6.4) on Xen(version 4.1.2).
If we config the cpu number more than 32, it'll show 32 in
the VM, and if we config it 64 cpus, the VM will crash and
the log is list below.

Can someone tell us why is this happening?

thanks

===crash log=
CPU: CPU feature rdtscp disabled on xen guest
CPU: CPU feature constant_tsc disabled on xen guest
mce: CPU supports 0 MCE banks
 #32
CPU32: Stuck ??
 #33
CPU33: Stuck ??
 #34
CPU34: Stuck ??
 #35
CPU35: Stuck ??
 #36
CPU36: Stuck ??
 #37
CPU37: Stuck ??
 #38
CPU38: Stuck ??
 #39
CPU39: Stuck ??
 #40
CPU40: Stuck ??
 #41
CPU41: Stuck ??
 #42
CPU42: Stuck ??
 #43
CPU43: Stuck ??
 #44
CPU44: Stuck ??
 #45
CPU45: Stuck ??
 #46
CPU46: Stuck ??
 #47
CPU47: Stuck ??
 #48
CPU48: Stuck ??
 #49
CPU49: Stuck ??
 #50
CPU50: Stuck ??
 #51
CPU51: Stuck ??
 #52
CPU52: Stuck ??
 #53
CPU53: Stuck ??
 #54
CPU54: Stuck ??
 #55
CPU55: Stuck ??
 #56
CPU56: Stuck ??
 #57
CPU57: Stuck ??
 #58
CPU58: Stuck ??
 #59
CPU59: Stuck ??
 #60
CPU60: Stuck ??
 #61
CPU61: Stuck ??
 #62
CPU62: Stuck ??
 #63
CPU63: Stuck ??
Brought up 32 CPUs
Total of 32 processors activated (153583.72 BogoMIPS).
devtmpfs: initialized
regulator: core version 0.5
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1 for base access
bio: create slab  at 0
ACPI: Interpreter enabled
ACPI: (supports S0 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: No dock devices found.
HEST: Table not found.
ACPI: PCI Root Bridge [PCI0] (:00)
pci :00:01.3: quirk: region b000-b03f claimed by PIIX4 ACPI
pci :00:01.3: quirk: region b100-b10f claimed by PIIX4 SMB
ACPI: PCI Interrupt Link [LNKA] (IRQs *5 10 11)
ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
ACPI: PCI Interrupt Link [LNKD] (IRQs *5 10 11)
vgaarb: device added: PCI::00:02.0,decodes=io+mem,owns=io+mem,locks=none
vgaarb: loaded
vgaarb: bridge control possible :00:02.0
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
Switching to clocksource jiffies
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 11 devices
ACPI: ACPI bus type pnp unregistered
system 00:00: iomem range 0x0-0x9 could not be reserved
system 00:02: ioport range 0x8a0-0x8a3 has been reserved
system 00:02: ioport range 0xcc0-0xccf has been reserved
system 00:02: ioport range 0x4d0-0x4d1 has been reserved
system 00:0a: ioport range 0xae00-0xae0f has been reserved
system 00:0a: ioport range 0xb044-0xb047 has been reserved
NET: Registered protocol family 2
IP route cache hash table entries: 65536 (order: 7, 524288 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
NET: Registered protocol family 1
pci :00:00.0: Limiting direct PCI/PCI transfers
pci :00:01.0: PIIX3: Enabling Passive Release
pci :00:01.0: Activating ISA DMA hang workarounds
Trying to unpack rootfs image as initramfs...
Freeing initrd memory: 15508k freed
audit: initializing netlink socket (disabled)
type=2000 audit(1427337937.473:1): initialized
HugeTLB registered 2 MB page size, pre-allocated 0 pages
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
msgmni has been set to 2912
alg: No test for stdrng (krng)
ksign: Installing public key data
Loading keyring
- Added public key 4BB1E63D18B42658
- User ID: Red Hat, Inc. (Kernel Module GPG key)
- Added public key D4A26C9CCD09BEDA
- User ID: Red Hat Enterprise Linux Driver Update Program 
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
pciehp: PCI Express Hot Plug Controller Driver version: 0.4
acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
acpiphp: Slot [4] registered
acpiphp: Slot [5] registered
acpiphp: Slot [6] registered
acpiphp: Slot [7] registered
acpiphp: Slot [8] registered
acpiphp: Slot [9] registered
acpiphp: Slot [10] registered
acpiphp: Slot [11] registered
acpiphp: Slot [12] registered
acpiphp: Slot [13] registered
acpiphp: Slot [14] registered
acpiphp: Slot [15] registered
acpiphp: Slot [16] registered
acpiphp: Slot [17] registered
acpiphp: Slot [18] registered
acpiphp: Slot [19] registered
acpiphp: Slot [20] registered
acpiphp: Slot [21] registered
acpiphp: Slot [22] registered
acpiphp: Slot [23] registered
acpiphp: Slot [24] registered
acpiphp: Slot [25] r

[Xen-devel] linux-next: manual merge of the xen-tip tree with the arm64-acpi tree

2015-03-25 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the xen-tip tree got a conflict in
drivers/xen/Kconfig between commit 94ccae47e02d ("XEN / ACPI: Make XEN
ACPI depend on X86") from the arm64-acpi tree and commit 628c28eefd6f
("xen: unify foreign GFN map/unmap for auto-xlated physmap guests")
from the xen-tip tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/xen/Kconfig
index a31cd29b68a8,afc39ca5cc4f..
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@@ -253,8 -253,10 +253,14 @@@ config XEN_EF
def_bool y
depends on X86_64 && EFI
  
 +config XEN_ACPI
 +  def_bool y
 +  depends on X86 && ACPI
 +
+ config XEN_AUTO_XLATE
+   def_bool y
+   depends on ARM || ARM64 || XEN_PVHVM
+   help
+ Support for auto-translated physmap guests.
+ 
  endmenu


pgpySq275iekB.pgp
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [libvirt test] 36729: tolerable FAIL - PUSHED

2015-03-25 Thread xen . org

flight 36729 libvirt real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/36729/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-xsm   9 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm  9 guest-start  fail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm  5 xen-boot fail   never pass

version targeted for testing:
 libvirt  6cf1e11cc09212f71991f1061681ea0b1ee2bc1b
baseline version:
 libvirt  3b289a81eaa49fd96b829cf47ea5c6d84e4f65c2


People who touched revisions under test:
  JÃ¡n Tomko 
  Laine Stump 
  Luyao Huang 
  Pavel Hrdina 


jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-xsm fail
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass



sg-report-flight on osstest.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=libvirt
+ revision=6cf1e11cc09212f71991f1061681ea0b1ee2bc1b
+ . cri-lock-repos
++ . cri-common
+++ . cri-getconfig
+++ umask 002
+++ getconfig Repos
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
++ repos=/export/home/osstest/repos
++ repos_lock=/export/home/osstest/repos/lock
++ '[' x '!=' x/export/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/export/home/osstest/repos/lock
++ exec with-lock-ex -w /export/home/osstest/repos/lock ./ap-push libvirt 
6cf1e11cc09212f71991f1061681ea0b1ee2bc1b
+ branch=libvirt
+ revision=6cf1e11cc09212f71991f1061681ea0b1ee2bc1b
+ . cri-lock-repos
++ . cri-common
+++ . cri-getconfig
+++ umask 002
+++ getconfig Repos
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
++ repos=/export/home/osstest/repos
++ repos_lock=/export/home/osstest/repos/lock
++ '[' x/export/home/osstest/repos/lock '!=' x/export/home/osstest/repos/lock 
']'
+ . cri-common
++ . cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=libvirt
+ xenbranch=xen-unstable
+ '[' xlibvirt = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ : tested/2.6.39.x
+ . ap-common
++ : osst...@xenbits.xensource.com
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xensource.com:/home/xen/git/xen.git
++ : git://xenbits.xen.org/staging/qemu-xen-unstable.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://libvirt.org/libvirt.git
++ : osst...@xenbits.xensource.com:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xensource.com:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{"GitCache

Re: [Xen-devel] [PATCH v1 02/47] x86: mtrr: generalize run time disabling of MTRR

2015-03-25 Thread Juergen Gross


On 03/25/2015 08:59 PM, Konrad Rzeszutek Wilk wrote:

On Fri, Mar 20, 2015 at 04:17:52PM -0700, Luis R. Rodriguez wrote:

From: "Luis R. Rodriguez" 

It is possible to enable CONFIG_MTRR and up with it
disabled at run time and yet CONFIG_X86_PAT continues
to kick through fully functionally. This can happen


s/fully/full/ ?



for instance on Xen where MTRR is not supported but
PAT is, this can happen now on Linux as of commit
47591df50 by Juergen introduced as of v3.19.


s/3.19/4.0/


No, 3.19 is correct.

Juergen



Technically we should assume the proper CPU
bits would be set to disable MTRR but we can't
always rely on this. At least on the Xen Hypervisor
for instance only X86_FEATURE_MTRR was disabled
as of Xen 4.4 through Xen commit 586ab6a [0],
but not X86_FEATURE_K6_MTRR, X86_FEATURE_CENTAUR_MCR,
or X86_FEATURE_CYRIX_ARR for instance.


Oh, could you send an patch for that to Xen please?


x86 mtrr code relies on quite a bit of checks for
mtrr_if being set to check to see if MTRR did get
set up, instead of using that lets provide a generic
setter which when set we know MTRR is enabled. This


s/we know MTRR is enabled/will let us know that MTRR is enabled/


also adds a few checks where they were not before
which could potentially safeguard ourselves against
incorrect usage of MTRR where this was not desirable.

Where possible match error codes as if MTRR was
disabled on arch/x86/include/asm/mtrr.h.

Lastly, since disabling MTRR can happen at run time
and we could end up with PAT enabled best record now
on our logs when MTRR is disabled.

[0] ~/devel/xen (git::stable-4.5)$ git describe --contains 586ab6a
4.4.0-rc1~18

Cc: Andy Lutomirski 
Cc: Suresh Siddha 
Cc: Venkatesh Pallipadi 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Dave Airlie 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: Dave Hansen 
Cc: venkatesh.pallip...@intel.com
Cc: Stefan Bader 
Cc: konrad.w...@oracle.com
Cc: ville.syrj...@linux.intel.com
Cc: david.vra...@citrix.com
Cc: jbeul...@suse.com
Cc: toshi.k...@hp.com
Cc: bhelg...@google.com
Cc: Roger Pau Monné 
Cc: linux-fb...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: xen-de...@lists.xensource.com
Signed-off-by: Luis R. Rodriguez 
---
  arch/x86/include/asm/mtrr.h|  2 ++
  arch/x86/kernel/cpu/mtrr/cleanup.c |  2 +-
  arch/x86/kernel/cpu/mtrr/generic.c |  5 +++--
  arch/x86/kernel/cpu/mtrr/if.c  |  3 +++
  arch/x86/kernel/cpu/mtrr/main.c| 31 ++-
  5 files changed, 31 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index f768f62..cade917 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -31,6 +31,7 @@
   * arch_phys_wc_add and arch_phys_wc_del.
   */
  # ifdef CONFIG_MTRR
+extern int mtrr_enabled;
  extern u8 mtrr_type_lookup(u64 addr, u64 end);
  extern void mtrr_save_fixed_ranges(void *);
  extern void mtrr_save_state(void);
@@ -50,6 +51,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
  extern int amd_special_default_mtrr(void);
  extern int phys_wc_to_mtrr_index(int handle);
  #  else
+static const int mtrr_enabled;
  static inline u8 mtrr_type_lookup(u64 addr, u64 end)
  {
/*
diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c 
b/arch/x86/kernel/cpu/mtrr/cleanup.c
index 5f90b85..784dc55 100644
--- a/arch/x86/kernel/cpu/mtrr/cleanup.c
+++ b/arch/x86/kernel/cpu/mtrr/cleanup.c
@@ -880,7 +880,7 @@ int __init mtrr_trim_uncached_memory(unsigned long end_pfn)
 * Make sure we only trim uncachable memory on machines that
 * support the Intel MTRR architecture:
 */
-   if (!is_cpu(INTEL) || disable_mtrr_trim)
+   if (!is_cpu(INTEL) || disable_mtrr_trim || !mtrr_enabled)
return 0;

rdmsr(MSR_MTRRdefType, def, dummy);
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c 
b/arch/x86/kernel/cpu/mtrr/generic.c
index 09c82de..df321b2 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -116,7 +116,8 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 
*partial_end, int *repeat)
u8 prev_match, curr_match;

*repeat = 0;
-   if (!mtrr_state_set)
+   /* generic_mtrr_ops is only set for generic_mtrr_ops */
+   if (!mtrr_state_set || !mtrr_enabled)
return 0xFF;

if (!mtrr_state.enabled)
@@ -290,7 +291,7 @@ static void get_fixed_ranges(mtrr_type *frs)

  void mtrr_save_fixed_ranges(void *info)
  {
-   if (cpu_has_mtrr)
+   if (mtrr_enabled && cpu_has_mtrr)
get_fixed_ranges(mtrr_state.fixed_ranges);
  }

diff --git a/arch/x86/kernel/cpu/mtrr/if.c b/arch/x86/kernel/cpu/mtrr/if.c
index d76f13d..e9e001a 100644
--- a/arch/x86/kernel/cpu/mtrr/if.c
+++ b/arch/x86/kernel/cpu/mtrr/if.c
@@ -436,6 +436,9 @@ static int __init mtrr_if_init(void)
  {
struct cpuinfo_x86 *c = &boot_cpu_data;

+   if (!mtrr_enabled)
+

[Xen-devel] [xen-unstable test] 36728: regressions - FAIL

2015-03-25 Thread xen . org

flight 36728 xen-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/36728/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-credit2  12 guest-localmigratefail REGR. vs. 36514
 test-amd64-amd64-xl  12 guest-localmigratefail REGR. vs. 36514
 test-amd64-amd64-xl-multivcpu 12 guest-localmigrate   fail REGR. vs. 36514
 test-amd64-amd64-pair  17 guest-migrate/src_host/dst_host fail REGR. vs. 36514

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-sedf 12 guest-localmigratefail REGR. vs. 36514
 test-amd64-amd64-xl-sedf-pin 12 guest-localmigratefail REGR. vs. 36514
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 36514

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 build-armhf-xsm   5 xen-buildfail   never pass
 test-amd64-amd64-rumpuserxen-amd64 13 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 build-amd64-xsm   5 xen-buildfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass

version targeted for testing:
 xen  84066dd4ef4bb5983e246c629a26ef4f3394e5d5
baseline version:
 xen  3a28f760508fb35c430edac17a9efde5aff6d1d5


People who touched revisions under test:
  Andrew Cooper 
  Boris Ostrovsky 
  Daniel De Graaf 
  Dario Faggioli 
  Don Slutz 
  George Dunlap 
  Ian Campbell 
  Ian Jackson 
  Jan Beulich 
  Jim Fehlig 
  Juergen Gross 
  Kevin Tian 
  Konrad Rzeszutek Wilk 
  Koushik Chakravarty 
  Olaf Hering 
  Pramod Devendra 
  Quan Xu 
  Riku Voipio 
  Ross Lagerwall 
  Vijaya Kumar K 
  Vijaya Kumar K
  Wei Liu 
  Wen Congyang 


jobs:
 build-amd64-xsm  fail
 build-armhf-xsm  fail
 build-i386-xsm   pass

[Xen-devel] [rumpuserxen test] 36748: regressions - FAIL

2015-03-25 Thread xen . org

flight 36748 rumpuserxen real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/36748/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-rumpuserxen   5 rumpuserxen-build fail REGR. vs. 33866
 build-i386-rumpuserxen5 rumpuserxen-build fail REGR. vs. 33866

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a

version targeted for testing:
 rumpuserxen  80294081448af74270ca2dd2ba758f0ed308f46c
baseline version:
 rumpuserxen  30d72f3fc5e35cd53afd82c8179cc0e0b11146ad


People who touched revisions under test:
  Antti Kantee 
  Ian Jackson 
  Martin Lucina 
  Wei Liu 


jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-i386-rumpuserxen-i386 blocked 



sg-report-flight on osstest.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 527 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v1 05/47] pci: add pci_iomap_wc() variants

2015-03-25 Thread Luis R. Rodriguez

On Mon, Mar 23, 2015 at 12:20:47PM -0500, Bjorn Helgaas wrote:
> Hi Luis,
> 
> This seems OK to me, 

Great.

> but I'm curious about a few things.
> 
> On Fri, Mar 20, 2015 at 6:17 PM, Luis R. Rodriguez
>  wrote:
> > From: "Luis R. Rodriguez" 
> >
> > This allows drivers to take advantage of write-combining
> > when possible. Ideally we'd have pci_read_bases() just
> > peg an IORESOURCE_WC flag for us
> 
> We do set IORESOURCE_PREFETCH.  Do you mean something different?

I did not think we had a WC IORESOURCE flag. Are you saying that we can use
IORESOURCE_PREFETCH for that purpose? If so then great.  As I read a PCI BAR
can have PCI_BASE_ADDRESS_MEM_PREFETCH and when that's the case we peg
IORESOURCE_PREFETCH. That seems to be what I want indeed. Questions below.

> >  but where exactly
> > video devices memory lie varies *largely* and at times things
> > are mixed with MMIO registers, sometimes we can address
> > the changes in drivers, other times the change requires
> > intrusive changes.
> 
> What does a video device address have to do with this?  I do see that
> if a BAR maps only a frame buffer, the device might be able to mark it
> prefetchable, while if the BAR mapped both a frame buffer and some
> registers, it might not be able to make it prefetchable.  But that
> doesn't seem like it depends on the *address*.

I meant the offsets for each of those, either registers or framebuffer,
and that typically they are mixed (primarily on older devices), so indeed your
summary of the problem is what I meant. Let's remember that we are trying to
take advantage of PAT here when available and avoid MTRR in that case, do we
know that the same PCI BARs that have always historically used MTRRs had
IORESOURCE_PREFETCH set, is that a fair assumption ? I realize they are
different things -- but its precisely why I ask.

> pci_iomap_range() already makes a cacheable mapping if
> IORESOURCE_CACHEABLE; I'm guessing that you would like it to
> automatically use WC if the BAR if IORESOURCE_PREFETCH, e.g.,
> 
>   if (flags & IORESOURCE_CACHEABLE)
> return ioremap(start, len);
>   if (flags & IORESOURCE_PREFETCH)
> return ioremap_wc(start, len);
>   return ioremap_nocache(start, len);

Indeed, that's exactly what I think we should strive towards.

> Is there a reason not to do that?

This depends on the exact defintion of IORESOURCE_PREFETCH and
PCI_BASE_ADDRESS_MEM_PREFETCH and how they are used all over and
accross *all devices*. This didn't look promising for starters:

include/uapi/linux/pci_regs.h:#define  PCI_BASE_ADDRESS_MEM_PREFETCH0x08
/* prefetchable? */

PCI_BASE_ADDRESS_MEM_PREFETCH seems to be BAR specific, so a few questions:

1) Can we rest assured for instance that if we check for
PCI_BASE_ADDRESS_MEM_PREFETCH and if set that it will *only* be set on a full
PCI BAR if the full PCI BAR does want WC? If not this can regress
functionality. That seems risky. It however would not be risky if we used
another API that did look for IORESOURCE_PREFETCH and if so use ioremap_wc() --
that way only drivers we know that do use the full PCI bar would use this API.
There's a bit of a problem with this though:

2) Do we know that if a *full PCI BAR* is used for WC that
PCI_BASE_ADDRESS_MEM_PREFETCH *was* definitely set for the PCI BAR? If so then
the API usage would be restricted only to devices that we know *do* adhere to
this. That reduces the possible uses for older drivers and can create
regressions if used loosely without verification... but..

3) If from now on we get folks to commit to uset PCI_BASE_ADDRESS_MEM_PREFETCH
for full PCI BARs that do want WC perhaps newer devices / drivers will use
this very consistently ? Can we bank on that and is it worth it ?

4) If a PCI BAR *does not* have PCI_BASE_ADDRESS_MEM_PREFETCH do we know it
must not never want WC ?

If we don't have certainty on any of the above I'm afraid we can't do much
right now but perhaps we can push towards better use of 
PCI_BASE_ADDRESS_MEM_PREFETCH
and hope folks will only use this for the full PCI BAR only if WC is desired.

Thoughts?

> > Although there is also arch_phys_wc_add() that makes use of
> > architecture specific write-combinging alternatives (MTRR on
> > x86 when a system does not have PAT) we void polluting
> > pci_iomap() space with it and force drivers and subsystems
> > that want to use it to be explicit.
> >
> > There are a few motivations for this:
> >
> > a) Take advantage of PAT when available
> >
> > b) Help bury MTRR code away, MTRR is architecture specific and on
> >x86 its replaced by PAT
> >
> > c) Help with the goal of eventually using _PAGE_CACHE_UC over
> >_PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (de33c442e)
> > ...
> 
> > +void __iomem *pci_iomap_wc_range(struct pci_dev *dev,
> > +int bar,
> > +unsigned long offset,
> > +unsigned long maxlen)
> > +{
> > +   resource_size_t start = pci

[Xen-devel] [PATCH] libxl: Fix memory leak if pthread_create fails.

2015-03-25 Thread Konrad Rzeszutek Wilk

If we fail to create the thread we leak the shutdown_info
structure.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 src/libxl/libxl_domain.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
index 774b070..0ac5c62 100644
--- a/src/libxl/libxl_domain.c
+++ b/src/libxl/libxl_domain.c
@@ -482,7 +482,7 @@ libxlDomainEventHandler(void *data, VIR_LIBXL_EVENT_CONST 
libxl_event *event)
 libxlDriverPrivatePtr driver = data;
 virDomainObjPtr vm = NULL;
 libxl_shutdown_reason xl_reason = event->u.domain_shutdown.shutdown_reason;
-struct libxlShutdownThreadInfo *shutdown_info;
+struct libxlShutdownThreadInfo *shutdown_info = NULL;
 virThread thread;
 libxlDriverConfigPtr cfg;
 
@@ -535,6 +535,7 @@ libxlDomainEventHandler(void *data, VIR_LIBXL_EVENT_CONST 
libxl_event *event)
 virObjectUnref(cfg);
 if (vm)
 virObjectUnlock(vm);
+VIR_FREE(shutdown_info);
 }
 
 void
-- 
1.8.4.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/3] libxl: acquire a job when destroying a domain

2015-03-25 Thread Konrad Rzeszutek Wilk

On Wed, Mar 25, 2015 at 02:08:35PM -0600, Jim Fehlig wrote:
> A job should be acquired at the beginning of a domain destroy operation,
> not at the end when cleaning up the domain.  Fix two occurances of this
> late job acquisition in the libxl driver.  Doing so renders
> libxlDomainCleanup unused, so it is removed.
> 
> Signed-off-by: Jim Fehlig 
> ---
>  src/libxl/libxl_domain.c | 49 
> +---
>  src/libxl/libxl_domain.h |  4 
>  src/libxl/libxl_driver.c | 20 
>  3 files changed, 29 insertions(+), 44 deletions(-)
> 
> diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
> index eca1ff0..e240eae 100644
> --- a/src/libxl/libxl_domain.c
> +++ b/src/libxl/libxl_domain.c
> @@ -371,6 +371,9 @@ libxlDomainShutdownThread(void *opaque)
>  
>  cfg = libxlDriverConfigGet(driver);
>  
> +if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0)
> +goto cleanup;
> +

Won't this a deadlock with 'libxlDomainAutoCoreDump'  (line 410) (which also
calls :

 727 if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0)
 728 goto cleanup;

well, not deadlock - but spin up to 30 seconds? and then give up?

Perhaps this patch should be folded in?

>From 9f2bac0c28815fc51850290c4b1962881691bfca Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Wed, 25 Mar 2015 22:34:51 -0400
Subject: [PATCH] squash

---
 src/libxl/libxl_domain.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
index aef0157..774b070 100644
--- a/src/libxl/libxl_domain.c
+++ b/src/libxl/libxl_domain.c
@@ -723,15 +723,11 @@ libxlDomainAutoCoreDump(libxlDriverPrivatePtr driver,
 timestr) < 0)
 goto cleanup;
 
-if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0)
-goto cleanup;
-
 /* Unlock virDomainObj while dumping core */
 virObjectUnlock(vm);
 libxl_domain_core_dump(cfg->ctx, vm->def->id, dumpfile, NULL);
 virObjectLock(vm);
 
-ignore_value(libxlDomainObjEndJob(driver, vm));
 ret = 0;
 
  cleanup:
-- 
1.8.4.2

>  if (xl_reason == LIBXL_SHUTDOWN_REASON_POWEROFF) {
>  dom_event = virDomainEventLifecycleNewFromObj(vm,
> VIR_DOMAIN_EVENT_STOPPED,
> @@ -384,7 +387,7 @@ libxlDomainShutdownThread(void *opaque)
>  goto restart;
>  case VIR_DOMAIN_LIFECYCLE_PRESERVE:
>  case VIR_DOMAIN_LIFECYCLE_LAST:
> -goto cleanup;
> +goto endjob;
>  }
>  } else if (xl_reason == LIBXL_SHUTDOWN_REASON_CRASH) {
>  dom_event = virDomainEventLifecycleNewFromObj(vm,
> @@ -399,7 +402,7 @@ libxlDomainShutdownThread(void *opaque)
>  goto restart;
>  case VIR_DOMAIN_LIFECYCLE_CRASH_PRESERVE:
>  case VIR_DOMAIN_LIFECYCLE_CRASH_LAST:
> -goto cleanup;
> +goto endjob;
>  case VIR_DOMAIN_LIFECYCLE_CRASH_COREDUMP_DESTROY:
>  libxlDomainAutoCoreDump(driver, vm);
>  goto destroy;
> @@ -420,11 +423,11 @@ libxlDomainShutdownThread(void *opaque)
>  goto restart;
>  case VIR_DOMAIN_LIFECYCLE_PRESERVE:
>  case VIR_DOMAIN_LIFECYCLE_LAST:
> -goto cleanup;
> +goto endjob;
>  }
>  } else {
>  VIR_INFO("Unhandled shutdown_reason %d", xl_reason);
> -goto cleanup;
> +goto endjob;
>  }
>  
>   destroy:
> @@ -433,13 +436,11 @@ libxlDomainShutdownThread(void *opaque)
>  dom_event = NULL;
>  }
>  libxl_domain_destroy(cfg->ctx, vm->def->id, NULL);
> -if (libxlDomainCleanupJob(driver, vm, reason)) {
> -if (!vm->persistent) {
> -virDomainObjListRemove(driver->domains, vm);
> -vm = NULL;
> -}
> -}
> -goto cleanup;
> +libxlDomainCleanup(driver, vm, reason);
> +if (!vm->persistent)
> +virDomainObjListRemove(driver->domains, vm);
> +
> +goto endjob;
>  
>   restart:
>  if (dom_event) {
> @@ -447,13 +448,17 @@ libxlDomainShutdownThread(void *opaque)
>  dom_event = NULL;
>  }
>  libxl_domain_destroy(cfg->ctx, vm->def->id, NULL);
> -libxlDomainCleanupJob(driver, vm, VIR_DOMAIN_SHUTOFF_SHUTDOWN);
> +libxlDomainCleanup(driver, vm, VIR_DOMAIN_SHUTOFF_SHUTDOWN);
>  if (libxlDomainStart(driver, vm, false, -1) < 0) {
>  virErrorPtr err = virGetLastError();
>  VIR_ERROR(_("Failed to restart VM '%s': %s"),
>vm->def->name, err ? err->message : _("unknown error"));
>  }
>  
> + endjob:
> +if (!libxlDomainObjEndJob(driver, vm))
> +vm = NULL;
> +
>   cleanup:
>  if (vm)
>  virObjectUnlock(vm);
> @@ -690,26 +695,6 @@ libxlDomainCleanup(libxlDriverPrivatePtr driver,
>  }
>  
>  /*
> - * Cleanup function for domain that has reached shutoff state.
> - * Executed in the context of a job.
> - *
> - * virDomainObjP

Re: [Xen-devel] [PATCH v6 04/30] xen/PCI: Don't use deprecated function pci_scan_bus_parented()

2015-03-25 Thread Yijing Wang

 That seems OK to me.  Probably still wrong, but no worse than it was 
 before.
>>>
>>> Interesting. The mechanism for PCI passthrough can either synthesize
>>> and PCI bus number starting at zero (so first device is always 0:0:0.0)
>>> or it can replicate the backend PCI topology. That means you
>>> could have segment values passed in, so: ab:ff:00.1). I've to admin
>>> I hadn't tried the 'physical' replication on an machine with
>>> domains (err, segments).
>>>
>>> Is there an git tree with this so I can just try it out?
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git
>> pci/enumeration-yw6 has similar code (it exports the single
> 
> I presume now it is bjorn/pci/enumeration-yw8 ? Going to test this out
> this week.

Yes, it's the latest version. thanks!

>> busn_resource and makes xen use it).  That should be functionally
>> identical to what v4.0-rc1 does.
>>
>> Yijing hasn't posted the static busn_res proposal above yet, so I
>> don't have a branch with that in it.
>>
>> Bjorn
> 
> .
> 


-- 
Thanks!
Yijing


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v3][PATCH 2/2] libxl: introduce gfx_passthru_kind

2015-03-25 Thread Chen, Tiejun


On 2015/3/25 18:32, Ian Campbell wrote:

On Wed, 2015-03-25 at 09:10 +0800, Chen, Tiejun wrote:


+But when given as a string the B option describes the type
+of device to enable. Not this behavior is only supported with upstream


"Note" and "the upstream..."


Fixed.




+=item "igd"
+
+Enables graphics device PCI passthrough but force set the type of device
+with the Intel Graphics Device.


"but force set the type" doesn't scan very well. "... forcing the type
of device to Intel Graphics Device" works I think.


Fine to me as well.




I understand what you mean but that table just includes IGDs existed on
BDW and HSW. Because in the case of qemu upstream we're just covering
these platforms, and with our discussion we don't have any plan to add
those legacy platforms in the future. But qemu-xen-traditional still
covers those platforms. So I'm afraid its not good to check this with
that table as well.


Hrm, OK. I suppose we can live with autodetect and igd both meaning igd
and whoever adds a new type will have to remember to add a check for
qemu-trad then.



When we really have to introduce a new type, this means we probably need 
to change something inside qemu codes. So a new type should just go into 
that table to support qemu upstream since now we shouldn't refactor 
anything in qemu-xen-traditional, right?


Thanks
Tiejun


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] One question to lowlevel/xl/xl.c and lowlevel/xc/xc.c

2015-03-25 Thread Chen, Tiejun


On 2015/3/25 18:26, Ian Campbell wrote:

On Wed, 2015-03-25 at 09:18 +0800, Chen, Tiejun wrote:

Actually my problem is that, I need to add a new parameter, 'flag', like
this, xc_assign_device(xxx,xxx,flag). So if I don't refine xc.c, tools
can't be compiled successfully. Or maybe you're suggesting I may isolate
this file while building tools, right?


I answered this in my original reply:
  it is acceptable for the new
 parameter to not be plumbed up to the Python bindings until someone
 comes along with a requirement to use it from Python. IOW you can just
 pass whatever the nop value is for the new argument.



Yes, I knew this but I'm just getting a little confusion again after 
we're starting to talk if xc.c is compiled...


And I will try to do this.


The "nop value" is whatever value should be passed to retain the current
behaviour.


Sure.

Thanks
Tiejun


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [qemu-upstream-4.4-testing test] 36726: regressions - FAIL

2015-03-25 Thread xen . org

flight 36726 qemu-upstream-4.4-testing real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/36726/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-winxpsp3  7 windows-install fail REGR. vs. 31663

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xend-winxpsp3 17 leak-check/check fail  never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xend-qemut-winxpsp3 17 leak-check/checkfail never pass

version targeted for testing:
 qemuud173a0c20d7970c17fa593cf86abc1791a8a4a3a
baseline version:
 qemuub04df88d41f64fc6b56d193b6e90fb840cedb1d3


People who touched revisions under test:
  Benoit Canet 
  BenoÃ®t Canet 
  Dmitry Fleytman 
  Gerd Hoffmann 
  Jason Wang 
  Jeff Cody 
  Juan Quintela 
  Kevin Wolf 
  Laszlo Ersek 
  Michael Roth 
  Michael S. Tsirkin 
  Peter Maydell 
  Petr Matousek 
  Stefan Hajnoczi 
  Stefano Stabellini 


jobs:
 build-amd64-xend pass
 build-i386-xend  pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-win7-amd64   fail
 test-amd64-i386-xl-win7-amd64fail
 test-amd64-amd64-xl-credit2  pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-amd64-xl-pcipt-intel  fail
 test-amd64-i386-rhel6hvm-intel   pass
 test-amd64-i386-qemut-rhel6hvm-intel pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-xl-multivcpupass
 test-amd64-amd64-pairpass
 test-amd64-i386-pair pass
 test-amd64-amd64-xl-sedf-pin pass
 test-amd64-amd64-pv  pass
 test-amd64-i386-pv   pass
 test-amd64-amd64-xl-sedf pass
 test-amd64-i386-xl-qemut-w

Re: [Xen-devel] [PATCH 3/3] xen: arm: handle thumb mode guest in continuations

2015-03-25 Thread Julien Grall


Hi Ian,

On 25/03/2015 15:34, Ian Campbell wrote:

PC only needs adjusting by 2, otherwise we rerun the instruction prior
to the hvc as well.


I don't understand why you have to adjust PC by 2 for thumb.
The spec encodes the HVC thumb instruction on 32 bits (i.e 4 bytes).

Regards,


--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/3] xen: arm: correctly handle continuations for 64-bit guests

2015-03-25 Thread Julien Grall


Hi Ian,

On 25/03/2015 15:34, Ian Campbell wrote:

The 64-bit ABI is different to 32-bit:

  - uses x16 as the op register rather than r12.
  - arguments in x0..x5 and not r0..r5. Using rN here potentially
truncates.
  - return value goes in x0, not r0.

Hypercalls can only be made directly from kernel space, so checking
the domain's size is sufficient.

The update of regs->pc is duplicated in both halves because the 32-bit
case is going to need fixing to handle Thumb mode (next patch).

Spotted due to spurious -EFAULT when destroying a domain, due to the
hypercall's pointer argument being truncated. I'm unclear why I am
only seeing this now.


Good catch!

x16 would still contain the valid operation, because we are (most of the 
time?) continuing on the same hypercall.


So the only issue would be argument truncation. I guess that we don't 
have big value (i.e > 32 bits) to store.



Signed-off-by: Ian Campbell 
---
I imagine this needs backporting everywhere...


Agree for Xen 4.4 and Xen 4.5.

Reviewed-by: Julien Grall 

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-3.10 test] 36723: regressions - FAIL

2015-03-25 Thread xen . org

flight 36723 linux-3.10 real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/36723/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-winxpsp3  7 windows-install fail REGR. vs. 26303

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-ovmf-amd64  7 debian-hvm-install  fail pass in 36615

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels 
fail blocked in 26303
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 26303
 test-amd64-amd64-xl-winxpsp3  7 windows-install  fail   like 26303
 test-amd64-amd64-rumpuserxen-amd64 8 guest-start fail in 36615 blocked in 26303
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 7 debian-hvm-install fail in 36615 
blocked in 26303

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 7 debian-hvm-install fail never 
pass
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-amd64-amd64-xl-xsm   9 guest-start  fail   never pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 7 debian-hvm-install fail never 
pass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 7 debian-hvm-install fail never 
pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 7 debian-hvm-install fail never 
pass
 test-amd64-i386-libvirt-xsm   9 guest-start  fail   never pass
 test-amd64-i386-xl-xsm9 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm  9 guest-start  fail   never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm  5 xen-boot fail   never pass
 test-armhf-armhf-xl-multivcpu  5 xen-boot fail  never pass
 test-armhf-armhf-xl-sedf-pin  5 xen-boot fail   never pass
 test-armhf-armhf-xl-midway5 xen-boot fail   never pass
 test-armhf-armhf-xl-credit2   5 xen-boot fail   never pass
 test-armhf-armhf-libvirt  5 xen-boot fail   never pass
 test-armhf-armhf-xl-xsm   5 xen-boot fail   never pass
 test-armhf-armhf-xl   5 xen-boot fail   never pass
 test-armhf-armhf-xl-sedf  5 xen-boot fail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass

version targeted for testing:
 linux7f4e64246049cef5ae1eca37eec1701a9477799e
baseline version:
 linuxbe67db109090b17b56eb8eb2190cd70700f107aa


997 people touched revisions under test,
not listing them all


jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass

[Xen-devel] [linux-3.16 test] 36722: regressions - FAIL

2015-03-25 Thread xen . org

flight 36722 linux-3.16 real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/36722/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-pvops 5 kernel-build  fail REGR. vs. 34167
 test-amd64-i386-pair   17 guest-migrate/src_host/dst_host fail REGR. vs. 34167

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 7 debian-hvm-install fail never 
pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 7 debian-hvm-install fail never 
pass
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   9 guest-start  fail   never pass
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-amd64-i386-xl-xsm9 guest-start  fail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pcipt-intel  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-xsm   5 xen-boot fail   never pass
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  5 xen-boot fail   never pass
 test-amd64-i386-freebsd10-i386  7 freebsd-install  fail never pass
 test-amd64-amd64-xl-sedf  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-freebsd10-amd64  7 freebsd-install fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-win7-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-sedf-pin  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a

version targeted for testing:
 linux4923505b93e073f70380557cb360997e5b84b4f9
baseline version:
 linux19583ca584d6f574384e17fe7613dfaeadcdc4a6


1129 people touched revisions under test,
not listing them all


jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 bu

Re: [Xen-devel] [PATCH OSSTEST 04/12] toolstack: distinguish local and remote migration support

2015-03-25 Thread Jim Fehlig

Ian Campbell wrote:
> On Mon, 2015-02-09 at 11:09 +, Wei Liu wrote:
>   
>> Libvirt supports migrating a guest to remote host but not local host.
>> 
>
> Jim, is that right?
>   

Opps, I missed this mail.  Sorry for the delay.

Yes, that is correct

# virsh migrate --live test-pv xen+ssh://localhost
error: Requested operation is not valid: domain 'test-pv' is already active

I suppose the code could be fixed up to allow localhost migration, but
doing so would be quite low on my todo list.

Regards,
Jim


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] 3.18 xen-pcifront regression?

2015-03-25 Thread Michael D Labriola

Konrad Rzeszutek Wilk  wrote on 03/25/2015 
04:27:00 PM:

> From: Konrad Rzeszutek Wilk 
> To: Bjorn Helgaas , 
> Cc: Michael D Labriola , xen-
> de...@lists.xenproject.org, Stuart Wehrly , 
> michael.d.labri...@gmail.com, Jayson A Dyke , "Rafael 
> J. Wysocki" , linux-...@vger.kernel.org, linux-
> a...@vger.kernel.org
> Date: 03/25/2015 04:27 PM
> Subject: Re: [Xen-devel] 3.18 xen-pcifront regression?
> 
> On Tue, Mar 24, 2015 at 12:27:02PM -0500, Bjorn Helgaas wrote:
> > [+cc Rafael, linux-pci, linux-acpi]
> > 
> > On Tue, Mar 24, 2015 at 11:28:06AM -0400, Konrad Rzeszutek Wilk wrote:
> > > On Tue, Mar 24, 2015 at 11:14:29AM -0400, Michael D Labriola wrote:
> > > > I'm having problems booting a 3.18 or newer domU w/ PCI devices 
passed 
> > > > through.  It only seems to be the domU kernel that's upset 
> (i.e., Behavior 
> > > > is identical whether I'm running 3.19 or 3.13 dom0).  I'm running 
a 32bit 
> > > > dom0 (3.13.11) w/ 64bit 4.4.0 hypervisor and 32bit domU.  I get 
the 
> > > > following Oops when trying to boot my domU with a couple PCI cards 
passed 
> > > > through:
> > > > 
> > > > BUG: unable to handle kernel paging request at 0030303e
> > > > IP: [] acpi_ns_validate_handle+0x12/0x1a
> > > > *pdpt = 019f1027 *pde =  
> > > > Oops:  [#1] PREEMPT SMP 
> > > > Modules linked in: xen_pcifront(+) pcspkr xen_blkfront loop
> > > > CPU: 0 PID: 18 Comm: xenwatch Not tainted 3.17.0-test+ #6
> > > > task: cb869950 ti: cb8ae000 task.ti: cb8ae000
> > > > EIP: 0061:[] EFLAGS: 00010246 CPU: 0
> > > > EIP is at acpi_ns_validate_handle+0x12/0x1a
> > > > EAX:  EBX: cb895dc0 ECX:  EDX: 0030303a
> > > > ESI: c0a6bccd EDI:  EBP: 0004 ESP: cb8afd00
> > > >  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0069
> > > > CR0: 8005003b CR2: 0030303e CR3: 0a68e000 CR4: 00040660
> > > > Stack:
> > > >  c06eda4d  c096a21d   6462  
c0c102c0
> > > >  0030303a 00040004 0030303a cb8afd94 cb8afdec cb8afd60 c06b78e1 
cb8afd60
> > > >  0061 0246 c0407bc7 c0c102c0  cb8afda0 cb8afda8 
cb8afdb0
> > > > Call Trace:
> > > >  [] ? acpi_evaluate_object+0x31/0x1fc
> > > 
> > > We should not be calling in any acpi code in PV domU guests.
> > > 
> > > WE actually disable it (acpi=0) to make sure we don't call it - as
> > > there is no ACPI AML data at all in the guest.
> > > 
> > > CC-ing Bjorn.
> > > >  [] ? resume_kernel+0x5/0x7
> > > >  [] ? pci_get_hp_params+0x111/0x4e0
> > > >  [] ? xen_force_evtchn_callback+0x17/0x30
> > > >  [] ? xen_restore_fl_direct_reloc+0x4/0x4
> > > >  [] ? pci_device_add+0x24/0x450
> > > >  [] ? pci_bus_read_config_word+0x6e/0x80
> > > >  [] ? pci_scan_single_device+0x8d/0xb0
> > > >  [] ? pci_scan_slot+0x3c/0xf0
> > > >  [] ? pci_scan_child_bus+0x1c/0x90
> > > >  [] ? pci_scan_bus_parented+0x6a/0x90
> > > >  [] ? pcifront_scan_root+0x91/0x130 [xen_pcifront]
> > > >  [] ? pcifront_backend_changed+0x4af/0x654 
[xen_pcifront]
> > > >  [] ? xenbus_gather+0x5f/0x90
> > > >  [] ? xenbus_gather+0x5f/0x90
> > > >  [] ? xenbus_read_driver_state+0x33/0x50
> > > >  [] ? xenbus_otherend_changed+0x95/0xa0
> > > >  [] ? backend_changed+0xf/0x20
> > > >  [] ? xenwatch_thread+0x72/0x110
> > > >  [] ? bit_waitqueue+0x50/0x50
> > > >  [] ? join+0x70/0x70
> > > >  [] ? kthread+0xab/0xd0
> > > >  [] ? ret_from_kernel_thread+0x21/0x30
> > > >  [] ? flush_kthread_worker+0xa0/0xa0
> > > > Code: 03 10 00 00 eb 0e 46 83 c2 04 4b 85 db 75 b9 c6 02 00 31 
> c0 5b 5e 5f 
> > > > 5d c3 89 c2 8d 40 ff 83 f8 fd 76 06 a1 2c 32 c1 c0 c3 31 c0 <80>7a 
04 0f 
> > > > 0f 44 c2 c3 83 ec 10 83 f8 1d 76 24 89 44 24 0c c7
> > > > EIP: [] acpi_ns_validate_handle+0x12/0x1a SS:ESP 
0069:cb8afd00
> > > > CR2: 0030303e
> > > > ---[ end trace d4ddeb038cbcbdf7 ]---
> > > > 
> > > > 
> > > > I've bisected down to the following commit in 3.18, which breaks 
my 
> > > > system.
> > > > 
> > > > 6cd33649fa83d97ba7b66f1d871a360e867c5220 is the first bad commit
> > > > commit 6cd33649fa83d97ba7b66f1d871a360e867c5220
> > > > Author: Bjorn Helgaas 
> > > > Date:   Wed Aug 27 14:29:47 2014 -0600
> > > > 
> > > > PCI: Add pci_configure_device() during enumeration
> > > > 
> > > > Some platforms can tell the OS how to configure PCI 
devices,e.g., how 
> > > > to
> > > > set cache line size, error reporting enables, etc.  ACPI 
defines _HPP 
> > > > and
> > > > _HPX methods for this purpose.
> > > > 
> > > > This configuration was previously done by some of the hotplug 
drivers 
> > > > using
> > > > pci_configure_slot().  But not all hotplug drivers did this, 
and per 
> > > > the
> > > > spec (ACPI rev 5.0, sec 6.2.7), we can also do it for "devices 
not
> > > > configured by the BIOS at system boot."
> > > > 
> > > > Move this configuration into the PCI core by adding 
> > > > pci_configure_device()
> > > > and calling it from pci_device_add(), so we do this for all 
> devices as 
> > > > we
> > > > enumerate

Re: [Xen-devel] [PATCH 05/11] x86/altp2m: basic data structures and support routines.

2015-03-25 Thread Ed White

>>> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
>>> index abf3d7a..8fe0650 100644
>>> --- a/xen/arch/x86/mm/hap/hap.c
>>> +++ b/xen/arch/x86/mm/hap/hap.c
>>> @@ -439,7 +439,7 @@ void hap_domain_init(struct domain *d)
>>>  int hap_enable(struct domain *d, u32 mode)
>>>  {
>>>  unsigned int old_pages;
>>> -uint8_t i;
>>> +uint16_t i;
>>>  int rv = 0;
>>>  
>>>  domain_pause(d);
>>> @@ -485,6 +485,23 @@ int hap_enable(struct domain *d, u32 mode)
>>> goto out;
>>>  }
>>>  
>>> +/* Init alternate p2m data */
>>> +if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
>>
>> This memory should be allocated from some domain-accounted pool,
>> probably the paging pool (d->->arch.paging.alloc_page()).  You can use
>> map_domain_page_global() to get a safe pointer to anchor in
>> d->arch.altp2m_eptp for hardware.

I tried this but could not get it to work due to panics in Xen.
Looking at the current VMX code, all the existing structures
shared with hardware (VMCS, exception bitmap, etc.) are allocated
using alloc_xenheap_page(), which is what induced me to write the
code this way.

Ed

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] 3.18 xen-pcifront regression?

2015-03-25 Thread Konrad Rzeszutek Wilk

> > Thanks for the report, Michael, and sorry for the inconvenience.  I think
> > the patch below will fix it, but I don't think it's the right fix either
> > because it seems a little ad hoc to sprinkle "acpi_pci_disabled" tests
> > around like fairy dust.  I wonder if we can set things up so ACPI methods
> > would fail gracefully like they do when ACPI is disabled at compile-time.
> > 
> > I can boot with "acpi=off" on qemu just fine, and when we look up the ACPI
> > device handles, we just get NULL pointers, so everything works out even
> > without a fix like the one below.
> 
> 
> > 
> > There must be something different about the way things get set up in that
> > domU kernel.  I'll try to look into that some more, but I'm going on
> > vacation for the next week, so if you learn anything before then, let me
> > know.
> 
> And I don't see where 'resume_kernel' gets called.
> 
> 
> And under my PV guest I see:
> # lspci
> 00:00.0 USB Controller: NEC Corporation Device 0194 (rev 04)
> 00:01.0 USB Controller: NEC Corporation Device 0194 (rev 03)
> 
> when the guest boots up. Hm, what kind of cards are these?
> Could you also provide the config file and the guest config please?


It helps when I boot an proper kernel (I had booted 3.16 by mistake).

With the 4.0 kernel I see this as well:

[4.059387] pci :00:00.0: reg 0x10: [mem 0xfbd0-0xfbd01fff 64bit]
[4.114184] BUG: unable to handle kernel paging request at 0030303e
[4.114199] IP: [] acpi_ns_validate_handle+0x1c/0x26
[4.114216] *pdpt =  *pde = c2c2c2c2c2c2c2c2 
[4.114230] Oops:  [#1] SMP 
[4.114241] Modules linked in:
[4.114252] CPU: 0 PID: 22 Comm: xenwatch Not tainted 
4.0.0-rc5upstream-00070-g3a88f16 #1
[4.114268] task: dd557370 ti: dd55a000 task.ti: dd55a000 
[4.114278] EIP: e019:[] EFLAGS: 00010246 CPU: 0
[4.114289] EIP is at acpi_ns_validate_handle+0x1c/0x26
[4.114299] EAX:  EBX: 0030303a ECX:  EDX: 
[4.114310] ESI: dd66e7c0 EDI: 0030303a EBP: dd55bcd8 ESP: dd55bcd4
[4.114322]  DS: 007b ES: 007b FS: 00d8 GS:  SS: e021
[4.114334] CR0: 80050033 CR2: 0030303e CR3: 019a4000 CR4: 00042660
[4.114347] Stack:
[4.114354]  c17ae853 dd55bd14 c137ae18 0010 c1a602c0 dd55bcf0 c109ae9a 
dd55bcfc
[4.114376]  c13a9c57 deadbeef dd55bd50  deadbeef 0030303a dd55bd78 
dd55bdd0
[4.114397]  dd55bd58 c1336d90 dd55bd40 007b 007b 00d8 dd55bd94 
dd55bd8c
[4.114419] Call Trace:
[4.114429]  [] acpi_evaluate_object+0x6f/0x366
[4.114443]  [] ? irq_exit+0x4a/0xc0
[4.114456]  [] ? xen_evtchn_do_upcall+0x27/0x40
[4.114468]  [] pci_get_hp_params+0x110/0x4b0
[4.114480]  [] pci_device_add+0x26/0x450
[4.114494]  [] ? xen_restore_fl_direct_reloc+0x4/0x4  
[4.115132]  [] ? _raw_spin_unlock_irqrestore+0x16/0x40
[4.115132]  [] pci_scan_single_device+0x8b/0xb0
[4.115132]  [] pci_scan_slot+0x43/0x100
[4.115132]  [] pci_scan_child_bus+0x1c/0xa0   
[4.115132]  [] pci_scan_bus_parented+0x64/0x90
[4.115132]  [] pcifront_scan_root+0x90/0x120
[4.115132]  [] pcifront_backend_changed+0x475/0x63c
[4.115132]  [] ? kmemleak_free+0x20/0x50
[4.115132]  [] ? kfree+0x7d/0x100
[4.115132]  [] ? __cleanup+0xfd/0x180   
[4.115132]  [] ? xenbus_gather+0x5d/0x90
[4.115132]  [] ? xenbus_read_driver_state+0x35/0x50
[4.115132]  [] xenbus_otherend_changed+0x7d/0x80
[4.115132]  [] backend_changed+0x12/0x20 
[4.115132]  [] xenwatch_thread+0x72/0x120
[4.115132]  [] ? woken_wake_function+0x20/0x20
[4.115132]  [] kthread+0xac/0xd0
[4.115132]  [] ? xenbus_transaction_start+0x60/0x60
[4.115132]  [] ret_from_kernel_thread+0x21/0x30
[4.115132]  [] ? kthread_freezable_should_stop+0x60/0x60
[4.115132] Code: 4f b2 00 00 31 c0 83 c4 2c 5b 5e 5f 5d c3 90 55 89 e5 53 
89 c3 e8 45 b0 00 00 8d 43 ff 83 f8 fd 76 07 a1 20 80 a6 c1 eb 09 31 c0 <80> 7b 
04 0f 0f 44 c3 5b 5d c3 55 89 e5 53 89 c3 e8 1f b0 00 00
[4.115132] EIP: [] acpi_ns_validate_handle+0x1c/0x26 SS:ESP 
e021:dd55bcd4
[4.115132] CR2: 0030303e
[4.115132] ---[ end trace 21d8bfe52b77b825 ]---
[4.115132] Kernel panic - not syncing: Fatal exception
[4.115132] Kernel Offset: 0x0 from 0xc100 (relocation range: 
0xc000-0xedbfdfff)

The interesting thing is that under 64-bit kernels I see this:

[3.304732] pci_bus :00: root bus resource [io  0x-0x]^M
[3.304748] pci_bus :00: root bus resource [mem 0x-0xf]^M
[3.304764] pci_bus :00: root bus resource [bus 00-ff]^M
[3.305023] pci :00:00.0: [1033:0194] type 00 class 0x0c0330^M
[3.322181] pci :00:00.0: reg 0x10: [mem 0xfbd0-0xfbd01fff 64bit]^M
[3.377359] ACPI Exception: AE_BAD_PARAMETER, Thread 520623344 could not 
acquire Mutex [0x1] (20150204/utmutex-285)^M
[3.377379] ACPI Exception: AE_BAD_PARAMETER, Thread 520623344 could not 
acquire Mutex [0x1] (20150204/utmutex

Re: [Xen-devel] [PATCH v1 04/47] pci: add pci_ioremap_wc_bar()

2015-03-25 Thread Luis R. Rodriguez

On Wed, Mar 25, 2015 at 04:03:46PM -0400, Konrad Rzeszutek Wilk wrote:
> On Fri, Mar 20, 2015 at 04:17:54PM -0700, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" 
> > 
> > This lets drivers take advanate of PAT when available. This
> 
> s/advanate/advantage/

Amended.

> > should help with the transition of converting video drivers over
> > to ioremap_wc() to help with the goal of eventually using
> > _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on
> > ioremap_nocache() (de33c442e)
> 
> Please mention the title of the patch too:
> 
> "x86 PAT: fix performance drop for glx, use UC minus for ioremap(), 
> ioremap_nocache() and pci_mmap_page_range()"

Added.

> > 
> > Cc: Suresh Siddha 
> > Cc: Venkatesh Pallipadi 
> > Cc: Ingo Molnar 
> > Cc: Thomas Gleixner 
> > Cc: Juergen Gross 
> > Cc: Daniel Vetter 
> > Cc: Andy Lutomirski 
> > Cc: Dave Airlie 
> > Cc: Antonino Daplas 
> > Cc: Jean-Christophe Plagniol-Villard 
> > Cc: Tomi Valkeinen 
> > Cc: linux-fb...@vger.kernel.org
> > Cc: linux-ker...@vger.kernel.org
> > Signed-off-by: Luis R. Rodriguez 
> > ---
> >  drivers/pci/pci.c   | 14 ++
> >  include/linux/pci.h |  1 +
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 81f06e8..6afd507 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -137,6 +137,20 @@ void __iomem *pci_ioremap_bar(struct pci_dev *pdev, 
> > int bar)
> >  pci_resource_len(pdev, bar));
> >  }
> >  EXPORT_SYMBOL_GPL(pci_ioremap_bar);
> > +
> > +void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar)
> > +{
> > +   /*
> > +* Make sure the BAR is actually a memory resource, not an IO resource
> > +*/
> > +   if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) {
> > +   WARN_ON(1);
> 
> Would it be better to use dev_warn ? That way you can see which BDF it is?
> 
> Thought WARN will give a nice stack-trace that should easily point to the
> driver so perhaps not.. Either way - up to you.

I'm sticking to the style and use as with pci_ioremap_bar(). Whatever we pick
we should make both use the same. More information is always better and
since we do have dev_warn(), it would be nice to use that however within
its use on both pci_ioremap_wc_bar() and pci_ioremap_bar() we have
a use of the pdev with pci_resource_flags() and I believe if pdev is NULL
we'd get a NULL dereference (dev_driver_string() is used), so it would
seem it might be best to stick with a simple WARN_ON(). Arjan, any
preference? Obviously if pdev is NULL your driver is dumb but as folks
develop drivers this should be expected.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] 3.18 xen-pcifront regression?

2015-03-25 Thread Konrad Rzeszutek Wilk

On Tue, Mar 24, 2015 at 12:27:02PM -0500, Bjorn Helgaas wrote:
> [+cc Rafael, linux-pci, linux-acpi]
> 
> On Tue, Mar 24, 2015 at 11:28:06AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Tue, Mar 24, 2015 at 11:14:29AM -0400, Michael D Labriola wrote:
> > > I'm having problems booting a 3.18 or newer domU w/ PCI devices passed 
> > > through.  It only seems to be the domU kernel that's upset (i.e., 
> > > Behavior 
> > > is identical whether I'm running 3.19 or 3.13 dom0).  I'm running a 32bit 
> > > dom0 (3.13.11) w/ 64bit 4.4.0 hypervisor and 32bit domU.  I get the 
> > > following Oops when trying to boot my domU with a couple PCI cards passed 
> > > through:
> > > 
> > > BUG: unable to handle kernel paging request at 0030303e
> > > IP: [] acpi_ns_validate_handle+0x12/0x1a
> > > *pdpt = 019f1027 *pde =  
> > > Oops:  [#1] PREEMPT SMP 
> > > Modules linked in: xen_pcifront(+) pcspkr xen_blkfront loop
> > > CPU: 0 PID: 18 Comm: xenwatch Not tainted 3.17.0-test+ #6
> > > task: cb869950 ti: cb8ae000 task.ti: cb8ae000
> > > EIP: 0061:[] EFLAGS: 00010246 CPU: 0
> > > EIP is at acpi_ns_validate_handle+0x12/0x1a
> > > EAX:  EBX: cb895dc0 ECX:  EDX: 0030303a
> > > ESI: c0a6bccd EDI:  EBP: 0004 ESP: cb8afd00
> > >  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0069
> > > CR0: 8005003b CR2: 0030303e CR3: 0a68e000 CR4: 00040660
> > > Stack:
> > >  c06eda4d  c096a21d   6462  c0c102c0
> > >  0030303a 00040004 0030303a cb8afd94 cb8afdec cb8afd60 c06b78e1 cb8afd60
> > >  0061 0246 c0407bc7 c0c102c0  cb8afda0 cb8afda8 cb8afdb0
> > > Call Trace:
> > >  [] ? acpi_evaluate_object+0x31/0x1fc
> > 
> > We should not be calling in any acpi code in PV domU guests.
> > 
> > WE actually disable it (acpi=0) to make sure we don't call it - as
> > there is no ACPI AML data at all in the guest.
> > 
> > CC-ing Bjorn.
> > >  [] ? resume_kernel+0x5/0x7
> > >  [] ? pci_get_hp_params+0x111/0x4e0
> > >  [] ? xen_force_evtchn_callback+0x17/0x30
> > >  [] ? xen_restore_fl_direct_reloc+0x4/0x4
> > >  [] ? pci_device_add+0x24/0x450
> > >  [] ? pci_bus_read_config_word+0x6e/0x80
> > >  [] ? pci_scan_single_device+0x8d/0xb0
> > >  [] ? pci_scan_slot+0x3c/0xf0
> > >  [] ? pci_scan_child_bus+0x1c/0x90
> > >  [] ? pci_scan_bus_parented+0x6a/0x90
> > >  [] ? pcifront_scan_root+0x91/0x130 [xen_pcifront]
> > >  [] ? pcifront_backend_changed+0x4af/0x654 [xen_pcifront]
> > >  [] ? xenbus_gather+0x5f/0x90
> > >  [] ? xenbus_gather+0x5f/0x90
> > >  [] ? xenbus_read_driver_state+0x33/0x50
> > >  [] ? xenbus_otherend_changed+0x95/0xa0
> > >  [] ? backend_changed+0xf/0x20
> > >  [] ? xenwatch_thread+0x72/0x110
> > >  [] ? bit_waitqueue+0x50/0x50
> > >  [] ? join+0x70/0x70
> > >  [] ? kthread+0xab/0xd0
> > >  [] ? ret_from_kernel_thread+0x21/0x30
> > >  [] ? flush_kthread_worker+0xa0/0xa0
> > > Code: 03 10 00 00 eb 0e 46 83 c2 04 4b 85 db 75 b9 c6 02 00 31 c0 5b 5e 
> > > 5f 
> > > 5d c3 89 c2 8d 40 ff 83 f8 fd 76 06 a1 2c 32 c1 c0 c3 31 c0 <80> 7a 04 0f 
> > > 0f 44 c2 c3 83 ec 10 83 f8 1d 76 24 89 44 24 0c c7
> > > EIP: [] acpi_ns_validate_handle+0x12/0x1a SS:ESP 0069:cb8afd00
> > > CR2: 0030303e
> > > ---[ end trace d4ddeb038cbcbdf7 ]---
> > > 
> > > 
> > > I've bisected down to the following commit in 3.18, which breaks my 
> > > system.
> > > 
> > > 6cd33649fa83d97ba7b66f1d871a360e867c5220 is the first bad commit
> > > commit 6cd33649fa83d97ba7b66f1d871a360e867c5220
> > > Author: Bjorn Helgaas 
> > > Date:   Wed Aug 27 14:29:47 2014 -0600
> > > 
> > > PCI: Add pci_configure_device() during enumeration
> > >  
> > > Some platforms can tell the OS how to configure PCI devices, e.g., 
> > > how 
> > > to
> > > set cache line size, error reporting enables, etc.  ACPI defines _HPP 
> > > and
> > > _HPX methods for this purpose.
> > >  
> > > This configuration was previously done by some of the hotplug drivers 
> > > using
> > > pci_configure_slot().  But not all hotplug drivers did this, and per 
> > > the
> > > spec (ACPI rev 5.0, sec 6.2.7), we can also do it for "devices not
> > > configured by the BIOS at system boot."
> > >  
> > > Move this configuration into the PCI core by adding 
> > > pci_configure_device()
> > > and calling it from pci_device_add(), so we do this for all devices 
> > > as 
> > > we
> > > enumerate them.
> > >  
> > > This is based on pci_configure_slot(), which is used by hotplug 
> > > drivers.
> > > I omitted:
> > >  
> > >   - pcie_bus_configure_settings() because it configures MPS and MRRS, 
> > > which
> > > requires global knowledge of the fabric and must be done later, 
> > > and
> > >  
> > >   - configuration of subordinate devices; that will happen when we 
> > > call
> > > pci_device_add() for those devices.
> > >  
> > > Because pci_configure_slot() was only done by hotplug drivers, this 
> > > initial
> > > version of pci_configu

Re: [Xen-devel] [PATCH v1 05/47] pci: add pci_iomap_wc() variants

2015-03-25 Thread Konrad Rzeszutek Wilk

On Fri, Mar 20, 2015 at 04:17:55PM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" 
> 
> This allows drivers to take advantage of write-combining
> when possible. Ideally we'd have pci_read_bases() just
> peg an IORESOURCE_WC flag for us but where exactly
> video devices memory lie varies *largely* and at times things
> are mixed with MMIO registers, sometimes we can address
> the changes in drivers, other times the change requires
> intrusive changes.
> 
> Although there is also arch_phys_wc_add() that makes use of
> architecture specific write-combinging alternatives (MTRR on

combinging?

> x86 when a system does not have PAT) we void polluting
> pci_iomap() space with it and force drivers and subsystems
> that want to use it to be explicit.
> 
> There are a few motivations for this:
> 
> a) Take advantage of PAT when available
> 
> b) Help bury MTRR code away, MTRR is architecture specific and on
>x86 its replaced by PAT
> 
> c) Help with the goal of eventually using _PAGE_CACHE_UC over
>_PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (de33c442e)
> 
> Cc: Andy Lutomirski 
> Cc: Suresh Siddha 
> Cc: Venkatesh Pallipadi 
> Cc: Ingo Molnar 
> Cc: Thomas Gleixner 
> Cc: Juergen Gross 
> Cc: Daniel Vetter 
> Cc: Dave Airlie 
> Cc: Bjorn Helgaas 
> Cc: Antonino Daplas 
> Cc: Jean-Christophe Plagniol-Villard 
> Cc: Tomi Valkeinen 
> Cc: Dave Hansen 
> Cc: Arnd Bergmann 
> Cc: Michael S. Tsirkin 
> Cc: venkatesh.pallip...@intel.com
> Cc: Stefan Bader 
> Cc: konrad.w...@oracle.com
> Cc: ville.syrj...@linux.intel.com
> Cc: david.vra...@citrix.com
> Cc: jbeul...@suse.com
> Cc: toshi.k...@hp.com
> Cc: Roger Pau Monné 
> Cc: linux-fb...@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Cc: xen-de...@lists.xensource.com
> Signed-off-by: Luis R. Rodriguez 
> ---
>  include/asm-generic/pci_iomap.h | 14 ++
>  lib/pci_iomap.c | 61 
> +
>  2 files changed, 75 insertions(+)
> 
> diff --git a/include/asm-generic/pci_iomap.h b/include/asm-generic/pci_iomap.h
> index 7389c87..b1e17fc 100644
> --- a/include/asm-generic/pci_iomap.h
> +++ b/include/asm-generic/pci_iomap.h
> @@ -15,9 +15,13 @@ struct pci_dev;
>  #ifdef CONFIG_PCI
>  /* Create a virtual mapping cookie for a PCI BAR (memory or IO) */
>  extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long 
> max);
> +extern void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned 
> long max);
>  extern void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
>unsigned long offset,
>unsigned long maxlen);
> +extern void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar,
> + unsigned long offset,
> + unsigned long maxlen);
>  /* Create a virtual mapping cookie for a port on a given PCI device.
>   * Do not call this directly, it exists to make it easier for architectures
>   * to override */
> @@ -34,12 +38,22 @@ static inline void __iomem *pci_iomap(struct pci_dev 
> *dev, int bar, unsigned lon
>   return NULL;
>  }
>  
> +static inline void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, 
> unsigned long max)
> +{
> + return NULL;
> +}
>  static inline void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
>   unsigned long offset,
>   unsigned long maxlen)
>  {
>   return NULL;
>  }
> +static inline void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar,
> +unsigned long offset,
> +unsigned long maxlen)
> +{
> + return NULL;
> +}
>  #endif
>  
>  #endif /* __ASM_GENERIC_IO_H */
> diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
> index bcce5f1..30b65ae 100644
> --- a/lib/pci_iomap.c
> +++ b/lib/pci_iomap.c
> @@ -52,6 +52,46 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
>  EXPORT_SYMBOL(pci_iomap_range);
>  
>  /**
> + * pci_iomap_wc_range - create a virtual WC mapping cookie for a PCI BAR
> + * @dev: PCI device that owns the BAR
> + * @bar: BAR number
> + * @offset: map memory at the given offset in BAR
> + * @maxlen: max length of the memory to map
> + *
> + * Using this function you will get a __iomem address to your device BAR.
> + * You can access it using ioread*() and iowrite*(). These functions hide
> + * the details if this is a MMIO or PIO address space and will just do what
> + * you expect from them in the correct way. When possible write combining
> + * is used.
> + *
> + * @maxlen specifies the maximum length to map. If you want to get access to
> + * the complete BAR from offset to the end, pass %0 here.

s/%0/0 ? Or is that some special syntax?

> + * */
> +void __iomem *pci_iomap_wc_range(struct pci_dev *dev,
> +  int bar,
> +  unsigned l

[Xen-devel] [PATCH 2/3] libxl: acquire a job when destroying a domain

2015-03-25 Thread Jim Fehlig

A job should be acquired at the beginning of a domain destroy operation,
not at the end when cleaning up the domain.  Fix two occurances of this
late job acquisition in the libxl driver.  Doing so renders
libxlDomainCleanup unused, so it is removed.

Signed-off-by: Jim Fehlig 
---
 src/libxl/libxl_domain.c | 49 +---
 src/libxl/libxl_domain.h |  4 
 src/libxl/libxl_driver.c | 20 
 3 files changed, 29 insertions(+), 44 deletions(-)

diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
index eca1ff0..e240eae 100644
--- a/src/libxl/libxl_domain.c
+++ b/src/libxl/libxl_domain.c
@@ -371,6 +371,9 @@ libxlDomainShutdownThread(void *opaque)
 
 cfg = libxlDriverConfigGet(driver);
 
+if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0)
+goto cleanup;
+
 if (xl_reason == LIBXL_SHUTDOWN_REASON_POWEROFF) {
 dom_event = virDomainEventLifecycleNewFromObj(vm,
VIR_DOMAIN_EVENT_STOPPED,
@@ -384,7 +387,7 @@ libxlDomainShutdownThread(void *opaque)
 goto restart;
 case VIR_DOMAIN_LIFECYCLE_PRESERVE:
 case VIR_DOMAIN_LIFECYCLE_LAST:
-goto cleanup;
+goto endjob;
 }
 } else if (xl_reason == LIBXL_SHUTDOWN_REASON_CRASH) {
 dom_event = virDomainEventLifecycleNewFromObj(vm,
@@ -399,7 +402,7 @@ libxlDomainShutdownThread(void *opaque)
 goto restart;
 case VIR_DOMAIN_LIFECYCLE_CRASH_PRESERVE:
 case VIR_DOMAIN_LIFECYCLE_CRASH_LAST:
-goto cleanup;
+goto endjob;
 case VIR_DOMAIN_LIFECYCLE_CRASH_COREDUMP_DESTROY:
 libxlDomainAutoCoreDump(driver, vm);
 goto destroy;
@@ -420,11 +423,11 @@ libxlDomainShutdownThread(void *opaque)
 goto restart;
 case VIR_DOMAIN_LIFECYCLE_PRESERVE:
 case VIR_DOMAIN_LIFECYCLE_LAST:
-goto cleanup;
+goto endjob;
 }
 } else {
 VIR_INFO("Unhandled shutdown_reason %d", xl_reason);
-goto cleanup;
+goto endjob;
 }
 
  destroy:
@@ -433,13 +436,11 @@ libxlDomainShutdownThread(void *opaque)
 dom_event = NULL;
 }
 libxl_domain_destroy(cfg->ctx, vm->def->id, NULL);
-if (libxlDomainCleanupJob(driver, vm, reason)) {
-if (!vm->persistent) {
-virDomainObjListRemove(driver->domains, vm);
-vm = NULL;
-}
-}
-goto cleanup;
+libxlDomainCleanup(driver, vm, reason);
+if (!vm->persistent)
+virDomainObjListRemove(driver->domains, vm);
+
+goto endjob;
 
  restart:
 if (dom_event) {
@@ -447,13 +448,17 @@ libxlDomainShutdownThread(void *opaque)
 dom_event = NULL;
 }
 libxl_domain_destroy(cfg->ctx, vm->def->id, NULL);
-libxlDomainCleanupJob(driver, vm, VIR_DOMAIN_SHUTOFF_SHUTDOWN);
+libxlDomainCleanup(driver, vm, VIR_DOMAIN_SHUTOFF_SHUTDOWN);
 if (libxlDomainStart(driver, vm, false, -1) < 0) {
 virErrorPtr err = virGetLastError();
 VIR_ERROR(_("Failed to restart VM '%s': %s"),
   vm->def->name, err ? err->message : _("unknown error"));
 }
 
+ endjob:
+if (!libxlDomainObjEndJob(driver, vm))
+vm = NULL;
+
  cleanup:
 if (vm)
 virObjectUnlock(vm);
@@ -690,26 +695,6 @@ libxlDomainCleanup(libxlDriverPrivatePtr driver,
 }
 
 /*
- * Cleanup function for domain that has reached shutoff state.
- * Executed in the context of a job.
- *
- * virDomainObjPtr should be locked on invocation
- * Returns true if references remain on virDomainObjPtr, false otherwise.
- */
-bool
-libxlDomainCleanupJob(libxlDriverPrivatePtr driver,
-  virDomainObjPtr vm,
-  virDomainShutoffReason reason)
-{
-if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_DESTROY) < 0)
-return true;
-
-libxlDomainCleanup(driver, vm, reason);
-
-return libxlDomainObjEndJob(driver, vm);
-}
-
-/*
  * Core dump domain to default dump path.
  *
  * virDomainObjPtr must be locked on invocation
diff --git a/src/libxl/libxl_domain.h b/src/libxl/libxl_domain.h
index a032e46..30855a2 100644
--- a/src/libxl/libxl_domain.h
+++ b/src/libxl/libxl_domain.h
@@ -108,10 +108,6 @@ libxlDomainCleanup(libxlDriverPrivatePtr driver,
virDomainObjPtr vm,
virDomainShutoffReason reason);
 
-bool
-libxlDomainCleanupJob(libxlDriverPrivatePtr driver,
-  virDomainObjPtr vm,
-  virDomainShutoffReason reason);
 /*
  * Note: Xen 4.3 removed the const from the event handler signature.
  * Detect which signature to use based on
diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c
index da4c1c7..a1977c2 100644
--- a/src/libxl/libxl_driver.c
+++ b/src/libxl/libxl_driver.c
@@ -1238,10 +1238,13 @@ libxlDomainDestroyFlags(virDomainPtr dom,
 if (virDomainDestroyFlagsEnsureACL(dom->conn, vm->def) < 0)

[Xen-devel] [PATCH 1/3] libxl: Move job acquisition in libxlDomainStart to callers

2015-03-25 Thread Jim Fehlig

Let callers of libxlDomainStart decide when it is appropriate to
acquire a job on the associated virDomainObj.

Signed-off-by: Jim Fehlig 
---
 src/libxl/libxl_domain.c | 24 --
 src/libxl/libxl_driver.c | 53 +++-
 2 files changed, 52 insertions(+), 25 deletions(-)

diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
index 8feb4dc..eca1ff0 100644
--- a/src/libxl/libxl_domain.c
+++ b/src/libxl/libxl_domain.c
@@ -907,16 +907,13 @@ libxlDomainStart(libxlDriverPrivatePtr driver, 
virDomainObjPtr vm,
 
 libxl_domain_config_init(&d_config);
 
-if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0)
-return ret;
-
 cfg = libxlDriverConfigGet(driver);
 /* If there is a managed saved state restore it instead of starting
  * from scratch. The old state is removed once the restoring succeeded. */
 if (restore_fd < 0) {
 managed_save_path = libxlDomainManagedSavePath(driver, vm);
 if (managed_save_path == NULL)
-goto endjob;
+goto cleanup;
 
 if (virFileExists(managed_save_path)) {
 
@@ -924,7 +921,7 @@ libxlDomainStart(libxlDriverPrivatePtr driver, 
virDomainObjPtr vm,
managed_save_path,
&def, &hdr);
 if (managed_save_fd < 0)
-goto endjob;
+goto cleanup;
 
 restore_fd = managed_save_fd;
 
@@ -938,7 +935,7 @@ libxlDomainStart(libxlDriverPrivatePtr driver, 
virDomainObjPtr vm,
_("cannot restore domain '%s' uuid %s from a 
file"
  " which belongs to domain '%s' uuid %s"),
vm->def->name, vm_uuidstr, def->name, 
def_uuidstr);
-goto endjob;
+goto cleanup;
 }
 
 virDomainObjAssignDef(vm, def, true, NULL);
@@ -955,18 +952,18 @@ libxlDomainStart(libxlDriverPrivatePtr driver, 
virDomainObjPtr vm,
 
 if (libxlBuildDomainConfig(driver->reservedVNCPorts, vm->def,
cfg->ctx, &d_config) < 0)
-goto endjob;
+goto cleanup;
 
 if (cfg->autoballoon && libxlDomainFreeMem(cfg->ctx, &d_config) < 0) {
 virReportError(VIR_ERR_INTERNAL_ERROR,
_("libxenlight failed to get free memory for domain 
'%s'"),
d_config.c_info.name);
-goto endjob;
+goto cleanup;
 }
 
 if (virHostdevPrepareDomainDevices(hostdev_mgr, LIBXL_DRIVER_NAME,
vm->def, VIR_HOSTDEV_SP_PCI) < 0)
-goto endjob;
+goto cleanup;
 
 /* Unlock virDomainObj while creating the domain */
 virObjectUnlock(vm);
@@ -998,7 +995,7 @@ libxlDomainStart(libxlDriverPrivatePtr driver, 
virDomainObjPtr vm,
 virReportError(VIR_ERR_INTERNAL_ERROR,
_("libxenlight failed to restore domain '%s'"),
d_config.c_info.name);
-goto endjob;
+goto cleanup;
 }
 
 /*
@@ -1045,7 +1042,7 @@ libxlDomainStart(libxlDriverPrivatePtr driver, 
virDomainObjPtr vm,
 libxlDomainEventQueue(driver, event);
 
 ret = 0;
-goto endjob;
+goto cleanup;
 
  cleanup_dom:
 if (priv->deathW) {
@@ -1056,10 +1053,7 @@ libxlDomainStart(libxlDriverPrivatePtr driver, 
virDomainObjPtr vm,
 vm->def->id = -1;
 virDomainObjSetState(vm, VIR_DOMAIN_SHUTOFF, VIR_DOMAIN_SHUTOFF_FAILED);
 
- endjob:
-if (!libxlDomainObjEndJob(driver, vm))
-vm = NULL;
-
+ cleanup:
 libxl_domain_config_dispose(&d_config);
 VIR_FREE(dom_xml);
 VIR_FREE(managed_save_path);
diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c
index 05f6eb1..da4c1c7 100644
--- a/src/libxl/libxl_driver.c
+++ b/src/libxl/libxl_driver.c
@@ -303,18 +303,26 @@ libxlAutostartDomain(virDomainObjPtr vm,
 virObjectLock(vm);
 virResetLastError();
 
+if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0) {
+virObjectUnlock(vm);
+return ret;
+}
+
 if (vm->autostart && !virDomainObjIsActive(vm) &&
 libxlDomainStart(driver, vm, false, -1) < 0) {
 err = virGetLastError();
 VIR_ERROR(_("Failed to autostart VM '%s': %s"),
   vm->def->name,
   err ? err->message : _("unknown error"));
-goto cleanup;
+goto endjob;
 }
 
 ret = 0;
- cleanup:
-virObjectUnlock(vm);
+
+ endjob:
+if (libxlDomainObjEndJob(driver, vm))
+virObjectUnlock(vm);
+
 return ret;
 }
 
@@ -885,17 +893,26 @@ libxlDomainCreateXML(virConnectPtr conn, const char *xml,
 goto cleanup;
 def = NULL;
 
+if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0) {
+virDomainObjListRemove(driver->domains, vm);
+vm = NULL;
+goto cleanup;
+}
+

[Xen-devel] [PATCH 0/3] libxl: domain destroy fixes

2015-03-25 Thread Jim Fehlig

This small series of patches fixes some issues wrt domain destroy in
the libxl driver.  The primary motivation for this work is to
prevent locking the virDomainObj during long running destroy operations
on large memory domains.

Patch 1 moves job acquisition from libxlDomainStart to it's callers so
they have more control over when the job is acquired.  Patch 2 fixes a
few spots where we never acquired a job during domain destroy.  Patch 3
contains the interesting change, where the virDomainObj is unlocked
during the long-running destroy operation.

This series wraps up my work to improve parallel OpenStack Tempest runs
against the libxl driver.  With libvirt.git master + this series + a
patched libxl [1], I've successfully run a reproducer that was hitting
the same issues encountered by Tempest.

[1] libxl commits from xen.git: 93699882d, f1335f0d, 4783c99a, 1c91d6fba,
and 188e9c54.  I'll contact the stable branch maintainers and ask them
to include these commits in the next Xen 4.4.x and 4.5.x releases.

Jim Fehlig (3):
  libxl: Move job acquisition in libxlDomainStart to callers
  libxl: acquire a job when destroying a domain
  libxl: drop virDomainObj lock when destroying a domain

 src/libxl/libxl_domain.c | 77 +++
 src/libxl/libxl_domain.h |  4 ---
 src/libxl/libxl_driver.c | 78 
 3 files changed, 89 insertions(+), 70 deletions(-)

-- 
1.8.4.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 3/3] libxl: drop virDomainObj lock when destroying a domain

2015-03-25 Thread Jim Fehlig

A destroy operation can take considerable time on large memory
domains due to scrubbing the domain' memory.  The operation is
running in the context of a job, so unlocking the domain and
allowing query operations is safe.

Signed-off-by: Jim Fehlig 
---
 src/libxl/libxl_domain.c | 4 
 src/libxl/libxl_driver.c | 5 -
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
index e240eae..aef0157 100644
--- a/src/libxl/libxl_domain.c
+++ b/src/libxl/libxl_domain.c
@@ -435,7 +435,9 @@ libxlDomainShutdownThread(void *opaque)
 libxlDomainEventQueue(driver, dom_event);
 dom_event = NULL;
 }
+virObjectUnlock(vm);
 libxl_domain_destroy(cfg->ctx, vm->def->id, NULL);
+virObjectLock(vm);
 libxlDomainCleanup(driver, vm, reason);
 if (!vm->persistent)
 virDomainObjListRemove(driver->domains, vm);
@@ -447,7 +449,9 @@ libxlDomainShutdownThread(void *opaque)
 libxlDomainEventQueue(driver, dom_event);
 dom_event = NULL;
 }
+virObjectUnlock(vm);
 libxl_domain_destroy(cfg->ctx, vm->def->id, NULL);
+virObjectLock(vm);
 libxlDomainCleanup(driver, vm, VIR_DOMAIN_SHUTOFF_SHUTDOWN);
 if (libxlDomainStart(driver, vm, false, -1) < 0) {
 virErrorPtr err = virGetLastError();
diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c
index a1977c2..21e20bb 100644
--- a/src/libxl/libxl_driver.c
+++ b/src/libxl/libxl_driver.c
@@ -1250,7 +1250,10 @@ libxlDomainDestroyFlags(virDomainPtr dom,
 event = virDomainEventLifecycleNewFromObj(vm, VIR_DOMAIN_EVENT_STOPPED,
  VIR_DOMAIN_EVENT_STOPPED_DESTROYED);
 
-if (libxl_domain_destroy(cfg->ctx, vm->def->id, NULL) < 0) {
+virObjectUnlock(vm);
+ret = libxl_domain_destroy(cfg->ctx, vm->def->id, NULL);
+virObjectLock(vm);
+if (ret < 0) {
 virReportError(VIR_ERR_INTERNAL_ERROR,
_("Failed to destroy domain '%d'"), vm->def->id);
 goto endjob;
-- 
1.8.4.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v1 04/47] pci: add pci_ioremap_wc_bar()

2015-03-25 Thread Luis R. Rodriguez

On Fri, Mar 20, 2015 at 04:50:32PM -0700, Andy Lutomirski wrote:
> On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez
>  wrote:
> > From: "Luis R. Rodriguez" 
> >
> > This lets drivers take advanate of PAT when available. This
> > should help with the transition of converting video drivers over
> > to ioremap_wc() to help with the goal of eventually using
> > _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on
> > ioremap_nocache() (de33c442e)
> >
> > Cc: Suresh Siddha 
> > Cc: Venkatesh Pallipadi 
> > Cc: Ingo Molnar 
> > Cc: Thomas Gleixner 
> > Cc: Juergen Gross 
> > Cc: Daniel Vetter 
> > Cc: Andy Lutomirski 
> > Cc: Dave Airlie 
> > Cc: Antonino Daplas 
> > Cc: Jean-Christophe Plagniol-Villard 
> > Cc: Tomi Valkeinen 
> > Cc: linux-fb...@vger.kernel.org
> > Cc: linux-ker...@vger.kernel.org
> > Signed-off-by: Luis R. Rodriguez 
> > ---
> >  drivers/pci/pci.c   | 14 ++
> >  include/linux/pci.h |  1 +
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 81f06e8..6afd507 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -137,6 +137,20 @@ void __iomem *pci_ioremap_bar(struct pci_dev *pdev, 
> > int bar)
> >  pci_resource_len(pdev, bar));
> >  }
> >  EXPORT_SYMBOL_GPL(pci_ioremap_bar);
> > +
> > +void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar)
> > +{
> > +   /*
> > +* Make sure the BAR is actually a memory resource, not an IO 
> > resource
> > +*/
> > +   if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) {
> > +   WARN_ON(1);
> > +   return NULL;
> > +   }
> 
> if (WARN_ON(...))?

Sure, they are equivalent however this follows the same exact style as
pci_ioremap_bar() so if we change this one might as well change the style of
pci_ioremap_bar() as well. Let me know if there is any preference. I personally
don't mind the extra line as it shortens the check.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v1 04/47] pci: add pci_ioremap_wc_bar()

2015-03-25 Thread Konrad Rzeszutek Wilk

On Fri, Mar 20, 2015 at 04:17:54PM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" 
> 
> This lets drivers take advanate of PAT when available. This

s/advanate/advantage/
> should help with the transition of converting video drivers over
> to ioremap_wc() to help with the goal of eventually using
> _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on
> ioremap_nocache() (de33c442e)

Please mention the title of the patch too:

"x86 PAT: fix performance drop for glx, use UC minus for ioremap(), 
ioremap_nocache() and pci_mmap_page_range()"
> 
> Cc: Suresh Siddha 
> Cc: Venkatesh Pallipadi 
> Cc: Ingo Molnar 
> Cc: Thomas Gleixner 
> Cc: Juergen Gross 
> Cc: Daniel Vetter 
> Cc: Andy Lutomirski 
> Cc: Dave Airlie 
> Cc: Antonino Daplas 
> Cc: Jean-Christophe Plagniol-Villard 
> Cc: Tomi Valkeinen 
> Cc: linux-fb...@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Luis R. Rodriguez 
> ---
>  drivers/pci/pci.c   | 14 ++
>  include/linux/pci.h |  1 +
>  2 files changed, 15 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 81f06e8..6afd507 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -137,6 +137,20 @@ void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int 
> bar)
>pci_resource_len(pdev, bar));
>  }
>  EXPORT_SYMBOL_GPL(pci_ioremap_bar);
> +
> +void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar)
> +{
> + /*
> +  * Make sure the BAR is actually a memory resource, not an IO resource
> +  */
> + if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) {
> + WARN_ON(1);

Would it be better to use dev_warn ? That way you can see which BDF it is?

Thought WARN will give a nice stack-trace that should easily point to the
driver so perhaps not.. Either way - up to you.

> + return NULL;
> + }
> + return ioremap_wc(pci_resource_start(pdev, bar),
> +   pci_resource_len(pdev, bar));
> +}
> +EXPORT_SYMBOL_GPL(pci_ioremap_wc_bar);
>  #endif
>  
>  #define PCI_FIND_CAP_TTL 48
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 211e9da..c235b09 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1667,6 +1667,7 @@ static inline void pci_mmcfg_late_init(void) { }
>  int pci_ext_cfg_avail(void);
>  
>  void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar);
> +void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar);
>  
>  #ifdef CONFIG_PCI_IOV
>  int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
> -- 
> 2.3.2.209.gd67f9d5.dirty
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/1] hvm.c: Prevent gcc uninitialised var warning

2015-03-25 Thread Don Slutz

On 03/25/15 11:48, Jan Beulich wrote:
 On 25.03.15 at 16:02,  wrote:
>> On 23/03/15 15:01, Don Slutz wrote:
>>> gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 reports:
>>> --
>>> hvm.c: In function `hvm_create_ioreq_server':
>>> hvm.c:487:18: error: `bufioreq_pfn' may be used uninitialised in this 
>> function
>>> [-Werror=uninitialized]
>>> hvm.c:718:30: note: `bufioreq_pfn' was declared here
>>> --
>>>
>>> My code analysis says that gcc is wrong, but initilize the variable
>>> to prevent the gcc warning.
>>>
>>> Reported-by: Ian Murray 
>>> Signed-off-by: Don Slutz 
>>
>> GCC is correct and there is path through this function where
>> bufioreq_pfn is used while uninitialised (non-debug build, is_default =
>> 1, handle_bufioreq = 0).
> 

Looks like you are talking about the ASSERT(handle_bufioreq).  But
that does not leave bufioreq_pfn uninitialised.

Currently you cannot get here in that state.  The only way to
have is_default=1 is:


rc = hvm_create_ioreq_server(d, domid, 1, 1, NULL);

Which should prevent "handle_bufioreq == 0"

> I'm not seeing it: When is_default is set, bufioreq_pfn gets
> initialized in the first if()'s body.
> 
>> As an aside, the compiler is in a very easy position to spot this.  The
>> error means that GCC has positively identified a basic block which does
>> use bufioreq_pfn before it has been initialised.
>>

If the compiler is right, how come the messages says:

may be used

which to me says that it's determination is not 100%


>> I observe that the patch has been committed, but it merely hides the
>> problem and doesn't fix it; hvm_free_ioreq_gmfn(d, bufioreq_pfn) will
>> still clobber arbitrary memory.
> 

hvm_free_ioreq_gmfn() does have issues.  And since it is
possible to call it with

d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN]

which can be anything.

> I committed the patch based on me not being able to find a path
> where the variable would actually be used uninitialized (and it is
> known that various gcc versions have varying levels of problems
> with correctly identifying such issues). If indeed there is a path
> that I'm not seeing, then I'd indeed favor reverting and putting
> in a proper, backportable fix.
> 

I still cannot find a path where this fails.  However I can
provide a patch that does 2 things:

1) Change the "ASSERT(handle_bufioreq)" to "BUG_ON(handle_bufioreq);"
   (not sure it is required).

2) Add checking to hvm_free_ioreq_gmfn() to prevent memory clobber
   since it's arg may come from an HVM_PARAM.

   -Don Slutz

> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v1 02/47] x86: mtrr: generalize run time disabling of MTRR

2015-03-25 Thread Konrad Rzeszutek Wilk

On Fri, Mar 20, 2015 at 04:17:52PM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" 
> 
> It is possible to enable CONFIG_MTRR and up with it
> disabled at run time and yet CONFIG_X86_PAT continues
> to kick through fully functionally. This can happen

s/fully/full/ ?


> for instance on Xen where MTRR is not supported but
> PAT is, this can happen now on Linux as of commit
> 47591df50 by Juergen introduced as of v3.19.

s/3.19/4.0/
> 
> Technically we should assume the proper CPU
> bits would be set to disable MTRR but we can't
> always rely on this. At least on the Xen Hypervisor
> for instance only X86_FEATURE_MTRR was disabled
> as of Xen 4.4 through Xen commit 586ab6a [0],
> but not X86_FEATURE_K6_MTRR, X86_FEATURE_CENTAUR_MCR,
> or X86_FEATURE_CYRIX_ARR for instance.

Oh, could you send an patch for that to Xen please?
> 
> x86 mtrr code relies on quite a bit of checks for
> mtrr_if being set to check to see if MTRR did get
> set up, instead of using that lets provide a generic
> setter which when set we know MTRR is enabled. This

s/we know MTRR is enabled/will let us know that MTRR is enabled/

> also adds a few checks where they were not before
> which could potentially safeguard ourselves against
> incorrect usage of MTRR where this was not desirable.
> 
> Where possible match error codes as if MTRR was
> disabled on arch/x86/include/asm/mtrr.h.
> 
> Lastly, since disabling MTRR can happen at run time
> and we could end up with PAT enabled best record now
> on our logs when MTRR is disabled.
> 
> [0] ~/devel/xen (git::stable-4.5)$ git describe --contains 586ab6a
> 4.4.0-rc1~18
> 
> Cc: Andy Lutomirski 
> Cc: Suresh Siddha 
> Cc: Venkatesh Pallipadi 
> Cc: Ingo Molnar 
> Cc: Thomas Gleixner 
> Cc: Juergen Gross 
> Cc: Daniel Vetter 
> Cc: Dave Airlie 
> Cc: Antonino Daplas 
> Cc: Jean-Christophe Plagniol-Villard 
> Cc: Tomi Valkeinen 
> Cc: Dave Hansen 
> Cc: venkatesh.pallip...@intel.com
> Cc: Stefan Bader 
> Cc: konrad.w...@oracle.com
> Cc: ville.syrj...@linux.intel.com
> Cc: david.vra...@citrix.com
> Cc: jbeul...@suse.com
> Cc: toshi.k...@hp.com
> Cc: bhelg...@google.com
> Cc: Roger Pau Monné 
> Cc: linux-fb...@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Cc: xen-de...@lists.xensource.com
> Signed-off-by: Luis R. Rodriguez 
> ---
>  arch/x86/include/asm/mtrr.h|  2 ++
>  arch/x86/kernel/cpu/mtrr/cleanup.c |  2 +-
>  arch/x86/kernel/cpu/mtrr/generic.c |  5 +++--
>  arch/x86/kernel/cpu/mtrr/if.c  |  3 +++
>  arch/x86/kernel/cpu/mtrr/main.c| 31 ++-
>  5 files changed, 31 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
> index f768f62..cade917 100644
> --- a/arch/x86/include/asm/mtrr.h
> +++ b/arch/x86/include/asm/mtrr.h
> @@ -31,6 +31,7 @@
>   * arch_phys_wc_add and arch_phys_wc_del.
>   */
>  # ifdef CONFIG_MTRR
> +extern int mtrr_enabled;
>  extern u8 mtrr_type_lookup(u64 addr, u64 end);
>  extern void mtrr_save_fixed_ranges(void *);
>  extern void mtrr_save_state(void);
> @@ -50,6 +51,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
>  extern int amd_special_default_mtrr(void);
>  extern int phys_wc_to_mtrr_index(int handle);
>  #  else
> +static const int mtrr_enabled;
>  static inline u8 mtrr_type_lookup(u64 addr, u64 end)
>  {
>   /*
> diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c 
> b/arch/x86/kernel/cpu/mtrr/cleanup.c
> index 5f90b85..784dc55 100644
> --- a/arch/x86/kernel/cpu/mtrr/cleanup.c
> +++ b/arch/x86/kernel/cpu/mtrr/cleanup.c
> @@ -880,7 +880,7 @@ int __init mtrr_trim_uncached_memory(unsigned long 
> end_pfn)
>* Make sure we only trim uncachable memory on machines that
>* support the Intel MTRR architecture:
>*/
> - if (!is_cpu(INTEL) || disable_mtrr_trim)
> + if (!is_cpu(INTEL) || disable_mtrr_trim || !mtrr_enabled)
>   return 0;
>  
>   rdmsr(MSR_MTRRdefType, def, dummy);
> diff --git a/arch/x86/kernel/cpu/mtrr/generic.c 
> b/arch/x86/kernel/cpu/mtrr/generic.c
> index 09c82de..df321b2 100644
> --- a/arch/x86/kernel/cpu/mtrr/generic.c
> +++ b/arch/x86/kernel/cpu/mtrr/generic.c
> @@ -116,7 +116,8 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 
> *partial_end, int *repeat)
>   u8 prev_match, curr_match;
>  
>   *repeat = 0;
> - if (!mtrr_state_set)
> + /* generic_mtrr_ops is only set for generic_mtrr_ops */
> + if (!mtrr_state_set || !mtrr_enabled)
>   return 0xFF;
>  
>   if (!mtrr_state.enabled)
> @@ -290,7 +291,7 @@ static void get_fixed_ranges(mtrr_type *frs)
>  
>  void mtrr_save_fixed_ranges(void *info)
>  {
> - if (cpu_has_mtrr)
> + if (mtrr_enabled && cpu_has_mtrr)
>   get_fixed_ranges(mtrr_state.fixed_ranges);
>  }
>  
> diff --git a/arch/x86/kernel/cpu/mtrr/if.c b/arch/x86/kernel/cpu/mtrr/if.c
> index d76f13d..e9e001a 100644
> --- a/arch/x86/kernel/cpu/mtrr/if.c
> +++ b/arch/x86/kernel/cpu/mtrr/if.c
> @@ -436,6 +436,9 @@

Re: [Xen-devel] [PATCH v1 03/47] devres: add devm_ioremap_wc()

2015-03-25 Thread Luis R. Rodriguez

On Fri, Mar 20, 2015 at 04:49:51PM -0700, Andy Lutomirski wrote:
> On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez
>  wrote:
> > From: "Luis R. Rodriguez" 
> >
> > We have devm_ioremap_nocache() but no devm_ioremap_wc()
> > so add that. This will be used later.
> >
> > Cc: Suresh Siddha 
> > Cc: Venkatesh Pallipadi 
> > Cc: Ingo Molnar 
> > Cc: Thomas Gleixner 
> > Cc: Juergen Gross 
> > Cc: Daniel Vetter 
> > Cc: Andy Lutomirski 
> > Cc: Dave Airlie 
> > Cc: Antonino Daplas 
> > Cc: Jean-Christophe Plagniol-Villard 
> > Cc: Tomi Valkeinen 
> > Cc: linux-fb...@vger.kernel.org
> > Cc: linux-ker...@vger.kernel.org
> > Signed-off-by: Luis R. Rodriguez 
> 
> Looks good to me.

Thanks, I'll peg a Reviewed-by.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 0/9] qspinlock stuff -v15

2015-03-25 Thread Konrad Rzeszutek Wilk

On Mon, Mar 16, 2015 at 02:16:13PM +0100, Peter Zijlstra wrote:
> Hi Waiman,
> 
> As promised; here is the paravirt stuff I did during the trip to BOS last 
> week.
> 
> All the !paravirt patches are more or less the same as before (the only real
> change is the copyright lines in the first patch).
> 
> The paravirt stuff is 'simple' and KVM only -- the Xen code was a little more
> convoluted and I've no real way to test that but it should be stright fwd to
> make work.
> 
> I ran this using the virtme tool (thanks Andy) on my laptop with a 4x
> overcommit on vcpus (16 vcpus as compared to the 4 my laptop actually has) and
> it both booted and survived a hackbench run (perf bench sched messaging -g 20
> -l 5000).
> 
> So while the paravirt code isn't the most optimal code ever conceived it does 
> work.
> 
> Also, the paravirt patching includes replacing the call with "movb $0, %arg1"
> for the native case, which should greatly reduce the cost of having
> CONFIG_PARAVIRT_SPINLOCKS enabled on actual hardware.

Ah nice. That could be spun out as a seperate patch to optimize the existing
ticket locks I presume.

Now with the old pv ticketlock code an vCPU would only go to sleep once and
be woken up when it was its turn. With this new code it is woken up twice 
(and twice it goes to sleep). With an overcommit scenario this would imply
that we will have at least twice as many VMEXIT as with the previous code.

I presume when you did benchmarking this did not even register? Thought
I wonder if it would if you ran the benchmark for a week or so.

> 
> I feel that if someone were to do a Xen patch we can go ahead and merge this
> stuff (finally!).
> 
> These patches do not implement the paravirt spinlock debug stats currently
> implemented (separately) by KVM and Xen, but that should not be too hard to do
> on top and in the 'generic' code -- no reason to duplicate all that.
> 
> Of course; once this lands people can look at improving the paravirt nonsense.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/3] xen: arm: Fix typo "Falltrough"

2015-03-25 Thread Julien Grall

Hi Ian,

On 25/03/15 15:34, Ian Campbell wrote:
> Signed-off-by: Ian Campbell 

Reviewed-by: Julien Grall 

Regards,

> ---
>  xen/arch/arm/domain.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index fdba081..939d8cd 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -742,7 +742,7 @@ int domain_relinquish_resources(struct domain *d)
>  {
>  case RELMEM_not_started:
>  d->arch.relmem = RELMEM_xen;
> -/* Falltrough */
> +/* Fallthrough */
>  
>  case RELMEM_xen:
>  ret = relinquish_memory(d, &d->xenpage_list);
> 


-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 12/12] xen: arm: Dump guest state when invalid trap state is detected

2015-03-25 Thread Julien Grall

Hi Ian,

On 25/03/15 14:22, Ian Campbell wrote:
> By adding GUEST_BUG_ON locally to traps.c.
> 
> Signed-off-by: Ian Campbell 
Reviewed-by: Julien Grall 

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 10/12] xen: arm: handle remaining traps from userspace

2015-03-25 Thread Julien Grall

Hi Ian,

On 25/03/15 14:22, Ian Campbell wrote:
> CP14 dbg and general CP register access are both handled with
> unconditional injection of #undef from their respective handlers, so
> allow these even from 32-bit userspace on a 64-bit kernel.
> 
> SMC32 and HVC32 should only come from a guest in AArch32 mode and
> SMC64 and HVC64 should only come from a guest in AArch64 mode. Add
> appropriate BUG_ONs to all cases.
> 
> After this bad_trap is no longer used.
> 
> Signed-off-by: Ian Campbell 

Reviewed-by: Julien Grall 

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 04/30] xen/PCI: Don't use deprecated function pci_scan_bus_parented()

2015-03-25 Thread Konrad Rzeszutek Wilk

On Fri, Mar 13, 2015 at 09:26:08AM -0500, Bjorn Helgaas wrote:
> On Fri, Mar 13, 2015 at 9:01 AM, Konrad Rzeszutek Wilk
>  wrote:
> > On Fri, Mar 13, 2015 at 08:24:58AM -0500, Bjorn Helgaas wrote:
> >> On Thu, Mar 12, 2015 at 9:36 PM, Yijing Wang  wrote:
> >> > +  pci_add_resource(&resources, &ioport_resource);
> >> > +  pci_add_resource(&resources, &iomem_resource);
> >> > +  pci_add_resource(&resources, &busn_resource);
> >> 
> >>  Since I don't want to export busn_resource, you might have to 
> >>  allocate your
> >>  own struct resource for it here.  And, of course, figure out the 
> >>  details of
> >>  which PCI domain you're in and whether you need to share one struct
> >>  resource across several host bridges in the same domain.
> >> >>>
> >> >>> Allocate its own resource here is ok for me, as I mentioned in 
> >> >>> previous reply,
> >> >>> so do we still need to add additional info to figure out which domain 
> >> >>> own the bus resource ?
> >> >>
> >> >> That's up to the caller.  Only the platform knows which bridges it 
> >> >> wants to
> >> >> have in the same domain.  In principle, every host bridge could be in 
> >> >> its
> >> >> own domain, since each bridge is the root of a unique PCI hierarchy.  
> >> >> But
> >> >> some platforms have firmware that assumes otherwise.  I have no idea 
> >> >> what
> >> >> xen assumes.
> >> >
> >> > I'm not xen guy, so I don't know much about it, but because it call 
> >> > pci_scan_bus_parented()
> >> > before, and in which busn_resource is always shared for different host 
> >> > bridges(same domain or not),
> >> > I think add a static bus resource(0,255) should be safe, at least, it 
> >> > would not introduce new risk.
> >> >
> >> > Something like:
> >> >
> >> > diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
> >> > index b1ffebe..a69e529 100644
> >> > --- a/drivers/pci/xen-pcifront.c
> >> > +++ b/drivers/pci/xen-pcifront.c
> >> > @@ -446,9 +446,15 @@ static int pcifront_scan_root(struct 
> >> > pcifront_device *pdev,
> >> >  unsigned int domain, unsigned int bus)
> >> >  {
> >> > struct pci_bus *b;
> >> > +   LIST_HEAD(resources);
> >> > struct pcifront_sd *sd = NULL;
> >> > struct pci_bus_entry *bus_entry = NULL;
> >> > int err = 0;
> >> > +   static struct resource busn_res = {
> >> > +   .start = 0,
> >> > +   .end = 255,
> >> > +   .flags = IORESOURCE_BUS,
> >> > +   };
> >> >
> >> >  #ifndef CONFIG_PCI_DOMAINS
> >> > if (domain != 0) {
> >> > @@ -470,17 +476,21 @@ static int pcifront_scan_root(struct 
> >> > pcifront_device *pdev,
> >> > err = -ENOMEM;
> >> > goto err_out;
> >> > }
> >> > +   pci_add_resource(&resources, &ioport_resource);
> >> > +   pci_add_resource(&resources, &iomem_resource);
> >> > +   pci_add_resource(&resources, &busn_res);
> >> > pcifront_init_sd(sd, domain, bus, pdev);
> >> >
> >> > pci_lock_rescan_remove();
> >> >
> >> > -   b = pci_scan_bus_parented(&pdev->xdev->dev, bus,
> >> > - &pcifront_bus_ops, sd);
> >> > +   b = pci_scan_root_bus(&pdev->xdev->dev, bus,
> >> > + &pcifront_bus_ops, sd, &resources);
> >> > if (!b) {
> >> >
> >> > Bjorn, what do you think about ?
> >>
> >> That seems OK to me.  Probably still wrong, but no worse than it was 
> >> before.
> >
> > Interesting. The mechanism for PCI passthrough can either synthesize
> > and PCI bus number starting at zero (so first device is always 0:0:0.0)
> > or it can replicate the backend PCI topology. That means you
> > could have segment values passed in, so: ab:ff:00.1). I've to admin
> > I hadn't tried the 'physical' replication on an machine with
> > domains (err, segments).
> >
> > Is there an git tree with this so I can just try it out?
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git
> pci/enumeration-yw6 has similar code (it exports the single

I presume now it is bjorn/pci/enumeration-yw8 ? Going to test this out
this week.
> busn_resource and makes xen use it).  That should be functionally
> identical to what v4.0-rc1 does.
> 
> Yijing hasn't posted the static busn_res proposal above yet, so I
> don't have a branch with that in it.
> 
> Bjorn

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 09/12] xen: arm: correctly handle sysreg accesses from userspace

2015-03-25 Thread Julien Grall

Hi Ian,

On 25/03/15 14:22, Ian Campbell wrote:
> Previously we implemented all registers as RAZ/WI even if they
> shouldn't be accessible to userspace.
> 
> Accesses to the *_EL1 registers from EL0 are trapped to EL1 by the
> hardware, so add a BUG_ON. Likewise accesses from 32-bit EL1 cannot
> happen.

It's not true on all *_EL1 registers. See PMINTENSET_EL1 for instance.

> PMUSERENR_EL0 and MDCCSR_EL0 are R/O to EL0.

Might be worth to explain that we forgot to implement MDCCSR_EL0 before.

> 
> Other PM*_EL0 registers are accessible at EL0 only if PMUSERENR_EL0.EN
> is set, since we emulate that as RAZ/WI we know that bit cannot be
> set.

The ARMv8 spec is confusing on this point

Let's take PMCR_EL0 for instance. The PMUSERENR_EL0.EN trapping is not
explained.

Although, PMCR is trapped when PMUSERENR_EL0.EN == 0.

Do you have a paragraph on the spec which clearly explain the behavior?

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 08/12] xen: arm: Handle CP14 32-bit register accesses from userspace

2015-03-25 Thread Julien Grall

Hi Ian,

On 25/03/15 14:22, Ian Campbell wrote:
> Accesses to these from 32-bit userspace would cause a hypervisor
> exception (host crash) when running a 64-bit kernel, which is worked
> around by the fix to XSA-102. On 32-bit kernels they would be
> implemented as RAZ/WI which is incorrect but harmless.
> 
> Update as follows:
>  - DBGDSCRINT should be R/O.
>  - DBGDSCREXT should be EL1 only.
>  - DBGOSLAR is RO and EL1 only.

This register is WO. Actually the implementation is right, just the
commit message.

Other than that:

Reviewed-by: Julien Grall 

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 07/12] xen: arm: Handle CP15 register traps from userspace

2015-03-25 Thread Julien Grall

Hi Ian,

On 25/03/15 14:22, Ian Campbell wrote:
> Previously userspace access to PM* would have been incorrectly (but
> benignly) implemented as RAZ/WI when running on a 32-bit kernel and
> would cause a hypervisor exception (host crash) when running a 64-bit
> kernel (this was already solved via the fix to XSA-102).
> 
> CLIDR, CCSIDR, DCCISW, ACTLR, PMINTENSET, PMINTENCLR are EL1 only,
> attempts to access from EL0 will trap to EL1 not to us, hence BUG_ON
> is appropriate now.

For PMINTENSET and PMINTENCLR the spec (ARMv8 DDI0487A rev d) says:

"If MDCR_EL2.TPM==1, Non-secure accesses to this register will trap from
EL1 and EL0 to EL2."

As we set to 1 MDCR_EL1.TPM, EL0 access will trap to Xen. So I think we
should replace the BUG_ON to injected a exception.

Reading more the spec only ACTLR access from EL0 will trap to EL1. All
access from EL0 to the others registers in the list above will trap to EL2.

Although, the ARMv7 spec seems to say to only valid access will be trapped.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Design and Question: Eliminate Xen (RTDS) scheduler overhead on dedicated CPU

2015-03-25 Thread Konrad Rzeszutek Wilk

On Tue, Mar 24, 2015 at 11:27:35AM -0400, Meng Xu wrote:
> 2015-03-24 7:54 GMT-04:00 George Dunlap :
> 
> > On Tue, Mar 24, 2015 at 3:50 AM, Meng Xu  wrote:
> > > Hi Dario and George,
> > >
> > > I'm exploring the design choice of eliminating the Xen scheduler
> > overhead on
> > > the dedicated CPU. A dedicated CPU is a PCPU that has a full capacity
> > VCPU
> > > pinned onto it and no other VCPUs will run on that PCPU.
> >
> > Hey Meng!  This sounds awesome, thanks for looking into it.
> >
> 
> :-) I think it is a useful feature for extreme low latency applications.
> 
> 
> >
> >
> > > [Problems]
> > > The issue I'm encountering is as follows:
> > > After I implemented the dedicated cpu feature, I compared the latency of
> > a
> > > cpu-intensive task in domU on dedicated CPU (denoted as R_dedcpu) and the
> > > latency on non-dedicated CPU (denoted as R_nodedcpu). The expected result
> > > should be R_dedcpu < R_nodedcpu since we avoid the scheduler overhead.
> > > However, the actual result is R_dedcpu > R_nodedcpu, and R_dedcpu -
> > > R_nodedcpu ~= 1000 cycles.
> > >
> > > After adding some trace to every function that may raise the
> > > SCHEDULE_SOFTIRQ, I found:
> > > When a cpu is not marked as dedicated cpu and the scheduler on it is not
> > > disabled, the vcpu_block() is triggered 2896 times during 58280322928ns
> > > (i.e., triggered once every 20,124,421ns in average) on the dedicated
> > cpu.
> > > However,
> > > When I disable the scheduler on a dedicated cpu, the function
> > > vcpu_block(void) @schedule.c will be triggered very frequently; the
> > > vcpu_block(void) is triggered 644824 times during 8,918,636,761ns (i.e.,
> > > once every 13831ns in average) on the dedicated cpu.
> > >
> > > To sum up the problem I'm facing, the vcpu_block(void) is trigger much
> > > faster and more frequently when the scheduler is disabled on a cpu than
> > when
> > > the scheduled is enabled.
> > >
> > > [My question]
> > > I'm very confused at the reason why vcpu_block(void) is triggered so
> > > frequently when the scheduler is disabled.  The vcpu_block(void) is
> > called
> > > by the SCHEDOP_block hypercall, but why this hypercall will be triggered
> > so
> > > frequently?
> > >
> > > It will be great if you know the answer directly. (This is just a pure
> > hope
> > > and I cannot really expect it. :-) )
> > > But I really appreciate it if you could give me some directions on how I
> > > should figure it out. I grepped vcpu_block(void) and SCHEDOP_block  in
> > the
> > > xen code base, but didn't found much call to them.
> > >
> > > What confused me most is that  the dedicated VCPU should be blocked less
> > > frequently instead of more frequently when the scheduler is disabled on
> > the
> > > dedicated CPU, because the dedicated VCPU is always running on the CPU
> > now
> > > without the hypervisor scheduler's interference.
> >
> > So if I had to guess, I would guess that you're not actually blocking
> > when the guest tries to block.  Normally if the guest blocks, it
> > blocks in a loop like this:
> >
> > do {
> >   enable_irqs();
> >   hlt;
> >   disable_irqs;
> > } while (!interrup_pending);
> >
> > For a PV guest, the hlt() would be replaced with a PV block() hypercall.
> >
> > Normally, when a guest calls block(), then it's taken off the
> > runqueue; and if there's nothing on the runqueue, then the scheduler
> > will run the idle domain; it's the idle domain that actually does the
> > blocking.
> >
> > If you've hardwired it always to return the vcpu in question rather
> > than the idle domain, then it will never block -- it will busy-wait,
> > calling block millions of times.
> >
> > The simplest way to get your prototype working, in that case, would be
> > to return the idle vcpu for that pcpu if the guest is blocked.
> >
> 
> Exactly! Thank you so much for pointing this out!  I did hardwired it
> always to return the vcpu that is supposed to be blocked. Now I totally
> understand what happened. :-) 

Or you could change the kernel to do an idle poll instead of using 'hlt'.

And idle poll is just:
 do {
nop;pause;
 } while  (!interrupt_pending);

On Linux I believe you do 'idle=poll' to make it do that.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 3/7] libxl: In libxl_set_vcpuonline check for maximum number of VCPUs against the cpumap.

2015-03-25 Thread Konrad Rzeszutek Wilk

On Tue, Mar 24, 2015 at 03:22:55PM +, Ian Campbell wrote:
> On Mon, 2015-03-23 at 14:20 -0400, Konrad Rzeszutek Wilk wrote:
> > There is no sense in trying to online (or offline) CPUs when the size of
> > cpumap is greater than the maximum number of VCPUs the guest can go to.
> > 
> > As such fail the operation if the count of CPUs to online is greater
> > than what the guest started with. For the offline case we do not
> > check (as the bits are unset in the cpumap) and let it go through.
> > 
> > Signed-off-by: Konrad Rzeszutek Wilk 
> > ---
> >  tools/libxl/libxl.c | 27 ---
> >  1 file changed, 24 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> > index 4152ee4..d2b5ff3 100644
> > --- a/tools/libxl/libxl.c
> > +++ b/tools/libxl/libxl.c
> > @@ -5449,6 +5449,20 @@ int libxl_domain_get_nodeaffinity(libxl_ctx *ctx, 
> > uint32_t domid,
> >  return 0;
> >  }
> >  
> > +static int libxl__check_max(libxl__gc *gc, libxl_dominfo *info,
> > +libxl_bitmap *cpumap)
> > +{
> > +int maxcpus = libxl_bitmap_count_set(cpumap);
> > +
> > +if (maxcpus > info->vcpu_max_id + 1)
> > +{
> > +LOGE(ERROR, "You have a max of %d vCPUs and you want %d vCPUs!",
> > + info->vcpu_max_id + 1, maxcpus);
> 
> Please avoid personal pronouns in libxl logs (I don't have a max of any
> number of vcpus, the domain does). Here "setting %d vcpus, max vcpus is
> %d" perhaps.
> 
> Both libxl__set_vcpuonline_qmp and libxl__set_vcpuonline_xenstore start
> with the same code to get info. Please could you move that (and the
> associated dispose) into libxl_set_vcpuonline and check the limit once
> up front (no need for helper then) and pass the info as a pointer to the
> specific functions.

Would this be OK ?

>From a160e64842f36f96f5da610c2498cf1770665dcb Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Fri, 20 Mar 2015 16:29:16 -0400
Subject: [PATCH v3 3/8] libxl: In libxl_set_vcpuonline check for maximum
 number of VCPUs against the cpumap.

There is no sense in trying to online (or offline) CPUs when the size of
cpumap is greater than the maximum number of VCPUs the guest can go to.

As such fail the operation if the count of CPUs to online is greater
than what the guest started with. For the offline case we do not
check (as the bits are unset in the cpumap) and let it go through.

We coalesce some of the underlaying libxl_set_vcpuonline code
together to take advantage for the QMP and XenStore ways.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 tools/libxl/libxl.c | 75 -
 1 file changed, 45 insertions(+), 30 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f957713..c37d057 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5447,28 +5447,34 @@ int libxl_domain_get_nodeaffinity(libxl_ctx *ctx, 
uint32_t domid,
 return 0;
 }
 
+static int libxl__check_max(libxl__gc *gc, libxl_dominfo *info,
+libxl_bitmap *cpumap)
+{
+int maxcpus = libxl_bitmap_count_set(cpumap);
+
+if (maxcpus > info->vcpu_max_id + 1)
+{
+LOGE(ERROR, "Requested %d VCPUs, however maxcpus is %d!",
+ maxcpus, info->vcpu_max_id + 1);
+return ERROR_FAIL;
+}
+return 0;
+}
+
 static int libxl__set_vcpuonline_xenstore(libxl__gc *gc, uint32_t domid,
- libxl_bitmap *cpumap)
+ libxl_bitmap *cpumap,
+ libxl_dominfo *info)
 {
-libxl_dominfo info;
 char *dompath;
 xs_transaction_t t;
-int i, rc;
-
-libxl_dominfo_init(&info);
+int i, rc = ERROR_FAIL;
 
-rc = libxl_domain_info(CTX, &info, domid);
-if (rc < 0) {
-LOGE(ERROR, "getting domain info list");
-goto out;
-}
-rc = ERROR_FAIL;
 if (!(dompath = libxl__xs_get_dompath(gc, domid)))
 goto out;
 
 retry_transaction:
 t = xs_transaction_start(CTX->xsh);
-for (i = 0; i <= info.vcpu_max_id; i++)
+for (i = 0; i <= info->vcpu_max_id; i++)
 libxl__xs_write(gc, t,
libxl__sprintf(gc, "%s/cpu/%u/availability", dompath, 
i),
"%s", libxl_bitmap_test(cpumap, i) ? "online" : 
"offline");
@@ -5478,25 +5484,15 @@ retry_transaction:
 } else
 rc = 0;
 out:
-libxl_dominfo_dispose(&info);
 return rc;
 }
 
 static int libxl__set_vcpuonline_qmp(libxl__gc *gc, uint32_t domid,
- libxl_bitmap *cpumap)
+ libxl_bitmap *cpumap, libxl_dominfo *info)
 {
-libxl_dominfo info;
-int i, rc;
-
-libxl_dominfo_init(&info);
+int i;
 
-rc = libxl_domain_info(CTX, &info, domid);
-if (rc < 0) {
-LOGE(ERROR, "getting domain info list");
-libxl_dominfo_dispose(&info);
-return rc;
-}

Re: [Xen-devel] [PATCH v3 7/7] libxl/vcpu-set - allow to decrease vcpu count on overcommitted guests (v3)

2015-03-25 Thread Konrad Rzeszutek Wilk

On Tue, Mar 24, 2015 at 03:41:46PM +, Ian Campbell wrote:
> On Mon, 2015-03-23 at 14:21 -0400, Konrad Rzeszutek Wilk wrote:
> > We have a check to warn the user if they are overcommitting.
> > But the check only checks the hosts CPU amount and does
> > not take into account the case when the user is trying to fix
> > the overcommit. That is - they want to limit the amount of
> > online VCPUs.
> > 
> > This fix allows the user to offline vCPUs without any
> > warnings when they are running an overcommitted guest.
> > 
> > Also fix the extra space in the message.
> > 
> > Signed-off-by: Konrad Rzeszutek Wilk 
> > 
> > ---
> > [v2: Remove crud code as spotted by Boris]
> > [v3: Moved crud code removal to another patch]
> > ---
> >  tools/libxl/xl_cmdimpl.c | 15 ---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> > index b121d75..d7bd5d5 100644
> > --- a/tools/libxl/xl_cmdimpl.c
> > +++ b/tools/libxl/xl_cmdimpl.c
> > @@ -5238,9 +5238,18 @@ static int vcpuset(uint32_t domid, const char* 
> > nr_vcpus, int check_host)
> >   */
> >  if (check_host) {
> >  unsigned int host_cpu = libxl_get_max_cpus(ctx);
> > -if (max_vcpus > host_cpu) {
> > -fprintf(stderr, "You are overcommmitting! You have %d physical 
> > " \
> > -" CPUs and want %d vCPUs! Aborting, use --ignore-host 
> > to " \
> > +libxl_dominfo dominfo;
> > +
> > +rc = libxl_domain_info(ctx, &dominfo, domid);
> > +if (rc)
> > +{
> > + if (rc == ERROR_DOMAIN_NOTFOUND)
> > +fprintf(stderr, "Domain %u does not exist.\n", domid);
> > +return 1;
> 
> Indentation is wrong on the if.
> 
> More generally all of these rc == ERROR_DOMAIN_NOTFOUND check and prints
> are going to get rather tiresome, especially if/when there are other
> interesting error codes. Perhaps we could arrange for something further
> down the stack on libxl to log this sort of thing, such that xl can rely
> on it already having been mentioned?
> 
> e.g. libxl_domain_info could do it perhaps?

Would this patch work?

>From 9a0a0e581b29d9aa893d80962053383c235e9ad9 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Wed, 25 Mar 2015 13:36:58 -0400
Subject: [PATCH v3 4/8] libxl/libxl_domain_info: Log if domain not found.

If we cannot find the domain - log an error (and still
continue returning an error).

Signed-off-by: Konrad Rzeszutek Wilk 
---
 tools/libxl/libxl.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index c37d057..181b5bd 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -707,8 +707,10 @@ int libxl_domain_info(libxl_ctx *ctx, libxl_dominfo 
*info_r,
 LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "getting domain info list");
 return ERROR_FAIL;
 }
-if (ret==0 || xcinfo.domain != domid) return ERROR_DOMAIN_NOTFOUND;
-
+if (ret==0 || xcinfo.domain != domid) {
+LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "Domain %d not found!", domid);
+return ERROR_DOMAIN_NOTFOUND;
+}
 if (info_r)
 xcinfo2xlinfo(ctx, &xcinfo, info_r);
 return 0;
-- 
2.1.0



And then this patch would become:

>From 37d530f04a266e8d707b811bc7583f9d7b6fb18d Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Mon, 2 Feb 2015 16:18:39 -0500
Subject: [PATCH] libxl/vcpu-set - allow to decrease vcpu count on
 overcommitted guests (v3)

We have a check to warn the user if they are overcommitting.
But the check only checks the hosts CPU amount and does
not take into account the case when the user is trying to fix
the overcommit. That is - they want to limit the amount of
online VCPUs.

This fix allows the user to offline vCPUs without any
warnings when they are running an overcommitted guest.

Also fix the extra space in the message.

Signed-off-by: Konrad Rzeszutek Wilk 

---
[v2: Remove crud code as spotted by Boris]
[v3: Moved crud code removal to another patch]
[v4: Remove the fprintf as it is done in libxl_domain_info]
[v5: Fix memory leak]
---
 tools/libxl/xl_cmdimpl.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index b121d75..c77c691 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -5238,12 +5238,21 @@ static int vcpuset(uint32_t domid, const char* 
nr_vcpus, int check_host)
  */
 if (check_host) {
 unsigned int host_cpu = libxl_get_max_cpus(ctx);
-if (max_vcpus > host_cpu) {
-fprintf(stderr, "You are overcommmitting! You have %d physical " \
-" CPUs and want %d vCPUs! Aborting, use --ignore-host to " 
\
-" continue\n", host_cpu, max_vcpus);
+libxl_dominfo dominfo;
+
+rc = libxl_domain_info(ctx, &dominfo, domid);
+if (rc)
 ret

Re: [Xen-devel] [PATCH 06/12] xen: arm: correctly handle vtimer traps from userspace

2015-03-25 Thread Julien Grall

Hi Ian,

On 25/03/15 14:22, Ian Campbell wrote:
> -static void vtimer_cntp_tval(struct cpu_user_regs *regs, uint32_t *r, int 
> read)
> +static int vtimer_cntp_tval(struct cpu_user_regs *regs, uint32_t *r, int 
> read)
>  {
>  struct vcpu *v = current;
>  s_time_t now;
>  
> +if ( psr_mode_is_user(regs) &&
> + !(READ_SYSREG(CNTKCTL_EL1) & CNTKCTL_EL1_EL0PTEN) )
> +return 0;
> +

Would it make sense to create a macro for this check? The code is pretty
much the same on every function except the field to check
(EL0PTEN/EL0PCTEN).

Either way:

Reviewed-by:  Julien Grall 

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/MSI: fix error handling

2015-03-25 Thread Andrew Cooper


On 25/03/15 16:23, Jan Beulich wrote:

__setup_msi_irq() needs to undo what it did before calling
write_msi_msg() in case that returned an error.

map_domain_pirq() needs to get rid of the MSI descriptor it
(implicitly) allocated. The case of a setup_msi_irq() failure on a
non-initial multi-vector-MSI interrupt needs special handling: While
the initial IRQ will get freed by the caller (who also passed it to
us), we need to undo the effect setup_msi_irq() had on it. (As a
benefit from the added call to msi_free_irq() we no longer need to
explicitly call destroy_irq() on the non-initial slots.)

Signed-off-by: Jan Beulich 


Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 02/12] xen: arm: handle accesses to CNTP_CVAL_EL0

2015-03-25 Thread Julien Grall

On 25/03/15 14:22, Ian Campbell wrote:
> +static int vtimer_cntp_cval(struct cpu_user_regs *regs, uint64_t *r, int 
> read)
> +{
> +struct vcpu *v = current;
> +
> +if ( psr_mode_is_user(regs) &&
> + !(READ_SYSREG(CNTKCTL_EL1) & CNTKCTL_EL1_EL0PTEN) )

Sorry, I didn't notice it on my previous review.

CNTKCTL_EL1_EL0PTEN and psr_mode_is_user are only defined in
respectively patch #6 and #4.

Can you invert the patches to avoid build breakage during bisection?

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 03/12] xen: arm: Use ARMv8 names for CNTHCTL_EL2 bits

2015-03-25 Thread Julien Grall

Hi Ian,

On 25/03/15 14:22, Ian Campbell wrote:
> Rather than using the v8 register names and the v7 bit names, which
> makes things needlessly difficult when reading.
> 
> Signed-off-by: Ian Campbell 

Reviewed-by: Julien Grall 

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 02/12] xen: arm: handle accesses to CNTP_CVAL_EL0

2015-03-25 Thread Julien Grall

Hi Ian,

On 25/03/15 14:22, Ian Campbell wrote:
> All OSes we have run on top of Xen use CNTP_TVAL_EL0 but for
> completeness we really should handle CVAL too.
> 
> In vtimer_emulate_cp64 pull the propagation of the 64-bit result into
> two 32-bit registers out of the switch to avoid duplicating for every
> register. We also need to initialise x now since previously the only
> register implemented register was R/O.
> 
> While adding HSR_SYSREG_CNTP_CVAL_EL0 also move
> HSR_SYSREG_CNTP_CTL_EL0 so it is sorted correctly.
> 
> Signed-off-by: Ian Campbell 

Reviewed-by: Julien Grall 

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86: don't change affinity with interrupt unmasked

2015-03-25 Thread Andrew Cooper


On 20/03/15 16:40, Jan Beulich wrote:

With ->startup unmasking the IRQ, setting the affinity afterwards
without masking the IRQ again is invalid namely for MSI (which can't
have their affinity updated atomically).

Signed-off-by: Jan Beulich


Reviewed-by: Andrew Cooper 


---
Changing the affinity of non-maskable MSI IRQs seems bogus too


Agreed.  Their affinity can clearly only be changed safely by a device 
driver which can guarantee that an interrupt will not be generated 
during the vulnerable period.


This further implies that Xen can't even safely set an affinity to start 
with...



, but I
can't immediately see what we can do about this (better than disabling
affinity changes for them).


If we used interrupt remapping properly (which we don't), we could 
update the effective affinity without changing any device configuration, 
but this still doesn't provide a solution for the many systems out there 
without (functional) interrupt remapping.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] x86: simplify non‑atomic bitops

2015-03-25 Thread Andrew Cooper


On 20/03/15 14:53, Jan Beulich wrote:

- being non-atomic, their pointer arguments shouldn't be volatile-
   qualified
- their (half fake) memory operands can be a single "+m" instead of
   being both an output and an input

Signed-off-by: Jan Beulich 


After further consideration, would it not be better to change the 
non-atomic variants to being straight C.


e.g.

static inline void __set_bit(int nr, void *_addr)
{
int *addr = _addr;
addr[nr / sizeof(int)] |= (1U << (nr % sizeof(int)));
}

This would drop the memory clobber from the asm statement and allow the 
compiler to optimise repeated __set_bit() calls to the same word into a 
single action.


~Andrew

(On a separate note, I feel that all of these operations should be 
acting on unsigned rather than signed ints, but that applies to all of 
these operations, not just the non-atomic ones)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 00/11] Alternate p2m: support multiple copies of host p2m

2015-03-25 Thread Ed White

>>
>> The second thing is how similar some of this is to nested p2m code,
>> making me wonder whether it could share more code with that.  It's not
>> as much duplication as I had feared, but e.g. altp2m_write_p2m_entry()
>> is _identical_ to nestedp2m_write_p2m_entry(), (making the
>> copyright claim at the top of the file quite dubious, BTW).
>>
> 
> I did initially use nestedp2m_write_p2m_entry directly, but I knew
> that wouldn't be acceptable! On this specific point, I would be more
> inclined to refactor the normal write entry routine so you can call
> it everywhere, since both the nested and alternate ones are simply
> a copy of a part of the normal one.
> 

I started to look into this again. What I described as the normal write
routine is the one for HAP, and that is actually a domain-level routine
(akin to the paging mode-specific ones), and takes a domain as its first
parameter. The nestedp2m routine is a p2m-level routine that takes a p2m
as its first parameter, and that's what I had copied.

However, looking at the code, the p2m-level write routine is only ever
called from the domain-level one, but the HAP routine doesn't make such
calls, and nestedp2m and altp2m are only available in HAP mode.

Therefore, I have dropped the altp2m write routine with no ill effects.
I don't think the nestedp2m one can ever be called.

Ed


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xen/passthrough: Support a single iommu_domain per xen domain per SMMU

2015-03-25 Thread Stefano Stabellini

On Wed, 25 Mar 2015, Manish Jaggi wrote:
> On Tuesday 24 March 2015 07:34 PM, Robert VanVossen wrote:
> > 
> > On 3/24/2015 7:07 AM, Manish Jaggi wrote:
> > > On Monday 23 March 2015 07:16 PM, Robbie VanVossen wrote:
> > > > If multiple devices are being passed through to the same domain and they
> > > > share a single SMMU, then they only require a single iommu_domain.
> > > > 
> > > > In arm_smmu_assign_dev, before a new iommu_domain is created, the
> > > > xen_domain->contexts is checked for any iommu_domains that are already
> > > > assigned to device that uses the same SMMU as the current device. If one
> > > > is found, attach the device to that iommu_domain. If a new one isn't
> > > > found, create a new iommu_domain just like before.
> > > > 
> > > > The arm_smmu_deassign_dev function assumes that there is a single
> > > > device per iommu_domain. This meant that when the first device was
> > > > deassigned, the iommu_domain was freed and when another device was
> > > > deassigned a crash occured in xen.
> > > > 
> > > > To fix this, a reference counter was added to the iommu_domain struct.
> > > > When an arm_smmu_xen_device references an iommu_domain, the
> > > > iommu_domains ref is incremented. When that reference is removed, the
> > > > iommu_domains ref is decremented. The iommu_domain will only be freed
> > > > when the ref is 0.
> > > > 
> > > > Signed-off-by: Robbie VanVossen 
> > > Hi,
> > > Are you adding a PCI passthrough support to Xen ?. I am in process of
> > > sending smmu driver patches based on juliens latest code.
> > > 
> > Nope, I am just working on what this patch describes
> Your patch is doing similar thing what am working on. As per the Xen 4.6
> release I took smmu for pci-passthorugh. I didnt send smmu patches so far as
> there are other dependency patches I need to send first like pci_host_bridge
> register hypercall patch.
> 
> Julien/Ian could you please hold the merge of this patch.

Sorry Manish, but the Xen community works on a first come first served
basis:  if/when this patch is considered in good condition and ready to
be merged, is likely to be merged.
Unless you have a specific concern about the code of course.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] iommu VT-d: separate rmrr addition function

2015-03-25 Thread Elena Ufimtseva

On Wed, Mar 25, 2015 at 05:04:38PM +, Jan Beulich wrote:
> >>> On 24.03.15 at 17:08,  wrote:
> > +static int register_one_rmrr(struct acpi_rmrr_unit *rmrru)
> > +{
> > +bool_t ignore = 0;
> > +unsigned int i = 0;
> > +int ret = 0;
> > +
> > +/* Skip checking if segment is not accessible yet. */
> > +if ( !pci_known_segment(rmrru->segment) )
> > +i = UINT_MAX;
> > +
> > +for ( ; i < rmrru->scope.devices_cnt; i++ )
> > +{
> > +u8 b, d, f;
> > +
> > +b = d = f = 0;
> 
> ???
> 
> > +b = PCI_BUS(rmrru->scope.devices[i]);
> > +d = PCI_SLOT(rmrru->scope.devices[i]);
> > +f = PCI_FUNC(rmrru->scope.devices[i]);
> 
> Quoting my reply on v1: "Please make this the declarations (with
> initializers) of these variables; they don't appear to be used outside
> this scope." I.e.
> 
> u8 b = PCI_BUS(...);

Got it, I did read it differently first time. Example helps for sure!

> 
> > +if ( !ret && (rmrru->scope.devices_cnt != 0) )
> > +ret = register_one_rmrr(rmrru);
> > +if ( ret )
> >  xfree(rmrru);
> 
> Indentation is screwed up here.

Ok, will fix.

> 
> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86: properly parenthesize {read, write}_atomic()

2015-03-25 Thread Andrew Cooper


On 20/03/15 15:54, Jan Beulich wrote:

... at once eliminating some redundancy from the read variant (the cast
to the destination type can be done once outside the switch).

Signed-off-by: Jan Beulich


Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 02/11] VMX: implement suppress #VE.

2015-03-25 Thread Ed White

On 01/15/2015 10:46 AM, Ed White wrote:
> On 01/15/2015 08:25 AM, Tim Deegan wrote:
>> Hi,
>>
>> At 13:26 -0800 on 09 Jan (1420806392), Ed White wrote:
>>>  static inline bool_t is_epte_valid(ept_entry_t *e)
>>>  {
>>> -return (e->epte != 0 && e->sa_p2mt != p2m_invalid);
>>> +return (e->valid != 0 && e->sa_p2mt != p2m_invalid);
>>
>> This test for 0 is just catching uninitialised entries in freshly
>> allocated pages.  Rather than changing it to ignore bit 63, this loop...
>>
>>>  }
>>>  
>>>  /* returns : 0 for success, -errno otherwise */
>>> @@ -194,6 +194,19 @@ static int ept_set_middle_entry(struct p2m_domain 
>>> *p2m, ept_entry_t *ept_entry)
>>>  
>>>  ept_entry->r = ept_entry->w = ept_entry->x = 1;
>>>  
>>> +/* Disable #VE on all entries */ 
>>> +if ( cpu_has_vmx_virt_exceptions )
>>> +{
>>> +ept_entry_t *table = __map_domain_page(pg);
>>> +
>>> +for ( int i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
>>> +table[i].suppress_ve = 1;
>>
>> ...should set the type of the empty entries to p2m_invalid as it goes.
>>
>>> +/* Disable #VE on all entries */
>>> +if ( cpu_has_vmx_virt_exceptions )
>>> +{
>>> +ept_entry_t *table =
>>> +map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
>>> +
>>> +for ( int i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
>>> +table[i].suppress_ve = 1;
>>
>> And the same here.

I have some time to work on this patch series again, and although I tried
this it doesn't eliminate all the instances of epte being zero with the
possible exception of suppress_ve. I spent some time trying to find all
cases where that happens without success, so I've used Andrew's suggestion
of using a mask.

Ed

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 0/2] printk adjustments

2015-03-25 Thread Andrew Cooper


On 20/03/15 16:04, Jan Beulich wrote:

1: introduce gprintk()
2: make {,g}dprintk() a no-op in non-debug builds

Signed-off-by: Jan Beulich


Both Reviewed-by: Andrew Cooper 

It is likely that some other shifting of messages will happen, but lets 
see what this looks like as a first pass.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] ARM: KVM/XEN: how should we support virt-what?

2015-03-25 Thread Stefano Stabellini

On Wed, 25 Mar 2015, Ian Campbell wrote:
> On Wed, 2015-03-25 at 10:44 +0100, Andrew Jones wrote:
> > Hello ARM virt maintainers,
> > 
> > I'd like to start a discussion about supporting virt-what[1]. virt-what
> > allows userspace to determine if the system it's running on is running
> > in a guest, and of what type (KVM, Xen, etc.). Despite it being a best
> > effort tool, see the Caveat emptor in [1], it has become quite a useful
> > tool, and is showing up in different places, such as OpenStack. If you
> > look at the code[2], specifically [3], then you'll see how it works on
> > x86, which is to use the dedicated hypervisor cpuid leaves. I'm
> > wondering what equivalent we have, or can develop, for arm.
> > Here are some thoughts;
> > 0) there's already something we can use, and I just need to be told
> >about it.
> > 1) be as similar as possible to x86 by dedicating some currently
> >undefined sysreg bits. This would take buy-in from lots of parties,
> >so is not likely the way to go.
> > 2) create a specific DT node that will get exposed through sysfs, or
> >somewhere.
> > 3) same as (2), but just use the nodes currently in mach-virt's DT
> >as the indication we're a guest. This would just be a heuristic,
> >i.e. "have virtio mmio" && psci.method == hvc, or something,
> >and we'd still need a way to know if we're kvm vs. xen vs. ??.
> 
> FWIW Xen has a specific node,
> https://git.kernel.org/cgit/linux/kernel/git/devicetree/devicetree-rebasing.git/tree/Bindings/arm/xen.txt
> 
> Doesn't help you with ACPI systems though. IIRC there will be a Xen
> table of some sort.

This is the Xen specific ACPI table:

http://wiki.xenproject.org/mediawiki/images/c/c4/Xen-environment-table.pdf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 1/3] libxl/cpumap: Add xc_cpumap_[setcpu, clearcpu, testcpu] to complement xc_cpumap_alloc.

2015-03-25 Thread Konrad Rzeszutek Wilk

On Wed, Mar 25, 2015 at 08:53:01AM +, Dario Faggioli wrote:
> On Tue, 2015-03-24 at 16:29 -0400, Konrad Rzeszutek Wilk wrote:
> > On Tue, Mar 24, 2015 at 05:46:04PM +, Ian Campbell wrote:
> > > Is it necessary to worry about alignment here, since xc_cpumap_t is
> > > actually a uint8_t*.
> >
> > We can also do and not worry about it:
> >
> > diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
> > index 7514b84..19a1b18 100644
> > --- a/tools/libxc/xc_misc.c
> > +++ b/tools/libxc/xc_misc.c
> > @@ -94,19 +94,22 @@ xc_cpumap_t xc_cpumap_alloc(xc_interface *xch)
> >  return calloc(1, sz);
> >  }
> >  
> > +#define BITS_PER_CPUMAP(map) (sizeof(*map) * 8)
> > +#define CPUMAP_ENTRY(cpu, map) ((map))[(cpu) / BITS_PER_CPUMAP(map)]
> > +#define CPUMAP_SHIFT(cpu, map) ((cpu) % BITS_PER_CPUMAP(map))
> >
> > [...]
> >
> Maybe it's only me, but I really find it a bit hard to figure out what
> the differences between this and what's in xc_bitops.h are.
> 
> If going for this, I'd say that the reasons why we need these, and such
> differences between these and BITMAP_* should be made evident somehow
> (changelog, comments, etc.).

Here is an updated patch:

>From dbb1bf2d12b24a6a91ad95abef0d4310e7141b79 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Mon, 23 Mar 2015 11:52:54 -0400
Subject: [PATCH] libxl/cpumap: Add xc_cpumap_[setcpu, clearcpu, testcpu] to
 complement xc_cpumap_alloc.

We export the xc_cpumap_alloc but not the bit operations.
One could include 'xc_bitops.h' but that is naughty - so instead
we just export the proper functions to do it on the xc_cpumap_t
typedef.

Signed-off-by: Konrad Rzeszutek Wilk 

v2: Use our own macro to make sure ARM is not affected negatively
---
 tools/libxc/include/xenctrl.h |  9 +
 tools/libxc/xc_misc.c | 30 ++
 2 files changed, 39 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 4e9537e..565f098 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -394,6 +394,15 @@ int xc_get_cpumap_size(xc_interface *xch);
 /* allocate a cpumap */
 xc_cpumap_t xc_cpumap_alloc(xc_interface *xch);
 
+/* clear an CPU from the cpumap. */
+void xc_cpumap_clearcpu(int cpu, xc_cpumap_t map);
+
+/* set an CPU in the cpumap. */
+void xc_cpumap_setcpu(int cpu, xc_cpumap_t map);
+
+/* Test whether the CPU in cpumap is set. */
+int xc_cpumap_testcpu(int cpu, xc_cpumap_t map);
+
 /*
  * NODEMAP handling
  */
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index be68291..e113385 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -18,6 +18,7 @@
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  
USA
  */
 
+#include "xc_bitops.h"
 #include "xc_private.h"
 #include 
 
@@ -93,6 +94,35 @@ xc_cpumap_t xc_cpumap_alloc(xc_interface *xch)
 return calloc(1, sz);
 }
 
+/*
+ * xc_bitops.h has macros that do this as well - however xc_cpumap_t
+ * is an array of uint8 (1byte) while said macros expect an array
+ * of unsigned long (8 bytes). This is not a problem when setting
+ * an bit (as the computation will come with the same offset) - the
+ * issue is when clearing an bit. The aligment on ARM could affect
+ * other structures that are close in memory causing memory corruption.
+ * Hence our own macros that differ only in that we compute the
+ * size of the array elements (sizeof(*map)) as opposed to assuming
+ * it is unsigned long.
+ */
+#define BITS_PER_CPUMAP(map) (sizeof(*map) * 8)
+#define CPUMAP_ENTRY(cpu, map) ((map))[(cpu) / BITS_PER_CPUMAP(map)]
+#define CPUMAP_SHIFT(cpu, map) ((cpu) % BITS_PER_CPUMAP(map))
+void xc_cpumap_clearcpu(int cpu, xc_cpumap_t map)
+{
+CPUMAP_ENTRY(cpu, map) &= ~(1U << CPUMAP_SHIFT(cpu, map));
+}
+
+void xc_cpumap_setcpu(int cpu, xc_cpumap_t map)
+{
+CPUMAP_ENTRY(cpu, map) |= (1U << CPUMAP_SHIFT(cpu, map));
+}
+
+int xc_cpumap_testcpu(int cpu, xc_cpumap_t map)
+{
+return (CPUMAP_ENTRY(cpu, map) >> CPUMAP_SHIFT(cpu, map)) & 1;
+}
+
 xc_nodemap_t xc_nodemap_alloc(xc_interface *xch)
 {
 int sz;
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] sysctl: Don't overwrite array size variable when it is set on error earlier

2015-03-25 Thread Andrew Cooper


On 25/03/15 17:09, Boris Ostrovsky wrote:

When querying CPU topology, if caller-provided array size is smaller than
number of online CPUs then, in addition to returning -ENOBUFS, sysctl is
expected to provide back this number. However, this value, stored in 'i',
is overwritten in the subsequent loop's control statement.

Make sure we don't do this by converting the loop to 'while'.

Signed-off-by: Boris Ostrovsky 
Reported-by: Andrew Cooper 
---
  xen/common/sysctl.c |6 +-
  1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index a8c629f..b83d230 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -338,8 +338,10 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
u_sysctl)
  ret = -ENOBUFS;
  i = num_cpus;
  }
+else
+i = 0;
  
-for ( i = 0; i < num_cpus; i++ )

+while ( i < num_cpus )


This would be fine to keep as "for ( ; i < num_cpus; i++)", and helps 
avoid an issue if someone introduces a continue; in the future.


As for the fix itself, Reviewed-by: Andrew Cooper 




  {
  xen_sysctl_cputopo_t cputopo;
  
@@ -363,6 +365,8 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)

  ret = -EFAULT;
  break;
  }
+
+i++;
  }
  }
  else



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] iommu VT-d: separate rmrr addition function

2015-03-25 Thread Jan Beulich

>>> On 24.03.15 at 17:08,  wrote:
> +static int register_one_rmrr(struct acpi_rmrr_unit *rmrru)
> +{
> +bool_t ignore = 0;
> +unsigned int i = 0;
> +int ret = 0;
> +
> +/* Skip checking if segment is not accessible yet. */
> +if ( !pci_known_segment(rmrru->segment) )
> +i = UINT_MAX;
> +
> +for ( ; i < rmrru->scope.devices_cnt; i++ )
> +{
> +u8 b, d, f;
> +
> +b = d = f = 0;

???

> +b = PCI_BUS(rmrru->scope.devices[i]);
> +d = PCI_SLOT(rmrru->scope.devices[i]);
> +f = PCI_FUNC(rmrru->scope.devices[i]);

Quoting my reply on v1: "Please make this the declarations (with
initializers) of these variables; they don't appear to be used outside
this scope." I.e.

u8 b = PCI_BUS(...);

> +if ( !ret && (rmrru->scope.devices_cnt != 0) )
> +ret = register_one_rmrr(rmrru);
> +if ( ret )
>  xfree(rmrru);

Indentation is screwed up here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] sysctl: Don't overwrite array size variable when it is set on error earlier

2015-03-25 Thread Boris Ostrovsky

When querying CPU topology, if caller-provided array size is smaller than
number of online CPUs then, in addition to returning -ENOBUFS, sysctl is
expected to provide back this number. However, this value, stored in 'i',
is overwritten in the subsequent loop's control statement.

Make sure we don't do this by converting the loop to 'while'.

Signed-off-by: Boris Ostrovsky 
Reported-by: Andrew Cooper 
---
 xen/common/sysctl.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index a8c629f..b83d230 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -338,8 +338,10 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
u_sysctl)
 ret = -ENOBUFS;
 i = num_cpus;
 }
+else
+i = 0;
 
-for ( i = 0; i < num_cpus; i++ )
+while ( i < num_cpus )
 {
 xen_sysctl_cputopo_t cputopo;
 
@@ -363,6 +365,8 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
u_sysctl)
 ret = -EFAULT;
 break;
 }
+
+i++;
 }
 }
 else
-- 
1.7.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v19 13/14] x86/VPMU: Add privileged PMU mode

2015-03-25 Thread Jan Beulich

>>> On 17.03.15 at 15:54,  wrote:
> Add support for privileged PMU mode (XENPMU_MODE_ALL) which allows privileged
> domain (dom0) profile both itself (and the hypervisor) and the guests. While
> this mode is on profiling in guests is disabled.
> 
> Signed-off-by: Boris Ostrovsky 

Acked-by: Jan Beulich 
but ...

> @@ -177,17 +183,18 @@ void vpmu_do_interrupt(struct cpu_user_regs *regs)
>  sampling = sampled;
>  
>  vpmu = vcpu_vpmu(sampling);
> -if ( !is_hvm_vcpu(sampling) )
> +if ( !is_hvm_vcpu(sampling) || (vpmu_mode & XENPMU_MODE_ALL) )
>  {
>  /* PV(H) guest */
>  const struct cpu_user_regs *cur_regs;
>  uint64_t *flags = &vpmu->xenpmu_data->pmu.pmu_flags;
> -uint32_t domid = DOMID_SELF;
> +uint32_t domid;

... why is this not domid_t? Probably needs adjustment in an earlier
patch.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v19 12/14] x86/VPMU: Merge vpmu_rdmsr and vpmu_wrmsr

2015-03-25 Thread Jan Beulich

>>> On 17.03.15 at 15:54,  wrote:
> ---
> Changes in v19:
> * const-ified arch_vpmu_ops in vpmu_do_wrmsr
> * non-changes:
>- kept 'current' as a non-initializer to avoid unnecessary initialization
>  in the (common) non-VPMU case

Afaict there's nothing at all keeping the compiler to still schedule part
or all of the involved insns ahead of that first goto. And if it was an
initializer, there would equally be nothing keeping the compiler from
scheduling part or all of those insns after the if(). So I don't see why
not making it the initializer is better (other than increasing line count
and patch size). If you really wanted to do something to help the
compiler, flagging the if() condition with likely() would perhaps be the
thing to do.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [ovmf test] 36719: regressions - FAIL

2015-03-25 Thread xen . org

flight 36719 ovmf real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/36719/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-win7-amd64  7 windows-installfail REGR. vs. 36590
 test-amd64-amd64-xl-win7-amd64  7 windows-install fail REGR. vs. 36590

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 36590

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 7 debian-hvm-install fail never 
pass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 7 debian-hvm-install fail never 
pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 7 debian-hvm-install fail never 
pass
 test-amd64-i386-xl-xsm9 guest-start  fail   never pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 7 debian-hvm-install fail never 
pass
 test-amd64-amd64-libvirt-xsm  9 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-amd64-amd64-xl-xsm   9 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm   9 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3  7 windows-install  fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass

version targeted for testing:
 ovmf e29fc5022042f7c86b07673741749cf3de5f3816
baseline version:
 ovmf f61762d0c86b42c7922ed2d4326c9647252752ae


People who touched revisions under test:
  Cinnamon Shia 
  Fu Siyuan 


jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmfail
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm fail
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmfail
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-xl-xsm  fail
 test-amd64-i386-xl-xsm   fail
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemut-win7-amd64

[Xen-devel] [PATCH v2 1/4] x86/MSI-X: be more careful during teardown

2015-03-25 Thread Jan Beulich

When a device gets detached from a guest, pciback will clear its
command register, thus disabling both memory and I/O decoding. The
disabled memory decoding, however, has an effect on the MSI-X table
accesses the hypervisor does: These won't have the intended effect
anymore. Even worse, for PCIe devices (but not SR-IOV virtual
functions) such accesses may (will?) be treated as Unsupported
Requests, causing respective errors to be surfaced, potentially in the
form of NMIs that may be fatal to the hypervisor or Dom0 is different
ways. Hence rather than carrying out these accesses, we should avoid
them where we can, and use alternative (e.g. PCI config space based)
mechanisms to achieve at least the same effect.

Signed-off-by: Jan Beulich 
---
v2: mask_msi_irq()'s BUG() needs to be conditional upon IRQ_DISABLED
not already being set, as ->shutdown() may be called on error paths
when the IRQ never got enabled. This in turn required the changes
to irq.c.

Backporting note (largely to myself):
   Depends on (not yet backported) commit 061eebe0e "x86/MSI: drop
   workaround for insecure Dom0 kernels" (due to re-use of struct
   arch_msix's warned field).

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -217,9 +217,9 @@ void destroy_irq(unsigned int irq)
 }
 
 spin_lock_irqsave(&desc->lock, flags);
-desc->status  |= IRQ_DISABLED;
 desc->status  &= ~IRQ_GUEST;
 desc->handler->shutdown(desc);
+desc->status |= IRQ_DISABLED;
 action = desc->action;
 desc->action  = NULL;
 desc->msi_desc = NULL;
@@ -995,8 +995,8 @@ void __init release_irq(unsigned int irq
 spin_lock_irqsave(&desc->lock,flags);
 action = desc->action;
 desc->action  = NULL;
-desc->status |= IRQ_DISABLED;
 desc->handler->shutdown(desc);
+desc->status |= IRQ_DISABLED;
 spin_unlock_irqrestore(&desc->lock,flags);
 
 /* Wait to make sure it's not being used on another CPU */
@@ -1725,8 +1725,8 @@ static irq_guest_action_t *__pirq_guest_
 BUG_ON(action->in_flight != 0);
 
 /* Disabling IRQ before releasing the desc_lock avoids an IRQ storm. */
-desc->status |= IRQ_DISABLED;
 desc->handler->disable(desc);
+desc->status |= IRQ_DISABLED;
 
 /*
  * Mark any remaining pending EOIs as ready to flush.
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -121,6 +121,27 @@ static void msix_put_fixmap(struct arch_
 spin_unlock(&msix->table_lock);
 }
 
+static bool_t memory_decoded(const struct pci_dev *dev)
+{
+u8 bus, slot, func;
+
+if ( !dev->info.is_virtfn )
+{
+bus = dev->bus;
+slot = PCI_SLOT(dev->devfn);
+func = PCI_FUNC(dev->devfn);
+}
+else
+{
+bus = dev->info.physfn.bus;
+slot = PCI_SLOT(dev->info.physfn.devfn);
+func = PCI_FUNC(dev->info.physfn.devfn);
+}
+
+return !!(pci_conf_read16(dev->seg, bus, slot, func, PCI_COMMAND) &
+  PCI_COMMAND_MEMORY);
+}
+
 /*
  * MSI message composition
  */
@@ -162,7 +183,7 @@ void msi_compose_msg(unsigned vector, co
 }
 }
 
-static void read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
+static bool_t read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 {
 switch ( entry->msi_attrib.type )
 {
@@ -198,6 +219,8 @@ static void read_msi_msg(struct msi_desc
 void __iomem *base;
 base = entry->mask_base;
 
+if ( unlikely(!memory_decoded(entry->dev)) )
+return 0;
 msg->address_lo = readl(base + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
 msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR_OFFSET);
 msg->data = readl(base + PCI_MSIX_ENTRY_DATA_OFFSET);
@@ -209,6 +232,8 @@ static void read_msi_msg(struct msi_desc
 
 if ( iommu_intremap )
 iommu_read_msi_from_ire(entry, msg);
+
+return 1;
 }
 
 static int write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
@@ -260,6 +285,8 @@ static int write_msi_msg(struct msi_desc
 void __iomem *base;
 base = entry->mask_base;
 
+if ( unlikely(!memory_decoded(entry->dev)) )
+return -ENXIO;
 writel(msg->address_lo,
base + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
 writel(msg->address_hi,
@@ -287,7 +314,8 @@ void set_msi_affinity(struct irq_desc *d
 ASSERT(spin_is_locked(&desc->lock));
 
 memset(&msg, 0, sizeof(msg));
-read_msi_msg(msi_desc, &msg);
+if ( !read_msi_msg(msi_desc, &msg) )
+return;
 
 msg.data &= ~MSI_DATA_VECTOR_MASK;
 msg.data |= MSI_DATA_VECTOR(desc->arch.vector);
@@ -347,20 +375,24 @@ int msi_maskable_irq(const struct msi_de
|| entry->msi_attrib.maskbit;
 }
 
-static void msi_set_mask_bit(struct irq_desc *desc, int flag)
+static bool_t msi_set_mask_bit(struct irq_desc *desc, int flag)
 {
 struct msi_desc *entry = desc->msi_desc;
+struct pci_dev *pdev;
+u16 seg;
+u8 bus, slot, func;
 
 ASSERT(spin_is_locked(&desc->lock));
 BUG_ON(!entry || !entry->dev

Re: [Xen-devel] [PATCH] LZ4 : fix the data abort issue

2015-03-25 Thread Ian Campbell

On Wed, 2015-03-25 at 16:14 +, Jan Beulich wrote:
> If the part of the compression data are corrupted, or the compression
> data is totally fake, the memory access over the limit is possible.
> 
> This is the log from my system usning lz4 decompression.
>[6502]data abort, halting
>[6503]r0  0x r1  0x r2  0xdcea0ffc r3  0xdcea0ffc
>[6509]r4  0xb9ab0bfd r5  0xdcea0ffc r6  0xdcea0ff8 r7  0xdce8
>[6515]r8  0x r9  0x r10 0x r11 0xb9a98000
>[6522]r12 0xdcea1000 usp 0x ulr 0x pc  0x820149bc
>[6528]spsr 0x41f3
> and the memory addresses of some variables at the moment are
> ref:0xdcea0ffc, op:0xdcea0ffc, oend:0xdcea1000
> 
> As you can see, COPYLENGH is 8bytes, so @ref and @op can access the momory
> over @oend.
> 
> Signed-off-by: JeHyeon Yeon 
> Reviewed-by: David Sterba 
> [Linux commit d5e7cafd69da24e6d6cc988fab6ea313a2577efc]
> Signed-off-by: Jan Beulich 

Acked-by: Ian Campbell 

> 
> --- a/xen/common/lz4/decompress.c
> +++ b/xen/common/lz4/decompress.c
> @@ -132,6 +132,9 @@ static int INIT lz4_uncompress(const uns
>   /* Error: request to write beyond destination buffer */
>   if (cpy > oend)
>   goto _output_error;
> + if ((ref + COPYLENGTH) > oend ||
> + (op + COPYLENGTH) > oend)
> + goto _output_error;
>   LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
>   while (op < cpy)
>   *op++ = *ref++;
> 
> 
> 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 4/4] x86/MSI-X: cleanup

2015-03-25 Thread Jan Beulich

- __pci_enable_msix() now checks that an MSI-X capability was actually
  found
- pass "pos" to msix_capability_init() as both callers already know it
  (and hence there's no need to re-obtain it)
- call __pci_disable_msi{,x}() directly instead of via
  pci_disable_msi() from __pci_enable_msi{x,}() state validation paths
- use msix_control_reg() instead of open coding it
- log message adjustments
- coding style corrections

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -35,6 +35,8 @@
 static s8 __read_mostly use_msi = -1;
 boolean_param("msi", use_msi);
 
+static void __pci_disable_msix(struct msi_desc *);
+
 /* bitmap indicate which fixed map is free */
 static DEFINE_SPINLOCK(msix_fixmap_lock);
 static DECLARE_BITMAP(msix_fixmap_pages, FIX_MSIX_MAX_PAGES);
@@ -167,12 +169,14 @@ void msi_compose_msg(unsigned vector, co
 unsigned dest;
 
 memset(msg, 0, sizeof(*msg));
-if ( !cpumask_intersects(cpu_mask, &cpu_online_map) ) {
+if ( !cpumask_intersects(cpu_mask, &cpu_online_map) )
+{
 dprintk(XENLOG_ERR,"%s, compose msi message error!!\n", __func__);
 return;
 }
 
-if ( vector ) {
+if ( vector )
+{
 cpumask_t *mask = this_cpu(scratch_mask);
 
 cpumask_and(mask, cpu_mask, &cpu_online_map);
@@ -233,8 +237,7 @@ static bool_t read_msi_msg(struct msi_de
 }
 case PCI_CAP_ID_MSIX:
 {
-void __iomem *base;
-base = entry->mask_base;
+void __iomem *base = entry->mask_base;
 
 if ( unlikely(!msix_memory_decoded(entry->dev,
entry->msi_attrib.pos)) )
@@ -300,8 +303,7 @@ static int write_msi_msg(struct msi_desc
 }
 case PCI_CAP_ID_MSIX:
 {
-void __iomem *base;
-base = entry->mask_base;
+void __iomem *base = entry->mask_base;
 
 if ( unlikely(!msix_memory_decoded(entry->dev,
entry->msi_attrib.pos)) )
@@ -327,7 +329,7 @@ void set_msi_affinity(struct irq_desc *d
 struct msi_desc *msi_desc = desc->msi_desc;
 
 dest = set_desc_affinity(desc, mask);
-if (dest == BAD_APICID || !msi_desc)
+if ( dest == BAD_APICID || !msi_desc )
 return;
 
 ASSERT(spin_is_locked(&desc->lock));
@@ -379,11 +381,11 @@ static void msix_set_enable(struct pci_d
 pos = pci_find_cap_offset(seg, bus, slot, func, PCI_CAP_ID_MSIX);
 if ( pos )
 {
-control = pci_conf_read16(seg, bus, slot, func, pos + PCI_MSIX_FLAGS);
+control = pci_conf_read16(seg, bus, slot, func, msix_control_reg(pos));
 control &= ~PCI_MSIX_FLAGS_ENABLE;
 if ( enable )
 control |= PCI_MSIX_FLAGS_ENABLE;
-pci_conf_write16(seg, bus, slot, func, pos + PCI_MSIX_FLAGS, control);
+pci_conf_write16(seg, bus, slot, func, msix_control_reg(pos), control);
 }
 }
 
@@ -408,9 +410,11 @@ static bool_t msi_set_mask_bit(struct ir
 bus = pdev->bus;
 slot = PCI_SLOT(pdev->devfn);
 func = PCI_FUNC(pdev->devfn);
-switch (entry->msi_attrib.type) {
+switch ( entry->msi_attrib.type )
+{
 case PCI_CAP_ID_MSI:
-if (entry->msi_attrib.maskbit) {
+if ( entry->msi_attrib.maskbit )
+{
 u32 mask_bits;
 
 mask_bits = pci_conf_read32(seg, bus, slot, func, entry->msi.mpos);
@@ -778,13 +782,14 @@ static u64 read_pci_mem_bar(u16 seg, u8 
  * requested MSI-X entries with allocated irqs or non-zero for otherwise.
  **/
 static int msix_capability_init(struct pci_dev *dev,
+unsigned int pos,
 struct msi_info *msi,
 struct msi_desc **desc,
 unsigned int nr_entries)
 {
 struct arch_msix *msix = dev->msix;
 struct msi_desc *entry = NULL;
-int pos, vf;
+int vf;
 u16 control;
 u64 table_paddr;
 u32 table_offset;
@@ -796,7 +801,6 @@ static int msix_capability_init(struct p
 
 ASSERT(spin_is_locked(&pcidevs_lock));
 
-pos = pci_find_cap_offset(seg, bus, slot, func, PCI_CAP_ID_MSIX);
 control = pci_conf_read16(seg, bus, slot, func, msix_control_reg(pos));
 /*
  * Ensure MSI-X interrupts are masked during setup. Some devices require
@@ -984,10 +988,9 @@ static int __pci_enable_msi(struct msi_i
 old_desc = find_msi_entry(pdev, msi->irq, PCI_CAP_ID_MSI);
 if ( old_desc )
 {
-dprintk(XENLOG_WARNING, "irq %d has already mapped to MSI on "
-"device %04x:%02x:%02x.%01x\n",
-msi->irq, msi->seg, msi->bus,
-PCI_SLOT(msi->devfn), PCI_FUNC(msi->devfn));
+printk(XENLOG_WARNING "irq %d already mapped to MSI on 
%04x:%02x:%02x.%u\n",
+   msi->irq, msi->seg, msi->bus,
+   PCI_SLOT(msi->devfn), PCI_FUNC(msi->devfn));
 *desc = old_desc;
 return 0;
 }
@@ -995,10 +998,10 @@ static int __p

[Xen-devel] [PATCH v2 3/4] x86/MSI-X: reduce fiddling with control register during restore

2015-03-25 Thread Jan Beulich

Rather than disabling and enabling MSI-X once per vector, do it just
once per device.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1217,6 +1217,9 @@ int pci_restore_msi_state(struct pci_dev
 struct msi_desc *entry, *tmp;
 struct irq_desc *desc;
 struct msi_msg msg;
+u8 slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
+unsigned int type = 0, pos = 0;
+u16 control = 0;
 
 ASSERT(spin_is_locked(&pcidevs_lock));
 
@@ -1233,8 +1236,6 @@ int pci_restore_msi_state(struct pci_dev
 list_for_each_entry_safe( entry, tmp, &pdev->msi_list, list )
 {
 unsigned int i = 0, nr = 1;
-u16 control = 0;
-u8 slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
 
 irq = entry->irq;
 desc = &irq_desc[irq];
@@ -1251,31 +1252,38 @@ int pci_restore_msi_state(struct pci_dev
 pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
 PCI_FUNC(pdev->devfn), i);
 spin_unlock_irqrestore(&desc->lock, flags);
+if ( type == PCI_CAP_ID_MSIX )
+pci_conf_write16(pdev->seg, pdev->bus, slot, func,
+ msix_control_reg(pos),
+ control & ~PCI_MSIX_FLAGS_ENABLE);
 return -EINVAL;
 }
 
+ASSERT(!type || type == entry->msi_attrib.type);
+pos = entry->msi_attrib.pos;
 if ( entry->msi_attrib.type == PCI_CAP_ID_MSI )
 {
 msi_set_enable(pdev, 0);
 nr = entry->msi.nvec;
 }
-else if ( entry->msi_attrib.type == PCI_CAP_ID_MSIX )
+else if ( !type && entry->msi_attrib.type == PCI_CAP_ID_MSIX )
 {
 control = pci_conf_read16(pdev->seg, pdev->bus, slot, func,
-  msix_control_reg(entry->msi_attrib.pos));
+  msix_control_reg(pos));
 pci_conf_write16(pdev->seg, pdev->bus, slot, func,
- msix_control_reg(entry->msi_attrib.pos),
+ msix_control_reg(pos),
  control | (PCI_MSIX_FLAGS_ENABLE |
 PCI_MSIX_FLAGS_MASKALL));
 if ( unlikely(!memory_decoded(pdev)) )
 {
 spin_unlock_irqrestore(&desc->lock, flags);
 pci_conf_write16(pdev->seg, pdev->bus, slot, func,
- msix_control_reg(entry->msi_attrib.pos),
+ msix_control_reg(pos),
  control & ~PCI_MSIX_FLAGS_ENABLE);
 return -ENXIO;
 }
 }
+type = entry->msi_attrib.type;
 
 msg = entry->msg;
 write_msi_msg(entry, &msg);
@@ -1298,9 +1306,9 @@ int pci_restore_msi_state(struct pci_dev
 
 spin_unlock_irqrestore(&desc->lock, flags);
 
-if ( entry->msi_attrib.type == PCI_CAP_ID_MSI )
+if ( type == PCI_CAP_ID_MSI )
 {
-unsigned int cpos = msi_control_reg(entry->msi_attrib.pos);
+unsigned int cpos = msi_control_reg(pos);
 
 control = pci_conf_read16(pdev->seg, pdev->bus, slot, func, cpos) &
   ~PCI_MSI_FLAGS_QSIZE;
@@ -1310,12 +1318,13 @@ int pci_restore_msi_state(struct pci_dev
 
 msi_set_enable(pdev, 1);
 }
-else if ( entry->msi_attrib.type == PCI_CAP_ID_MSIX )
-pci_conf_write16(pdev->seg, pdev->bus, slot, func,
- msix_control_reg(entry->msi_attrib.pos),
- control | PCI_MSIX_FLAGS_ENABLE);
 }
 
+if ( type == PCI_CAP_ID_MSIX )
+pci_conf_write16(pdev->seg, pdev->bus, slot, func,
+ msix_control_reg(pos),
+ control | PCI_MSIX_FLAGS_ENABLE);
+
 return 0;
 }
 


x86/MSI-X: reduce fiddling with control register during restore

Rather than disabling and enabling MSI-X once per vector, do it just
once per device.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1217,6 +1217,9 @@ int pci_restore_msi_state(struct pci_dev
 struct msi_desc *entry, *tmp;
 struct irq_desc *desc;
 struct msi_msg msg;
+u8 slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
+unsigned int type = 0, pos = 0;
+u16 control = 0;
 
 ASSERT(spin_is_locked(&pcidevs_lock));
 
@@ -1233,8 +1236,6 @@ int pci_restore_msi_state(struct pci_dev
 list_for_each_entry_safe( entry, tmp, &pdev->msi_list, list )
 {
 unsigned int i = 0, nr = 1;
-u16 control = 0;
-u8 slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
 
 irq = entry->irq;
 desc = &irq_desc[irq];
@@ -1251,31 +1252,38 @@ int pci_restore_msi_state(struct pci_dev
 pdev->seg, pdev->bus, PCI_SLOT(pdev

Re: [Xen-devel] [PATCH] hvmloader: don't treat ROM BAR like other BARs

2015-03-25 Thread Andrew Cooper


On 20/03/15 16:20, Jan Beulich wrote:

Its low 11 bits have different meaning.

Signed-off-by: Jan Beulich


Reviewed-by: Andrew Cooper 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 2/4] x86/MSI-X: access MSI-X table only after having enabled MSI-X

2015-03-25 Thread Jan Beulich

As done in Linux by f598282f51 ("PCI: Fix the NIU MSI-X problem in a
better way") and its broken predecessor, make sure we don't access the
MSI-X table without having enabled MSI-X first, using the mask-all flag
instead to prevent interrupts from occurring.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -142,6 +142,19 @@ static bool_t memory_decoded(const struc
   PCI_COMMAND_MEMORY);
 }
 
+static bool_t msix_memory_decoded(const struct pci_dev *dev, unsigned int pos)
+{
+u16 control = pci_conf_read16(dev->seg, dev->bus,
+  PCI_SLOT(dev->devfn),
+  PCI_FUNC(dev->devfn),
+  msix_control_reg(pos));
+
+if ( !(control & PCI_MSIX_FLAGS_ENABLE) )
+return 0;
+
+return memory_decoded(dev);
+}
+
 /*
  * MSI message composition
  */
@@ -219,7 +236,8 @@ static bool_t read_msi_msg(struct msi_de
 void __iomem *base;
 base = entry->mask_base;
 
-if ( unlikely(!memory_decoded(entry->dev)) )
+if ( unlikely(!msix_memory_decoded(entry->dev,
+   entry->msi_attrib.pos)) )
 return 0;
 msg->address_lo = readl(base + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
 msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR_OFFSET);
@@ -285,7 +303,8 @@ static int write_msi_msg(struct msi_desc
 void __iomem *base;
 base = entry->mask_base;
 
-if ( unlikely(!memory_decoded(entry->dev)) )
+if ( unlikely(!msix_memory_decoded(entry->dev,
+   entry->msi_attrib.pos)) )
 return -ENXIO;
 writel(msg->address_lo,
base + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
@@ -379,7 +398,7 @@ static bool_t msi_set_mask_bit(struct ir
 {
 struct msi_desc *entry = desc->msi_desc;
 struct pci_dev *pdev;
-u16 seg;
+u16 seg, control;
 u8 bus, slot, func;
 
 ASSERT(spin_is_locked(&desc->lock));
@@ -401,35 +420,38 @@ static bool_t msi_set_mask_bit(struct ir
 }
 break;
 case PCI_CAP_ID_MSIX:
+control = pci_conf_read16(seg, bus, slot, func,
+  msix_control_reg(entry->msi_attrib.pos));
+if ( unlikely(!(control & PCI_MSIX_FLAGS_ENABLE)) )
+pci_conf_write16(seg, bus, slot, func,
+ msix_control_reg(entry->msi_attrib.pos),
+ control | (PCI_MSIX_FLAGS_ENABLE |
+PCI_MSIX_FLAGS_MASKALL));
 if ( likely(memory_decoded(pdev)) )
 {
 writel(flag, entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
 readl(entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
-break;
+if ( likely(control & PCI_MSIX_FLAGS_ENABLE) )
+break;
+flag = 1;
 }
-if ( flag )
+else if ( flag && !(control & PCI_MSIX_FLAGS_MASKALL) )
 {
-u16 control;
 domid_t domid = pdev->domain->domain_id;
 
-control = pci_conf_read16(seg, bus, slot, func,
-  msix_control_reg(entry->msi_attrib.pos));
-if ( control & PCI_MSIX_FLAGS_MASKALL )
-break;
-pci_conf_write16(seg, bus, slot, func,
- msix_control_reg(entry->msi_attrib.pos),
- control | PCI_MSIX_FLAGS_MASKALL);
+control |= PCI_MSIX_FLAGS_MASKALL;
 if ( pdev->msix->warned != domid )
 {
 pdev->msix->warned = domid;
 printk(XENLOG_G_WARNING
-   "cannot mask IRQ %d: masked MSI-X on Dom%d's 
%04x:%02x:%02x.%u\n",
+   "cannot mask IRQ %d: masking MSI-X on Dom%d's 
%04x:%02x:%02x.%u\n",
desc->irq, domid, pdev->seg, pdev->bus,
PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
 }
-break;
 }
-/* fall through */
+pci_conf_write16(seg, bus, slot, func,
+ msix_control_reg(entry->msi_attrib.pos), control);
+return flag;
 default:
 return 0;
 }
@@ -454,7 +476,8 @@ static int msi_get_mask_bit(const struct
 entry->msi.mpos) >>
 entry->msi_attrib.entry_nr) & 1;
 case PCI_CAP_ID_MSIX:
-if ( unlikely(!memory_decoded(entry->dev)) )
+if ( unlikely(!msix_memory_decoded(entry->dev,
+   entry->msi_attrib.pos)) )
 break;
 return readl(entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) & 1;
 }
@@ -775,16 +798,32 @@ static int msix_capability_init(struct p
 
 pos = pci_find_cap_offset(seg, bus, slot, func, PCI_CAP_ID_MSIX);
 control = pci_conf_read16(s

Re: [Xen-devel] [PATCH OSSTEST] Arrange for core dumps to be placed in /var/core and collect them

2015-03-25 Thread Ian Campbell

On Wed, 2015-03-25 at 16:18 +, Ian Jackson wrote:
> Ian Campbell writes ("Re: [PATCH OSSTEST] Arrange for core dumps to be placed 
> in /var/core and collect them"):
> > Do you have any thoughts on how to best universally arrange for the
> > rlimits to be increased? Probably inserting a ulimit call into the
> > libvirt initscript would be an easy first step, and since that's the
> > thing we most often see crashing it seems. I'll send such a patch
> > shortly.
> 
> Maybe pam ?

Yes, I think that's what my subsequent patch to configure
via /etc/security/limits.d has achived.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 0/4] x86/MSI-X: XSA-120 follow-up

2015-03-25 Thread Jan Beulich

The problem requiring the first patch here is actually what lead to
XSA-120.

1: be more careful during teardown
2: access MSI-X table only after having enabled MSI-X
3: reduce fiddling with control register during restore
4: cleanup

Signed-off-by: Jan Beulich 

(Only patch 1 really changed - see there for details.)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [OSSTEST PATCH 29/32] ts-kernel-build: enable CONFIG_SCSI_HPSA

2015-03-25 Thread Ian Jackson

Ian Campbell writes ("Re: [OSSTEST PATCH 29/32] ts-kernel-build: enable 
CONFIG_SCSI_HPSA"):
> On Tue, 2015-03-24 at 19:02 +, Ian Jackson wrote:
> > +setopt CONFIG_SCSI_HPSA n
> 
> This is disabling the option, not enabling it per the commit message.

Yes.  I noticed that when I executed it :-/.  I intended to enable it
(setting it to `m').

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 3/8] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient

2015-03-25 Thread Andrew Cooper


On 19/03/15 22:53, Boris Ostrovsky wrote:

--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -324,39 +324,63 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
u_sysctl)
  }
  break;
  
-case XEN_SYSCTL_topologyinfo:

+case XEN_SYSCTL_cputopoinfo:
  {
-uint32_t i, max_cpu_index, last_online_cpu;
-xen_sysctl_topologyinfo_t *ti = &op->u.topologyinfo;
+uint32_t i, num_cpus;
+xen_sysctl_cputopoinfo_t *ti = &op->u.cputopoinfo;
  
-last_online_cpu = cpumask_last(&cpu_online_map);

-max_cpu_index = min_t(uint32_t, ti->max_cpu_index, last_online_cpu);
-ti->max_cpu_index = last_online_cpu;
-
-for ( i = 0; i <= max_cpu_index; i++ )
+num_cpus = cpumask_last(&cpu_online_map) + 1;
+if ( !guest_handle_is_null(ti->cputopo) )
  {
-if ( !guest_handle_is_null(ti->cpu_to_core) )
+if ( ti->num_cpus < num_cpus )
  {
-uint32_t core = cpu_online(i) ? cpu_to_core(i) : ~0u;
-if ( copy_to_guest_offset(ti->cpu_to_core, i, &core, 1) )
-break;
+ret = -ENOBUFS;
+i = num_cpus;
  }
-if ( !guest_handle_is_null(ti->cpu_to_socket) )
+
+for ( i = 0; i < num_cpus; i++ )


Observe that the "i = 0" clobbers the -ENOBUFS detection, meaning that 
Xen will always write num_cpus into the userspace array, writing past 
the end of the array if it is too short.


As this patch has already been committed, please fix as a matter of 
priority (or I can if you are overly busy).


~Andrew

(Also, you have introduced a mixed tab/space into 
tools/python/xen/lowlevel/xc/xc.c on the "goto out;" line)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] x86/MSI: fix error handling

2015-03-25 Thread Jan Beulich

__setup_msi_irq() needs to undo what it did before calling
write_msi_msg() in case that returned an error.

map_domain_pirq() needs to get rid of the MSI descriptor it
(implicitly) allocated. The case of a setup_msi_irq() failure on a
non-initial multi-vector-MSI interrupt needs special handling: While
the initial IRQ will get freed by the caller (who also passed it to
us), we need to undo the effect setup_msi_irq() had on it. (As a
benefit from the added call to msi_free_irq() we no longer need to
explicitly call destroy_irq() on the non-initial slots.)

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1974,6 +1974,8 @@ int map_domain_pirq(
 dprintk(XENLOG_G_ERR, "dom%d: irq %d in use\n",
 d->domain_id, irq);
 pci_disable_msi(msi_desc);
+msi_desc->irq = -1;
+msi_free_irq(msi_desc);
 ret = -EBUSY;
 goto done;
 }
@@ -2028,22 +2030,29 @@ int map_domain_pirq(
 if ( ret )
 {
 spin_unlock_irqrestore(&desc->lock, flags);
+pci_disable_msi(msi_desc);
+if ( nr )
+{
+ASSERT(msi_desc->irq >= 0);
+desc = irq_to_desc(msi_desc->irq);
+spin_lock_irqsave(&desc->lock, flags);
+desc->handler = &no_irq_type;
+desc->msi_desc = NULL;
+spin_unlock_irqrestore(&desc->lock, flags);
+}
 while ( nr-- )
 {
-if ( irq >= 0 )
-{
-if ( irq_deny_access(d, irq) )
-printk(XENLOG_G_ERR
-   "dom%d: could not revoke access to IRQ%d (pirq 
%d)\n",
-   d->domain_id, irq, pirq);
-destroy_irq(irq);
-}
+if ( irq >= 0 && irq_deny_access(d, irq) )
+printk(XENLOG_G_ERR
+   "dom%d: could not revoke access to IRQ%d (pirq 
%d)\n",
+   d->domain_id, irq, pirq);
 if ( info )
 cleanup_domain_irq_pirq(d, irq, info);
 info = pirq_info(d, pirq + nr);
 irq = info->arch.irq;
 }
-pci_disable_msi(msi_desc);
+msi_desc->irq = -1;
+msi_free_irq(msi_desc);
 goto done;
 }
 
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -561,6 +561,7 @@ static struct msi_desc *alloc_msi_entry(
 while ( nr-- )
 {
 entry[nr].dev = NULL;
+entry[nr].irq = -1;
 entry[nr].remap_index = -1;
 }
 
@@ -578,11 +579,19 @@ int __setup_msi_irq(struct irq_desc *des
 hw_irq_controller *handler)
 {
 struct msi_msg msg;
+int ret;
 
 desc->msi_desc = msidesc;
 desc->handler = handler;
 msi_compose_msg(desc->arch.vector, desc->arch.cpu_mask, &msg);
-return write_msi_msg(msidesc, &msg);
+ret = write_msi_msg(msidesc, &msg);
+if ( unlikely(ret) )
+{
+desc->handler = &no_irq_type;
+desc->msi_desc = NULL;
+}
+
+return ret;
 }
 
 int msi_free_irq(struct msi_desc *entry)
@@ -592,7 +601,8 @@ int msi_free_irq(struct msi_desc *entry)
 
 while ( nr-- )
 {
-destroy_irq(entry[nr].irq);
+if ( entry[nr].irq >= 0 )
+destroy_irq(entry[nr].irq);
 
 /* Free the unused IRTE if intr remap enabled */
 if ( iommu_intremap )



x86/MSI: fix error handling

__setup_msi_irq() needs to undo what it did before calling
write_msi_msg() in case that returned an error.

map_domain_pirq() needs to get rid of the MSI descriptor it
(implicitly) allocated. The case of a setup_msi_irq() failure on a
non-initial multi-vector-MSI interrupt needs special handling: While
the initial IRQ will get freed by the caller (who also passed it to
us), we need to undo the effect setup_msi_irq() had on it. (As a
benefit from the added call to msi_free_irq() we no longer need to
explicitly call destroy_irq() on the non-initial slots.)

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1974,6 +1974,8 @@ int map_domain_pirq(
 dprintk(XENLOG_G_ERR, "dom%d: irq %d in use\n",
 d->domain_id, irq);
 pci_disable_msi(msi_desc);
+msi_desc->irq = -1;
+msi_free_irq(msi_desc);
 ret = -EBUSY;
 goto done;
 }
@@ -2028,22 +2030,29 @@ int map_domain_pirq(
 if ( ret )
 {
 spin_unlock_irqrestore(&desc->lock, flags);
+pci_disable_msi(msi_desc);
+if ( nr )
+{
+ASSERT(msi_desc->irq >= 0);
+desc = irq_to_desc(msi_desc->irq);
+spin_lock_irqsave(&desc->lock, flags);
+desc->handler = &no_irq_type;
+desc->msi_desc = NULL;
+

Re: [Xen-devel] [PATCH OSSTEST] Arrange for core dumps to be placed in /var/core and collect them

2015-03-25 Thread Ian Jackson

Ian Campbell writes ("Re: [PATCH OSSTEST] Arrange for core dumps to be placed 
in /var/core and collect them"):
> Do you have any thoughts on how to best universally arrange for the
> rlimits to be increased? Probably inserting a ulimit call into the
> libvirt initscript would be an easy first step, and since that's the
> thing we most often see crashing it seems. I'll send such a patch
> shortly.

Maybe pam ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] LZ4 : fix the data abort issue

2015-03-25 Thread Jan Beulich

If the part of the compression data are corrupted, or the compression
data is totally fake, the memory access over the limit is possible.

This is the log from my system usning lz4 decompression.
   [6502]data abort, halting
   [6503]r0  0x r1  0x r2  0xdcea0ffc r3  0xdcea0ffc
   [6509]r4  0xb9ab0bfd r5  0xdcea0ffc r6  0xdcea0ff8 r7  0xdce8
   [6515]r8  0x r9  0x r10 0x r11 0xb9a98000
   [6522]r12 0xdcea1000 usp 0x ulr 0x pc  0x820149bc
   [6528]spsr 0x41f3
and the memory addresses of some variables at the moment are
ref:0xdcea0ffc, op:0xdcea0ffc, oend:0xdcea1000

As you can see, COPYLENGH is 8bytes, so @ref and @op can access the momory
over @oend.

Signed-off-by: JeHyeon Yeon 
Reviewed-by: David Sterba 
[Linux commit d5e7cafd69da24e6d6cc988fab6ea313a2577efc]
Signed-off-by: Jan Beulich 

--- a/xen/common/lz4/decompress.c
+++ b/xen/common/lz4/decompress.c
@@ -132,6 +132,9 @@ static int INIT lz4_uncompress(const uns
/* Error: request to write beyond destination buffer */
if (cpy > oend)
goto _output_error;
+   if ((ref + COPYLENGTH) > oend ||
+   (op + COPYLENGTH) > oend)
+   goto _output_error;
LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
while (op < cpy)
*op++ = *ref++;



LZ4 : fix the data abort issue

If the part of the compression data are corrupted, or the compression
data is totally fake, the memory access over the limit is possible.

This is the log from my system usning lz4 decompression.
   [6502]data abort, halting
   [6503]r0  0x r1  0x r2  0xdcea0ffc r3  0xdcea0ffc
   [6509]r4  0xb9ab0bfd r5  0xdcea0ffc r6  0xdcea0ff8 r7  0xdce8
   [6515]r8  0x r9  0x r10 0x r11 0xb9a98000
   [6522]r12 0xdcea1000 usp 0x ulr 0x pc  0x820149bc
   [6528]spsr 0x41f3
and the memory addresses of some variables at the moment are
ref:0xdcea0ffc, op:0xdcea0ffc, oend:0xdcea1000

As you can see, COPYLENGH is 8bytes, so @ref and @op can access the momory
over @oend.

Signed-off-by: JeHyeon Yeon 
Reviewed-by: David Sterba 
[Linux commit d5e7cafd69da24e6d6cc988fab6ea313a2577efc]
Signed-off-by: Jan Beulich 

--- a/xen/common/lz4/decompress.c
+++ b/xen/common/lz4/decompress.c
@@ -132,6 +132,9 @@ static int INIT lz4_uncompress(const uns
/* Error: request to write beyond destination buffer */
if (cpy > oend)
goto _output_error;
+   if ((ref + COPYLENGTH) > oend ||
+   (op + COPYLENGTH) > oend)
+   goto _output_error;
LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
while (op < cpy)
*op++ = *ref++;
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] VM Migration on ARM

2015-03-25 Thread Andrew Cooper


On 20/03/15 13:17, Vijay Kilari wrote:

Hi Andrew,

On Wed, Mar 11, 2015 at 4:21 PM, Andrew Cooper
  wrote:

On 11/03/15 05:32, Vijay Kilari wrote:

Hi Andrew,

 From Ian & Stefano, I came to know that you have introduced
Migration framework v2
under below patch series

http://marc.info/?l=xen-devel&m=141036915311145

I have few queries:

1) Is there any plans/timeline to support for ARM32/ARM64?.

I will not personally be doing any ARM migration support, not even as
part of getting migration v2 accepted upstream.


2) Also does the Domain Image format specification for ARM is complete?

The spec has certain sentinel values for ARM already where it made sense
to just include them while writing.

It has no specific thought to ARM migration, but an ARM virtual machine
is exactly like an x86 HVM domain (from the toolstacks point of view),
other than the Qemu record.  With any luck, no new additions to the spec
will be needed, but the spec is trivially extendible if required.

IMO, ARM uses PVH domain.
Can you please let me know how to use your patches, I mean xl/xc commands
to do migration (non-live/live) using your patches.
Any debugging and testing support information would be helpful.


Hi - apologises for the late reply.  I have been on holiday.

The patches series as available does make `xl save/restore/migrate` 
work, although the error handling leaves a lot to be desired.


We (XenServer) are in the process of attempting to get everything 
upstream.  The libxc side of things is definitely working and already 
shipped in XenServer 6.5 (We absolutely needed to support migrating 
between 32 and 64bit toolstacks, which was the primary reason for the 
rewrite).


What sort of timescale are you looking for on this?  I hope to have a 
new series posted soon, now that the datacopier subseries has been accepted.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-4.3-testing test] 36695: regressions - FAIL

2015-03-25 Thread Jan Beulich

>>> On 25.03.15 at 10:27,  wrote:
> On Wed, 2015-03-25 at 08:33 +, xen.org wrote:
>> flight 36695 xen-4.3-testing real [real]
>> http://www.chiark.greenend.org.uk/~xensrcts/logs/36695/ 
>> 
>> Regressions :-(
>> 
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>  test-amd64-i386-pair   17 guest-migrate/src_host/dst_host fail REGR. vs. 
> 36483
> 
> This looks like the swiotlb thing. Given that the only commit under test
> here is:
> 
> commit c58b16ef1572176cf2f6a424b527b5ed4bb73f17
> Author: Jan Beulich 
> Date:   Thu Mar 19 16:08:36 2015 +0100
> 
> update Xen version to 4.3.4
> 
> Shall we just force push it?

Yes, I think so.

Thanks, Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/3] xen: arm: correctly handle continuations for 64-bit guests

2015-03-25 Thread Ian Campbell

On Wed, 2015-03-25 at 15:41 +, Andrew Cooper wrote:
> On 25/03/15 15:34, Ian Campbell wrote:
> > The 64-bit ABI is different to 32-bit:
> >
> >   - uses x16 as the op register rather than r12.
> >   - arguments in x0..x5 and not r0..r5. Using rN here potentially
> > truncates.
> >   - return value goes in x0, not r0.
> >
> > Hypercalls can only be made directly from kernel space, so checking
> > the domain's size is sufficient.
> >
> > The update of regs->pc is duplicated in both halves because the 32-bit
> > case is going to need fixing to handle Thumb mode (next patch).
> >
> > Spotted due to spurious -EFAULT when destroying a domain, due to the
> > hypercall's pointer argument being truncated. I'm unclear why I am
> > only seeing this now.
> >
> > Signed-off-by: Ian Campbell 
> 
> Almost certainly 15e0aac6fe76be6a710a8e6d3da610d437903266 which changed 
> XEN_DOMCTL_destroydomain to use continuations rather than repeated 
> hypercalls.

That sounds like why domaindestroy now exhibits it, but not why it
hasn't been plaguing us on different hypercalls for ever. I suppose we
don't actually hit preempt very often in practice.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/1] hvm.c: Prevent gcc uninitialised var warning

2015-03-25 Thread Jan Beulich

>>> On 25.03.15 at 16:02,  wrote:
> On 23/03/15 15:01, Don Slutz wrote:
>> gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 reports:
>> --
>> hvm.c: In function `hvm_create_ioreq_server':
>> hvm.c:487:18: error: `bufioreq_pfn' may be used uninitialised in this 
> function
>> [-Werror=uninitialized]
>> hvm.c:718:30: note: `bufioreq_pfn' was declared here
>> --
>>
>> My code analysis says that gcc is wrong, but initilize the variable
>> to prevent the gcc warning.
>>
>> Reported-by: Ian Murray 
>> Signed-off-by: Don Slutz 
> 
> GCC is correct and there is path through this function where
> bufioreq_pfn is used while uninitialised (non-debug build, is_default =
> 1, handle_bufioreq = 0).

I'm not seeing it: When is_default is set, bufioreq_pfn gets
initialized in the first if()'s body.

> As an aside, the compiler is in a very easy position to spot this.  The
> error means that GCC has positively identified a basic block which does
> use bufioreq_pfn before it has been initialised.
> 
> I observe that the patch has been committed, but it merely hides the
> problem and doesn't fix it; hvm_free_ioreq_gmfn(d, bufioreq_pfn) will
> still clobber arbitrary memory.

I committed the patch based on me not being able to find a path
where the variable would actually be used uninitialized (and it is
known that various gcc versions have varying levels of problems
with correctly identifying such issues). If indeed there is a path
that I'm not seeing, then I'd indeed favor reverting and putting
in a proper, backportable fix.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/3] xen: arm: correctly handle continuations for 64-bit guests

2015-03-25 Thread Andrew Cooper


On 25/03/15 15:34, Ian Campbell wrote:

The 64-bit ABI is different to 32-bit:

  - uses x16 as the op register rather than r12.
  - arguments in x0..x5 and not r0..r5. Using rN here potentially
truncates.
  - return value goes in x0, not r0.

Hypercalls can only be made directly from kernel space, so checking
the domain's size is sufficient.

The update of regs->pc is duplicated in both halves because the 32-bit
case is going to need fixing to handle Thumb mode (next patch).

Spotted due to spurious -EFAULT when destroying a domain, due to the
hypercall's pointer argument being truncated. I'm unclear why I am
only seeing this now.

Signed-off-by: Ian Campbell 


Almost certainly 15e0aac6fe76be6a710a8e6d3da610d437903266 which changed 
XEN_DOMCTL_destroydomain to use continuations rather than repeated 
hypercalls.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/3] xen: arm: Fix typo "Falltrough"

2015-03-25 Thread Ian Campbell

On Wed, 2015-03-25 at 15:34 +, Ian Campbell wrote:
Sigh, I remembered just too late that without a cover letter git doesn't
chain things, sorry. I should stop being so lazy...

> Signed-off-by: Ian Campbell 
> ---
>  xen/arch/arm/domain.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index fdba081..939d8cd 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -742,7 +742,7 @@ int domain_relinquish_resources(struct domain *d)
>  {
>  case RELMEM_not_started:
>  d->arch.relmem = RELMEM_xen;
> -/* Falltrough */
> +/* Fallthrough */
>  
>  case RELMEM_xen:
>  ret = relinquish_memory(d, &d->xenpage_list);



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 2/3] xen: arm: correctly handle continuations for 64-bit guests

2015-03-25 Thread Ian Campbell

The 64-bit ABI is different to 32-bit:

 - uses x16 as the op register rather than r12.
 - arguments in x0..x5 and not r0..r5. Using rN here potentially
   truncates.
 - return value goes in x0, not r0.

Hypercalls can only be made directly from kernel space, so checking
the domain's size is sufficient.

The update of regs->pc is duplicated in both halves because the 32-bit
case is going to need fixing to handle Thumb mode (next patch).

Spotted due to spurious -EFAULT when destroying a domain, due to the
hypercall's pointer argument being truncated. I'm unclear why I am
only seeing this now.

Signed-off-by: Ian Campbell 
---
I imagine this needs backporting everywhere...

For reference the git diff -b version of this is:
@@ -357,6 +357,36 @@ unsigned long hypercall_create_continuation(
 else
 {
 regs = guest_cpu_user_regs();
+
+#ifdef CONFIG_ARM_64
+if ( !is_32bit_domain(current->domain) )
+{
+regs->x16 = op;
+
+/* Ensure the hypercall trap instruction is re-executed. */
+regs->pc -= 4;  /* re-execute 'hvc #XEN_HYPERCALL_TAG' */
+
+for ( i = 0; *p != '\0'; i++ )
+{
+arg = next_arg(p, args);
+
+switch ( i )
+{
+case 0: regs->x0 = arg; break;
+case 1: regs->x1 = arg; break;
+case 2: regs->x2 = arg; break;
+case 3: regs->x3 = arg; break;
+case 4: regs->x4 = arg; break;
+case 5: regs->x5 = arg; break;
+}
+}
+
+/* Return value gets written back to x0 */
+rc = regs->x0;
+}
+else
+#endif
+{
 regs->r12 = op;

 /* Ensure the hypercall trap instruction is re-executed.
 * */
@@ -380,6 +410,7 @@ unsigned long hypercall_create_continuation(
 /* Return value gets written back to r0 */
 rc = regs->r0;
 }
+}

 va_end(args);
---
 xen/arch/arm/domain.c |   63 -
 1 file changed, 47 insertions(+), 16 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 939d8cd..10f13e4 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -356,29 +356,60 @@ unsigned long hypercall_create_continuation(
 }
 else
 {
-regs  = guest_cpu_user_regs();
-regs->r12 = op;
+regs = guest_cpu_user_regs();
 
-/* Ensure the hypercall trap instruction is re-executed. */
-regs->pc -= 4;  /* re-execute 'hvc #XEN_HYPERCALL_TAG' */
-
-for ( i = 0; *p != '\0'; i++ )
+#ifdef CONFIG_ARM_64
+if ( !is_32bit_domain(current->domain) )
 {
-arg = next_arg(p, args);
+regs->x16 = op;
 
-switch ( i )
+/* Ensure the hypercall trap instruction is re-executed. */
+regs->pc -= 4;  /* re-execute 'hvc #XEN_HYPERCALL_TAG' */
+
+for ( i = 0; *p != '\0'; i++ )
 {
-case 0: regs->r0 = arg; break;
-case 1: regs->r1 = arg; break;
-case 2: regs->r2 = arg; break;
-case 3: regs->r3 = arg; break;
-case 4: regs->r4 = arg; break;
-case 5: regs->r5 = arg; break;
+arg = next_arg(p, args);
+
+switch ( i )
+{
+case 0: regs->x0 = arg; break;
+case 1: regs->x1 = arg; break;
+case 2: regs->x2 = arg; break;
+case 3: regs->x3 = arg; break;
+case 4: regs->x4 = arg; break;
+case 5: regs->x5 = arg; break;
+}
 }
+
+/* Return value gets written back to x0 */
+rc = regs->x0;
 }
+else
+#endif
+{
+regs->r12 = op;
+
+/* Ensure the hypercall trap instruction is re-executed. */
+regs->pc -= 4;  /* re-execute 'hvc #XEN_HYPERCALL_TAG' */
+
+for ( i = 0; *p != '\0'; i++ )
+{
+arg = next_arg(p, args);
+
+switch ( i )
+{
+case 0: regs->r0 = arg; break;
+case 1: regs->r1 = arg; break;
+case 2: regs->r2 = arg; break;
+case 3: regs->r3 = arg; break;
+case 4: regs->r4 = arg; break;
+case 5: regs->r5 = arg; break;
+}
+}
 
-/* Return value gets written back to r0 */
-rc = regs->r0;
+/* Return value gets written back to r0 */
+rc = regs->r0;
+

[Xen-devel] [PATCH 3/3] xen: arm: handle thumb mode guest in continuations

2015-03-25 Thread Ian Campbell

PC only needs adjusting by 2, otherwise we rerun the instruction prior
to the hvc as well.

hvc is unpredictable if used within a Thumb IT (conditional execution)
block, so we don't need to worry about rewinding any of that state.

Signed-off-by: Ian Campbell 
---
Should be backported
---
 xen/arch/arm/domain.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 10f13e4..2d2197e 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -387,10 +387,11 @@ unsigned long hypercall_create_continuation(
 else
 #endif
 {
+int is_thumb = (regs->cpsr & PSR_THUMB);
 regs->r12 = op;
 
 /* Ensure the hypercall trap instruction is re-executed. */
-regs->pc -= 4;  /* re-execute 'hvc #XEN_HYPERCALL_TAG' */
+regs->pc -= is_thumb?2:4; /* re-execute 'hvc #XEN_HYPERCALL_TAG' */
 
 for ( i = 0; *p != '\0'; i++ )
 {
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 1/3] xen: arm: Fix typo "Falltrough"

2015-03-25 Thread Ian Campbell

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/domain.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index fdba081..939d8cd 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -742,7 +742,7 @@ int domain_relinquish_resources(struct domain *d)
 {
 case RELMEM_not_started:
 d->arch.relmem = RELMEM_xen;
-/* Falltrough */
+/* Fallthrough */
 
 case RELMEM_xen:
 ret = relinquish_memory(d, &d->xenpage_list);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC] Add the panic info when disable VT-d

2015-03-25 Thread Jan Beulich

>>> On 25.03.15 at 03:32,  wrote:
>> >>> On 19.01.15 at 10:00,  wrote:
>> > --- a/xen/arch/x86/apic.c
>> > +++ b/xen/arch/x86/apic.c
>> > @@ -915,6 +915,11 @@ void __init x2apic_bsp_setup(void)
>> >  return;
>> >  }
>> >  printk("x2APIC: Already enabled by BIOS: Ignoring cmdline
>> > disable.\n");
>> > +} else {
>> > +if ( !iommu_enable) {
>> > +panic("x2APIC should be disabled while IOMMU is disabled,"
>> > + "try to set x2apic=0 in cmdline and disable x2apic in BIOS");
>> > +}
>> 
>> Putting aside the coding style violations (figure braces on their own lines, 
> no
>> hard tabs), you tie this to the wrong thing: You need interrupt remapping to
>> be enabled, whereas iommu_enable may only mean DMA remapping. And
>> I'm afraid you'd run into an ordering problem (iommu_intremap getting set
>> to its final value vs. the code above being run) if you tried to correct 
> this.
> 
> I don't understand the ordering problem you referred to, the patch is just 
> change the panic info, 
> and tell the user what they should do to avoid the panic, I didn't change 
> the logic of the current code.
> Without my patch, the current code will also print the panic info like this:
> 
> (XEN) 
> (XEN) Panic on CPU 0:
> (XEN) Couldn't enable IOMMU and iommu=required/force
> (XEN) 
>  
> It's odd because the user have set ' iommu=0', anyway, it should be fixed.

Not sure what you consider odd here - x2apic_bsp_setup() is
where force_iommu gets set to 1, i.e. the user specifying
"iommu=0" gets intentionally overridden in this case.

> How about change the panic info to this.
>  +} else {
>  +if ( !iommu_enable) {
>  +panic("x2APIC should be disabled while iommu=0 is set,"
>  +  "try to set x2apic=0 option and disable x2apic in BIOS to 
> avoid this");
>  +}
> 
> or could you give some suggestion?

You only altered the text, not the condition. As said before, I think
this really should be keyed off of iommu_intremap. I can't exclude
that the existing logic isn't really correct either, but if that's the
case the solution is not to add another random panic() invocation,
but to fix that logic.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/1] hvm.c: Prevent gcc uninitialised var warning

2015-03-25 Thread Andrew Cooper

On 23/03/15 15:01, Don Slutz wrote:
> gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 reports:
> --
> hvm.c: In function `hvm_create_ioreq_server':
> hvm.c:487:18: error: `bufioreq_pfn' may be used uninitialised in this function
> [-Werror=uninitialized]
> hvm.c:718:30: note: `bufioreq_pfn' was declared here
> --
>
> My code analysis says that gcc is wrong, but initilize the variable
> to prevent the gcc warning.
>
> Reported-by: Ian Murray 
> Signed-off-by: Don Slutz 

GCC is correct and there is path through this function where
bufioreq_pfn is used while uninitialised (non-debug build, is_default =
1, handle_bufioreq = 0).

As an aside, the compiler is in a very easy position to spot this.  The
error means that GCC has positively identified a basic block which does
use bufioreq_pfn before it has been initialised.

I observe that the patch has been committed, but it merely hides the
problem and doesn't fix it; hvm_free_ioreq_gmfn(d, bufioreq_pfn) will
still clobber arbitrary memory.

No combination of parameters should be able to cause UB like this.  In
this case, the error handling is incredibly fragile.  All the unmapping
should be based on information in s->{,buf}ioreq which is set upon
successful map, rather than the booleans indicating whether the function
should have set up the mappings.  This allows all the fail$N labels to
collapse into a single, more simple, exit path.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH OSSTEST] Arrange for core dumps to be placed in /var/core and collect them

2015-03-25 Thread Ian Campbell

On Wed, 2015-03-25 at 09:40 +, Ian Campbell wrote:
> Do you have any thoughts on how to best universally arrange for the
> rlimits to be increased? Probably inserting a ulimit call into the
> libvirt initscript would be an easy first step, and since that's the
> thing we most often see crashing it seems. I'll send such a patch
> shortly.
> 
> Adding it to xencommons would be OK too, which just leaves catching e.g.
> daemonised xl processes launched via ssh. Hacking /etc/profile or
> something in TestSupport::cmdex perhaps?

I handled this by writing an /etc/security/limits.d/ file during host
install, which seems to have done the trick. I just sent a patch which
along with the libvirt one I sent earlier today covers everything I
think we care about except our daemons which are launched from
xencommons.

The initscript in xen.git doesn't offer a useful hook for this, we could
add something to the end of /etc/default/xencommons, which is sourced by
the script, but that seems like a bit of an abuse.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 06/12] xen: arm: correctly handle vtimer traps from userspace

2015-03-25 Thread Ian Campbell

Previously 32-bit userspace on 32-bit kernel and 64-bit userspace on
64-bit kernel could access these registers irrespective of whether the
kernel had configured them to be allowed to. To fix this:

 - Userspace access to CNTP_CTL_EL0 and CNTP_TVAL_EL0 should be gated
   on CNTKCTL_EL1.EL0PTEN.
 - Userspace access to CNTPCT_EL0 should be gated on
   CNTKCTL_EL1.EL0PCTEN.

When we do not handle an access we now silently inject an undef even
in debug mode since the debugging messages are not helpful (we have
handled the access, by explicitly choosing not to).

Since HSR_EC_CP15_64 cannot be taken from a guest in AArch64 mode
except due to a hardware bug switch the associated check to a BUG_ON
(which will be switched to something more appropriate in a subsequent
patch)

Fix a coding style issue in HSR_CPREG64(CNTPCT) while touching similar
code.

Signed-off-by: Ian Campbell 
---
v2: Replaces "xen: arm: turn vtimer traps for cp32/64 and sysreg into #undef"
---
 xen/arch/arm/traps.c|   28 +---
 xen/arch/arm/vtimer.c   |   38 +++---
 xen/include/asm-arm/processor.h |6 ++
 3 files changed, 38 insertions(+), 34 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 4eda561..c5ffcc5 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1598,11 +1598,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 case HSR_CPREG32(CNTP_CTL):
 case HSR_CPREG32(CNTP_TVAL):
 if ( !vtimer_emulate(regs, hsr) )
-{
-dprintk(XENLOG_ERR,
-"failed emulation of 32-bit vtimer CP register access\n");
-domain_crash_synchronous();
-}
+goto undef_cp15_32;
 break;
 case HSR_CPREG32(ACTLR):
 if ( cp32.read )
@@ -1645,6 +1641,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 gdprintk(XENLOG_ERR, "unhandled 32-bit CP15 access %#x\n",
  hsr.bits & HSR_CP32_REGS_MASK);
 #endif
+ undef_cp15_32:
 inject_undef_exception(regs, hsr.len);
 return;
 }
@@ -1665,11 +1662,7 @@ static void do_cp15_64(struct cpu_user_regs *regs,
 case HSR_CPREG64(CNTPCT):
 case HSR_CPREG64(CNTP_CVAL):
 if ( !vtimer_emulate(regs, hsr) )
-{
-dprintk(XENLOG_ERR,
-"failed emulation of 64-bit vtimer CP register access\n");
-domain_crash_synchronous();
-}
+goto undef_cp15_64;
 break;
 default:
 {
@@ -1683,6 +1676,7 @@ static void do_cp15_64(struct cpu_user_regs *regs,
 gdprintk(XENLOG_ERR, "unhandled 64-bit CP15 access %#x\n",
  hsr.bits & HSR_CP64_REGS_MASK);
 #endif
+ undef_cp15_64:
 inject_undef_exception(regs, hsr.len);
 return;
 }
@@ -1851,11 +1845,7 @@ static void do_sysreg(struct cpu_user_regs *regs,
 case HSR_SYSREG_CNTP_TVAL_EL0:
 case HSR_SYSREG_CNTP_CVAL_EL0:
 if ( !vtimer_emulate(regs, hsr) )
-{
-dprintk(XENLOG_ERR,
-"failed emulation of 64-bit vtimer sysreg access\n");
-domain_crash_synchronous();
-}
+goto undef_sysreg;
 break;
 case HSR_SYSREG_ICC_SGI1R_EL1:
 if ( !vgic_emulate(regs, hsr) )
@@ -1874,8 +1864,8 @@ static void do_sysreg(struct cpu_user_regs *regs,
 default:
  bad_sysreg:
 {
-struct hsr_sysreg sysreg = hsr.sysreg;
 #ifndef NDEBUG
+struct hsr_sysreg sysreg = hsr.sysreg;
 
 gdprintk(XENLOG_ERR,
  "%s %d, %d, c%d, c%d, %d %s x%d @ 0x%"PRIregister"\n",
@@ -1888,7 +1878,8 @@ static void do_sysreg(struct cpu_user_regs *regs,
 gdprintk(XENLOG_ERR, "unhandled 64-bit sysreg access %#x\n",
  hsr.bits & HSR_SYSREG_REGS_MASK);
 #endif
-inject_undef_exception(regs, sysreg.len);
+ undef_sysreg:
+inject_undef_exception(regs, hsr.len);
 return;
 }
 }
@@ -2064,8 +2055,7 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 do_cp15_32(regs, hsr);
 break;
 case HSR_EC_CP15_64:
-if ( !is_32bit_domain(current->domain) )
-goto bad_trap;
+BUG_ON(!psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_cp15_64);
 do_cp15_64(regs, hsr);
 break;
diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index 0c67f20..5aca65e 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -154,9 +154,14 @@ int virt_timer_restore(struct vcpu *v)
 return 0;
 }
 
-static void vtimer_cntp_ctl(struct cpu_user_regs *regs, uint32_t *r, int read)
+static int vtimer_cntp_ctl(struct cpu_user_regs *regs, uint32_t *r, int read)
 {
 struct vcpu *v = current;
+
+if ( psr_mode_is_user(regs) &&
+ !(READ_SYSREG(CNTKCTL_EL1) & CNTKCTL_EL1_EL0PTEN) )
+return 0;
+
 if ( read )
 {
 *r =

[Xen-devel] [PATCH 03/12] xen: arm: Use ARMv8 names for CNTHCTL_EL2 bits

2015-03-25 Thread Ian Campbell

Rather than using the v8 register names and the v7 bit names, which
makes things needlessly difficult when reading.

Signed-off-by: Ian Campbell 
---
v3: New patch.
---
 xen/arch/arm/time.c |2 +-
 xen/include/asm-arm/processor.h |4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index 352e25e..ce6d3fd 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -230,7 +230,7 @@ void __cpuinit init_timer_interrupt(void)
 /* Sensible defaults */
 WRITE_SYSREG64(0, CNTVOFF_EL2); /* No VM-specific offset */
 /* Do not let the VMs program the physical timer, only read the physical 
counter */
-WRITE_SYSREG32(CNTHCTL_PA, CNTHCTL_EL2);
+WRITE_SYSREG32(CNTHCTL_EL2_EL1PCTEN, CNTHCTL_EL2);
 WRITE_SYSREG32(0, CNTP_CTL_EL0);/* Physical timer disabled */
 WRITE_SYSREG32(0, CNTHP_CTL_EL2);   /* Hypervisor's timer disabled */
 isb();
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index fcd26fb..570d2a1 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -555,8 +555,8 @@ union hsr {
 #define FSC_LL_MASK(_AC(0x03,U)<<0)
 
 /* Time counter hypervisor control register */
-#define CNTHCTL_PA  (1u<<0)  /* Kernel/user access to physical counter */
-#define CNTHCTL_TA  (1u<<1)  /* Kernel/user access to CNTP timer */
+#define CNTHCTL_EL2_EL1PCTEN (1u<<0) /* Kernel/user access to physical counter 
*/
+#define CNTHCTL_EL2_EL1PCEN  (1u<<1) /* Kernel/user access to CNTP timer regs 
*/
 
 /* Timer control registers */
 #define CNTx_CTL_ENABLE   (1u<<0)  /* Enable timer */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 09/12] xen: arm: correctly handle sysreg accesses from userspace

2015-03-25 Thread Ian Campbell

Previously we implemented all registers as RAZ/WI even if they
shouldn't be accessible to userspace.

Accesses to the *_EL1 registers from EL0 are trapped to EL1 by the
hardware, so add a BUG_ON. Likewise accesses from 32-bit EL1 cannot
happen.

PMUSERENR_EL0 and MDCCSR_EL0 are R/O to EL0.

Other PM*_EL0 registers are accessible at EL0 only if PMUSERENR_EL0.EN
is set, since we emulate that as RAZ/WI we know that bit cannot be
set.

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/traps.c  |   54 +++--
 xen/include/asm-arm/sysregs.h |1 +
 2 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 909e880..a65b32a 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1852,11 +1852,40 @@ static void do_sysreg(struct cpu_user_regs *regs,
 switch ( hsr.bits & HSR_SYSREG_REGS_MASK )
 {
 /* RAZ/WI registers: */
+
 /*  - Debug */
 case HSR_SYSREG_MDSCR_EL1:
 /*  - Perf monitors */
 case HSR_SYSREG_PMINTENSET_EL1:
 case HSR_SYSREG_PMINTENCLR_EL1:
+/* - Breakpoints */
+HSR_SYSREG_DBG_CASES(DBGBVR):
+HSR_SYSREG_DBG_CASES(DBGBCR):
+/* - Watchpoints */
+HSR_SYSREG_DBG_CASES(DBGWVR):
+HSR_SYSREG_DBG_CASES(DBGWCR):
+/* - Double Lock Register */
+case HSR_SYSREG_OSDLR_EL1:
+/* EL1 only */
+BUG_ON(psr_mode_is_user(regs));
+goto sysreg_raz_wi;
+
+case HSR_SYSREG_PMUSERENR_EL0:
+/* RO at EL0. RAZ/WI at EL1 */
+if ( psr_mode_is_user(regs) && !hsr.sysreg.read )
+goto undef_sysreg;
+goto sysreg_raz_wi;
+
+case HSR_SYSREG_MDCCSR_EL0:
+/*
+ * Accessible at EL0 only if MDSCR_EL1.TDCC is set to 0. We emulate 
that
+ * register as RAZ/WI above. So RO at both EL0 and EL1.
+ */
+if ( !hsr.sysreg.read )
+goto undef_sysreg;
+
+*x = 0;
+break;
 case HSR_SYSREG_PMCR_EL0:
 case HSR_SYSREG_PMCNTENSET_EL0:
 case HSR_SYSREG_PMCNTENCLR_EL0:
@@ -1868,16 +1897,16 @@ static void do_sysreg(struct cpu_user_regs *regs,
 case HSR_SYSREG_PMCCNTR_EL0:
 case HSR_SYSREG_PMXEVTYPER_EL0:
 case HSR_SYSREG_PMXEVCNTR_EL0:
-case HSR_SYSREG_PMUSERENR_EL0:
 case HSR_SYSREG_PMOVSSET_EL0:
-/* - Breakpoints */
-HSR_SYSREG_DBG_CASES(DBGBVR):
-HSR_SYSREG_DBG_CASES(DBGBCR):
-/* - Watchpoints */
-HSR_SYSREG_DBG_CASES(DBGWVR):
-HSR_SYSREG_DBG_CASES(DBGWCR):
-/* - Double Lock Register */
-case HSR_SYSREG_OSDLR_EL1:
+/*
+ * Accessible at EL0 only if PMUSERENR_EL0.EN is set. We
+ * emulate that register as 0 above.
+ */
+if ( psr_mode_is_user(regs) )
+goto undef_sysreg;
+/* Fall thru */
+
+ sysreg_raz_wi:
 if ( hsr.sysreg.read )
 *x = 0;
 /* else: write ignored */
@@ -1885,8 +1914,9 @@ static void do_sysreg(struct cpu_user_regs *regs,
 
 /* Write only, Write ignore registers: */
 case HSR_SYSREG_OSLAR_EL1:
+BUG_ON(psr_mode_is_user(regs));
 if ( hsr.sysreg.read )
-goto bad_sysreg;
+goto undef_sysreg;
 /* else: write ignored */
 break;
 case HSR_SYSREG_CNTP_CTL_EL0:
@@ -1910,7 +1940,6 @@ static void do_sysreg(struct cpu_user_regs *regs,
 "Emulation of sysreg ICC_SGI0R_EL1/ASGI1R_EL1 not 
supported\n");
 inject_undef64_exception(regs, hsr.len);
 default:
- bad_sysreg:
 {
 #ifndef NDEBUG
 struct hsr_sysreg sysreg = hsr.sysreg;
@@ -2153,8 +2182,7 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 inject_undef64_exception(regs, hsr.len);
 break;
 case HSR_EC_SYSREG:
-if ( is_32bit_domain(current->domain) )
-goto bad_trap;
+BUG_ON(psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_sysreg);
 do_sysreg(regs, hsr);
 break;
diff --git a/xen/include/asm-arm/sysregs.h b/xen/include/asm-arm/sysregs.h
index df8e070..3b9ddd3 100644
--- a/xen/include/asm-arm/sysregs.h
+++ b/xen/include/asm-arm/sysregs.h
@@ -43,6 +43,7 @@
 #define HSR_SYSREG_MDSCR_EL1  HSR_SYSREG(2,0,c0,c2,2)
 #define HSR_SYSREG_OSLAR_EL1  HSR_SYSREG(2,0,c1,c0,4)
 #define HSR_SYSREG_OSDLR_EL1  HSR_SYSREG(2,0,c1,c3,4)
+#define HSR_SYSREG_MDCCSR_EL0 HSR_SYSREG(2,3,c0,c1,0)
 
 #define HSR_SYSREG_DBGBVRn_EL1(n) HSR_SYSREG(2,0,c0,c##n,4)
 #define HSR_SYSREG_DBGBCRn_EL1(n) HSR_SYSREG(2,0,c0,c##n,5)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 01/12] xen: arm: Correct PMXEV cp register definitions

2015-03-25 Thread Ian Campbell

p15,0,c9,c13,1 is PMXEVTYPER not PMXEVCNTR.
p15,0,c9,c13,2 is PMXEVCNTR not PMXEVCNR.

Signed-off-by: Ian Campbell 
Reviewed-by: Julien Grall 
---
 xen/arch/arm/traps.c |2 +-
 xen/include/asm-arm/cpregs.h |4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index ad046e8..50d67aa 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1633,8 +1633,8 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 case HSR_CPREG32(PMCEID0):
 case HSR_CPREG32(PMCEID1):
 case HSR_CPREG32(PMCCNTR):
+case HSR_CPREG32(PMXEVTYPER):
 case HSR_CPREG32(PMXEVCNTR):
-case HSR_CPREG32(PMXEVCNR):
 case HSR_CPREG32(PMUSERENR):
 case HSR_CPREG32(PMINTENSET):
 case HSR_CPREG32(PMINTENCLR):
diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
index f1100c8..afe9148 100644
--- a/xen/include/asm-arm/cpregs.h
+++ b/xen/include/asm-arm/cpregs.h
@@ -222,8 +222,8 @@
 #define PMCEID0 p15,0,c9,c12,6  /* Perf. Mon. Common Event 
Identification register 0 */
 #define PMCEID1 p15,0,c9,c12,7  /* Perf. Mon. Common Event 
Identification register 1 */
 #define PMCCNTR p15,0,c9,c13,0  /* Perf. Mon. Cycle Count Register */
-#define PMXEVCNTR   p15,0,c9,c13,1  /* Perf. Mon. Event Type Select 
Register */
-#define PMXEVCNRp15,0,c9,c13,2  /* Perf. Mon. Event Count Register */
+#define PMXEVTYPER  p15,0,c9,c13,1  /* Perf. Mon. Event Type Select 
Register */
+#define PMXEVCNTR   p15,0,c9,c13,2  /* Perf. Mon. Event Count Register */
 #define PMUSERENR   p15,0,c9,c14,0  /* Perf. Mon. User Enable Register */
 #define PMINTENSET  p15,0,c9,c14,1  /* Perf. Mon. Interrupt Enable Set 
Register */
 #define PMINTENCLR  p15,0,c9,c14,2  /* Perf. Mon. Interrupt Enable Clear 
Register */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 05/12] xen: arm: Handle 32-bit EL0 on 64-bit EL1 when advancing PC after trap

2015-03-25 Thread Ian Campbell

Fix a coding style issue while in the area.

Signed-off-by: Ian Campbell 
Reviewed-by: Julien Grall 
---
 xen/arch/arm/traps.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 83abafa..4eda561 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1487,7 +1487,7 @@ static int check_conditional_instr(struct cpu_user_regs 
*regs, union hsr hsr)
 {
 unsigned long it;
 
-BUG_ON( !is_32bit_domain(current->domain) || !(cpsr&PSR_THUMB) );
+BUG_ON( !psr_mode_is_32bit(regs->cpsr) || !(cpsr&PSR_THUMB) );
 
 it = ( (cpsr >> (10-2)) & 0xfc) | ((cpsr >> 25) & 0x3 );
 
@@ -1496,7 +1496,7 @@ static int check_conditional_instr(struct cpu_user_regs 
*regs, union hsr hsr)
 return 1;
 
 /* The cond for this instruction works out as the top 4 bits. */
-cond =  ( it >> 4 );
+cond = ( it >> 4 );
 }
 
 cpsr_cond = cpsr >> 28;
@@ -1514,10 +1514,10 @@ static void advance_pc(struct cpu_user_regs *regs, 
union hsr hsr)
 unsigned long itbits, cond, cpsr = regs->cpsr;
 
 /* PSR_IT_MASK bits can only be set for 32-bit processors in Thumb mode. */
-BUG_ON( (!is_32bit_domain(current->domain)||!(cpsr&PSR_THUMB))
+BUG_ON( (!psr_mode_is_32bit(cpsr)||!(cpsr&PSR_THUMB))
 && (cpsr&PSR_IT_MASK) );
 
-if ( is_32bit_domain(current->domain) && (cpsr&PSR_IT_MASK) )
+if ( cpsr&PSR_IT_MASK )
 {
 /* The ITSTATE[7:0] block is contained in CPSR[15:10],CPSR[26:25]
  *
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 0/12] xen: arm: reenable support for 32-bit userspace running in 64-bit guest.

2015-03-25 Thread Ian Campbell

XSA-102/CVE-2014-5147[0] concerned a crash when trapping from 32-bit
userspace in a 64-bit guest. Part of that security patch was c0020e09970
"xen: arm: Handle traps from 32-bit userspace on 64-bit kernel as undef
fix" which turned the exploitable crash into a #undef to the guest (so
as to kill the process but not the host) as a workaround for the issue.

However while this prevented the exploit it did not make 32-bit
userspaces which were prone to triggering the issue actually work.

This series consists of some patches which I originally wrote for
XSA-102 to fix the issue properly before it was determined that those
fixes were too invasive by far for a security update. At the end of the
series is a new patch which removes the XSA-102 workaround since all
problematic traps should now be handled.

Since these were originally intended to be the security fix they have
had a fair bit of scrutiny already in private . However since there is
now a risk of reintroducing XSA-102 I would appreciate a pretty thorough
second pair of eyes on it this time around.

I've tested this with a local utility which tries to access the various
cp and system registers from both 32- and 64-bit processes and checks
that they either work or give the expected traps. Since this tool is
effectively an exploit for XSA-102 I'm not sharing here but if you ask
nicely and appear to be wearing the correct colour hat I might share it
with you (it's not terribly impressive, so don't get too excited).

Since last time I've implemented Julien's review feedback including:
  * added the GUEST_BUG_ON patch to the end to replace the BUG_ONs
due to invalid h/w state, which gets more useful debug if that
occurs.
  * handled CNTP_CVAL_EL0.

Ian.

[0] http://xenbits.xen.org/xsa/advisory-102.html




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 07/12] xen: arm: Handle CP15 register traps from userspace

2015-03-25 Thread Ian Campbell

Previously userspace access to PM* would have been incorrectly (but
benignly) implemented as RAZ/WI when running on a 32-bit kernel and
would cause a hypervisor exception (host crash) when running a 64-bit
kernel (this was already solved via the fix to XSA-102).

CLIDR, CCSIDR, DCCISW, ACTLR, PMINTENSET, PMINTENCLR are EL1 only,
attempts to access from EL0 will trap to EL1 not to us, hence BUG_ON
is appropriate now.

PMUSERENR is R/O at EL0 and we implement as RAZ/WI at EL1 as before.

The remaining PM* registers are accessible to EL0 only if
PMUSERENR_EL0.EN is set, since we emulate this as RAZ/WI the bit is
never set so we inject a trap on attempted access. We weren't
previously handling PMCCNTR.

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/traps.c |   32 +++-
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index c5ffcc5..e49ae79 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1565,6 +1565,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 switch ( hsr.bits & HSR_CP32_REGS_MASK )
 {
 case HSR_CPREG32(CLIDR):
+BUG_ON(psr_mode_is_user(regs));
 if ( !cp32.read )
 {
 dprintk(XENLOG_ERR,
@@ -1574,6 +1575,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 *r = READ_SYSREG32(CLIDR_EL1);
 break;
 case HSR_CPREG32(CCSIDR):
+BUG_ON(psr_mode_is_user(regs));
 if ( !cp32.read )
 {
 dprintk(XENLOG_ERR,
@@ -1583,6 +1585,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 *r = READ_SYSREG32(CCSIDR_EL1);
 break;
 case HSR_CPREG32(DCCISW):
+BUG_ON(psr_mode_is_user(regs));
 if ( cp32.read )
 {
 dprintk(XENLOG_ERR,
@@ -1601,6 +1604,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 goto undef_cp15_32;
 break;
 case HSR_CPREG32(ACTLR):
+BUG_ON(psr_mode_is_user(regs));
 if ( cp32.read )
*r = v->arch.actlr;
 break;
@@ -1613,6 +1617,18 @@ static void do_cp15_32(struct cpu_user_regs *regs,
  * always support PMCCNTR (the cyle counter): we just RAZ/WI for all
  * PM register, which doesn't crash the kernel at least
  */
+case HSR_CPREG32(PMUSERENR):
+/* RO at EL0. RAZ/WI at EL1 */
+if ( psr_mode_is_user(regs) && !hsr.cp32.read )
+goto undef_cp15_32;
+goto cp15_32_raz_wi;
+
+case HSR_CPREG32(PMINTENSET):
+case HSR_CPREG32(PMINTENCLR):
+/* EL1 only */
+BUG_ON(psr_mode_is_user(regs));
+goto cp15_32_raz_wi;
+
 case HSR_CPREG32(PMCR):
 case HSR_CPREG32(PMCNTENSET):
 case HSR_CPREG32(PMCNTENCLR):
@@ -1624,12 +1640,19 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 case HSR_CPREG32(PMCCNTR):
 case HSR_CPREG32(PMXEVTYPER):
 case HSR_CPREG32(PMXEVCNTR):
-case HSR_CPREG32(PMUSERENR):
-case HSR_CPREG32(PMINTENSET):
-case HSR_CPREG32(PMINTENCLR):
 case HSR_CPREG32(PMOVSSET):
+/*
+ * Accessible at EL0 only if PMUSERENR_EL0.EN is set. We
+ * emulate that register as 0 above.
+ */
+if ( psr_mode_is_user(regs) )
+goto undef_cp15_32;
+/* Fall thru */
+
+ cp15_32_raz_wi:
 if ( cp32.read )
 *r = 0;
+/* else: write ignored */
 break;
 
 default:
@@ -2049,8 +2072,7 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 advance_pc(regs, hsr);
 break;
 case HSR_EC_CP15_32:
-if ( !is_32bit_domain(current->domain) )
-goto bad_trap;
+BUG_ON(!psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_cp15_32);
 do_cp15_32(regs, hsr);
 break;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 08/12] xen: arm: Handle CP14 32-bit register accesses from userspace

2015-03-25 Thread Ian Campbell

Accesses to these from 32-bit userspace would cause a hypervisor
exception (host crash) when running a 64-bit kernel, which is worked
around by the fix to XSA-102. On 32-bit kernels they would be
implemented as RAZ/WI which is incorrect but harmless.

Update as follows:
 - DBGDSCRINT should be R/O.
 - DBGDSCREXT should be EL1 only.
 - DBGOSLAR is RO and EL1 only.
 - DBGVCR, DBGB[VC]R*, DBGW[VC]R*, and DBGOSDLR are EL1 only.

DBGDIDR and DBGDSCRINT are accessible from EL0 if DBGDSCRext.UDCCdis.
Since we emulate that as RAZ/WI we allow access.

When we do not allow an access we now silently inject an undef even in
debug mode since the debugging messages are not helpful (we have
handled the access, by explicitly choosing not to).

Signed-off-by: Ian Campbell 
---
v3: Add comment to DBGDSCRINT case.
---
 xen/arch/arm/traps.c |   38 +++---
 1 file changed, 31 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index e49ae79..909e880 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1722,10 +1722,12 @@ static void do_cp14_32(struct cpu_user_regs *regs, 
union hsr hsr)
 switch ( hsr.bits & HSR_CP32_REGS_MASK )
 {
 case HSR_CPREG32(DBGDIDR):
-
-/* Read-only register */
+/*
+ * Read-only register. Accessible by EL0 if DBGDSCRext.UDCCdis
+ * is set to 0, which we emulated below.
+ */
 if ( !cp32.read )
-goto bad_cp;
+goto undef_cp14_32;
 
 /* Implement the minimum requirements:
  *  - Number of watchpoints: 1
@@ -1738,15 +1740,28 @@ static void do_cp14_32(struct cpu_user_regs *regs, 
union hsr hsr)
 break;
 
 case HSR_CPREG32(DBGDSCRINT):
+/*
+ * Read-only register. Accessible by EL0 if DBGDSCRext.UDCCdis
+ * is set to 0, which we emulated below.
+ */
+if ( !cp32.read )
+goto undef_cp14_32;
+
+*r = 0;
+break;
+
 case HSR_CPREG32(DBGDSCREXT):
+if ( usr_mode(regs) )
+goto undef_cp14_32;
+
 /* Implement debug status and control register as RAZ/WI.
  * The OS won't use Hardware debug if MDBGen not set
  */
 if ( cp32.read )
*r = 0;
 break;
+
 case HSR_CPREG32(DBGVCR):
-case HSR_CPREG32(DBGOSLAR):
 case HSR_CPREG32(DBGBVR0):
 case HSR_CPREG32(DBGBCR0):
 case HSR_CPREG32(DBGWVR0):
@@ -1754,13 +1769,22 @@ static void do_cp14_32(struct cpu_user_regs *regs, 
union hsr hsr)
 case HSR_CPREG32(DBGBVR1):
 case HSR_CPREG32(DBGBCR1):
 case HSR_CPREG32(DBGOSDLR):
+if ( usr_mode(regs) )
+goto undef_cp14_32;
 /* RAZ/WI */
 if ( cp32.read )
 *r = 0;
 break;
 
+case HSR_CPREG32(DBGOSLAR):
+if ( usr_mode(regs) )
+goto undef_cp14_32;
+/* WO */
+if ( cp32.read )
+goto undef_cp14_32;
+/* else: ignore */
+break;
 default:
-bad_cp:
 #ifndef NDEBUG
 gdprintk(XENLOG_ERR,
  "%s p14, %d, r%d, cr%d, cr%d, %d @ 0x%"PRIregister"\n",
@@ -1769,6 +1793,7 @@ bad_cp:
 gdprintk(XENLOG_ERR, "unhandled 32-bit cp14 access %#x\n",
  hsr.bits & HSR_CP32_REGS_MASK);
 #endif
+undef_cp14_32:
 inject_undef_exception(regs, hsr.len);
 return;
 }
@@ -2082,8 +2107,7 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 do_cp15_64(regs, hsr);
 break;
 case HSR_EC_CP14_32:
-if ( !is_32bit_domain(current->domain) )
-goto bad_trap;
+BUG_ON(!psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_cp14_32);
 do_cp14_32(regs, hsr);
 break;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 10/12] xen: arm: handle remaining traps from userspace

2015-03-25 Thread Ian Campbell

CP14 dbg and general CP register access are both handled with
unconditional injection of #undef from their respective handlers, so
allow these even from 32-bit userspace on a 64-bit kernel.

SMC32 and HVC32 should only come from a guest in AArch32 mode and
SMC64 and HVC64 should only come from a guest in AArch64 mode. Add
appropriate BUG_ONs to all cases.

After this bad_trap is no longer used.

Signed-off-by: Ian Campbell 
---
v3: Reintroduce accidentally dropped undef injection from smc64 case.
---
 xen/arch/arm/traps.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index a65b32a..9e7337f 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2141,23 +2141,22 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 do_cp14_32(regs, hsr);
 break;
 case HSR_EC_CP14_DBG:
-if ( !is_32bit_domain(current->domain) )
-goto bad_trap;
+BUG_ON(!psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_cp14_dbg);
 do_cp14_dbg(regs, hsr);
 break;
 case HSR_EC_CP:
-if ( !is_32bit_domain(current->domain) )
-goto bad_trap;
+BUG_ON(!psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_cp);
 do_cp(regs, hsr);
 break;
 case HSR_EC_SMC32:
+BUG_ON(!psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_smc32);
-inject_undef32_exception(regs);
+inject_undef_exception(regs, hsr.len);
 break;
 case HSR_EC_HVC32:
-perfc_incr(trap_hvc32);
+BUG_ON(!psr_mode_is_32bit(regs->cpsr));
 #ifndef NDEBUG
 if ( (hsr.iss & 0xff00) == 0xff00 )
 return do_debug_trap(regs, hsr.iss & 0x00ff);
@@ -2168,6 +2167,7 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 break;
 #ifdef CONFIG_ARM_64
 case HSR_EC_HVC64:
+BUG_ON(psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_hvc64);
 #ifndef NDEBUG
 if ( (hsr.iss & 0xff00) == 0xff00 )
@@ -2178,6 +2178,7 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 do_trap_hypercall(regs, ®s->x16, hsr.iss);
 break;
 case HSR_EC_SMC64:
+BUG_ON(psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_smc64);
 inject_undef64_exception(regs, hsr.len);
 break;
@@ -2204,7 +2205,6 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 #endif
 
 default:
- bad_trap:
 printk("Hypervisor Trap. HSR=0x%x EC=0x%x IL=%x 
Syndrome=0x%"PRIx32"\n",
hsr.bits, hsr.ec, hsr.len, hsr.iss);
 do_unexpected_trap("Hypervisor", regs);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 04/12] xen: arm: Factor out psr_mode_is_user

2015-03-25 Thread Ian Campbell

This embodies the logic on arm64 that userspace can be either 32-bit
or 64-bit. It will be used in other places shortly.

Note that the logic differs slightly because the original (in
inject_abt64_exception) knew that the kernel was 64-bit and could
therefore assume that any 32-bit mode was userspace. Instead the
refactored code explicitly checks for usr mode.

Signed-off-by: Ian Campbell 
Reviewed-by: Julien Grall 
---
 xen/arch/arm/traps.c   |9 +
 xen/include/asm-arm/regs.h |8 
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 9e4a60f..83abafa 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -455,14 +455,7 @@ static void inject_abt64_exception(struct cpu_user_regs 
*regs,
 .len = instr_len,
 };
 
-/*
- * Trap may have been taken from EL0, which might be in AArch32
- * mode (PSR_MODE_BIT set), or in AArch64 mode (PSR_MODE_EL0t).
- *
- * Since we know the kernel must be 64-bit any trap from a 32-bit
- * mode must have been from EL0.
- */
-if ( psr_mode_is_32bit(regs->cpsr) || psr_mode(regs->cpsr,PSR_MODE_EL0t) )
+if ( psr_mode_is_user(regs) )
 esr.ec = prefetch
 ? HSR_EC_INSTR_ABORT_LOWER_EL : HSR_EC_DATA_ABORT_LOWER_EL;
 else
diff --git a/xen/include/asm-arm/regs.h b/xen/include/asm-arm/regs.h
index 0951857..56d53d6 100644
--- a/xen/include/asm-arm/regs.h
+++ b/xen/include/asm-arm/regs.h
@@ -24,9 +24,17 @@
 
 #ifdef CONFIG_ARM_32
 #define hyp_mode(r) psr_mode((r)->cpsr,PSR_MODE_HYP)
+#define psr_mode_is_user(r) usr_mode(r)
 #else
 #define hyp_mode(r) (psr_mode((r)->cpsr,PSR_MODE_EL2h) || \
  psr_mode((r)->cpsr,PSR_MODE_EL2t))
+
+/*
+ * Trap may have been taken from EL0, which might be in AArch32 usr
+ * mode, or in AArch64 mode (PSR_MODE_EL0t).
+ */
+#define psr_mode_is_user(r) \
+(psr_mode((r)->cpsr,PSR_MODE_EL0t) || usr_mode(r))
 #endif
 
 #define guest_mode(r) \
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

1 2 >

1 - 100 of 174 matches

Mail list logo