date:20141203

Re: [Xen-devel] [PATCH] tools/hotplug: update systemd dependency to use service instead of socket

2014-12-03 Thread Olaf Hering

On Wed, Dec 03, M A Young wrote:

> On Wed, 3 Dec 2014, Konrad Rzeszutek Wilk wrote:
> >Options=mode=755,context="$XENSTORED_MOUNT_CTX"
> 
> Yes, that was on my probable bug list, as context="none" isn't a valid mount
> option (on Fedora at least), presumably because context has to be followed
> by a valid selinux context.

Is that something the sysadmin has to adjust, or should the xen source
provide proper values?

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-3.10 test] 32058: regressions - FAIL

2014-12-03 Thread xen . org

flight 32058 linux-3.10 real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/32058/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-winxpsp3  7 windows-install fail REGR. vs. 26303

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 7 debian-hvm-install fail blocked in 
26303
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 26303
 test-amd64-amd64-xl-winxpsp3  7 windows-install  fail   like 26303

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  5 xen-boot fail   never pass
 test-armhf-armhf-xl   5 xen-boot fail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-libvirt   9 guest-start  fail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-amd64-libvirt  9 guest-start  fail   never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass

version targeted for testing:
 linux252f23ea5987a4730e3399ef1ad5d78efcc786c9
baseline version:
 linuxbe67db109090b17b56eb8eb2190cd70700f107aa


774 people touched revisions under test,
not listing them all


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64fail
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-win7-amd64   fail
 test-amd64-i386-xl-win7-amd64fail
 test-amd64-i386-xl-credit2   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-i386-rumpuserxen-i386 pass
 test-amd64-amd64-xl-pcipt-intel  fail
 test-amd64-i386-rhel6hvm-intel   pass

[Xen-devel] xen/arm: uart interrupts handling

2014-12-03 Thread Vijay Kilari

Hi Tim,

I see that on uart interrupt, ICR is written to clear the all
interrupts except TX, RX and RX timeout. With this, cpu always finds
TX/RX is active and never
comes out of the loop.

With the below changes, TX, RX & RTI are cleared before handling this
interrupts.

Is my observation is correct?. If so I wonder how it is working on
platforms that
are using pl011. Without this for my cpu just keeps looping here.

  index fba0a55..d21bce3 100644
--- a/xen/drivers/char/pl011.c
+++ b/xen/drivers/char/pl011.c
@@ -63,7 +63,7 @@ static void pl011_interrupt(int irq, void *data,
struct cpu_user_regs *regs)
 {
 do
 {
-pl011_write(uart, ICR, status & ~(TXI|RTI|RXI));
+pl011_write(uart, ICR, status & (TXI|RTI|RXI));

 if ( status & (RTI|RXI) )
 serial_rx_interrupt(port, regs);
@@ -157,7 +157,7 @@ static void pl011_resume(struct serial_port *port)
 {
 BUG(); // XXX
 }


Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Intel-gfx] [Announcement] 2014-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2014-12-03 Thread Jike Song


Hi all,

We're pleased to announce a public release to Intel Graphics Virtualization 
Technology (Intel GVT-g, formerly known as XenGT). Intel GVT-g is a complete 
vGPU solution with mediated pass-through, supported today on 4th generation 
Intel Core(TM) processors with Intel Graphics processors. A virtual GPU 
instance is maintained for each VM, with part of performance critical resources 
directly assigned. The capability of running native graphics driver inside a 
VM, without hypervisor intervention in performance critical paths, achieves a 
good balance among performance, feature, and sharing capability. Though we only 
support Xen on Intel Processor Graphics so far, the core logic can be easily 
ported to other hypervisors.


The news of this update:


- kernel update from 3.11.6 to 3.14.1

- We plan to integrate Intel GVT-g as a feature in i915 driver. That 
effort is still under review, not included in this update yet

- Next update will be around early Jan, 2015


This update consists of:

- Windows HVM support with driver version 15.33.3910

- Stability fixes, e.g. stabilize GPU, the GPU hang occurrence rate 
becomes rare now

- Hardware Media Acceleration for Decoding/Encoding/Transcoding, VC1, 
H264 etc. format supporting

- Display enhancements, e.g. DP type is supported for virtual PORT

- Display port capability virtualization: with this feature, dom0 
manager could freely assign virtual DDI ports to VM, not necessary to check 
whether the corresponding physical DDI ports are available



Please refer to the new setup guide, which provides step-to-step details about 
building/configuring/running Intel GVT-g:



https://github.com/01org/XenGT-Preview-kernel/blob/master/XenGT_Setup_Guide.pdf



The new source codes are available at the updated github repos:


Linux: https://github.com/01org/XenGT-Preview-kernel.git

Xen: https://github.com/01org/XenGT-Preview-xen.git

Qemu: https://github.com/01org/XenGT-Preview-qemu.git


More information about Intel GVT-g background, architecture, etc can be found 
at:



https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian


http://events.linuxfoundation.org/sites/events/files/slides/XenGT-Xen%20Summit-v7_0.pdf

https://01.org/xen/blogs/srclarkx/2013/graphics-virtualization-xengt


The previous update can be found here:


http://lists.xen.org/archives/html/xen-devel/2014-07/msg03248.html


Appreciate your comments!


--
Thanks,
Jike

On 07/25/2014 04:31 PM, Jike Song wrote:

Hi all,

We're pleased to announce an update to Intel Graphics Virtualization Technology 
(Intel GVT-g, formerly known as XenGT). Intel GVT-g is a complete vGPU solution 
with mediated pass-through, supported today on 4th generation Intel Core(TM) 
processors with Intel Graphics processors. A virtual GPU instance is maintained 
for each VM, with part of performance critical resources directly assigned. The 
capability of running native graphics driver inside a VM, without hypervisor 
intervention in performance critical paths, achieves a good balance among 
performance, feature, and sharing capability. Though we only support Xen on 
Intel Processor Graphics so far, the core logic can be easily ported to other 
hypervisors.

The news of this update:

- Project code name is "XenGT", now official name is Intel Graphics 
Virtualization Technology (Intel GVT-g)
- Currently Intel GVT-g supports Intel Processor Graphics built into 
4th generation Intel Core processors - Haswell platform
- Moving forward, XenGT will change to quarterly release cadence. Next 
update will be around early October, 2014.

This update consists of:

- Stability fixes, e.g. stable DP support
- Display enhancements, e.g. virtual monitor support. Users can define 
a virtual monitor type with customized EDID for virtual machines, not 
necessarily the same as physical monitors.
- Improved support for GPU recovery
- Experimental Windows HVM support. To download the experimental 
version, see setup guide for details
- Experimental Hardware Media Acceleration for decoding.


Please refer to the new setup guide, which provides step-to-step details about 
building/configuring/running Intel GVT-g:


https://github.com/01org/XenGT-Preview-kernel/blob/master/XenGT_Setup_Guide.pdf


The new source codes are available at the updated github repos:

Linux: https://github.com/01org/XenGT-Preview-kernel.git
Xen: https://github.com/01org/XenGT-Preview-xen.git
Qemu: https://github.com/01org/XenGT-Preview-qemu.git


More information about Intel GVT-g background, architecture, etc can be found 
at:


https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian

http://events.linuxfoundation.org/sites/events/files/slides/XenGT-Xen%20Summit-v7_0.pdf
https://

Re: [Xen-devel] [v4] libxc: Expose the 1GB pages cpuid flag to guest

2014-12-03 Thread Zhang, Yang Z

Konrad Rzeszutek Wilk wrote on 2014-12-03:
> On Wed, Dec 03, 2014 at 09:38:49AM +, Ian Campbell wrote:
>> On Tue, 2014-12-02 at 16:09 -0500, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Nov 28, 2014 at 11:50:43AM +, Ian Campbell wrote:
 On Fri, 2014-11-28 at 18:52 +0800, Liang Li wrote:
> If hardware support the 1GB pages, expose the feature to guest by
> default. Users don't have to use a 'cpuid= ' option in config fil
> e to turn it on.
> 
> If guest use shadow mode, the 1GB pages feature will be hidden from
> guest, this is done in the function hvm_cpuid(). So the change is
> okay for shadow mode case.
> 
> Signed-off-by: Liang Li 
> Signed-off-by: Yang Zhang 
 
 FTR although this is strictly speaking a toolstack patch I think the
 main ack required should be from the x86 hypervisor guys...
>>> 
>>> Jan acked it.
>> 
>> For 4.5?
> 
> Probably not.
>> 
>> Have you release acked it?
> 
> No.
>> 
>> This seemed like 4.6 material to me, or at least I've not seen any
>> mention/argument to the contrary.
> 
> Correct. 4.6 please.

I think this more like a bug fixing than a feature. See our previous discussion.

>> 
>> Ian.
>> 
 
> ---
>  tools/libxc/xc_cpuid_x86.c | 3 +++
>  1 file changed, 3 insertions(+)
> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> index a18b1ff..c97f91a 100644
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -109,6 +109,7 @@ static void amd_xc_cpuid_policy(
>  regs[3] &= (0x0183f3ff | /* features shared with
> 0x0001:EDX */
>  bitmaskof(X86_FEATURE_NX) |
>  bitmaskof(X86_FEATURE_LM) | +  
>   bitmaskof(X86_FEATURE_PAGE1GB) |
>  bitmaskof(X86_FEATURE_SYSCALL) |
>  bitmaskof(X86_FEATURE_MP) |
>  bitmaskof(X86_FEATURE_MMXEXT) | @@ -192,6
>  +193,7 @@ static void intel_xc_cpuid_policy(
>  bitmaskof(X86_FEATURE_ABM)); regs[3] &=
>  (bitmaskof(X86_FEATURE_NX) |
>  bitmaskof(X86_FEATURE_LM) | +  
>   bitmaskof(X86_FEATURE_PAGE1GB) |
>  bitmaskof(X86_FEATURE_SYSCALL) |
>  bitmaskof(X86_FEATURE_RDTSCP));
>  break;
> @@ -386,6 +388,7 @@ static void xc_cpuid_hvm_policy(
>  clear_bit(X86_FEATURE_LM, regs[3]);
>  clear_bit(X86_FEATURE_NX, regs[3]);
>  clear_bit(X86_FEATURE_PSE36, regs[3]);
> +clear_bit(X86_FEATURE_PAGE1GB, regs[3]);
>  }
>  break;
 
 
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel
>> 
>> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


Best regards,
Yang


--- Begin Message ---
On Mon, Jan 13, 2014 at 11:51:28AM +, Jan Beulich wrote:
> >>> On 13.01.14 at 12:38, Ian Campbell  wrote:
> > On Mon, 2014-01-13 at 11:30 +, Jan Beulich wrote:
> >> In fact I can't see where this would be forced off: xc_cpuid_x86.c
> >> only does so in the PV case, and all hvm_pse1gb_supported() is
> >> that the CPU supports it and the domain uses HAP.
> >
> > Took me a while to spot it too:
> > static void intel_xc_cpuid_policy(
> > [...]
> > case 0x8001: {
> > int is_64bit = hypervisor_is_64bit(xch) && is_pae;
> >
> > /* Only a few features are advertised in Intel's 
> > 0x8001. */
> > regs[2] &= (is_64bit ? bitmaskof(X86_FEATURE_LAHF_LM) : 0) |
> >bitmaskof(X86_FEATURE_ABM);
> > regs[3] &= ((is_pae ? bitmaskof(X86_FEATURE_NX) : 0) |
> > (is_64bit ? bitmaskof(X86_FEATURE_LM) : 0) |
> > (is_64bit ? bitmaskof(X86_FEATURE_SYSCALL) : 0) 
> > |
> > (is_64bit ? bitmaskof(X86_FEATURE_RDTSCP) : 0));
> > break;
> > }
> >
> >
> > Which masks anything which is not explicitly mentioned. (PAGE1GB is in
> > regs[3], I think).
>
> Ah, okay. The funs of white listing on HVM vs black listing on PV
> again.
>
> > The AMD version is more permissive:
> >
> > regs[3] &= (0x0183f3ff | /* features shared with 0x0001:EDX */
> > (is_pae ? bitmaskof(X86_FEATURE_NX) : 0) |
> > (is_64bit ? bitmaskof(X86_FEATURE_LM) : 0) |
> > bitmaskof(X86_FEATURE_SYSCALL) |
> > bitmaskof(X86_FEATURE_MP) |
> > bitmaskof(X86_FEATURE_MMXEXT) |
> > bitmaskof(X86_FEATURE_FFXSR) |
> >

Re: [Xen-devel] xl pci-attach silently fails the first time

2014-12-03 Thread Konrad Rzeszutek Wilk

On Wed, Dec 03, 2014 at 09:31:03PM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Dec 01, 2014 at 12:01:51PM -0500, Konrad Rzeszutek Wilk wrote:
> > On Mon, Dec 01, 2014 at 02:32:44PM +0100, Olaf Hering wrote:
> > > On Mon, Dec 01, Olaf Hering wrote:
> > > 
> > > > # xl pci-assignable-add 01:10.0
> > > > # xl pci-assignable-list
> > > > :01:10.0
> > > > # xl create -f domU.cfg
> > > > # xl console domU
> > > > ## lspci gives just emulated PCI devices
> > > 
> > > ttyS0:Rescue:~ # lspci 
> > > 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 
> > > 02)
> > > 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
> > > 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton 
> > > II]
> > > 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
> > > 00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 
> > > 01)
> > > 00:03.0 VGA compatible controller: Cirrus Logic GD 5446
> > > 00:04.0 Ethernet controller: Realtek Semiconductor Co., Ltd. 
> > > RTL-8100/8101L/8139 PCI Fast Ethernet Adapter (rev ff)
> > > 
> > > > ## detach from domU console
> > > > # xl pci-attach domU :01:10.0
> > > > # xl pci-list domU
> > > > Vdev Device
> > > > 04.0 :01:10.0
> > > > 
> > > > # xl console domU
> > > > ## lspci gives just emulated PCI devices
> > > 
> > > ttyS0:Rescue:~ # lspci 
> > > 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 
> > > 02)
> > > 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
> > > 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton 
> > > II]
> > > 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
> > > 00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 
> > > 01)
> > > 00:03.0 VGA compatible controller: Cirrus Logic GD 5446
> > > 00:04.0 Ethernet controller: Realtek Semiconductor Co., Ltd. 
> > > RTL-8100/8101L/8139 PCI Fast Ethernet Adapter (rev 01)
> > > 
> > > > ## detach from domU console
> > > > # xl pci-detach domU :01:10.0
> > > Now lspci shows that the emulated network card is gone.
> > 
> > > ttyS0:Rescue:~ # lspci 
> > > 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 
> > > 02)
> > > 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
> > > 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton 
> > > II]
> > > 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
> > > 00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 
> > > 01)
> > > 00:03.0 VGA compatible controller: Cirrus Logic GD 5446
> > > 
> > > > # xl pci-attach domU :01:10.0
> > > > # xl pci-list domU
> > > > Vdev Device
> > > > 04.0 :01:10.0
> > > > # xl console domU
> > > > ## lspci shows now also the assigned host device
> > > 
> > > 
> > > So the actual bug is that the very first time after pci-attach the guests
> > > "00:04.0" PCI device is (most likely) replaced with the host PCI device. 
> > > Just
> > > the guest does not notice that "00:04.0" was actually already gone after 
> > > unplug.
> > 
> > That is odd - I see any device 'hot-plugged' being added at 00:05 and 
> > further.
> 
> I have to apologize. The reason it works for me is because the emulated
> device gets unplugged quite early:
> 
> # lspci
> 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
> 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
> 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
> 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
> 00:02.0 Class ff80: XenSource, Inc. Xen Platform Device (rev 01)
> 00:03.0 VGA compatible controller: Cirrus Logic GD 5446
> 
> [and here I run 'xl pci-attach USB 00:1a.0]
> 
> # [   30.609802] hpet1: lost 1589 rtc interrupts
> [   30.672030] hpet1: lost 2200 rtc interrupts
> [   30.672030] hpet1: lost 2201 rtc interrupts
> [   30.672030] hpet1: lost 2200 rtc interrupts
> [   30.899341] pci :00:04.0: [8086:1c2d] type 00 class 0x0c0320
> 
> And the other test I have been running is when the guest is booted
> with an PCI device (and then unplugged):
> 
> # lspci
> 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
> 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
> 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
> 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01)
> 00:02.0 VGA compatible controller: Cirrus Logic GD 5446
> 00:03.0 Class ff80: XenSource, Inc. Xen Platform Device (rev 01)
> 00:05.0 USB Controller: Intel Corporation Cougar Point USB Enhanced Host 
> Controller #2 (rev 04)
> 
> But oddly enough when I do 'pci-detach' and then 'pci-attach' I get:
> lspci
> 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
> 00:01.0 ISA bridge: Intel Corporation 82

Re: [Xen-devel] xl pci-attach silently fails the first time

2014-12-03 Thread Konrad Rzeszutek Wilk

On Mon, Dec 01, 2014 at 12:01:51PM -0500, Konrad Rzeszutek Wilk wrote:
> On Mon, Dec 01, 2014 at 02:32:44PM +0100, Olaf Hering wrote:
> > On Mon, Dec 01, Olaf Hering wrote:
> > 
> > > # xl pci-assignable-add 01:10.0
> > > # xl pci-assignable-list
> > > :01:10.0
> > > # xl create -f domU.cfg
> > > # xl console domU
> > > ## lspci gives just emulated PCI devices
> > 
> > ttyS0:Rescue:~ # lspci 
> > 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
> > 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
> > 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton 
> > II]
> > 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
> > 00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 
> > 01)
> > 00:03.0 VGA compatible controller: Cirrus Logic GD 5446
> > 00:04.0 Ethernet controller: Realtek Semiconductor Co., Ltd. 
> > RTL-8100/8101L/8139 PCI Fast Ethernet Adapter (rev ff)
> > 
> > > ## detach from domU console
> > > # xl pci-attach domU :01:10.0
> > > # xl pci-list domU
> > > Vdev Device
> > > 04.0 :01:10.0
> > > 
> > > # xl console domU
> > > ## lspci gives just emulated PCI devices
> > 
> > ttyS0:Rescue:~ # lspci 
> > 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
> > 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
> > 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton 
> > II]
> > 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
> > 00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 
> > 01)
> > 00:03.0 VGA compatible controller: Cirrus Logic GD 5446
> > 00:04.0 Ethernet controller: Realtek Semiconductor Co., Ltd. 
> > RTL-8100/8101L/8139 PCI Fast Ethernet Adapter (rev 01)
> > 
> > > ## detach from domU console
> > > # xl pci-detach domU :01:10.0
> > Now lspci shows that the emulated network card is gone.
> 
> > ttyS0:Rescue:~ # lspci 
> > 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
> > 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
> > 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton 
> > II]
> > 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
> > 00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 
> > 01)
> > 00:03.0 VGA compatible controller: Cirrus Logic GD 5446
> > 
> > > # xl pci-attach domU :01:10.0
> > > # xl pci-list domU
> > > Vdev Device
> > > 04.0 :01:10.0
> > > # xl console domU
> > > ## lspci shows now also the assigned host device
> > 
> > 
> > So the actual bug is that the very first time after pci-attach the guests
> > "00:04.0" PCI device is (most likely) replaced with the host PCI device. 
> > Just
> > the guest does not notice that "00:04.0" was actually already gone after 
> > unplug.
> 
> That is odd - I see any device 'hot-plugged' being added at 00:05 and further.

I have to apologize. The reason it works for me is because the emulated
device gets unplugged quite early:

# lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:02.0 Class ff80: XenSource, Inc. Xen Platform Device (rev 01)
00:03.0 VGA compatible controller: Cirrus Logic GD 5446

[and here I run 'xl pci-attach USB 00:1a.0]

# [   30.609802] hpet1: lost 1589 rtc interrupts
[   30.672030] hpet1: lost 2200 rtc interrupts
[   30.672030] hpet1: lost 2201 rtc interrupts
[   30.672030] hpet1: lost 2200 rtc interrupts
[   30.899341] pci :00:04.0: [8086:1c2d] type 00 class 0x0c0320

And the other test I have been running is when the guest is booted
with an PCI device (and then unplugged):

# lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01)
00:02.0 VGA compatible controller: Cirrus Logic GD 5446
00:03.0 Class ff80: XenSource, Inc. Xen Platform Device (rev 01)
00:05.0 USB Controller: Intel Corporation Cougar Point USB Enhanced Host 
Controller #2 (rev 04)

But oddly enough when I do 'pci-detach' and then 'pci-attach' I get:
lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01)
00:02.0 VGA compatible controller: Cirrus Logic GD 5446
00:03.0 Class ff80: XenSource, Inc. Xen Platform Device (rev 01)
00:04.0 USB Con

Re: [Xen-devel] [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour

2014-12-03 Thread Julien Grall


Hi Vitaly,

On 03/12/2014 17:16, Vitaly Kuznetsov wrote:

New operation sets the 'recipient' domain which will recieve all


s/recieve/receive/


memory pages from a particular domain and kills the original domain.

Signed-off-by: Vitaly Kuznetsov 
---
@@ -1764,13 +1765,32 @@ void free_domheap_pages(struct page_info *pg, unsigned 
int order)


[..]


+else
+{
+mfn = page_to_mfn(pg);
+gmfn = mfn_to_gmfn(d, mfn);
+
+page_set_owner(pg, NULL);
+if ( assign_pages(d->recipient, pg, order, 0) )
+/* assign_pages reports the error by itself */
+goto out;
+
+if ( guest_physmap_add_page(d->recipient, gmfn, mfn, order) )


On ARM, mfn_to_gmfn will always return the mfn. This would result to add 
a 1:1 mapping in the recipient domain.


But ... only DOM0 has its memory mapped 1:1. So this code may blow up 
the P2M of the recipient domain.


I'm not an x86 expert, but this may also happen when the recipient 
domain is using translated page mode (i.e HVM/PVHM).


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 6/9] PCI: Expose pci_load_saved_state for public consumption.

2014-12-03 Thread Bjorn Helgaas

On Wed, Dec 3, 2014 at 2:40 PM, Konrad Rzeszutek Wilk
 wrote:
> We have the pci_load_and_free_saved_state, and pci_store_saved_state
> but are missing the functionality to just load the state
> multiple times in the PCI device without having to free/save
> the state.
>
> This patch makes it possible to use this function.
>
> CC: Bjorn Helgaas 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Bjorn Helgaas 

I assume you'll merge this whole series through your tree.  Let me
know if you want me to do anything else.

> ---
>  drivers/pci/pci.c   | 5 +++--
>  include/linux/pci.h | 2 ++
>  2 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 625a4ac..f00a9d6 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1142,8 +1142,8 @@ EXPORT_SYMBOL_GPL(pci_store_saved_state);
>   * @dev: PCI device that we're dealing with
>   * @state: Saved state returned from pci_store_saved_state()
>   */
> -static int pci_load_saved_state(struct pci_dev *dev,
> -   struct pci_saved_state *state)
> +int pci_load_saved_state(struct pci_dev *dev,
> +struct pci_saved_state *state)
>  {
> struct pci_cap_saved_data *cap;
>
> @@ -1171,6 +1171,7 @@ static int pci_load_saved_state(struct pci_dev *dev,
> dev->state_saved = true;
> return 0;
>  }
> +EXPORT_SYMBOL_GPL(pci_load_saved_state);
>
>  /**
>   * pci_load_and_free_saved_state - Reload the save state pointed to by state,
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 5be8db4..08088cb1 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1003,6 +1003,8 @@ void __iomem __must_check *pci_platform_rom(struct 
> pci_dev *pdev, size_t *size);
>  int pci_save_state(struct pci_dev *dev);
>  void pci_restore_state(struct pci_dev *dev);
>  struct pci_saved_state *pci_store_saved_state(struct pci_dev *dev);
> +int pci_load_saved_state(struct pci_dev *dev,
> +struct pci_saved_state *state);
>  int pci_load_and_free_saved_state(struct pci_dev *dev,
>   struct pci_saved_state **state);
>  struct pci_cap_saved_state *pci_find_saved_cap(struct pci_dev *dev, char 
> cap);
> --
> 1.9.3
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [qemu-mainline test] 32029: tolerable FAIL - PUSHED

2014-12-03 Thread xen . org

flight 32029 qemu-mainline real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/32029/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 31947

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  9 guest-start  fail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt   9 guest-start  fail   never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-libvirt  9 guest-start  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass

version targeted for testing:
 qemuu0d7954c288e91b8a457f15a0a8e8244facf6594b
baseline version:
 qemuudb12451decf7dfe0f083564183e135f2095228b9


People who touched revisions under test:
  Gonglei 
  Peter Maydell 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-win7-amd64   fail
 test-amd64-i386-xl-win7-amd64fail
 test-amd64-i386-xl-credit2   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-amd64-xl-pcipt-intel  fail
 test-amd64-i386-rhel6hvm-intel   pass
 test-amd64-i386-qemut-rhel6hvm-intel pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-libvirt fail
 test-armhf-armhf-libvirt fail
 test-amd64-i386-libvirt  fail
 test-amd64-i386-xl-multivcpu pass
 test

Re: [Xen-devel] [PATCH for-4.5] libxl_set_memory_target: only remove videoram from absolute targets

2014-12-03 Thread Don Slutz


On 12/03/14 13:20, Stefano Stabellini wrote:

If the new target is relative to the current target, do not remove
videoram again: it has already been removed from the current target.

Signed-off-by: Stefano Stabellini 

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index de23fec..2aa83bd 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -4741,13 +4741,17 @@ retry_transaction:
  goto out;
  }
  
+videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,

+"%s/memory/videoram", dompath));
+videoram = videoram_s ? atoi(videoram_s) : 0;
+
  if (relative) {
  if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
  new_target_memkb = 0;
  else
  new_target_memkb = current_target_memkb + target_memkb;
  } else
-new_target_memkb = target_memkb;
+new_target_memkb = target_memkb - videoram;
  if (new_target_memkb > memorykb) {
  LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
  "memory_dynamic_max must be less than or equal to"
@@ -4763,9 +4767,6 @@ retry_transaction:
  abort_transaction = 1;
  goto out;
  }
-videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,
-"%s/memory/videoram", dompath));
-videoram = videoram_s ? atoi(videoram_s) : 0;
  
  if (enforce) {

  memorykb = new_target_memkb;


Since new_target_memkb is now adjusted before this line, you
need to change this to:

 memorykb = new_target_memkb + videoram;

   -Don Slutz


@@ -4780,7 +4781,6 @@ retry_transaction:
  }
  }
  
-new_target_memkb -= videoram;

  rc = xc_domain_set_pod_target(ctx->xch, domid,
  new_target_memkb / 4, NULL, NULL, NULL);
  if (rc != 0) {




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [libvirt test] 32064: regressions - FAIL

2014-12-03 Thread xen . org

flight 32064 libvirt real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/32064/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt5 libvirt-build fail REGR. vs. 32005
 build-amd64-libvirt   5 libvirt-build fail REGR. vs. 32005
 build-armhf-libvirt   5 libvirt-build fail REGR. vs. 32005

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  86a15a258283850ec5563e2a66bf389aa2dfa318
baseline version:
 libvirt  ff018e686a8a412255bc34d3dc558a1bcf74fac5


People who touched revisions under test:
  Daniel Hansel 
  Dmitry Guryanov 
  John Ferlan 
  JÃ¡n Tomko 
  Laine Stump 
  Martin Kletzander 
  Michal Privoznik 
  Pavel Hrdina 
  Wang Rui 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-armhf-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt blocked
 test-armhf-armhf-libvirt blocked
 test-amd64-i386-libvirt  blocked



sg-report-flight on osstest.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 333 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute

2014-12-03 Thread Konrad Rzeszutek Wilk

The life-cycle of a PCI device in Xen pciback is complex
and is constrained by the PCI generic locking mechanism.

It starts with the device being binded to us - for which
we do a device function reset (and done via SysFS
so the PCI lock is held)

If the device is unbinded from us - we also do a function
reset (also done via SysFS so the PCI lock is held).

If the device is un-assigned from a guest - we do a function
reset (no PCI lock).

All on the individual PCI function level (so bus:device:function).

Unfortunatly a function reset is not adequate for certain
PCIe devices. The reset for an individual PCI function "means
device must support FLR (PCIe or AF), PM reset on D3hot->D0
device specific reset, or be a singleton device on a bus
a secondary bus reset.  FLR does not have widespread support,
reset is not very reliable, and bus topology is dictated by the
and device design.  We need to provide a means for a user to
a bus reset in cases where the existing mechanisms are not
 or not reliable. " (Adam Williamson, 'vfio-pci: PCI hot reset
interface' commit 8b27ee60bfd6bbb84d2df28fa706c5c5081066ca).

As such to do a slot or a bus reset is we need another mechanism.
This is not exposed SysFS as there is no good way of exposing
a bus topology there.

This is due to the complexity - we MUST know that the different
functions off a PCIe device are not in use by other drivers, or
if they are in use (say one of them is assigned to a guest
and the other is idle) - it is still OK to reset the slot
(assuming both of them are owned by Xen pciback).

This patch does that by doing an slot or bus reset (if
slot not supported) if all of the functions of a PCIe
device belong to Xen PCIback. We do not care if the device is
in-use as we depend on the toolstack to be aware of this -
however if it is we will WARN the user.

Due to the complexity with the PCI lock we cannot do
the reset when a device is binded ('echo $BDF > bind')
or when unbinded ('echo $BDF > unbind') as the pci_[slot|bus]_reset
also take the same lock resulting in a dead-lock.

Putting the reset function in a workqueue or thread
won't work either - as we have to do the reset function
outside the 'unbind' context (it holds the PCI lock).
But once you 'unbind' a device the device is no longer
under the ownership of Xen pciback and the pci_set_drvdata
has been reset so we cannot use a thread for this.

Instead of doing all this complex dance, we depend on the toolstack
doing the right thing. As such implement the 'do_flr' SysFS attribute
which 'xl' uses when a device is detached or attached from/to a guest.
It bypasses the need to worry about the PCI lock.

To not inadvertly do a bus reset that would affect devices that
are in use by other drivers (other than Xen pciback) prior
to the reset we check that all of the devices under the bridge
are owned by Xen pciback. If they are not we do not do
the bus (or slot) reset.

We also warn the user if the device is in use - but still
continue with the reset. This should not happen as the toolstack
also does the check.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 Documentation/ABI/testing/sysfs-driver-pciback |  12 +++
 drivers/xen/xen-pciback/pci_stub.c | 124 ++---
 2 files changed, 125 insertions(+), 11 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-pciback 
b/Documentation/ABI/testing/sysfs-driver-pciback
index 6a733bf..2d4e32f 100644
--- a/Documentation/ABI/testing/sysfs-driver-pciback
+++ b/Documentation/ABI/testing/sysfs-driver-pciback
@@ -11,3 +11,15 @@ Description:
 #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
 will allow the guest to read and write to the configuration
 register 0x0E.
+
+
+What:   /sys/bus/pci/drivers/pciback/do_flr
+Date:   December 2014
+KernelVersion:  3.19
+Contact:xen-de...@lists.xenproject.org
+Description:
+An option to slot or bus reset an PCI device owned by
+Xen PCI backend. Writing a string of :BB:DD.F will cause
+the driver to perform an slot or bus reset if the device
+supports. It also checks to make sure that all of the devices
+under the bridge are owned by Xen PCI backend.
diff --git a/drivers/xen/xen-pciback/pci_stub.c 
b/drivers/xen/xen-pciback/pci_stub.c
index cc3cbb4..f830bf4 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -100,14 +100,9 @@ static void pcistub_device_release(struct kref *kref)
 
xen_unregister_device_domain_owner(dev);
 
-   /* Call the reset function which does not take lock as this
-* is called from "unbind" which takes a device_lock mutex.
-*/
-   __pci_reset_function_locked(dev);
+   /* Reset is done by the toolstack by using 'do_flr' on the SysFS. */
if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
dev_info(&dev-

[Xen-devel] [PATCH v5] Fixes for PCI backend for 3.19.

2014-12-03 Thread Konrad Rzeszutek Wilk


Since v4 (http://lists.xen.org/archives/html/xen-devel/2014-11/msg02130.html):
 - Per David's review altered one of the patches.
v3 (https://lkml.org/lkml/2014/7/8/533):
 - Epic discussion.

These patches fix some issues with PCI back and also add proper
bus/slot reset.


 Documentation/ABI/testing/sysfs-driver-pciback |  12 ++
 drivers/pci/pci.c  |   5 +-
 drivers/xen/xen-pciback/passthrough.c  |  14 +-
 drivers/xen/xen-pciback/pci_stub.c | 236 +
 drivers/xen/xen-pciback/pciback.h  |   7 +-
 drivers/xen/xen-pciback/vpci.c |  14 +-
 drivers/xen/xen-pciback/xenbus.c   |   4 +-
 include/linux/device.h |   5 +
 include/linux/pci.h|   2 +
 9 files changed, 254 insertions(+), 45 deletions(-)


Jan Beulich (1):
  xen-pciback: drop SR-IOV VFs when PF driver unloads

Konrad Rzeszutek Wilk (8):
  xen/pciback: Don't deadlock when unbinding.
  driver core: Provide an wrapper around the mutex to do lockdep warnings
  xen/pciback: Include the domain id if removing the device whilst still in 
use
  xen/pciback: Print out the domain owning the device.
  xen/pciback: Remove tons of dereferences
  PCI: Expose pci_load_saved_state for public consumption.
  xen/pciback: Restore configuration space when detaching from a guest.
  xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 7/9] xen/pciback: Restore configuration space when detaching from a guest.

2014-12-03 Thread Konrad Rzeszutek Wilk

The commit "xen/pciback: Don't deadlock when unbinding." was using
the version of pci_reset_function which would lock the device lock.
That is no good as we can dead-lock. As such we swapped to using
the lock-less version and requiring that the callers
of 'pcistub_put_pci_dev' take the device lock. And as such
this bug got exposed.

Using the lock-less version is  OK, except that we tried to
use 'pci_restore_state' after the lock-less version of
__pci_reset_function_locked - which won't work as 'state_saved'
is set to false. Said 'state_saved' is a toggle boolean that
is to be used by the sequence of a) pci_save_state/pci_restore_state
or b) pci_load_and_free_saved_state/pci_restore_state. We don't
want to use a) as the guest might have messed up the PCI
configuration space and we want it to revert to the state
when the PCI device was binded to us. Therefore we pick
b) to restore the configuration space.

We restore from our 'golden' version of PCI configuration space, when an:
 - Device is unbinded from pciback
 - Device is detached from a guest.

Reported-by:  Sander Eikelenboom 
Signed-off-by: Konrad Rzeszutek Wilk 
---
v2: Always FLR reset
---
 drivers/xen/xen-pciback/pci_stub.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c 
b/drivers/xen/xen-pciback/pci_stub.c
index 843a2ba..8580e53 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
 */
__pci_reset_function_locked(dev);
if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
-   dev_dbg(&dev->dev, "Could not reload PCI state\n");
+   dev_info(&dev->dev, "Could not reload PCI state\n");
else
pci_restore_state(dev);
 
@@ -257,6 +257,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 {
struct pcistub_device *psdev, *found_psdev = NULL;
unsigned long flags;
+   struct xen_pcibk_dev_data *dev_data;
+   int ret;
 
spin_lock_irqsave(&pcistub_devices_lock, flags);
 
@@ -280,8 +282,18 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 */
device_lock_assert(&dev->dev);
__pci_reset_function_locked(dev);
-   pci_restore_state(dev);
 
+   dev_data = pci_get_drvdata(dev);
+   ret = pci_load_saved_state(dev, dev_data->pci_saved_state);
+   if (!ret) {
+   /*
+* The usual sequence is pci_save_state & pci_restore_state
+* but the guest might have messed the configuration space up.
+* Use the initial version (when device was bound to us).
+*/
+   pci_restore_state(dev);
+   } else
+   dev_info(&dev->dev, "Could not reload PCI state\n");
/* This disables the device. */
xen_pcibk_reset_device(dev);
 
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 3/9] xen/pciback: Include the domain id if removing the device whilst still in use

2014-12-03 Thread Konrad Rzeszutek Wilk

Cleanup the function a bit - also include the id of the
domain that is using the device.

Signed-off-by: Konrad Rzeszutek Wilk 
Reviewed-by: David Vrabel 
---
 drivers/xen/xen-pciback/pci_stub.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c 
b/drivers/xen/xen-pciback/pci_stub.c
index 8b77089..e5ff09d 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -553,12 +553,14 @@ static void pcistub_remove(struct pci_dev *dev)
spin_unlock_irqrestore(&pcistub_devices_lock, flags);
 
if (found_psdev) {
-   dev_dbg(&dev->dev, "found device to remove - in use? %p\n",
-   found_psdev->pdev);
+   dev_dbg(&dev->dev, "found device to remove %s\n",
+   found_psdev->pdev ? "- in-use" : "");
 
if (found_psdev->pdev) {
-   pr_warn("** removing device %s while still in-use! 
**\n",
-  pci_name(found_psdev->dev));
+   int domid = xen_find_device_domain_owner(dev);
+
+   pr_warn("** removing device %s while still in-use 
by domain %d! **\n",
+  pci_name(found_psdev->dev), domid);
pr_warn("** driver domain may still access this 
device's i/o resources!\n");
pr_warn("** shutdown driver domain before binding 
device\n");
pr_warn("** to other drivers or domains\n");
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 6/9] PCI: Expose pci_load_saved_state for public consumption.

2014-12-03 Thread Konrad Rzeszutek Wilk

We have the pci_load_and_free_saved_state, and pci_store_saved_state
but are missing the functionality to just load the state
multiple times in the PCI device without having to free/save
the state.

This patch makes it possible to use this function.

CC: Bjorn Helgaas 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/pci/pci.c   | 5 +++--
 include/linux/pci.h | 2 ++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 625a4ac..f00a9d6 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1142,8 +1142,8 @@ EXPORT_SYMBOL_GPL(pci_store_saved_state);
  * @dev: PCI device that we're dealing with
  * @state: Saved state returned from pci_store_saved_state()
  */
-static int pci_load_saved_state(struct pci_dev *dev,
-   struct pci_saved_state *state)
+int pci_load_saved_state(struct pci_dev *dev,
+struct pci_saved_state *state)
 {
struct pci_cap_saved_data *cap;
 
@@ -1171,6 +1171,7 @@ static int pci_load_saved_state(struct pci_dev *dev,
dev->state_saved = true;
return 0;
 }
+EXPORT_SYMBOL_GPL(pci_load_saved_state);
 
 /**
  * pci_load_and_free_saved_state - Reload the save state pointed to by state,
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5be8db4..08088cb1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1003,6 +1003,8 @@ void __iomem __must_check *pci_platform_rom(struct 
pci_dev *pdev, size_t *size);
 int pci_save_state(struct pci_dev *dev);
 void pci_restore_state(struct pci_dev *dev);
 struct pci_saved_state *pci_store_saved_state(struct pci_dev *dev);
+int pci_load_saved_state(struct pci_dev *dev,
+struct pci_saved_state *state);
 int pci_load_and_free_saved_state(struct pci_dev *dev,
  struct pci_saved_state **state);
 struct pci_cap_saved_state *pci_find_saved_cap(struct pci_dev *dev, char cap);
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 8/9] xen-pciback: drop SR-IOV VFs when PF driver unloads

2014-12-03 Thread Konrad Rzeszutek Wilk

From: Jan Beulich 

When a PF driver unloads, it may find it necessary to leave the VFs
around simply because of pciback having marked them as assigned to a
guest. Utilize a suitable notification to let go of the VFs, thus
allowing the PF to go back into the state it was before its driver
loaded (which in particular allows the driver to be loaded again with
it being able to create the VFs anew, but which also allows to then
pass through the PF instead of the VFs).

Don't do this however for any VFs currently in active use by a guest.

Signed-off-by: Jan Beulich 
[v2: Removed the switch statement, moved it about]
[v3: Redid it a bit differently]
Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/xen/xen-pciback/pci_stub.c | 54 ++
 1 file changed, 54 insertions(+)

diff --git a/drivers/xen/xen-pciback/pci_stub.c 
b/drivers/xen/xen-pciback/pci_stub.c
index 8580e53..cc3cbb4 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -1518,6 +1518,53 @@ parse_error:
 fs_initcall(pcistub_init);
 #endif
 
+#ifdef CONFIG_PCI_IOV
+static struct pcistub_device *find_vfs(const struct pci_dev *pdev)
+{
+   struct pcistub_device *psdev = NULL;
+   unsigned long flags;
+   bool found = false;
+
+   spin_lock_irqsave(&pcistub_devices_lock, flags);
+   list_for_each_entry(psdev, &pcistub_devices, dev_list) {
+   if (!psdev->pdev && psdev->dev != pdev
+   && pci_physfn(psdev->dev) == pdev) {
+   found = true;
+   break;
+   }
+   }
+   spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+   if (found)
+   return psdev;
+   return NULL;
+}
+
+static int pci_stub_notifier(struct notifier_block *nb,
+unsigned long action, void *data)
+{
+   struct device *dev = data;
+   const struct pci_dev *pdev = to_pci_dev(dev);
+
+   if (action != BUS_NOTIFY_UNBIND_DRIVER)
+   return NOTIFY_DONE;
+
+   if (!pdev->is_physfn)
+   return NOTIFY_DONE;
+
+   for (;;) {
+   struct pcistub_device *psdev = find_vfs(pdev);
+   if (!psdev)
+   break;
+   device_release_driver(&psdev->dev->dev);
+   }
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block pci_stub_nb = {
+   .notifier_call = pci_stub_notifier,
+};
+#endif
+
 static int __init xen_pcibk_init(void)
 {
int err;
@@ -1539,12 +1586,19 @@ static int __init xen_pcibk_init(void)
err = xen_pcibk_xenbus_register();
if (err)
pcistub_exit();
+#ifdef CONFIG_PCI_IOV
+   else
+   bus_register_notifier(&pci_bus_type, &pci_stub_nb);
+#endif
 
return err;
 }
 
 static void __exit xen_pcibk_cleanup(void)
 {
+#ifdef CONFIG_PCI_IOV
+   bus_unregister_notifier(&pci_bus_type, &pci_stub_nb);
+#endif
xen_pcibk_xenbus_unregister();
pcistub_exit();
 }
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 5/9] xen/pciback: Remove tons of dereferences

2014-12-03 Thread Konrad Rzeszutek Wilk

A little cleanup. No functional difference.

Reviewed-by: Boris Ostrovsky 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/xen/xen-pciback/pci_stub.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c 
b/drivers/xen/xen-pciback/pci_stub.c
index e5ff09d..843a2ba 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -631,10 +631,12 @@ static pci_ers_result_t common_process(struct 
pcistub_device *psdev,
 {
pci_ers_result_t res = result;
struct xen_pcie_aer_op *aer_op;
+   struct xen_pcibk_device *pdev = psdev->pdev;
+   struct xen_pci_sharedinfo *sh_info = pdev->sh_info;
int ret;
 
/*with PV AER drivers*/
-   aer_op = &(psdev->pdev->sh_info->aer_op);
+   aer_op = &(sh_info->aer_op);
aer_op->cmd = aer_cmd ;
/*useful for error_detected callback*/
aer_op->err = state;
@@ -655,36 +657,36 @@ static pci_ers_result_t common_process(struct 
pcistub_device *psdev,
* this flag to judge whether we need to check pci-front give aer
* service ack signal
*/
-   set_bit(_PCIB_op_pending, (unsigned long *)&psdev->pdev->flags);
+   set_bit(_PCIB_op_pending, (unsigned long *)&pdev->flags);
 
/*It is possible that a pcifront conf_read_write ops request invokes
* the callback which cause the spurious execution of wake_up.
* Yet it is harmless and better than a spinlock here
*/
set_bit(_XEN_PCIB_active,
-   (unsigned long *)&psdev->pdev->sh_info->flags);
+   (unsigned long *)&sh_info->flags);
wmb();
-   notify_remote_via_irq(psdev->pdev->evtchn_irq);
+   notify_remote_via_irq(pdev->evtchn_irq);
 
ret = wait_event_timeout(xen_pcibk_aer_wait_queue,
 !(test_bit(_XEN_PCIB_active, (unsigned long *)
-&psdev->pdev->sh_info->flags)), 300*HZ);
+&sh_info->flags)), 300*HZ);
 
if (!ret) {
if (test_bit(_XEN_PCIB_active,
-   (unsigned long *)&psdev->pdev->sh_info->flags)) {
+   (unsigned long *)&sh_info->flags)) {
dev_err(&psdev->dev->dev,
"pcifront aer process not responding!\n");
clear_bit(_XEN_PCIB_active,
- (unsigned long *)&psdev->pdev->sh_info->flags);
+ (unsigned long *)&sh_info->flags);
aer_op->err = PCI_ERS_RESULT_NONE;
return res;
}
}
-   clear_bit(_PCIB_op_pending, (unsigned long *)&psdev->pdev->flags);
+   clear_bit(_PCIB_op_pending, (unsigned long *)&pdev->flags);
 
if (test_bit(_XEN_PCIF_active,
-   (unsigned long *)&psdev->pdev->sh_info->flags)) {
+   (unsigned long *)&sh_info->flags)) {
dev_dbg(&psdev->dev->dev,
"schedule pci_conf service in " DRV_NAME "\n");
xen_pcibk_test_and_schedule_op(psdev->pdev);
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 4/9] xen/pciback: Print out the domain owning the device.

2014-12-03 Thread Konrad Rzeszutek Wilk

We had been printing it only if the device was built with
debug enabled. But this information is useful in the field
to troubleshoot.

Signed-off-by: Konrad Rzeszutek Wilk 
Reviewed-by: David Vrabel 
---
 drivers/xen/xen-pciback/xenbus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index 499..fe17c80 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -247,7 +247,7 @@ static int xen_pcibk_export_device(struct xen_pcibk_device 
*pdev,
if (err)
goto out;
 
-   dev_dbg(&dev->dev, "registering for %d\n", pdev->xdev->otherend_id);
+   dev_info(&dev->dev, "registering for %d\n", pdev->xdev->otherend_id);
if (xen_register_device_domain_owner(dev,
 pdev->xdev->otherend_id) != 0) {
dev_err(&dev->dev, "Stealing ownership from dom%d.\n",
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 1/9] xen/pciback: Don't deadlock when unbinding.

2014-12-03 Thread Konrad Rzeszutek Wilk

As commit 0a9fd0152929db372ff61b0d6c280fdd34ae8bdb
'xen/pciback: Document the entry points for 'pcistub_put_pci_dev''
explained there are four entry points in this function.
Two of them are when the user fiddles in the SysFS to
unbind a device which might be in use by a guest or not.

Both 'unbind' states will cause a deadlock as the the PCI lock has
already been taken, which then pci_device_reset tries to take.

We can simplify this by requiring that all callers of
pcistub_put_pci_dev MUST hold the device lock. And then
we can just call the lockless version of pci_device_reset.

To make it even simpler we will modify xen_pcibk_release_pci_dev
to quality whether it should take a lock or not - as it ends
up calling xen_pcibk_release_pci_dev and needs to hold the lock.

Reviewed-by: Boris Ostrovsky 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/xen/xen-pciback/passthrough.c | 14 +++---
 drivers/xen/xen-pciback/pci_stub.c| 12 ++--
 drivers/xen/xen-pciback/pciback.h |  7 ---
 drivers/xen/xen-pciback/vpci.c| 14 +++---
 drivers/xen/xen-pciback/xenbus.c  |  2 +-
 5 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/drivers/xen/xen-pciback/passthrough.c 
b/drivers/xen/xen-pciback/passthrough.c
index 828dddc..f16a30e 100644
--- a/drivers/xen/xen-pciback/passthrough.c
+++ b/drivers/xen/xen-pciback/passthrough.c
@@ -69,7 +69,7 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device 
*pdev,
 }
 
 static void __xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
-   struct pci_dev *dev)
+   struct pci_dev *dev, bool lock)
 {
struct passthrough_dev_data *dev_data = pdev->pci_dev_data;
struct pci_dev_entry *dev_entry, *t;
@@ -87,8 +87,13 @@ static void __xen_pcibk_release_pci_dev(struct 
xen_pcibk_device *pdev,
 
mutex_unlock(&dev_data->lock);
 
-   if (found_dev)
+   if (found_dev) {
+   if (lock)
+   device_lock(&found_dev->dev);
pcistub_put_pci_dev(found_dev);
+   if (lock)
+   device_unlock(&found_dev->dev);
+   }
 }
 
 static int __xen_pcibk_init_devices(struct xen_pcibk_device *pdev)
@@ -156,8 +161,11 @@ static void __xen_pcibk_release_devices(struct 
xen_pcibk_device *pdev)
struct pci_dev_entry *dev_entry, *t;
 
list_for_each_entry_safe(dev_entry, t, &dev_data->dev_list, list) {
+   struct pci_dev *dev = dev_entry->dev;
list_del(&dev_entry->list);
-   pcistub_put_pci_dev(dev_entry->dev);
+   device_lock(&dev->dev);
+   pcistub_put_pci_dev(dev);
+   device_unlock(&dev->dev);
kfree(dev_entry);
}
 
diff --git a/drivers/xen/xen-pciback/pci_stub.c 
b/drivers/xen/xen-pciback/pci_stub.c
index 017069a..9cbe1a3 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device 
*pdev,
  *  - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
  *
  *  As such we have to be careful.
+ *
+ *  To make this easier, the caller has to hold the device lock.
  */
 void pcistub_put_pci_dev(struct pci_dev *dev)
 {
@@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
/* Cleanup our device
 * (so it's ready for the next domain)
 */
-
-   /* This is OK - we are running from workqueue context
-* and want to inhibit the user from fiddling with 'reset'
-*/
-   pci_reset_function(dev);
+   lockdep_assert_held(&dev->dev.mutex);
+   __pci_reset_function_locked(dev);
pci_restore_state(dev);
 
/* This disables the device. */
@@ -567,7 +566,8 @@ static void pcistub_remove(struct pci_dev *dev)
/* N.B. This ends up calling pcistub_put_pci_dev which 
ends up
 * doing the FLR. */
xen_pcibk_release_pci_dev(found_psdev->pdev,
-   found_psdev->dev);
+   found_psdev->dev,
+   false /* caller holds the lock. 
*/);
}
 
spin_lock_irqsave(&pcistub_devices_lock, flags);
diff --git a/drivers/xen/xen-pciback/pciback.h 
b/drivers/xen/xen-pciback/pciback.h
index f72af87..58e38d5 100644
--- a/drivers/xen/xen-pciback/pciback.h
+++ b/drivers/xen/xen-pciback/pciback.h
@@ -99,7 +99,8 @@ struct xen_pcibk_backend {
unsigned int *domain, unsigned int *bus,
unsigned int *devfn);
int (*publish)(struct xen_pcibk_device *pdev, publish_pci_root_cb cb);
-   void (*release)(struct xen_pcibk_device *pdev, struct pci_dev *dev);
+   void (*release)(struct xen_pcibk_device *pdev, struct pci_dev *dev,
+

[Xen-devel] [PATCH v5 2/9] driver core: Provide an wrapper around the mutex to do lockdep warnings

2014-12-03 Thread Konrad Rzeszutek Wilk

Instead of open-coding it in drivers that want to double check
that their functions are indeed holding the device lock.

Signed-off-by: Konrad Rzeszutek Wilk 
Suggested-by: David Vrabel 
Acked-by: Greg Kroah-Hartman 
---
 drivers/xen/xen-pciback/pci_stub.c | 2 +-
 include/linux/device.h | 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c 
b/drivers/xen/xen-pciback/pci_stub.c
index 9cbe1a3..8b77089 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -278,7 +278,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
/* Cleanup our device
 * (so it's ready for the next domain)
 */
-   lockdep_assert_held(&dev->dev.mutex);
+   device_lock_assert(&dev->dev);
__pci_reset_function_locked(dev);
pci_restore_state(dev);
 
diff --git a/include/linux/device.h b/include/linux/device.h
index ce1f2160..41d6a75 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -911,6 +911,11 @@ static inline void device_unlock(struct device *dev)
mutex_unlock(&dev->mutex);
 }
 
+static inline void device_lock_assert(struct device *dev)
+{
+   lockdep_assert_held(&dev->mutex);
+}
+
 void driver_init(void);
 
 /*
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] systemd: use pkg-config to determine systemd library availability

2014-12-03 Thread M A Young




On Wed, 3 Dec 2014, Konrad Rzeszutek Wilk wrote:


On Wed, Dec 03, 2014 at 11:55:22AM +0100, Olaf Hering wrote:

On Wed, Dec 03, Ian Campbell wrote:


On Wed, 2014-12-03 at 11:49 +0100, Olaf Hering wrote:

On Wed, Dec 03, Ian Campbell wrote:


Ah I didn't know about the sd_listen_fds thing, so I think that what we
need then is to use pkg-config first to determine if systemd-daemon is
present at all, and then check for specific symbols we require using the
pkg-config supplied CFLAGS and LDFLAGS rather than assuming
-lsystemd-daemon.


Correction: sd_listen_fds is available since at least v1.
 git describe --contains abbbea81a8fa70badb7a05e518d6b07c360fc09d
 v1~390


In that case I don't think we realistically need to check for it?


Yes. Anything before 208 is stale. At least I dont have anything older
around for testing.


And for Fedora it is Fedora 21 or later. F20 has 208 so we are OK there.


Fedora 21 is going through its final release candidates now so it will 
stick to Xen 4.4 . The first Fedora release that might use this code (if I 
decide to use it - at the moment I have reservations) would be Fedora 22.


Michael Young

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools/hotplug: update systemd dependency to use service instead of socket

2014-12-03 Thread M A Young




On Wed, 3 Dec 2014, Konrad Rzeszutek Wilk wrote:


On Tue, Dec 02, 2014 at 06:51:50PM +, M A Young wrote:

On Tue, 2 Dec 2014, Konrad Rzeszutek Wilk wrote:


On Tue, Dec 02, 2014 at 03:44:55PM +, Ian Campbell wrote:

On Tue, 2014-12-02 at 16:39 +0100, Olaf Hering wrote:

Since commit 4542ae340d75bd6319e3fcd94e6c9336e210aeef ("tools/hotplug:
systemd xenstored dependencies") all service files use the .socket unit
as startup dependency. While this happens to work for boot it fails for
shutdown because a .socket does not seem to enforce ordering. When
xendomains.service runs during shutdown then systemd will stop
xenstored.service at the same time.

Change all "xenstored.socket" to "xenstored.service" to let systemd know
that xenstored has to be shutdown after everything else.

Reported-by: Mark Pryor 
Signed-off-by: Olaf Hering 
Cc: Ian Jackson 
Cc: Stefano Stabellini 


Acked-by: Ian Campbell 


Cc: Wei Liu 
---

This should go into 4.5 to fix xendomains.service.


CCing Konrad...


CC-ing Michael.

Michael, since Fedora is using systemd, did you observe this bug as well?
(I think I did, but I might have blamed it on my wacky setup).


I only tried the xen systemd on xen 4.5-rc2 and didn't have a lot of success
even when I reverted to Fedora's systemd for xen, so I can't really comment.
I did have issues with xen systemd which I shall report if they are still
there in -rc3.


It seems that hte issue I am having is:

ELinux: security_context_to_sid($XENSTORED_MOUNT_CTX) failed for (dev tmpfs, 
type tmpfs) er
Dec 03 11:46:07 laptop.dumpdata.com systemd[1]: var-lib-xenstored.mount mount 
process exited, code=exited status=32
Dec 03 11:46:07 laptop.dumpdata.com systemd[1]: Failed to mount mount xenstore 
file system.

Which looks like so:

[root@laptop system]# more var-lib-xenstored.mount
[Unit]
Description=mount xenstore file system
Requires=proc-xen.mount
After=proc-xen.mount
ConditionPathExists=/proc/xen/capabilities
RefuseManualStop=true

[Mount]
Environment=XENSTORED_MOUNT_CTX=none
EnvironmentFile=-/etc/sysconfig/xenstored
What=xenstore
Where=/var/lib/xenstored
Type=tmpfs
Options=mode=755,context="$XENSTORED_MOUNT_CTX"


Yes, that was on my probable bug list, as context="none" isn't a valid 
mount option (on Fedora at least), presumably because context has to be 
followed by a valid selinux context.


Michael Young

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] systemd: use pkg-config to determine systemd library availability

2014-12-03 Thread Konrad Rzeszutek Wilk

On Wed, Dec 03, 2014 at 11:55:22AM +0100, Olaf Hering wrote:
> On Wed, Dec 03, Ian Campbell wrote:
> 
> > On Wed, 2014-12-03 at 11:49 +0100, Olaf Hering wrote:
> > > On Wed, Dec 03, Ian Campbell wrote:
> > > 
> > > > Ah I didn't know about the sd_listen_fds thing, so I think that what we
> > > > need then is to use pkg-config first to determine if systemd-daemon is
> > > > present at all, and then check for specific symbols we require using the
> > > > pkg-config supplied CFLAGS and LDFLAGS rather than assuming
> > > > -lsystemd-daemon.
> > > 
> > > Correction: sd_listen_fds is available since at least v1.
> > >  git describe --contains abbbea81a8fa70badb7a05e518d6b07c360fc09d
> > >  v1~390
> > 
> > In that case I don't think we realistically need to check for it?
> 
> Yes. Anything before 208 is stale. At least I dont have anything older
> around for testing.

And for Fedora it is Fedora 21 or later. F20 has 208 so we are OK there.

> 
> Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] A few EFI code questions

2014-12-03 Thread Daniel Kiper

Hey,

1) Why is there in EFI code so many functions (e.g. efi_start(),
   efi_arch_edd(), ...) with local variables declared as a static?
   Though some of them have also regular local variables. I do not
   why it was decided that some of them must be the static and
   some of do not. It is a bit confusing. As I can see there is
   only one place which have to have local static (place_string()).
   Other seems to me as thing to save space on the stack but I do
   not think we need that. According to UEFI spec there will be
   "128 KiB or more of available stack space" when system runs in
   boot services mode. It is a lot of space. So, I think we can
   safely convert most of local static variables to normal local
   variables. Am I right?

2) I am going to add EDID support to EFI code. Should it be x86
   specific code or common one? As I can see EDID is defined as
   part of GOP so I think that EDID code should be placed in
   xen/common/efi/boot.c.

3) Should not we change xen/arch/*/efi/efi-boot.h to
   xen/arch/*/efi/efi-boot.c? efi-boot.h contains more
   code than definitions, declarations and short static
   functions. So, I think that it is more regular *.c file
   than header file.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] libxl_set_memory_target: only remove videoram from absolute targets

2014-12-03 Thread Konrad Rzeszutek Wilk

On Wed, Dec 03, 2014 at 06:20:41PM +, Stefano Stabellini wrote:
> If the new target is relative to the current target, do not remove
> videoram again: it has already been removed from the current target.

Please explain:
 - Is this an regression?
 - How often does it occur?
 - Is it fatal?
 - Are there work-arounds?

Thanks!
> 
> Signed-off-by: Stefano Stabellini 
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index de23fec..2aa83bd 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -4741,13 +4741,17 @@ retry_transaction:
>  goto out;
>  }
>  
> +videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,
> +"%s/memory/videoram", dompath));
> +videoram = videoram_s ? atoi(videoram_s) : 0;
> +
>  if (relative) {
>  if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
>  new_target_memkb = 0;
>  else
>  new_target_memkb = current_target_memkb + target_memkb;
>  } else
> -new_target_memkb = target_memkb;
> +new_target_memkb = target_memkb - videoram;
>  if (new_target_memkb > memorykb) {
>  LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
>  "memory_dynamic_max must be less than or equal to"
> @@ -4763,9 +4767,6 @@ retry_transaction:
>  abort_transaction = 1;
>  goto out;
>  }
> -videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,
> -"%s/memory/videoram", dompath));
> -videoram = videoram_s ? atoi(videoram_s) : 0;
>  
>  if (enforce) {
>  memorykb = new_target_memkb;
> @@ -4780,7 +4781,6 @@ retry_transaction:
>  }
>  }
>  
> -new_target_memkb -= videoram;
>  rc = xc_domain_set_pod_target(ctx->xch, domid,
>  new_target_memkb / 4, NULL, NULL, NULL);
>  if (rc != 0) {

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv1] xen: increase default number of PIRQs for hardware domains

2014-12-03 Thread Konrad Rzeszutek Wilk

On Wed, Dec 03, 2014 at 04:04:20PM +, David Vrabel wrote:
> The default limit for the number of PIRQs for hardware domains (dom0)
> is not sufficient for some (x86) systems.
> 
> Since the pirq structures are individually and dynamically allocated,
> the limit for hardware domains may be increased to the number of
> possible IRQs.

Why not also expand the number for the guest?
> 
> The extra_guest_irqs command line option now only allows changes to
> the domU value.  Any argument for dom0 is ignored.
> 
> Signed-off-by: David Vrabel 
> ---
>  docs/misc/xen-command-line.markdown |   11 ---
>  xen/common/domain.c |7 +--
>  2 files changed, 5 insertions(+), 13 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.markdown 
> b/docs/misc/xen-command-line.markdown
> index 0866df2..d352031 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -594,15 +594,12 @@ except for debugging purposes.
>  Force or disable use of EFI runtime services.
>  
>  ### extra\_guest\_irqs
> -> `= [][,]`
> +> `= []`
>  
> -> Default: `32,256`
> +> Default: `32`
>  
> -Change the number of PIRQs available for guests.  The optional first number 
> is
> -common for all domUs, while the optional second number (preceded by a comma)
> -is for dom0.  Changing the setting for domU has no impact on dom0 and vice
> -versa.  For example to change dom0 without changing domU, use
> -`extra_guest_irqs=,512`
> +Change the number of PIRQs available for guests. This limit does not
> +apply to hardware domains (dom0).
>  
>  ### flask\_enabled
>  > `= `
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 4a62c1d..a88d829 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -231,14 +231,11 @@ static int late_hwdom_init(struct domain *d)
>  #endif
>  }
>  
> -static unsigned int __read_mostly extra_dom0_irqs = 256;
>  static unsigned int __read_mostly extra_domU_irqs = 32;
>  static void __init parse_extra_guest_irqs(const char *s)
>  {
>  if ( isdigit(*s) )
>  extra_domU_irqs = simple_strtoul(s, &s, 0);
> -if ( *s == ',' && isdigit(*++s) )
> -extra_dom0_irqs = simple_strtoul(s, &s, 0);
>  }
>  custom_param("extra_guest_irqs", parse_extra_guest_irqs);
>  
> @@ -324,10 +321,8 @@ struct domain *domain_create(
>  atomic_inc(&d->pause_count);
>  
>  if ( !is_hardware_domain(d) )
> -d->nr_pirqs = nr_static_irqs + extra_domU_irqs;
> +d->nr_pirqs = min(nr_static_irqs + extra_domU_irqs, nr_irqs);
>  else
> -d->nr_pirqs = nr_static_irqs + extra_dom0_irqs;
> -if ( d->nr_pirqs > nr_irqs )
>  d->nr_pirqs = nr_irqs;
>  
>  radix_tree_init(&d->pirq_tree);
> -- 
> 1.7.10.4
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-xen-4.5 1/3] tools/hotplug: distclean target should remove files generated by configure

2014-12-03 Thread Konrad Rzeszutek Wilk

On Wed, Dec 03, 2014 at 04:53:53PM +0100, Daniel Kiper wrote:
> On Tue, Dec 02, 2014 at 01:36:20PM -0500, Konrad Rzeszutek Wilk wrote:
> > On Tue, Dec 02, 2014 at 04:16:28PM +0100, Daniel Kiper wrote:
> > > Signed-off-by: Daniel Kiper 
> >
> > This usage scenario which I can see this being useful (and
> > I've tripped over this) is when you rebuild a new version
> > from the same repo. As in, this affects developers, but
> > not end-users and not distros. But perhaps I am missing
> > one scenario?
> >
> > As such I would lean towards deferring this (and the other
> > two) to Xen 4.6.
> 
> As I know Debian build system sometimes complain if make distclean
> does not leave build tree in distclean state (read "state before
> configure" != "state after distclean"). It means that from
> distros point of view we should apply this patch. However,
> other two are not required and we can deffer them to Xen 4.6.

Cc-ing Axel and Debian Xen Team.
> 
> Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools/hotplug: update systemd dependency to use service instead of socket

2014-12-03 Thread Konrad Rzeszutek Wilk

On Tue, Dec 02, 2014 at 06:51:50PM +, M A Young wrote:
> On Tue, 2 Dec 2014, Konrad Rzeszutek Wilk wrote:
> 
> >On Tue, Dec 02, 2014 at 03:44:55PM +, Ian Campbell wrote:
> >>On Tue, 2014-12-02 at 16:39 +0100, Olaf Hering wrote:
> >>>Since commit 4542ae340d75bd6319e3fcd94e6c9336e210aeef ("tools/hotplug:
> >>>systemd xenstored dependencies") all service files use the .socket unit
> >>>as startup dependency. While this happens to work for boot it fails for
> >>>shutdown because a .socket does not seem to enforce ordering. When
> >>>xendomains.service runs during shutdown then systemd will stop
> >>>xenstored.service at the same time.
> >>>
> >>>Change all "xenstored.socket" to "xenstored.service" to let systemd know
> >>>that xenstored has to be shutdown after everything else.
> >>>
> >>>Reported-by: Mark Pryor 
> >>>Signed-off-by: Olaf Hering 
> >>>Cc: Ian Jackson 
> >>>Cc: Stefano Stabellini 
> >>
> >>Acked-by: Ian Campbell 
> >>
> >>>Cc: Wei Liu 
> >>>---
> >>>
> >>>This should go into 4.5 to fix xendomains.service.
> >>
> >>CCing Konrad...
> >
> >CC-ing Michael.
> >
> >Michael, since Fedora is using systemd, did you observe this bug as well?
> >(I think I did, but I might have blamed it on my wacky setup).
> 
> I only tried the xen systemd on xen 4.5-rc2 and didn't have a lot of success
> even when I reverted to Fedora's systemd for xen, so I can't really comment.
> I did have issues with xen systemd which I shall report if they are still
> there in -rc3.

It seems that hte issue I am having is:

ELinux: security_context_to_sid($XENSTORED_MOUNT_CTX) failed for (dev tmpfs, 
type tmpfs) er
Dec 03 11:46:07 laptop.dumpdata.com systemd[1]: var-lib-xenstored.mount mount 
process exited, code=exited status=32
Dec 03 11:46:07 laptop.dumpdata.com systemd[1]: Failed to mount mount xenstore 
file system.

Which looks like so:

[root@laptop system]# more var-lib-xenstored.mount 
[Unit]
Description=mount xenstore file system
Requires=proc-xen.mount
After=proc-xen.mount
ConditionPathExists=/proc/xen/capabilities
RefuseManualStop=true

[Mount]
Environment=XENSTORED_MOUNT_CTX=none
EnvironmentFile=-/etc/sysconfig/xenstored
What=xenstore
Where=/var/lib/xenstored
Type=tmpfs
Options=mode=755,context="$XENSTORED_MOUNT_CTX"


There is no /etc/sysconfig/xenstored (there is an oxenstored.conf)

If I alter it:

Options=mode=755
#,context="$XENSTORED_MOUNT_CTX"

It starts.
> 
>   Michael Young

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] libxl: expose #define to 4.5 and above

2014-12-03 Thread Konrad Rzeszutek Wilk

On Wed, Dec 03, 2014 at 10:50:34AM +, Ian Campbell wrote:
> On Wed, 2014-12-03 at 10:41 +, Wei Liu wrote:
> > In e3abab74 (libxl: un-constify return value of libxl_basename), the
> > macro was exposed to releases < 4.5. However only new code is able to
> > make use of that macro so it should be exposed to releases >= 4.5.
> > 
> > Signed-off-by: Wei Liu 
> > Cc: Ian Campbell 
> > Cc: Ian Jackson 
> > Cc: Andrew Cooper 
> 
> Acked-by: Ian Campbell 
> 
> Konrad, given that the original patch is in 4.5 (as of yesterday) we
> should obviously take this one too.

Right. Release-Acked-by: Konrad Rzeszutek Wilk 
> 
> > ---
> >  tools/libxl/libxl.h   |6 +++---
> >  tools/libxl/libxl_utils.c |2 +-
> >  tools/libxl/libxl_utils.h |2 +-
> >  3 files changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> > index 291c190..0a123f1 100644
> > --- a/tools/libxl/libxl.h
> > +++ b/tools/libxl/libxl.h
> > @@ -478,13 +478,13 @@ typedef struct libxl__ctx libxl_ctx;
> >  #endif
> >  
> >  /*
> > - * LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
> > + * LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
> >   *
> >   * The return value of libxl_basename is malloc'ed but the erroneously
> >   * marked as "const" in releases before 4.5.
> >   */
> > -#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION < 0x040500
> > -#define LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE 1
> > +#if !defined(LIBXL_API_VERSION) || LIBXL_API_VERSION >= 0x040500
> > +#define LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE 1
> >  #endif
> >  
> >  /*
> > diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
> > index 22119fc..7095b58 100644
> > --- a/tools/libxl/libxl_utils.c
> > +++ b/tools/libxl/libxl_utils.c
> > @@ -19,7 +19,7 @@
> >  
> >  #include "libxl_internal.h"
> >  
> > -#ifdef LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
> > +#ifndef LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
> >  const
> >  #endif
> >  char *libxl_basename(const char *name)
> > diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
> > index 8277eb9..acacdd9 100644
> > --- a/tools/libxl/libxl_utils.h
> > +++ b/tools/libxl/libxl_utils.h
> > @@ -18,7 +18,7 @@
> >  
> >  #include "libxl.h"
> >  
> > -#ifdef LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
> > +#ifndef LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
> >  const
> >  #endif
> >  char *libxl_basename(const char *name); /* returns string from strdup */
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 7/7] xen/pciback: Restore configuration space when detaching from a guest.

2014-12-03 Thread Konrad Rzeszutek Wilk

On Tue, Dec 02, 2014 at 06:11:50PM -0500, Boris Ostrovsky wrote:
> On 11/21/2014 05:17 PM, Konrad Rzeszutek Wilk wrote:
> >The commit "xen/pciback: Don't deadlock when unbinding." was using
> >the version of pci_reset_function which would lock the device lock.
> >That is no good as we can dead-lock. As such we swapped to using
> >the lock-less version and requiring that the callers
> >of 'pcistub_put_pci_dev' take the device lock. And as such
> >this bug got exposed.
> >
> >Using the lock-less version is  OK, except that we tried to
> >use 'pci_restore_state' after the lock-less version of
> >__pci_reset_function_locked - which won't work as 'state_saved'
> >is set to false. Said 'state_saved' is a toggle boolean that
> >is to be used by the sequence of a) pci_save_state/pci_restore_state
> >or b) pci_load_and_free_saved_state/pci_restore_state. We don't
> >want to use a) as the guest might have messed up the PCI
> >configuration space and we want it to revert to the state
> >when the PCI device was binded to us. Therefore we pick
> >b) to restore the configuration space.
> 
> 
> Doesn't this all mean that patch 1/7 broke pcistub_put_pci_dev()?

It fixed it (there was a deadlock there).

But the fix to the dead-lock exposed this bug.

One could say that 1/7 broke it because it never worked in the
first place, but now that it works (thanks to #1)  - it did not
work right?

Squashing the patches together is a bit too much I fear.

> 
> -boris
> 
> 
> >
> >We restore from our 'golden' version of PCI configuration space, when an:
> >  - Device is unbinded from pciback
> >  - Device is detached from a guest.
> >
> >Reported-by:  Sander Eikelenboom 
> >Signed-off-by: Konrad Rzeszutek Wilk 
> >---
> >  drivers/xen/xen-pciback/pci_stub.c | 20 
> >  1 file changed, 16 insertions(+), 4 deletions(-)
> >
> >diff --git a/drivers/xen/xen-pciback/pci_stub.c 
> >b/drivers/xen/xen-pciback/pci_stub.c
> >index 843a2ba..eb8b58e 100644
> >--- a/drivers/xen/xen-pciback/pci_stub.c
> >+++ b/drivers/xen/xen-pciback/pci_stub.c
> >@@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
> >  */
> > __pci_reset_function_locked(dev);
> > if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> >-dev_dbg(&dev->dev, "Could not reload PCI state\n");
> >+dev_info(&dev->dev, "Could not reload PCI state\n");
> > else
> > pci_restore_state(dev);
> >@@ -257,6 +257,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >  {
> > struct pcistub_device *psdev, *found_psdev = NULL;
> > unsigned long flags;
> >+struct xen_pcibk_dev_data *dev_data;
> >+int ret;
> > spin_lock_irqsave(&pcistub_devices_lock, flags);
> >@@ -279,9 +281,19 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >  * (so it's ready for the next domain)
> >  */
> > device_lock_assert(&dev->dev);
> >-__pci_reset_function_locked(dev);
> >-pci_restore_state(dev);
> >-
> >+dev_data = pci_get_drvdata(dev);
> >+ret = pci_load_saved_state(dev, dev_data->pci_saved_state);
> >+if (ret < 0)
> >+dev_warn(&dev->dev, "Could not reload PCI state\n");
> >+else {
> >+__pci_reset_function_locked(dev);
> >+/*
> >+ * The usual sequence is pci_save_state & pci_restore_state
> >+ * but the guest might have messed the configuration space up.
> >+ * Use the initial version (when device was bound to us).
> >+ */
> >+pci_restore_state(dev);
> >+}
> > /* This disables the device. */
> > xen_pcibk_reset_device(dev);
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v4] libxc: Expose the 1GB pages cpuid flag to guest

2014-12-03 Thread Konrad Rzeszutek Wilk

On Wed, Dec 03, 2014 at 09:38:49AM +, Ian Campbell wrote:
> On Tue, 2014-12-02 at 16:09 -0500, Konrad Rzeszutek Wilk wrote:
> > On Fri, Nov 28, 2014 at 11:50:43AM +, Ian Campbell wrote:
> > > On Fri, 2014-11-28 at 18:52 +0800, Liang Li wrote:
> > > > If hardware support the 1GB pages, expose the feature to guest by
> > > > default. Users don't have to use a 'cpuid= ' option in config fil
> > > > e to turn it on.
> > > > 
> > > > If guest use shadow mode, the 1GB pages feature will be hidden from
> > > > guest, this is done in the function hvm_cpuid(). So the change is
> > > > okay for shadow mode case.
> > > > 
> > > > Signed-off-by: Liang Li 
> > > > Signed-off-by: Yang Zhang 
> > > 
> > > FTR although this is strictly speaking a toolstack patch I think the
> > > main ack required should be from the x86 hypervisor guys...
> > 
> > Jan acked it.
> 
> For 4.5?

Probably not.
> 
> Have you release acked it?

No.
> 
> This seemed like 4.6 material to me, or at least I've not seen any
> mention/argument to the contrary.

Correct. 4.6 please.
> 
> Ian.
> 
> > > 
> > > > ---
> > > >  tools/libxc/xc_cpuid_x86.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > > 
> > > > diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> > > > index a18b1ff..c97f91a 100644
> > > > --- a/tools/libxc/xc_cpuid_x86.c
> > > > +++ b/tools/libxc/xc_cpuid_x86.c
> > > > @@ -109,6 +109,7 @@ static void amd_xc_cpuid_policy(
> > > >  regs[3] &= (0x0183f3ff | /* features shared with 
> > > > 0x0001:EDX */
> > > >  bitmaskof(X86_FEATURE_NX) |
> > > >  bitmaskof(X86_FEATURE_LM) |
> > > > +bitmaskof(X86_FEATURE_PAGE1GB) |
> > > >  bitmaskof(X86_FEATURE_SYSCALL) |
> > > >  bitmaskof(X86_FEATURE_MP) |
> > > >  bitmaskof(X86_FEATURE_MMXEXT) |
> > > > @@ -192,6 +193,7 @@ static void intel_xc_cpuid_policy(
> > > >  bitmaskof(X86_FEATURE_ABM));
> > > >  regs[3] &= (bitmaskof(X86_FEATURE_NX) |
> > > >  bitmaskof(X86_FEATURE_LM) |
> > > > +bitmaskof(X86_FEATURE_PAGE1GB) |
> > > >  bitmaskof(X86_FEATURE_SYSCALL) |
> > > >  bitmaskof(X86_FEATURE_RDTSCP));
> > > >  break;
> > > > @@ -386,6 +388,7 @@ static void xc_cpuid_hvm_policy(
> > > >  clear_bit(X86_FEATURE_LM, regs[3]);
> > > >  clear_bit(X86_FEATURE_NX, regs[3]);
> > > >  clear_bit(X86_FEATURE_PSE36, regs[3]);
> > > > +clear_bit(X86_FEATURE_PAGE1GB, regs[3]);
> > > >  }
> > > >  break;
> > > >  
> > > 
> > > 
> > > 
> > > ___
> > > Xen-devel mailing list
> > > Xen-devel@lists.xen.org
> > > http://lists.xen.org/xen-devel
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] INSTALL: fix typo in xendomains.service name

2014-12-03 Thread Konrad Rzeszutek Wilk

On Wed, Dec 03, 2014 at 09:52:34AM +0100, Olaf Hering wrote:
> Signed-off-by: Olaf Hering 
> Cc: Ian Campbell 
> Cc: Ian Jackson 
> ---
>  INSTALL | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

This being a doc it can go in anytime. thank you!
> 
> diff --git a/INSTALL b/INSTALL
> index 0bc67ea..71dd0eb 100644
> --- a/INSTALL
> +++ b/INSTALL
> @@ -284,7 +284,7 @@ systemctl enable xen-init-dom0.service
>  systemctl enable xenconsoled.service
>  
>  Other optional services are:
> -systemctl enable xen-domains.service
> +systemctl enable xendomains.service
>  systemctl enable xen-watchdog.service
>  
>  

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v15 17/21] x86/VPMU: Handle PMU interrupts for PV guests

2014-12-03 Thread Boris Ostrovsky


On 11/27/2014 03:59 AM, Jan Beulich wrote:

On 26.11.14 at 15:39,  wrote:

On 11/25/2014 09:28 AM, Jan Beulich wrote:

+else
+{
+struct segment_register seg;
+
+hvm_get_segment_register(sampled, x86_seg_cs, &seg);
+r->cs = seg.sel;
+hvm_get_segment_register(sampled, x86_seg_ss, &seg);
+r->ss = seg.sel;
+if ( seg.attr.fields.dpl != 0 )
+*flags |= PMU_SAMPLE_USER;

Is that how hardware treats it (CPL != 0 meaning user, rather
than CPL == 3)?

You mean how *software* (e.g. Linux kernel) treats it? If yes, then for
32-bit user_mode() checks for (CS == 3) and for 64-bit it's !!(CS & 3).

No, I meant hardware. There CPL qualified PMU aspects, and it was
those I had in mind to use as reference here.


Maybe you should surface CPL instead of a
boolean flag?



Yes, I think it may be better. Let the caller sort out how to interpret it.

-boris



Am I not already doing it by passing SS and CS to the guest?

No, neither SS.RPL nor CS.RPL formally represent CPL.

Jan




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v15 11/21] x86/VPMU: Interface for setting PMU mode and flags

2014-12-03 Thread Boris Ostrovsky


On 11/27/2014 03:57 AM, Jan Beulich wrote:

On 26.11.14 at 15:32,  wrote:

On 11/25/2014 08:49 AM, Jan Beulich wrote:

On 17.11.14 at 00:07,  wrote:

@@ -244,19 +256,19 @@ void vpmu_initialise(struct vcpu *v)
   switch ( vendor )
   {
   case X86_VENDOR_AMD:
-if ( svm_vpmu_initialise(v, opt_vpmu_enabled) != 0 )
-opt_vpmu_enabled = 0;
+if ( svm_vpmu_initialise(v) != 0 )
+vpmu_mode = XENPMU_MODE_OFF;
   break;
   
   case X86_VENDOR_INTEL:

-if ( vmx_vpmu_initialise(v, opt_vpmu_enabled) != 0 )
-opt_vpmu_enabled = 0;
+if ( vmx_vpmu_initialise(v) != 0 )
+vpmu_mode = XENPMU_MODE_OFF;
   break;

So this turns off the vPMU globally upon failure of initializing
some random vCPU. Why is that? I see this was the case even
before your entire series, but shouldn't this be fixed _before_
enhancing the whole thing to support PV/PVH?

Yes, that's probably too strong. Do you want to fix this as an early
patch (before PV(H)) support gets in? I'd rather do it in the patch that
moves things into initcalls.

Yes, I think this should be fixed in a prereq patch, thus allowing it
to be easily backported if so desired.


I started to make this change and then I realized that the next patch 
(12/21) is actually already taking care of this problem: most of the 
*_vpmu_initialise() will be executed as initcalls during host boot and 
if any of those fail then we do want to disable VPMU globally (those 
failures would not be VCPU-specific).


-boris




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xen: privcmd: schedule() after private hypercall when non CONFIG_PREEMPT

2014-12-03 Thread Luis R. Rodriguez

On Wed, Dec 03, 2014 at 05:37:51AM +0100, Juergen Gross wrote:
> On 12/03/2014 03:28 AM, Luis R. Rodriguez wrote:
>> On Tue, Dec 02, 2014 at 11:11:18AM +, David Vrabel wrote:
>>> On 01/12/14 22:36, Luis R. Rodriguez wrote:

 Then I do agree its a fair analogy (and find this obviously odd that how
 widespread cond_resched() is), we just don't have an equivalent for IRQ
 context, why not avoid the special check then and use this all the time in 
 the
 middle of a hypercall on the return from an interrupt (e.g., the timer
 interrupt)?
>>>
>>> http://lists.xen.org/archives/html/xen-devel/2014-02/msg01101.html
>>
>> OK thanks! That explains why we need some asm code but in that submission you
>> still also had used is_preemptible_hypercall(regs) and in the new
>> implementation you use a CPU variable xen_in_preemptible_hcall prior to 
>> calling
>> preempt_schedule_irq(). I believe you added the CPU variable because
>> preempt_schedule_irq() will preempt first without any checks if it should, 
>> I'm
>> asking why not do something like cond_resched_irq() where we check with
>> should_resched() prior to preempting and that way we can avoid having to use
>> the CPU variable?
>
> Because that could preempt at any asynchronous interrupt making the
> no-preempt kernel fully preemptive. 

OK yeah I see. That still doesn't negate the value of using something
like cond_resched_irq() with a should_resched() on only critical hypercalls.
The current implementation (patch by David) forces preemption without
checking for should_resched() so it would preempt unnecessarily at least
once.

> How would you know you are just
> doing a critical hypercall which should be preempted?

You would not, you're right. I was just trying to see if we could generalize
an API for this to avoid having users having to create their own CPU variables
but this all seems very specialized as we want to use this on the timer
so if we do generalize a cond_resched_irq() perhaps the documentation can
warn about this type of case or abuse.

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xsm/flask: improve unknown permission handling

2014-12-03 Thread Andrew Cooper

On 03/12/14 18:37, Daniel De Graaf wrote:
> On 11/27/2014 10:33 AM, Andrew Cooper wrote:
>> On 27/11/14 15:23, George Dunlap wrote:
>>> On Tue, Nov 25, 2014 at 6:05 PM, Daniel De Graaf
>>>  wrote:
 When an unknown domctl, sysctl, or other operation is encountered
 in the
 FLASK security server, use the allow_unknown bit in the security
 policy
 (set by running checkpolicy -U allow) to decide if the permission
 should
 be allowed or denied.  This allows new operations to be tested without
 needing to immediately add security checks; however, it is not
 flexible
 enough to avoid adding the actual permission checks.  An error message
 is printed to the hypervisor console when this fallback is
 encountered.
>>> Thanks -- I do think as Konrad said however, that when building with
>>> debug=y, we want the failure to be more obvious.  A crash is probably
>>> the best thing.
>>>
>>> I guess we want something like the following after the printk in
>>> avc_unknown_permission()?
>>>
>>> #ifndef NDEBUG
>>>  BUG();
>>> #endif
>>
>> ASSERT(!"Flask default policy error");
>>
>> provides rather more information in the panic message, and avoids the
>> #ifdefs.
>>
>> ~Andrew
>
> This allows any (privileged or unprivileged) guest to trigger the ASSERT
> and cause a hypervisor crash on a debug build.  Given that XSA-37 was
> considered a security vulnerability due to this type of behavior, I am
> hesitant to deliberately add a path to trigger a hypervisor crash, even
> if it makes testing easier.
>

XSA-37 was only an XSA because the rules at the time were unclear as
whether it was an issue or not.  At the same time, the rules were
clarified to state that issues in a debug build only are not security
issues.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xsm/flask: improve unknown permission handling

2014-12-03 Thread Daniel De Graaf


On 11/27/2014 10:33 AM, Andrew Cooper wrote:

On 27/11/14 15:23, George Dunlap wrote:

On Tue, Nov 25, 2014 at 6:05 PM, Daniel De Graaf  wrote:

When an unknown domctl, sysctl, or other operation is encountered in the
FLASK security server, use the allow_unknown bit in the security policy
(set by running checkpolicy -U allow) to decide if the permission should
be allowed or denied.  This allows new operations to be tested without
needing to immediately add security checks; however, it is not flexible
enough to avoid adding the actual permission checks.  An error message
is printed to the hypervisor console when this fallback is encountered.

Thanks -- I do think as Konrad said however, that when building with
debug=y, we want the failure to be more obvious.  A crash is probably
the best thing.

I guess we want something like the following after the printk in
avc_unknown_permission()?

#ifndef NDEBUG
 BUG();
#endif


ASSERT(!"Flask default policy error");

provides rather more information in the panic message, and avoids the
#ifdefs.

~Andrew


This allows any (privileged or unprivileged) guest to trigger the ASSERT
and cause a hypervisor crash on a debug build.  Given that XSA-37 was
considered a security vulnerability due to this type of behavior, I am
hesitant to deliberately add a path to trigger a hypervisor crash, even
if it makes testing easier.

--
Daniel De Graaf
National Security Agency

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH for-4.5] libxl_set_memory_target: only remove videoram from absolute targets

2014-12-03 Thread Stefano Stabellini

If the new target is relative to the current target, do not remove
videoram again: it has already been removed from the current target.

Signed-off-by: Stefano Stabellini 

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index de23fec..2aa83bd 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -4741,13 +4741,17 @@ retry_transaction:
 goto out;
 }
 
+videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,
+"%s/memory/videoram", dompath));
+videoram = videoram_s ? atoi(videoram_s) : 0;
+
 if (relative) {
 if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
 new_target_memkb = 0;
 else
 new_target_memkb = current_target_memkb + target_memkb;
 } else
-new_target_memkb = target_memkb;
+new_target_memkb = target_memkb - videoram;
 if (new_target_memkb > memorykb) {
 LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
 "memory_dynamic_max must be less than or equal to"
@@ -4763,9 +4767,6 @@ retry_transaction:
 abort_transaction = 1;
 goto out;
 }
-videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,
-"%s/memory/videoram", dompath));
-videoram = videoram_s ? atoi(videoram_s) : 0;
 
 if (enforce) {
 memorykb = new_target_memkb;
@@ -4780,7 +4781,6 @@ retry_transaction:
 }
 }
 
-new_target_memkb -= videoram;
 rc = xc_domain_set_pod_target(ctx->xch, domid,
 new_target_memkb / 4, NULL, NULL, NULL);
 if (rc != 0) {

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 for-4.6] libxl_set_memory_target: retain the same maxmem offset on top of the current target

2014-12-03 Thread Stefano Stabellini

On Wed, 3 Dec 2014, Don Slutz wrote:
> On 12/03/14 12:31, Stefano Stabellini wrote:
> > On Tue, 2 Dec 2014, Don Slutz wrote:
> > > On 12/02/14 09:59, Don Slutz wrote:
> > > > On 12/02/14 09:26, Stefano Stabellini wrote:
> > > > > On Tue, 2 Dec 2014, Don Slutz wrote:
> > > > > > On 12/02/14 06:53, Stefano Stabellini wrote:
> > > > > > > In libxl_set_memory_target when setting the new maxmem, retain the
> > > > > > > same
> > > > > > > offset on top of the current target. The offset includes memory
> > > > > > > allocated by QEMU for rom files.
> > > > > > > 
> > > > > > > Signed-off-by: Stefano
> > > > > > > Stabellini
> > > > > > > 
> > > > > > > ---
> > > > > > > 
> > > > > > > Changes in v2:
> > > > > > > - call libxl_domain_info instead of libxl_dominfo_init;
> > > > > > > - call libxl_domain_info before retry_transaction.
> > > > > > > 
> > > > > > > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> > > > > > > index de23fec..569a32a 100644
> > > > > > > --- a/tools/libxl/libxl.c
> > > > > > > +++ b/tools/libxl/libxl.c
> > > > > > > @@ -4694,6 +4694,9 @@ int libxl_set_memory_target(libxl_ctx *ctx,
> > > > > > > uint32_t
> > > > > > > domid,
> > > > > > > char *uuid;
> > > > > > > xs_transaction_t t;
> > > > > > > +if (libxl_domain_info(ctx, &ptr, domid) < 0)
> > > > > > > +goto out_no_transaction;
> > > > > > > +
> > > > > > > retry_transaction:
> > > > > > > t = xs_transaction_start(ctx->xsh);
> > > > > > > @@ -4767,10 +4770,9 @@ retry_transaction:
> > > > > > > "%s/memory/videoram", dompath));
> > > > > > > videoram = videoram_s ? atoi(videoram_s) : 0;
> > > > > > > -if (enforce) {
> > > > > > > -memorykb = new_target_memkb;
> > > > > > > -rc = xc_domain_setmaxmem(ctx->xch, domid, memorykb +
> > > > > > > -LIBXL_MAXMEM_CONSTANT);
> > > > > > > +if (enforce && new_target_memkb > 0) {
> > > > > > > +memorykb = ptr.max_memkb - current_target_memkb +
> > > > > > > new_target_memkb;
> > > My testing shows that this should be:
> > > 
> > >  memorykb = ptr.max_memkb - (current_target_memkb + videoram) +
> > >  new_target_memkb;
> > > 
> > > As far as I can tell the reason for this is that memory/target (aka
> > > current_target_memkb) was set based on:
> > > 
> > >  new_target_memkb -= videoram;
> > Thank you very much for testing and the suggestion!
> > 
> > I think that the right fix for this is to remove videoram from
> > new_target_memkb earlier and only when the new target is absolute,
> > otherwise we risk removing videoram twice (in case the new target is
> > relative). I wonder why we didn't notice this before.
> 
> Sounds like a good idea.  No clue, I have been looking real close at this
> stuff.
> and that my be why I tripped over it.
> 
> > 
> > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> > index d5d5204..4803cc4 100644
> > --- a/tools/libxl/libxl.c
> > +++ b/tools/libxl/libxl.c
> > @@ -4744,13 +4744,17 @@ retry_transaction:
> >   goto out;
> >   }
> >   +videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,
> > +"%s/memory/videoram", dompath));
> > +videoram = videoram_s ? atoi(videoram_s) : 0;
> > +
> >   if (relative) {
> >   if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
> >   new_target_memkb = 0;
> >   else
> >   new_target_memkb = current_target_memkb + target_memkb;
> >   } else
> > -new_target_memkb = target_memkb;
> > +new_target_memkb = target_memkb - videoram;
> >   if (new_target_memkb > memorykb) {
> >   LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
> >   "memory_dynamic_max must be less than or equal to"
> > @@ -4766,9 +4770,6 @@ retry_transaction:
> >   abort_transaction = 1;
> >   goto out;
> >   }
> > -videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,
> > -"%s/memory/videoram", dompath));
> > -videoram = videoram_s ? atoi(videoram_s) : 0;
> > if (enforce && new_target_memkb > 0) {
> >   memorykb = ptr.max_memkb - current_target_memkb +
> > new_target_memkb;
> > @@ -4782,7 +4783,6 @@ retry_transaction:
> >   }
> >   }
> >   -new_target_memkb -= videoram;
> >   rc = xc_domain_set_pod_target(ctx->xch, domid,
> >   new_target_memkb / 4, NULL, NULL, NULL);
> >   if (rc != 0) {
> 
> This does look like a bugfix for just the videoram issue.  Not sure why
> you made this v2 since I do not see the original change.
> 
> Anyway if you want to post this change for 4.5 (?) I would be happy to review
> it.

Thanks, I'll do.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 for-4.6] libxl_set_memory_target: retain the same maxmem offset on top of the current target

2014-12-03 Thread Don Slutz


On 12/03/14 12:31, Stefano Stabellini wrote:

On Tue, 2 Dec 2014, Don Slutz wrote:

On 12/02/14 09:59, Don Slutz wrote:

On 12/02/14 09:26, Stefano Stabellini wrote:

On Tue, 2 Dec 2014, Don Slutz wrote:

On 12/02/14 06:53, Stefano Stabellini wrote:

In libxl_set_memory_target when setting the new maxmem, retain the
same
offset on top of the current target. The offset includes memory
allocated by QEMU for rom files.

Signed-off-by: Stefano Stabellini

---

Changes in v2:
- call libxl_domain_info instead of libxl_dominfo_init;
- call libxl_domain_info before retry_transaction.

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index de23fec..569a32a 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -4694,6 +4694,9 @@ int libxl_set_memory_target(libxl_ctx *ctx,
uint32_t
domid,
char *uuid;
xs_transaction_t t;
+if (libxl_domain_info(ctx, &ptr, domid) < 0)
+goto out_no_transaction;
+
retry_transaction:
t = xs_transaction_start(ctx->xsh);
@@ -4767,10 +4770,9 @@ retry_transaction:
"%s/memory/videoram", dompath));
videoram = videoram_s ? atoi(videoram_s) : 0;
-if (enforce) {
-memorykb = new_target_memkb;
-rc = xc_domain_setmaxmem(ctx->xch, domid, memorykb +
-LIBXL_MAXMEM_CONSTANT);
+if (enforce && new_target_memkb > 0) {
+memorykb = ptr.max_memkb - current_target_memkb +
new_target_memkb;

My testing shows that this should be:

 memorykb = ptr.max_memkb - (current_target_memkb + videoram) +
 new_target_memkb;

As far as I can tell the reason for this is that memory/target (aka
current_target_memkb) was set based on:

 new_target_memkb -= videoram;

Thank you very much for testing and the suggestion!

I think that the right fix for this is to remove videoram from
new_target_memkb earlier and only when the new target is absolute,
otherwise we risk removing videoram twice (in case the new target is
relative). I wonder why we didn't notice this before.


Sounds like a good idea.  No clue, I have been looking real close at 
this stuff.

and that my be why I tripped over it.



diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d5d5204..4803cc4 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -4744,13 +4744,17 @@ retry_transaction:
  goto out;
  }
  
+videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,

+"%s/memory/videoram", dompath));
+videoram = videoram_s ? atoi(videoram_s) : 0;
+
  if (relative) {
  if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
  new_target_memkb = 0;
  else
  new_target_memkb = current_target_memkb + target_memkb;
  } else
-new_target_memkb = target_memkb;
+new_target_memkb = target_memkb - videoram;
  if (new_target_memkb > memorykb) {
  LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
  "memory_dynamic_max must be less than or equal to"
@@ -4766,9 +4770,6 @@ retry_transaction:
  abort_transaction = 1;
  goto out;
  }
-videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,
-"%s/memory/videoram", dompath));
-videoram = videoram_s ? atoi(videoram_s) : 0;
  
  if (enforce && new_target_memkb > 0) {

  memorykb = ptr.max_memkb - current_target_memkb + new_target_memkb;
@@ -4782,7 +4783,6 @@ retry_transaction:
  }
  }
  
-new_target_memkb -= videoram;

  rc = xc_domain_set_pod_target(ctx->xch, domid,
  new_target_memkb / 4, NULL, NULL, NULL);
  if (rc != 0) {


This does look like a bugfix for just the videoram issue.  Not sure why
you made this v2 since I do not see the original change.

Anyway if you want to post this change for 4.5 (?) I would be happy to 
review it.

-Don Slutz

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Regression, host crash with 4.5rc1

2014-12-03 Thread Dugger, Donald D

Jan-

No, I have no knowledge of an unpublished errata related to C State issues.

--
Don Dugger
"Censeo Toto nos in Kansa esse decisse." - D. Gale
Ph: 303/443-3786

-Original Message-
From: Jan Beulich [mailto:jbeul...@suse.com] 
Sent: Thursday, November 27, 2014 2:28 AM
To: Steve Freitas; Dugger, Donald D; Nakajima, Jun
Cc: xen-devel@lists.xen.org; Don Slutz
Subject: Re: [Xen-devel] Regression, host crash with 4.5rc1

>>> On 27.11.14 at 06:29,  wrote:
> On 11/25/2014 03:00 AM, Jan Beulich wrote:
>> Okay, so it's not really the mwait-idle driver causing the 
>> regression, but it is C-state related. Hence we're now down to seeing 
>> whether all or just the deeper C states are affected, i.e. I now need 
>> to ask you to play with "max_cstate=". For that you'll have to 
>> remember that the option's effect differs between the ACPI and the MWAIT 
>> idle drivers.
>> In the spirit of bisection I'd suggest using "max_cstate=2" first no 
>> matter which of the two scenarios you pick. If that still hangs, 
>> "max_cstate=1" obviously is the only other thing to try. Should that 
>> not hang (and you left out "mwait-idle=0"), trying "max_cstate=3"
>> in that same scenario would be the other case to check.
>>
>> No need for 'd' and 'a' output for the time being, but 'c' output 
>> would be much appreciated for all cases where you observe hangs.
>>
> 
> Okay, working through that now. I tried max_cstate=2 and got no hangs, 
> whether with or without mwait-idle=0. However, I was puzzled by this:
> 
> (XEN) 'c' pressed -> printing ACPI Cx structures
> (XEN) ==cpu0==
> (XEN) active state: C0
> (XEN) max_cstate:   C2
> (XEN) states:
> (XEN) C1:   type[C1] latency[003] usage[12219860] method[  FFH] 
> duration[1190961948551]
> (XEN) C2:   type[C1] latency[010] usage[10205554] method[  FFH] 
> duration[2015393965907]
> (XEN) C3:   type[C2] latency[020] usage[50926286] method[  FFH] 
> duration[30527997858148]
> (XEN)*C0:   usage[73351700] duration[9974627547595]
> (XEN) max=0 pwr=0 urg=0 nxt=0
> (XEN) PC2[0] PC3[8589642315848] PC6[0] PC7[0]
> (XEN) CC3[28794734145697] CC6[0] CC7[0]
> (XEN) ==cpu1==
> (XEN) active state: C3
> (XEN) max_cstate:   C2
> (XEN) states:
> (XEN) C1:   type[C1] latency[003] usage[10699950] method[  FFH] 
> duration[1141422044112]
> (XEN) C2:   type[C1] latency[010] usage[06382904] method[  FFH] 
> duration[1329739264322]
> (XEN)*C3:   type[C2] latency[020] usage[44630764] method[  FFH] 
> duration[31676618425954]
> (XEN) C0:   usage[61713618] duration[9561201640320]
> (XEN) max=0 pwr=0 urg=0 nxt=0
> (XEN) PC2[0] PC3[8589642315848] PC6[0] PC7[0]
> (XEN) CC3[30066495105056] CC6[0] CC7[0] [...]
> 
> Why would some of the cores be in C3 even though they list max_cstate as C2?

This was precisely the reason why I told you that the numbering differs (and is 
confusing and has nothing to do with actual C state
numbers): What max_cstate refers to in the mwait-idle driver is what above is 
listed as type[Cx], i.e. the state at index 1 is C1, at
2 we've got C1E, and at 3 we've got C2. And those still aren't in line with the 
numbering the CPU documentation uses, it's rather kind of meant to refer to the 
ACPI numbering (but probably also not fully matching up).

So max_cstate=2 working suggests a problem with what the CPU calls C6, which 
presumably isn't all that surprising considering the many errata (BD35, BD38, 
BD40, BD59, BD87, and BD104). Not sure how to proceed from here - I suppose you 
already made sure you run with the latest available BIOS. And with 6 errata 
documented it's not all that unlikely that there's a 7th one with MONITOR/MWAIT 
behavior. The commit you bisected to (and which you had verified to be the 
culprit by just forcing
arch_skip_send_event_check() to always return false) could be reasonably 
assumed to be broken only when MWAIT use for all C states didn't work.

Don, Jun - is there anything known but not yet publicly documented for Family 6 
Model 44 Xeons?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 2/9] xen: introduce SHUTDOWN_soft_reset shutdown reason

2014-12-03 Thread Vitaly Kuznetsov

Signed-off-by: Vitaly Kuznetsov 
---
 xen/common/shutdown.c  | 7 +++
 xen/include/public/sched.h | 3 ++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/xen/common/shutdown.c b/xen/common/shutdown.c
index 94d4c53..5c3a158 100644
--- a/xen/common/shutdown.c
+++ b/xen/common/shutdown.c
@@ -71,6 +71,13 @@ void hwdom_shutdown(u8 reason)
 break; /* not reached */
 }
 
+case SHUTDOWN_soft_reset:
+{
+printk("Domain 0 did soft reset but it is unsupported, rebooting.\n");
+machine_restart(0);
+break; /* not reached */
+}
+
 default:
 {
 printk("Domain 0 shutdown (unknown reason %u): ", reason);
diff --git a/xen/include/public/sched.h b/xen/include/public/sched.h
index 4000ac9..800c808 100644
--- a/xen/include/public/sched.h
+++ b/xen/include/public/sched.h
@@ -159,7 +159,8 @@ DEFINE_XEN_GUEST_HANDLE(sched_watchdog_t);
 #define SHUTDOWN_suspend2  /* Clean up, save suspend info, kill. */
 #define SHUTDOWN_crash  3  /* Tell controller we've crashed. */
 #define SHUTDOWN_watchdog   4  /* Restart because watchdog time expired. */
-#define SHUTDOWN_MAX4  /* Maximum valid shutdown reason. */
+#define SHUTDOWN_soft_reset 5  /* Soft reset, rebuild keeping memory content */
+#define SHUTDOWN_MAX5  /* Maximum valid shutdown reason. */
 /* ` } */
 
 #endif /* __XEN_PUBLIC_SCHED_H__ */
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 6/9] libxl: add libxl__domain_soft_reset_destroy_old()

2014-12-03 Thread Vitaly Kuznetsov

New libxl__domain_soft_reset_destroy_old() is an internal-only
version of libxl_domain_destroy() which follows the same domain
destroy path with the only difference: xc_domain_destroy() is
being avoided so the domain is not actually being destroyed.

Add soft_reset flag to libxl__domain_destroy_state structure
to support the change.

The original libxl_domain_destroy() function could be easily
modified to support new flag but I'm trying to avoid that as
it is part of public API.

Signed-off-by: Vitaly Kuznetsov 
---
 tools/libxl/libxl.c  | 32 +++-
 tools/libxl/libxl_internal.h |  4 
 2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f84f7c2..c2bd730 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1437,6 +1437,23 @@ int libxl_domain_destroy(libxl_ctx *ctx, uint32_t domid,
 return AO_INPROGRESS;
 }
 
+int libxl__domain_soft_reset_destroy_old(libxl_ctx *ctx, uint32_t domid,
+ const libxl_asyncop_how *ao_how)
+{
+AO_CREATE(ctx, domid, ao_how);
+libxl__domain_destroy_state *dds;
+
+GCNEW(dds);
+dds->ao = ao;
+dds->domid = domid;
+dds->callback = domain_destroy_cb;
+dds->soft_reset = 1;
+libxl__domain_destroy(egc, dds);
+
+return AO_INPROGRESS;
+}
+
+
 static void domain_destroy_cb(libxl__egc *egc, libxl__domain_destroy_state 
*dds,
   int rc)
 {
@@ -1612,6 +1629,7 @@ static void devices_destroy_cb(libxl__egc *egc,
 {
 STATE_AO_GC(drs->ao);
 libxl__destroy_domid_state *dis = CONTAINER_OF(drs, *dis, drs);
+libxl__domain_destroy_state *dds = CONTAINER_OF(dis, *dds, domain);
 libxl_ctx *ctx = CTX;
 uint32_t domid = dis->domid;
 char *dom_path;
@@ -1650,11 +1668,15 @@ static void devices_destroy_cb(libxl__egc *egc,
 }
 libxl__userdata_destroyall(gc, domid);
 
-rc = xc_domain_destroy(ctx->xch, domid);
-if (rc < 0) {
-LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_domain_destroy 
failed for %d", domid);
-rc = ERROR_FAIL;
-goto out;
+if (!dds->soft_reset)
+{
+rc = xc_domain_destroy(ctx->xch, domid);
+if (rc < 0) {
+LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
+"xc_domain_destroy failed for %d", domid);
+rc = ERROR_FAIL;
+goto out;
+}
 }
 rc = 0;
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index a38f695..f29ed83 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2969,6 +2969,7 @@ struct libxl__domain_destroy_state {
 int stubdom_finished;
 libxl__destroy_domid_state domain;
 int domain_finished;
+int soft_reset;
 };
 
 /*
@@ -3132,6 +3133,9 @@ _hidden void libxl__domain_save_device_model(libxl__egc 
*egc,
 
 _hidden const char *libxl__device_model_savefile(libxl__gc *gc, uint32_t 
domid);
 
+_hidden int libxl__domain_soft_reset_destroy_old(libxl_ctx *ctx, uint32_t 
domid,
+ const libxl_asyncop_how 
*ao_how);
+
 
 /*
  * Convenience macros.
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 for-4.6] libxl_set_memory_target: retain the same maxmem offset on top of the current target

2014-12-03 Thread Stefano Stabellini

On Tue, 2 Dec 2014, Don Slutz wrote:
> On 12/02/14 09:59, Don Slutz wrote:
> > On 12/02/14 09:26, Stefano Stabellini wrote:
> > > On Tue, 2 Dec 2014, Don Slutz wrote:
> > > > On 12/02/14 06:53, Stefano Stabellini wrote:
> > > > > In libxl_set_memory_target when setting the new maxmem, retain the
> > > > > same
> > > > > offset on top of the current target. The offset includes memory
> > > > > allocated by QEMU for rom files.
> > > > > 
> > > > > Signed-off-by: Stefano Stabellini
> > > > > 
> > > > > ---
> > > > > 
> > > > > Changes in v2:
> > > > > - call libxl_domain_info instead of libxl_dominfo_init;
> > > > > - call libxl_domain_info before retry_transaction.
> > > > > 
> > > > > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> > > > > index de23fec..569a32a 100644
> > > > > --- a/tools/libxl/libxl.c
> > > > > +++ b/tools/libxl/libxl.c
> > > > > @@ -4694,6 +4694,9 @@ int libxl_set_memory_target(libxl_ctx *ctx,
> > > > > uint32_t
> > > > > domid,
> > > > >char *uuid;
> > > > >xs_transaction_t t;
> > > > >+if (libxl_domain_info(ctx, &ptr, domid) < 0)
> > > > > +goto out_no_transaction;
> > > > > +
> > > > >retry_transaction:
> > > > >t = xs_transaction_start(ctx->xsh);
> > > > >@@ -4767,10 +4770,9 @@ retry_transaction:
> > > > >"%s/memory/videoram", dompath));
> > > > >videoram = videoram_s ? atoi(videoram_s) : 0;
> > > > >-if (enforce) {
> > > > > -memorykb = new_target_memkb;
> > > > > -rc = xc_domain_setmaxmem(ctx->xch, domid, memorykb +
> > > > > -LIBXL_MAXMEM_CONSTANT);
> > > > > +if (enforce && new_target_memkb > 0) {
> > > > > +memorykb = ptr.max_memkb - current_target_memkb +
> > > > > new_target_memkb;
> 
> My testing shows that this should be:
> 
> memorykb = ptr.max_memkb - (current_target_memkb + videoram) +
> new_target_memkb;
> 
> As far as I can tell the reason for this is that memory/target (aka
> current_target_memkb) was set based on:
> 
> new_target_memkb -= videoram;

Thank you very much for testing and the suggestion!

I think that the right fix for this is to remove videoram from
new_target_memkb earlier and only when the new target is absolute,
otherwise we risk removing videoram twice (in case the new target is
relative). I wonder why we didn't notice this before.


diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d5d5204..4803cc4 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -4744,13 +4744,17 @@ retry_transaction:
 goto out;
 }
 
+videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,
+"%s/memory/videoram", dompath));
+videoram = videoram_s ? atoi(videoram_s) : 0;
+
 if (relative) {
 if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
 new_target_memkb = 0;
 else
 new_target_memkb = current_target_memkb + target_memkb;
 } else
-new_target_memkb = target_memkb;
+new_target_memkb = target_memkb - videoram;
 if (new_target_memkb > memorykb) {
 LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
 "memory_dynamic_max must be less than or equal to"
@@ -4766,9 +4770,6 @@ retry_transaction:
 abort_transaction = 1;
 goto out;
 }
-videoram_s = libxl__xs_read(gc, t, libxl__sprintf(gc,
-"%s/memory/videoram", dompath));
-videoram = videoram_s ? atoi(videoram_s) : 0;
 
 if (enforce && new_target_memkb > 0) {
 memorykb = ptr.max_memkb - current_target_memkb + new_target_memkb;
@@ -4782,7 +4783,6 @@ retry_transaction:
 }
 }
 
-new_target_memkb -= videoram;
 rc = xc_domain_set_pod_target(ctx->xch, domid,
 new_target_memkb / 4, NULL, NULL, NULL);
 if (rc != 0) {

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] 4.5.0-rc3

2014-12-03 Thread Ian Jackson

Konrad has tagged 4.5.0-rc3.

The tag has been pushed to xen.git, along with the force push of the
qemu tag to master.  I have merged the result into staging.  The
tarball is in the expected place.

Thanks, Konrad.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 5/9] libxc: support XEN_DOMCTL_devour

2014-12-03 Thread Vitaly Kuznetsov

Introduce new xc_domain_devour() function to support XEN_DOMCTL_devour.

Signed-off-by: Vitaly Kuznetsov 
---
 tools/libxc/include/xenctrl.h | 14 ++
 tools/libxc/xc_domain.c   | 13 +
 2 files changed, 27 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 0ad8b8d..a789de3 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -558,6 +558,20 @@ int xc_domain_unpause(xc_interface *xch,
 int xc_domain_destroy(xc_interface *xch,
   uint32_t domid);
 
+/**
+ * This function sets a 'recipient' domain for a domain (when the source domain
+ * releases memory it is being reassigned to the recipient domain instead of
+ * being freed) and kills the original domain. The destination domain is 
supposed
+ * to have enough max_mem and no pages assigned.
+ *
+ * @parm xch a handle to an open hypervisor interface
+ * @parm domid the source domain id
+ * @parm recipient the destrination domain id
+ * @return 0 on success, -1 on failure
+ */
+int xc_domain_devour(xc_interface *xch,
+ uint32_t domid, uint32_t recipient);
+
 
 /**
  * This function resumes a suspended domain. The domain should have
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index b864872..5949725 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -122,6 +122,19 @@ int xc_domain_destroy(xc_interface *xch,
 return ret;
 }
 
+int xc_domain_devour(xc_interface *xch, uint32_t domid, uint32_t recipient)
+{
+int ret;
+DECLARE_DOMCTL;
+domctl.cmd = XEN_DOMCTL_devour;
+domctl.domain = (domid_t)domid;
+domctl.u.devour.recipient = (domid_t)recipient;
+do {
+ret = do_domctl(xch, &domctl);
+} while ( ret && (errno == EAGAIN) );
+return ret;
+}
+
 int xc_domain_shutdown(xc_interface *xch,
uint32_t domid,
int reason)
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 9/9] xsm: add XEN_DOMCTL_devour support

2014-12-03 Thread Vitaly Kuznetsov

Signed-off-by: Vitaly Kuznetsov 
---
 xen/common/domctl.c |  6 ++
 xen/include/xsm/dummy.h |  6 ++
 xen/include/xsm/xsm.h   |  6 ++
 xen/xsm/dummy.c |  1 +
 xen/xsm/flask/hooks.c   | 17 +
 xen/xsm/flask/policy/access_vectors | 10 ++
 6 files changed, 46 insertions(+)

diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 7e7fb47..7c22e35 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -1190,6 +1190,12 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
u_domctl)
 break;
 }
 
+ret = xsm_devour(XSM_HOOK, d, recipient_dom);
+if ( ret ) {
+put_domain(recipient_dom);
+break;
+}
+
 if ( recipient_dom->tot_pages != 0 )
 {
 put_domain(recipient_dom);
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index f20e89c..6e9e38b 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -113,6 +113,12 @@ static XSM_INLINE int xsm_set_target(XSM_DEFAULT_ARG 
struct domain *d, struct do
 return xsm_default_action(action, current->domain, NULL);
 }
 
+static XSM_INLINE int xsm_devour(XSM_DEFAULT_ARG struct domain *d, struct 
domain *e)
+{
+XSM_ASSERT_ACTION(XSM_HOOK);
+return xsm_default_action(action, current->domain, NULL);
+}
+
 static XSM_INLINE int xsm_domctl(XSM_DEFAULT_ARG struct domain *d, int cmd)
 {
 XSM_ASSERT_ACTION(XSM_OTHER);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 4ce089f..7db7433 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -58,6 +58,7 @@ struct xsm_operations {
 int (*domctl_scheduler_op) (struct domain *d, int op);
 int (*sysctl_scheduler_op) (int op);
 int (*set_target) (struct domain *d, struct domain *e);
+int (*devour) (struct domain *d, struct domain *e);
 int (*domctl) (struct domain *d, int cmd);
 int (*sysctl) (int cmd);
 int (*readconsole) (uint32_t clear);
@@ -213,6 +214,11 @@ static inline int xsm_set_target (xsm_default_t def, 
struct domain *d, struct do
 return xsm_ops->set_target(d, e);
 }
 
+static inline int xsm_devour (xsm_default_t def, struct domain *d, struct 
domain *r)
+{
+return xsm_ops->devour(d, r);
+}
+
 static inline int xsm_domctl (xsm_default_t def, struct domain *d, int cmd)
 {
 return xsm_ops->domctl(d, cmd);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 8eb3050..f3c2f9e 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -35,6 +35,7 @@ void xsm_fixup_ops (struct xsm_operations *ops)
 set_to_dummy_if_null(ops, domctl_scheduler_op);
 set_to_dummy_if_null(ops, sysctl_scheduler_op);
 set_to_dummy_if_null(ops, set_target);
+set_to_dummy_if_null(ops, devour);
 set_to_dummy_if_null(ops, domctl);
 set_to_dummy_if_null(ops, sysctl);
 set_to_dummy_if_null(ops, readconsole);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index d48463f..097c8c2 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -565,6 +565,21 @@ static int flask_set_target(struct domain *d, struct 
domain *t)
 return rc;
 }
 
+static int flask_devour(struct domain *d, struct domain *r)
+{
+int rc;
+struct domain_security_struct *dsec, *rsec;
+dsec = d->ssid;
+rsec = r->ssid;
+
+rc = current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__SET_AS_SOURCE);
+if ( rc )
+return rc;
+if ( r )
+rc = current_has_perm(r, SECCLASS_DOMAIN2, DOMAIN2__SET_AS_RECIPIENT);
+return rc;
+}
+
 static int flask_domctl(struct domain *d, int cmd)
 {
 switch ( cmd )
@@ -580,6 +595,7 @@ static int flask_domctl(struct domain *d, int cmd)
 #ifdef HAS_MEM_ACCESS
 case XEN_DOMCTL_mem_event_op:
 #endif
+case XEN_DOMCTL_devour:
 #ifdef CONFIG_X86
 /* These have individual XSM hooks (arch/x86/domctl.c) */
 case XEN_DOMCTL_shadow_op:
@@ -1512,6 +1528,7 @@ static struct xsm_operations flask_ops = {
 .domctl_scheduler_op = flask_domctl_scheduler_op,
 .sysctl_scheduler_op = flask_sysctl_scheduler_op,
 .set_target = flask_set_target,
+.devour = flask_devour,
 .domctl = flask_domctl,
 .sysctl = flask_sysctl,
 .readconsole = flask_readconsole,
diff --git a/xen/xsm/flask/policy/access_vectors 
b/xen/xsm/flask/policy/access_vectors
index 1da9f63..64c3424 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -142,6 +142,8 @@ class domain
 #  target = the new target domain
 # see also the domain2 make_priv_for and set_as_target checks
 set_target
+# XEN_DOMCTL_devour
+devour
 # SCHEDOP_remote_shutdown
 shutdown
 # XEN_DOMCTL_set{,_machine}_address_size
@@ -196,6 +198,14 @@ class domain2
 #  source = the domain making the hypercall
 #  target = the new target domain
 set_as_target
+# checked in XEN_DOMCTL_devour:
+#  source = the domain making the hypercall
+#  target

[Xen-devel] [PATCH v4 4/9] xen: introduce XEN_DOMCTL_devour

2014-12-03 Thread Vitaly Kuznetsov

New operation sets the 'recipient' domain which will recieve all
memory pages from a particular domain and kills the original domain.

Signed-off-by: Vitaly Kuznetsov 
---
 xen/common/domain.c |  3 +++
 xen/common/domctl.c | 33 +
 xen/common/page_alloc.c | 28 
 xen/include/public/domctl.h | 15 +++
 xen/include/xen/sched.h |  2 ++
 5 files changed, 77 insertions(+), 4 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index c13a7cf..f26267a 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -825,6 +825,9 @@ static void complete_domain_destroy(struct rcu_head *head)
 if ( d->target != NULL )
 put_domain(d->target);
 
+if ( d->recipient != NULL )
+put_domain(d->recipient);
+
 evtchn_destroy_final(d);
 
 radix_tree_destroy(&d->pirq_tree, free_pirq_struct);
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index f15dcfe..7e7fb47 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -1177,6 +1177,39 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
u_domctl)
 }
 break;
 
+case XEN_DOMCTL_devour:
+{
+struct domain *recipient_dom;
+
+if ( !d->recipient )
+{
+recipient_dom = get_domain_by_id(op->u.devour.recipient);
+if ( recipient_dom == NULL )
+{
+ret = -ESRCH;
+break;
+}
+
+if ( recipient_dom->tot_pages != 0 )
+{
+put_domain(recipient_dom);
+ret = -EINVAL;
+break;
+}
+/*
+ * Make sure no allocation/remapping is ongoing and set is_dying
+ * flag to prevent such actions in future.
+ */
+spin_lock(&d->page_alloc_lock);
+d->is_dying = DOMDYING_locked;
+d->recipient = recipient_dom;
+smp_wmb(); /* make sure recipient was set before domain_kill() */
+spin_unlock(&d->page_alloc_lock);
+}
+ret = domain_kill(d);
+}
+break;
+
 default:
 ret = arch_do_domctl(op, d, u_domctl);
 break;
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 7b4092d..7eb4404 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1707,6 +1707,7 @@ void free_domheap_pages(struct page_info *pg, unsigned 
int order)
 {
 struct domain *d = page_get_owner(pg);
 unsigned int i;
+unsigned long mfn, gmfn;
 bool_t drop_dom_ref;
 
 ASSERT(!in_irq());
@@ -1764,13 +1765,32 @@ void free_domheap_pages(struct page_info *pg, unsigned 
int order)
 scrub = 1;
 }
 
-if ( unlikely(scrub) )
-for ( i = 0; i < (1 << order); i++ )
-scrub_one_page(&pg[i]);
+if ( !d || !d->recipient || d->recipient->is_dying )
+{
+if ( unlikely(scrub) )
+for ( i = 0; i < (1 << order); i++ )
+scrub_one_page(&pg[i]);
 
-free_heap_pages(pg, order);
+free_heap_pages(pg, order);
+}
+else
+{
+mfn = page_to_mfn(pg);
+gmfn = mfn_to_gmfn(d, mfn);
+
+page_set_owner(pg, NULL);
+if ( assign_pages(d->recipient, pg, order, 0) )
+/* assign_pages reports the error by itself */
+goto out;
+
+if ( guest_physmap_add_page(d->recipient, gmfn, mfn, order) )
+printk(XENLOG_G_INFO
+   "Failed to add MFN %lx (GFN %lx) to Dom%d's physmap\n",
+   mfn, gmfn, d->recipient->domain_id);
+}
 }
 
+out:
 if ( drop_dom_ref )
 put_domain(d);
 }
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 57e2ed7..871fa5e 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -995,6 +995,19 @@ struct xen_domctl_psr_cmt_op {
 typedef struct xen_domctl_psr_cmt_op xen_domctl_psr_cmt_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cmt_op_t);
 
+/*
+ * XEN_DOMCTL_devour - kills the domain reassigning all of its domheap pages
+ * to the 'recipient' domain. Pages from xen heap belonging to the domain
+ * are not copied. Reassigned pages are mapped to the same GMFNs in the
+ * recipient domain as they were mapped in the original. The recipient domain
+ * is supposed to not have any domheap pages to avoid MFN-GMFN collisions.
+ */
+struct xen_domctl_devour {
+domid_t recipient;
+};
+typedef struct xen_domctl_devour xen_domctl_devour_t;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_devour_t);
+
 struct xen_domctl {
 uint32_t cmd;
 #define XEN_DOMCTL_createdomain   1
@@ -1070,6 +1083,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_setvnumainfo  74
 #define XEN_DOMCTL_psr_cmt_op75
 #define XEN_DOMCTL_arm_configure_domain  76
+#define XEN_DOMCTL

[Xen-devel] [PATCH v4 0/9] toolstack-based approach to pvhvm guest kexec

2014-12-03 Thread Vitaly Kuznetsov

Changes from RFCv3:
This is the first non-RFC series as no major concerns were expressed. I'm trying
to address Jan's comments. Changes are:
- Move from XEN_DOMCTL_set_recipient to XEN_DOMCTL_devour (I don't really like
  the name but nothing more appropriate came to my mind) which incorporates
  former XEN_DOMCTL_set_recipient and XEN_DOMCTL_destroydomain to prevent
  original domain from changing its allocations during transfer procedure.
- Check in free_domheap_pages() that assign_pages() succeeded.
- Change printk() in free_domheap_pages().
- DOMDYING_locked state was introduced to support XEN_DOMCTL_devour.
- xc_domain_soft_reset() got simplified a bit. Now we just wait for the original
  domain to die or loose all its pages.
- rebased on top of current master branch.

Changes from RFC/WIPv2:

Here is a slightly different approach to memory reassignment. Instead of
introducing new (and very doubtful) XENMEM_transfer operation introduce
simple XEN_DOMCTL_set_recipient operation and do everything in 
free_domheap_pages()
handler utilizing normal domain destroy path. This is better because:
- The approach is general-enough
- All memory pages are usually being freed when the domain is destroyed
- No special grants handling required
- Better supportability

With regards to PV:
Though XEN_DOMCTL_set_recipient works for both PV and HVM this patchset does not
bring PV kexec/kdump support. xc_domain_soft_reset() is limited to work with HVM
domains only. The main reason for that is: it is (in theory) possible to save 
p2m
and rebuild them with the new domain but that would only allow us to resume 
execution
from where we stopped. If we want to execute new kernel we need to build the 
same
kernel/initrd/bootstrap_pagetables/... structure we build to boot PV domain 
initially.
That however would destroy the original domain's memory thus making kdump 
impossible.
To make everything work additional support from kexec userspace/linux kernel is
required and I'm not sure it makes sense to implement all this stuff in the 
light of
PVH.

Original description:

When a PVHVM linux guest performs kexec there are lots of things which
require taking care of:
- shared info, vcpu_info
- grants
- event channels
- ...
Instead of taking care of all these things we can rebuild the domain
performing kexec from scratch doing so-called soft-reboot.

The idea was suggested by David Vrabel, Jan Beulich, and Konrad Rzeszutek Wilk.

P.S. The patch series can be tested with PVHVM Linux guest with the following
modifications:

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index c0cb11f..33c5cdd 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -33,6 +33,10 @@
 #include 
 #include 

+#ifdef CONFIG_KEXEC
+#include 
+#endif
+
 #include 
 #include 
 #include 
@@ -1810,6 +1814,22 @@ static struct notifier_block xen_hvm_cpu_notifier = {
   .notifier_call   = xen_hvm_cpu_notify,
 };

+#ifdef CONFIG_KEXEC
+static void xen_pvhvm_kexec_shutdown(void)
+{
+   native_machine_shutdown();
+   if (kexec_in_progress) {
+  xen_reboot(SHUTDOWN_soft_reset);
+  }
+}
+
+static void xen_pvhvm_crash_shutdown(struct pt_regs *regs)
+{
+   native_machine_crash_shutdown(regs);
+   xen_reboot(SHUTDOWN_soft_reset);
+}
+#endif
+
 static void __init xen_hvm_guest_init(void)
 {
init_hvm_pv_info();
@@ -1826,6 +1846,10 @@ static void __init xen_hvm_guest_init(void)
   x86_init.irqs.intr_init = xen_init_IRQ;
   xen_hvm_init_time_ops();
   xen_hvm_init_mmu_ops();
+#ifdef CONFIG_KEXEC
+   machine_ops.shutdown = xen_pvhvm_kexec_shutdown;
+   machine_ops.crash_shutdown = xen_pvhvm_crash_shutdown;
+#endif
 }

 static bool xen_nopv = false;
diff --git a/include/xen/interface/sched.h b/include/xen/interface/sched.h
index 9ce0839..b5942a8 100644
--- a/include/xen/interface/sched.h
+++ b/include/xen/interface/sched.h
@@ -107,5 +107,6 @@ struct sched_watchdog {
 #define SHUTDOWN_suspend2  /* Clean up, save suspend info, kill. */
 #define SHUTDOWN_crash  3  /* Tell controller we've crashed. */
 #define SHUTDOWN_watchdog   4  /* Restart because watchdog time expired. */
+#define SHUTDOWN_soft_reset 5  /* Soft-reset for kexec.  */

 #endif /* __XEN_PUBLIC_SCHED_H__ */

Vitaly Kuznetsov (9):
  xen: introduce DOMDYING_locked state
  xen: introduce SHUTDOWN_soft_reset shutdown reason
  libxl: support SHUTDOWN_soft_reset shutdown reason
  xen: introduce XEN_DOMCTL_devour
  libxc: support XEN_DOMCTL_devour
  libxl: add libxl__domain_soft_reset_destroy_old()
  libxc: introduce soft reset for HVM domains
  libxl: soft reset support
  xsm: add XEN_DOMCTL_devour support

 tools/libxc/Makefile|   1 +
 tools/libxc/include/xenctrl.h   |  14 ++
 tools/libxc/include/xenguest.h  |  20 +++
 tools/libxc/xc_domain.c |  13 ++
 tools/libxc/xc_domain_soft_reset.c  | 282 
 tools/libxl/libxl.c

[Xen-devel] [PATCH v4 1/9] xen: introduce DOMDYING_locked state

2014-12-03 Thread Vitaly Kuznetsov

New dying state is requred to indicate that a particular domain
is dying but cleanup procedure wasn't started. This state can be
set from outside of domain_kill().

Signed-off-by: Vitaly Kuznetsov 
---
 xen/common/domain.c | 1 +
 xen/include/xen/sched.h | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 4a62c1d..c13a7cf 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -603,6 +603,7 @@ int domain_kill(struct domain *d)
 switch ( d->is_dying )
 {
 case DOMDYING_alive:
+case DOMDYING_locked:
 domain_pause(d);
 d->is_dying = DOMDYING_dying;
 spin_barrier(&d->domain_lock);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 46fc6e3..a42d0b8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -369,7 +369,8 @@ struct domain
 /* Is this guest being debugged by dom0? */
 bool_t   debugger_attached;
 /* Is this guest dying (i.e., a zombie)? */
-enum { DOMDYING_alive, DOMDYING_dying, DOMDYING_dead } is_dying;
+enum { DOMDYING_alive, DOMDYING_locked, DOMDYING_dying, DOMDYING_dead }
+is_dying;
 /* Domain is paused by controller software? */
 int  controller_pause_count;
 /* Domain's VCPUs are pinned 1:1 to physical CPUs? */
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 7/9] libxc: introduce soft reset for HVM domains

2014-12-03 Thread Vitaly Kuznetsov

Add new xc_domain_soft_reset() function which performs so-called 'soft reset'
for an HVM domain. It is being performed in the following way:
- Save HVM context and all HVM params;
- Devour original domain with XEN_DOMCTL_devour;
- Wait till original domain dies or has no pages left;
- Restore HVM context, HVM params, seed grant table.

After that the domain resumes execution from where SHUTDOWN_soft_reset was
called.

Signed-off-by: Vitaly Kuznetsov 
---
 tools/libxc/Makefile   |   1 +
 tools/libxc/include/xenguest.h |  20 +++
 tools/libxc/xc_domain_soft_reset.c | 282 +
 3 files changed, 303 insertions(+)
 create mode 100644 tools/libxc/xc_domain_soft_reset.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index bd2ca6c..8f8abd6 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -52,6 +52,7 @@ GUEST_SRCS-y += xc_offline_page.c xc_compression.c
 else
 GUEST_SRCS-y += xc_nomigrate.c
 endif
+GUEST_SRCS-y += xc_domain_soft_reset.c
 
 vpath %.c ../../xen/common/libelf
 CFLAGS += -I../../xen/common/libelf
diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 40bbac8..770cd10 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -131,6 +131,26 @@ int xc_domain_restore(xc_interface *xch, int io_fd, 
uint32_t dom,
  * of the new domain is automatically appended to the filename,
  * separated by a ".".
  */
+
+/**
+ * This function does soft reset for a domain. During soft reset all
+ * source domain's memory is being reassigned to the destination domain,
+ * HVM context and HVM params are being copied.
+ *
+ * @parm xch a handle to an open hypervisor interface
+ * @parm source_dom the id of the source domain
+ * @parm dest_dom the id of the destination domain
+ * @parm console_domid the id of the domain handling console
+ * @parm console_mfn returned with the mfn of the console page
+ * @parm store_domid the id of the domain handling store
+ * @parm store_mfn returned with the mfn of the store page
+ * @return 0 on success, -1 on failure
+ */
+int xc_domain_soft_reset(xc_interface *xch, uint32_t source_dom,
+ uint32_t dest_dom, domid_t console_domid,
+ unsigned long *console_mfn, domid_t store_domid,
+ unsigned long *store_mfn);
+
 #define XC_DEVICE_MODEL_RESTORE_FILE "/var/lib/xen/qemu-resume"
 
 /**
diff --git a/tools/libxc/xc_domain_soft_reset.c 
b/tools/libxc/xc_domain_soft_reset.c
new file mode 100644
index 000..24d0b48
--- /dev/null
+++ b/tools/libxc/xc_domain_soft_reset.c
@@ -0,0 +1,282 @@
+/**
+ * xc_domain_soft_reset.c
+ *
+ * Do soft reset.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  
USA
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "xc_private.h"
+#include "xc_core.h"
+#include "xc_bitops.h"
+#include "xc_dom.h"
+#include "xg_private.h"
+#include "xg_save_restore.h"
+
+#include 
+
+#define SLEEP_INT 1
+
+int xc_domain_soft_reset(xc_interface *xch, uint32_t source_dom,
+ uint32_t dest_dom, domid_t console_domid,
+ unsigned long *console_mfn, domid_t store_domid,
+ unsigned long *store_mfn)
+{
+xc_dominfo_t old_info, new_info;
+int rc = 1;
+
+uint32_t hvm_buf_size = 0;
+uint8_t *hvm_buf = NULL;
+unsigned long console_pfn, store_pfn, io_pfn, buffio_pfn;
+unsigned long max_gpfn;
+uint64_t hvm_params[HVM_NR_PARAMS];
+xen_pfn_t sharedinfo_pfn;
+
+DPRINTF("%s: soft reset domid %u -> %u", __func__, source_dom, dest_dom);
+
+if ( xc_domain_getinfo(xch, source_dom, 1, &old_info) != 1 )
+{
+PERROR("Could not get old domain info");
+return 1;
+}
+
+if ( xc_domain_getinfo(xch, dest_dom, 1, &new_info) != 1 )
+{
+PERROR("Could not get new domain info");
+return 1;
+}
+
+if ( !old_info.hvm || !new_info.hvm )
+{
+PERROR("Soft reset is supported for HVM only");
+return 1;
+}
+
+max_gpfn = xc_domain_maximum_gpfn(xch, source_dom);
+
+sharedinfo_pfn = old_info.shared_info_frame;
+if ( xc_get_pfn_type_batch(xch, source_dom, 1, &sharedinfo_pfn) )
+{
+PERR

[Xen-devel] [PATCH v4 3/9] libxl: support SHUTDOWN_soft_reset shutdown reason

2014-12-03 Thread Vitaly Kuznetsov

Use letter 't' to indicate a domain in such state.

Signed-off-by: Vitaly Kuznetsov 
---
 tools/libxl/libxl_types.idl   | 1 +
 tools/libxl/xl_cmdimpl.c  | 2 +-
 tools/python/xen/lowlevel/xl/xl.c | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index f7fc695..4a0e2be 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -175,6 +175,7 @@ libxl_shutdown_reason = Enumeration("shutdown_reason", [
 (2, "suspend"),
 (3, "crash"),
 (4, "watchdog"),
+(5, "soft_reset"),
 ], init_val = "LIBXL_SHUTDOWN_REASON_UNKNOWN")
 
 libxl_vga_interface_type = Enumeration("vga_interface_type", [
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 0e754e7..53611dc 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -3497,7 +3497,7 @@ static void list_domains(int verbose, int context, int 
claim, int numa,
  const libxl_dominfo *info, int nb_domain)
 {
 int i;
-static const char shutdown_reason_letters[]= "-rscw";
+static const char shutdown_reason_letters[]= "-rscwt";
 libxl_bitmap nodemap;
 libxl_physinfo physinfo;
 
diff --git a/tools/python/xen/lowlevel/xl/xl.c 
b/tools/python/xen/lowlevel/xl/xl.c
index 32f982a..7c61160 100644
--- a/tools/python/xen/lowlevel/xl/xl.c
+++ b/tools/python/xen/lowlevel/xl/xl.c
@@ -784,6 +784,7 @@ PyMODINIT_FUNC initxl(void)
 _INT_CONST_LIBXL(m, SHUTDOWN_REASON_SUSPEND);
 _INT_CONST_LIBXL(m, SHUTDOWN_REASON_CRASH);
 _INT_CONST_LIBXL(m, SHUTDOWN_REASON_WATCHDOG);
+_INT_CONST_LIBXL(m, SHUTDOWN_REASON_SOFT_RESET);
 
 genwrap__init(m);
 }
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 8/9] libxl: soft reset support

2014-12-03 Thread Vitaly Kuznetsov

Perform soft reset when a domain did SHUTDOWN_soft_reset. Migrate the
content with xc_domain_soft_reset(), reload dm and toolstack.

Signed-off-by: Vitaly Kuznetsov 
---
 tools/libxl/libxl.h  |   6 +++
 tools/libxl/libxl_create.c   | 103 +++
 tools/libxl/libxl_internal.h |   4 ++
 tools/libxl/xl_cmdimpl.c |  22 -
 4 files changed, 124 insertions(+), 11 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 41d6e8d..c802635 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -919,6 +919,12 @@ int static inline libxl_domain_create_restore_0x040200(
 
 #endif
 
+int libxl_domain_soft_reset(libxl_ctx *ctx, libxl_domain_config *d_config,
+uint32_t *domid, uint32_t domid_old,
+const libxl_asyncop_how *ao_how,
+const libxl_asyncprogress_how *aop_console_how)
+LIBXL_EXTERNAL_CALLERS_ONLY;
+
   /* A progress report will be made via ao_console_how, of type
* domain_create_console_available, when the domain's primary
* console is available and can be connected to.
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 1198225..b1e809b 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -25,6 +25,8 @@
 #include 
 #include 
 
+#define INVALID_DOMID ~0
+
 int libxl__domain_create_info_setdefault(libxl__gc *gc,
  libxl_domain_create_info *c_info)
 {
@@ -903,6 +905,9 @@ static void initiate_domain_create(libxl__egc *egc,
 if (restore_fd >= 0) {
 LOG(DEBUG, "restoring, not running bootloader");
 domcreate_bootloader_done(egc, &dcs->bl, 0);
+} else if (dcs->domid_soft_reset != INVALID_DOMID) {
+LOG(DEBUG, "soft reset, not running bootloader\n");
+domcreate_bootloader_done(egc, &dcs->bl, 0);
 } else  {
 LOG(DEBUG, "running bootloader");
 dcs->bl.callback = domcreate_bootloader_done;
@@ -951,6 +956,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 libxl_domain_config *const d_config = dcs->guest_config;
 libxl_domain_build_info *const info = &d_config->b_info;
 const int restore_fd = dcs->restore_fd;
+const uint32_t domid_soft_reset = dcs->domid_soft_reset;
 libxl__domain_build_state *const state = &dcs->build_state;
 libxl__srm_restore_autogen_callbacks *const callbacks =
 &dcs->shs.callbacks.restore.a;
@@ -974,7 +980,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 dcs->dmss.dm.callback = domcreate_devmodel_started;
 dcs->dmss.callback = domcreate_devmodel_started;
 
-if ( restore_fd < 0 ) {
+if ( (restore_fd < 0) && (domid_soft_reset == INVALID_DOMID) ) {
 rc = libxl__domain_build(gc, d_config, domid, state);
 domcreate_rebuild_done(egc, dcs, rc);
 return;
@@ -1004,14 +1010,74 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 rc = ERROR_INVAL;
 goto out;
 }
-libxl__xc_domain_restore(egc, dcs,
- hvm, pae, superpages);
+if ( restore_fd >= 0 ) {
+libxl__xc_domain_restore(egc, dcs,
+ hvm, pae, superpages);
+} else {
+libxl__xc_domain_soft_reset(egc, dcs);
+}
+
 return;
 
  out:
 libxl__xc_domain_restore_done(egc, dcs, rc, 0, 0);
 }
 
+void libxl__xc_domain_soft_reset(libxl__egc *egc,
+ libxl__domain_create_state *dcs)
+{
+STATE_AO_GC(dcs->ao);
+libxl_ctx *ctx = libxl__gc_owner(gc);
+const uint32_t domid_soft_reset = dcs->domid_soft_reset;
+const uint32_t domid = dcs->guest_domid;
+libxl_domain_config *const d_config = dcs->guest_config;
+libxl_domain_build_info *const info = &d_config->b_info;
+uint8_t *buf;
+uint32_t len;
+uint32_t console_domid, store_domid;
+unsigned long store_mfn, console_mfn;
+int rc;
+struct libxl__domain_suspend_state *dss;
+
+GCNEW(dss);
+
+dss->ao = ao;
+dss->domid = domid_soft_reset;
+dss->dm_savefile = GCSPRINTF("/var/lib/xen/qemu-save.%d",
+ domid_soft_reset);
+
+if (info->type == LIBXL_DOMAIN_TYPE_HVM) {
+rc = libxl__domain_suspend_device_model(gc, dss);
+if (rc) goto out;
+}
+
+console_domid = dcs->build_state.console_domid;
+store_domid = dcs->build_state.store_domid;
+
+libxl__domain_soft_reset_destroy_old(ctx, domid_soft_reset, 0);
+
+rc = xc_domain_soft_reset(ctx->xch, domid_soft_reset, domid, console_domid,
+  &console_mfn, store_domid, &store_mfn);
+if (rc) goto out;
+
+libxl__qmp_cleanup(gc, domid_soft_reset);
+
+dcs->build_state.store_mfn = store_mfn;
+dcs->build_state.console_mfn = console_mfn;
+
+rc = libxl__toolstack_save(domid_soft_reset, &buf, &len, dss);
+if (rc) goto out;
+
+rc = libxl__toolsta

Re: [Xen-devel] [PATCH RESEND] xen/blkfront: improve protection against issuing unsupported REQ_FUA

2014-12-03 Thread Vitaly Kuznetsov

Boris Ostrovsky  writes:

> On 12/01/2014 08:01 AM, Vitaly Kuznetsov wrote:
>> Guard against issuing unsupported REQ_FUA and REQ_FLUSH was introduced
>> in d11e61583 and was factored out into blkif_request_flush_valid() in
>> 0f1ca65ee. However:
>> 1) This check in incomplete. In case we negotiated to feature_flush = 
>> REQ_FLUSH
>> and flush_op = BLKIF_OP_FLUSH_DISKCACHE (so FUA is unsupported) FUA 
>> request
>> will still pass the check.
>> 2) blkif_request_flush_valid() is misnamed. It is bool but returns true when
>> the request is invalid.
>> 3) When blkif_request_flush_valid() fails -EIO is being returned. It seems 
>> that
>> -EOPNOTSUPP is more appropriate here.
>> Fix all of the above issues.
>>
>> This patch is based on the original patch by Laszlo Ersek and a comment by
>> Jeff Moyer.
>>
>> Signed-off-by: Vitaly Kuznetsov 
>> Reviewed-by: Laszlo Ersek 
>
> Reviewed-by: Boris Ostrovsky 
>
> (although, as I mentioned last time, a companion patch to remove
> flush_op would be a good thing to have)
>

Thanks, it is on my todo list but I'm trying to separate this
(potential) bugfix from straight cleanup.

> -boris
>
>> ---
>>   drivers/block/xen-blkfront.c | 14 --
>>   1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
>> index 5ac312f..2e6c103 100644
>> --- a/drivers/block/xen-blkfront.c
>> +++ b/drivers/block/xen-blkfront.c
>> @@ -582,12 +582,14 @@ static inline void flush_requests(struct blkfront_info 
>> *info)
>>  notify_remote_via_irq(info->irq);
>>   }
>>   -static inline bool blkif_request_flush_valid(struct request *req,
>> - struct blkfront_info *info)
>> +static inline bool blkif_request_flush_invalid(struct request *req,
>> +   struct blkfront_info *info)
>>   {
>>  return ((req->cmd_type != REQ_TYPE_FS) ||
>> -((req->cmd_flags & (REQ_FLUSH | REQ_FUA)) &&
>> -!info->flush_op));
>> +((req->cmd_flags & REQ_FLUSH) &&
>> + !(info->feature_flush & REQ_FLUSH)) ||
>> +((req->cmd_flags & REQ_FUA) &&
>> + !(info->feature_flush & REQ_FUA)));
>>   }
>> /*
>> @@ -612,8 +614,8 @@ static void do_blkif_request(struct request_queue *rq)
>>  blk_start_request(req);
>>   -  if (blkif_request_flush_valid(req, info)) {
>> -__blk_end_request_all(req, -EIO);
>> +if (blkif_request_flush_invalid(req, info)) {
>> +__blk_end_request_all(req, -EOPNOTSUPP);
>>  continue;
>>  }
>>   

-- 
  Vitaly

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RESEND] xen/blkfront: improve protection against issuing unsupported REQ_FUA

2014-12-03 Thread Boris Ostrovsky


On 12/01/2014 08:01 AM, Vitaly Kuznetsov wrote:

Guard against issuing unsupported REQ_FUA and REQ_FLUSH was introduced
in d11e61583 and was factored out into blkif_request_flush_valid() in
0f1ca65ee. However:
1) This check in incomplete. In case we negotiated to feature_flush = REQ_FLUSH
and flush_op = BLKIF_OP_FLUSH_DISKCACHE (so FUA is unsupported) FUA request
will still pass the check.
2) blkif_request_flush_valid() is misnamed. It is bool but returns true when
the request is invalid.
3) When blkif_request_flush_valid() fails -EIO is being returned. It seems that
-EOPNOTSUPP is more appropriate here.
Fix all of the above issues.

This patch is based on the original patch by Laszlo Ersek and a comment by
Jeff Moyer.

Signed-off-by: Vitaly Kuznetsov 
Reviewed-by: Laszlo Ersek 


Reviewed-by: Boris Ostrovsky 

(although, as I mentioned last time, a companion patch to remove 
flush_op would be a good thing to have)



-boris


---
  drivers/block/xen-blkfront.c | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5ac312f..2e6c103 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -582,12 +582,14 @@ static inline void flush_requests(struct blkfront_info 
*info)
notify_remote_via_irq(info->irq);
  }
  
-static inline bool blkif_request_flush_valid(struct request *req,

-struct blkfront_info *info)
+static inline bool blkif_request_flush_invalid(struct request *req,
+  struct blkfront_info *info)
  {
return ((req->cmd_type != REQ_TYPE_FS) ||
-   ((req->cmd_flags & (REQ_FLUSH | REQ_FUA)) &&
-   !info->flush_op));
+   ((req->cmd_flags & REQ_FLUSH) &&
+!(info->feature_flush & REQ_FLUSH)) ||
+   ((req->cmd_flags & REQ_FUA) &&
+!(info->feature_flush & REQ_FUA)));
  }
  
  /*

@@ -612,8 +614,8 @@ static void do_blkif_request(struct request_queue *rq)
  
  		blk_start_request(req);
  
-		if (blkif_request_flush_valid(req, info)) {

-   __blk_end_request_all(req, -EIO);
+   if (blkif_request_flush_invalid(req, info)) {
+   __blk_end_request_all(req, -EOPNOTSUPP);
continue;
}
  



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv1] xen: increase default number of PIRQs for hardware domains

2014-12-03 Thread Andrew Cooper

On 03/12/14 16:04, David Vrabel wrote:
> The default limit for the number of PIRQs for hardware domains (dom0)
> is not sufficient for some (x86) systems.
>
> Since the pirq structures are individually and dynamically allocated,
> the limit for hardware domains may be increased to the number of
> possible IRQs.
>
> The extra_guest_irqs command line option now only allows changes to
> the domU value.  Any argument for dom0 is ignored.
>
> Signed-off-by: David Vrabel 

Reviewed-by: Andrew Cooper 

> ---
>  docs/misc/xen-command-line.markdown |   11 ---
>  xen/common/domain.c |7 +--
>  2 files changed, 5 insertions(+), 13 deletions(-)
>
> diff --git a/docs/misc/xen-command-line.markdown 
> b/docs/misc/xen-command-line.markdown
> index 0866df2..d352031 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -594,15 +594,12 @@ except for debugging purposes.
>  Force or disable use of EFI runtime services.
>  
>  ### extra\_guest\_irqs
> -> `= [][,]`
> +> `= []`
>  
> -> Default: `32,256`
> +> Default: `32`
>  
> -Change the number of PIRQs available for guests.  The optional first number 
> is
> -common for all domUs, while the optional second number (preceded by a comma)
> -is for dom0.  Changing the setting for domU has no impact on dom0 and vice
> -versa.  For example to change dom0 without changing domU, use
> -`extra_guest_irqs=,512`
> +Change the number of PIRQs available for guests. This limit does not
> +apply to hardware domains (dom0).
>  
>  ### flask\_enabled
>  > `= `
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 4a62c1d..a88d829 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -231,14 +231,11 @@ static int late_hwdom_init(struct domain *d)
>  #endif
>  }
>  
> -static unsigned int __read_mostly extra_dom0_irqs = 256;
>  static unsigned int __read_mostly extra_domU_irqs = 32;
>  static void __init parse_extra_guest_irqs(const char *s)
>  {
>  if ( isdigit(*s) )
>  extra_domU_irqs = simple_strtoul(s, &s, 0);
> -if ( *s == ',' && isdigit(*++s) )
> -extra_dom0_irqs = simple_strtoul(s, &s, 0);
>  }
>  custom_param("extra_guest_irqs", parse_extra_guest_irqs);
>  
> @@ -324,10 +321,8 @@ struct domain *domain_create(
>  atomic_inc(&d->pause_count);
>  
>  if ( !is_hardware_domain(d) )
> -d->nr_pirqs = nr_static_irqs + extra_domU_irqs;
> +d->nr_pirqs = min(nr_static_irqs + extra_domU_irqs, nr_irqs);
>  else
> -d->nr_pirqs = nr_static_irqs + extra_dom0_irqs;
> -if ( d->nr_pirqs > nr_irqs )
>  d->nr_pirqs = nr_irqs;
>  
>  radix_tree_init(&d->pirq_tree);


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCHv1] xen: increase default number of PIRQs for hardware domains

2014-12-03 Thread David Vrabel

The default limit for the number of PIRQs for hardware domains (dom0)
is not sufficient for some (x86) systems.

Since the pirq structures are individually and dynamically allocated,
the limit for hardware domains may be increased to the number of
possible IRQs.

The extra_guest_irqs command line option now only allows changes to
the domU value.  Any argument for dom0 is ignored.

Signed-off-by: David Vrabel 
---
 docs/misc/xen-command-line.markdown |   11 ---
 xen/common/domain.c |7 +--
 2 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 0866df2..d352031 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -594,15 +594,12 @@ except for debugging purposes.
 Force or disable use of EFI runtime services.
 
 ### extra\_guest\_irqs
-> `= [][,]`
+> `= []`
 
-> Default: `32,256`
+> Default: `32`
 
-Change the number of PIRQs available for guests.  The optional first number is
-common for all domUs, while the optional second number (preceded by a comma)
-is for dom0.  Changing the setting for domU has no impact on dom0 and vice
-versa.  For example to change dom0 without changing domU, use
-`extra_guest_irqs=,512`
+Change the number of PIRQs available for guests. This limit does not
+apply to hardware domains (dom0).
 
 ### flask\_enabled
 > `= `
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 4a62c1d..a88d829 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -231,14 +231,11 @@ static int late_hwdom_init(struct domain *d)
 #endif
 }
 
-static unsigned int __read_mostly extra_dom0_irqs = 256;
 static unsigned int __read_mostly extra_domU_irqs = 32;
 static void __init parse_extra_guest_irqs(const char *s)
 {
 if ( isdigit(*s) )
 extra_domU_irqs = simple_strtoul(s, &s, 0);
-if ( *s == ',' && isdigit(*++s) )
-extra_dom0_irqs = simple_strtoul(s, &s, 0);
 }
 custom_param("extra_guest_irqs", parse_extra_guest_irqs);
 
@@ -324,10 +321,8 @@ struct domain *domain_create(
 atomic_inc(&d->pause_count);
 
 if ( !is_hardware_domain(d) )
-d->nr_pirqs = nr_static_irqs + extra_domU_irqs;
+d->nr_pirqs = min(nr_static_irqs + extra_domU_irqs, nr_irqs);
 else
-d->nr_pirqs = nr_static_irqs + extra_dom0_irqs;
-if ( d->nr_pirqs > nr_irqs )
 d->nr_pirqs = nr_irqs;
 
 radix_tree_init(&d->pirq_tree);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen-detect: check for XEN_PV before XEN_HVM

2014-12-03 Thread Andrew Cooper

On 01/12/14 14:37, John Haxby wrote:
> At some stage, the cpuid instruction used to detect a xen hvm domain
> also started working in a pv domain so pv domains were being identified
> as hvm (dom0 excepted).  Change the order so that pv is tested for
> first.
>
> Signed-off-by: John Haxby 

This will have happened as a side effect of Intels CPUID-faulting
ability present in IvyBridge servers and later, which permits Xen the
ability to intercept regular cpuid instructions.

On the other hand, the forced emulation prefix is now valid in HVM
guests in debug Xens with the "hvm_fep" command line option.  In that
case, the HVM domain would be erroneously identified as PV.

Perhaps it is worth having an explicit guest type available in the cpuid
leaves themselves, so guest userspace need not guess at all?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-xen-4.5 1/3] tools/hotplug: distclean target should remove files generated by configure

2014-12-03 Thread Daniel Kiper

On Tue, Dec 02, 2014 at 01:36:20PM -0500, Konrad Rzeszutek Wilk wrote:
> On Tue, Dec 02, 2014 at 04:16:28PM +0100, Daniel Kiper wrote:
> > Signed-off-by: Daniel Kiper 
>
> This usage scenario which I can see this being useful (and
> I've tripped over this) is when you rebuild a new version
> from the same repo. As in, this affects developers, but
> not end-users and not distros. But perhaps I am missing
> one scenario?
>
> As such I would lean towards deferring this (and the other
> two) to Xen 4.6.

As I know Debian build system sometimes complain if make distclean
does not leave build tree in distclean state (read "state before
configure" != "state after distclean"). It means that from
distros point of view we should apply this patch. However,
other two are not required and we can deffer them to Xen 4.6.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen-detect: fix strict-aliasing compilation warning.

2014-12-03 Thread Andrew Cooper

On 03/12/14 15:39, John Haxby wrote:
> On 01/12/14 17:15, Andrew Cooper wrote:
>> On 01/12/14 14:37, John Haxby wrote:
>>> With gcc 4.8.3, compiling xen-detect gives a compilation warning if
>>> you're optimising:
>>>
>>> $ cc -Wall -Os xen-detect.c
>>> xen-detect.c: In function ‘check_for_xen’:
>>> xen-detect.c:65:9: warning: dereferencing type-punned pointer will break
>>> strict-aliasing rules [-Wstrict-aliasing]
>>>  *(uint32_t *)(signature + 0) = regs[1];
>>>  ^
>>>
>>> Signed-off-by: John Haxby 
>> Why are you compiling without the CFLAGS from the Xen build system?
>>
>> We explicitly disable strict alias optimisations, because optimisations
>> based upon the aliasing rules in C is mad.  Even when you eliminate all
>> the warnings, there are still subtle bugs because the compiler is free
>> to assume a lot more than a programmer would typically deem reasonable.
> Do you want me to repost the second patch (the actual bug fix one) so
> that it doesn't assume the line number changes and whatnot for this one?
>
> jch

With a pragmatic hat on, making more stuff "-Wall" safe is probably
better, although production code should use the surrounding infrastructure.

With all of these patches, you must CC the toolstack maintainers.  If
you believe it should make the cut for 4.5, you must also CC Konrad and
argue for a release ack.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen-detect: fix strict-aliasing compilation warning.

2014-12-03 Thread Andrew Cooper

On 01/12/14 18:45, John Haxby wrote:
>
>> On 1 Dec 2014, at 17:15, Andrew Cooper > > wrote:
>>
>> On 01/12/14 14:37, John Haxby wrote:
>>> With gcc 4.8.3, compiling xen-detect gives a compilation warning if
>>> you're optimising:
>>>
>>> $ cc -Wall -Os xen-detect.c
>>> xen-detect.c: In function ‘check_for_xen’:
>>> xen-detect.c:65:9: warning: dereferencing type-punned pointer will break
>>> strict-aliasing rules [-Wstrict-aliasing]
>>> *(uint32_t *)(signature + 0) = regs[1];
>>> ^
>>>
>>> Signed-off-by: John Haxby >> >
>>
>> Why are you compiling without the CFLAGS from the Xen build system?
>>
>> We explicitly disable strict alias optimisations, because optimisations
>> based upon the aliasing rules in C is mad.  Even when you eliminate all
>> the warnings, there are still subtle bugs because the compiler is free
>> to assume a lot more than a programmer would typically deem reasonable.
>
>
> I wasn’t building the whole system, I just wanted xen-detect so I
> pulled it out and compiled it; I usually use "-Wall -Os” because the
> combination finds problems I might otherwise overlook.   The patch
> also removes three lines of code :) but you can take it or leave it as
> you choose.   The other patch — reversing the code of pv and hvm
> checking — was the real problem.
>
> jch

I feel it would be neater to fix this by using the
XEN_CPUID_SIGNATURE_E{B,C,D}X constants from the API.  This fixes the
strict aliasing, and does away with the string handling completely.

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen-detect: fix strict-aliasing compilation warning.

2014-12-03 Thread John Haxby

On 01/12/14 17:15, Andrew Cooper wrote:
> On 01/12/14 14:37, John Haxby wrote:
>> With gcc 4.8.3, compiling xen-detect gives a compilation warning if
>> you're optimising:
>>
>> $ cc -Wall -Os xen-detect.c
>> xen-detect.c: In function ‘check_for_xen’:
>> xen-detect.c:65:9: warning: dereferencing type-punned pointer will break
>> strict-aliasing rules [-Wstrict-aliasing]
>>  *(uint32_t *)(signature + 0) = regs[1];
>>  ^
>>
>> Signed-off-by: John Haxby 
> 
> Why are you compiling without the CFLAGS from the Xen build system?
> 
> We explicitly disable strict alias optimisations, because optimisations
> based upon the aliasing rules in C is mad.  Even when you eliminate all
> the warnings, there are still subtle bugs because the compiler is free
> to assume a lot more than a programmer would typically deem reasonable.

Do you want me to repost the second patch (the actual bug fix one) so
that it doesn't assume the line number changes and whatnot for this one?

jch


> 
> ~Andrew
> 
>> ---
>>  tools/misc/xen-detect.c | 21 ++---
>>  1 file changed, 10 insertions(+), 11 deletions(-)
>>
>> diff --git a/tools/misc/xen-detect.c b/tools/misc/xen-detect.c
>> index 787b5da..19c66d1 100644
>> --- a/tools/misc/xen-detect.c
>> +++ b/tools/misc/xen-detect.c
>> @@ -54,28 +54,27 @@ static void cpuid(uint32_t idx, uint32_t *regs, int 
>> pv_context)
>>  
>>  static int check_for_xen(int pv_context)
>>  {
>> -uint32_t regs[4];
>> -char signature[13];
>> +union
>> +{
>> +uint32_t regs[4];
>> +char signature[17];
>> +} u;
>>  uint32_t base;
>>  
>>  for ( base = 0x4000; base < 0x4001; base += 0x100 )
>>  {
>> -cpuid(base, regs, pv_context);
>> -
>> -*(uint32_t *)(signature + 0) = regs[1];
>> -*(uint32_t *)(signature + 4) = regs[2];
>> -*(uint32_t *)(signature + 8) = regs[3];
>> -signature[12] = '\0';
>> +cpuid(base, u.regs, pv_context);
>> +u.signature[16] = '\0';
>>  
>> -if ( !strcmp("XenVMMXenVMM", signature) && (regs[0] >= (base + 2)) )
>> +if ( !strcmp("XenVMMXenVMM", u.signature+4) && (u.regs[0] >= (base 
>> + 2)) )
>>  goto found;
>>  }
>>  
>>  return 0;
>>  
>>   found:
>> -cpuid(base + 1, regs, pv_context);
>> -return regs[0];
>> +cpuid(base + 1, u.regs, pv_context);
>> +return u.regs[0];
>>  }
>>  
>>  static jmp_buf sigill_jmp;
> 
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-xen-4.5] console: increase initial conring size

2014-12-03 Thread Daniel Kiper

On Tue, Dec 02, 2014 at 03:58:53PM +, Jan Beulich wrote:
> >>> Daniel Kiper  12/02/14 3:58 PM >>>
> >In general initial conring size is sufficient. However, if log
> >level is increased on platforms which have e.g. huge number
> >of memory regions (I have an IBM System x3550 M2 with 8 GiB RAM
> >which has more than 200 entries in EFI memory map) then some
> >of earlier messages in console ring are overwritten. It means
> >that in case of issues deeper analysis can be hindered. Sadly
> >conring_size argument does not help because new console buffer
> >is allocated late on heap. It means that it is not possible to
> >allocate larger ring earlier. So, in this situation initial
> >conring size should be increased. My experiments showed that
> >even on not so big machines more than 26 KiB of free space are
> >needed for initial messages. In theory we could increase conring
> >size buffer to 32 KiB. However, I think that this value could be
> >too small for huge machines with large number of ACPI tables and
> >EFI memory regions. Hence, this patch increases initial conring
> >size to 64 KiB.
> >
> >Signed-off-by: Daniel Kiper 
>
> I think it was made clear before that just saying "for-xen-4.5" without any
> further rationale is insufficient. Please explain what makes this so important
> a change that it needs to go in now.

What is not clear in commit message? It describes a bug, how to fix it
and why in that way. Do you need anything else?

> Apart from that, as Olaf validly replied, setting up a dynamically allocated
> buffer earlier would seem like a much better course of action.

I though about that before posting this patch (I did not know beforehand about
Olaf's work made in 2011). However, I stated that it is too late to make so
intrusive changes. I think we should (sadly) apply this "band-aid" right
now because, as you can see, this bug hits more and more people. On the
other hand I agree that we should finally fix this issue in better way.
Even I am able to add this thing to my TODO list but it is quite long
so I do not know when it will happen.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/4] sysctl/libxl: Add interface for returning IO topology data

2014-12-03 Thread Boris Ostrovsky


On 12/03/2014 10:20 AM, Andrew Cooper wrote:

On 02/12/14 21:34, Boris Ostrovsky wrote:

  /* XEN_SYSCTL_topologyinfo */
  #define INVALID_TOPOLOGY_ID  (~0U)
+
+struct xen_sysctl_cputopo {
+uint32_t core;
+uint32_t socket;
+uint32_t node;
+};
+typedef struct xen_sysctl_cputopo xen_sysctl_cputopo_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopo_t);
+
+struct xen_sysctl_iotopo {
+uint16_t seg;
+uint8_t bus;
+uint8_t devfn;
+uint32_t node;
+};
+typedef struct xen_sysctl_iotopo xen_sysctl_iotopo_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_iotopo_t);
+
  struct xen_sysctl_topologyinfo {
  /*
   * IN: maximum addressable entry in the caller-provided arrays.
- * OUT: largest cpu identifier in the system.
+ * OUT: largest cpu identifier or max number of devices in the system.
   * If OUT is greater than IN then the arrays are truncated!
   * If OUT is leass than IN then the array tails are not written by sysctl.
   */
  uint32_t max_cpu_index;
+uint32_t max_devs;
  
  /*

   * If not NULL, these arrays are filled with core/socket/node identifier
- * for each cpu.
- * If a cpu has no core/socket/node information (e.g., cpu not present)
- * then the sentinel value ~0u is written to each array.
- * The number of array elements written by the sysctl is:
+ * for each cpu and/or node for each PCI device.
+ * If information for a particular entry is not avalable it is set to
+ * INVALID_TOPOLOGY_ID.
+ * The number of array elements for CPU topology written by the sysctl is:
   *   min(@max_cpu_index_IN,@max_cpu_index_OUT)+1
   */
-XEN_GUEST_HANDLE_64(uint32) cpu_to_core;
-XEN_GUEST_HANDLE_64(uint32) cpu_to_socket;
-XEN_GUEST_HANDLE_64(uint32) cpu_to_node;
+XEN_GUEST_HANDLE_64(xen_sysctl_cputopo_t) cputopo;
+XEN_GUEST_HANDLE_64(xen_sysctl_iotopo_t) iotopo;

These are inherently lists with different indicies.  They should not
conglomerated like this.



I don't follow this. These are indeed lists with different indicies but 
why can't they both be part of this struct?


-boris



I would suggest introducing a new hypercall (xen_sysctl_iotopologyinfo
?) and leave this one alone.

~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/4] sysctl/libxl: Add interface for returning IO topology data

2014-12-03 Thread Andrew Cooper

On 02/12/14 21:34, Boris Ostrovsky wrote:
>  /* XEN_SYSCTL_topologyinfo */
>  #define INVALID_TOPOLOGY_ID  (~0U)
> +
> +struct xen_sysctl_cputopo {
> +uint32_t core;
> +uint32_t socket;
> +uint32_t node;
> +};
> +typedef struct xen_sysctl_cputopo xen_sysctl_cputopo_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopo_t);
> +
> +struct xen_sysctl_iotopo {
> +uint16_t seg;
> +uint8_t bus;
> +uint8_t devfn;
> +uint32_t node;
> +};
> +typedef struct xen_sysctl_iotopo xen_sysctl_iotopo_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_iotopo_t);
> +
>  struct xen_sysctl_topologyinfo {
>  /*
>   * IN: maximum addressable entry in the caller-provided arrays.
> - * OUT: largest cpu identifier in the system.
> + * OUT: largest cpu identifier or max number of devices in the system.
>   * If OUT is greater than IN then the arrays are truncated!
>   * If OUT is leass than IN then the array tails are not written by 
> sysctl.
>   */
>  uint32_t max_cpu_index;
> +uint32_t max_devs;
>  
>  /*
>   * If not NULL, these arrays are filled with core/socket/node identifier
> - * for each cpu.
> - * If a cpu has no core/socket/node information (e.g., cpu not present) 
> - * then the sentinel value ~0u is written to each array.
> - * The number of array elements written by the sysctl is:
> + * for each cpu and/or node for each PCI device.
> + * If information for a particular entry is not avalable it is set to
> + * INVALID_TOPOLOGY_ID.
> + * The number of array elements for CPU topology written by the sysctl 
> is:
>   *   min(@max_cpu_index_IN,@max_cpu_index_OUT)+1
>   */
> -XEN_GUEST_HANDLE_64(uint32) cpu_to_core;
> -XEN_GUEST_HANDLE_64(uint32) cpu_to_socket;
> -XEN_GUEST_HANDLE_64(uint32) cpu_to_node;
> +XEN_GUEST_HANDLE_64(xen_sysctl_cputopo_t) cputopo;
> +XEN_GUEST_HANDLE_64(xen_sysctl_iotopo_t) iotopo;

These are inherently lists with different indicies.  They should not
conglomerated like this.

I would suggest introducing a new hypercall (xen_sysctl_iotopologyinfo
?) and leave this one alone.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] pci: Do not ignore device's PXM information

2014-12-03 Thread Boris Ostrovsky


On 12/03/2014 10:01 AM, Andrew Cooper wrote:

On 02/12/14 21:34, Boris Ostrovsky wrote:

If ACPI provides PXM data for IO devices then dom0 will pass it to
hypervisor during PHYSDEVOP_pci_device_add call. This information,
however, is currently ignored.

We should remember it (in the form of nodeID). We will also print it
when user requests device information dump.

Signed-off-by: Boris Ostrovsky 
---
  xen/arch/x86/physdev.c| 20 +---
  xen/drivers/passthrough/pci.c | 13 +
  xen/include/xen/pci.h |  5 -
  3 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 6b3201b..7775f80 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -565,7 +565,8 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) 
arg)
  if ( copy_from_guest(&manage_pci, arg, 1) != 0 )
  break;
  
-ret = pci_add_device(0, manage_pci.bus, manage_pci.devfn, NULL);

+ret = pci_add_device(0, manage_pci.bus, manage_pci.devfn,
+ NULL, NUMA_NO_NODE);
  break;
  }
  
@@ -597,13 +598,14 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)

  pdev_info.physfn.devfn = manage_pci_ext.physfn.devfn;
  ret = pci_add_device(0, manage_pci_ext.bus,
   manage_pci_ext.devfn,
- &pdev_info);
+ &pdev_info, NUMA_NO_NODE);
  break;
  }
  
  case PHYSDEVOP_pci_device_add: {

  struct physdev_pci_device_add add;
  struct pci_dev_info pdev_info;
+int node;
  
  ret = -EFAULT;

  if ( copy_from_guest(&add, arg, 1) != 0 )
@@ -618,7 +620,19 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) 
arg)
  }
  else
  pdev_info.is_virtfn = 0;
-ret = pci_add_device(add.seg, add.bus, add.devfn, &pdev_info);
+
+if ( add.flags & XEN_PCI_DEV_PXM ) {
+int optarr_off = offsetof(struct physdev_pci_device_add, optarr) /
+ sizeof(add.optarr[0]);
+
+if ( copy_from_guest_offset(&add.optarr[0], arg, optarr_off, 1) )
+break;

This will clobber the hypervisor stack, attempting to put the PXM
information into what is probably pdev_info.


Sigh... I actually fixed this on Linux side and now introduced this same 
bug here. (I guess I was lucky here that compiler must have inserted a 
word between add and pdev_info for alignment).




How is one expected to use XEN_PCI_DEV_PXM ? There is currently no
specification of how to use the variable length structure.


I don't think this is explicitly specified anywhere but if this flag is 
set then optarr[0] is supposed to store uint32_t pxm. I will add a 
comment to this effect to physdev.h


Thanks.
-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] pci: Do not ignore device's PXM information

2014-12-03 Thread Andrew Cooper

On 02/12/14 21:34, Boris Ostrovsky wrote:
> If ACPI provides PXM data for IO devices then dom0 will pass it to
> hypervisor during PHYSDEVOP_pci_device_add call. This information,
> however, is currently ignored.
>
> We should remember it (in the form of nodeID). We will also print it
> when user requests device information dump.
>
> Signed-off-by: Boris Ostrovsky 
> ---
>  xen/arch/x86/physdev.c| 20 +---
>  xen/drivers/passthrough/pci.c | 13 +
>  xen/include/xen/pci.h |  5 -
>  3 files changed, 30 insertions(+), 8 deletions(-)
>
> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
> index 6b3201b..7775f80 100644
> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -565,7 +565,8 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) 
> arg)
>  if ( copy_from_guest(&manage_pci, arg, 1) != 0 )
>  break;
>  
> -ret = pci_add_device(0, manage_pci.bus, manage_pci.devfn, NULL);
> +ret = pci_add_device(0, manage_pci.bus, manage_pci.devfn,
> + NULL, NUMA_NO_NODE);
>  break;
>  }
>  
> @@ -597,13 +598,14 @@ ret_t do_physdev_op(int cmd, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>  pdev_info.physfn.devfn = manage_pci_ext.physfn.devfn;
>  ret = pci_add_device(0, manage_pci_ext.bus,
>   manage_pci_ext.devfn,
> - &pdev_info);
> + &pdev_info, NUMA_NO_NODE);
>  break;
>  }
>  
>  case PHYSDEVOP_pci_device_add: {
>  struct physdev_pci_device_add add;
>  struct pci_dev_info pdev_info;
> +int node;
>  
>  ret = -EFAULT;
>  if ( copy_from_guest(&add, arg, 1) != 0 )
> @@ -618,7 +620,19 @@ ret_t do_physdev_op(int cmd, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>  }
>  else
>  pdev_info.is_virtfn = 0;
> -ret = pci_add_device(add.seg, add.bus, add.devfn, &pdev_info);
> +
> +if ( add.flags & XEN_PCI_DEV_PXM ) {
> +int optarr_off = offsetof(struct physdev_pci_device_add, optarr) 
> /
> + sizeof(add.optarr[0]);
> +
> +if ( copy_from_guest_offset(&add.optarr[0], arg, optarr_off, 1) )
> +break;

This will clobber the hypervisor stack, attempting to put the PXM
information into what is probably pdev_info.

How is one expected to use XEN_PCI_DEV_PXM ? There is currently no
specification of how to use the variable length structure.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Qemu-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

2014-12-03 Thread Stefano Stabellini

On Wed, 3 Dec 2014, Don Slutz wrote:
> On 12/03/14 07:20, Stefano Stabellini wrote:
> > On Wed, 3 Dec 2014, Wei Liu wrote:
> > > On Tue, Dec 02, 2014 at 03:23:29PM -0500, Don Slutz wrote:
> > > [...]
> > > > > > > >hw_error("xc_domain_getinfo failed");
> > > > > > > >}
> > > > > > > > -if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
> > > > > > > > -(nr_pfn * XC_PAGE_SIZE / 1024)) <
> > > > > > > > 0) {
> > > > > > > > +max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE;
> > > > > > > > +free_pages = max_pages - info.nr_pages;
> > > > > > > > +real_free = free_pages;
> > > > > > > > +if (free_pages > VGA_HOLE_SIZE) {
> > > > > > > > +free_pages -= VGA_HOLE_SIZE;
> > > > > > > > +} else {
> > > > > > > > +free_pages = 0;
> > > > > > > > +}
> > > > > I don't think we need to subtract VGA_HOLE_SIZE.
> > > > If you do not use some slack (like 32 here), xen does report:
> > > > 
> > > > 
> > > > (d5) [2014-11-29 17:07:21] Loaded SeaBIOS
> > > > (d5) [2014-11-29 17:07:21] Creating MP tables ...
> > > > (d5) [2014-11-29 17:07:21] Loading ACPI ...
> > > > (XEN) [2014-11-29 17:07:21] page_alloc.c:1568:d5 Over-allocation for
> > > > domain
> > > > 5: 1057417 > 1057416
> > > > (XEN) [2014-11-29 17:07:21] memory.c:158:d5 Could not allocate order=0
> > > This message is a bit red herring.
> > > 
> > > It's hvmloader trying to populate ram for firmware data. The actual
> > > amount of extra pages needed depends on the firmware.
> > > 
> > > In any case it's safe to disallow hvmloader from doing so, it will just
> > > relocate some pages from ram (hence shrinking *mem_end).
> > That looks like a better solution
> > 
> 
> I went with a "leave some slack" so that the error message above is not
> output.
> 
> When a change to hvmloader is done so that the message does not appear during
> normal usage, the extra pages in QEMU can be dropped.

Although those messages look like fatal errors, they are not. It is
normal way for hvmloader to operate: firstly it tries to allocate extra
memory until it gets an error, then it continues with normal memory,
see:

void mem_hole_populate_ram(xen_pfn_t mfn, uint32_t nr_mfns)
{
static int over_allocated;
struct xen_add_to_physmap xatp;
struct xen_memory_reservation xmr;

for ( ; nr_mfns-- != 0; mfn++ )
{
/* Try to allocate a brand new page in the reserved area. */
if ( !over_allocated )
{
xmr.domid = DOMID_SELF;
xmr.mem_flags = 0;
xmr.extent_order = 0;
xmr.nr_extents = 1;
set_xen_guest_handle(xmr.extent_start, &mfn);
if ( hypercall_memory_op(XENMEM_populate_physmap, &xmr) == 1 )
continue;
over_allocated = 1;
}

/* Otherwise, relocate a page from the ordinary RAM map. */

I think there is really nothing to change there. Maybe we want to make
those warnings less scary.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Poor network performance between DomU with multiqueue support

2014-12-03 Thread Zhangleiqiang (Trump)

> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: Tuesday, December 02, 2014 11:59 PM
> To: Zhangleiqiang (Trump)
> Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao (brian); Xiaoding
> (B); Yuzhou (C); Zhuangyuxin
> Subject: Re: [Xen-devel] Poor network performance between DomU with
> multiqueue support
> 
> On Tue, Dec 02, 2014 at 02:46:36PM +, Zhangleiqiang (Trump) wrote:
> > Thanks for your reply, Wei.
> >
> > I do the following testing just now and found the results as follows:
> >
> > There are three DomUs (4U4G) are running on Host A (6U6G) and one DomU
> (4U4G) is running on Host B (6U6G), I send packets from three DomUs to the
> DomU on Host B simultaneously.
> >
> > 1. The "top" output of Host B as follows:
> >
> > top - 09:42:11 up  1:07,  2 users,  load average: 2.46, 1.90, 1.47
> > Tasks: 173 total,   4 running, 169 sleeping,   0 stopped,   0 zombie
> > %Cpu0  :  0.0 us,  0.0 sy,  0.0 ni, 97.3 id,  0.0 wa,  0.0 hi,  0.8
> > si,  1.9 st
> > %Cpu1  :  0.0 us, 27.0 sy,  0.0 ni, 63.1 id,  0.0 wa,  0.0 hi,  9.5
> > si,  0.4 st
> > %Cpu2  :  0.0 us, 90.0 sy,  0.0 ni,  8.3 id,  0.0 wa,  0.0 hi,  1.7
> > si,  0.0 st
> > %Cpu3  :  0.4 us,  1.4 sy,  0.0 ni, 95.4 id,  0.0 wa,  0.0 hi,  1.4
> > si,  1.4 st
> > %Cpu4  :  0.0 us, 60.2 sy,  0.0 ni, 39.5 id,  0.0 wa,  0.0 hi,  0.3
> > si,  0.0 st
> > %Cpu5  :  0.0 us,  2.8 sy,  0.0 ni, 89.4 id,  0.0 wa,  0.0 hi,  6.9 si,  0.9
> st
> > KiB Mem:   4517144 total,  3116480 used,  1400664 free,  876
> buffers
> > KiB Swap:  2103292 total,0 used,  2103292 free.  2374656
> cached Mem
> >
> >   PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM
> TIME+ COMMAND
> >  7440 root  20   0   0  0  0 R 71.10 0.000
> 8:15.38 vif4.0-q3-guest
> >  7434 root  20   0   0  0  0 R 59.14 0.000
> 9:00.58 vif4.0-q0-guest
> >18 root  20   0   0  0  0 R 33.89 0.000
> 2:35.06 ksoftirqd/2
> >28 root  20   0   0  0  0 S 20.93 0.000
> 3:01.81 ksoftirqd/4
> >
> >
> > As shown above, only two netback related processes (vif4.0-*) are running
> with high cpu usage, and the other 2 netback processes are idle. The "ps"
> result of vif4.0-* processes as follows:
> >
> > root  7434 50.5  0.0  0 0 ?R09:23  11:29
> [vif4.0-q0-guest]
> > root  7435  0.0  0.0  0 0 ?S09:23   0:00
> [vif4.0-q0-deall]
> > root  7436  0.0  0.0  0 0 ?S09:23   0:00
> [vif4.0-q1-guest]
> > root  7437  0.0  0.0  0 0 ?S09:23   0:00
> [vif4.0-q1-deall]
> > root  7438  0.0  0.0  0 0 ?S09:23   0:00
> [vif4.0-q2-guest]
> > root  7439  0.0  0.0  0 0 ?S09:23   0:00
> [vif4.0-q2-deall]
> > root  7440 48.1  0.0  0 0 ?R09:23  10:55
> [vif4.0-q3-guest]
> > root  7441  0.0  0.0  0 0 ?S09:23   0:00
> [vif4.0-q3-deall]
> > root  9724  0.0  0.0   9244  1520 pts/0S+   09:46   0:00
> grep --color=auto
> >
> >
> > 2. The "rx" related content in /proc/interupts in receiver DomU (on Host B):
> >
> > 73: 2   0   2925405 0   
> > xen-dyn-event
>   eth0-q0-rx
> > 75: 43  93  0   118 
> > xen-dyn-event
>   eth0-q1-rx
> > 77: 2   337614  1983
> > xen-dyn-event
>   eth0-q2-rx
> > 79: 2414666 0   9   0   
> > xen-dyn-event
>   eth0-q3-rx
> >
> > As shown above, it seems like that only q0 and q3 handles the interrupt
> triggered by packet receving.
> >
> > Any advise? Thanks.
> 
> Netback selects queue based on the return value of skb_get_queue_mapping.
> The queue mapping is set by core driver or ndo_select_queue (if specified by
> individual driver). In this case netback doesn't have its implementation of
> ndo_select_queue, so it's up to core driver to decide which queue to dispatch
> the packet to.  I think you need to inspect why Dom0 only steers traffic to
> these two queues but not all of them.
> 
> Don't know which utility is handy for this job. Probably tc(8) is useful?

Thanks Wei.

I think the reason for the above results that only two netback/netfront 
processes works hard is the queue select method. I have tried to send packets 
from multiple host/vm to a vm, and all of the netback/netfront processes are 
running with high cpu usage a few times.

However, I find another issue. Even using 6 queues and making sure that all of 
these 6 netback processes running with high cpu usage (indeed, any of it 
running with 87% cpu usage), the whole VM receive throughout is not very higher 
than results when using 4 queues. The results are from 4.5Gbps to 5.04 Gbps 
using TCP with 512 bytes length and 4.3Gbps to 5.78Gbps using TCP with 1460 
bytes length.

According

[Xen-devel] [PATCH v3 0/2] gnttab: Improve scaleability

2014-12-03 Thread Christoph Egger

This patch series changes the grant table locking to
a more fain grained locking protocol. The result is
a performance boost measured with blkfront/blkback.
Document the locking protocol.

v3:
  * Addressed gnttab_swap_grant_ref() comment from Andrew Cooper
v2:
  * Add arm part per request from Julien Grall

Christoph Egger (1):
  gnttab: Introduce rwlock to protect updates to grant table state

Matt Wilson (1):
  gnttab: refactor locking for scalability

 docs/misc/grant-tables.txt|   49 ++-
 xen/arch/arm/mm.c |4 +-
 xen/arch/x86/mm.c |4 +-
 xen/common/grant_table.c  |  321 +
 xen/include/xen/grant_table.h |9 +-
 5 files changed, 258 insertions(+), 129 deletions(-)

-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 2/2] gnttab: refactor locking for scalability

2014-12-03 Thread Christoph Egger

From: Matt Wilson 

This patch refactors grant table locking. It splits the previous
single spinlock per grant table into multiple locks. The heavily
modified components of the grant table (the maptrack state and the
active entries) are now protected by their own spinlocks. The
remaining elements of the grant table are read-mostly, so the main
grant table lock is modified to be a rwlock to improve concurrency.

Workloads with high grant table operation rates, especially map/unmap
operations as used by blkback/blkfront when persistent grants are not
supported, show marked improvement with these changes. A domU with 24
VBDs in a streaming 2M write workload achieved 1,400 MB/sec before
this change. Performance more than doubles with this patch, reaching
3,000 MB/sec before tuning and 3,600 MB/sec after adjusting event
channel vCPU bindings.

Signed-off-by: Matt Wilson 
[chegger: ported to xen-staging, split into multiple commits]

v3:
  * Addressed gnttab_swap_grant_ref() comment from Andrew Cooper

Signed-off-by: Christoph Egger 
Cc: Anthony Liguori 
Cc: Jan Beulich 
Cc: Keir Fraser 
---
 docs/misc/grant-tables.txt |   21 +
 xen/common/grant_table.c   |  219 
 2 files changed, 163 insertions(+), 77 deletions(-)

diff --git a/docs/misc/grant-tables.txt b/docs/misc/grant-tables.txt
index c9ae8f2..1ada018 100644
--- a/docs/misc/grant-tables.txt
+++ b/docs/misc/grant-tables.txt
@@ -63,6 +63,7 @@ is complete.
   act->domid : remote domain being granted rights
   act->frame : machine frame being granted
   act->pin   : used to hold reference counts
+  act->lock  : spinlock used to serialize access to active entry state
 
  Map tracking
  
@@ -87,6 +88,8 @@ is complete.
version, partially initialized active table 
pages,
etc.
   grant_table->maptrack_lock : spinlock used to protect the maptrack state
+  active_grant_entry->lock   : spinlock used to serialize modifications to
+   active entries
 
  The primary lock for the grant table is a read/write spinlock. All
  functions that access members of struct grant_table must acquire a
@@ -102,6 +105,24 @@ is complete.
  state can be rapidly modified under some workloads, and the critical
  sections are very small, thus we use a spinlock to protect them.
 
+ Active entries are obtained by calling active_entry_acquire(gt, ref).
+ This function returns a pointer to the active entry after locking its
+ spinlock. The caller must hold the rwlock for the gt in question
+ before calling active_entry_acquire(). This is because the grant
+ table can be dynamically extended via gnttab_grow_table() while a
+ domain is running and must be fully initialized. Once all access to
+ the active entry is complete, release the lock by calling
+ active_entry_release(act).
+
+ Summary of rules for locking:
+  active_entry_acquire() and active_entry_release() can only be
+  called when holding the relevant grant table's lock. I.e.:
+read_lock(>->lock);
+act = active_entry_acquire(gt, ref);
+...
+active_entry_release(act);
+read_unlock(>->lock);
+
 

 
  Granting a foreign domain access to frames
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 24feb65..5601863 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -151,10 +151,13 @@ struct active_grant_entry {
in the page.   */
 unsigned  length:16; /* For sub-page grants, the length of the
 grant.*/
+spinlock_tlock;  /* lock to protect access of this entry.
+see docs/misc/grant-tables.txt for
+locking protocol  */
 };
 
 #define ACGNT_PER_PAGE (PAGE_SIZE / sizeof(struct active_grant_entry))
-#define active_entry(t, e) \
+#define _active_entry(t, e) \
 ((t)->active[(e)/ACGNT_PER_PAGE][(e)%ACGNT_PER_PAGE])
 
 static inline void gnttab_flush_tlb(const struct domain *d)
@@ -182,6 +185,29 @@ nr_active_grant_frames(struct grant_table *gt)
 return num_act_frames_from_sha_frames(nr_grant_frames(gt));
 }
 
+static inline struct active_grant_entry *
+active_entry_acquire(struct grant_table *t, grant_ref_t e)
+{
+struct active_grant_entry *act;
+
+#ifndef NDEBUG
+/* not perfect, but better than nothing for a debug build
+ * sanity check
+ */
+BUG_ON(!rw_is_locked(&t->lock));
+#endif
+
+act = &_active_entry(t, e);
+spin_lock(&act->lock);
+
+return act;
+}
+
+static inline void active_entry_release(struct active_grant_entry *act)
+{
+spin_unlock(&act->lock);
+}
+
 /* Check if the page has been paged out, or needs unsharing. 
If rc == GNTST_okay, *page contains the page struct with a ref taken.

[Xen-devel] [PATCH v3 1/2] gnttab: Introduce rwlock to protect updates to grant table state

2014-12-03 Thread Christoph Egger

Split grant table lock into two separate locks. One to protect
maptrack state and change the other into a rwlock.

The rwlock is used to prevent readers from accessing
inconsistent grant table state such as current
version, partially initialized active table pages,
etc.

Signed-off-by: Matt Wilson 
[chegger: ported to xen-staging, split into multiple commits]

v3:
  * Addressed gnttab_swap_grant_ref() comment from Andrew Cooper
v2:
  * Add arm part per request from Julien Grall

Signed-off-by: Christoph Egger 
Cc: Jan Beulich 
Cc: Keir Fraser 
Cc: Julien Grall 
---
 docs/misc/grant-tables.txt|   28 +-
 xen/arch/arm/mm.c |4 +-
 xen/arch/x86/mm.c |4 +-
 xen/common/grant_table.c  |  120 +++--
 xen/include/xen/grant_table.h |9 ++--
 5 files changed, 104 insertions(+), 61 deletions(-)

diff --git a/docs/misc/grant-tables.txt b/docs/misc/grant-tables.txt
index 19db4ec..c9ae8f2 100644
--- a/docs/misc/grant-tables.txt
+++ b/docs/misc/grant-tables.txt
@@ -74,7 +74,33 @@ is complete.
  matching map track entry is then removed, as if unmap had been invoked.
  These are not used by the transfer mechanism.
   map->domid : owner of the mapped frame
-  map->ref_and_flags : grant reference, ro/rw, mapped for host or device access
+  map->ref   : grant reference
+  map->flags : ro/rw, mapped for host or device access
+
+
+ Locking
+ ~~~
+ Xen uses several locks to serialize access to the internal grant table state.
+
+  grant_table->lock  : rwlock used to prevent readers from accessing
+   inconsistent grant table state such as current
+   version, partially initialized active table 
pages,
+   etc.
+  grant_table->maptrack_lock : spinlock used to protect the maptrack state
+
+ The primary lock for the grant table is a read/write spinlock. All
+ functions that access members of struct grant_table must acquire a
+ read lock around critical sections. Any modification to the members
+ of struct grant_table (e.g., nr_status_frames, nr_grant_frames,
+ active frames, etc.) must only be made if the write lock is
+ held. These elements are read-mostly, and read critical sections can
+ be large, which makes a rwlock a good choice.
+
+ The maptrack state is protected by its own spinlock. Any access (read
+ or write) of struct grant_table members that have a "maptrack_"
+ prefix must be made while holding the maptrack lock. The maptrack
+ state can be rapidly modified under some workloads, and the critical
+ sections are very small, thus we use a spinlock to protect them.
 
 

 
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 7d4ba0c..2765683 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1037,7 +1037,7 @@ int xenmem_add_to_physmap_one(
 switch ( space )
 {
 case XENMAPSPACE_grant_table:
-spin_lock(&d->grant_table->lock);
+write_lock(&d->grant_table->lock);
 
 if ( d->grant_table->gt_version == 0 )
 d->grant_table->gt_version = 1;
@@ -1067,7 +1067,7 @@ int xenmem_add_to_physmap_one(
 
 t = p2m_ram_rw;
 
-spin_unlock(&d->grant_table->lock);
+write_unlock(&d->grant_table->lock);
 break;
 case XENMAPSPACE_shared_info:
 if ( idx != 0 )
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 522c43d..37c13b1 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4565,7 +4565,7 @@ int xenmem_add_to_physmap_one(
 mfn = virt_to_mfn(d->shared_info);
 break;
 case XENMAPSPACE_grant_table:
-spin_lock(&d->grant_table->lock);
+write_lock(&d->grant_table->lock);
 
 if ( d->grant_table->gt_version == 0 )
 d->grant_table->gt_version = 1;
@@ -4587,7 +4587,7 @@ int xenmem_add_to_physmap_one(
 mfn = virt_to_mfn(d->grant_table->shared_raw[idx]);
 }
 
-spin_unlock(&d->grant_table->lock);
+write_unlock(&d->grant_table->lock);
 break;
 case XENMAPSPACE_gmfn_range:
 case XENMAPSPACE_gmfn:
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 8fba923..24feb65 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -227,23 +227,23 @@ double_gt_lock(struct grant_table *lgt, struct 
grant_table *rgt)
 {
 if ( lgt < rgt )
 {
-spin_lock(&lgt->lock);
-spin_lock(&rgt->lock);
+spin_lock(&lgt->maptrack_lock);
+spin_lock(&rgt->maptrack_lock);
 }
 else
 {
 if ( lgt != rgt )
-spin_lock(&rgt->lock);
-spin_lock(&lgt->lock);
+spin_lock(&rgt->maptrack_lock);
+spin_lock(&lgt->maptrack_lock);
 }

[Xen-devel] [qemu-upstream-unstable test] 32024: regressions - FAIL

2014-12-03 Thread xen . org

flight 32024 qemu-upstream-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/32024/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-libvirt  4 xen-install   fail REGR. vs. 31848

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 31848
 test-amd64-amd64-xl-qemut-winxpsp3  7 windows-install  fail like 31848

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt   9 guest-start  fail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt  9 guest-start  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass

version targeted for testing:
 qemuu1ebb75b1fee779621b63e84fefa7b07354c43a99
baseline version:
 qemuua230ec3101ddda868252c036ea960af2b2d6cd5a


People who touched revisions under test:
  Jason Wang 
  Peter Maydell 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-win7-amd64   fail
 test-amd64-i386-xl-win7-amd64fail
 test-amd64-i386-xl-credit2   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-amd64-xl-pcipt-intel  fail
 test-amd64-i386-rhel6hvm-intel   pass
 test-amd64-i386-qemut-rhel6hvm-intel pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-libvirt fail
 test-armhf-armhf-libvirt fail
 test-amd64-i386-libvirt  fai

[Xen-devel] Announcing Xen Project Test Day for 4.5 RC3 on December 4

2014-12-03 Thread Russell Pavlicek

Folks,

This Thursday, December 4, is our third Test Day for the 4.5 release
cycle. Release Candidate 3 will be available for assessment on 
Wednesday.  Now is the time to see if the upcoming release of the 
Xen Project Hypervisor will work in your environment.

Information about testing this release can be found here:
http://wiki.xenproject.org/wiki/Xen_4.5_RC3_test_instructions

To learn more about Test Days, including the proposed dates 
for the RC4 Test Day and final release, check out:
http://wiki.xenproject.org/wiki/Xen_Project_Test_Days

See you in #xentest on IRC this Thursday for Test Day!


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Qemu-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

2014-12-03 Thread Don Slutz


On 12/03/14 07:20, Stefano Stabellini wrote:

On Wed, 3 Dec 2014, Wei Liu wrote:

On Tue, Dec 02, 2014 at 03:23:29PM -0500, Don Slutz wrote:
[...]

   hw_error("xc_domain_getinfo failed");
   }
-if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
-(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
+max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE;
+free_pages = max_pages - info.nr_pages;
+real_free = free_pages;
+if (free_pages > VGA_HOLE_SIZE) {
+free_pages -= VGA_HOLE_SIZE;
+} else {
+free_pages = 0;
+}

I don't think we need to subtract VGA_HOLE_SIZE.

If you do not use some slack (like 32 here), xen does report:


(d5) [2014-11-29 17:07:21] Loaded SeaBIOS
(d5) [2014-11-29 17:07:21] Creating MP tables ...
(d5) [2014-11-29 17:07:21] Loading ACPI ...
(XEN) [2014-11-29 17:07:21] page_alloc.c:1568:d5 Over-allocation for domain
5: 1057417 > 1057416
(XEN) [2014-11-29 17:07:21] memory.c:158:d5 Could not allocate order=0

This message is a bit red herring.

It's hvmloader trying to populate ram for firmware data. The actual
amount of extra pages needed depends on the firmware.

In any case it's safe to disallow hvmloader from doing so, it will just
relocate some pages from ram (hence shrinking *mem_end).

That looks like a better solution



I went with a "leave some slack" so that the error message above is not 
output.


When a change to hvmloader is done so that the message does not appear 
during

normal usage, the extra pages in QEMU can be dropped.



extent: id=5 memflags=0 (0 of 1)
(d5) [2014-11-29 17:07:21] vm86 TSS at 00098c00
(d5) [2014-11-29 17:07:21] BIOS map:


However while QEMU startup ends with 32 "free" pages (maxmem - curmem)
this quickly changes to 23.  I.E. there are 7 more places that act like a
call
to xc_domain_populate_physmap_exact() but do not log errors if there is
no free pages.  And just to make sure I do not fully understand what is
happening here, after the message above, the domain appears to work
fine and ends up with 8 "free" pages (code I do not know about ends up
releasing populated pages).

So my current thinking is to have QEMU leave a few (9 to 32 (64?)) pages
"free".


Unless we know before hand how many pages hvmloader needs this number is
estimation at best.
  
Right. It would be nice to get rid of any estimations by:

- making hvmloader use normal ram
- making qemu increase maxmem
- removing all the estimation from libxl


Sounds like a plan for 4.6

-Don Slutz

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH for 2.3 v2 1/1] xen-hvm: increase maxmem before calling xc_domain_populate_physmap

2014-12-03 Thread Don Slutz

From: Stefano Stabellini 

Increase maxmem before calling xc_domain_populate_physmap_exact to
avoid the risk of running out of guest memory. This way we can also
avoid complex memory calculations in libxl at domain construction
time.

This patch fixes an abort() when assigning more than 4 NICs to a VM.

Signed-off-by: Stefano Stabellini 
Signed-off-by: Don Slutz 
---
v2: Changes by Don Slutz
  Switch from xc_domain_getinfo to xc_domain_getinfolist
  Fix error check for xc_domain_getinfolist
  Limit increase of maxmem to only do when needed:
Add QEMU_SPARE_PAGES (How many pages to leave free)
Add free_pages calculation

 xen-hvm.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/xen-hvm.c b/xen-hvm.c
index 7548794..d30e77e 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -90,6 +90,7 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t 
*shared_page, int vcpu)
 #endif
 
 #define BUFFER_IO_MAX_DELAY  100
+#define QEMU_SPARE_PAGES 16
 
 typedef struct XenPhysmap {
 hwaddr start_addr;
@@ -244,6 +245,8 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, 
MemoryRegion *mr)
 unsigned long nr_pfn;
 xen_pfn_t *pfn_list;
 int i;
+xc_domaininfo_t info;
+unsigned long free_pages;
 
 if (runstate_check(RUN_STATE_INMIGRATE)) {
 /* RAM already populated in Xen */
@@ -266,6 +269,22 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, 
MemoryRegion *mr)
 pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i;
 }
 
+if ((xc_domain_getinfolist(xen_xc, xen_domid, 1, &info) != 1) ||
+(info.domain != xen_domid)) {
+hw_error("xc_domain_getinfolist failed");
+}
+free_pages = info.max_pages - info.tot_pages;
+if (free_pages > QEMU_SPARE_PAGES) {
+free_pages -= QEMU_SPARE_PAGES;
+} else {
+free_pages = 0;
+}
+if ((free_pages < nr_pfn) &&
+(xc_domain_setmaxmem(xen_xc, xen_domid,
+ ((info.max_pages + nr_pfn - free_pages)
+  << (XC_PAGE_SHIFT - 10))) < 0)) {
+hw_error("xc_domain_setmaxmem failed");
+}
 if (xc_domain_populate_physmap_exact(xen_xc, xen_domid, nr_pfn, 0, 0, 
pfn_list)) {
 hw_error("xen: failed to populate ram at " RAM_ADDR_FMT, ram_addr);
 }
-- 
1.8.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] libxl: expose #define to 4.5 and above

2014-12-03 Thread Andrew Cooper

On 03/12/14 10:41, Wei Liu wrote:
> In e3abab74 (libxl: un-constify return value of libxl_basename), the
> macro was exposed to releases < 4.5. However only new code is able to
> make use of that macro so it should be exposed to releases >= 4.5.
>
> Signed-off-by: Wei Liu 
> Cc: Ian Campbell 
> Cc: Ian Jackson 

Reviewed-by: Andrew Cooper 

> ---
>  tools/libxl/libxl.h   |6 +++---
>  tools/libxl/libxl_utils.c |2 +-
>  tools/libxl/libxl_utils.h |2 +-
>  3 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 291c190..0a123f1 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -478,13 +478,13 @@ typedef struct libxl__ctx libxl_ctx;
>  #endif
>  
>  /*
> - * LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
> + * LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
>   *
>   * The return value of libxl_basename is malloc'ed but the erroneously
>   * marked as "const" in releases before 4.5.
>   */
> -#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION < 0x040500
> -#define LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE 1
> +#if !defined(LIBXL_API_VERSION) || LIBXL_API_VERSION >= 0x040500
> +#define LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE 1
>  #endif
>  
>  /*
> diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
> index 22119fc..7095b58 100644
> --- a/tools/libxl/libxl_utils.c
> +++ b/tools/libxl/libxl_utils.c
> @@ -19,7 +19,7 @@
>  
>  #include "libxl_internal.h"
>  
> -#ifdef LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
> +#ifndef LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
>  const
>  #endif
>  char *libxl_basename(const char *name)
> diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
> index 8277eb9..acacdd9 100644
> --- a/tools/libxl/libxl_utils.h
> +++ b/tools/libxl/libxl_utils.h
> @@ -18,7 +18,7 @@
>  
>  #include "libxl.h"
>  
> -#ifdef LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
> +#ifndef LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
>  const
>  #endif
>  char *libxl_basename(const char *name); /* returns string from strdup */


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Qemu-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

2014-12-03 Thread Stefano Stabellini

On Wed, 3 Dec 2014, Wei Liu wrote:
> On Tue, Dec 02, 2014 at 03:23:29PM -0500, Don Slutz wrote:
> [...]
> >    hw_error("xc_domain_getinfo failed");
> >    }
> > -if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
> > -(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
> > +max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE;
> > +free_pages = max_pages - info.nr_pages;
> > +real_free = free_pages;
> > +if (free_pages > VGA_HOLE_SIZE) {
> > +free_pages -= VGA_HOLE_SIZE;
> > +} else {
> > +free_pages = 0;
> > +}
> > >I don't think we need to subtract VGA_HOLE_SIZE.
> > 
> > If you do not use some slack (like 32 here), xen does report:
> > 
> > 
> > (d5) [2014-11-29 17:07:21] Loaded SeaBIOS
> > (d5) [2014-11-29 17:07:21] Creating MP tables ...
> > (d5) [2014-11-29 17:07:21] Loading ACPI ...
> > (XEN) [2014-11-29 17:07:21] page_alloc.c:1568:d5 Over-allocation for domain
> > 5: 1057417 > 1057416
> > (XEN) [2014-11-29 17:07:21] memory.c:158:d5 Could not allocate order=0
> 
> This message is a bit red herring.
> 
> It's hvmloader trying to populate ram for firmware data. The actual
> amount of extra pages needed depends on the firmware.
> 
> In any case it's safe to disallow hvmloader from doing so, it will just
> relocate some pages from ram (hence shrinking *mem_end).

That looks like a better solution


> > extent: id=5 memflags=0 (0 of 1)
> > (d5) [2014-11-29 17:07:21] vm86 TSS at 00098c00
> > (d5) [2014-11-29 17:07:21] BIOS map:
> > 
> > 
> > However while QEMU startup ends with 32 "free" pages (maxmem - curmem)
> > this quickly changes to 23.  I.E. there are 7 more places that act like a
> > call
> > to xc_domain_populate_physmap_exact() but do not log errors if there is
> > no free pages.  And just to make sure I do not fully understand what is
> > happening here, after the message above, the domain appears to work
> > fine and ends up with 8 "free" pages (code I do not know about ends up
> > releasing populated pages).
> > 
> > So my current thinking is to have QEMU leave a few (9 to 32 (64?)) pages
> > "free".
> > 
> 
> Unless we know before hand how many pages hvmloader needs this number is
> estimation at best.
 
Right. It would be nice to get rid of any estimations by:
- making hvmloader use normal ram
- making qemu increase maxmem
- removing all the estimation from libxl

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-linus test] 32019: regressions - trouble: blocked/broken/fail/pass

2014-12-03 Thread xen . org

flight 32019 linux-linus real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/32019/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-rumpuserxen-i386  8 guest-start   fail REGR. vs. 31241
 test-amd64-i386-xl1 STARTING running [st=running!]
 test-amd64-i386-xl-multivcpu 11 guest-saverestore fail REGR. vs. 31241
 test-amd64-i386-xl-credit2broken
 test-amd64-i386-rhel6hvm-intel  5 xen-bootfail REGR. vs. 31241
 test-amd64-i386-qemuu-rhel6hvm-intel  1 STARTING running [st=running!]
 test-amd64-i386-xl-win7-amd64  blocked
 test-amd64-amd64-xl-qemut-winxpsp3  2 STARTING   running [st=running!]
 test-amd64-i386-xl-qemut-debianhvm-amd64 broken
 test-amd64-amd64-xl-qemut-debianhvm-amd64 2 hosts-allocate broken REGR. vs. 
31241
 test-amd64-amd64-xl-sedf-pin  1 build-check(1)   running [st=running!]
 test-amd64-amd64-rumpuserxen-amd64  8 guest-start fail REGR. vs. 31241
 test-amd64-amd64-xl-sedf  1 build-check(1)   running [st=running!]
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 broken
 test-amd64-i386-xl-qemut-winxpsp3  blocked
 test-amd64-amd64-xl-winxpsp3  1 build-check(1)   running [st=running!]

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl   9 guest-start  fail   like 31241
 test-amd64-i386-freebsd10-i386  7 freebsd-install  fail like 31241
 test-amd64-amd64-xl-pcipt-intel  2 hosts-allocate   broken REGR. vs. 31241
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 31241
 test-amd64-amd64-xl-qemuu-winxpsp3  7 windows-install  fail like 31241

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  9 guest-start  fail   never pass
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-libvirt  9 guest-start  fail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass

version targeted for testing:
 linux3a18ca061311f2f1ee9c44012f89c7436d392117
baseline version:
 linux9f76628da20f96a179ca62b504886f99ecc29223


700 people touched revisions under test,
not listing them all


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   blocked 
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64broken  
 test-amd64-i386-xl-qemut-debianhvm-amd64 broken  
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  blocked 
 test-amd64-amd64-xl-qemuu-ovmf-amd64 p

Re: [Xen-devel] [RFC v2] Add support for Xen ARM guest on FreeBSD

2014-12-03 Thread Julien Grall


On 02/12/2014 18:30, Warner Losh wrote:

Hey Julien,


Hi Warner,


Have you rebased your patch train after Andrew’s commits?


I just pushed a new branch rebased on the latest master:

git://xenbits.xen.org/people/julieng/freebsd.git branch xen-arm-v2.2

I can re-export the patch into files if necessary.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] xl pci-attach silently fails the first time

2014-12-03 Thread Olaf Hering

On Wed, Dec 03, Olaf Hering wrote:

> On Tue, Dec 02, Konrad Rzeszutek Wilk wrote:
> > On Tue, Dec 02, 2014 at 04:46:52PM +0100, Olaf Hering wrote:
> > > On Mon, Dec 01, Konrad Rzeszutek Wilk wrote:
> > ACPI hotplug. And it does work after PCI discovery.
> In a pvops kernel, is the emulated but unplugged PCI hardware still listed 
> with lspci?

It is not. So thats why it happens to work.

So how would I trigger an ACPI hotplug event within qemus unplug code?

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Qemu-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

2014-12-03 Thread Wei Liu

On Tue, Dec 02, 2014 at 03:23:29PM -0500, Don Slutz wrote:
[...]
>    hw_error("xc_domain_getinfo failed");
>    }
> -if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
> -(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
> +max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE;
> +free_pages = max_pages - info.nr_pages;
> +real_free = free_pages;
> +if (free_pages > VGA_HOLE_SIZE) {
> +free_pages -= VGA_HOLE_SIZE;
> +} else {
> +free_pages = 0;
> +}
> >I don't think we need to subtract VGA_HOLE_SIZE.
> 
> If you do not use some slack (like 32 here), xen does report:
> 
> 
> (d5) [2014-11-29 17:07:21] Loaded SeaBIOS
> (d5) [2014-11-29 17:07:21] Creating MP tables ...
> (d5) [2014-11-29 17:07:21] Loading ACPI ...
> (XEN) [2014-11-29 17:07:21] page_alloc.c:1568:d5 Over-allocation for domain
> 5: 1057417 > 1057416
> (XEN) [2014-11-29 17:07:21] memory.c:158:d5 Could not allocate order=0

This message is a bit red herring.

It's hvmloader trying to populate ram for firmware data. The actual
amount of extra pages needed depends on the firmware.

In any case it's safe to disallow hvmloader from doing so, it will just
relocate some pages from ram (hence shrinking *mem_end).

> extent: id=5 memflags=0 (0 of 1)
> (d5) [2014-11-29 17:07:21] vm86 TSS at 00098c00
> (d5) [2014-11-29 17:07:21] BIOS map:
> 
> 
> However while QEMU startup ends with 32 "free" pages (maxmem - curmem)
> this quickly changes to 23.  I.E. there are 7 more places that act like a
> call
> to xc_domain_populate_physmap_exact() but do not log errors if there is
> no free pages.  And just to make sure I do not fully understand what is
> happening here, after the message above, the domain appears to work
> fine and ends up with 8 "free" pages (code I do not know about ends up
> releasing populated pages).
> 
> So my current thinking is to have QEMU leave a few (9 to 32 (64?)) pages
> "free".
> 

Unless we know before hand how many pages hvmloader needs this number is
estimation at best.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] systemd: use pkg-config to determine systemd library availability

2014-12-03 Thread Olaf Hering

On Wed, Dec 03, Ian Campbell wrote:

> On Wed, 2014-12-03 at 11:49 +0100, Olaf Hering wrote:
> > On Wed, Dec 03, Ian Campbell wrote:
> > 
> > > Ah I didn't know about the sd_listen_fds thing, so I think that what we
> > > need then is to use pkg-config first to determine if systemd-daemon is
> > > present at all, and then check for specific symbols we require using the
> > > pkg-config supplied CFLAGS and LDFLAGS rather than assuming
> > > -lsystemd-daemon.
> > 
> > Correction: sd_listen_fds is available since at least v1.
> >  git describe --contains abbbea81a8fa70badb7a05e518d6b07c360fc09d
> >  v1~390
> 
> In that case I don't think we realistically need to check for it?

Yes. Anything before 208 is stale. At least I dont have anything older
around for testing.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] libxl: expose #define to 4.5 and above

2014-12-03 Thread Ian Campbell

On Wed, 2014-12-03 at 10:41 +, Wei Liu wrote:
> In e3abab74 (libxl: un-constify return value of libxl_basename), the
> macro was exposed to releases < 4.5. However only new code is able to
> make use of that macro so it should be exposed to releases >= 4.5.
> 
> Signed-off-by: Wei Liu 
> Cc: Ian Campbell 
> Cc: Ian Jackson 
> Cc: Andrew Cooper 

Acked-by: Ian Campbell 

Konrad, given that the original patch is in 4.5 (as of yesterday) we
should obviously take this one too.

> ---
>  tools/libxl/libxl.h   |6 +++---
>  tools/libxl/libxl_utils.c |2 +-
>  tools/libxl/libxl_utils.h |2 +-
>  3 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 291c190..0a123f1 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -478,13 +478,13 @@ typedef struct libxl__ctx libxl_ctx;
>  #endif
>  
>  /*
> - * LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
> + * LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
>   *
>   * The return value of libxl_basename is malloc'ed but the erroneously
>   * marked as "const" in releases before 4.5.
>   */
> -#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION < 0x040500
> -#define LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE 1
> +#if !defined(LIBXL_API_VERSION) || LIBXL_API_VERSION >= 0x040500
> +#define LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE 1
>  #endif
>  
>  /*
> diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
> index 22119fc..7095b58 100644
> --- a/tools/libxl/libxl_utils.c
> +++ b/tools/libxl/libxl_utils.c
> @@ -19,7 +19,7 @@
>  
>  #include "libxl_internal.h"
>  
> -#ifdef LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
> +#ifndef LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
>  const
>  #endif
>  char *libxl_basename(const char *name)
> diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
> index 8277eb9..acacdd9 100644
> --- a/tools/libxl/libxl_utils.h
> +++ b/tools/libxl/libxl_utils.h
> @@ -18,7 +18,7 @@
>  
>  #include "libxl.h"
>  
> -#ifdef LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
> +#ifndef LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
>  const
>  #endif
>  char *libxl_basename(const char *name); /* returns string from strdup */



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] systemd: use pkg-config to determine systemd library availability

2014-12-03 Thread Ian Campbell

On Wed, 2014-12-03 at 11:49 +0100, Olaf Hering wrote:
> On Wed, Dec 03, Ian Campbell wrote:
> 
> > Ah I didn't know about the sd_listen_fds thing, so I think that what we
> > need then is to use pkg-config first to determine if systemd-daemon is
> > present at all, and then check for specific symbols we require using the
> > pkg-config supplied CFLAGS and LDFLAGS rather than assuming
> > -lsystemd-daemon.
> 
> Correction: sd_listen_fds is available since at least v1.
>  git describe --contains abbbea81a8fa70badb7a05e518d6b07c360fc09d
>  v1~390

In that case I don't think we realistically need to check for it?

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] systemd: use pkg-config to determine systemd library availability

2014-12-03 Thread Olaf Hering

On Wed, Dec 03, Ian Campbell wrote:

> Ah I didn't know about the sd_listen_fds thing, so I think that what we
> need then is to use pkg-config first to determine if systemd-daemon is
> present at all, and then check for specific symbols we require using the
> pkg-config supplied CFLAGS and LDFLAGS rather than assuming
> -lsystemd-daemon.

Correction: sd_listen_fds is available since at least v1.
 git describe --contains abbbea81a8fa70badb7a05e518d6b07c360fc09d
 v1~390

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH for-4.5] libxl: expose #define to 4.5 and above

2014-12-03 Thread Wei Liu

In e3abab74 (libxl: un-constify return value of libxl_basename), the
macro was exposed to releases < 4.5. However only new code is able to
make use of that macro so it should be exposed to releases >= 4.5.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Andrew Cooper 
---
 tools/libxl/libxl.h   |6 +++---
 tools/libxl/libxl_utils.c |2 +-
 tools/libxl/libxl_utils.h |2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 291c190..0a123f1 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -478,13 +478,13 @@ typedef struct libxl__ctx libxl_ctx;
 #endif
 
 /*
- * LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
+ * LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
  *
  * The return value of libxl_basename is malloc'ed but the erroneously
  * marked as "const" in releases before 4.5.
  */
-#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION < 0x040500
-#define LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE 1
+#if !defined(LIBXL_API_VERSION) || LIBXL_API_VERSION >= 0x040500
+#define LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE 1
 #endif
 
 /*
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
index 22119fc..7095b58 100644
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -19,7 +19,7 @@
 
 #include "libxl_internal.h"
 
-#ifdef LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
+#ifndef LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
 const
 #endif
 char *libxl_basename(const char *name)
diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
index 8277eb9..acacdd9 100644
--- a/tools/libxl/libxl_utils.h
+++ b/tools/libxl/libxl_utils.h
@@ -18,7 +18,7 @@
 
 #include "libxl.h"
 
-#ifdef LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
+#ifndef LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
 const
 #endif
 char *libxl_basename(const char *name); /* returns string from strdup */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Time dilation in XEN 4.2

2014-12-03 Thread George Dunlap

On Mon, Dec 1, 2014 at 8:10 PM, leon zawodowiec  wrote:

> Hello, is it possible to implement time dilation in XEN 4.2 without
> modyfing kernel(also hints are welcome)? Thank you for replies.
>

I don't think anyone knows what you're talking about.

It might be helpful if you read this document and then try asking your
question again with a bit more detail:

wiki.xenproject.org/wiki/Asking_Developer_Questions

 -George
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] systemd: use pkg-config to determine systemd library availability

2014-12-03 Thread Ian Campbell

On Wed, 2014-12-03 at 11:26 +0100, Olaf Hering wrote:
> On Tue, Dec 02, Konrad Rzeszutek Wilk wrote:
> 
> > On Tue, Dec 02, 2014 at 03:11:30PM +, Wei Liu wrote:
> > > AC_CHECK_LIB fails on Debian Jessie since the ld flag it generates is
> > > incorrect, even in the event systemd library is available.  Use
> > > PKG_CHECK_MODULES instead.
> > > 
> > > Tested on Debian Jessie and Arch Linux.
> > 
> > And Fedora and SuSE? CC-ing the other distro maintainers
> > for their input.
> 
> I'm fine with that. But:
> 
> It seems be that sd_listen_fds() is new in v209. It was backported to
> v208 in openSUSE 13.1. So there should be some detection if
> sd_listen_fds() is really available. Looks like this patch removes the
> check.

Ah I didn't know about the sd_listen_fds thing, so I think that what we
need then is to use pkg-config first to determine if systemd-daemon is
present at all, and then check for specific symbols we require using the
pkg-config supplied CFLAGS and LDFLAGS rather than assuming
-lsystemd-daemon.

Ian.

> 
> I get this from pkg-config:
> 
> root@optiplex:/work/olaf/13.1/github/olafhering/xen.git # pkg-config --cflags 
> libsystemd-daemon ; echo $?
> 
> 0
> root@optiplex:/work/olaf/13.1/github/olafhering/xen.git # pkg-config --libs 
> libsystemd-daemon ; echo $?
> -lsystemd-daemon 
> 0
> 
> Olaf



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] systemd: use pkg-config to determine systemd library availability

2014-12-03 Thread Olaf Hering

On Tue, Dec 02, Konrad Rzeszutek Wilk wrote:

> On Tue, Dec 02, 2014 at 03:11:30PM +, Wei Liu wrote:
> > AC_CHECK_LIB fails on Debian Jessie since the ld flag it generates is
> > incorrect, even in the event systemd library is available.  Use
> > PKG_CHECK_MODULES instead.
> > 
> > Tested on Debian Jessie and Arch Linux.
> 
> And Fedora and SuSE? CC-ing the other distro maintainers
> for their input.

I'm fine with that. But:

It seems be that sd_listen_fds() is new in v209. It was backported to
v208 in openSUSE 13.1. So there should be some detection if
sd_listen_fds() is really available. Looks like this patch removes the
check.

I get this from pkg-config:

root@optiplex:/work/olaf/13.1/github/olafhering/xen.git # pkg-config --cflags 
libsystemd-daemon ; echo $?

0
root@optiplex:/work/olaf/13.1/github/olafhering/xen.git # pkg-config --libs 
libsystemd-daemon ; echo $?
-lsystemd-daemon 
0

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxl: Fix building libxlu_cfg_y.y with bison 3.0

2014-12-03 Thread Olaf Hering

On Wed, Dec 03, Ian Campbell wrote:

> There was a point in time where the prevailing version of bison (or
> maybe flex) in stable distro releases had a bug which meant these files
> could not be regenerated easily on common distros. I don't recall the
> details well enough to know if that time has now passed. Perhaps Ian J
> does.

Its very easy to build a private bison/flex/whatever and use this
instead of one from the system. Our configure can easily detect a broken
version and refuse to compile.  After all we dont ship .i or .s files
because some compilers happen to optimize better.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxl: Fix building libxlu_cfg_y.y with bison 3.0

2014-12-03 Thread Ian Campbell

On Tue, 2014-12-02 at 09:49 -0800, Ed Swierk wrote:
> On Tue, Dec 2, 2014 at 6:00 AM, Andrew Cooper  
> wrote:
> > The automatically generating doesn't actually work.  Depending on the
> > relative timestamps caused by a SCM checkout, or a tarball extraction,
> > the files will be attempted to be regenerated.
> >
> > These files are regenerated in the XenServer build, simply because of
> > their order in the archived tarball.
> 
> When I clone the xen tree from git, the timestamps match about 95% of
> the time, but the 5% failure rate was annoying enough that I finally
> dug in to fix the parser build.
> 
> IMHO the generated files should be omitted from the source tree; as
> long as the source files are actually buildable, there's no reason not
> to treat them like any other source file.

There was a point in time where the prevailing version of bison (or
maybe flex) in stable distro releases had a bug which meant these files
could not be regenerated easily on common distros. I don't recall the
details well enough to know if that time has now passed. Perhaps Ian J
does.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools/hotplug: update systemd dependency to use service instead of socket

2014-12-03 Thread Olaf Hering

On Tue, Dec 02, Olaf Hering wrote:

> Since commit 4542ae340d75bd6319e3fcd94e6c9336e210aeef ("tools/hotplug:
> systemd xenstored dependencies") all service files use the .socket unit
> as startup dependency. While this happens to work for boot it fails for
> shutdown because a .socket does not seem to enforce ordering. When
> xendomains.service runs during shutdown then systemd will stop
> xenstored.service at the same time.
> 
> Change all "xenstored.socket" to "xenstored.service" to let systemd know
> that xenstored has to be shutdown after everything else.
> 
> Reported-by: Mark Pryor 
> Signed-off-by: Olaf Hering 
> Cc: Ian Jackson 
> Cc: Stefano Stabellini 
> Cc: Ian Campbell 
> Cc: Wei Liu 

Tested-by: Olaf Hering 

I was able to reproduce the hang on shutdown with openSUSE 13.1. This
patch fixes the hang.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 for-4.5 1/2] libxl: un-constify return value of libxl_basename

2014-12-03 Thread Ian Campbell

On Tue, 2014-12-02 at 18:44 +, Andrew Cooper wrote:
> On 01/12/14 11:31, Wei Liu wrote:
> > The string returned is malloc'ed but marked as "const".
> >
> > Signed-off-by: Wei Liu 
> > Cc: Ian Campbell 
> > Cc: Ian Jackson 
> > ---
> >  tools/libxl/libxl.h   |   10 ++
> >  tools/libxl/libxl_utils.c |5 -
> >  tools/libxl/libxl_utils.h |6 +-
> >  3 files changed, 19 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> > index 41d6e8d..291c190 100644
> > --- a/tools/libxl/libxl.h
> > +++ b/tools/libxl/libxl.h
> > @@ -478,6 +478,16 @@ typedef struct libxl__ctx libxl_ctx;
> >  #endif
> >  
> >  /*
> > + * LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE
> > + *
> > + * The return value of libxl_basename is malloc'ed but the erroneously
> > + * marked as "const" in releases before 4.5.
> > + */
> > +#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION < 0x040500
> > +#define LIBXL_HAVE_CONST_LIBXL_BASENAME_RETURN_VALUE 1
> > +#endif
> 
> This define is currently useless.  Only newer code is capable of making
> use of newly introduced LIBXL_HAVE_$FOO flags, and with its current
> arrangement, this flag is only exposed to code requesting an older API
> version.
> 
> This instead needs to be LIBXL_HAVE_NONCONST_LIBXL_BASENAME_RETURN_VALUE
> which should be 1 for any API version >= 4.5

Oops, yes. Wei, can you send an incremental fixup please?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v4] libxc: Expose the 1GB pages cpuid flag to guest

2014-12-03 Thread Ian Campbell

On Tue, 2014-12-02 at 16:09 -0500, Konrad Rzeszutek Wilk wrote:
> On Fri, Nov 28, 2014 at 11:50:43AM +, Ian Campbell wrote:
> > On Fri, 2014-11-28 at 18:52 +0800, Liang Li wrote:
> > > If hardware support the 1GB pages, expose the feature to guest by
> > > default. Users don't have to use a 'cpuid= ' option in config fil
> > > e to turn it on.
> > > 
> > > If guest use shadow mode, the 1GB pages feature will be hidden from
> > > guest, this is done in the function hvm_cpuid(). So the change is
> > > okay for shadow mode case.
> > > 
> > > Signed-off-by: Liang Li 
> > > Signed-off-by: Yang Zhang 
> > 
> > FTR although this is strictly speaking a toolstack patch I think the
> > main ack required should be from the x86 hypervisor guys...
> 
> Jan acked it.

For 4.5?

Have you release acked it?

This seemed like 4.6 material to me, or at least I've not seen any
mention/argument to the contrary.

Ian.

> > 
> > > ---
> > >  tools/libxc/xc_cpuid_x86.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> > > index a18b1ff..c97f91a 100644
> > > --- a/tools/libxc/xc_cpuid_x86.c
> > > +++ b/tools/libxc/xc_cpuid_x86.c
> > > @@ -109,6 +109,7 @@ static void amd_xc_cpuid_policy(
> > >  regs[3] &= (0x0183f3ff | /* features shared with 0x0001:EDX 
> > > */
> > >  bitmaskof(X86_FEATURE_NX) |
> > >  bitmaskof(X86_FEATURE_LM) |
> > > +bitmaskof(X86_FEATURE_PAGE1GB) |
> > >  bitmaskof(X86_FEATURE_SYSCALL) |
> > >  bitmaskof(X86_FEATURE_MP) |
> > >  bitmaskof(X86_FEATURE_MMXEXT) |
> > > @@ -192,6 +193,7 @@ static void intel_xc_cpuid_policy(
> > >  bitmaskof(X86_FEATURE_ABM));
> > >  regs[3] &= (bitmaskof(X86_FEATURE_NX) |
> > >  bitmaskof(X86_FEATURE_LM) |
> > > +bitmaskof(X86_FEATURE_PAGE1GB) |
> > >  bitmaskof(X86_FEATURE_SYSCALL) |
> > >  bitmaskof(X86_FEATURE_RDTSCP));
> > >  break;
> > > @@ -386,6 +388,7 @@ static void xc_cpuid_hvm_policy(
> > >  clear_bit(X86_FEATURE_LM, regs[3]);
> > >  clear_bit(X86_FEATURE_NX, regs[3]);
> > >  clear_bit(X86_FEATURE_PSE36, regs[3]);
> > > +clear_bit(X86_FEATURE_PAGE1GB, regs[3]);
> > >  }
> > >  break;
> > >  
> > 
> > 
> > 
> > ___
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] INSTALL: fix typo in xendomains.service name

2014-12-03 Thread Olaf Hering

Signed-off-by: Olaf Hering 
Cc: Ian Campbell 
Cc: Ian Jackson 
---
 INSTALL | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/INSTALL b/INSTALL
index 0bc67ea..71dd0eb 100644
--- a/INSTALL
+++ b/INSTALL
@@ -284,7 +284,7 @@ systemctl enable xen-init-dom0.service
 systemctl enable xenconsoled.service
 
 Other optional services are:
-systemctl enable xen-domains.service
+systemctl enable xendomains.service
 systemctl enable xen-watchdog.service
 
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] xl pci-attach silently fails the first time

2014-12-03 Thread Olaf Hering

On Tue, Dec 02, Konrad Rzeszutek Wilk wrote:

> On Tue, Dec 02, 2014 at 04:46:52PM +0100, Olaf Hering wrote:
> > On Mon, Dec 01, Konrad Rzeszutek Wilk wrote:
> > 
> > > That is odd - I see any device 'hot-plugged' being added at 00:05 and 
> > > further.
> > 
> > Does this by any chance depend on the guest?! I mean, how is the guest
> 
> I doubt it.

Something is different here, likely just the non-pvops guest kernel.

> > notified that a PCI device is gone (by unplug)? Maybe the pvops case
> > just happens to work because the unplug happens early, perhaps before
> > PCI discovery?!
> 
> ACPI hotplug. And it does work after PCI discovery.

In a pvops kernel, is the emulated but unplugged PCI hardware still listed with 
lspci?

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

99 matches

Mail list logo