[PATCH 11/12] ARM: KVM: trap VM system registers until MMU and caches are ON

2014-03-03 Thread Marc Zyngier
In order to be able to detect the point where the guest enables its MMU and caches, trap all the VM related system registers. Once we see the guest enabling both the MMU and the caches, we can go back to a saner mode of operation, which is to leave these registers in complete control of the guest.

[PATCH 10/12] ARM: KVM: add world-switch for AMAIR{0,1}

2014-03-03 Thread Marc Zyngier
HCR.TVM traps (among other things) accesses to AMAIR0 and AMAIR1. In order to minimise the amount of surprise a guest could generate by trying to access these registers with caches off, add them to the list of registers we switch/handle. Signed-off-by: Marc Zyngier Reviewed-by: Christoffer Dall

[PATCH 12/12] ARM: KVM: fix warning in mmu.c

2014-03-03 Thread Marc Zyngier
Compiling with THP enabled leads to the following warning: arch/arm/kvm/mmu.c: In function ‘unmap_range’: arch/arm/kvm/mmu.c:177:39: warning: ‘pte’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (kvm_pmd_huge(*pmd) || page_empty(pte)) {

[PATCH 06/12] ARM: KVM: force cache clean on page fault when caches are off

2014-03-03 Thread Marc Zyngier
In order for a guest with caches disabled to observe data written contained in a given page, we need to make sure that page is committed to memory, and not just hanging in the cache (as guest accesses are completely bypassing the cache until it decides to enable it). For this purpose, hook into th

[PATCH 02/12] arm64: KVM: allows discrimination of AArch32 sysreg access

2014-03-03 Thread Marc Zyngier
The current handling of AArch32 trapping is slightly less than perfect, as it is not possible (from a handler point of view) to distinguish it from an AArch64 access, nor to tell a 32bit from a 64bit access either. Fix this by introducing two additional flags: - is_aarch32: true if the access was

[PATCH 08/12] ARM: KVM: fix ordering of 64bit coprocessor accesses

2014-03-03 Thread Marc Zyngier
Commit 240e99cbd00a (ARM: KVM: Fix 64-bit coprocessor handling) added an ordering dependency for the 64bit registers. The order described is: CRn, CRm, Op1, Op2, 64bit-first. Unfortunately, the implementation is: CRn, 64bit-first, CRm... Move the 64bit test to be last in order to match the docum

[PATCH 04/12] ARM: KVM: introduce kvm_p*d_addr_end

2014-03-03 Thread Marc Zyngier
The use of p*d_addr_end with stage-2 translation is slightly dodgy, as the IPA is 40bits, while all the p*d_addr_end helpers are taking an unsigned long (arm64 is fine with that as unligned long is 64bit). The fix is to introduce 64bit clean versions of the same helpers, and use them in the stage-

[PATCH 07/12] ARM: KVM: fix handling of trapped 64bit coprocessor accesses

2014-03-03 Thread Marc Zyngier
Commit 240e99cbd00a (ARM: KVM: Fix 64-bit coprocessor handling) changed the way we match the 64bit coprocessor access from user space, but didn't update the trap handler for the same set of registers. The effect is that a trapped 64bit access is never matched, leading to a fault being injected int

[PATCH 01/12] arm64: KVM: force cache clean on page fault when caches are off

2014-03-03 Thread Marc Zyngier
In order for the guest with caches off to observe data written contained in a given page, we need to make sure that page is committed to memory, and not just hanging in the cache (as guest accesses are completely bypassing the cache until it decides to enable it). For this purpose, hook into the c

[GIT PULL] KVM/ARM for 3.15

2014-03-03 Thread Marc Zyngier
Paolo, Gleb, Please pull the following tag to get what we currently have queued for 3.15. This series fixes a number of issue we have with when the guest runs with caches off. Thanks, M. The following changes since commit 1b385cbdd74aa803e966e01e5fe49490d6044e30: kvm, vmx: Really fix

[PATCH 05/12] arm64: KVM: flush VM pages before letting the guest enable caches

2014-03-03 Thread Marc Zyngier
When the guest runs with caches disabled (like in an early boot sequence, for example), all the writes are diectly going to RAM, bypassing the caches altogether. Once the MMU and caches are enabled, whatever sits in the cache becomes suddenly visible, which isn't what the guest expects. A way to

[PATCH 09/12] ARM: KVM: introduce per-vcpu HYP Configuration Register

2014-03-03 Thread Marc Zyngier
So far, KVM/ARM used a fixed HCR configuration per guest, except for the VI/VF/VA bits to control the interrupt in absence of VGIC. With the upcoming need to dynamically reconfigure trapping, it becomes necessary to allow the HCR to be changed on a per-vcpu basis. The fix here is to mimic what KV

[PATCH 03/12] arm64: KVM: trap VM system registers until MMU and caches are ON

2014-03-03 Thread Marc Zyngier
In order to be able to detect the point where the guest enables its MMU and caches, trap all the VM related system registers. Once we see the guest enabling both the MMU and the caches, we can go back to a saner mode of operation, which is to leave these registers in complete control of the guest.

Re: [RFC v3 0/6] networking: address root block upon initialization

2014-03-03 Thread Stephen Hemminger
On Mon, 3 Mar 2014 17:05:18 -0800 "Luis R. Rodriguez" wrote: > On Mon, Mar 3, 2014 at 2:46 PM, Luis R. Rodriguez > wrote: > > From: "Luis R. Rodriguez" > > <-- snip --> > > > As I tested using the root block preference I noticed that if a net_device > > slave under the bridge gets the designa

RE: [RFC]VM live snapshot proposal

2014-03-03 Thread Huangpeng (Peter)
> Hi Paolo, > > On Mon, Mar 03, 2014 at 02:47:31PM +0100, Paolo Bonzini wrote: > > I'm not sure what's the status of the kernel infrastructure for > > post-copy. Andrea? > > sys_userfaultfd is still work in progress but it shouldn't be much work left > to > completion. madvise(MADV_USERFAUL

RE: [RFC]VM live snapshot proposal

2014-03-03 Thread Huangpeng (Peter)
> > I think this is different in the same way that block-backup and > > block-mirror are different. Huangpeng's proposal would let you make a > > consistent snapshot of disks and RAM. > > Right. Though the point isn't about consistency (doing the disk snapshot when > memory has converged would be

RE: [RFC]VM live snapshot proposal

2014-03-03 Thread Huangpeng (Peter)
> > > Here I have another proposal, based on the live-migration scheme, > > > add consistent memory state tracking and saving. > > > The idea is simple: > > > 1.First round use live-migration to save all memory to a snapshot file. > > > 2.intercept the action of memory-modify, save old pages to a

Re: [RFC v3 0/6] networking: address root block upon initialization

2014-03-03 Thread Luis R. Rodriguez
On Mon, Mar 3, 2014 at 2:46 PM, Luis R. Rodriguez wrote: > From: "Luis R. Rodriguez" <-- snip --> > As I tested using the root block preference I noticed that if a net_device > slave under the bridge gets the designated root port prior to setting in > userspace the root_block feature enabling t

RE: [RFC]VM live snapshot proposal

2014-03-03 Thread Huangpeng (Peter)
> Yes, this is the tricky part. To be honest, I think this is the reason no > one has > submitted patches - it's a hard task and the win isn't that great (you can > already migrate to file). > Yes, lots of places have to be considered. Though scenarios are limited, users like library experimen

Re: [RFC v3 4/6] bridge: enable root block during device registration

2014-03-03 Thread Luis R. Rodriguez
On Mon, Mar 3, 2014 at 4:31 PM, Stephen Hemminger wrote: > On Mon, 3 Mar 2014 15:58:50 -0800 > "Luis R. Rodriguez" wrote: > >> On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger >> wrote: >> > Doing this in priv flags bloats what is a limited resource (# of bits). >> >> Agreed. I tried to avoid i

Re: [RFC v3 4/6] bridge: enable root block during device registration

2014-03-03 Thread Stephen Hemminger
On Mon, 3 Mar 2014 15:58:50 -0800 "Luis R. Rodriguez" wrote: > On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger > wrote: > > Doing this in priv flags bloats what is a limited resource (# of bits). > > Agreed. I tried to avoid it but saw no other option for addressing > this during initializa

Re: [RFC v3 4/6] bridge: enable root block during device registration

2014-03-03 Thread Luis R. Rodriguez
On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger wrote: > Doing this in priv flags bloats what is a limited resource (# of bits). Agreed. I tried to avoid it but saw no other option for addressing this during initialization properly without requirng a userspace upgrade. > Plus there are issues

Re: [RFC v3 4/6] bridge: enable root block during device registration

2014-03-03 Thread Stephen Hemminger
On Mon, 3 Mar 2014 14:47:03 -0800 "Luis R. Rodriguez" wrote: > From: "Luis R. Rodriguez" > > root block support was added via 1007dd1a on v3.8 but toggling > this flag is only allowed after a device has been registered and > added to a bridge as its a bridge *port* primitive, not a *net_device

[RFC v3 1/6] bridge: preserve random init MAC address

2014-03-03 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" As it is now if you add create a bridge it gets started with a random MAC address and if you then add a net_device as a slave but later kick it out you end up with a zero MAC address. Instead preserve the original random MAC address and use it. If you manually set the b

[RFC v3 0/6] networking: address root block upon initialization

2014-03-03 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" This is my third series on addressing removing the xen-netback hack of using a high MAC address for a root block preference after feedback and testing of the bridge feature Stephen mentioned. We want to remove that hack as its possible to end up with IPv6 conflicts upon

[RFC v3 2/6] bridge: trigger a bridge calculation upon port changes

2014-03-03 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" If netlink is used to tune a port we currently don't trigger a new recalculation of the bridge id, ensure that happens just as if we're adding a new net_device onto the bridge. Cc: Stephen Hemminger Cc: bri...@lists.linux-foundation.org Cc: net...@vger.kernel.org Cc: l

[RFC v3 4/6] bridge: enable root block during device registration

2014-03-03 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" root block support was added via 1007dd1a on v3.8 but toggling this flag is only allowed after a device has been registered and added to a bridge as its a bridge *port* primitive, not a *net_device* feature. There are work arounds possible to account for the lack of netl

[RFC v3 3/6] bridge: fix bridge root block on designated port

2014-03-03 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" Root port blocking was designed so that a bridge port can opt out of becoming the designated root port for a bridge. If a port however first becomes the designated root port and we then toggle the root port block on it we currently don't kick that port out of the designa

[RFC v3 5/6] xen-netback: use a random MAC address and force bridge root block

2014-03-03 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" The purpose of using a static MAC address of FE:FF:FF:FF:FF:FF was to prevent our backend interfaces from being used by the bridge and nominating our interface as a root port on the bridge. This was possible given that the bridge code will use the lowest MAC address for

[RFC v3 6/6] tun: add initialization root block support

2014-03-03 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" The networking bridge module allows us to specify a root block preference on net_devices but this feature is a bridge port primitive. The bridge module assumes that once a device is added as a slave to the brige that it can be considered for the the root port. Furthermor

Re: [PATCH 2/7] KVM: s390: virtio-ccw adapter interrupt support.

2014-03-03 Thread Christian Borntraeger
On 25/02/14 18:24, Cornelia Huck wrote: > Implement the new CCW_CMD_SET_IND_ADAPTER command and try to enable > adapter interrupts for every device on the first startup. If the host > does not support adapter interrupts, fall back to normal I/O interrupts. > > virtio-ccw adapter interrupts use the

Re: [RFC]VM live snapshot proposal

2014-03-03 Thread Andrea Arcangeli
Hi Paolo, On Mon, Mar 03, 2014 at 02:47:31PM +0100, Paolo Bonzini wrote: > I'm not sure what's the status of the kernel infrastructure for > post-copy. Andrea? sys_userfaultfd is still work in progress but it shouldn't be much work left to completion. madvise(MADV_USERFAULT) and remap_anon_pa

Re: macvtap performance regression (bisected) between 3.13 and 3.14-rc1

2014-03-03 Thread Vlad Yasevich
On 03/03/2014 04:13 AM, Christian Borntraeger wrote: > On 02/03/14 02:21, Vlad Yasevich wrote: >> On 03/01/2014 02:27 PM, Vlad Yasevich wrote: >>> On 03/01/2014 06:15 AM, Christian Borntraeger wrote: On 28/02/14 23:14, Vlad Yasevich wrote: > On 02/27/2014 03:52 PM, Christian Borntraeger wr

Re: [PATCH 3/3] drivers/vfio/pci: Fix MSIx message lost

2014-03-03 Thread Alex Williamson
On Mon, 2014-03-03 at 14:10 +0800, Gavin Shan wrote: > On Sun, Mar 02, 2014 at 09:49:49PM -0700, Alex Williamson wrote: > >On Mon, 2014-03-03 at 14:51 +1100, Benjamin Herrenschmidt wrote: > >> On Mon, 2014-03-03 at 11:24 +0800, Gavin Shan wrote: > > .../... > > >> > >> > Reported-by: Wen Xiong

Re: Enhancement for PLE handler in KVM

2014-03-03 Thread Paolo Bonzini
Il 03/03/2014 19:24, Li, Bin (Bin) ha scritto: Hello, all. The PLE handler attempts to determine an alternate vCPU to schedule. In some cases the wrong vCPU is scheduled and performance suffers. This patch allows for the guest OS to signal, using a hypercall, that it's starting/ending a critic

Enhancement for PLE handler in KVM

2014-03-03 Thread Li, Bin (Bin)
Hello, all. The PLE handler attempts to determine an alternate vCPU to schedule. In some cases the wrong vCPU is scheduled and performance suffers. This patch allows for the guest OS to signal, using a hypercall, that it's starting/ending a critical section. Using this information in the P

Re: 3.10.X kernel/jump_label kvm

2014-03-03 Thread Paolo Bonzini
Il 03/03/2014 19:17, Stefan Priebe ha scritto: Am 03.03.2014 17:36, schrieb Paolo Bonzini: Il 28/02/2014 20:47, Stefan Priebe ha scritto: Hello, i got this stack trace multiple times while using a vanilla 3.10.32 kernel and already sent it to the list in december but got no replies. Please

Re: 3.10.X kernel/jump_label kvm

2014-03-03 Thread Stefan Priebe
Am 03.03.2014 17:36, schrieb Paolo Bonzini: Il 28/02/2014 20:47, Stefan Priebe ha scritto: Hello, i got this stack trace multiple times while using a vanilla 3.10.32 kernel and already sent it to the list in december but got no replies. Please try the patch of commit 0dce7cd67fd9055c4a2ff278

Re: 3.10.X kernel/jump_label kvm

2014-03-03 Thread Paolo Bonzini
Il 28/02/2014 20:47, Stefan Priebe ha scritto: Hello, i got this stack trace multiple times while using a vanilla 3.10.32 kernel and already sent it to the list in december but got no replies. Please try the patch of commit 0dce7cd67fd9055c4a2ff278f8af1431e646d346: diff --git a/arch/x86/kvm/l

Re: [RFC PATCH] vfio-pci: avoid deadlock between unbind and VFIO_DEVICE_RESET

2014-03-03 Thread Alex Williamson
On Mon, 2014-03-03 at 12:28 -0300, Thadeu Lima de Souza Cascardo wrote: > On Mon, Mar 03, 2014 at 08:09:22AM -0700, Alex Williamson wrote: > > On Mon, 2014-03-03 at 11:33 -0300, Thadeu Lima de Souza Cascardo wrote: > > > When we unbind vfio-pci from a device, while running a guest, we might > > > h

Re: [RFC PATCH] vfio-pci: avoid deadlock between unbind and VFIO_DEVICE_RESET

2014-03-03 Thread Thadeu Lima de Souza Cascardo
On Mon, Mar 03, 2014 at 08:09:22AM -0700, Alex Williamson wrote: > On Mon, 2014-03-03 at 11:33 -0300, Thadeu Lima de Souza Cascardo wrote: > > When we unbind vfio-pci from a device, while running a guest, we might > > have a deadlock when such a guest reboots. > > > > Unbind takes device_lock at d

Re: [RFC PATCH] vfio-pci: avoid deadlock between unbind and VFIO_DEVICE_RESET

2014-03-03 Thread Alex Williamson
On Mon, 2014-03-03 at 11:33 -0300, Thadeu Lima de Souza Cascardo wrote: > When we unbind vfio-pci from a device, while running a guest, we might > have a deadlock when such a guest reboots. > > Unbind takes device_lock at device_release_driver, and waits for > release_q at vfio_del_group_dev. > >

Re: [Qemu-devel] [RFC]VM live snapshot proposal

2014-03-03 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote: > Il 03/03/2014 14:30, Kevin Wolf ha scritto: > >> > So why don't we simply reuse the existing migration code? > >> I think this is different in the same way that block-backup and > >> block-mirror are different. Huangpeng's proposal would let you make

[RFC PATCH] vfio-pci: avoid deadlock between unbind and VFIO_DEVICE_RESET

2014-03-03 Thread Thadeu Lima de Souza Cascardo
When we unbind vfio-pci from a device, while running a guest, we might have a deadlock when such a guest reboots. Unbind takes device_lock at device_release_driver, and waits for release_q at vfio_del_group_dev. release_q will only be woken up when all references to vfio_device are gone, and that

Re: [RFC]VM live snapshot proposal

2014-03-03 Thread Kevin Wolf
Am 03.03.2014 um 14:47 hat Paolo Bonzini geschrieben: > Il 03/03/2014 14:30, Kevin Wolf ha scritto: > >> > So why don't we simply reuse the existing migration code? > >> I think this is different in the same way that block-backup and > >> block-mirror are different. Huangpeng's proposal would let

Re: [RFC]VM live snapshot proposal

2014-03-03 Thread Paolo Bonzini
Il 03/03/2014 14:30, Kevin Wolf ha scritto: > > So why don't we simply reuse the existing migration code? > I think this is different in the same way that block-backup and > block-mirror are different. Huangpeng's proposal would let you make > a consistent snapshot of disks and RAM. Right. Thoug

Re: [RFC]VM live snapshot proposal

2014-03-03 Thread Kevin Wolf
Am 03.03.2014 um 14:19 hat Paolo Bonzini geschrieben: > Il 03/03/2014 13:55, Kevin Wolf ha scritto: > > Due to memory-modifications may happen in kvm, qemu, or vhost, the > > key-part is how we > > can provide common page-modify-tracking-and-saving api, we completed a > > prot

Re: [RFC]VM live snapshot proposal

2014-03-03 Thread Paolo Bonzini
Il 03/03/2014 13:55, Kevin Wolf ha scritto: > > Due to memory-modifications may happen in kvm, qemu, or vhost, the key-part is how we > > can provide common page-modify-tracking-and-saving api, we completed a prototype by > > simply add modified-page tracking/saving function in qemu, and it see

Re: [RFC]VM live snapshot proposal

2014-03-03 Thread Paolo Bonzini
Il 03/03/2014 13:32, Stefan Hajnoczi ha scritto: If there is not enough memory to fork, then a synchronous approach to catching guest memory writes is needed. I'm not sure if a good mechanism for that exists but the simplest would be mprotect(2) and a signal handler (which will make the guest ru

Re: [RFC]VM live snapshot proposal

2014-03-03 Thread Kevin Wolf
Am 03.03.2014 um 13:32 hat Stefan Hajnoczi geschrieben: > On Mon, Mar 03, 2014 at 01:13:41AM +, Huangpeng (Peter) wrote: > > Just to summarize the idea of live savevm for people joining the > discussion: > > It should be possible to save a snapshot of the guest (including memory, > devices, a

Re: [RFC]VM live snapshot proposal

2014-03-03 Thread Stefan Hajnoczi
On Mon, Mar 03, 2014 at 01:13:41AM +, Huangpeng (Peter) wrote: Just to summarize the idea of live savevm for people joining the discussion: It should be possible to save a snapshot of the guest (including memory, devices, and disk) without noticable downtime. The 'savevm' command pauses the

Re: [PATCH] virtio: make udp more efficient by avoiding indirect desc

2014-03-03 Thread Qin Chuanyu
On 2014/2/11 23:43, Michael S. Tsirkin wrote: On Tue, Feb 11, 2014 at 10:58:52PM +0800, Qin Chuanyu wrote: udp packet use 2 buffers at least, one for vnet_hdr and one for skb->data. Not really, we use 1 buffer now with vnet_hdr inline with data. I have found that there are related patch in Q

Re: macvtap performance regression (bisected) between 3.13 and 3.14-rc1

2014-03-03 Thread Christian Borntraeger
On 02/03/14 02:21, Vlad Yasevich wrote: > On 03/01/2014 02:27 PM, Vlad Yasevich wrote: >> On 03/01/2014 06:15 AM, Christian Borntraeger wrote: >>> On 28/02/14 23:14, Vlad Yasevich wrote: On 02/27/2014 03:52 PM, Christian Borntraeger wrote: > Vlad, > > commit 6acf54f1cf0a6747bac9fea

Re: macvtap performance regression (bisected) between 3.13 and 3.14-rc1

2014-03-03 Thread Christian Borntraeger
On 01/03/14 20:27, Vlad Yasevich wrote: > On 03/01/2014 06:15 AM, Christian Borntraeger wrote: >> On 28/02/14 23:14, Vlad Yasevich wrote: >>> On 02/27/2014 03:52 PM, Christian Borntraeger wrote: Vlad, commit 6acf54f1cf0a6747bac9fea26f34cfc5a9029523 macvtap: Add support of pa

Re: [PATCH] target-i386: bugfix of Intel MPX

2014-03-03 Thread Paolo Bonzini
Il 03/03/2014 06:24, Liu, Jinsong ha scritto: From 3a7783cd9a0556787809d3d5ecb5f2b85dd9fc02 Mon Sep 17 00:00:00 2001 From: Liu Jinsong Date: Mon, 3 Mar 2014 18:56:39 +0800 Subject: [PATCH] target-i386: bugfix of Intel MPX The correct size of cpuid 0x0d sub-leaf 4 is 0x40, not 0x10. This is conf