Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Avi Kivity
On 12/02/2010 10:51 PM, Anthony Liguori wrote: VCPU in HLT state only allows injection of certain events that would be delivered on HLT. #PF is not one of them. But you can't inject an exception into a guest while the VMCS is active, can you? No, but this is irrelevant. So the guest takes

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Avi Kivity
On 12/02/2010 05:23 PM, Anthony Liguori wrote: On 12/02/2010 08:39 AM, lidong chen wrote: In certain use-cases, we want to allocate guests fixed time slices where idle guest cycles leave the machine idling. i could not understand why need this? can you tell more detailedly? If you run 4

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Avi Kivity
On 12/02/2010 09:14 PM, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt trap. Give that VM a vcpu per pcpu (pin in place probably). And have that VM do nothing other than hlt. Then it's always runnable according to

Re: [PATCH 2/6] KVM: SVM: Add manipulation functions for CRx intercepts

2010-12-03 Thread Roedel, Joerg
On Thu, Dec 02, 2010 at 11:43:50AM -0500, Marcelo Tosatti wrote: On Tue, Nov 30, 2010 at 06:03:57PM +0100, Joerg Roedel wrote: - control-intercept_cr_read =INTERCEPT_CR0_MASK | - INTERCEPT_CR3_MASK | -

Re: Problems on qemu-kvm unittests

2010-12-03 Thread Avi Kivity
On 12/03/2010 12:59 AM, Lucas Meneghel Rodrigues wrote: We are getting failures when executing apic.flat on our periodic upstream tests: 12/02 18:40:59 DEBUG|kvm_vm:0664| Running qemu command: /usr/local/autotest/tests/kvm/qemu -name 'vm1' -monitor

[PATCH] do not try to register kvmclock if the host does not support it.

2010-12-03 Thread Glauber Costa
With the new Async PF code, some of our initiation code was moved to kvm.c. With that, code that was being issued conditional to kvmclock successful registration (primary cpu registering), started being issued unconditionally. This patch proposes that we protect all registrations inside

[PATCH 02/12] KVM: SVM: Add clean-bit for intercetps, tsc-offset and pause filter count

2010-12-03 Thread Joerg Roedel
This patch adds the clean-bit for intercepts-vectors, the TSC offset and the pause-filter count to the appropriate places. The IO and MSR permission bitmaps are not subject to this bit. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |7 +++ 1 files changed, 7

[PATCH 05/12] KVM: SVM: Add clean-bit for interrupt state

2010-12-03 Thread Joerg Roedel
This patch implements the clean-bit for all interrupt related state in the vmcb. This corresponds to vmcb offset 0x60-0x67. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/svm.c

[PATCH 04/12] KVM: SVM: Add clean-bit for the ASID

2010-12-03 Thread Joerg Roedel
This patch implements the clean-bit for the asid in the vmcb. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 3ab42be..8675048 100644 ---

[PATCH 03/12] KVM: SVM: Add clean-bit for IOPM_BASE and MSRPM_BASE

2010-12-03 Thread Joerg Roedel
This patch adds the clean bit for the physical addresses of the MSRPM and the IOPM. It does not need to be set in the code because the only place where these values are changed is the nested-svm vmrun and vmexit path. These functions already mark the complete VMCB as dirty. Signed-off-by: Joerg

[PATCH 12/12] KVM: SVM: Add clean-bit for LBR state

2010-12-03 Thread Joerg Roedel
This patch implements the clean-bit for all LBR related state. This includes the debugctl, br_from, br_to, last_excp_from, and last_excp_to msrs. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git

[PATCH 11/12] KVM: SVM: Add clean-bit for CR2 register

2010-12-03 Thread Joerg Roedel
This patch implements the clean-bit for the cr2 register in the vmcb. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 4c366fe..7643f83 100644 ---

[PATCH 07/12] KVM: SVM: Add clean-bit for control registers

2010-12-03 Thread Joerg Roedel
This patch implements the CRx clean-bit for the vmcb. This bit covers cr0, cr3, cr4, and efer. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index

[PATCH 10/12] KVM: SVM: Add clean-bit for Segements and CPL

2010-12-03 Thread Joerg Roedel
This patch implements the clean-bit defined for the cs, ds, ss, an es segemnts and the current cpl saved in the vmcb. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c

[PATCH 06/12] KVM: SVM: Add clean-bit for NPT state

2010-12-03 Thread Joerg Roedel
This patch implements the clean-bit for all nested paging related state in the vmcb. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index b81d31a..3b5d894

[PATCH 09/12] KVM: SVM: Add clean-bit for GDT and IDT

2010-12-03 Thread Joerg Roedel
This patch implements the clean-bit for the base and limit of the gdt and idt in the vmcb. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index

[PATCH 08/12] KVM: SVM: Add clean-bit for DR6 and DR7

2010-12-03 Thread Joerg Roedel
This patch implements the clean-bit for the dr6 and dr7 debug registers in the vmcb. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 1b35969..0517505

[PATCH 0/12] KVM: SVM: Add support for VMCB state caching

2010-12-03 Thread Joerg Roedel
Hi Avi, Hi Marcelo, here is a patch-set which adds support for VMCB state caching to KVM. This is a new CPU feature where software can mark certain parts of the VMCB as unchanged since the last vmexit and the hardware can then avoid reloading these parts from memory. The feature is implemented

[PATCH 01/12] KVM: SVM: Add clean-bits infrastructure code

2010-12-03 Thread Joerg Roedel
This patch adds the infrastructure for the implementation of the individual clean-bits. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/include/asm/svm.h |6 +- arch/x86/kvm/svm.c | 31 +++ 2 files changed, 36 insertions(+), 1

[PATCH 1/2] make kvmclock value idempotent for stopped machine

2010-12-03 Thread Glauber Costa
Although we never made such commitment clear (well, to the best of my knowledge), some people expect that two savevm issued in sequence in a stopped machine will yield the same results. This is not a crazy requirement, since we don't expect a stopped machine to be updating its state, for any

[PATCH 0/2] Fix savevm odness related to kvmclock

2010-12-03 Thread Glauber Costa
Some users told me that savevm path is behaving oddly wrt kvmclock. The first oddness is that a guarantee we never made (AFAIK) is being broken: two consecutive savevm operations, with the machine stopped in between produces different results, due to the call to KVM_GET_CLOCK ioctl. I believe the

[PATCH 2/2] Do not register kvmclock savevm section if kvmclock is disabled.

2010-12-03 Thread Glauber Costa
Usually nobody usually thinks about that scenario (me included and specially), but kvmclock can be actually disabled in the host. It happens in two scenarios: 1. host too old. 2. we passed -kvmclock to our -cpu parameter. In both cases, we should not register kvmclock savevm section. This

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 11:38:33AM +0200, Avi Kivity wrote: What if one of the guest crashes qemu or invokes a powerdown? Suddenly the others get 33% each (with 1% going to my secret round-up account). Doesn't seem like a reliable way to limit cpu. Some monitoring tool will need to catch that

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 11:40:27AM +0200, Avi Kivity wrote: On 12/02/2010 09:14 PM, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt trap. Give that VM a vcpu per pcpu (pin in place probably). And have that VM do

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt trap. Give that VM a vcpu per pcpu (pin in place probably). And have that VM do nothing other than hlt. Then it's always

Re: [PATCH 01/12] KVM: SVM: Add clean-bits infrastructure code

2010-12-03 Thread Roedel, Joerg
On Fri, Dec 03, 2010 at 05:45:48AM -0500, Joerg Roedel wrote: +enum { + VMCB_CLEAN_MAX, +}; This is a left-over from an earlier version. I forgot to remove it. Here is an updated patch. Sorry. From 7e3f4f175561429d0054daac94763e67d12424ba Mon Sep 17 00:00:00 2001 From: Joerg Roedel

Re: [RFC PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2010-12-03 Thread Srivatsa Vaddagiri
On Thu, Dec 02, 2010 at 02:43:24PM -0500, Rik van Riel wrote: mutex_lock(vcpu-mutex); + vcpu-task = current; Shouldn't we grab reference to current task_struct before storing a pointer to it? - vatsa -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Gleb Natapov
On Thu, Dec 02, 2010 at 02:51:51PM -0600, Anthony Liguori wrote: On 12/02/2010 02:12 PM, Marcelo Tosatti wrote: opt = CPU_BASED_TPR_SHADOW | CPU_BASED_USE_MSR_BITMAPS | CPU_BASED_ACTIVATE_SECONDARY_CONTROLS; -- 1.7.0.4 Breaks async PF (see checks

[PATCH] Correcting timeout interruption of virtio_console test.

2010-12-03 Thread Jiří Župka
Catch new exception from kvm_suprocess to avoid killing of tests. --- client/tests/kvm/tests/virtio_console.py | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/client/tests/kvm/tests/virtio_console.py b/client/tests/kvm/tests/virtio_console.py index

[KVM-Autotest][PATCH][virtio-console] Correcting timeout interruption of virtio_console test.

2010-12-03 Thread Jiří Župka
Catch new exception from kvm_suprocess to avoid killing of tests. --- client/tests/kvm/tests/virtio_console.py | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/client/tests/kvm/tests/virtio_console.py b/client/tests/kvm/tests/virtio_console.py index

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Peter Zijlstra
On Thu, 2010-12-02 at 14:44 -0500, Rik van Riel wrote: unsigned long clone_flags); + +#ifdef CONFIG_SCHED_HRTICK +extern u64 slice_remain(struct task_struct *); +extern void yield_to(struct task_struct *); +#else +static inline void yield_to(struct task_struct

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 02:23:39PM +0100, Peter Zijlstra wrote: Right, so another approach might be to simply swap the vruntime between curr and p. Can't that cause others to stave? For ex: consider a cpu p0 having these tasks: p0 - A0 B0 A1 A0/A1 have entered some sort of AB-BA

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 06:54:16AM +0100, Mike Galbraith wrote: +void yield_to(struct task_struct *p) +{ + unsigned long flags; + struct sched_entity *se = p-se; + struct rq *rq; + struct cfs_rq *cfs_rq; + u64 remain = slice_remain(current); That slice remaining only

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Peter Zijlstra
On Fri, 2010-12-03 at 19:00 +0530, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 02:23:39PM +0100, Peter Zijlstra wrote: Right, so another approach might be to simply swap the vruntime between curr and p. Can't that cause others to stave? For ex: consider a cpu p0 having these tasks:

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 03:03:30PM +0100, Peter Zijlstra wrote: No, because they do receive service (they spend some time spinning before being interrupted), so the respective vruntimes will increase, at some point they'll pass B0 and it'll get scheduled. Is that sufficient to ensure that B0

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 07:36:07PM +0530, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 03:03:30PM +0100, Peter Zijlstra wrote: No, because they do receive service (they spend some time spinning before being interrupted), so the respective vruntimes will increase, at some point they'll

Re: [RFC PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2010-12-03 Thread Rik van Riel
On 12/03/2010 07:17 AM, Srivatsa Vaddagiri wrote: On Thu, Dec 02, 2010 at 02:43:24PM -0500, Rik van Riel wrote: mutex_lock(vcpu-mutex); + vcpu-task = current; Shouldn't we grab reference to current task_struct before storing a pointer to it? That should not be needed, since

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Anthony Liguori
On 12/02/2010 09:44 PM, Chris Wright wrote: Yes. There's definitely a use-case to have a hard cap. OK, good, just wanted to be clear. Because this started as a discussion of hard caps, and it began to sound as if you were no longer advocating for them. But I think another common

[PATCH 1/3] KVM: SVM: Remove flush_guest_tlb function

2010-12-03 Thread Joerg Roedel
This function is unused and there is svm_flush_tlb which does the same. So this function can be removed. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |5 - 1 files changed, 0 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c

[PATCH 0/3] KVM: SVM: Add support Flush-By-ASID feature

2010-12-03 Thread Joerg Roedel
Hi Avi, Hi Marcelo, here is the patch-set to add support for the Flush-By-ASID feature to KVM on AMD. Patches 1 and 2 clean up the code a little bit and patch 3 implements the feature itself. Regards, Joerg arch/x86/include/asm/svm.h |2 ++ arch/x86/kvm/svm.c | 32

[PATCH 3/3] KVM: SVM: Implement Flush-By-Asid feature

2010-12-03 Thread Joerg Roedel
This patch adds the new flush-by-asid of upcoming AMD processors to the KVM-AMD module. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/include/asm/svm.h |2 ++ arch/x86/kvm/svm.c | 10 -- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git

[PATCH 2/3] KVM: SVM: Use svm_flush_tlb instead of force_new_asid

2010-12-03 Thread Joerg Roedel
This patch replaces all calls to force_new_asid which are intended to flush the guest-tlb by the more appropriate function svm_flush_tlb. As a side-effect the force_new_asid function is removed. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c | 19 +++ 1

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Mike Galbraith
On Fri, 2010-12-03 at 19:16 +0530, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 06:54:16AM +0100, Mike Galbraith wrote: +void yield_to(struct task_struct *p) +{ + unsigned long flags; + struct sched_entity *se = p-se; + struct rq *rq; + struct cfs_rq *cfs_rq; + u64

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Rik van Riel
On 12/03/2010 09:45 AM, Mike Galbraith wrote: I'll have to go back and re-read that. Off the top of my head, I see no way it could matter which container the numbers live in as long as they keep advancing, and stay in the same runqueue. (hm, task weights would have to be the same too or

Re: [RFC PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2010-12-03 Thread Rik van Riel
On 12/02/2010 08:18 PM, Chris Wright wrote: * Rik van Riel (r...@redhat.com) wrote: Keep track of which task is running a KVM vcpu. This helps us figure out later what task to wake up if we want to boost a vcpu that got preempted. Unfortunately there are no guarantees that the same task

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Mike Galbraith
On Fri, 2010-12-03 at 09:48 -0500, Rik van Riel wrote: On 12/03/2010 09:45 AM, Mike Galbraith wrote: I'll have to go back and re-read that. Off the top of my head, I see no way it could matter which container the numbers live in as long as they keep advancing, and stay in the same

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Rik van Riel
On 12/03/2010 10:09 AM, Mike Galbraith wrote: On Fri, 2010-12-03 at 09:48 -0500, Rik van Riel wrote: On 12/03/2010 09:45 AM, Mike Galbraith wrote: I'll have to go back and re-read that. Off the top of my head, I see no way it could matter which container the numbers live in as long as they

Re: [stable] [PATCH 3/3][STABLE] KVM: add schedule check to napi_enable call

2010-12-03 Thread Peter Lieven
Am 04.06.2010 um 02:02 schrieb Bruce Rogers: On 6/3/2010 at 04:51 PM, Greg KH g...@kroah.com wrote: On Thu, Jun 03, 2010 at 04:17:34PM -0600, Bruce Rogers wrote: On 6/3/2010 at 03:03 PM, Greg KH g...@kroah.com wrote: On Thu, Jun 03, 2010 at 01:38:31PM -0600, Bruce Rogers wrote:

Re: [RFC PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2010-12-03 Thread Chris Wright
* Rik van Riel (r...@redhat.com) wrote: On 12/02/2010 08:18 PM, Chris Wright wrote: * Rik van Riel (r...@redhat.com) wrote: Keep track of which task is running a KVM vcpu. This helps us figure out later what task to wake up if we want to boost a vcpu that got preempted. Unfortunately

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 10:35:25AM -0500, Rik van Riel wrote: Do you have suggestions on what I should do to make this yield_to functionality work? Keeping in mind the complications of yield_to, I had suggested we do something suggested below: http://marc.info/?l=kvmm=129122645006996w=2

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 05:27:52PM +0530, Srivatsa Vaddagiri wrote: On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt trap. Give that VM a vcpu per pcpu (pin in place

[PATCH] KVM: SVM: Add xsetbv intercept

2010-12-03 Thread Joerg Roedel
This patch implements the xsetbv intercept to the AMD part of KVM. This makes AVX usable in a save way for the guest on AVX capable AMD hardware. The patch is tested by using AVX in the guest and host in parallel and checking for data corruption. I also used the KVM xsave unit-tests and they all

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Rik van Riel
On 12/03/2010 11:20 AM, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 10:35:25AM -0500, Rik van Riel wrote: Do you have suggestions on what I should do to make this yield_to functionality work? Keeping in mind the complications of yield_to, I had suggested we do something suggested below:

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Chris Wright
* Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt trap. Give that VM a vcpu per pcpu (pin in place probably). And have

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Chris Wright
* Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Fri, Dec 03, 2010 at 05:27:52PM +0530, Srivatsa Vaddagiri wrote: On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 12:09:01PM -0500, Rik van Riel wrote: I don't see how that is going to help get the lock released, when the VCPU holding the lock is on another CPU. Even the directed yield() is not guaranteed to get the lock released, given its shooting in the dark? Anyway, the

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Rik van Riel
On 12/03/2010 12:29 PM, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 12:09:01PM -0500, Rik van Riel wrote: I don't see how that is going to help get the lock released, when the VCPU holding the lock is on another CPU. Even the directed yield() is not guaranteed to get the lock released,

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: That's what Marcelo's suggestion does w/out a fill thread. Are we willing to add that to KVM sources? I was working under the constraints of not modifying the kernel (especially avoid adding short term hacks that become unnecessary

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 09:28:25AM -0800, Chris Wright wrote: * Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt trap.

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Chris Wright
* Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Fri, Dec 03, 2010 at 09:28:25AM -0800, Chris Wright wrote: * Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright wrote: Perhaps it should be a VM level option. And then

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 09:38:05AM -0800, Chris Wright wrote: All guest are of equal priorty in this case (that's how we are able to divide time into 25% chunks), so unless we dynamically boost D's priority based on how idle other VMs are, its not going to be easy! Right, I think

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 12:33:29PM -0500, Rik van Riel wrote: Anyway, the intention of yield() proposed was not to get lock released immediately (which will happen eventually), but rather to avoid inefficiency associated with (long) spinning and at the same time make sure we are not leaking

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Anthony Liguori
On 12/03/2010 11:38 AM, Chris Wright wrote: * Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Fri, Dec 03, 2010 at 09:28:25AM -0800, Chris Wright wrote: * Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: That's what Marcelo's suggestion does w/out a fill thread. There's one complication though even with that. How do we compute the real utilization of VM (given that it will appear to be burning 100% cycles)? We need to have scheduler

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Chris Wright
* Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: That's what Marcelo's suggestion does w/out a fill thread. There's one complication though even with that. How do we compute the real utilization of VM (given that it will

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Anthony Liguori
On 12/03/2010 11:58 AM, Chris Wright wrote: * Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: That's what Marcelo's suggestion does w/out a fill thread. There's one complication though even with that. How do

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 12:07:15PM -0600, Anthony Liguori wrote: My first reaction is that it's not terribly important to account the non-idle time in the guest because of the use-case for this model. Agreed ...but I was considering the larger user-base who may be surprised to see their VMs

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Marcelo Tosatti
On Fri, Dec 03, 2010 at 09:58:54AM -0800, Chris Wright wrote: * Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: That's what Marcelo's suggestion does w/out a fill thread. There's one complication though even with that.

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Chris Wright
* Anthony Liguori (anth...@codemonkey.ws) wrote: On 12/03/2010 11:58 AM, Chris Wright wrote: * Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: That's what Marcelo's suggestion does w/out a fill thread. There's one complication

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Marcelo Tosatti
On Fri, Dec 03, 2010 at 04:10:43PM -0200, Marcelo Tosatti wrote: On Fri, Dec 03, 2010 at 09:58:54AM -0800, Chris Wright wrote: * Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: That's what Marcelo's suggestion does w/out

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Rik van Riel
On 12/02/2010 07:50 PM, Chris Wright wrote: +void requeue_task(struct rq *rq, struct task_struct *p) +{ + assert_spin_locked(rq-lock); + + if (!p-se.on_rq || task_running(rq, p) || task_has_rt_policy(p)) + return; already checked task_running(rq, p) ||

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Anthony Liguori
On 12/03/2010 12:20 PM, Chris Wright wrote: * Anthony Liguori (anth...@codemonkey.ws) wrote: On 12/03/2010 11:58 AM, Chris Wright wrote: * Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: That's

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Chris Wright
* Rik van Riel (r...@redhat.com) wrote: On 12/02/2010 07:50 PM, Chris Wright wrote: +/* + * Yield the CPU, giving the remainder of our time slice to task p. + * Typically used to hand CPU time to another thread inside the same + * process, eg. when p holds a resource other threads are waiting

[PATCH 0/5] Extra capabilities for device assignment

2010-12-03 Thread Alex Williamson
Now that we've got PCI capabilities cleaned up and device assignment using them, we can add more capabilities to be guest visible. This adds minimal PCI Express, PCI-X, and Power Management, along with direct passthrough Vital Product Data and Vendor Specific capabilities. With this, devices like

[PATCH 1/5] device-assignment: Fix off-by-one in header check

2010-12-03 Thread Alex Williamson
Include the first byte at 40h or else access might go to the hardware instead of the emulated config space, resulting in capability loops, since the ordering is different. Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/device-assignment.c |4 ++-- 1 files changed, 2

[PATCH 2/5] pci: MSI-X capability is 12 bytes, not 16, MSI is 10 bytes

2010-12-03 Thread Alex Williamson
Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/pci.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/pci.h b/hw/pci.h index 34955d8..7c52637 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -124,8 +124,8 @@ enum { #define

[PATCH 3/5] pci: Error on PCI capability collisions

2010-12-03 Thread Alex Williamson
Nothing good can happen when we overlap capabilities Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/pci.c | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index b08113d..288d6fd 100644 --- a/hw/pci.c +++ b/hw/pci.c @@

[PATCH 4/5] device-assignment: Error checking when adding capabilities

2010-12-03 Thread Alex Williamson
Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/device-assignment.c | 14 +- 1 files changed, 9 insertions(+), 5 deletions(-) diff --git a/hw/device-assignment.c b/hw/device-assignment.c index 6d6e657..838bf89 100644 --- a/hw/device-assignment.c +++

[PATCH 5/5] device-assignment: pass through and stub more PCI caps

2010-12-03 Thread Alex Williamson
Some drivers depend on finding capabilities like power management, PCI express/X, vital product data, or vendor specific fields. Now that we have better capability support, we can pass more of these tables through to the guest. Note that VPD and VNDR are direct pass through capabilies, the rest

Re: [PATCH 2/5] pci: MSI-X capability is 12 bytes, not 16, MSI is 10 bytes

2010-12-03 Thread Chris Wright
* Alex Williamson (alex.william...@redhat.com) wrote: Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/pci.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/pci.h b/hw/pci.h index 34955d8..7c52637 100644 --- a/hw/pci.h +++ b/hw/pci.h @@

Re: [PATCH 2/5] pci: MSI-X capability is 12 bytes, not 16, MSI is 10 bytes

2010-12-03 Thread Alex Williamson
On Fri, 2010-12-03 at 11:37 -0800, Chris Wright wrote: * Alex Williamson (alex.william...@redhat.com) wrote: Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/pci.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/pci.h b/hw/pci.h

Re: [PATCH 2/5] pci: MSI-X capability is 12 bytes, not 16, MSI is 10 bytes

2010-12-03 Thread Chris Wright
* Alex Williamson (alex.william...@redhat.com) wrote: On Fri, 2010-12-03 at 11:37 -0800, Chris Wright wrote: * Alex Williamson (alex.william...@redhat.com) wrote: Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/pci.h |4 ++-- 1 files changed, 2

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Mike Galbraith
On Fri, 2010-12-03 at 10:35 -0500, Rik van Riel wrote: On 12/03/2010 10:09 AM, Mike Galbraith wrote: On Fri, 2010-12-03 at 09:48 -0500, Rik van Riel wrote: On 12/03/2010 09:45 AM, Mike Galbraith wrote: I'll have to go back and re-read that. Off the top of my head, I see no way it could

Re: [PATCH 0/6] KVM: SVM: Wrap access to intercept masks into functions

2010-12-03 Thread Marcelo Tosatti
On Tue, Nov 30, 2010 at 06:03:55PM +0100, Joerg Roedel wrote: Hi Avi, Hi Marcelo, this patchset wraps the access to the intercept vectors in the VMCB into specific functions. There are two reasons for this: 1) In the nested-svm code the effective intercept masks are

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Peter Zijlstra
On Fri, 2010-12-03 at 16:09 +0100, Mike Galbraith wrote: On Fri, 2010-12-03 at 09:48 -0500, Rik van Riel wrote: On 12/03/2010 09:45 AM, Mike Galbraith wrote: I'll have to go back and re-read that. Off the top of my head, I see no way it could matter which container the numbers live in

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Peter Zijlstra
On Fri, 2010-12-03 at 19:40 +0530, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 07:36:07PM +0530, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 03:03:30PM +0100, Peter Zijlstra wrote: No, because they do receive service (they spend some time spinning before being interrupted), so

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Peter Zijlstra
On Fri, 2010-12-03 at 13:27 -0500, Rik van Riel wrote: Should these details all be in sched_fair? Seems like the wrong layer here. And would that condition go the other way? If new vruntime is smaller than min, then it becomes new cfs_rq-min_vruntime? That would be nice.

[PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v3)

2010-12-03 Thread Anthony Liguori
In certain use-cases, we want to allocate guests fixed time slices where idle guest cycles leave the machine idling. There are many approaches to achieve this but the most direct is to simply avoid trapping the HLT instruction which lets the guest directly execute the instruction putting the

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Anthony Liguori
On 12/03/2010 03:36 AM, Avi Kivity wrote: On 12/02/2010 10:51 PM, Anthony Liguori wrote: VCPU in HLT state only allows injection of certain events that would be delivered on HLT. #PF is not one of them. But you can't inject an exception into a guest while the VMCS is active, can you? No,

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Anthony Liguori
On 12/02/2010 02:40 PM, Marcelo Tosatti wrote: On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright wrote: * Anthony Liguori (aligu...@us.ibm.com) wrote: In certain use-cases, we want to allocate guests fixed time slices where idle guest cycles leave the machine idling. There are

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Anthony Liguori
On 12/02/2010 11:37 AM, Marcelo Tosatti wrote: On Thu, Dec 02, 2010 at 07:59:17AM -0600, Anthony Liguori wrote: In certain use-cases, we want to allocate guests fixed time slices where idle guest cycles leave the machine idling. There are many approaches to achieve this but the most direct

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Anthony Liguori
On 12/03/2010 03:38 AM, Avi Kivity wrote: On 12/02/2010 05:23 PM, Anthony Liguori wrote: On 12/02/2010 08:39 AM, lidong chen wrote: In certain use-cases, we want to allocate guests fixed time slices where idle guest cycles leave the machine idling. i could not understand why need this? can

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Anthony Liguori
On 12/02/2010 02:12 PM, Marcelo Tosatti wrote: It should be possible to achieve determinism with a scheduler policy? If the desire is the ultimate desire is to have the guests be scheduled in a non-work conserving fashion, I can't see a more direct approach that to simply not have the

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v3)

2010-12-03 Thread Joerg Roedel
On Fri, Dec 03, 2010 at 04:39:22PM -0600, Anthony Liguori wrote: + if (yield_on_hlt) + min |= CPU_BASED_HLT_EXITING; This approach won't work out on AMD because in HLT the CPU may enter C1e. In C1e the local apic timer interupt is not delivered anymore and when this is the

[PATCH 0/5] KVMgenirq: Enable adaptive IRQ sharing for passed-through devices

2010-12-03 Thread Jan Kiszka
Besides 3 cleanup patches, this series consists of two major changes. The first introduces an interrupt sharing notifier to the genirq subsystem. It fires when an interrupt line is about to be use by more than one driver or the last but one user called free_irq. The second major change makes use

[PATCH 1/5] genirq: Pass descriptor to __free_irq

2010-12-03 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com One caller of __free_free already has to resolve and check the irq descriptor. So this small cleanup consistently pushes irq_to_desc and the NULL check to the call site to avoid redundant work. Signed-off-by: Jan Kiszka jan.kis...@siemens.com ---

[PATCH 5/5] KVM: Allow host IRQ sharing for passed-through PCI 2.3 devices

2010-12-03 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com PCI 2.3 allows to generically disable IRQ sources at device level. This enables us to share IRQs of such devices on the host side when passing them to a guest. However, IRQ disabling via the PCI config space is more costly than masking the line via

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v3)

2010-12-03 Thread Anthony Liguori
On 12/03/2010 05:32 PM, Joerg Roedel wrote: On Fri, Dec 03, 2010 at 04:39:22PM -0600, Anthony Liguori wrote: + if (yield_on_hlt) + min |= CPU_BASED_HLT_EXITING; This approach won't work out on AMD because in HLT the CPU may enter C1e. In C1e the local apic timer

[PATCH 4/5] KVM: Clean up unneeded void pointer casts

2010-12-03 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- virt/kvm/assigned-dev.c | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c index 3f8a745..c6114d3 100644 ---

[PATCH 3/5] KVM: Split up MSI-X assigned device IRQ handler

2010-12-03 Thread Jan Kiszka
From: Jan Kiszka jan.kis...@siemens.com The threaded IRQ handler for MSI-X has almost nothing in common with the INTx/MSI handler. Move its code into a dedicated handler. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- virt/kvm/assigned-dev.c | 32 +++- 1

  1   2   >