[PATCH 8/8] KVM: PPC: Book3S HV: Fix thinko in try_lock_hpte()

2012-10-15 Thread Paul Mackerras
This fixes an error in the inline asm in try_lock_hpte() where we
were erroneously using a register number as an immediate operand.
The bug only affects an error path, and in fact the code will still
work as long as the compiler chooses some register other than r0
for the "bits" variable.  Nevertheless it should still be fixed.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_book3s_64.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 0dd1d86..1472a5b 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -60,7 +60,7 @@ static inline long try_lock_hpte(unsigned long *hpte, 
unsigned long bits)
 "  ori %0,%0,%4\n"
 "  stdcx.  %0,0,%2\n"
 "  beq+2f\n"
-"  li  %1,%3\n"
+"  mr  %1,%3\n"
 "2:isync"
 : "=&r" (tmp), "=&r" (old)
 : "r" (hpte), "r" (bits), "i" (HPTE_V_HVLOCK)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/8] Various Book3s HV fixes that haven't been picked up yet

2012-10-15 Thread Paul Mackerras
This is a set of 8 patches of which the first 7 have been posted
previously and have had no comments.  The 8th is new, but is quite
trivial.  They fix a series of issues with HV-style KVM on ppc.
They only touch code that is specific to Book3S HV KVM.

Please apply.

Thanks,
Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/8] KVM: PPC: Book3S HV: Run virtual core whenever any vcpus in it can run

2012-10-15 Thread Paul Mackerras
Currently the Book3S HV code implements a policy on multi-threaded
processors (i.e. POWER7) that requires all of the active vcpus in a
virtual core to be ready to run before we run the virtual core.
However, that causes problems on reset, because reset stops all vcpus
except vcpu 0, and can also reduce throughput since all four threads
in a virtual core have to wait whenever any one of them hits a
hypervisor page fault.

This relaxes the policy, allowing the virtual core to run as soon as
any vcpu in it is runnable.  With this, the KVMPPC_VCPU_STOPPED state
and the KVMPPC_VCPU_BUSY_IN_HOST state have been combined into a single
KVMPPC_VCPU_NOTREADY state, since we no longer need to distinguish
between them.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_host.h |5 +--
 arch/powerpc/kvm/book3s_hv.c|   74 ++-
 2 files changed, 40 insertions(+), 39 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 218534d..1e8cbd1 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -563,9 +563,8 @@ struct kvm_vcpu_arch {
 };
 
 /* Values for vcpu->arch.state */
-#define KVMPPC_VCPU_STOPPED0
-#define KVMPPC_VCPU_BUSY_IN_HOST   1
-#define KVMPPC_VCPU_RUNNABLE   2
+#define KVMPPC_VCPU_NOTREADY   0
+#define KVMPPC_VCPU_RUNNABLE   1
 
 /* Values for vcpu->arch.io_gpr */
 #define KVM_MMIO_REG_MASK  0x001f
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 89995fa..61d2934 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -776,10 +776,7 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, 
unsigned int id)
 
kvmppc_mmu_book3s_hv_init(vcpu);
 
-   /*
-* We consider the vcpu stopped until we see the first run ioctl for it.
-*/
-   vcpu->arch.state = KVMPPC_VCPU_STOPPED;
+   vcpu->arch.state = KVMPPC_VCPU_NOTREADY;
 
init_waitqueue_head(&vcpu->arch.cpu_run);
 
@@ -866,9 +863,8 @@ static void kvmppc_remove_runnable(struct kvmppc_vcore *vc,
 {
if (vcpu->arch.state != KVMPPC_VCPU_RUNNABLE)
return;
-   vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST;
+   vcpu->arch.state = KVMPPC_VCPU_NOTREADY;
--vc->n_runnable;
-   ++vc->n_busy;
list_del(&vcpu->arch.run_list);
 }
 
@@ -1169,7 +1165,6 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 {
int n_ceded;
-   int prev_state;
struct kvmppc_vcore *vc;
struct kvm_vcpu *v, *vn;
 
@@ -1186,7 +1181,6 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, 
struct kvm_vcpu *vcpu)
vcpu->arch.ceded = 0;
vcpu->arch.run_task = current;
vcpu->arch.kvm_run = kvm_run;
-   prev_state = vcpu->arch.state;
vcpu->arch.state = KVMPPC_VCPU_RUNNABLE;
list_add_tail(&vcpu->arch.run_list, &vc->runnable_threads);
++vc->n_runnable;
@@ -1196,35 +1190,26 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, 
struct kvm_vcpu *vcpu)
 * If the vcore is already running, we may be able to start
 * this thread straight away and have it join in.
 */
-   if (prev_state == KVMPPC_VCPU_STOPPED) {
+   if (!signal_pending(current)) {
if (vc->vcore_state == VCORE_RUNNING &&
VCORE_EXIT_COUNT(vc) == 0) {
vcpu->arch.ptid = vc->n_runnable - 1;
kvmppc_create_dtl_entry(vcpu, vc);
kvmppc_start_thread(vcpu);
+   } else if (vc->vcore_state == VCORE_SLEEPING) {
+   wake_up(&vc->wq);
}
 
-   } else if (prev_state == KVMPPC_VCPU_BUSY_IN_HOST)
-   --vc->n_busy;
+   }
 
while (vcpu->arch.state == KVMPPC_VCPU_RUNNABLE &&
   !signal_pending(current)) {
-   if (vc->n_busy || vc->vcore_state != VCORE_INACTIVE) {
+   if (vc->vcore_state != VCORE_INACTIVE) {
spin_unlock(&vc->lock);
kvmppc_wait_for_exec(vcpu, TASK_INTERRUPTIBLE);
spin_lock(&vc->lock);
continue;
}
-   vc->runner = vcpu;
-   n_ceded = 0;
-   list_for_each_entry(v, &vc->runnable_threads, arch.run_list)
-   if (!v->arch.pending_exceptions)
-   n_ceded += v->arch.ceded;
-   if (n_ceded == vc->n_runnable)
-   kvmppc_vcore_blocked(vc);
-   else
-   kvmppc_run_core(vc);
-
list_for_each_entry_safe(v, vn, &vc->runnable_threads,
 arch.run_list) {
kvmppc_core_prepare_to_enter(v);
@@ -1236,23 +1221,40 

[PATCH 6/8] KVM: PPC: Book3S HV: Fix accounting of stolen time

2012-10-15 Thread Paul Mackerras
Currently the code that accounts stolen time tends to overestimate the
stolen time, and will sometimes report more stolen time in a DTL
(dispatch trace log) entry than has elapsed since the last DTL entry.
This can cause guests to underflow the user or system time measured
for some tasks, leading to ridiculous CPU percentages and total runtimes
being reported by top and other utilities.

In addition, the current code was designed for the previous policy where
a vcore would only run when all the vcpus in it were runnable, and so
only counted stolen time on a per-vcore basis.  Now that a vcore can
run while some of the vcpus in it are doing other things in the kernel
(e.g. handling a page fault), we need to count the time when a vcpu task
is preempted while it is not running as part of a vcore as stolen also.

To do this, we bring back the BUSY_IN_HOST vcpu state and extend the
vcpu_load/put functions to count preemption time while the vcpu is
in that state.  Handling the transitions between the RUNNING and
BUSY_IN_HOST states requires checking and updating two variables
(accumulated time stolen and time last preempted), so we add a new
spinlock, vcpu->arch.tbacct_lock.  This protects both the per-vcpu
stolen/preempt-time variables, and the per-vcore variables while this
vcpu is running the vcore.

Finally, we now don't count time spent in userspace as stolen time.
The task could be executing in userspace on behalf of the vcpu, or
it could be preempted, or the vcpu could be genuinely stopped.  Since
we have no way of dividing up the time between these cases, we don't
count any of it as stolen.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_host.h |5 ++
 arch/powerpc/kvm/book3s_hv.c|  127 ++-
 2 files changed, 117 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 1e8cbd1..3093896 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -559,12 +559,17 @@ struct kvm_vcpu_arch {
unsigned long dtl_index;
u64 stolen_logged;
struct kvmppc_vpa slb_shadow;
+
+   spinlock_t tbacct_lock;
+   u64 busy_stolen;
+   u64 busy_preempt;
 #endif
 };
 
 /* Values for vcpu->arch.state */
 #define KVMPPC_VCPU_NOTREADY   0
 #define KVMPPC_VCPU_RUNNABLE   1
+#define KVMPPC_VCPU_BUSY_IN_HOST   2
 
 /* Values for vcpu->arch.io_gpr */
 #define KVM_MMIO_REG_MASK  0x001f
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 61d2934..8b3c470 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -60,23 +60,74 @@
 /* Used to indicate that a guest page fault needs to be handled */
 #define RESUME_PAGE_FAULT  (RESUME_GUEST | RESUME_FLAG_ARCH1)
 
+/* Used as a "null" value for timebase values */
+#define TB_NIL (~(u64)0)
+
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
 
+/*
+ * We use the vcpu_load/put functions to measure stolen time.
+ * Stolen time is counted as time when either the vcpu is able to
+ * run as part of a virtual core, but the task running the vcore
+ * is preempted or sleeping, or when the vcpu needs something done
+ * in the kernel by the task running the vcpu, but that task is
+ * preempted or sleeping.  Those two things have to be counted
+ * separately, since one of the vcpu tasks will take on the job
+ * of running the core, and the other vcpu tasks in the vcore will
+ * sleep waiting for it to do that, but that sleep shouldn't count
+ * as stolen time.
+ *
+ * Hence we accumulate stolen time when the vcpu can run as part of
+ * a vcore using vc->stolen_tb, and the stolen time when the vcpu
+ * needs its task to do other things in the kernel (for example,
+ * service a page fault) in busy_stolen.  We don't accumulate
+ * stolen time for a vcore when it is inactive, or for a vcpu
+ * when it is in state RUNNING or NOTREADY.  NOTREADY is a bit of
+ * a misnomer; it means that the vcpu task is not executing in
+ * the KVM_VCPU_RUN ioctl, i.e. it is in userspace or elsewhere in
+ * the kernel.  We don't have any way of dividing up that time
+ * between time that the vcpu is genuinely stopped, time that
+ * the task is actively working on behalf of the vcpu, and time
+ * that the task is preempted, so we don't count any of it as
+ * stolen.
+ *
+ * Updates to busy_stolen are protected by arch.tbacct_lock;
+ * updates to vc->stolen_tb are protected by the arch.tbacct_lock
+ * of the vcpu that has taken responsibility for running the vcore
+ * (i.e. vc->runner).  The stolen times are measured in units of
+ * timebase ticks.  (Note that the != TB_NIL checks below are
+ * purely defensive; they should never fail.)
+ */
+
 void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct kvmppc_vcore *vc = vcpu->arch.vcore;
 
-   if (vc->runner == vcpu && vc->vc

[PATCH 2/8] KVM: PPC: Book3S HV: Fix some races in starting secondary threads

2012-10-15 Thread Paul Mackerras
Subsequent patches implementing in-kernel XICS emulation will make it
possible for IPIs to arrive at secondary threads at arbitrary times.
This fixes some races in how we start the secondary threads, which
if not fixed could lead to occasional crashes of the host kernel.

This makes sure that (a) we have grabbed all the secondary threads,
and verified that they are no longer in the kernel, before we start
any thread, (b) that the secondary thread loads its vcpu pointer
after clearing the IPI that woke it up (so we don't miss a wakeup),
and (c) that the secondary thread clears its vcpu pointer before
incrementing the nap count.  It also removes unnecessary setting
of the vcpu and vcore pointers in the paca in kvmppc_core_vcpu_load.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv.c|   41 ++-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   11 ++---
 2 files changed, 32 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index c5ddf04..77dec0f 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -64,8 +64,6 @@ void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct kvmppc_vcore *vc = vcpu->arch.vcore;
 
-   local_paca->kvm_hstate.kvm_vcpu = vcpu;
-   local_paca->kvm_hstate.kvm_vcore = vc;
if (vc->runner == vcpu && vc->vcore_state != VCORE_INACTIVE)
vc->stolen_tb += mftb() - vc->preempt_tb;
 }
@@ -880,6 +878,7 @@ static int kvmppc_grab_hwthread(int cpu)
 
/* Ensure the thread won't go into the kernel if it wakes */
tpaca->kvm_hstate.hwthread_req = 1;
+   tpaca->kvm_hstate.kvm_vcpu = NULL;
 
/*
 * If the thread is already executing in the kernel (e.g. handling
@@ -929,7 +928,6 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu)
smp_wmb();
 #if defined(CONFIG_PPC_ICP_NATIVE) && defined(CONFIG_SMP)
if (vcpu->arch.ptid) {
-   kvmppc_grab_hwthread(cpu);
xics_wake_cpu(cpu);
++vc->n_woken;
}
@@ -955,7 +953,8 @@ static void kvmppc_wait_for_nap(struct kvmppc_vcore *vc)
 
 /*
  * Check that we are on thread 0 and that any other threads in
- * this core are off-line.
+ * this core are off-line.  Then grab the threads so they can't
+ * enter the kernel.
  */
 static int on_primary_thread(void)
 {
@@ -967,6 +966,17 @@ static int on_primary_thread(void)
while (++thr < threads_per_core)
if (cpu_online(cpu + thr))
return 0;
+
+   /* Grab all hw threads so they can't go into the kernel */
+   for (thr = 1; thr < threads_per_core; ++thr) {
+   if (kvmppc_grab_hwthread(cpu + thr)) {
+   /* Couldn't grab one; let the others go */
+   do {
+   kvmppc_release_hwthread(cpu + thr);
+   } while (--thr > 0);
+   return 0;
+   }
+   }
return 1;
 }
 
@@ -1015,16 +1025,6 @@ static int kvmppc_run_core(struct kvmppc_vcore *vc)
}
 
/*
-* Make sure we are running on thread 0, and that
-* secondary threads are offline.
-*/
-   if (threads_per_core > 1 && !on_primary_thread()) {
-   list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list)
-   vcpu->arch.ret = -EBUSY;
-   goto out;
-   }
-
-   /*
 * Assign physical thread IDs, first to non-ceded vcpus
 * and then to ceded ones.
 */
@@ -1043,15 +1043,22 @@ static int kvmppc_run_core(struct kvmppc_vcore *vc)
if (vcpu->arch.ceded)
vcpu->arch.ptid = ptid++;
 
+   /*
+* Make sure we are running on thread 0, and that
+* secondary threads are offline.
+*/
+   if (threads_per_core > 1 && !on_primary_thread()) {
+   list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list)
+   vcpu->arch.ret = -EBUSY;
+   goto out;
+   }
+
vc->stolen_tb += mftb() - vc->preempt_tb;
vc->pcpu = smp_processor_id();
list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
kvmppc_start_thread(vcpu);
kvmppc_create_dtl_entry(vcpu, vc);
}
-   /* Grab any remaining hw threads so they can't go into the kernel */
-   for (i = ptid; i < threads_per_core; ++i)
-   kvmppc_grab_hwthread(vc->pcpu + i);
 
preempt_disable();
spin_unlock(&vc->lock);
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 44b72fe..1e90ef6 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -134,8 +134,11 @@ kvm_start_guest:
 
 27:/* XXX should handle hypervisor maintenance interrupts etc. here */
 
+   /* reload

[PATCH 7/8] KVM: PPC: Book3S HV: Allow DTL to be set to address 0, length 0

2012-10-15 Thread Paul Mackerras
Commit 55b665b026 ("KVM: PPC: Book3S HV: Provide a way for userspace
to get/set per-vCPU areas") includes a check on the length of the
dispatch trace log (DTL) to make sure the buffer is at least one entry
long.  This is appropriate when registering a buffer, but the
interface also allows for any existing buffer to be unregistered by
specifying a zero address.  In this case the length check is not
appropriate.  This makes the check conditional on the address being
non-zero.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 8b3c470..812764c 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -811,9 +811,8 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union 
kvmppc_one_reg *val)
addr = val->vpaval.addr;
len = val->vpaval.length;
r = -EINVAL;
-   if (len < sizeof(struct dtl_entry))
-   break;
-   if (addr && !vcpu->arch.vpa.next_gpa)
+   if (addr && (len < sizeof(struct dtl_entry) ||
+!vcpu->arch.vpa.next_gpa))
break;
len -= len % sizeof(struct dtl_entry);
r = set_vpa(vcpu, &vcpu->arch.dtl, addr, len);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/8] KVM: PPC: Book3S HV: Fixes for late-joining threads

2012-10-15 Thread Paul Mackerras
If a thread in a virtual core becomes runnable while other threads
in the same virtual core are already running in the guest, it is
possible for the latecomer to join the others on the core without
first pulling them all out of the guest.  Currently this only happens
rarely, when a vcpu is first started.  This fixes some bugs and
omissions in the code in this case.

First, we need to check for VPA updates for the latecomer and make
a DTL entry for it.  Secondly, if it comes along while the master
vcpu is doing a VPA update, we don't need to do anything since the
master will pick it up in kvmppc_run_core.  To handle this correctly
we introduce a new vcore state, VCORE_STARTING.  Thirdly, there is
a race because we currently clear the hardware thread's hwthread_req
before waiting to see it get to nap.  A latecomer thread could have
its hwthread_req cleared before it gets to test it, and therefore
never increment the nap_count, leading to messages about wait_for_nap
timeouts.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_host.h |7 ---
 arch/powerpc/kvm/book3s_hv.c|   14 +++---
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 68f5a30..218534d 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -289,9 +289,10 @@ struct kvmppc_vcore {
 
 /* Values for vcore_state */
 #define VCORE_INACTIVE 0
-#define VCORE_RUNNING  1
-#define VCORE_EXITING  2
-#define VCORE_SLEEPING 3
+#define VCORE_SLEEPING 1
+#define VCORE_STARTING 2
+#define VCORE_RUNNING  3
+#define VCORE_EXITING  4
 
 /*
  * Struct used to manage memory for a virtual processor area
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 3a737a4..89995fa 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -336,6 +336,11 @@ static void kvmppc_update_vpa(struct kvm_vcpu *vcpu, 
struct kvmppc_vpa *vpap)
 
 static void kvmppc_update_vpas(struct kvm_vcpu *vcpu)
 {
+   if (!(vcpu->arch.vpa.update_pending ||
+ vcpu->arch.slb_shadow.update_pending ||
+ vcpu->arch.dtl.update_pending))
+   return;
+
spin_lock(&vcpu->arch.vpa_update_lock);
if (vcpu->arch.vpa.update_pending) {
kvmppc_update_vpa(vcpu, &vcpu->arch.vpa);
@@ -1009,7 +1014,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
vc->n_woken = 0;
vc->nap_count = 0;
vc->entry_exit_count = 0;
-   vc->vcore_state = VCORE_RUNNING;
+   vc->vcore_state = VCORE_STARTING;
vc->in_guest = 0;
vc->napping_threads = 0;
 
@@ -1062,6 +1067,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
kvmppc_create_dtl_entry(vcpu, vc);
}
 
+   vc->vcore_state = VCORE_RUNNING;
preempt_disable();
spin_unlock(&vc->lock);
 
@@ -1070,8 +1076,6 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
srcu_idx = srcu_read_lock(&vcpu0->kvm->srcu);
 
__kvmppc_vcore_entry(NULL, vcpu0);
-   for (i = 0; i < threads_per_core; ++i)
-   kvmppc_release_hwthread(vc->pcpu + i);
 
spin_lock(&vc->lock);
/* disable sending of IPIs on virtual external irqs */
@@ -1080,6 +1084,8 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
/* wait for secondary threads to finish writing their state to memory */
if (vc->nap_count < vc->n_woken)
kvmppc_wait_for_nap(vc);
+   for (i = 0; i < threads_per_core; ++i)
+   kvmppc_release_hwthread(vc->pcpu + i);
/* prevent other vcpu threads from doing kvmppc_start_thread() now */
vc->vcore_state = VCORE_EXITING;
spin_unlock(&vc->lock);
@@ -1170,6 +1176,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, 
struct kvm_vcpu *vcpu)
kvm_run->exit_reason = 0;
vcpu->arch.ret = RESUME_GUEST;
vcpu->arch.trap = 0;
+   kvmppc_update_vpas(vcpu);
 
/*
 * Synchronize with other threads in this virtual core
@@ -1193,6 +1200,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, 
struct kvm_vcpu *vcpu)
if (vc->vcore_state == VCORE_RUNNING &&
VCORE_EXIT_COUNT(vc) == 0) {
vcpu->arch.ptid = vc->n_runnable - 1;
+   kvmppc_create_dtl_entry(vcpu, vc);
kvmppc_start_thread(vcpu);
}
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/8] KVM: PPC: Book3S HV: Allow KVM guests to stop secondary threads coming online

2012-10-15 Thread Paul Mackerras
When a Book3S HV KVM guest is running, we need the host to be in
single-thread mode, that is, all of the cores (or at least all of
the cores where the KVM guest could run) to be running only one
active hardware thread.  This is because of the hardware restriction
in POWER processors that all of the hardware threads in the core
must be in the same logical partition.  Complying with this restriction
is much easier if, from the host kernel's point of view, only one
hardware thread is active.

This adds two hooks in the SMP hotplug code to allow the KVM code to
make sure that secondary threads (i.e. hardware threads other than
thread 0) cannot come online while any KVM guest exists.  The KVM
code still has to check that any core where it runs a guest has the
secondary threads offline, but having done that check it can now be
sure that they will not come online while the guest is running.

Signed-off-by: Paul Mackerras 
Acked-by: Benjamin Herrenschmidt 
---
 arch/powerpc/include/asm/smp.h |8 +++
 arch/powerpc/kernel/smp.c  |   46 
 arch/powerpc/kvm/book3s_hv.c   |   12 +--
 3 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index ebc24dc..b625a1a 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -66,6 +66,14 @@ void generic_cpu_die(unsigned int cpu);
 void generic_mach_cpu_die(void);
 void generic_set_cpu_dead(unsigned int cpu);
 int generic_check_cpu_restart(unsigned int cpu);
+
+extern void inhibit_secondary_onlining(void);
+extern void uninhibit_secondary_onlining(void);
+
+#else /* HOTPLUG_CPU */
+static inline void inhibit_secondary_onlining(void) {}
+static inline void uninhibit_secondary_onlining(void) {}
+
 #endif
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 8d4214a..c4f420c 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -417,6 +417,45 @@ int generic_check_cpu_restart(unsigned int cpu)
 {
return per_cpu(cpu_state, cpu) == CPU_UP_PREPARE;
 }
+
+static atomic_t secondary_inhibit_count;
+
+/*
+ * Don't allow secondary CPU threads to come online
+ */
+void inhibit_secondary_onlining(void)
+{
+   /*
+* This makes secondary_inhibit_count stable during cpu
+* online/offline operations.
+*/
+   get_online_cpus();
+
+   atomic_inc(&secondary_inhibit_count);
+   put_online_cpus();
+}
+EXPORT_SYMBOL_GPL(inhibit_secondary_onlining);
+
+/*
+ * Allow secondary CPU threads to come online again
+ */
+void uninhibit_secondary_onlining(void)
+{
+   get_online_cpus();
+   atomic_dec(&secondary_inhibit_count);
+   put_online_cpus();
+}
+EXPORT_SYMBOL_GPL(uninhibit_secondary_onlining);
+
+static int secondaries_inhibited(void)
+{
+   return atomic_read(&secondary_inhibit_count);
+}
+
+#else /* HOTPLUG_CPU */
+
+#define secondaries_inhibited()0
+
 #endif
 
 static void cpu_idle_thread_init(unsigned int cpu, struct task_struct *idle)
@@ -435,6 +474,13 @@ int __cpuinit __cpu_up(unsigned int cpu, struct 
task_struct *tidle)
 {
int rc, c;
 
+   /*
+* Don't allow secondary threads to come online if inhibited
+*/
+   if (threads_per_core > 1 && secondaries_inhibited() &&
+   cpu % threads_per_core != 0)
+   return -EBUSY;
+
if (smp_ops == NULL ||
(smp_ops->cpu_bootable && !smp_ops->cpu_bootable(cpu)))
return -EINVAL;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 9a15da7..c5ddf04 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -47,6 +47,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1016,8 +1017,6 @@ static int kvmppc_run_core(struct kvmppc_vcore *vc)
/*
 * Make sure we are running on thread 0, and that
 * secondary threads are offline.
-* XXX we should also block attempts to bring any
-* secondary threads online.
 */
if (threads_per_core > 1 && !on_primary_thread()) {
list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list)
@@ -1730,11 +1729,20 @@ int kvmppc_core_init_vm(struct kvm *kvm)
 
kvm->arch.using_mmu_notifiers = !!cpu_has_feature(CPU_FTR_ARCH_206);
spin_lock_init(&kvm->arch.slot_phys_lock);
+
+   /*
+* Don't allow secondary CPU threads to come online
+* while any KVM VMs exist.
+*/
+   inhibit_secondary_onlining();
+
return 0;
 }
 
 void kvmppc_core_destroy_vm(struct kvm *kvm)
 {
+   uninhibit_secondary_onlining();
+
if (kvm->arch.rma) {
kvm_release_rma(kvm->arch.rma);
kvm->arch.rma = NULL;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majo

[PATCH 3/8] KVM: PPC: Book3s HV: Don't access runnable threads list without vcore lock

2012-10-15 Thread Paul Mackerras
There were a few places where we were traversing the list of runnable
threads in a virtual core, i.e. vc->runnable_threads, without holding
the vcore spinlock.  This extends the places where we hold the vcore
spinlock to cover everywhere that we traverse that list.

Since we possibly need to sleep inside kvmppc_book3s_hv_page_fault,
this moves the call of it from kvmppc_handle_exit out to
kvmppc_vcpu_run, where we don't hold the vcore lock.

In kvmppc_vcore_blocked, we don't actually need to check whether
all vcpus are ceded and don't have any pending exceptions, since the
caller has already done that.  The caller (kvmppc_run_vcpu) wasn't
actually checking for pending exceptions, so we add that.

The change of if to while in kvmppc_run_vcpu is to make sure that we
never call kvmppc_remove_runnable() when the vcore state is RUNNING or
EXITING.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_asm.h |1 +
 arch/powerpc/kvm/book3s_hv.c   |   67 ++--
 2 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h 
b/arch/powerpc/include/asm/kvm_asm.h
index 76fdcfe..aabcdba 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -118,6 +118,7 @@
 
 #define RESUME_FLAG_NV  (1<<0)  /* Reload guest nonvolatile state? */
 #define RESUME_FLAG_HOST(1<<1)  /* Resume host? */
+#define RESUME_FLAG_ARCH1  (1<<2)
 
 #define RESUME_GUEST0
 #define RESUME_GUEST_NV RESUME_FLAG_NV
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 77dec0f..3a737a4 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -57,6 +57,9 @@
 /* #define EXIT_DEBUG_SIMPLE */
 /* #define EXIT_DEBUG_INT */
 
+/* Used to indicate that a guest page fault needs to be handled */
+#define RESUME_PAGE_FAULT  (RESUME_GUEST | RESUME_FLAG_ARCH1)
+
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
 
@@ -431,7 +434,6 @@ static int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
  struct task_struct *tsk)
 {
int r = RESUME_HOST;
-   int srcu_idx;
 
vcpu->stat.sum_exits++;
 
@@ -491,16 +493,12 @@ static int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 * have been handled already.
 */
case BOOK3S_INTERRUPT_H_DATA_STORAGE:
-   srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
-   r = kvmppc_book3s_hv_page_fault(run, vcpu,
-   vcpu->arch.fault_dar, vcpu->arch.fault_dsisr);
-   srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx);
+   r = RESUME_PAGE_FAULT;
break;
case BOOK3S_INTERRUPT_H_INST_STORAGE:
-   srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
-   r = kvmppc_book3s_hv_page_fault(run, vcpu,
-   kvmppc_get_pc(vcpu), 0);
-   srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx);
+   vcpu->arch.fault_dar = kvmppc_get_pc(vcpu);
+   vcpu->arch.fault_dsisr = 0;
+   r = RESUME_PAGE_FAULT;
break;
/*
 * This occurs if the guest executes an illegal instruction.
@@ -984,22 +982,24 @@ static int on_primary_thread(void)
  * Run a set of guest threads on a physical core.
  * Called with vc->lock held.
  */
-static int kvmppc_run_core(struct kvmppc_vcore *vc)
+static void kvmppc_run_core(struct kvmppc_vcore *vc)
 {
struct kvm_vcpu *vcpu, *vcpu0, *vnext;
long ret;
u64 now;
int ptid, i, need_vpa_update;
int srcu_idx;
+   struct kvm_vcpu *vcpus_to_update[threads_per_core];
 
/* don't start if any threads have a signal pending */
need_vpa_update = 0;
list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
if (signal_pending(vcpu->arch.run_task))
-   return 0;
-   need_vpa_update |= vcpu->arch.vpa.update_pending |
-   vcpu->arch.slb_shadow.update_pending |
-   vcpu->arch.dtl.update_pending;
+   return;
+   if (vcpu->arch.vpa.update_pending ||
+   vcpu->arch.slb_shadow.update_pending ||
+   vcpu->arch.dtl.update_pending)
+   vcpus_to_update[need_vpa_update++] = vcpu;
}
 
/*
@@ -1019,8 +1019,8 @@ static int kvmppc_run_core(struct kvmppc_vcore *vc)
 */
if (need_vpa_update) {
spin_unlock(&vc->lock);
-   list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list)
-   kvmppc_update_vpas(vcpu);
+   for (i = 0; i < need_vpa_update; ++i)
+   kvmppc_update_vpas(vcpus_to_update[i]);
spin_lock(&vc->lock);
}
 
@@ -1037,8

Re: [PATCH 0/8] Various Book3s HV fixes that haven't been picked up yet

2012-10-15 Thread Alexander Graf

On 15.10.2012, at 13:14, Paul Mackerras wrote:

> This is a set of 8 patches of which the first 7 have been posted
> previously and have had no comments.  The 8th is new, but is quite
> trivial.  They fix a series of issues with HV-style KVM on ppc.
> They only touch code that is specific to Book3S HV KVM.
> 
> Please apply.

Sorry, I can't accept patches that haven't shown up on kvm@vger. Please send 
this patch set again with CC to kvm@vger.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] KVM: PPC: Support ioeventfd

2012-10-15 Thread Alexander Graf
In order to support vhost, we need to be able to support ioeventfd.

This patch set adds support for ioeventfd to PPC and makes it possible to
do so without implementing irqfd along the way, as it requires an in-kernel
irqchip which we don't have yet.

Alex

Alexander Graf (2):
  KVM: Distangle eventfd code from irqchip
  KVM: PPC: Support eventfd

 arch/powerpc/kvm/Kconfig   |1 +
 arch/powerpc/kvm/Makefile  |4 +++-
 arch/powerpc/kvm/powerpc.c |   17 -
 include/linux/kvm_host.h   |   12 +++-
 virt/kvm/eventfd.c |6 ++
 5 files changed, 37 insertions(+), 3 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM: PPC: Support eventfd

2012-10-15 Thread Alexander Graf
In order to support the generic eventfd infrastructure on PPC, we need
to call into the generic KVM in-kernel device mmio code.

Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/Kconfig   |1 +
 arch/powerpc/kvm/Makefile  |4 +++-
 arch/powerpc/kvm/powerpc.c |   17 -
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 71f0cd9..4730c95 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -20,6 +20,7 @@ config KVM
bool
select PREEMPT_NOTIFIERS
select ANON_INODES
+   select HAVE_KVM_EVENTFD
 
 config KVM_BOOK3S_HANDLER
bool
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index c2a0863..cd89658 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -6,7 +6,8 @@ subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
 
 ccflags-y := -Ivirt/kvm -Iarch/powerpc/kvm
 
-common-objs-y = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o)
+common-objs-y = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o \
+   eventfd.o)
 
 CFLAGS_44x_tlb.o  := -I.
 CFLAGS_e500_tlb.o := -I.
@@ -76,6 +77,7 @@ kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HV) := \
 
 kvm-book3s_64-module-objs := \
../../../virt/kvm/kvm_main.o \
+   ../../../virt/kvm/eventfd.o \
powerpc.o \
emulate.o \
book3s.o \
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index deb0d59..900d8fc 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -314,6 +314,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_PPC_IRQ_LEVEL:
case KVM_CAP_ENABLE_CAP:
case KVM_CAP_ONE_REG:
+   case KVM_CAP_IOEVENTFD:
r = 1;
break;
 #ifndef CONFIG_KVM_BOOK3S_64_HV
@@ -613,6 +614,13 @@ int kvmppc_handle_load(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
vcpu->mmio_is_write = 0;
vcpu->arch.mmio_sign_extend = 0;
 
+   if (!kvm_io_bus_read(vcpu->kvm, KVM_MMIO_BUS, run->mmio.phys_addr,
+bytes, &run->mmio.data)) {
+   kvmppc_complete_mmio_load(vcpu, run);
+   vcpu->mmio_needed = 0;
+   return EMULATE_DONE;
+   }
+
return EMULATE_DO_MMIO;
 }
 
@@ -622,8 +630,8 @@ int kvmppc_handle_loads(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 {
int r;
 
-   r = kvmppc_handle_load(run, vcpu, rt, bytes, is_bigendian);
vcpu->arch.mmio_sign_extend = 1;
+   r = kvmppc_handle_load(run, vcpu, rt, bytes, is_bigendian);
 
return r;
 }
@@ -661,6 +669,13 @@ int kvmppc_handle_store(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
}
}
 
+   if (!kvm_io_bus_write(vcpu->kvm, KVM_MMIO_BUS, run->mmio.phys_addr,
+ bytes, &run->mmio.data)) {
+   kvmppc_complete_mmio_load(vcpu, run);
+   vcpu->mmio_needed = 0;
+   return EMULATE_DONE;
+   }
+
return EMULATE_DO_MMIO;
 }
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] KVM: Distangle eventfd code from irqchip

2012-10-15 Thread Alexander Graf
The current eventfd code assumes that when we have eventfd, we also have
irqfd for in-kernel interrupt delivery. This is not necessarily true. On
PPC we don't have an in-kernel irqchip yet, but we can still support easily
support eventfd.

Signed-off-by: Alexander Graf 
---
 include/linux/kvm_host.h |   12 +++-
 virt/kvm/eventfd.c   |6 ++
 2 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6afc5be..f2f5880 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -884,10 +884,20 @@ static inline void kvm_free_irq_routing(struct kvm *kvm) 
{}
 #ifdef CONFIG_HAVE_KVM_EVENTFD
 
 void kvm_eventfd_init(struct kvm *kvm);
+int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args);
+
+#ifdef CONFIG_HAVE_KVM_IRQCHIP
 int kvm_irqfd(struct kvm *kvm, struct kvm_irqfd *args);
 void kvm_irqfd_release(struct kvm *kvm);
 void kvm_irq_routing_update(struct kvm *, struct kvm_irq_routing_table *);
-int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args);
+#else
+static inline int kvm_irqfd(struct kvm *kvm, struct kvm_irqfd *args)
+{
+   return -EINVAL;
+}
+
+static inline void kvm_irqfd_release(struct kvm *kvm) {}
+#endif
 
 #else
 
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 9718e98..d7424c8 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -35,6 +35,7 @@
 
 #include "iodev.h"
 
+#ifdef __KVM_HAVE_IOAPIC
 /*
  * 
  * irqfd: Allows an fd to be used to inject an interrupt to the guest
@@ -425,17 +426,21 @@ fail:
kfree(irqfd);
return ret;
 }
+#endif
 
 void
 kvm_eventfd_init(struct kvm *kvm)
 {
+#ifdef __KVM_HAVE_IOAPIC
spin_lock_init(&kvm->irqfds.lock);
INIT_LIST_HEAD(&kvm->irqfds.items);
INIT_LIST_HEAD(&kvm->irqfds.resampler_list);
mutex_init(&kvm->irqfds.resampler_lock);
+#endif
INIT_LIST_HEAD(&kvm->ioeventfds);
 }
 
+#ifdef __KVM_HAVE_IOAPIC
 /*
  * shutdown any irqfd's that match fd+gsi
  */
@@ -555,6 +560,7 @@ static void __exit irqfd_module_exit(void)
 
 module_init(irqfd_module_init);
 module_exit(irqfd_module_exit);
+#endif
 
 /*
  * 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm/powerpc: Handle errors in secondary thread grabbing

2012-10-15 Thread Michael Ellerman
In the Book3s HV code, kvmppc_run_core() has logic to grab the secondary
threads of the physical core.

If for some reason a thread is stuck, kvmppc_grab_hwthread() can fail,
but currently we ignore the failure and continue into the guest. If the
stuck thread is in the kernel badness ensues.

Instead we should check for failure and bail out.

I've moved the grabbing prior to the startup of runnable threads, to simplify
the error case. AFAICS this is harmless, but I could be missing something
subtle.

Signed-off-by: Michael Ellerman 
---

Or we could just BUG_ON() ?
---
 arch/powerpc/kvm/book3s_hv.c |   22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 721d460..55925cd 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -884,16 +884,30 @@ static int kvmppc_run_core(struct kvmppc_vcore *vc)
if (vcpu->arch.ceded)
vcpu->arch.ptid = ptid++;
 
+   /*
+* Grab any remaining hw threads so they can't go into the kernel.
+* Do this early to simplify the cleanup path if it fails.
+*/
+   for (i = ptid; i < threads_per_core; ++i) {
+   int j, rc = kvmppc_grab_hwthread(vc->pcpu + i);
+   if (rc) {
+   for (j = i - 1; j ; j--)
+   kvmppc_release_hwthread(vc->pcpu + j);
+
+   list_for_each_entry(vcpu, &vc->runnable_threads,
+   arch.run_list)
+   vcpu->arch.ret = -EBUSY;
+
+   goto out;
+   }
+   }
+
vc->stolen_tb += mftb() - vc->preempt_tb;
vc->pcpu = smp_processor_id();
list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
kvmppc_start_thread(vcpu);
kvmppc_create_dtl_entry(vcpu, vc);
}
-   /* Grab any remaining hw threads so they can't go into the kernel */
-   for (i = ptid; i < threads_per_core; ++i)
-   kvmppc_grab_hwthread(vc->pcpu + i);
-
preempt_disable();
spin_unlock(&vc->lock);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/8] Various Book3s HV fixes that haven't been picked up yet

2012-10-15 Thread Paul Mackerras
On Mon, Oct 15, 2012 at 02:00:54PM +0200, Alexander Graf wrote:
> 
> Sorry, I can't accept patches that haven't shown up on kvm@vger. Please send 
> this patch set again with CC to kvm@vger.

Done; I didn't cc kvm-ppc this time since the patches haven't changed.

By the way, what is the purpose of kvm-ppc@vger.kernel.org?

Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm/powerpc: Handle errors in secondary thread grabbing

2012-10-15 Thread Paul Mackerras
Michael,

On Tue, Oct 16, 2012 at 11:15:50AM +1100, Michael Ellerman wrote:
> In the Book3s HV code, kvmppc_run_core() has logic to grab the secondary
> threads of the physical core.
> 
> If for some reason a thread is stuck, kvmppc_grab_hwthread() can fail,
> but currently we ignore the failure and continue into the guest. If the
> stuck thread is in the kernel badness ensues.
> 
> Instead we should check for failure and bail out.
> 
> I've moved the grabbing prior to the startup of runnable threads, to simplify
> the error case. AFAICS this is harmless, but I could be missing something
> subtle.

Thanks for looking at this - but in fact this is fixed by my patch
entitled "KVM: PPC: Book3S HV: Fix some races in starting secondary
threads" submitted back on August 28.

Regards,
Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] KVM: Provide mmu notifier retry test based on struct kvm

2012-10-15 Thread Paul Mackerras
The mmu_notifier_retry() function, used to test whether any page
invalidations are in progress, currently takes a vcpu pointer, though
the code only needs the VM's struct kvm pointer.  Forthcoming patches
to the powerpc Book3S HV code will need to test for retry within a VM
ioctl, where a struct kvm pointer is available but a struct vcpu
pointer isn't.  Therefore this creates a variant of mmu_notifier_retry
called kvm_mmu_notifier_retry that takes a struct kvm pointer, and
implements mmu_notifier_retry in terms of it.

Signed-off-by: Paul Mackerras 
---
 include/linux/kvm_host.h |   11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6afc5be..1cc1e1d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -841,9 +841,9 @@ extern struct kvm_stats_debugfs_item debugfs_entries[];
 extern struct dentry *kvm_debugfs_dir;
 
 #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
-static inline int mmu_notifier_retry(struct kvm_vcpu *vcpu, unsigned long 
mmu_seq)
+static inline int kvm_mmu_notifier_retry(struct kvm *kvm, unsigned long 
mmu_seq)
 {
-   if (unlikely(vcpu->kvm->mmu_notifier_count))
+   if (unlikely(kvm->mmu_notifier_count))
return 1;
/*
 * Ensure the read of mmu_notifier_count happens before the read
@@ -856,10 +856,15 @@ static inline int mmu_notifier_retry(struct kvm_vcpu 
*vcpu, unsigned long mmu_se
 * can't rely on kvm->mmu_lock to keep things ordered.
 */
smp_rmb();
-   if (vcpu->kvm->mmu_notifier_seq != mmu_seq)
+   if (kvm->mmu_notifier_seq != mmu_seq)
return 1;
return 0;
 }
+
+static inline int mmu_notifier_retry(struct kvm_vcpu *vcpu, unsigned long 
mmu_seq)
+{
+   return kvm_mmu_notifier_retry(vcpu->kvm, mmu_seq);
+}
 #endif
 
 #ifdef KVM_CAP_IRQ_ROUTING
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs

2012-10-15 Thread Paul Mackerras
This uses a bit in our record of the guest view of the HPTE to record
when the HPTE gets modified.  We use a reserved bit for this, and ensure
that this bit is always cleared in HPTE values returned to the guest.
The recording of modified HPTEs is only done if other code indicates
its interest by setting kvm->arch.hpte_mod_interest to a non-zero value.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_book3s_64.h |6 ++
 arch/powerpc/include/asm/kvm_host.h  |1 +
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  |   25 ++---
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 1472a5b..4ca4f25 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -50,6 +50,12 @@ extern int kvm_hpt_order;/* order of 
preallocated HPTs */
 #define HPTE_V_HVLOCK  0x40UL
 #define HPTE_V_ABSENT  0x20UL
 
+/*
+ * We use this bit in the guest_rpte field of the revmap entry
+ * to indicate a modified HPTE.
+ */
+#define HPTE_GR_MODIFIED   (1ul << 62)
+
 static inline long try_lock_hpte(unsigned long *hpte, unsigned long bits)
 {
unsigned long tmp, old;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 3093896..58c7264 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -248,6 +248,7 @@ struct kvm_arch {
atomic_t vcpus_running;
unsigned long hpt_npte;
unsigned long hpt_mask;
+   atomic_t hpte_mod_interest;
spinlock_t slot_phys_lock;
unsigned short last_vcpu[NR_CPUS];
struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 3233587..c83c0ca 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -66,6 +66,18 @@ void kvmppc_add_revmap_chain(struct kvm *kvm, struct 
revmap_entry *rev,
 }
 EXPORT_SYMBOL_GPL(kvmppc_add_revmap_chain);
 
+/*
+ * Note modification of an HPTE; set the HPTE modified bit
+ * if it wasn't modified before and anyone is interested.
+ */
+static inline void note_hpte_modification(struct kvm *kvm,
+ struct revmap_entry *rev)
+{
+   if (!(rev->guest_rpte & HPTE_GR_MODIFIED) &&
+   atomic_read(&kvm->arch.hpte_mod_interest))
+   rev->guest_rpte |= HPTE_GR_MODIFIED;
+}
+
 /* Remove this HPTE from the chain for a real page */
 static void remove_revmap_chain(struct kvm *kvm, long pte_index,
struct revmap_entry *rev,
@@ -287,8 +299,10 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long 
flags,
rev = &kvm->arch.revmap[pte_index];
if (realmode)
rev = real_vmalloc_addr(rev);
-   if (rev)
+   if (rev) {
rev->guest_rpte = g_ptel;
+   note_hpte_modification(kvm, rev);
+   }
 
/* Link HPTE into reverse-map chain */
if (pteh & HPTE_V_VALID) {
@@ -392,7 +406,8 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long 
flags,
/* Read PTE low word after tlbie to get final R/C values */
remove_revmap_chain(kvm, pte_index, rev, v, hpte[1]);
}
-   r = rev->guest_rpte;
+   r = rev->guest_rpte & ~HPTE_GR_MODIFIED;
+   note_hpte_modification(kvm, rev);
unlock_hpte(hpte, 0);
 
vcpu->arch.gpr[4] = v;
@@ -466,6 +481,7 @@ long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
 
args[j] = ((0x80 | flags) << 56) + pte_index;
rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
+   note_hpte_modification(kvm, rev);
 
if (!(hp[0] & HPTE_V_VALID)) {
/* insert R and C bits from PTE */
@@ -555,6 +571,7 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long 
flags,
if (rev) {
r = (rev->guest_rpte & ~mask) | bits;
rev->guest_rpte = r;
+   note_hpte_modification(kvm, rev);
}
r = (hpte[1] & ~mask) | bits;
 
@@ -606,8 +623,10 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long 
flags,
v &= ~HPTE_V_ABSENT;
v |= HPTE_V_VALID;
}
-   if (v & HPTE_V_VALID)
+   if (v & HPTE_V_VALID) {
r = rev[i].guest_rpte | (r & (HPTE_R_R | HPTE_R_C));
+   r &= ~HPTE_GR_MODIFIED;
+   }
vcpu->arch.gpr[4 + i * 2] = v;
vcpu->arch.gpr[5 + i * 2] = r;
}
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] KVM: PPC: Book3S HV: Restructure HPT entry creation code

2012-10-15 Thread Paul Mackerras
This restructures the code that creates HPT (hashed page table)
entries so that it can be called in situations where we don't have a
struct vcpu pointer, only a struct kvm pointer.  It also fixes a bug
where kvmppc_map_vrma() would corrupt the guest R4 value.

Now, most of the work of kvmppc_virtmode_h_enter is done by a new
function, kvmppc_virtmode_do_h_enter, which itself calls another new
function, kvmppc_do_h_enter, which contains most of the old
kvmppc_h_enter.  The new kvmppc_do_h_enter takes explicit arguments
for the place to return the HPTE index, the Linux page tables to use,
and whether it is being called in real mode, thus removing the need
for it to have the vcpu as an argument.

Currently kvmppc_map_vrma creates the VRMA (virtual real mode area)
HPTEs by calling kvmppc_virtmode_h_enter, which is designed primarily
to handle H_ENTER hcalls from the guest that need to pin a page of
memory.  Since H_ENTER returns the index of the created HPTE in R4,
kvmppc_virtmode_h_enter updates the guest R4, corrupting the guest R4
in the case when it gets called from kvmppc_map_vrma on the first
VCPU_RUN ioctl.  With this, kvmppc_map_vrma instead calls
kvmppc_virtmode_do_h_enter with the address of a dummy word as the
place to store the HPTE index, thus avoiding corrupting the guest R4.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_book3s.h |5 +++--
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |   36 +++--
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   27 -
 3 files changed, 45 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index ab73800..199b7fd 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -157,8 +157,9 @@ extern void *kvmppc_pin_guest_page(struct kvm *kvm, 
unsigned long addr,
 extern void kvmppc_unpin_guest_page(struct kvm *kvm, void *addr);
 extern long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel);
-extern long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
-   long pte_index, unsigned long pteh, unsigned long ptel);
+extern long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
+   long pte_index, unsigned long pteh, unsigned long ptel,
+   pgd_t *pgdir, bool realmode, unsigned long *idx_ret);
 extern long kvmppc_hv_get_dirty_log(struct kvm *kvm,
struct kvm_memory_slot *memslot, unsigned long *map);
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 7a4aae9..351f2ac 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -41,6 +41,10 @@
 /* Power architecture requires HPT is at least 256kB */
 #define PPC_MIN_HPT_ORDER  18
 
+static long kvmppc_virtmode_do_h_enter(struct kvm *kvm, unsigned long flags,
+   long pte_index, unsigned long pteh,
+   unsigned long ptel, unsigned long *pte_idx_ret);
+
 long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
 {
unsigned long hpt;
@@ -185,6 +189,7 @@ void kvmppc_map_vrma(struct kvm_vcpu *vcpu, struct 
kvm_memory_slot *memslot,
unsigned long addr, hash;
unsigned long psize;
unsigned long hp0, hp1;
+   unsigned long idx_ret;
long ret;
struct kvm *kvm = vcpu->kvm;
 
@@ -216,7 +221,8 @@ void kvmppc_map_vrma(struct kvm_vcpu *vcpu, struct 
kvm_memory_slot *memslot,
hash = (hash << 3) + 7;
hp_v = hp0 | ((addr >> 16) & ~0x7fUL);
hp_r = hp1 | addr;
-   ret = kvmppc_virtmode_h_enter(vcpu, H_EXACT, hash, hp_v, hp_r);
+   ret = kvmppc_virtmode_do_h_enter(kvm, H_EXACT, hash, hp_v, hp_r,
+&idx_ret);
if (ret != H_SUCCESS) {
pr_err("KVM: map_vrma at %lx failed, ret=%ld\n",
   addr, ret);
@@ -354,15 +360,10 @@ static long kvmppc_get_guest_page(struct kvm *kvm, 
unsigned long gfn,
return err;
 }
 
-/*
- * We come here on a H_ENTER call from the guest when we are not
- * using mmu notifiers and we don't have the requested page pinned
- * already.
- */
-long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
-   long pte_index, unsigned long pteh, unsigned long ptel)
+long kvmppc_virtmode_do_h_enter(struct kvm *kvm, unsigned long flags,
+   long pte_index, unsigned long pteh,
+   unsigned long ptel, unsigned long *pte_idx_ret)
 {
-   struct kvm *kvm = vcpu->kvm;
unsigned long psize, gpa, gfn;
struct kvm_memory_slot *memslot;
long ret;
@@ -390,8 +391,8 @@ long kvmppc_virtmode_h_enter(s

[PATCH 0/5] KVM: PPC: Book3S HV: HPT read/write functions for userspace

2012-10-15 Thread Paul Mackerras
This series of patches provides an interface by which userspace can
read and write the hashed page table (HPT) of a Book3S HV guest.
The interface is an ioctl which provides a file descriptor which can
be accessed with the read() and write() system calls.  The data read
and written is the guest view of the HPT, in which the second
doubleword of each HPTE (HPT entry) contains a guest physical address,
as distinct from the real HPT that the hardware accesses, where the
second doubleword of each HPTE contains a real address.

Because the HPT is divided into groups (HPTEGs) of 8 entries each,
where each HPTEG usually only contains a few valid entries, or none,
the data format that we use does run-length encoding of the invalid
entries, so in fact the invalid entries take up no space in the
stream.

The interface also provides for doing multiple passes over the HPT,
where the first pass provides information on all HPTEs, and subsequent
passes only return the HPTEs that have changed since the previous pass.

I have implemented a read/write interface rather than an mmap-based
interface because the data is not stored contiguously anywhere in
kernel memory.  Of each 16-byte HPTE, the first 8 bytes come from the
real HPT and the second 8 bytes come from the parallel vmalloc'd array
where we store the guest view of the guest physical address,
permissions, accessed/dirty bits etc.  Thus a mmap-based interface
would not be practicable (not without doubling the size of the
parallel array, typically requiring an extra 8MB of kernel memory per
guest).  This is also why I have not used the memslot interface for
this.

This implements the interface for HV-style KVM but not for PR-style
KVM.  Userspace does not need any additional interface with PR-style
KVM because userspace maintains the guest HPT already in that case,
and has an image of the guest view of the HPT in its address space.

This series is against the next branch of the kvm tree plus my
recently-posted set of 8 patches ("Various Book3s HV fixes that
haven't been picked up yet").  The overall diffstat is:

 Documentation/virtual/kvm/api.txt|   53 +
 arch/powerpc/include/asm/kvm.h   |   24 ++
 arch/powerpc/include/asm/kvm_book3s.h|8 +-
 arch/powerpc/include/asm/kvm_book3s_64.h |   24 ++
 arch/powerpc/include/asm/kvm_host.h  |1 +
 arch/powerpc/include/asm/kvm_ppc.h   |2 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  |  380 +-
 arch/powerpc/kvm/book3s_hv.c |   12 -
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  |   71 --
 arch/powerpc/kvm/powerpc.c   |   17 ++
 include/linux/kvm.h  |3 +
 include/linux/kvm_host.h |   11 +-
 12 files changed, 559 insertions(+), 47 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT

2012-10-15 Thread Paul Mackerras
A new ioctl, KVM_PPC_GET_HTAB_FD, returns a file descriptor.  Reads on
this fd return the contents of the HPT (hashed page table), writes
create and/or remove entries in the HPT.  There is a new capability,
KVM_CAP_PPC_HTAB_FD, to indicate the presence of the ioctl.  The ioctl
takes an argument structure with the index of the first HPT entry to
read out and a set of flags.  The flags indicate whether the user is
intending to read or write the HPT, and whether to return all entries
or only the "bolted" entries (those with the bolted bit, 0x10, set in
the first doubleword).

This is intended for use in implementing qemu's savevm/loadvm and for
live migration.  Therefore, on reads, the first pass returns information
about all HPTEs (or all bolted HPTEs).  When the first pass reaches the
end of the HPT, it returns from the read.  Subsequent reads only return
information about HPTEs that have changed since they were last read.
A read that finds no changed HPTEs in the HPT following where the last
read finished will return 0 bytes.

Signed-off-by: Paul Mackerras 
---
 Documentation/virtual/kvm/api.txt|   53 +
 arch/powerpc/include/asm/kvm.h   |   24 +++
 arch/powerpc/include/asm/kvm_book3s_64.h |   18 ++
 arch/powerpc/include/asm/kvm_ppc.h   |2 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  |  344 ++
 arch/powerpc/kvm/book3s_hv.c |   12 --
 arch/powerpc/kvm/powerpc.c   |   17 ++
 include/linux/kvm.h  |3 +
 8 files changed, 461 insertions(+), 12 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 4258180..8df3e53 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2071,6 +2071,59 @@ KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; 
source cpu in parm
 
 Note that the vcpu ioctl is asynchronous to vcpu execution.
 
+4.78 KVM_PPC_GET_HTAB_FD
+
+Capability: KVM_CAP_PPC_HTAB_FD
+Architectures: powerpc
+Type: vm ioctl
+Parameters: Pointer to struct kvm_get_htab_fd (in)
+Returns: file descriptor number (>= 0) on success, -1 on error
+
+This returns a file descriptor that can be used either to read out the
+entries in the guest's hashed page table (HPT), or to write entries to
+initialize the HPT.  The returned fd can only be written to if the
+KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
+can only be read if that bit is clear.  The argument struct looks like
+this:
+
+/* For KVM_PPC_GET_HTAB_FD */
+struct kvm_get_htab_fd {
+   __u64   flags;
+   __u64   start_index;
+};
+
+/* Values for kvm_get_htab_fd.flags */
+#define KVM_GET_HTAB_BOLTED_ONLY   ((__u64)0x1)
+#define KVM_GET_HTAB_WRITE ((__u64)0x2)
+
+The `start_index' field gives the index in the HPT of the entry at
+which to start reading.  It is ignored when writing.
+
+Reads on the fd will initially supply information about all
+"interesting" HPT entries.  Interesting entries are those with the
+bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise
+all entries.  When the end of the HPT is reached, the read() will
+return.  If read() is called again on the fd, it will start again from
+the beginning of the HPT, but will only return HPT entries that have
+changed since they were last read.
+
+Data read or written is structured as a header (8 bytes) followed by a
+series of valid HPT entries (16 bytes) each.  The header indicates how
+many valid HPT entries there are and how many invalid entries follow
+the valid entries.  The invalid entries are not represented explicitly
+in the stream.  The header format is:
+
+struct kvm_get_htab_header {
+   __u32   index;
+   __u16   n_valid;
+   __u16   n_invalid;
+};
+
+Writes to the fd create HPT entries starting at the index given in the
+header; first `n_valid' valid entries with contents from the data
+written, then `n_invalid' invalid entries, invalidating any previously
+valid entries found.
+
 
 5. The kvm_run structure
 
diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index b89ae4d..6518e38 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -331,6 +331,30 @@ struct kvm_book3e_206_tlb_params {
__u32 reserved[8];
 };
 
+/* For KVM_PPC_GET_HTAB_FD */
+struct kvm_get_htab_fd {
+   __u64   flags;
+   __u64   start_index;
+};
+
+/* Values for kvm_get_htab_fd.flags */
+#define KVM_GET_HTAB_BOLTED_ONLY   ((__u64)0x1)
+#define KVM_GET_HTAB_WRITE ((__u64)0x2)
+
+/*
+ * Data read on the file descriptor is formatted as a series of
+ * records, each consisting of a header followed by a series of
+ * `n_valid' HPTEs (16 bytes each), which are all valid.  Following 
+ * those valid HPTEs there are `n_invalid' invalid HPTEs, which
+ * are not represented explicitly in the stream.  The same format
+ * is used for writing.
+ */
+struct kvm_get_htab

[PATCH 4/5] KVM: PPC: Book3S HV: Make a HPTE removal function available

2012-10-15 Thread Paul Mackerras
This makes a HPTE removal function, kvmppc_do_h_remove(), available
outside book3s_hv_rm_mmu.c.  This will be used by the HPT writing
code.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_book3s.h |3 +++
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   19 +--
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 199b7fd..4ac1c67 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -160,6 +160,9 @@ extern long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, 
unsigned long flags,
 extern long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel,
pgd_t *pgdir, bool realmode, unsigned long *idx_ret);
+extern long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
+   unsigned long pte_index, unsigned long avpn,
+   unsigned long *hpret);
 extern long kvmppc_hv_get_dirty_log(struct kvm *kvm,
struct kvm_memory_slot *memslot, unsigned long *map);
 
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index c83c0ca..505548a 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -364,11 +364,10 @@ static inline int try_lock_tlbie(unsigned int *lock)
return old == 0;
 }
 
-long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
-unsigned long pte_index, unsigned long avpn,
-unsigned long va)
+long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
+   unsigned long pte_index, unsigned long avpn,
+   unsigned long *hpret)
 {
-   struct kvm *kvm = vcpu->kvm;
unsigned long *hpte;
unsigned long v, r, rb;
struct revmap_entry *rev;
@@ -410,10 +409,18 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long 
flags,
note_hpte_modification(kvm, rev);
unlock_hpte(hpte, 0);
 
-   vcpu->arch.gpr[4] = v;
-   vcpu->arch.gpr[5] = r;
+   hpret[0] = v;
+   hpret[1] = r;
return H_SUCCESS;
 }
+EXPORT_SYMBOL_GPL(kvmppc_do_h_remove);
+
+long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
+unsigned long pte_index, unsigned long avpn)
+{
+   return kvmppc_do_h_remove(vcpu->kvm, flags, pte_index, avpn,
+ &vcpu->arch.gpr[4]);
+}
 
 long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
 {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] MAINTAINERS: Add git tree link for PPC KVM

2012-10-15 Thread Michael Ellerman
Signed-off-by: Michael Ellerman 
---
 MAINTAINERS |1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e73060f..32dc107 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4244,6 +4244,7 @@ KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC
 M: Alexander Graf 
 L: kvm-ppc@vger.kernel.org
 W: http://kvm.qumranet.com
+T: git git://github.com/agraf/linux-2.6.git
 S: Supported
 F: arch/powerpc/include/asm/kvm*
 F: arch/powerpc/kvm/
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm/powerpc: Handle errors in secondary thread grabbing

2012-10-15 Thread Michael Ellerman
On Tue, 2012-10-16 at 14:13 +1100, Paul Mackerras wrote:
> Michael,
> 
> On Tue, Oct 16, 2012 at 11:15:50AM +1100, Michael Ellerman wrote:
> > In the Book3s HV code, kvmppc_run_core() has logic to grab the secondary
> > threads of the physical core.
> > 
> > If for some reason a thread is stuck, kvmppc_grab_hwthread() can fail,
> > but currently we ignore the failure and continue into the guest. If the
> > stuck thread is in the kernel badness ensues.
> > 
> > Instead we should check for failure and bail out.
> > 
> > I've moved the grabbing prior to the startup of runnable threads, to 
> > simplify
> > the error case. AFAICS this is harmless, but I could be missing something
> > subtle.
> 
> Thanks for looking at this - but in fact this is fixed by my patch
> entitled "KVM: PPC: Book3S HV: Fix some races in starting secondary
> threads" submitted back on August 28.

OK thanks. It seems that patch didn't make 3.7 ?

I don't see it in kvm-ppc-next either.

cheers

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html