Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-07 Thread Avi Kivity
On 08/06/2012 11:25 PM, Scott Wood wrote:
 On 08/05/2012 04:00 AM, Avi Kivity wrote:
 On 08/04/2012 01:32 AM, Benjamin Herrenschmidt wrote:
 On Fri, 2012-08-03 at 15:05 -0300, Marcelo Tosatti wrote:

 See kvm_arch_process_async_events() call to qemu_system_reset_request()
 in target-i386/kvm.c.

 The whole thing is fragile, though: we rely on the order events
 are processed inside KVM_RUN, in x86:

 1) If there is pending MMIO, process it.
 2) If not, return with -EINTR (and KVM_EXIT_INTR) in case
 there is a signal pending.

 That way, the vcpu will not process the stop event from the main loop
 (ie not exit from the kvm_cpu_exec() loop), until MMIO is finished.

 Right, it is fragile, thankfully we appear to adhere to the same
 ordering on powerpc so far :-)

 So we'll need to test but it looks like we might be able to fix our
 problem without a kernel or API change, just by changing qemu to
 do the same exit_request trick for our reboot hypercall.

 Long run however, I wonder whether we should consider an explicit ioctl
 to complete those pending operations instead...
 
 It's pointless.  We have to support the old method forever.
 
 Not in new architectures (even PPC has yet to start using this) or new
 userspaces -- and forever is a long time.  People down the road may very
 well decide that it's time to clean out the deprecated stuff that hasn't
 been used in over a decade.  IMHO this shouldn't be a reason to
 not improve the API, as long as compatibility is possible for as long as
 it is deemed worthwhile.

For qemu this is in common code (kvm-all.c); some architectures might
need wiring up, but the code is there.

For the kernel we need to handle signals, and we need to check for
signals in the atomic guest entry path.  That leads naturally to the
completion-before-signal order.

In fact there's no other way to do it.  If we check for signals before
handling completions, there's no way for userspace to know whether the
completion was completely handled, and whether more completions are
necessary (mmio completions are limited to 8 bytes but some x86
instructions generate many more accesses).

 
 There's no
 material different between sigqueue() + KVM_RUN and KVM_COMPLETE, or a
 KVM_RUN with a flag that tells it to exit immediately.
 
 The latter is less fragile and easier to use.


   if (need_exit)
queue signal

 vs

   run-need_exit = need_exit


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-07 Thread Avi Kivity
On 08/07/2012 04:32 AM, David Gibson wrote:
 On Tue, Aug 07, 2012 at 06:57:57AM +1000, Benjamin Herrenschmidt wrote:
 On Mon, 2012-08-06 at 13:13 +1000, David Gibson wrote:
  So, I'm still trying to nut out the implications for H_CEDE, and think
  if there are any other hypercalls that might want to block the guest
  for a time.  We were considering blocking H_PUT_TCE if qemu devices
  had active dma maps on the previously mapped iovas.  I'm not sure if
  the discussions that led to the inclusion of the qemu IOMMU code
  decided that was wholly unnnecessary or just not necessary for the
  time being.
 
 For sleeping hcalls they will simply have to set exit_request to
 complete the hcall from the kernel perspective, leaving us in a state
 where the kernel is about to restart at srr0 + 4, along with some other
 flag (stop or halt) to actually freeze the vcpu.
 
 If such an async hcall decides to return an error, it can then set
 gpr3 directly using ioctls before restarting the vcpu.
 
 Yeah, I'd pretty much convinced myself of that by the end of
 yesterday.  I hope to send patches implementing these fixes today.
 
 There are also some questions about why our in-kernel H_CEDE works
 kind of differently from x86's hlt instruction implementation (which
 comes out to qemu unless the irqchip is in-kernel as well).  I don't
 think we have an urgent problem there though.

It's the other way round, hlt sleeps in the kernel unless the irqchip is
not in the kernel.  Meaning the normal state of things is to sleep in
the kernel (whether or not you have an emulated interrupt controller in
the kernel -- the term irqchip in kernel is overloaded for x86).


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: PPC: booke: Add watchdog emulation

2012-08-07 Thread Alexander Graf

On 03.08.2012, at 07:30, Bharat Bhushan wrote:

 This patch adds the watchdog emulation in KVM. The watchdog
 emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl.
 The kernel timer are used for watchdog emulation and emulates
 h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU
 if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how
 it is configured.
 
 Signed-off-by: Liu Yu yu@freescale.com
 Signed-off-by: Scott Wood scottw...@freescale.com
 [bharat.bhus...@freescale.com: reworked patch]
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 v7:
 - kvmppc_core_dequeue_watchdog() and kvmppc_core_enque_watchdog()
   are made staic
 - Clear the KVM_REQ_WATCHDOG request when TSR_ENW | ENW_WIS cleared
   rather than checking these bits on handling the request.
   Below changes (pasted for quick understanding)
 diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
 index 7eedaeb..a0f5922 100644
 --- a/arch/powerpc/kvm/booke.c
 +++ b/arch/powerpc/kvm/booke.c
 @@ -629,8 +629,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
 kvm_vcpu *vcpu)
goto out;
}
 
 -   if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu) 
 -   (vcpu-arch.tsr  (TSR_ENW | TSR_WIS))) {
 +   if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu)) {
kvm_run-exit_reason = KVM_EXIT_WATCHDOG;
ret = 0;
goto out;
 @@ -1098,8 +1097,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
 kvm_vcpu *vcpu,
}
}
 
 -   if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu) 
 -   (vcpu-arch.tsr  (TSR_ENW | TSR_WIS))) {
 +   if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu)) {
run-exit_reason = KVM_EXIT_WATCHDOG;
r = RESUME_HOST | (r  RESUME_FLAG_NV);
}
 @@ -1251,8 +1249,10 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
 
vcpu-arch.tsr = sregs-u.e.tsr;
 
 -   if ((old_tsr ^ vcpu-arch.tsr)  (TSR_ENW | TSR_WIS))
 +   if ((old_tsr ^ vcpu-arch.tsr)  (TSR_ENW | TSR_WIS)) {
 +   clear_bit(KVM_REQ_WATCHDOG, vcpu-requests);
arm_next_watchdog(vcpu);
 +   }
 
update_timer_ints(vcpu);
}
 @@ -1434,8 +1434,10 @@ void kvmppc_clr_tsr_bits(struct kvm_vcpu *vcpu, u32 
 tsr_bits)
 * We may have stopped the watchdog due to
 * being stuck on final expiration.
 */
 -   if (tsr_bits  (TSR_ENW | TSR_WIS))
 +   if (tsr_bits  (TSR_ENW | TSR_WIS)) {
 +   clear_bit(KVM_REQ_WATCHDOG, vcpu-requests);
arm_next_watchdog(vcpu);
 +   }
 --
 
 v6:
 - Added kvmppc_subarch_vcpu_unit/uninit() for subarch specific initialization
 - Using spin_lock_irqsave()
 - Using del_timer_sync().
 
 v5:
 - Checking that TSR_ENW/TSR_WIS are still set on KVM_EXIT_WATCHDOG.
 - Moved watchdog_next_timeout() in lock.
 
 arch/powerpc/include/asm/kvm_host.h  |3 +
 arch/powerpc/include/asm/kvm_ppc.h   |2 +
 arch/powerpc/include/asm/reg_booke.h |7 ++
 arch/powerpc/kvm/book3s.c|9 ++
 arch/powerpc/kvm/booke.c |  158 ++
 arch/powerpc/kvm/booke_emulate.c |8 ++
 arch/powerpc/kvm/powerpc.c   |   14 +++-
 include/linux/kvm.h  |2 +
 include/linux/kvm_host.h |1 +
 9 files changed, 202 insertions(+), 2 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_host.h 
 b/arch/powerpc/include/asm/kvm_host.h
 index 572ad01..873ec11 100644
 --- a/arch/powerpc/include/asm/kvm_host.h
 +++ b/arch/powerpc/include/asm/kvm_host.h
 @@ -469,6 +469,8 @@ struct kvm_vcpu_arch {
   ulong fault_esr;
   ulong queued_dear;
   ulong queued_esr;
 + spinlock_t wdt_lock;
 + struct timer_list wdt_timer;
   u32 tlbcfg[4];
   u32 mmucfg;
   u32 epr;
 @@ -484,6 +486,7 @@ struct kvm_vcpu_arch {
   u8 osi_needed;
   u8 osi_enabled;
   u8 papr_enabled;
 + u8 watchdog_enabled;
   u8 sane;
   u8 cpu_type;
   u8 hcall_needed;
 diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
 b/arch/powerpc/include/asm/kvm_ppc.h
 index 0124937..1438a5e 100644
 --- a/arch/powerpc/include/asm/kvm_ppc.h
 +++ b/arch/powerpc/include/asm/kvm_ppc.h
 @@ -68,6 +68,8 @@ extern void kvmppc_emulate_dec(struct kvm_vcpu *vcpu);
 extern u32 kvmppc_get_dec(struct kvm_vcpu *vcpu, u64 tb);
 extern void kvmppc_decrementer_func(unsigned long data);
 extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
 +extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
 +extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
 /* Core-specific hooks */
 
 diff --git a/arch/powerpc/include/asm/reg_booke.h 
 b/arch/powerpc/include/asm/reg_booke.h
 index 2d916c4..e07e6af 100644
 --- a/arch/powerpc/include/asm/reg_booke.h
 +++ b/arch/powerpc/include/asm/reg_booke.h
 @@ -539,6 +539,13 @@
 

Re: [PATCH 2/2] Unified the kvm requests in one function

2012-08-07 Thread Alexander Graf

On 03.08.2012, at 07:30, Bharat Bhushan wrote:

 I am sending this as a separate patch for easiness on review.
 Once reviewed I will merge this with watchdog patch.

Please don't merge, but rather make 2 different patches out of it:

  1) Introduce kvmppc_handle_requests()
  2) Implement watchdog, including request bits

 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/kvm/booke.c |   63 +++---
 1 files changed, 37 insertions(+), 26 deletions(-)
 
 diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
 index a0f5922..68ed863 100644
 --- a/arch/powerpc/kvm/booke.c
 +++ b/arch/powerpc/kvm/booke.c
 @@ -525,18 +525,31 @@ static void update_timer_ints(struct kvm_vcpu *vcpu)
   kvmppc_core_dequeue_watchdog(vcpu);
 }
 
 +static int kvmppc_handle_requests(struct kvm_run *run, struct kvm_vcpu *vcpu)
 +{
 + int ret = 0;
 +
 + if (!vcpu-requests)
 + return 0;
 +
 + if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
 + smp_mb();
 + update_timer_ints(vcpu);
 + }
 +
 + if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu)) {
 + run-exit_reason = KVM_EXIT_WATCHDOG;
 + ret = 1;
 + }
 +
 + return ret;
 +}
 +
 static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
 {
   unsigned long *pending = vcpu-arch.pending_exceptions;
   unsigned int priority;
 
 - if (vcpu-requests) {
 - if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
 - smp_mb();
 - update_timer_ints(vcpu);
 - }
 - }
 -
   priority = __ffs(*pending);
   while (priority  BOOKE_IRQPRIO_MAX) {
   if (kvmppc_booke_irqprio_deliver(vcpu, priority))
 @@ -578,7 +591,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
  *
  * returns !0 if a signal is pending and check_signal is true
  */
 -static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 +static int kvmppc_prepare_to_enter(struct kvm_run *run, struct kvm_vcpu 
 *vcpu)

You can always get to run via vcpu-run. Every time we pass in run as 
parameter, we're using a deprecated kvm calling convention.

Otherwise looks good to me :)


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] booke: Added ONE_REG interface for IAC/DAC debug registers

2012-08-07 Thread Alexander Graf

On 03.08.2012, at 09:08, Bharat Bhushan wrote:

 From: Bharat Bhushan bharat.bhus...@freescale.com
 
 IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
 interface is added to set/get them.
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/include/asm/kvm.h  |   12 ++
 arch/powerpc/include/asm/kvm_host.h |   24 +++-
 arch/powerpc/kvm/booke.c|   66 +-
 arch/powerpc/kvm/booke_emulate.c|8 ++--
 4 files changed, 102 insertions(+), 8 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
 index 1bea4d8..3c14202 100644
 --- a/arch/powerpc/include/asm/kvm.h
 +++ b/arch/powerpc/include/asm/kvm.h
 @@ -221,6 +221,12 @@ struct kvm_sregs {
 
   __u32 dbsr; /* KVM_SREGS_E_UPDATE_DBSR */
   __u32 dbcr[3];
 + /*
 +  * iac/dac registers are 64bit wide, while this API
 +  * interface provides only lower 32 bits on 64 bit
 +  * processors. ONE_REG interface is added for 64bit
 +  * iac/dac registers.
 +  */
   __u32 iac[4];
   __u32 dac[2];
   __u32 dvc[2];
 @@ -326,5 +332,11 @@ struct kvm_book3e_206_tlb_params {
 };
 
 #define KVM_REG_PPC_HIOR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1)
 +#define KVM_REG_PPC_IAC1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x2)
 +#define KVM_REG_PPC_IAC2 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x3)
 +#define KVM_REG_PPC_IAC3 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x4)
 +#define KVM_REG_PPC_IAC4 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x5)
 +#define KVM_REG_PPC_DAC1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x6)
 +#define KVM_REG_PPC_DAC2 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x7)
 
 #endif /* __LINUX_KVM_POWERPC_H */
 diff --git a/arch/powerpc/include/asm/kvm_host.h 
 b/arch/powerpc/include/asm/kvm_host.h
 index 873ec11..dcee499 100644
 --- a/arch/powerpc/include/asm/kvm_host.h
 +++ b/arch/powerpc/include/asm/kvm_host.h
 @@ -344,6 +344,27 @@ struct kvmppc_slb {
   bool class  : 1;
 };
 
 +# ifdef CONFIG_PPC_FSL_BOOK3E
 +#define KVMPPC_BOOKE_IAC_NUM 2
 +#define KVMPPC_BOOKE_DAC_NUM 2
 +# else
 +#define KVMPPC_BOOKE_IAC_NUM 4
 +#define KVMPPC_BOOKE_DAC_NUM 2
 +# endif
 +#define KVMPPC_BOOKE_MAX_IAC 4
 +#define KVMPPC_BOOKE_MAX_DAC 2
 +
 +struct kvmppc_booke_debug_reg {
 + u32 dbcr0;
 + u32 dbcr1;
 + u32 dbcr2;
 +#ifdef CONFIG_KVM_E500MC
 + u32 dbcr4;
 +#endif
 + u64 iac[KVMPPC_BOOKE_MAX_IAC];
 + u64 dac[KVMPPC_BOOKE_MAX_DAC];
 +};
 +
 struct kvm_vcpu_arch {
   ulong host_stack;
   u32 host_pid;
 @@ -438,9 +459,8 @@ struct kvm_vcpu_arch {
 
   u32 ccr0;
   u32 ccr1;
 - u32 dbcr0;
 - u32 dbcr1;
   u32 dbsr;
 + struct kvmppc_booke_debug_reg dbg_reg;

Please put an #ifdef CONFIG_BOOKE around the above block.

 
   u64 mmcr[3];
   u32 pmc[8];
 diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
 index 68ed863..aebbb8b 100644
 --- a/arch/powerpc/kvm/booke.c
 +++ b/arch/powerpc/kvm/booke.c
 @@ -1379,12 +1379,74 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu 
 *vcpu,
 
 int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
 {
 - return -EINVAL;
 + int r = -EINVAL;
 +
 + switch(reg-id) {
 + case KVM_REG_PPC_IAC1:
 + r = copy_to_user((u64 __user *)(long)reg-addr,
 +  vcpu-arch.dbg_reg.iac[0], sizeof(u64));
 + break;
 + case KVM_REG_PPC_IAC2:
 + r = copy_to_user((u64 __user *)(long)reg-addr,
 +  vcpu-arch.dbg_reg.iac[1], sizeof(u64));
 + break;
 +#ifndef CONFIG_PPC_FSL_BOOK3E

No need for the #ifdef. Also, please do something like

case KVM_REG_PPC_IAC1:
case KVM_REG_PPC_IAC2:
case KVM_REG_PPC_IAC3:
case KVM_REG_PPC_IAC4: {
int iac = reg-id - KVM_REG_PPC_IAC1;
r = copy_to_user(..., ...iac[iac], ...);
break;
}

Alex

 + case KVM_REG_PPC_IAC3:
 + r = copy_to_user((u64 __user *)(long)reg-addr,
 +  vcpu-arch.dbg_reg.iac[2], sizeof(u64));
 + break;
 + case KVM_REG_PPC_IAC4:
 + r = copy_to_user((u64 __user *)(long)reg-addr,
 +  vcpu-arch.dbg_reg.iac[3], sizeof(u64));
 + break;
 +#endif
 + case KVM_REG_PPC_DAC1:
 + r = copy_to_user((u64 __user *)(long)reg-addr,
 +  vcpu-arch.dbg_reg.dac[0], sizeof(u64));
 + break;
 + case KVM_REG_PPC_DAC2:
 + r = copy_to_user((u64 __user *)(long)reg-addr,
 +  vcpu-arch.dbg_reg.dac[1], sizeof(u64));
 + break;
 + default:
 + break;
 + }
 + return r;
 }
 
 int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct 

Re: [PATCH 2/3] KVM: PPC: booke: Allow multiple exception types

2012-08-07 Thread Alexander Graf

On 03.08.2012, at 09:08, Bharat Bhushan wrote:

 Current kvmppc_booke_handlers uses the same macro (KVM_HANDLER) and
 all handlers are considered to be the same size. This will not be
 the case if we want to use different macros for different handlers.
 
 This patch improves the kvmppc_booke_handler so that it can
 support different macros for different handlers.
 
 Signed-off-by: Liu Yu yu@freescale.com
 [bharat.bhus...@freescale.com: Substantial changes]
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/include/asm/kvm_ppc.h  |2 -
 arch/powerpc/kvm/booke.c|9 ---
 arch/powerpc/kvm/booke.h|1 +
 arch/powerpc/kvm/booke_interrupts.S |   37 --
 arch/powerpc/kvm/e500.c |   13 +++
 5 files changed, 48 insertions(+), 14 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
 b/arch/powerpc/include/asm/kvm_ppc.h
 index 1438a5e..deb55bd 100644
 --- a/arch/powerpc/include/asm/kvm_ppc.h
 +++ b/arch/powerpc/include/asm/kvm_ppc.h
 @@ -47,8 +47,6 @@ enum emulation_result {
 
 extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 -extern char kvmppc_handlers_start[];
 -extern unsigned long kvmppc_handler_len;
 extern void kvmppc_handler_highmem(void);
 
 extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu);
 diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
 index aebbb8b..1338233 100644
 --- a/arch/powerpc/kvm/booke.c
 +++ b/arch/powerpc/kvm/booke.c
 @@ -1541,6 +1541,7 @@ int __init kvmppc_booke_init(void)
 {
 #ifndef CONFIG_KVM_BOOKE_HV
   unsigned long ivor[16];
 + unsigned long *handler = kvmppc_booke_handler_addr;
   unsigned long max_ivor = 0;
   int i;
 
 @@ -1574,14 +1575,14 @@ int __init kvmppc_booke_init(void)
 
   for (i = 0; i  16; i++) {
   if (ivor[i]  max_ivor)
 - max_ivor = ivor[i];
 + max_ivor = i;
 
   memcpy((void *)kvmppc_booke_handlers + ivor[i],
 -kvmppc_handlers_start + i * kvmppc_handler_len,
 -kvmppc_handler_len);
 +(void *)handler[i], handler[i + 1] - handler[i]);
   }
   flush_icache_range(kvmppc_booke_handlers,
 -kvmppc_booke_handlers + max_ivor + 
 kvmppc_handler_len);
 +kvmppc_booke_handlers + ivor[max_ivor] +
 +   handler[max_ivor + 1] - handler[max_ivor]);
 #endif /* !BOOKE_HV */
   return 0;
 }
 diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
 index ba61974..de9e526 100644
 --- a/arch/powerpc/kvm/booke.h
 +++ b/arch/powerpc/kvm/booke.h
 @@ -65,6 +65,7 @@
 (1  BOOKE_IRQPRIO_CRITICAL))
 
 extern unsigned long kvmppc_booke_handlers;
 +extern unsigned long kvmppc_booke_handler_addr[];
 
 void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr);
 void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr);
 diff --git a/arch/powerpc/kvm/booke_interrupts.S 
 b/arch/powerpc/kvm/booke_interrupts.S
 index bb46b32..3539805 100644
 --- a/arch/powerpc/kvm/booke_interrupts.S
 +++ b/arch/powerpc/kvm/booke_interrupts.S
 @@ -73,6 +73,14 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
   bctr
 .endm
 
 +.macro KVM_HANDLER_ADDR ivor_nr
 + .long   kvmppc_handler_\ivor_nr
 +.endm
 +
 +.macro KVM_HANDLER_END
 + .long   kvmppc_handlers_end
 +.endm
 +
 _GLOBAL(kvmppc_handlers_start)
 KVM_HANDLER BOOKE_INTERRUPT_CRITICAL SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0
 KVM_HANDLER BOOKE_INTERRUPT_MACHINE_CHECK  SPRN_SPRG_RSCRATCH_MC SPRN_MCSRR0
 @@ -93,9 +101,7 @@ KVM_HANDLER BOOKE_INTERRUPT_DEBUG SPRN_SPRG_RSCRATCH_CRIT 
 SPRN_CSRR0
 KVM_HANDLER BOOKE_INTERRUPT_SPE_UNAVAIL SPRN_SPRG_RSCRATCH0 SPRN_SRR0
 KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_DATA SPRN_SPRG_RSCRATCH0 SPRN_SRR0
 KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_ROUND SPRN_SPRG_RSCRATCH0 SPRN_SRR0
 -
 -_GLOBAL(kvmppc_handler_len)
 - .long kvmppc_handler_1 - kvmppc_handler_0
 +_GLOBAL(kvmppc_handlers_end)
 
 /* Registers:
  *  SPRG_SCRATCH0: guest r4
 @@ -463,6 +469,31 @@ lightweight_exit:
   lwz r4, VCPU_GPR(R4)(r4)
   rfi
 
 + .data
 + .align  4
 + .globl  kvmppc_booke_handler_addr
 +kvmppc_booke_handler_addr:
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_CRITICAL
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_MACHINE_CHECK
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_DATA_STORAGE
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_INST_STORAGE
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_EXTERNAL
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_ALIGNMENT
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_PROGRAM
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_FP_UNAVAIL
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_SYSCALL
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_AP_UNAVAIL
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_DECREMENTER
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_FIT
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_WATCHDOG
 +KVM_HANDLER_ADDR BOOKE_INTERRUPT_DTLB_MISS
 

Re: [PATCH 3/3] KVM: PPC: booke: Added debug handler

2012-08-07 Thread Alexander Graf

On 03.08.2012, at 09:08, Bharat Bhushan wrote:

 Installed debug handler will be used for guest debug support and
 debug facility emulation features (patches for these features
 will follow this patch).
 
 Signed-off-by: Liu Yu yu@freescale.com
 [bharat.bhus...@freescale.com: Substantial changes]
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/include/asm/kvm_host.h |1 +
 arch/powerpc/kernel/asm-offsets.c   |1 +
 arch/powerpc/kvm/booke_interrupts.S |   45 +++
 3 files changed, 47 insertions(+), 0 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_host.h 
 b/arch/powerpc/include/asm/kvm_host.h
 index dcee499..bd78523 100644
 --- a/arch/powerpc/include/asm/kvm_host.h
 +++ b/arch/powerpc/include/asm/kvm_host.h
 @@ -494,6 +494,7 @@ struct kvm_vcpu_arch {
   u32 tlbcfg[4];
   u32 mmucfg;
   u32 epr;
 + u32 crit_save;
 #endif
   gpa_t paddr_accessed;
   gva_t vaddr_accessed;
 diff --git a/arch/powerpc/kernel/asm-offsets.c 
 b/arch/powerpc/kernel/asm-offsets.c
 index 85b05c4..92f149b 100644
 --- a/arch/powerpc/kernel/asm-offsets.c
 +++ b/arch/powerpc/kernel/asm-offsets.c
 @@ -563,6 +563,7 @@ int main(void)
   DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
   DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
   DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
 + DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save));
 #endif /* CONFIG_PPC_BOOK3S */
 #endif /* CONFIG_KVM */
 
 diff --git a/arch/powerpc/kvm/booke_interrupts.S 
 b/arch/powerpc/kvm/booke_interrupts.S
 index 3539805..890673c 100644
 --- a/arch/powerpc/kvm/booke_interrupts.S
 +++ b/arch/powerpc/kvm/booke_interrupts.S
 @@ -73,6 +73,51 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
   bctr
 .endm
 
 +.macro KVM_DBG_HANDLER ivor_nr scratch srr0

This is a lot of asm code. Any chance to share a good share of it with the 
generic handler?


Alex

 +_GLOBAL(kvmppc_handler_\ivor_nr)
 + mtspr   \scratch, r4
 + mfspr   r4, SPRN_SPRG_THREAD
 + lwz r4, THREAD_KVM_VCPU(r4)
 + stw r3, VCPU_CRIT_SAVE(r4)
 + mfcrr3
 + mfspr   r4, SPRN_CSRR1
 + andi.   r4, r4, MSR_PR
 + bne 1f
 + /* debug interrupt happened in enter/exit path */
 + mfspr   r4, SPRN_CSRR1
 + rlwinm  r4, r4, 0, ~MSR_DE
 + mtspr   SPRN_CSRR1, r4
 + lis r4, 0x
 + ori r4, r4, 0x
 + mtspr   SPRN_DBSR, r4
 + mfspr   r4, SPRN_SPRG_THREAD
 + lwz r4, THREAD_KVM_VCPU(r4)
 + mtcrr3
 + lwz r3, VCPU_CRIT_SAVE(r4)
 + mfspr   r4, \scratch
 + rfci
 +1:   /* debug interrupt happened in guest */
 + mfspr   r4, \scratch
 + mtcrr3
 + mr  r3, r4
 + mfspr   r4, SPRN_SPRG_THREAD
 + lwz r4, THREAD_KVM_VCPU(r4)
 + stw r3, VCPU_GPR(R4)(r4)
 + stw r5, VCPU_GPR(R5)(r4)
 + stw r6, VCPU_GPR(R6)(r4)
 + lwz r3, VCPU_CRIT_SAVE(r4)
 + mfspr   r5, \srr0
 + stw r3, VCPU_GPR(R3)(r4)
 + stw r5, VCPU_PC(r4)
 + mfctr   r5
 + lis r6, kvmppc_resume_host@h
 + stw r5, VCPU_CTR(r4)
 + li  r5, \ivor_nr
 + ori r6, r6, kvmppc_resume_host@l
 + mtctr   r6
 + bctr
 +.endm
 +
 .macro KVM_HANDLER_ADDR ivor_nr
   .long   kvmppc_handler_\ivor_nr
 .endm
 -- 
 1.7.0.4
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/8] KVM: PPC: Add cache flush on page map

2012-08-07 Thread Alexander Graf
When we map a page that wasn't icache cleared before, do so when first
mapping it in KVM using the same information bits as the Linux mapping
logic. That way we are 100% sure that any page we map does not have stale
entries in the icache.

What we really would need is a method to flush the icache only when we
actually map a page executable for the guest. But with the current
abstraction that is hard to do, and this way we don't have a huge performance
penalty, so better be safe now and micro optimize things later.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |   10 ++
 arch/powerpc/mm/mem.c   |1 +
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index ed75bc9..c0a2fc1 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -33,6 +33,7 @@
 #include asm/kvm_asm.h
 #include asm/processor.h
 #include asm/page.h
+#include asm/cacheflush.h
 
 #define KVM_MAX_VCPUS  NR_CPUS
 #define KVM_MAX_VCORES NR_CPUS
@@ -550,5 +551,14 @@ struct kvm_vcpu_arch {
 #define KVM_MMIO_REG_FQPR  0x0060
 
 #define __KVM_HAVE_ARCH_WQP
+#define __KVM_HAVE_ARCH_MAP_PAGE
+static inline void kvm_arch_map_page(struct page *page)
+{
+   /* Need to invalidate the icache for new pages */
+   if (!test_bit(PG_arch_1, page-flags)) {
+   flush_dcache_icache_page(page);
+   set_bit(PG_arch_1, page-flags);
+   }
+}
 
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index baaafde..fbdad0e 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -469,6 +469,7 @@ void flush_dcache_icache_page(struct page *page)
__flush_dcache_icache_phys(page_to_pfn(page)  PAGE_SHIFT);
 #endif
 }
+EXPORT_SYMBOL(flush_dcache_icache_page);
 
 void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
 {
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/8] KVM: Add page map arch callback

2012-08-07 Thread Alexander Graf
Some archs need to ensure that their icache is flushed when mapping a new
page. Add a callback to the generic code for an arch to implement any cache
flush logic it may need.

Signed-off-by: Alexander Graf ag...@suse.de
---
 virt/kvm/kvm_main.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d42591d..4e0882d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1161,8 +1161,12 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, 
bool *async,
pfn = get_fault_pfn();
}
up_read(current-mm-mmap_sem);
-   } else
+   } else {
pfn = page_to_pfn(page[0]);
+#ifdef __KVM_HAVE_ARCH_MAP_PAGE
+   kvm_arch_map_page(page[0]);
+#endif
+   }
 
return pfn;
 }
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] KVM: PPC: E500: Implement MMU notifiers

2012-08-07 Thread Alexander Graf
The e500 target has lived without mmu notifiers ever since it got
introduced, but fails for the user space check on them with hugetlbfs.

So in order to get that one working, implement mmu notifiers in a
reasonably dumb fashion and be happy. On embedded hardware, we almost
never end up with mmu notifier calls, since most people don't overcommit.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |3 +-
 arch/powerpc/include/asm/kvm_ppc.h  |1 +
 arch/powerpc/kvm/Kconfig|2 +
 arch/powerpc/kvm/booke.c|   23 +++
 arch/powerpc/kvm/e500_tlb.c |   52 +++
 5 files changed, 80 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 572ad01..ed75bc9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -45,7 +45,8 @@
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 #endif
 
-#ifdef CONFIG_KVM_BOOK3S_64_HV
+#if defined(CONFIG_KVM_BOOK3S_64_HV) || defined(CONFIG_KVM_E500V2) || \
+defined(CONFIG_KVM_E500MC)
 #include linux/mmu_notifier.h
 
 #define KVM_ARCH_WANT_MMU_NOTIFIER
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 0124937..c38e824 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -104,6 +104,7 @@ extern void kvmppc_core_queue_external(struct kvm_vcpu 
*vcpu,
struct kvm_interrupt *irq);
 extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
  struct kvm_interrupt *irq);
+extern void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu);
 
 extern int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
   unsigned int op, int *advance);
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index f4dacb9..40cad8c 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -123,6 +123,7 @@ config KVM_E500V2
depends on EXPERIMENTAL  E500  !PPC_E500MC
select KVM
select KVM_MMIO
+   select MMU_NOTIFIER
---help---
  Support running unmodified E500 guest kernels in virtual machines on
  E500v2 host processors.
@@ -138,6 +139,7 @@ config KVM_E500MC
select KVM
select KVM_MMIO
select KVM_BOOKE_HV
+   select MMU_NOTIFIER
---help---
  Support running unmodified E500MC/E5500 (32-bit) guest kernels in
  virtual machines on E500MC/E5500 host processors.
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1d4ce9a..e794c3c 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -461,6 +461,15 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
return r;
 }
 
+static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
+{
+#if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
+   if (vcpu-requests)
+   if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
+   kvmppc_core_flush_tlb(vcpu);
+#endif
+}
+
 /*
  * Common checks before entering the guest world.  Call with interrupts
  * disabled.
@@ -485,12 +494,24 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
break;
}
 
+   smp_mb();
+   kvmppc_check_requests(vcpu);
+
if (kvmppc_core_prepare_to_enter(vcpu)) {
/* interrupts got enabled in between, so we
   are back at square 1 */
continue;
}
 
+   if (vcpu-mode == EXITING_GUEST_MODE) {
+   r = 1;
+   break;
+   }
+
+   /* Going into guest context! Yay! */
+   vcpu-mode = IN_GUEST_MODE;
+   smp_wmb();
+
break;
}
 
@@ -560,6 +581,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
 #endif
 
kvm_guest_exit();
+   vcpu-mode = OUTSIDE_GUEST_MODE;
+   smp_wmb();
 
 out:
local_irq_enable();
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index d26e705..3f78756 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -357,6 +357,13 @@ static void clear_tlb_refs(struct kvmppc_vcpu_e500 
*vcpu_e500)
clear_tlb_privs(vcpu_e500);
 }
 
+void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu)
+{
+   struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
+   clear_tlb_refs(vcpu_e500);
+   clear_tlb1_bitmap(vcpu_e500);
+}
+
 static inline void kvmppc_e500_deliver_tlb_miss(struct kvm_vcpu *vcpu,
unsigned int eaddr, int as)
 {
@@ -1062,6 +1069,51 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, 
gpa_t gpaddr,
write_stlbe(vcpu_e500, gtlbe, stlbe, stlbsel, sesel);
 }
 
+/* 

[PATCH 1/8] KVM: PPC: BookE: Expose remote TLB flushes in debugfs

2012-08-07 Thread Alexander Graf
We're already counting remote TLB flushes in a variable, but don't export
it to user space yet. Do so, so we know what's going on.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index d25a097..91b9662 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -62,6 +62,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ halt_wakeup, VCPU_STAT(halt_wakeup) },
{ doorbell, VCPU_STAT(dbell_exits) },
{ guest doorbell, VCPU_STAT(gdbell_exits) },
+   { remote_tlb_flush, VM_STAT(remote_tlb_flush) },
{ NULL }
 };
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/8] KVM: Add hva_to_memslot

2012-08-07 Thread Alexander Graf
Architecture code might want to figure out if an hva that it gets for example
via the mmu notifier callbacks actually is in guest mapped memory, and if so,
in which memory slot.

This patch introduces a helper function to enable it to do so. It is a
prerequisite for the e500 mmu notifier implementation.

Signed-off-by: Alexander Graf ag...@suse.de
---
 include/linux/kvm_host.h |1 +
 virt/kvm/kvm_main.c  |   14 ++
 2 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index dbc65f9..2b92928 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -464,6 +464,7 @@ int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct 
gfn_to_hva_cache *ghc,
 int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len);
 int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len);
 struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn);
+struct kvm_memory_slot *hva_to_memslot(struct kvm *kvm, hva_t hva);
 int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn);
 unsigned long kvm_host_page_size(struct kvm *kvm, gfn_t gfn);
 void mark_page_dirty(struct kvm *kvm, gfn_t gfn);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index bcf973e..d42591d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -999,6 +999,20 @@ struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, 
gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(gfn_to_memslot);
 
+struct kvm_memory_slot *hva_to_memslot(struct kvm *kvm, hva_t hva)
+{
+   struct kvm_memslots *slots = kvm_memslots(kvm);
+   struct kvm_memory_slot *memslot;
+
+   kvm_for_each_memslot(memslot, slots)
+   if (hva = memslot-userspace_addr 
+ hva  memslot-userspace_addr + memslot-npages)
+   return memslot;
+
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(hva_to_memslot);
+
 int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn)
 {
struct kvm_memory_slot *memslot = gfn_to_memslot(kvm, gfn);
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/8] KVM: PPC: Expose SYNC cap based on mmu notifiers

2012-08-07 Thread Alexander Graf
Semantically, the SYNC cap means that we have mmu notifiers available.
Express this in our #ifdef'ery around the feature, so that we can be sure
we don't miss out on ppc targets when they get their implementation.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/powerpc.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index a69ca4a..e2511de 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -264,10 +264,16 @@ int kvm_dev_ioctl_check_extension(long ext)
if (cpu_has_feature(CPU_FTR_ARCH_201))
r = 2;
break;
+#endif
case KVM_CAP_SYNC_MMU:
+#ifdef CONFIG_KVM_BOOK3S_64_HV
r = cpu_has_feature(CPU_FTR_ARCH_206) ? 1 : 0;
-   break;
+#elif defined(KVM_ARCH_WANT_MMU_NOTIFIER)
+   r = 1;
+#else
+   r = 0;
 #endif
+   break;
case KVM_CAP_NR_VCPUS:
/*
 * Recommending a number of CPUs is somewhat arbitrary; we
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/8] KVM: PPC: PR: Use generic tracepoint for guest exit

2012-08-07 Thread Alexander Graf
We want to have tracing information on guest exits for booke as well
as book3s. Since most information is identical, use a common trace point.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_pr.c |2 +-
 arch/powerpc/kvm/booke.c |3 ++
 arch/powerpc/kvm/trace.h |   79 +++---
 3 files changed, 55 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index a1baec3..0fe4cd4 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -551,7 +551,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
/* We get here with MSR.EE=0, so enable it to be a nice citizen */
__hard_irq_enable();
 
-   trace_kvm_book3s_exit(exit_nr, vcpu);
+   trace_kvm_exit(exit_nr, vcpu);
preempt_enable();
kvm_resched(vcpu);
switch (exit_nr) {
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 91b9662..1d4ce9a 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -39,6 +39,7 @@
 
 #include timing.h
 #include booke.h
+#include trace.h
 
 unsigned long kvmppc_booke_handlers;
 
@@ -678,6 +679,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
 
local_irq_enable();
 
+   trace_kvm_exit(exit_nr, vcpu);
+
run-exit_reason = KVM_EXIT_UNKNOWN;
run-ready_for_interrupt_injection = 1;
 
diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h
index 877186b..9fab6ed 100644
--- a/arch/powerpc/kvm/trace.h
+++ b/arch/powerpc/kvm/trace.h
@@ -31,6 +31,57 @@ TRACE_EVENT(kvm_ppc_instr,
  __entry-inst, __entry-pc, __entry-emulate)
 );
 
+TRACE_EVENT(kvm_exit,
+   TP_PROTO(unsigned int exit_nr, struct kvm_vcpu *vcpu),
+   TP_ARGS(exit_nr, vcpu),
+
+   TP_STRUCT__entry(
+   __field(unsigned int,   exit_nr )
+   __field(unsigned long,  pc  )
+   __field(unsigned long,  msr )
+   __field(unsigned long,  dar )
+#ifdef CONFIG_KVM_BOOK3S_PR
+   __field(unsigned long,  srr1)
+#endif
+   __field(unsigned long,  last_inst   )
+   ),
+
+   TP_fast_assign(
+#ifdef CONFIG_KVM_BOOK3S_PR
+   struct kvmppc_book3s_shadow_vcpu *svcpu;
+#endif
+   __entry-exit_nr= exit_nr;
+   __entry-pc = kvmppc_get_pc(vcpu);
+   __entry-dar= kvmppc_get_fault_dar(vcpu);
+   __entry-msr= vcpu-arch.shared-msr;
+#ifdef CONFIG_KVM_BOOK3S_PR
+   svcpu = svcpu_get(vcpu);
+   __entry-srr1   = svcpu-shadow_srr1;
+   svcpu_put(svcpu);
+#endif
+   __entry-last_inst  = vcpu-arch.last_inst;
+   ),
+
+   TP_printk(exit=0x%x
+| pc=0x%lx
+| msr=0x%lx
+| dar=0x%lx
+#ifdef CONFIG_KVM_BOOK3S_PR
+| srr1=0x%lx
+#endif
+| last_inst=0x%lx
+   ,
+   __entry-exit_nr,
+   __entry-pc,
+   __entry-msr,
+   __entry-dar,
+#ifdef CONFIG_KVM_BOOK3S_PR
+   __entry-srr1,
+#endif
+   __entry-last_inst
+   )
+);
+
 TRACE_EVENT(kvm_stlb_inval,
TP_PROTO(unsigned int stlb_index),
TP_ARGS(stlb_index),
@@ -105,34 +156,6 @@ TRACE_EVENT(kvm_gtlb_write,
 
 #ifdef CONFIG_KVM_BOOK3S_PR
 
-TRACE_EVENT(kvm_book3s_exit,
-   TP_PROTO(unsigned int exit_nr, struct kvm_vcpu *vcpu),
-   TP_ARGS(exit_nr, vcpu),
-
-   TP_STRUCT__entry(
-   __field(unsigned int,   exit_nr )
-   __field(unsigned long,  pc  )
-   __field(unsigned long,  msr )
-   __field(unsigned long,  dar )
-   __field(unsigned long,  srr1)
-   ),
-
-   TP_fast_assign(
-   struct kvmppc_book3s_shadow_vcpu *svcpu;
-   __entry-exit_nr= exit_nr;
-   __entry-pc = kvmppc_get_pc(vcpu);
-   __entry-dar= kvmppc_get_fault_dar(vcpu);
-   __entry-msr= vcpu-arch.shared-msr;
-   svcpu = svcpu_get(vcpu);
-   __entry-srr1   = svcpu-shadow_srr1;
-   svcpu_put(svcpu);
-   ),
-
-   TP_printk(exit=0x%x | pc=0x%lx | msr=0x%lx | dar=0x%lx | srr1=0x%lx,
- __entry-exit_nr, __entry-pc, __entry-msr, __entry-dar,
- __entry-srr1)
-);
-
 TRACE_EVENT(kvm_book3s_reenter,
TP_PROTO(int r, struct kvm_vcpu *vcpu),
TP_ARGS(r, vcpu),
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org

Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-07 Thread David Gibson
On Tue, Aug 07, 2012 at 11:46:35AM +0300, Avi Kivity wrote:
 On 08/07/2012 04:32 AM, David Gibson wrote:
  On Tue, Aug 07, 2012 at 06:57:57AM +1000, Benjamin Herrenschmidt wrote:
  On Mon, 2012-08-06 at 13:13 +1000, David Gibson wrote:
   So, I'm still trying to nut out the implications for H_CEDE, and think
   if there are any other hypercalls that might want to block the guest
   for a time.  We were considering blocking H_PUT_TCE if qemu devices
   had active dma maps on the previously mapped iovas.  I'm not sure if
   the discussions that led to the inclusion of the qemu IOMMU code
   decided that was wholly unnnecessary or just not necessary for the
   time being.
  
  For sleeping hcalls they will simply have to set exit_request to
  complete the hcall from the kernel perspective, leaving us in a state
  where the kernel is about to restart at srr0 + 4, along with some other
  flag (stop or halt) to actually freeze the vcpu.
  
  If such an async hcall decides to return an error, it can then set
  gpr3 directly using ioctls before restarting the vcpu.
  
  Yeah, I'd pretty much convinced myself of that by the end of
  yesterday.  I hope to send patches implementing these fixes today.
  
  There are also some questions about why our in-kernel H_CEDE works
  kind of differently from x86's hlt instruction implementation (which
  comes out to qemu unless the irqchip is in-kernel as well).  I don't
  think we have an urgent problem there though.
 
 It's the other way round, hlt sleeps in the kernel unless the irqchip is
 not in the kernel.

That's the same as what I said.

We never have irqchip in kernel (because we haven't written that yet)
but we still sleep in-kernel for CEDE.  I haven't spotted any problem
with that, but now I'm wondering if there is one, since x86 don't do
it in what seems like the analogous situation.

It's possible this works because our decrementer (timer) interrupts
are different at the core level from external interrupts coming from
the PIC, and *are* handled in kernel, but I haven't actually followed
the logic to work out if this is the case.

  Meaning the normal state of things is to sleep in
 the kernel (whether or not you have an emulated interrupt controller in
 the kernel -- the term irqchip in kernel is overloaded for x86).

Uh.. overloaded in what way.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-07 Thread Avi Kivity
On 08/07/2012 03:14 PM, David Gibson wrote:
 On Tue, Aug 07, 2012 at 11:46:35AM +0300, Avi Kivity wrote:
 On 08/07/2012 04:32 AM, David Gibson wrote:
  On Tue, Aug 07, 2012 at 06:57:57AM +1000, Benjamin Herrenschmidt wrote:
  On Mon, 2012-08-06 at 13:13 +1000, David Gibson wrote:
   So, I'm still trying to nut out the implications for H_CEDE, and think
   if there are any other hypercalls that might want to block the guest
   for a time.  We were considering blocking H_PUT_TCE if qemu devices
   had active dma maps on the previously mapped iovas.  I'm not sure if
   the discussions that led to the inclusion of the qemu IOMMU code
   decided that was wholly unnnecessary or just not necessary for the
   time being.
  
  For sleeping hcalls they will simply have to set exit_request to
  complete the hcall from the kernel perspective, leaving us in a state
  where the kernel is about to restart at srr0 + 4, along with some other
  flag (stop or halt) to actually freeze the vcpu.
  
  If such an async hcall decides to return an error, it can then set
  gpr3 directly using ioctls before restarting the vcpu.
  
  Yeah, I'd pretty much convinced myself of that by the end of
  yesterday.  I hope to send patches implementing these fixes today.
  
  There are also some questions about why our in-kernel H_CEDE works
  kind of differently from x86's hlt instruction implementation (which
  comes out to qemu unless the irqchip is in-kernel as well).  I don't
  think we have an urgent problem there though.
 
 It's the other way round, hlt sleeps in the kernel unless the irqchip is
 not in the kernel.
 
 That's the same as what I said.

I meant to stress that the normal way which other archs should emulate
is sleep-in-kernel.

 
 We never have irqchip in kernel (because we haven't written that yet)
 but we still sleep in-kernel for CEDE.  I haven't spotted any problem
 with that, but now I'm wondering if there is one, since x86 don't do
 it in what seems like the analogous situation.
 
 It's possible this works because our decrementer (timer) interrupts
 are different at the core level from external interrupts coming from
 the PIC, and *are* handled in kernel, but I haven't actually followed
 the logic to work out if this is the case.
 
  Meaning the normal state of things is to sleep in
 the kernel (whether or not you have an emulated interrupt controller in
 the kernel -- the term irqchip in kernel is overloaded for x86).
 
 Uh.. overloaded in what way.

On x86, irqchip-in-kernel means that the local APICs, the IOAPIC, and
the two PICs are emulated in the kernel.  Now the IOAPIC and the PICs
correspond to non-x86 interrupt controllers, but the local APIC is more
tightly coupled to the core.  Interrupt acceptance by the core is an
operation that involved synchronous communication with the local APIC:
the APIC presents the interrupt, the core accepts it based on the value
of the interrupt enable flag and possible a register (CR8), then the
APIC updates the ISR and IRR.

The upshot is that if the local APIC is in userspace, interrupts must be
synchronous with vcpu exection, so that KVM_INTERRUPT is a vcpu ioctl
and HLT is emulated in userspace (so that local APIC emulation can check
if an interrupt wakes it up or not).  As soon as the local APIC is
emulated in the kernel, HLT can be emulated there as well, and
interrupts become asynchronous (KVM_IRQ_LINE, a vm ioctl).

So irqchip_in_kernel, for most discussions, really means whether
interrupt queuing is synchronous or asynchronous.  It has nothing to do
with the interrupt controllers per se.  All non-x86 archs always have
irqchip_in_kernel() in this sense.

Peter has started to fix up this naming mess in qemu.  I guess we should
do the same for the kernel (except for ABIs) and document it, because it
keeps generating confusion.

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] KVM: PPC: E500: Implement MMU notifiers

2012-08-07 Thread Avi Kivity
On 08/07/2012 01:57 PM, Alexander Graf wrote:
 The e500 target has lived without mmu notifiers ever since it got
 introduced, but fails for the user space check on them with hugetlbfs.
 
 So in order to get that one working, implement mmu notifiers in a
 reasonably dumb fashion and be happy. On embedded hardware, we almost
 never end up with mmu notifier calls, since most people don't overcommit.
 
  
 +static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
 +{
 +#if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
 + if (vcpu-requests)
 + if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
 + kvmppc_core_flush_tlb(vcpu);
 +#endif
 +}
 +
  /*
   * Common checks before entering the guest world.  Call with interrupts
   * disabled.
 @@ -485,12 +494,24 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu 
 *vcpu)
   break;
   }
  
 + smp_mb();
 + kvmppc_check_requests(vcpu);
 +

On x86 we do the requests processing while in normal preemptible
context, then do an additional check for requests != 0 during guest
entry.  This allows us to do sleepy things in request processing, and
reduces the amount of work we do with interrupts disabled.

   if (kvmppc_core_prepare_to_enter(vcpu)) {
   /* interrupts got enabled in between, so we
  are back at square 1 */
   continue;
   }
  
 + if (vcpu-mode == EXITING_GUEST_MODE) {
 + r = 1;
 + break;
 + }
 +
 + /* Going into guest context! Yay! */
 + vcpu-mode = IN_GUEST_MODE;
 + smp_wmb();
 +
   break;
   }
  
 @@ -560,6 +581,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
 kvm_vcpu *vcpu)
  #endif
  
   kvm_guest_exit();
 + vcpu-mode = OUTSIDE_GUEST_MODE;
 + smp_wmb();
  
 +/* MMU Notifiers */
 +
 +int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
 +{
 + /* Is this a guest page? */
 + if (!hva_to_memslot(kvm, hva))
 + return 0;
 +
 + /*
 +  * Flush all shadow tlb entries everywhere. This is slow, but
 +  * we are 100% sure that we catch the to be unmapped page
 +  */
 + kvm_flush_remote_tlbs(kvm);

Wow.

 +
 + return 0;
 +}
 +

Where do you drop the reference count when installing a page in a shadow
tlb entry?


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] KVM: Add page map arch callback

2012-08-07 Thread Avi Kivity
On 08/07/2012 01:57 PM, Alexander Graf wrote:
 Some archs need to ensure that their icache is flushed when mapping a new
 page. Add a callback to the generic code for an arch to implement any cache
 flush logic it may need.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  virt/kvm/kvm_main.c |6 +-
  1 files changed, 5 insertions(+), 1 deletions(-)
 
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index d42591d..4e0882d 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -1161,8 +1161,12 @@ static pfn_t hva_to_pfn(unsigned long addr, bool 
 atomic, bool *async,
   pfn = get_fault_pfn();
   }
   up_read(current-mm-mmap_sem);
 - } else
 + } else {
   pfn = page_to_pfn(page[0]);
 +#ifdef __KVM_HAVE_ARCH_MAP_PAGE
 + kvm_arch_map_page(page[0]);
 +#endif
 + }
  

Please call it unconditionally, and have a stub inline ifndef
__KVM_HAVE_ARCH_MAP_PAGE.

Is this the correct place?  Who says the caller of hva_to_pfn() is going
to map it?


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] KVM: Add page map arch callback

2012-08-07 Thread Alexander Graf


On 07.08.2012, at 15:32, Avi Kivity a...@redhat.com wrote:

 On 08/07/2012 01:57 PM, Alexander Graf wrote:
 Some archs need to ensure that their icache is flushed when mapping a new
 page. Add a callback to the generic code for an arch to implement any cache
 flush logic it may need.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
 virt/kvm/kvm_main.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)
 
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index d42591d..4e0882d 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -1161,8 +1161,12 @@ static pfn_t hva_to_pfn(unsigned long addr, bool 
 atomic, bool *async,
pfn = get_fault_pfn();
}
up_read(current-mm-mmap_sem);
 -} else
 +} else {
pfn = page_to_pfn(page[0]);
 +#ifdef __KVM_HAVE_ARCH_MAP_PAGE
 +kvm_arch_map_page(page[0]);
 +#endif
 +}
 
 
 Please call it unconditionally, and have a stub inline ifndef
 __KVM_HAVE_ARCH_MAP_PAGE.

Sure, works fine with me.

 
 Is this the correct place?  Who says the caller of hva_to_pfn() is going
 to map it?

I don't think anyone is. However, we need the struct page, and all the generic 
kvm mm code tries hard to hide it from its users. The alternative would be to 
expose all those details, and I'm not sure that's a good idea.

Essentially, we don't care if we're overly cautious. Clearing one page too much 
is way better than clearing one too few.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] KVM: PPC: E500: Implement MMU notifiers

2012-08-07 Thread Alexander Graf


On 07.08.2012, at 15:30, Avi Kivity a...@redhat.com wrote:

 On 08/07/2012 01:57 PM, Alexander Graf wrote:
 The e500 target has lived without mmu notifiers ever since it got
 introduced, but fails for the user space check on them with hugetlbfs.
 
 So in order to get that one working, implement mmu notifiers in a
 reasonably dumb fashion and be happy. On embedded hardware, we almost
 never end up with mmu notifier calls, since most people don't overcommit.
 
 
 +static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
 +{
 +#if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
 +if (vcpu-requests)
 +if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
 +kvmppc_core_flush_tlb(vcpu);
 +#endif
 +}
 +
 /*
  * Common checks before entering the guest world.  Call with interrupts
  * disabled.
 @@ -485,12 +494,24 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu 
 *vcpu)
break;
}
 
 +smp_mb();
 +kvmppc_check_requests(vcpu);
 +
 
 On x86 we do the requests processing while in normal preemptible
 context, then do an additional check for requests != 0 during guest
 entry.  This allows us to do sleepy things in request processing, and
 reduces the amount of work we do with interrupts disabled.

Hrm. We could do the same I guess. Let me give it a try.

 
if (kvmppc_core_prepare_to_enter(vcpu)) {
/* interrupts got enabled in between, so we
   are back at square 1 */
continue;
}
 
 +if (vcpu-mode == EXITING_GUEST_MODE) {
 +r = 1;
 +break;
 +}
 +
 +/* Going into guest context! Yay! */
 +vcpu-mode = IN_GUEST_MODE;
 +smp_wmb();
 +
break;
}
 
 @@ -560,6 +581,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
 kvm_vcpu *vcpu)
 #endif
 
kvm_guest_exit();
 +vcpu-mode = OUTSIDE_GUEST_MODE;
 +smp_wmb();
 
 +/* MMU Notifiers */
 +
 +int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
 +{
 +/* Is this a guest page? */
 +if (!hva_to_memslot(kvm, hva))
 +return 0;
 +
 +/*
 + * Flush all shadow tlb entries everywhere. This is slow, but
 + * we are 100% sure that we catch the to be unmapped page
 + */
 +kvm_flush_remote_tlbs(kvm);
 
 Wow.

Yeah, cool, eh? It sounds worse than it is. Usually when we need to page out, 
we're under memory pressure. So we would get called multiple times to unmap 
different pages. If we just drop all shadow tlb entries, we also freed a lot of 
memory that can now be paged out without callbacks.

 
 +
 +return 0;
 +}
 +
 
 Where do you drop the reference count when installing a page in a shadow
 tlb entry?

Which reference count? Essentially the remote tlb flush calls 
kvmppc_e500_prov_release() on all currently mapped shadow tlb entries. Are we 
missing out on something more?


Alex

 
 
 -- 
 error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] KVM: Add page map arch callback

2012-08-07 Thread Avi Kivity
On 08/07/2012 04:44 PM, Alexander Graf wrote:
 
 
 Is this the correct place?  Who says the caller of hva_to_pfn() is going
 to map it?
 
 I don't think anyone is. However, we need the struct page, and all the 
 generic kvm mm code tries hard to hide it from its users. The alternative 
 would be to expose all those details, and I'm not sure that's a good idea.
 
 Essentially, we don't care if we're overly cautious. Clearing one page too 
 much is way better than clearing one too few.

Are you sure everyone uses hva_to_pfn()?  x86 uses gfn_to_hva_many(), in
one place.



-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] KVM: Add page map arch callback

2012-08-07 Thread Alexander Graf


On 07.08.2012, at 15:58, Avi Kivity a...@redhat.com wrote:

 On 08/07/2012 04:44 PM, Alexander Graf wrote:
 
 
 Is this the correct place?  Who says the caller of hva_to_pfn() is going
 to map it?
 
 I don't think anyone is. However, we need the struct page, and all the 
 generic kvm mm code tries hard to hide it from its users. The alternative 
 would be to expose all those details, and I'm not sure that's a good idea.
 
 Essentially, we don't care if we're overly cautious. Clearing one page too 
 much is way better than clearing one too few.
 
 Are you sure everyone uses hva_to_pfn()?  x86 uses gfn_to_hva_many(), in
 one place.

Nope, I only checked that e500 adheres to that flow so far. I'm not even 100% 
sure that book3s is always happy yet.

But I figured this is a step in the right direction. If we missed out on one, 
we can always add it later. The many function is a good spot. Maybe I'll just 
ckeck up all of kvm_main.c again for potential users.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] KVM: Add page map arch callback

2012-08-07 Thread Avi Kivity
On 08/07/2012 05:08 PM, Alexander Graf wrote:
 
 
 On 07.08.2012, at 15:58, Avi Kivity a...@redhat.com wrote:
 
 On 08/07/2012 04:44 PM, Alexander Graf wrote:
 
 
 Is this the correct place?  Who says the caller of hva_to_pfn() is going
 to map it?
 
 I don't think anyone is. However, we need the struct page, and all the 
 generic kvm mm code tries hard to hide it from its users. The alternative 
 would be to expose all those details, and I'm not sure that's a good idea.
 
 Essentially, we don't care if we're overly cautious. Clearing one page too 
 much is way better than clearing one too few.
 
 Are you sure everyone uses hva_to_pfn()?  x86 uses gfn_to_hva_many(), in
 one place.
 
 Nope, I only checked that e500 adheres to that flow so far. I'm not even 100% 
 sure that book3s is always happy yet.
 
 But I figured this is a step in the right direction. If we missed out on one, 
 we can always add it later. The many function is a good spot. Maybe I'll just 
 ckeck up all of kvm_main.c again for potential users.

I'm not sure.  We have lots of functions of this sort, and their number
keeps increasing.  Maybe a better place is pre-map.


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] KVM: Add page map arch callback

2012-08-07 Thread Alexander Graf


On 07.08.2012, at 16:10, Avi Kivity a...@redhat.com wrote:

 On 08/07/2012 05:08 PM, Alexander Graf wrote:
 
 
 On 07.08.2012, at 15:58, Avi Kivity a...@redhat.com wrote:
 
 On 08/07/2012 04:44 PM, Alexander Graf wrote:
 
 
 Is this the correct place?  Who says the caller of hva_to_pfn() is going
 to map it?
 
 I don't think anyone is. However, we need the struct page, and all the 
 generic kvm mm code tries hard to hide it from its users. The alternative 
 would be to expose all those details, and I'm not sure that's a good idea.
 
 Essentially, we don't care if we're overly cautious. Clearing one page too 
 much is way better than clearing one too few.
 
 Are you sure everyone uses hva_to_pfn()?  x86 uses gfn_to_hva_many(), in
 one place.
 
 Nope, I only checked that e500 adheres to that flow so far. I'm not even 
 100% sure that book3s is always happy yet.
 
 But I figured this is a step in the right direction. If we missed out on 
 one, we can always add it later. The many function is a good spot. Maybe 
 I'll just ckeck up all of kvm_main.c again for potential users.
 
 I'm not sure.  We have lots of functions of this sort, and their number
 keeps increasing.  Maybe a better place is pre-map.

Pre-map? How?

Alex

 
 
 -- 
 error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] KVM: PPC: E500: Implement MMU notifiers

2012-08-07 Thread Avi Kivity
On 08/07/2012 04:52 PM, Alexander Graf wrote:
 
 +/* MMU Notifiers */
 +
 +int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
 +{
 +/* Is this a guest page? */
 +if (!hva_to_memslot(kvm, hva))
 +return 0;
 +
 +/*
 + * Flush all shadow tlb entries everywhere. This is slow, but
 + * we are 100% sure that we catch the to be unmapped page
 + */
 +kvm_flush_remote_tlbs(kvm);
 
 Wow.
 
 Yeah, cool, eh? It sounds worse than it is. Usually when we need to page out, 
 we're under memory pressure. So we would get called multiple times to unmap 
 different pages. If we just drop all shadow tlb entries, we also freed a lot 
 of memory that can now be paged out without callbacks.

And it's just a shadow tlb yes?  So there's a limited amount of stuff
there.  But it'd be hell on x86.

 
 
 +
 +return 0;
 +}
 +
 
 Where do you drop the reference count when installing a page in a shadow
 tlb entry?
 
 Which reference count? Essentially the remote tlb flush calls 
 kvmppc_e500_prov_release() on all currently mapped shadow tlb entries. Are we 
 missing out on something more?
 

With mmu notifiers mapped pages are kept without elevated reference
counts; the mmu notifier protects them, not the refcount.  This allows
core mm code that looks at refcounts to work.


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] KVM: Add page map arch callback

2012-08-07 Thread Avi Kivity
On 08/07/2012 05:14 PM, Alexander Graf wrote:
 
 
 On 07.08.2012, at 16:10, Avi Kivity a...@redhat.com wrote:
 
 On 08/07/2012 05:08 PM, Alexander Graf wrote:
 
 
 On 07.08.2012, at 15:58, Avi Kivity a...@redhat.com wrote:
 
 On 08/07/2012 04:44 PM, Alexander Graf wrote:
 
 
 Is this the correct place?  Who says the caller of hva_to_pfn() is going
 to map it?
 
 I don't think anyone is. However, we need the struct page, and all the 
 generic kvm mm code tries hard to hide it from its users. The alternative 
 would be to expose all those details, and I'm not sure that's a good idea.
 
 Essentially, we don't care if we're overly cautious. Clearing one page 
 too much is way better than clearing one too few.
 
 Are you sure everyone uses hva_to_pfn()?  x86 uses gfn_to_hva_many(), in
 one place.
 
 Nope, I only checked that e500 adheres to that flow so far. I'm not even 
 100% sure that book3s is always happy yet.
 
 But I figured this is a step in the right direction. If we missed out on 
 one, we can always add it later. The many function is a good spot. Maybe 
 I'll just ckeck up all of kvm_main.c again for potential users.
 
 I'm not sure.  We have lots of functions of this sort, and their number
 keeps increasing.  Maybe a better place is pre-map.
 
 Pre-map? How?

In arch code before you install the page in a pte/tlbe.

We don't have a single point we can hook unfortunately.


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] KVM: PPC: E500: Implement MMU notifiers

2012-08-07 Thread Alexander Graf


On 07.08.2012, at 16:14, Avi Kivity a...@redhat.com wrote:

 On 08/07/2012 04:52 PM, Alexander Graf wrote:
 
 +/* MMU Notifiers */
 +
 +int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
 +{
 +/* Is this a guest page? */
 +if (!hva_to_memslot(kvm, hva))
 +return 0;
 +
 +/*
 + * Flush all shadow tlb entries everywhere. This is slow, but
 + * we are 100% sure that we catch the to be unmapped page
 + */
 +kvm_flush_remote_tlbs(kvm);
 
 Wow.
 
 Yeah, cool, eh? It sounds worse than it is. Usually when we need to page 
 out, we're under memory pressure. So we would get called multiple times to 
 unmap different pages. If we just drop all shadow tlb entries, we also freed 
 a lot of memory that can now be paged out without callbacks.
 
 And it's just a shadow tlb yes?  So there's a limited amount of stuff
 there.  But it'd be hell on x86.
 
 
 
 +
 +return 0;
 +}
 +
 
 Where do you drop the reference count when installing a page in a shadow
 tlb entry?
 
 Which reference count? Essentially the remote tlb flush calls 
 kvmppc_e500_prov_release() on all currently mapped shadow tlb entries. Are 
 we missing out on something more?
 
 
 With mmu notifiers mapped pages are kept without elevated reference
 counts; the mmu notifier protects them, not the refcount.  This allows
 core mm code that looks at refcounts to work.

Hrm. I wonder why it works then. We only drop the refcount after we got an mmu 
notifier callback. Maybe we get a callback on an unmapped page, but then happen 
to clear all the shadow entries as well, hence unpinning them along the way?

That explains why it works, but sure isn't exactly working as intended. Thanks 
for the hint!


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] KVM: Add page map arch callback

2012-08-07 Thread Avi Kivity
On 08/07/2012 05:24 PM, Alexander Graf wrote:
 
 Pre-map? How?
 
 In arch code before you install the page in a pte/tlbe.
 
 So how do I get to the struct page in there?
 

pfn_to_page()


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2] KVM: PPC: booke: Add watchdog emulation

2012-08-07 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Tuesday, August 07, 2012 3:38 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 1/2] KVM: PPC: booke: Add watchdog emulation
 
 
 On 03.08.2012, at 07:30, Bharat Bhushan wrote:
 
  This patch adds the watchdog emulation in KVM. The watchdog
  emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl.
  The kernel timer are used for watchdog emulation and emulates
  h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU
  if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how
  it is configured.
 
  Signed-off-by: Liu Yu yu@freescale.com
  Signed-off-by: Scott Wood scottw...@freescale.com
  [bharat.bhus...@freescale.com: reworked patch]
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v7:
  - kvmppc_core_dequeue_watchdog() and kvmppc_core_enque_watchdog()
are made staic
  - Clear the KVM_REQ_WATCHDOG request when TSR_ENW | ENW_WIS cleared
rather than checking these bits on handling the request.
Below changes (pasted for quick understanding)
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index 7eedaeb..a0f5922 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -629,8 +629,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct
 kvm_vcpu *vcpu)
 goto out;
 }
 
  -   if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu) 
  -   (vcpu-arch.tsr  (TSR_ENW | TSR_WIS))) {
  +   if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu)) {
 kvm_run-exit_reason = KVM_EXIT_WATCHDOG;
 ret = 0;
 goto out;
  @@ -1098,8 +1097,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
 kvm_vcpu *vcpu,
 }
 }
 
  -   if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu) 
  -   (vcpu-arch.tsr  (TSR_ENW | TSR_WIS))) {
  +   if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu)) {
 run-exit_reason = KVM_EXIT_WATCHDOG;
 r = RESUME_HOST | (r  RESUME_FLAG_NV);
 }
  @@ -1251,8 +1249,10 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
 
 vcpu-arch.tsr = sregs-u.e.tsr;
 
  -   if ((old_tsr ^ vcpu-arch.tsr)  (TSR_ENW | TSR_WIS))
  +   if ((old_tsr ^ vcpu-arch.tsr)  (TSR_ENW | TSR_WIS)) {
  +   clear_bit(KVM_REQ_WATCHDOG, vcpu-requests);
 arm_next_watchdog(vcpu);
  +   }
 
 update_timer_ints(vcpu);
 }
  @@ -1434,8 +1434,10 @@ void kvmppc_clr_tsr_bits(struct kvm_vcpu *vcpu, u32
 tsr_bits)
  * We may have stopped the watchdog due to
  * being stuck on final expiration.
  */
  -   if (tsr_bits  (TSR_ENW | TSR_WIS))
  +   if (tsr_bits  (TSR_ENW | TSR_WIS)) {
  +   clear_bit(KVM_REQ_WATCHDOG, vcpu-requests);
 arm_next_watchdog(vcpu);
  +   }
  --
 
  v6:
  - Added kvmppc_subarch_vcpu_unit/uninit() for subarch specific 
  initialization
  - Using spin_lock_irqsave()
  - Using del_timer_sync().
 
  v5:
  - Checking that TSR_ENW/TSR_WIS are still set on KVM_EXIT_WATCHDOG.
  - Moved watchdog_next_timeout() in lock.
 
  arch/powerpc/include/asm/kvm_host.h  |3 +
  arch/powerpc/include/asm/kvm_ppc.h   |2 +
  arch/powerpc/include/asm/reg_booke.h |7 ++
  arch/powerpc/kvm/book3s.c|9 ++
  arch/powerpc/kvm/booke.c |  158 
  ++
  arch/powerpc/kvm/booke_emulate.c |8 ++
  arch/powerpc/kvm/powerpc.c   |   14 +++-
  include/linux/kvm.h  |2 +
  include/linux/kvm_host.h |1 +
  9 files changed, 202 insertions(+), 2 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
 b/arch/powerpc/include/asm/kvm_host.h
  index 572ad01..873ec11 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -469,6 +469,8 @@ struct kvm_vcpu_arch {
  ulong fault_esr;
  ulong queued_dear;
  ulong queued_esr;
  +   spinlock_t wdt_lock;
  +   struct timer_list wdt_timer;
  u32 tlbcfg[4];
  u32 mmucfg;
  u32 epr;
  @@ -484,6 +486,7 @@ struct kvm_vcpu_arch {
  u8 osi_needed;
  u8 osi_enabled;
  u8 papr_enabled;
  +   u8 watchdog_enabled;
  u8 sane;
  u8 cpu_type;
  u8 hcall_needed;
  diff --git a/arch/powerpc/include/asm/kvm_ppc.h
 b/arch/powerpc/include/asm/kvm_ppc.h
  index 0124937..1438a5e 100644
  --- a/arch/powerpc/include/asm/kvm_ppc.h
  +++ b/arch/powerpc/include/asm/kvm_ppc.h
  @@ -68,6 +68,8 @@ extern void kvmppc_emulate_dec(struct kvm_vcpu *vcpu);
  extern u32 kvmppc_get_dec(struct kvm_vcpu *vcpu, u64 tb);
  extern void kvmppc_decrementer_func(unsigned long data);
  extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
  +extern int 

RE: [PATCH 3/3] KVM: PPC: booke: Added debug handler

2012-08-07 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Tuesday, August 07, 2012 4:18 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 3/3] KVM: PPC: booke: Added debug handler
 
 
 On 03.08.2012, at 09:08, Bharat Bhushan wrote:
 
  Installed debug handler will be used for guest debug support and debug
  facility emulation features (patches for these features will follow
  this patch).
 
  Signed-off-by: Liu Yu yu@freescale.com
  [bharat.bhus...@freescale.com: Substantial changes]
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |1 +
  arch/powerpc/kernel/asm-offsets.c   |1 +
  arch/powerpc/kvm/booke_interrupts.S |   45 
  +++
  3 files changed, 47 insertions(+), 0 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index dcee499..bd78523 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -494,6 +494,7 @@ struct kvm_vcpu_arch {
  u32 tlbcfg[4];
  u32 mmucfg;
  u32 epr;
  +   u32 crit_save;
  #endif
  gpa_t paddr_accessed;
  gva_t vaddr_accessed;
  diff --git a/arch/powerpc/kernel/asm-offsets.c
  b/arch/powerpc/kernel/asm-offsets.c
  index 85b05c4..92f149b 100644
  --- a/arch/powerpc/kernel/asm-offsets.c
  +++ b/arch/powerpc/kernel/asm-offsets.c
  @@ -563,6 +563,7 @@ int main(void)
  DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
  DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
  DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
  +   DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save));
  #endif /* CONFIG_PPC_BOOK3S */
  #endif /* CONFIG_KVM */
 
  diff --git a/arch/powerpc/kvm/booke_interrupts.S
  b/arch/powerpc/kvm/booke_interrupts.S
  index 3539805..890673c 100644
  --- a/arch/powerpc/kvm/booke_interrupts.S
  +++ b/arch/powerpc/kvm/booke_interrupts.S
  @@ -73,6 +73,51 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
  bctr
  .endm
 
  +.macro KVM_DBG_HANDLER ivor_nr scratch srr0
 
 This is a lot of asm code. Any chance to share a good share of it with the
 generic handler?

Yes it is a lot of code but I am finding it difficult.

Thanks
-Bharat

 
 
 Alex
 
  +_GLOBAL(kvmppc_handler_\ivor_nr)
  +   mtspr   \scratch, r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   stw r3, VCPU_CRIT_SAVE(r4)
  +   mfcrr3
  +   mfspr   r4, SPRN_CSRR1
  +   andi.   r4, r4, MSR_PR
  +   bne 1f
  +   /* debug interrupt happened in enter/exit path */
  +   mfspr   r4, SPRN_CSRR1
  +   rlwinm  r4, r4, 0, ~MSR_DE
  +   mtspr   SPRN_CSRR1, r4
  +   lis r4, 0x
  +   ori r4, r4, 0x
  +   mtspr   SPRN_DBSR, r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   mtcrr3
  +   lwz r3, VCPU_CRIT_SAVE(r4)
  +   mfspr   r4, \scratch
  +   rfci
  +1: /* debug interrupt happened in guest */
  +   mfspr   r4, \scratch
  +   mtcrr3
  +   mr  r3, r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   stw r3, VCPU_GPR(R4)(r4)
  +   stw r5, VCPU_GPR(R5)(r4)
  +   stw r6, VCPU_GPR(R6)(r4)
  +   lwz r3, VCPU_CRIT_SAVE(r4)
  +   mfspr   r5, \srr0
  +   stw r3, VCPU_GPR(R3)(r4)
  +   stw r5, VCPU_PC(r4)
  +   mfctr   r5
  +   lis r6, kvmppc_resume_host@h
  +   stw r5, VCPU_CTR(r4)
  +   li  r5, \ivor_nr
  +   ori r6, r6, kvmppc_resume_host@l
  +   mtctr   r6
  +   bctr
  +.endm
  +
  .macro KVM_HANDLER_ADDR ivor_nr
  .long   kvmppc_handler_\ivor_nr
  .endm
  --
  1.7.0.4
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: PPC: booke: Added debug handler

2012-08-07 Thread Scott Wood
On 08/07/2012 05:47 AM, Alexander Graf wrote:
 diff --git a/arch/powerpc/kvm/booke_interrupts.S 
 b/arch/powerpc/kvm/booke_interrupts.S
 index 3539805..890673c 100644
 --- a/arch/powerpc/kvm/booke_interrupts.S
 +++ b/arch/powerpc/kvm/booke_interrupts.S
 @@ -73,6 +73,51 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
  bctr
 .endm

 +.macro KVM_DBG_HANDLER ivor_nr scratch srr0
 
 This is a lot of asm code. Any chance to share a good share of it with the 
 generic handler?

That entire file could use an update to lok more like
bookehv_interrupts.S and its use of asm macros.

-Scott


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] KVM: PPC: Add cache flush on page map

2012-08-07 Thread Scott Wood
On 08/07/2012 05:57 AM, Alexander Graf wrote:
 When we map a page that wasn't icache cleared before, do so when first
 mapping it in KVM using the same information bits as the Linux mapping
 logic. That way we are 100% sure that any page we map does not have stale
 entries in the icache.
 
 What we really would need is a method to flush the icache only when we
 actually map a page executable for the guest. But with the current
 abstraction that is hard to do, and this way we don't have a huge performance
 penalty, so better be safe now and micro optimize things later.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  arch/powerpc/include/asm/kvm_host.h |   10 ++
  arch/powerpc/mm/mem.c   |1 +
  2 files changed, 11 insertions(+), 0 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_host.h 
 b/arch/powerpc/include/asm/kvm_host.h
 index ed75bc9..c0a2fc1 100644
 --- a/arch/powerpc/include/asm/kvm_host.h
 +++ b/arch/powerpc/include/asm/kvm_host.h
 @@ -33,6 +33,7 @@
  #include asm/kvm_asm.h
  #include asm/processor.h
  #include asm/page.h
 +#include asm/cacheflush.h
  
  #define KVM_MAX_VCPUSNR_CPUS
  #define KVM_MAX_VCORES   NR_CPUS
 @@ -550,5 +551,14 @@ struct kvm_vcpu_arch {
  #define KVM_MMIO_REG_FQPR0x0060
  
  #define __KVM_HAVE_ARCH_WQP
 +#define __KVM_HAVE_ARCH_MAP_PAGE
 +static inline void kvm_arch_map_page(struct page *page)
 +{
 + /* Need to invalidate the icache for new pages */
 + if (!test_bit(PG_arch_1, page-flags)) {
 + flush_dcache_icache_page(page);
 + set_bit(PG_arch_1, page-flags);
 + }
 +}

Shouldn't this test CPU_FTR_COHERENT_ICACHE?

-Scott


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-07 Thread Benjamin Herrenschmidt
On Tue, 2012-08-07 at 16:13 +0300, Avi Kivity wrote:
 

 Peter has started to fix up this naming mess in qemu.  I guess we should
 do the same for the kernel (except for ABIs) and document it, because it
 keeps generating confusion. 

Ok so our current situation is that the XICS and MPIC are emulated in
userspace entirely but the link between them and the VCPU is the
asynchronous EE line so we are fine.

I'm currently working on moving the XICS into the kernel for performance
reasons, however, just like ARM VGIC, I can't seem to find a way to
fit in the generic irqchip code in there. It's just not generic at
all and quite x86 centric :-)

So for now I'm just doing my own version of CREATE_IRQCHIP to create it
and KVM_INTERRUPT to trigger the various interrupts. None of the mapping
stuff (which we really don't need).

That's a bit of a problem vs. some of the code qemu-side such as in
virtio-pci which does seem to be written around the model exposed by the
x86 stuff and relies on doing such mappings so I think we'll have to
butcher some of that.

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-07 Thread David Gibson
On Tue, Aug 07, 2012 at 04:13:49PM +0300, Avi Kivity wrote:
 On 08/07/2012 03:14 PM, David Gibson wrote:
  On Tue, Aug 07, 2012 at 11:46:35AM +0300, Avi Kivity wrote:
  On 08/07/2012 04:32 AM, David Gibson wrote:
   On Tue, Aug 07, 2012 at 06:57:57AM +1000, Benjamin Herrenschmidt wrote:
   On Mon, 2012-08-06 at 13:13 +1000, David Gibson wrote:
So, I'm still trying to nut out the implications for H_CEDE, and think
if there are any other hypercalls that might want to block the guest
for a time.  We were considering blocking H_PUT_TCE if qemu devices
had active dma maps on the previously mapped iovas.  I'm not sure if
the discussions that led to the inclusion of the qemu IOMMU code
decided that was wholly unnnecessary or just not necessary for the
time being.
   
   For sleeping hcalls they will simply have to set exit_request to
   complete the hcall from the kernel perspective, leaving us in a state
   where the kernel is about to restart at srr0 + 4, along with some other
   flag (stop or halt) to actually freeze the vcpu.
   
   If such an async hcall decides to return an error, it can then set
   gpr3 directly using ioctls before restarting the vcpu.
   
   Yeah, I'd pretty much convinced myself of that by the end of
   yesterday.  I hope to send patches implementing these fixes today.
   
   There are also some questions about why our in-kernel H_CEDE works
   kind of differently from x86's hlt instruction implementation (which
   comes out to qemu unless the irqchip is in-kernel as well).  I don't
   think we have an urgent problem there though.
  
  It's the other way round, hlt sleeps in the kernel unless the irqchip is
  not in the kernel.
  
  That's the same as what I said.
 
 I meant to stress that the normal way which other archs should emulate
 is sleep-in-kernel.

Ok.

  We never have irqchip in kernel (because we haven't written that yet)
  but we still sleep in-kernel for CEDE.  I haven't spotted any problem
  with that, but now I'm wondering if there is one, since x86 don't do
  it in what seems like the analogous situation.
  
  It's possible this works because our decrementer (timer) interrupts
  are different at the core level from external interrupts coming from
  the PIC, and *are* handled in kernel, but I haven't actually followed
  the logic to work out if this is the case.
  
   Meaning the normal state of things is to sleep in
  the kernel (whether or not you have an emulated interrupt controller in
  the kernel -- the term irqchip in kernel is overloaded for x86).
  
  Uh.. overloaded in what way.
 
 On x86, irqchip-in-kernel means that the local APICs, the IOAPIC, and
 the two PICs are emulated in the kernel.  Now the IOAPIC and the PICs
 correspond to non-x86 interrupt controllers, but the local APIC is more
 tightly coupled to the core.  Interrupt acceptance by the core is an
 operation that involved synchronous communication with the local APIC:
 the APIC presents the interrupt, the core accepts it based on the value
 of the interrupt enable flag and possible a register (CR8), then the
 APIC updates the ISR and IRR.
 
 The upshot is that if the local APIC is in userspace, interrupts must be
 synchronous with vcpu exection, so that KVM_INTERRUPT is a vcpu ioctl
 and HLT is emulated in userspace (so that local APIC emulation can check
 if an interrupt wakes it up or not).

Sorry, still not 100% getting it.  When the vcpu is actually running
code, that synchronous communication must still be accomplished via
the KVM_INTERRUPT ioctl, yes?  So what makes HLT different, that the
communication can't be accomplished in that case.

 As soon as the local APIC is
 emulated in the kernel, HLT can be emulated there as well, and
 interrupts become asynchronous (KVM_IRQ_LINE, a vm ioctl).
 
 So irqchip_in_kernel, for most discussions, really means whether
 interrupt queuing is synchronous or asynchronous.  It has nothing to do
 with the interrupt controllers per se.  All non-x86 archs always have
 irqchip_in_kernel() in this sense.
 
 Peter has started to fix up this naming mess in qemu.  I guess we should
 do the same for the kernel (except for ABIs) and document it, because it
 keeps generating confusion.

Ok.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/3] KVM: PPC: booke: Added debug handler

2012-08-07 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Wednesday, August 08, 2012 2:15 AM
 To: Alexander Graf
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; 
 Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 3/3] KVM: PPC: booke: Added debug handler
 
 On 08/07/2012 05:47 AM, Alexander Graf wrote:
  diff --git a/arch/powerpc/kvm/booke_interrupts.S
  b/arch/powerpc/kvm/booke_interrupts.S
  index 3539805..890673c 100644
  --- a/arch/powerpc/kvm/booke_interrupts.S
  +++ b/arch/powerpc/kvm/booke_interrupts.S
  @@ -73,6 +73,51 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
 bctr
  .endm
 
  +.macro KVM_DBG_HANDLER ivor_nr scratch srr0
 
  This is a lot of asm code. Any chance to share a good share of it with the
 generic handler?
 
 That entire file could use an update to lok more like bookehv_interrupts.S and
 its use of asm macros.

In booke there is assumption that size of KVM IVORs will not me more than host 
IVORs size so that only IVPR is changed. 

I tried to give it that shape of bookehv_interrupts.S  and found that size of 
some IVORs become more than host IVORs.

Thanks
-Bharat
N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

Re: [kvmarm] [PATCH 5/8] KVM: Add hva_to_memslot

2012-08-07 Thread Christoffer Dall
On Tue, Aug 7, 2012 at 6:57 AM, Alexander Graf ag...@suse.de wrote:
 Architecture code might want to figure out if an hva that it gets for example
 via the mmu notifier callbacks actually is in guest mapped memory, and if so,
 in which memory slot.

 This patch introduces a helper function to enable it to do so. It is a
 prerequisite for the e500 mmu notifier implementation.

 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  include/linux/kvm_host.h |1 +
  virt/kvm/kvm_main.c  |   14 ++
  2 files changed, 15 insertions(+), 0 deletions(-)

 diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
 index dbc65f9..2b92928 100644
 --- a/include/linux/kvm_host.h
 +++ b/include/linux/kvm_host.h
 @@ -464,6 +464,7 @@ int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct 
 gfn_to_hva_cache *ghc,
  int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len);
  int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len);
  struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn);
 +struct kvm_memory_slot *hva_to_memslot(struct kvm *kvm, hva_t hva);
  int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn);
  unsigned long kvm_host_page_size(struct kvm *kvm, gfn_t gfn);
  void mark_page_dirty(struct kvm *kvm, gfn_t gfn);
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index bcf973e..d42591d 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -999,6 +999,20 @@ struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, 
 gfn_t gfn)
  }
  EXPORT_SYMBOL_GPL(gfn_to_memslot);

 +struct kvm_memory_slot *hva_to_memslot(struct kvm *kvm, hva_t hva)
 +{
 +   struct kvm_memslots *slots = kvm_memslots(kvm);
 +   struct kvm_memory_slot *memslot;
 +
 +   kvm_for_each_memslot(memslot, slots)
 +   if (hva = memslot-userspace_addr 
 + hva  memslot-userspace_addr + memslot-npages)

addr + npages, this doesn't look right

 +   return memslot;
 +
 +   return NULL;
 +}
 +EXPORT_SYMBOL_GPL(hva_to_memslot);

consider also adding a hva_to_gpa wrapper now when you're add it, then
ARM code will be so happy:

bool hva_to_gpa(struct kvm *kvm, unsigned long hva, gpa_t *gpa)
{
struct kvm_memory_slot *memslot;

memslot = hva_to_memslot(kvm, hva);
if (!memslot)
return false;

gpa_t gpa_offset = hva - memslot-userspace_addr;
*gpa = (memslot-base_gfn  PAGE_SHIFT) + gpa_offset;
return true;
}

 +
  int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn)
  {
 struct kvm_memory_slot *memslot = gfn_to_memslot(kvm, gfn);
 --
 1.6.0.2

 ___
 kvmarm mailing list
 kvm...@lists.cs.columbia.edu
 https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html