RE: [PATCH 4/6] KVM: PPC: debug stub interface parameter defined
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, September 24, 2012 9:09 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/6] KVM: PPC: debug stub interface parameter defined On 21.08.2012, at 15:51, Bharat Bhushan wrote: This patch defines the interface parameter for KVM_SET_GUEST_DEBUG ioctl support. Follow up patches will use this for setting up hardware breakpoints, watchpoints and software breakpoints. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm.h | 33 + arch/powerpc/kvm/book3s.c |6 ++ arch/powerpc/kvm/booke.c |6 ++ arch/powerpc/kvm/powerpc.c |6 -- 4 files changed, 45 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index 3c14202..61b197e 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -269,8 +269,41 @@ struct kvm_debug_exit_arch { /* for KVM_SET_GUEST_DEBUG */ struct kvm_guest_debug_arch { + struct { + /* H/W breakpoint/watchpoint address */ + __u64 addr; + /* +* Type denotes h/w breakpoint, read watchpoint, write +* watchpoint or watchpoint (both read and write). +*/ +#define KVMPPC_DEBUG_NOTYPE0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) + __u32 type; + __u32 pad1; Why the padding? Not sure why, I will remove this. + __u64 pad2; + } bp[16]; Why 16? I think for now 6 (4 iac + 2 dac) is sufficient for BOOKE. We kept 16 to have some room for future and other platforms. Thanks -Bharat }; +/* Debug related defines */ +/* + * kvm_guest_debug-control is a 32 bit field. The lower 16 bits are +generic + * and upper 16 bits are architecture specific. Architecture specific +defines + * that ioctl is for setting hardware breakpoint or software breakpoint. + */ +#define KVM_GUESTDBG_USE_SW_BP 0x0001 +#define KVM_GUESTDBG_USE_HW_BP 0x0002 + +/* When setting software breakpoint, Change the software breakpoint + * instruction to special trap instruction and set +KVM_GUESTDBG_USE_SW_BP + * flag in kvm_guest_debug-control. KVM does keep track of software + * breakpoints. So when KVM_GUESTDBG_USE_SW_BP flag is set and +special trap + * instruction is executed by guest then exit to userspace. + * NOTE: A Nice interface can be added to get the special trap instruction. + */ +#define KVMPPC_INST_GUEST_GDB 0x7C00021C /* ehpriv OC=0 */ This definitely has to be passed to user space (which writes that instruction into guest phys memory). Other PPC subarchs will use different instructions. Just model it as a read-only ONE_REG. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 4/6] KVM: PPC: debug stub interface parameter defined
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, September 24, 2012 9:09 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/6] KVM: PPC: debug stub interface parameter defined On 21.08.2012, at 15:51, Bharat Bhushan wrote: This patch defines the interface parameter for KVM_SET_GUEST_DEBUG ioctl support. Follow up patches will use this for setting up hardware breakpoints, watchpoints and software breakpoints. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm.h | 33 + arch/powerpc/kvm/book3s.c |6 ++ arch/powerpc/kvm/booke.c |6 ++ arch/powerpc/kvm/powerpc.c |6 -- 4 files changed, 45 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index 3c14202..61b197e 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -269,8 +269,41 @@ struct kvm_debug_exit_arch { /* for KVM_SET_GUEST_DEBUG */ struct kvm_guest_debug_arch { + struct { + /* H/W breakpoint/watchpoint address */ + __u64 addr; + /* +* Type denotes h/w breakpoint, read watchpoint, write +* watchpoint or watchpoint (both read and write). +*/ +#define KVMPPC_DEBUG_NOTYPE0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) + __u32 type; + __u32 pad1; Why the padding? + __u64 pad2; + } bp[16]; Why 16? }; +/* Debug related defines */ +/* + * kvm_guest_debug-control is a 32 bit field. The lower 16 bits are +generic + * and upper 16 bits are architecture specific. Architecture specific +defines + * that ioctl is for setting hardware breakpoint or software breakpoint. + */ +#define KVM_GUESTDBG_USE_SW_BP 0x0001 +#define KVM_GUESTDBG_USE_HW_BP 0x0002 + +/* When setting software breakpoint, Change the software breakpoint + * instruction to special trap instruction and set +KVM_GUESTDBG_USE_SW_BP + * flag in kvm_guest_debug-control. KVM does keep track of software + * breakpoints. So when KVM_GUESTDBG_USE_SW_BP flag is set and +special trap + * instruction is executed by guest then exit to userspace. + * NOTE: A Nice interface can be added to get the special trap instruction. + */ +#define KVMPPC_INST_GUEST_GDB 0x7C00021C /* ehpriv OC=0 */ This definitely has to be passed to user space (which writes that instruction into guest phys memory). Other PPC subarchs will use different instructions. Just model it as a read-only ONE_REG. Ok. Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, September 24, 2012 9:50 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support On 21.08.2012, at 15:52, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm.h| 29 ++- arch/powerpc/include/asm/kvm_host.h |5 + arch/powerpc/kernel/asm-offsets.c | 26 ++ arch/powerpc/kvm/booke.c | 144 + arch/powerpc/kvm/booke_interrupts.S | 110 + arch/powerpc/kvm/bookehv_interrupts.S | 141 +++- arch/powerpc/kvm/e500mc.c |3 +- 7 files changed, 435 insertions(+), 23 deletions(-) diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index 61b197e..53479ea 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -264,7 +265,31 @@ struct kvm_fpu { __u64 fpr[32]; }; + +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE 0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) struct kvm_debug_exit_arch { + __u64 pc; + /* +* exception - returns the exception number. If the KVM_DEBUG_EXIT +* exit is not handled (say not h/w breakpoint or software breakpoint +* set for this address) by qemu then it is supposed to inject this +* exception to guest. +*/ + __u32 exception; + /* +* exiting to userspace because of h/w breakpoint, watchpoint +* (read, write or both) and software breakpoint. +*/ + __u32 status; }; /* for KVM_SET_GUEST_DEBUG */ @@ -276,10 +301,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NOTYPE0x0 -#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ(1UL 3) __u32 type; __u32 pad1; __u64 pad2; diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index c7219c1..3ba465a 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -496,7 +496,12 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; + /* host debug registers*/ + struct kvmppc_booke_debug_reg host_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm- offsets.c index 555448e..6987821 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -564,6 +564,32 @@ int main(void) DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear)); DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr)); DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save)); + DEFINE(VCPU_DBSR, offsetof(struct kvm_vcpu, arch.dbsr)); + DEFINE(VCPU_SHADOW_DBG, offsetof(struct kvm_vcpu, arch.shadow_dbg_reg)); + DEFINE(VCPU_HOST_DBG, offsetof(struct kvm_vcpu, arch.host_dbg_reg)); + DEFINE(KVMPPC_DBG_DBCR0, offsetof(struct kvmppc_booke_debug_reg, + dbcr0)); + DEFINE(KVMPPC_DBG_DBCR1, offsetof(struct kvmppc_booke_debug_reg, + dbcr1)); + DEFINE(KVMPPC_DBG_DBCR2, offsetof(struct kvmppc_booke_debug_reg, + dbcr2)); +#ifdef CONFIG_KVM_E500MC + DEFINE(KVMPPC_DBG_DBCR4, offsetof(struct kvmppc_booke_debug_reg, + dbcr4)); +#endif + DEFINE(KVMPPC_DBG_IAC1, offsetof(struct kvmppc_booke_debug_reg, +iac[0])); + DEFINE(KVMPPC_DBG_IAC2, offsetof(struct kvmppc_booke_debug_reg
RE: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, October 04, 2012 4:56 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support On 04.10.2012, at 13:06, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, September 24, 2012 9:50 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support On 21.08.2012, at 15:52, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm.h| 29 ++- arch/powerpc/include/asm/kvm_host.h |5 + arch/powerpc/kernel/asm-offsets.c | 26 ++ arch/powerpc/kvm/booke.c | 144 +-- -- arch/powerpc/kvm/booke_interrupts.S | 110 + arch/powerpc/kvm/bookehv_interrupts.S | 141 +++- arch/powerpc/kvm/e500mc.c |3 +- 7 files changed, 435 insertions(+), 23 deletions(-) diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index 61b197e..53479ea 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -264,7 +265,31 @@ struct kvm_fpu { __u64 fpr[32]; }; + +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE0x0 +#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ (1UL 3) struct kvm_debug_exit_arch { + __u64 pc; + /* + * exception - returns the exception number. If the KVM_DEBUG_EXIT + * exit is not handled (say not h/w breakpoint or software breakpoint + * set for this address) by qemu then it is supposed to inject this + * exception to guest. + */ + __u32 exception; + /* + * exiting to userspace because of h/w breakpoint, watchpoint + * (read, write or both) and software breakpoint. + */ + __u32 status; }; /* for KVM_SET_GUEST_DEBUG */ @@ -276,10 +301,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NOTYPE 0x0 -#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ (1UL 3) __u32 type; __u32 pad1; __u64 pad2; diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index c7219c1..3ba465a 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -496,7 +496,12 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; + /* host debug registers*/ + struct kvmppc_booke_debug_reg host_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm- offsets.c index 555448e..6987821 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -564,6 +564,32 @@ int main(void) DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear)); DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr)); DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save)); + DEFINE(VCPU_DBSR, offsetof(struct kvm_vcpu, arch.dbsr)); + DEFINE(VCPU_SHADOW_DBG, offsetof(struct kvm_vcpu, arch.shadow_dbg_reg)); + DEFINE(VCPU_HOST_DBG, offsetof(struct kvm_vcpu, arch.host_dbg_reg)); + DEFINE(KVMPPC_DBG_DBCR0, offsetof(struct kvmppc_booke_debug_reg, + dbcr0)); + DEFINE(KVMPPC_DBG_DBCR1, offsetof(struct kvmppc_booke_debug_reg, + dbcr1)); + DEFINE(KVMPPC_DBG_DBCR2, offsetof(struct kvmppc_booke_debug_reg, + dbcr2)); +#ifdef CONFIG_KVM_E500MC + DEFINE(KVMPPC_DBG_DBCR4, offsetof(struct kvmppc_booke_debug_reg
RE: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support
-static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu) +static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, + int exit_nr) { enum emulation_result er; + if (unlikely(vcpu-guest_debug KVM_GUESTDBG_USE_SW_BP) +vcpu-arch.last_inst == KVMPPC_INST_GUEST_GDB) { This belongs into the normal emulation code path, behind the same switch() that everything else goes through. I am not sure I understood correctly. Below is the reason why I placed this code here. Instruction where software breakpoint is to be set is replaced by ehpriv instruction. On e500v2, this is not a valid instruction can causes program interrupt. On e500mc, ehpriv is a valid instruction. Both the exit path calls emulation_exit(), so we have placed the code in this function. Do you want this code to be moved in program interrupt exit path for e500v2 and BOOKE_INTERRUPT_HV_PRIV for e500mc? Ok, in this patch you do (basically): int emulation_exit() { if (inst == DEBUG_INST) { debug_stuff(); return; } switch (inst) { case INST_A: foo(); } } Are not we doing something like this: int emulation_exit() { if (inst == DEBUG_INST) { debug_stuff(); return; } status = kvmppc_emulate_instruction() switch (status) { case FAIL: foo(); case DONE: foo1(); } } Do you want something like this: int emulation_exit() { status = kvmppc_emulate_instruction() switch (status) { case FAIL: if (inst == DEBUG_INST) { debug_stuff(); return; } foo(); case DONE: foo1(); } } No, I want the DEBUG_INST be handled the same as any other instruction we emulate. I would like to understand what you are thinking: What I derived is , add the instruction in kvmppc_emulate_instruction() (or its child function) which, 1) fill the relevant information in run- , kvmppc_account_exit(vcpu, DEBUG_EXITS); and returns EMULATION_DONE And in emulation_exit() status = kvmppc_emulate_instruction() switch (status) { case EMULATION_DONE: if (inst == DEBUG) return RESUME_HOST; } Or 2) kvmppc_account_exit(vcpu, DEBUG_EXITS); returns EMULATION_DONE; And in emulation_exit() status = kvmppc_emulate_instruction() switch (status) { case EMULATION_DONE: if (inst == DEBUG) { fill run- return RESUME_HOST; } } Or 3) kvmppc_account_exit(vcpu, DEBUG_EXITS); returns a new status type (EMULATION_DEBUG_INST) And in emulation_exit() status = kvmppc_emulate_instruction() switch (status) { case EMULATION_DEBUG_INST: fill run- return RESUME_HOST; } what I want is: int emulation_exit() { switch (inst) { case INST_A: foo(); break; case DEBUG_INST: debug_stuff(); break; } } + run-exit_reason = KVM_EXIT_DEBUG; + run-debug.arch.pc = vcpu-arch.pc; + run-debug.arch.exception = exit_nr; + run-debug.arch.status = 0; + kvmppc_account_exit(vcpu, DEBUG_EXITS); + return RESUME_HOST; + } + er = kvmppc_emulate_instruction(run, vcpu); switch (er) { case EMULATE_DONE: @@ -697,6 +711,44 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu) default: BUG(); } + + if (unlikely(vcpu-guest_debug KVM_GUESTDBG_ENABLE) + (vcpu-guest_debug KVM_GUESTDBG_SINGLESTEP)) { I don't understand how this is supposed to work. When we enable singlestep, why would we end up in emulation_exit()? When singlestep is enabled then we set DBCR0[ICMP] and the debug handler should be able to handle this. I think you are right. + run-exit_reason = KVM_EXIT_DEBUG; + return RESUME_HOST; + } +} + +static int kvmppc_handle_debug(struct kvm_run *run, struct +kvm_vcpu +*vcpu) { + u32 dbsr; + +#ifndef CONFIG_KVM_BOOKE_HV + if (cpu_has_feature(CPU_FTR_DEBUG_LVL_EXC)) + vcpu-arch.pc = mfspr(SPRN_DSRR0); + else + vcpu-arch.pc = mfspr(SPRN_CSRR0); #endif Why doesn't this get handled in the asm code that recovers from the respective exceptions? Yes. I will remove this. + dbsr = vcpu-arch.dbsr; + + run-debug.arch.pc = vcpu-arch.pc; +
RE: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support
-Original Message- From: Wood Scott-B07421 Sent: Thursday, September 06, 2012 4:57 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Bhushan Bharat- R65777 Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support On 09/05/2012 06:23 PM, Scott Wood wrote: On 08/21/2012 08:52 AM, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm.h| 29 ++- arch/powerpc/include/asm/kvm_host.h |5 + arch/powerpc/kernel/asm-offsets.c | 26 ++ arch/powerpc/kvm/booke.c | 144 +-- -- arch/powerpc/kvm/booke_interrupts.S | 110 + arch/powerpc/kvm/bookehv_interrupts.S | 141 +++- arch/powerpc/kvm/e500mc.c |3 +- 7 files changed, 435 insertions(+), 23 deletions(-) diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index 61b197e..53479ea 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -264,7 +265,31 @@ struct kvm_fpu { __u64 fpr[32]; }; + +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE 0x0 +#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ (1UL 3) struct kvm_debug_exit_arch { That says arch, but it's not in an arch-specific file. Sigh, I can't read today apparently. + __u64 pc; + /* + * exception - returns the exception number. If the KVM_DEBUG_EXIT + * exit is not handled (say not h/w breakpoint or software breakpoint + * set for this address) by qemu then it is supposed to inject this + * exception to guest. + */ + __u32 exception; + /* + * exiting to userspace because of h/w breakpoint, watchpoint + * (read, write or both) and software breakpoint. + */ + __u32 status; }; What does exception number mean in a generic API? Still, exception number is not a well-defined concept powerpc-wide. Just for background why we added is that, on x86 this exception number is used to inject the exception to guest if QEMU is not able to handle the debug exception. Should we just through a print with clearing the exception condition? Or something else you would like to suggest? Thanks -Bharat
RE: [PATCH 2/2] KVM: PPC: booke/bookehv: Add guest debug support
diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index 3c14202..da71c84 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -265,10 +266,19 @@ struct kvm_fpu { }; struct kvm_debug_exit_arch { + __u32 exception; + __u32 pc; + __u32 status; }; PC must be 64-bit. What goes in status and exception? status - exit because of h/w breakpoint, watchpoint (read, write or both) and software breakpoint. exception - returns the exception number. If the exit is not handled (say not h/w breakpoint or software breakpoint set for this address) by qemu then it is supposed to inject the exception to guest. This is how it is implemented for x86. ok /* for KVM_SET_GUEST_DEBUG */ struct kvm_guest_debug_arch { + struct { + __u64 addr; + __u32 type; + __u32 pad1; + __u64 pad2; + } bp[16]; }; What goes in type? Type denote breakpoint, read watchpoint, write watchpoint or watchpoint (both read and write). Will adding a comment to describe this is ok? Yes, please make sure all of this is well documented. /* definition of registers in kvm_run */ @@ -285,6 +295,17 @@ struct kvm_sync_regs { #define KVM_CPU_3S_644 #define KVM_CPU_E500MC 5 +/* Debug related defines */ +#define KVM_INST_GUESTGDB 0x7C00021C /* ehpriv OC=0 */ Will this work on all PPC? It certainly won't work on other architectures, so at a minimum it's KVM_PPC_INST_GUEST_GDB, but maybe it needs to be determined at runtime. How to determine at run time? adding another ioctl ? Or extend an existing one. Is there any other information about debug capabilities that you expose -- number of hardware breakpoints supported, etc +#define KVM_GUESTDBG_USE_SW_BP 0x0001 +#define KVM_GUESTDBG_USE_HW_BP 0x0002 Where do these get used? Any reason for these particular values? If you're trying to create a partition where the upper half is generic and the lower half is arch-specific, say so. KVM_SET_GUEST_DEBUG ioctl used to set/unset debug interrupts, which have a u32 control element. We have inherited this mechanism from x86 implementation and it looks like lower 16 bits are generic (like KVM_GUESTDBG_ENBLE, KVM_GUESTDBG_SINGLESTEP etc and upper 16 bits are Architecture specific. I will add a comment to describe this. I don't think the sw/hw distinction belongs here -- it should be per breakpoint. KVM does not track the software breakpoint, so it is not per breakpoint. In KVM, when KVM_GUESTDBG_USE_SW_BP flag is set and special trap instruction is executed by guest then exit to userspace. + run-exit_reason = KVM_EXIT_DEBUG; + run-debug.arch.pc = vcpu-arch.pc; + run-debug.arch.exception = exit_nr; + run-debug.arch.status = 0; + kvmppc_account_exit(vcpu, DEBUG_EXITS); + return RESUME_HOST; The interface isn't (clearly labelled as) booke specific, but you return booke- specific exception numbers. How's userspace supposed to know what to do with them? What do you plan on doing with them in QEMU? This is booke specific. Then put booke in the name, Which data structure name should have booke? but what about it really needs to be booke specific? Why does QEMU care about the exception type? Explained above. Thanks -Bharat +#ifndef CONFIG_PPC_FSL_BOOK3E + PPC_LD(r7, VCPU_HOST_DBG+KVMPPC_DBG_IAC3, r4) + PPC_LD(r8, VCPU_HOST_DBG+KVMPPC_DBG_IAC4, r4) + mtspr SPRN_IAC3, r7 + mtspr SPRN_IAC4, r8 +#endif Can you handle this at runtime with a feature section? Why you want this to make run time? Removing config_ ? Currently KVM hardcodes the target hardware in a way that is unacceptable in much of the rest of the kernel. We have a long term goal to stop doing that, and we should avoid making it worse by adding random ifdefs for specific CPUs. -Scott
RE: [PATCH 2/3] KVM: PPC: booke: Allow multiple exception types
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, August 07, 2012 4:16 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/3] KVM: PPC: booke: Allow multiple exception types On 03.08.2012, at 09:08, Bharat Bhushan wrote: Current kvmppc_booke_handlers uses the same macro (KVM_HANDLER) and all handlers are considered to be the same size. This will not be the case if we want to use different macros for different handlers. This patch improves the kvmppc_booke_handler so that it can support different macros for different handlers. Signed-off-by: Liu Yu yu@freescale.com [bharat.bhus...@freescale.com: Substantial changes] Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_ppc.h |2 - arch/powerpc/kvm/booke.c|9 --- arch/powerpc/kvm/booke.h|1 + arch/powerpc/kvm/booke_interrupts.S | 37 -- arch/powerpc/kvm/e500.c | 13 +++ 5 files changed, 48 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 1438a5e..deb55bd 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -47,8 +47,6 @@ enum emulation_result { extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu); extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu); -extern char kvmppc_handlers_start[]; -extern unsigned long kvmppc_handler_len; extern void kvmppc_handler_highmem(void); extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index aebbb8b..1338233 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1541,6 +1541,7 @@ int __init kvmppc_booke_init(void) { #ifndef CONFIG_KVM_BOOKE_HV unsigned long ivor[16]; + unsigned long *handler = kvmppc_booke_handler_addr; unsigned long max_ivor = 0; int i; @@ -1574,14 +1575,14 @@ int __init kvmppc_booke_init(void) for (i = 0; i 16; i++) { if (ivor[i] max_ivor) - max_ivor = ivor[i]; + max_ivor = i; memcpy((void *)kvmppc_booke_handlers + ivor[i], - kvmppc_handlers_start + i * kvmppc_handler_len, - kvmppc_handler_len); + (void *)handler[i], handler[i + 1] - handler[i]); } flush_icache_range(kvmppc_booke_handlers, - kvmppc_booke_handlers + max_ivor + kvmppc_handler_len); + kvmppc_booke_handlers + ivor[max_ivor] + + handler[max_ivor + 1] - handler[max_ivor]); #endif /* !BOOKE_HV */ return 0; } diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h index ba61974..de9e526 100644 --- a/arch/powerpc/kvm/booke.h +++ b/arch/powerpc/kvm/booke.h @@ -65,6 +65,7 @@ (1 BOOKE_IRQPRIO_CRITICAL)) extern unsigned long kvmppc_booke_handlers; +extern unsigned long kvmppc_booke_handler_addr[]; void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr); void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr); diff --git a/arch/powerpc/kvm/booke_interrupts.S b/arch/powerpc/kvm/booke_interrupts.S index bb46b32..3539805 100644 --- a/arch/powerpc/kvm/booke_interrupts.S +++ b/arch/powerpc/kvm/booke_interrupts.S @@ -73,6 +73,14 @@ _GLOBAL(kvmppc_handler_\ivor_nr) bctr .endm +.macro KVM_HANDLER_ADDR ivor_nr + .long kvmppc_handler_\ivor_nr +.endm + +.macro KVM_HANDLER_END + .long kvmppc_handlers_end +.endm + _GLOBAL(kvmppc_handlers_start) KVM_HANDLER BOOKE_INTERRUPT_CRITICAL SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0 KVM_HANDLER BOOKE_INTERRUPT_MACHINE_CHECK SPRN_SPRG_RSCRATCH_MC SPRN_MCSRR0 @@ -93,9 +101,7 @@ KVM_HANDLER BOOKE_INTERRUPT_DEBUG SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0 KVM_HANDLER BOOKE_INTERRUPT_SPE_UNAVAIL SPRN_SPRG_RSCRATCH0 SPRN_SRR0 KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_DATA SPRN_SPRG_RSCRATCH0 SPRN_SRR0 KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_ROUND SPRN_SPRG_RSCRATCH0 SPRN_SRR0 - -_GLOBAL(kvmppc_handler_len) - .long kvmppc_handler_1 - kvmppc_handler_0 +_GLOBAL(kvmppc_handlers_end) /* Registers: * SPRG_SCRATCH0: guest r4 @@ -463,6 +469,31 @@ lightweight_exit: lwz r4, VCPU_GPR(R4)(r4) rfi + .data + .align 4 + .globl kvmppc_booke_handler_addr +kvmppc_booke_handler_addr: +KVM_HANDLER_ADDR BOOKE_INTERRUPT_CRITICAL +KVM_HANDLER_ADDR BOOKE_INTERRUPT_MACHINE_CHECK +KVM_HANDLER_ADDR BOOKE_INTERRUPT_DATA_STORAGE +KVM_HANDLER_ADDR BOOKE_INTERRUPT_INST_STORAGE +KVM_HANDLER_ADDR
RE: [PATCH 3/3] KVM: PPC: booke: Added debug handler
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Wednesday, August 08, 2012 4:41 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 3/3] KVM: PPC: booke: Added debug handler On 08.08.2012, at 03:02, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Wednesday, August 08, 2012 2:15 AM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 3/3] KVM: PPC: booke: Added debug handler On 08/07/2012 05:47 AM, Alexander Graf wrote: diff --git a/arch/powerpc/kvm/booke_interrupts.S b/arch/powerpc/kvm/booke_interrupts.S index 3539805..890673c 100644 --- a/arch/powerpc/kvm/booke_interrupts.S +++ b/arch/powerpc/kvm/booke_interrupts.S @@ -73,6 +73,51 @@ _GLOBAL(kvmppc_handler_\ivor_nr) bctr .endm +.macro KVM_DBG_HANDLER ivor_nr scratch srr0 This is a lot of asm code. Any chance to share a good share of it with the generic handler? That entire file could use an update to lok more like bookehv_interrupts.S and its use of asm macros. In booke there is assumption that size of KVM IVORs will not me more than host IVORs size so that only IVPR is changed. I tried to give it that shape of bookehv_interrupts.S and found that size of some IVORs become more than host IVORs. We can always jump off to another (more generic?) function and only have a small stub in the IVOR referenced code. What extra KVM_DBG_HANDLER have from KVM_HANDLER is the handing of debug single step (which is similar to host). So do you want a jump in assembly for handling the debug single step? Or you really think of moving something from the KVM_HANDLER to more generic? Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/3] KVM: PPC: booke: Allow multiple exception types
-Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Friday, August 10, 2012 2:49 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/3] KVM: PPC: booke: Allow multiple exception types On 10.08.2012, at 08:38, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, August 07, 2012 4:16 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/3] KVM: PPC: booke: Allow multiple exception types On 03.08.2012, at 09:08, Bharat Bhushan wrote: Current kvmppc_booke_handlers uses the same macro (KVM_HANDLER) and all handlers are considered to be the same size. This will not be the case if we want to use different macros for different handlers. This patch improves the kvmppc_booke_handler so that it can support different macros for different handlers. Signed-off-by: Liu Yu yu@freescale.com [bharat.bhus...@freescale.com: Substantial changes] Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_ppc.h |2 - arch/powerpc/kvm/booke.c|9 --- arch/powerpc/kvm/booke.h|1 + arch/powerpc/kvm/booke_interrupts.S | 37 - - arch/powerpc/kvm/e500.c | 13 +++ 5 files changed, 48 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 1438a5e..deb55bd 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -47,8 +47,6 @@ enum emulation_result { extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu); extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu); -extern char kvmppc_handlers_start[]; -extern unsigned long kvmppc_handler_len; extern void kvmppc_handler_highmem(void); extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index aebbb8b..1338233 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1541,6 +1541,7 @@ int __init kvmppc_booke_init(void) { #ifndef CONFIG_KVM_BOOKE_HV unsigned long ivor[16]; + unsigned long *handler = kvmppc_booke_handler_addr; unsigned long max_ivor = 0; int i; @@ -1574,14 +1575,14 @@ int __init kvmppc_booke_init(void) for (i = 0; i 16; i++) { if (ivor[i] max_ivor) - max_ivor = ivor[i]; + max_ivor = i; memcpy((void *)kvmppc_booke_handlers + ivor[i], -kvmppc_handlers_start + i * kvmppc_handler_len, -kvmppc_handler_len); +(void *)handler[i], handler[i + 1] - handler[i]); } flush_icache_range(kvmppc_booke_handlers, -kvmppc_booke_handlers + max_ivor + kvmppc_handler_len); +kvmppc_booke_handlers + ivor[max_ivor] + + handler[max_ivor + 1] - +handler[max_ivor]); #endif /* !BOOKE_HV */ return 0; } diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h index ba61974..de9e526 100644 --- a/arch/powerpc/kvm/booke.h +++ b/arch/powerpc/kvm/booke.h @@ -65,6 +65,7 @@ (1 BOOKE_IRQPRIO_CRITICAL)) extern unsigned long kvmppc_booke_handlers; +extern unsigned long kvmppc_booke_handler_addr[]; void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr); void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr); diff --git a/arch/powerpc/kvm/booke_interrupts.S b/arch/powerpc/kvm/booke_interrupts.S index bb46b32..3539805 100644 --- a/arch/powerpc/kvm/booke_interrupts.S +++ b/arch/powerpc/kvm/booke_interrupts.S @@ -73,6 +73,14 @@ _GLOBAL(kvmppc_handler_\ivor_nr) bctr .endm +.macro KVM_HANDLER_ADDR ivor_nr + .long kvmppc_handler_\ivor_nr +.endm + +.macro KVM_HANDLER_END + .long kvmppc_handlers_end +.endm + _GLOBAL(kvmppc_handlers_start) KVM_HANDLER BOOKE_INTERRUPT_CRITICAL SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0 KVM_HANDLER BOOKE_INTERRUPT_MACHINE_CHECK SPRN_SPRG_RSCRATCH_MC SPRN_MCSRR0 @@ -93,9 +101,7 @@ KVM_HANDLER BOOKE_INTERRUPT_DEBUG SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0 KVM_HANDLER BOOKE_INTERRUPT_SPE_UNAVAIL SPRN_SPRG_RSCRATCH0 SPRN_SRR0 KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_DATA SPRN_SPRG_RSCRATCH0 SPRN_SRR0 KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_ROUND SPRN_SPRG_RSCRATCH0 SPRN_SRR0 - -_GLOBAL(kvmppc_handler_len) - .long kvmppc_handler_1 - kvmppc_handler_0 +_GLOBAL(kvmppc_handlers_end) /* Registers: * SPRG_SCRATCH0: guest r4 @@ -463,6 +469,31 @@ lightweight_exit: lwz r4
RE: [PATCH 2/3] KVM: PPC: booke: Allow multiple exception types
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, August 10, 2012 4:27 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/3] KVM: PPC: booke: Allow multiple exception types On 10.08.2012, at 12:41, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Friday, August 10, 2012 2:49 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/3] KVM: PPC: booke: Allow multiple exception types On 10.08.2012, at 08:38, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, August 07, 2012 4:16 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/3] KVM: PPC: booke: Allow multiple exception types On 03.08.2012, at 09:08, Bharat Bhushan wrote: Current kvmppc_booke_handlers uses the same macro (KVM_HANDLER) and all handlers are considered to be the same size. This will not be the case if we want to use different macros for different handlers. This patch improves the kvmppc_booke_handler so that it can support different macros for different handlers. Signed-off-by: Liu Yu yu@freescale.com [bharat.bhus...@freescale.com: Substantial changes] Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_ppc.h |2 - arch/powerpc/kvm/booke.c|9 --- arch/powerpc/kvm/booke.h|1 + arch/powerpc/kvm/booke_interrupts.S | 37 - - arch/powerpc/kvm/e500.c | 13 +++ 5 files changed, 48 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 1438a5e..deb55bd 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -47,8 +47,6 @@ enum emulation_result { extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu); extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu); -extern char kvmppc_handlers_start[]; -extern unsigned long kvmppc_handler_len; extern void kvmppc_handler_highmem(void); extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index aebbb8b..1338233 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1541,6 +1541,7 @@ int __init kvmppc_booke_init(void) { #ifndef CONFIG_KVM_BOOKE_HV unsigned long ivor[16]; + unsigned long *handler = kvmppc_booke_handler_addr; unsigned long max_ivor = 0; int i; @@ -1574,14 +1575,14 @@ int __init kvmppc_booke_init(void) for (i = 0; i 16; i++) { if (ivor[i] max_ivor) - max_ivor = ivor[i]; + max_ivor = i; memcpy((void *)kvmppc_booke_handlers + ivor[i], - kvmppc_handlers_start + i * kvmppc_handler_len, - kvmppc_handler_len); + (void *)handler[i], handler[i + 1] - handler[i]); } flush_icache_range(kvmppc_booke_handlers, - kvmppc_booke_handlers + max_ivor + kvmppc_handler_len); + kvmppc_booke_handlers + ivor[max_ivor] + + handler[max_ivor + 1] - +handler[max_ivor]); #endif /* !BOOKE_HV */ return 0; } diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h index ba61974..de9e526 100644 --- a/arch/powerpc/kvm/booke.h +++ b/arch/powerpc/kvm/booke.h @@ -65,6 +65,7 @@ (1 BOOKE_IRQPRIO_CRITICAL)) extern unsigned long kvmppc_booke_handlers; +extern unsigned long kvmppc_booke_handler_addr[]; void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr); void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr); diff --git a/arch/powerpc/kvm/booke_interrupts.S b/arch/powerpc/kvm/booke_interrupts.S index bb46b32..3539805 100644 --- a/arch/powerpc/kvm/booke_interrupts.S +++ b/arch/powerpc/kvm/booke_interrupts.S @@ -73,6 +73,14 @@ _GLOBAL(kvmppc_handler_\ivor_nr) bctr .endm +.macro KVM_HANDLER_ADDR ivor_nr + .long kvmppc_handler_\ivor_nr +.endm + +.macro KVM_HANDLER_END + .long kvmppc_handlers_end +.endm + _GLOBAL(kvmppc_handlers_start) KVM_HANDLER BOOKE_INTERRUPT_CRITICAL SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0 KVM_HANDLER BOOKE_INTERRUPT_MACHINE_CHECK SPRN_SPRG_RSCRATCH_MC SPRN_MCSRR0 @@ -93,9 +101,7 @@ KVM_HANDLER BOOKE_INTERRUPT_DEBUG
RE: [PATCH 1/2] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, August 07, 2012 3:38 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2] KVM: PPC: booke: Add watchdog emulation On 03.08.2012, at 07:30, Bharat Bhushan wrote: This patch adds the watchdog emulation in KVM. The watchdog emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl. The kernel timer are used for watchdog emulation and emulates h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how it is configured. Signed-off-by: Liu Yu yu@freescale.com Signed-off-by: Scott Wood scottw...@freescale.com [bharat.bhus...@freescale.com: reworked patch] Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v7: - kvmppc_core_dequeue_watchdog() and kvmppc_core_enque_watchdog() are made staic - Clear the KVM_REQ_WATCHDOG request when TSR_ENW | ENW_WIS cleared rather than checking these bits on handling the request. Below changes (pasted for quick understanding) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 7eedaeb..a0f5922 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -629,8 +629,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) goto out; } - if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu) - (vcpu-arch.tsr (TSR_ENW | TSR_WIS))) { + if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu)) { kvm_run-exit_reason = KVM_EXIT_WATCHDOG; ret = 0; goto out; @@ -1098,8 +1097,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, } } - if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu) - (vcpu-arch.tsr (TSR_ENW | TSR_WIS))) { + if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu)) { run-exit_reason = KVM_EXIT_WATCHDOG; r = RESUME_HOST | (r RESUME_FLAG_NV); } @@ -1251,8 +1249,10 @@ static int set_sregs_base(struct kvm_vcpu *vcpu, vcpu-arch.tsr = sregs-u.e.tsr; - if ((old_tsr ^ vcpu-arch.tsr) (TSR_ENW | TSR_WIS)) + if ((old_tsr ^ vcpu-arch.tsr) (TSR_ENW | TSR_WIS)) { + clear_bit(KVM_REQ_WATCHDOG, vcpu-requests); arm_next_watchdog(vcpu); + } update_timer_ints(vcpu); } @@ -1434,8 +1434,10 @@ void kvmppc_clr_tsr_bits(struct kvm_vcpu *vcpu, u32 tsr_bits) * We may have stopped the watchdog due to * being stuck on final expiration. */ - if (tsr_bits (TSR_ENW | TSR_WIS)) + if (tsr_bits (TSR_ENW | TSR_WIS)) { + clear_bit(KVM_REQ_WATCHDOG, vcpu-requests); arm_next_watchdog(vcpu); + } -- v6: - Added kvmppc_subarch_vcpu_unit/uninit() for subarch specific initialization - Using spin_lock_irqsave() - Using del_timer_sync(). v5: - Checking that TSR_ENW/TSR_WIS are still set on KVM_EXIT_WATCHDOG. - Moved watchdog_next_timeout() in lock. arch/powerpc/include/asm/kvm_host.h |3 + arch/powerpc/include/asm/kvm_ppc.h |2 + arch/powerpc/include/asm/reg_booke.h |7 ++ arch/powerpc/kvm/book3s.c|9 ++ arch/powerpc/kvm/booke.c | 158 ++ arch/powerpc/kvm/booke_emulate.c |8 ++ arch/powerpc/kvm/powerpc.c | 14 +++- include/linux/kvm.h |2 + include/linux/kvm_host.h |1 + 9 files changed, 202 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 572ad01..873ec11 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -469,6 +469,8 @@ struct kvm_vcpu_arch { ulong fault_esr; ulong queued_dear; ulong queued_esr; + spinlock_t wdt_lock; + struct timer_list wdt_timer; u32 tlbcfg[4]; u32 mmucfg; u32 epr; @@ -484,6 +486,7 @@ struct kvm_vcpu_arch { u8 osi_needed; u8 osi_enabled; u8 papr_enabled; + u8 watchdog_enabled; u8 sane; u8 cpu_type; u8 hcall_needed; diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 0124937..1438a5e 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -68,6 +68,8 @@ extern void kvmppc_emulate_dec(struct kvm_vcpu *vcpu); extern u32 kvmppc_get_dec(struct kvm_vcpu *vcpu, u64 tb); extern void kvmppc_decrementer_func(unsigned long data); extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu); +extern int
RE: [PATCH 3/3] KVM: PPC: booke: Added debug handler
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, August 07, 2012 4:18 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 3/3] KVM: PPC: booke: Added debug handler On 03.08.2012, at 09:08, Bharat Bhushan wrote: Installed debug handler will be used for guest debug support and debug facility emulation features (patches for these features will follow this patch). Signed-off-by: Liu Yu yu@freescale.com [bharat.bhus...@freescale.com: Substantial changes] Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |1 + arch/powerpc/kernel/asm-offsets.c |1 + arch/powerpc/kvm/booke_interrupts.S | 45 +++ 3 files changed, 47 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index dcee499..bd78523 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -494,6 +494,7 @@ struct kvm_vcpu_arch { u32 tlbcfg[4]; u32 mmucfg; u32 epr; + u32 crit_save; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 85b05c4..92f149b 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -563,6 +563,7 @@ int main(void) DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst)); DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear)); DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr)); + DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save)); #endif /* CONFIG_PPC_BOOK3S */ #endif /* CONFIG_KVM */ diff --git a/arch/powerpc/kvm/booke_interrupts.S b/arch/powerpc/kvm/booke_interrupts.S index 3539805..890673c 100644 --- a/arch/powerpc/kvm/booke_interrupts.S +++ b/arch/powerpc/kvm/booke_interrupts.S @@ -73,6 +73,51 @@ _GLOBAL(kvmppc_handler_\ivor_nr) bctr .endm +.macro KVM_DBG_HANDLER ivor_nr scratch srr0 This is a lot of asm code. Any chance to share a good share of it with the generic handler? Yes it is a lot of code but I am finding it difficult. Thanks -Bharat Alex +_GLOBAL(kvmppc_handler_\ivor_nr) + mtspr \scratch, r4 + mfspr r4, SPRN_SPRG_THREAD + lwz r4, THREAD_KVM_VCPU(r4) + stw r3, VCPU_CRIT_SAVE(r4) + mfcrr3 + mfspr r4, SPRN_CSRR1 + andi. r4, r4, MSR_PR + bne 1f + /* debug interrupt happened in enter/exit path */ + mfspr r4, SPRN_CSRR1 + rlwinm r4, r4, 0, ~MSR_DE + mtspr SPRN_CSRR1, r4 + lis r4, 0x + ori r4, r4, 0x + mtspr SPRN_DBSR, r4 + mfspr r4, SPRN_SPRG_THREAD + lwz r4, THREAD_KVM_VCPU(r4) + mtcrr3 + lwz r3, VCPU_CRIT_SAVE(r4) + mfspr r4, \scratch + rfci +1: /* debug interrupt happened in guest */ + mfspr r4, \scratch + mtcrr3 + mr r3, r4 + mfspr r4, SPRN_SPRG_THREAD + lwz r4, THREAD_KVM_VCPU(r4) + stw r3, VCPU_GPR(R4)(r4) + stw r5, VCPU_GPR(R5)(r4) + stw r6, VCPU_GPR(R6)(r4) + lwz r3, VCPU_CRIT_SAVE(r4) + mfspr r5, \srr0 + stw r3, VCPU_GPR(R3)(r4) + stw r5, VCPU_PC(r4) + mfctr r5 + lis r6, kvmppc_resume_host@h + stw r5, VCPU_CTR(r4) + li r5, \ivor_nr + ori r6, r6, kvmppc_resume_host@l + mtctr r6 + bctr +.endm + .macro KVM_HANDLER_ADDR ivor_nr .long kvmppc_handler_\ivor_nr .endm -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 3/3] KVM: PPC: booke: Added debug handler
-Original Message- From: Wood Scott-B07421 Sent: Wednesday, August 08, 2012 2:15 AM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 3/3] KVM: PPC: booke: Added debug handler On 08/07/2012 05:47 AM, Alexander Graf wrote: diff --git a/arch/powerpc/kvm/booke_interrupts.S b/arch/powerpc/kvm/booke_interrupts.S index 3539805..890673c 100644 --- a/arch/powerpc/kvm/booke_interrupts.S +++ b/arch/powerpc/kvm/booke_interrupts.S @@ -73,6 +73,51 @@ _GLOBAL(kvmppc_handler_\ivor_nr) bctr .endm +.macro KVM_DBG_HANDLER ivor_nr scratch srr0 This is a lot of asm code. Any chance to share a good share of it with the generic handler? That entire file could use an update to lok more like bookehv_interrupts.S and its use of asm macros. In booke there is assumption that size of KVM IVORs will not me more than host IVORs size so that only IVPR is changed. I tried to give it that shape of bookehv_interrupts.S and found that size of some IVORs become more than host IVORs. Thanks -Bharat N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
RE: [PATCH 4/4] Enable kvm emulated watchdog
-Original Message- From: Wood Scott-B07421 Sent: Wednesday, August 01, 2012 11:31 PM To: Bhushan Bharat-R65777 Cc: Alexander Graf; qemu-...@nongnu.org List; kvm-ppc@vger.kernel.org; qemu- devel qemu-devel; KVM list Subject: Re: [PATCH 4/4] Enable kvm emulated watchdog On 08/01/2012 12:27 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Wednesday, August 01, 2012 7:57 AM To: Bhushan Bharat-R65777 Cc: qemu-...@nongnu.org List; kvm-ppc@vger.kernel.org; Bhushan Bharat-R65777; qemu-devel qemu-devel; KVM list Subject: Re: [PATCH 4/4] Enable kvm emulated watchdog On 20.07.2012, at 07:23, Bharat Bhushan wrote: @@ -384,6 +437,15 @@ int kvm_arch_init_vcpu(CPUPPCState *cenv) return ret; } +if (enable_watchdog_support) { +ret = kvm_watchdog_enable(cenv); Do you think this is a good idea? Why would real hardware not implement a watchdog just because the user didn't select an action? If there is no watchdog action then why we want to run watchdog timer? On real hardware, if software sets WRC to a non-zero value, the watchdog action is a system reset. The user doesn't have to do anything special. Scott, maybe I misunderstood you comment, did not you commented to enable kvm watchdog if there is watchdog action provided by use. Thanks -Bharat
RE: [PATCH 4/4] Enable kvm emulated watchdog
-Original Message- From: Wood Scott-B07421 Sent: Friday, August 03, 2012 2:02 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; qemu-...@nongnu.org List; kvm- p...@vger.kernel.org; qemu-devel qemu-devel; KVM list Subject: Re: [PATCH 4/4] Enable kvm emulated watchdog On 08/02/2012 10:56 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Wednesday, August 01, 2012 11:31 PM To: Bhushan Bharat-R65777 Cc: Alexander Graf; qemu-...@nongnu.org List; kvm-ppc@vger.kernel.org; qemu- devel qemu-devel; KVM list Subject: Re: [PATCH 4/4] Enable kvm emulated watchdog On 08/01/2012 12:27 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Wednesday, August 01, 2012 7:57 AM To: Bhushan Bharat-R65777 Cc: qemu-...@nongnu.org List; kvm-ppc@vger.kernel.org; Bhushan Bharat-R65777; qemu-devel qemu-devel; KVM list Subject: Re: [PATCH 4/4] Enable kvm emulated watchdog On 20.07.2012, at 07:23, Bharat Bhushan wrote: @@ -384,6 +437,15 @@ int kvm_arch_init_vcpu(CPUPPCState *cenv) return ret; } +if (enable_watchdog_support) { +ret = kvm_watchdog_enable(cenv); Do you think this is a good idea? Why would real hardware not implement a watchdog just because the user didn't select an action? If there is no watchdog action then why we want to run watchdog timer? On real hardware, if software sets WRC to a non-zero value, the watchdog action is a system reset. The user doesn't have to do anything special. Scott, maybe I misunderstood you comment, did not you commented to enable kvm watchdog if there is watchdog action provided by use. I changed my mind. :-) The main difference between the user specifically asking for an action of system reset, and the user not asking for anything, is that only in the former case should we error out if watchdog functionality isn't available -- but if it's a pain to distinguish, don't error out in either case. :-) .. ok -Bharat -Scott
RE: [PATCH 4/4] Enable kvm emulated watchdog
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Wednesday, August 01, 2012 7:57 AM To: Bhushan Bharat-R65777 Cc: qemu-...@nongnu.org List; kvm-ppc@vger.kernel.org; Bhushan Bharat-R65777; qemu-devel qemu-devel; KVM list Subject: Re: [PATCH 4/4] Enable kvm emulated watchdog On 20.07.2012, at 07:23, Bharat Bhushan wrote: Enable the KVM emulated watchdog if KVM supports (use the capability enablement in watchdog handler). Also watchdog exit (KVM_EXIT_WATCHDOG) handling is added. Watchdog state machine is cleared whenever VM state changes to running. This is to handle the cases like return from debug halt etc. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v4: Enbale watchdog support only when user specifies watchdog-action Earlier this was [3/3] of the patch series. Now because i separated linux-header updation in separate patch, so this become [4/4] v3: - TSR clearing is removed in whatchdog exit handling as this is no more needed. v2: - Merged ([PATCH 3/4] Watchdog exit handling support) and ([PATCH 4/4] Enable to use kvm emulated watchdog) - Clear watchdog state machine when VM state changes to running. hw/ppc_booke.c |5 sysemu.h |1 + target-ppc/cpu.h |1 + target-ppc/kvm.c | 69 ++ vl.c |2 + 5 files changed, 78 insertions(+), 0 deletions(-) diff --git a/hw/ppc_booke.c b/hw/ppc_booke.c index 837a5b6..478dbcd 100644 --- a/hw/ppc_booke.c +++ b/hw/ppc_booke.c @@ -203,6 +203,11 @@ static void booke_wdt_cb(void *opaque) booke_timer-wdt_timer); } +void ppc_booke_watchdog_clear_tsr(CPUPPCState *env, target_ulong tsr) +{ +env-spr[SPR_BOOKE_TSR] = tsr ~(TSR_ENW | TSR_WIS | +TSR_WRS_MASK); } + void store_booke_tsr(CPUPPCState *env, target_ulong val) { env-spr[SPR_BOOKE_TSR] = ~val; diff --git a/sysemu.h b/sysemu.h index bc2c788..fc388b7 100644 --- a/sysemu.h +++ b/sysemu.h @@ -131,6 +131,7 @@ extern uint8_t *boot_splash_filedata; extern int boot_splash_filedata_size; extern uint8_t qemu_extra_params_fw[2]; extern QEMUClock *rtc_clock; +extern int enable_watchdog_support; #define MAX_NODES 64 extern int nb_numa_nodes; diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h index ca2fc21..163389a 100644 --- a/target-ppc/cpu.h +++ b/target-ppc/cpu.h @@ -1191,6 +1191,7 @@ void store_40x_dbcr0 (CPUPPCState *env, uint32_t val); void store_40x_sler (CPUPPCState *env, uint32_t val); void store_booke_tcr (CPUPPCState *env, target_ulong val); void store_booke_tsr (CPUPPCState *env, target_ulong val); +void ppc_booke_watchdog_clear_tsr(CPUPPCState *env, target_ulong +tsr); void booke206_flush_tlb(CPUPPCState *env, int flags, const int check_iprot); target_phys_addr_t booke206_tlb_to_page_size(CPUPPCState *env, ppcmas_tlb_t *tlb); int ppcmas_tlb_check(CPUPPCState *env, ppcmas_tlb_t *tlb, diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index b6ef72d..0226b5e 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -32,6 +32,7 @@ #include device_tree.h #include hw/sysbus.h #include hw/spapr.h +#include hw/watchdog.h #include hw/sysbus.h #include hw/spapr.h @@ -60,6 +61,7 @@ static int cap_booke_sregs; static int cap_ppc_smt; static int cap_ppc_rma; static int cap_spapr_tce; +static int cap_ppc_watchdog; /* XXX We have a race condition where we actually have a level triggered * interrupt, but the infrastructure can't expose that yet, so the guest @@ -86,6 +88,7 @@ int kvm_arch_init(KVMState *s) cap_ppc_smt = kvm_check_extension(s, KVM_CAP_PPC_SMT); cap_ppc_rma = kvm_check_extension(s, KVM_CAP_PPC_RMA); cap_spapr_tce = kvm_check_extension(s, KVM_CAP_SPAPR_TCE); +cap_ppc_watchdog = kvm_check_extension(s, KVM_CAP_PPC_BOOKE_WATCHDOG); if (!cap_interrupt_level) { fprintf(stderr, KVM: Couldn't find level irq capability. Expect the @@ -168,6 +171,30 @@ static int kvm_booke206_tlb_init(CPUPPCState *env) return 0; } +static int kvm_watchdog_enable(CPUPPCState *env) +{ +int ret; +struct kvm_enable_cap encap = {}; + +if (!kvm_enabled()) { +return 0; +} + +if (!cap_ppc_watchdog) { +printf(warning: KVM does not support watchdog); +return 0; +} + +encap.cap = KVM_CAP_PPC_BOOKE_WATCHDOG; +ret = kvm_vcpu_ioctl(env, KVM_ENABLE_CAP, encap); +if (ret 0) { +fprintf(stderr, %s: couldn't enable KVM_CAP_PPC_BOOKE_WATCHDOG: %s\n, +__func__, strerror(-ret)); +return ret; +} + +return ret; +} #if defined(TARGET_PPC64) static void kvm_get_fallback_smmu_info(CPUPPCState *env, @@ -371,6 +398,32 @@ static inline void kvm_fixup_page_sizes(CPUPPCState *env) #endif /* !defined
RE: [PATCH 4/4] Enable kvm emulated watchdog
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Wednesday, August 01, 2012 7:57 AM To: Bhushan Bharat-R65777 Cc: qemu-...@nongnu.org List; kvm-ppc@vger.kernel.org; Bhushan Bharat-R65777; qemu-devel qemu-devel; KVM list Subject: Re: [PATCH 4/4] Enable kvm emulated watchdog On 20.07.2012, at 07:23, Bharat Bhushan wrote: Enable the KVM emulated watchdog if KVM supports (use the capability enablement in watchdog handler). Also watchdog exit (KVM_EXIT_WATCHDOG) handling is added. Watchdog state machine is cleared whenever VM state changes to running. This is to handle the cases like return from debug halt etc. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v4: Enbale watchdog support only when user specifies watchdog-action Earlier this was [3/3] of the patch series. Now because i separated linux-header updation in separate patch, so this become [4/4] v3: - TSR clearing is removed in whatchdog exit handling as this is no more needed. v2: - Merged ([PATCH 3/4] Watchdog exit handling support) and ([PATCH 4/4] Enable to use kvm emulated watchdog) - Clear watchdog state machine when VM state changes to running. hw/ppc_booke.c |5 sysemu.h |1 + target-ppc/cpu.h |1 + target-ppc/kvm.c | 69 ++ vl.c |2 + 5 files changed, 78 insertions(+), 0 deletions(-) diff --git a/hw/ppc_booke.c b/hw/ppc_booke.c index 837a5b6..478dbcd 100644 --- a/hw/ppc_booke.c +++ b/hw/ppc_booke.c @@ -203,6 +203,11 @@ static void booke_wdt_cb(void *opaque) booke_timer-wdt_timer); } +void ppc_booke_watchdog_clear_tsr(CPUPPCState *env, target_ulong tsr) +{ +env-spr[SPR_BOOKE_TSR] = tsr ~(TSR_ENW | TSR_WIS | +TSR_WRS_MASK); } + void store_booke_tsr(CPUPPCState *env, target_ulong val) { env-spr[SPR_BOOKE_TSR] = ~val; diff --git a/sysemu.h b/sysemu.h index bc2c788..fc388b7 100644 --- a/sysemu.h +++ b/sysemu.h @@ -131,6 +131,7 @@ extern uint8_t *boot_splash_filedata; extern int boot_splash_filedata_size; extern uint8_t qemu_extra_params_fw[2]; extern QEMUClock *rtc_clock; +extern int enable_watchdog_support; #define MAX_NODES 64 extern int nb_numa_nodes; diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h index ca2fc21..163389a 100644 --- a/target-ppc/cpu.h +++ b/target-ppc/cpu.h @@ -1191,6 +1191,7 @@ void store_40x_dbcr0 (CPUPPCState *env, uint32_t val); void store_40x_sler (CPUPPCState *env, uint32_t val); void store_booke_tcr (CPUPPCState *env, target_ulong val); void store_booke_tsr (CPUPPCState *env, target_ulong val); +void ppc_booke_watchdog_clear_tsr(CPUPPCState *env, target_ulong +tsr); void booke206_flush_tlb(CPUPPCState *env, int flags, const int check_iprot); target_phys_addr_t booke206_tlb_to_page_size(CPUPPCState *env, ppcmas_tlb_t *tlb); int ppcmas_tlb_check(CPUPPCState *env, ppcmas_tlb_t *tlb, diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index b6ef72d..0226b5e 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -32,6 +32,7 @@ #include device_tree.h #include hw/sysbus.h #include hw/spapr.h +#include hw/watchdog.h #include hw/sysbus.h #include hw/spapr.h @@ -60,6 +61,7 @@ static int cap_booke_sregs; static int cap_ppc_smt; static int cap_ppc_rma; static int cap_spapr_tce; +static int cap_ppc_watchdog; /* XXX We have a race condition where we actually have a level triggered * interrupt, but the infrastructure can't expose that yet, so the guest @@ -86,6 +88,7 @@ int kvm_arch_init(KVMState *s) cap_ppc_smt = kvm_check_extension(s, KVM_CAP_PPC_SMT); cap_ppc_rma = kvm_check_extension(s, KVM_CAP_PPC_RMA); cap_spapr_tce = kvm_check_extension(s, KVM_CAP_SPAPR_TCE); +cap_ppc_watchdog = kvm_check_extension(s, KVM_CAP_PPC_BOOKE_WATCHDOG); if (!cap_interrupt_level) { fprintf(stderr, KVM: Couldn't find level irq capability. Expect the @@ -168,6 +171,30 @@ static int kvm_booke206_tlb_init(CPUPPCState *env) return 0; } +static int kvm_watchdog_enable(CPUPPCState *env) +{ +int ret; +struct kvm_enable_cap encap = {}; + +if (!kvm_enabled()) { +return 0; +} + +if (!cap_ppc_watchdog) { +printf(warning: KVM does not support watchdog); +return 0; +} + +encap.cap = KVM_CAP_PPC_BOOKE_WATCHDOG; +ret = kvm_vcpu_ioctl(env, KVM_ENABLE_CAP, encap); +if (ret 0) { +fprintf(stderr, %s: couldn't enable KVM_CAP_PPC_BOOKE_WATCHDOG: %s\n, +__func__, strerror(-ret)); +return ret; +} + +return ret; +} #if defined(TARGET_PPC64) static void kvm_get_fallback_smmu_info(CPUPPCState *env, @@ -371,6 +398,32 @@ static inline void kvm_fixup_page_sizes(CPUPPCState *env) #endif /* !defined
RE: [PATCH v2] booke: Added ONE_REG interface for IAC/DAC debug registers
+struct kvmppc_debug_reg { +#ifdef CONFIG_BOOKE + u32 dbcr0; + u32 dbcr1; + u32 dbcr2; +#ifdef CONFIG_KVM_E500MC + u32 dbcr4; +#endif + u64 iac[KVMPPC_MAX_IAC]; + u64 dac[KVMPPC_MAX_DAC]; +#endif +}; Is there any reason for this to be a separate struct? No specific reason, The rest of debug ( which will follow sometime soon) uses this structure and looks to make code look easy. So why not use an fsl / booke specific struct for the debug patches then? Debug registers are really nothing common between book3s and booke, so we shouldn't treat them as such by using the same struct definition. All elements of struct are under #ifdef CONFIG_BOOKE? So for book3s it is as good as void, only struct type if declared. Do you want the struct type also under config_booke ? struct kvmppc_booke_debug_reg { lots of defines }; struct kvmppc_book3s_debug_reg { lots of other defines }; void booke_foo() { struct kvmppc_booke_debug_reg r; kvmppc_booke_debug_reg or kvmppc_book3s_debug_reg ? Thanks -Bharat r.dbcr0 = 0; } vs struct kvmppc_debug_reg { #ifdef CONFIG_BOOKE lots of defines #else lots of other defines #endif }; void booke_foo() { struct kvmppc_debug_reg r; r.dbcr0 = 0; } Which one has less #ifdefs? Keep in mind that the less #ifdefs you have, the more readable and maintainable your code becomes, because config options have less effect on your syntax. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v5] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: Wood Scott-B07421 Sent: Monday, July 23, 2012 9:02 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de Subject: Re: [PATCH 2/2 v5] KVM: PPC: booke: Add watchdog emulation On 07/22/2012 11:10 PM, Bhushan Bharat-R65777 wrote: @@ -386,13 +387,23 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) #ifdef CONFIG_KVM_EXIT_TIMING mutex_init(vcpu-arch.exit_timing_lock); #endif - +#ifdef CONFIG_BOOKE + spin_lock_init(vcpu-arch.wdt_lock); + /* setup watchdog timer once */ + setup_timer(vcpu-arch.wdt_timer, kvmppc_watchdog_func, + (unsigned long)vcpu); +#endif return 0; } Can you do this in kvmppc_booke_init()? I do not think we can do this in kvmppc_booke_init(). Watchdog have association with vcpu, while there is no vcpu at kvmppc_booke_init(). Sorry, I meant kvm_arch_vcpu_setup() in booke.c. Any specific reason to move this in above mentioned function? Want to avoid #ifdef config_booke ? Thanks -Bharat
RE: [PATCH v2] booke: Added ONE_REG interface for IAC/DAC debug registers
-Original Message- From: Wood Scott-B07421 Sent: Monday, July 23, 2012 9:12 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; ag...@suse.de; k...@vger.kernel.org; Bhushan Bharat- R65777 Subject: Re: [PATCH v2] booke: Added ONE_REG interface for IAC/DAC debug registers On 07/23/2012 06:19 AM, Bharat Bhushan wrote: IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG interface is added to set/get them. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - Using copy_to/from_user() apis. arch/powerpc/include/asm/kvm.h | 12 ++ arch/powerpc/include/asm/kvm_host.h | 28 ++- arch/powerpc/kvm/booke.c| 64 +- arch/powerpc/kvm/booke_emulate.c|8 ++-- 4 files changed, 104 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index 1bea4d8..3c14202 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -221,6 +221,12 @@ struct kvm_sregs { __u32 dbsr; /* KVM_SREGS_E_UPDATE_DBSR */ __u32 dbcr[3]; + /* +* iac/dac registers are 64bit wide, while this API +* interface provides only lower 32 bits on 64 bit +* processors. ONE_REG interface is added for 64bit +* iac/dac registers. +*/ __u32 iac[4]; __u32 dac[2]; __u32 dvc[2]; @@ -326,5 +332,11 @@ struct kvm_book3e_206_tlb_params { }; #define KVM_REG_PPC_HIOR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1) +#define KVM_REG_PPC_IAC1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x2) +#define KVM_REG_PPC_IAC2 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x3) +#define KVM_REG_PPC_IAC3 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x4) +#define KVM_REG_PPC_IAC4 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x5) +#define KVM_REG_PPC_DAC1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x6) +#define KVM_REG_PPC_DAC2 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x7) #endif /* __LINUX_KVM_POWERPC_H */ diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 43cac48..2c0f015 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -342,6 +342,31 @@ struct kvmppc_slb { bool class : 1; }; +#ifdef CONFIG_BOOKE +# ifdef CONFIG_PPC_FSL_BOOK3E +#define KVMPPC_IAC_NUM 2 +#define KVMPPC_DAC_NUM 2 +# else +#define KVMPPC_IAC_NUM 4 +#define KVMPPC_DAC_NUM 2 +# endif +#define KVMPPC_MAX_IAC 4 +#define KVMPPC_MAX_DAC 2 +#endif + +struct kvmppc_debug_reg { +#ifdef CONFIG_BOOKE + u32 dbcr0; + u32 dbcr1; + u32 dbcr2; +#ifdef CONFIG_KVM_E500MC + u32 dbcr4; +#endif + u64 iac[KVMPPC_MAX_IAC]; + u64 dac[KVMPPC_MAX_DAC]; +#endif +}; Is there any reason for this to be a separate struct? No specific reason, The rest of debug ( which will follow sometime soon) uses this structure and looks to make code look easy. + struct kvm_vcpu_arch { ulong host_stack; u32 host_pid; @@ -436,9 +461,8 @@ struct kvm_vcpu_arch { u32 ccr0; u32 ccr1; - u32 dbcr0; - u32 dbcr1; u32 dbsr; + struct kvmppc_debug_reg dbg_reg; u64 mmcr[3]; u32 pmc[8]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index be71b8e..fa23158 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1353,12 +1353,72 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg) { - return -EINVAL; + int r = -EINVAL; + + switch(reg-id) { + case KVM_REG_PPC_IAC1: + r = copy_to_user((u64 __user *)(long)reg-addr, +vcpu-arch.dbg_reg.iac[0], sizeof(u64)); + break; + case KVM_REG_PPC_IAC2: + r = copy_to_user((u64 __user *)(long)reg-addr, +vcpu-arch.dbg_reg.iac[1], sizeof(u64)); + break; + case KVM_REG_PPC_IAC3: + r = copy_to_user((u64 __user *)(long)reg-addr, +vcpu-arch.dbg_reg.iac[2], sizeof(u64)); + break; + case KVM_REG_PPC_IAC4: + r = copy_to_user((u64 __user *)(long)reg-addr, +vcpu-arch.dbg_reg.iac[3], sizeof(u64)); + break; This will exceed the array size if userspace asks to access IAC3/4 on an e500- family chip. No , is not the array size already set to maximum. But yes we should not let IAC3/4 being accessed for e500 (FSL_BOOKE). Why not just set the array at the max size unconditionally? This is already coded this way. Please see the struct is this patch
RE: [PATCH 2/2 v5] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: Wood Scott-B07421 Sent: Monday, July 23, 2012 9:31 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de Subject: Re: [PATCH 2/2 v5] KVM: PPC: booke: Add watchdog emulation On 07/23/2012 10:43 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Monday, July 23, 2012 9:02 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de Subject: Re: [PATCH 2/2 v5] KVM: PPC: booke: Add watchdog emulation On 07/22/2012 11:10 PM, Bhushan Bharat-R65777 wrote: @@ -386,13 +387,23 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) #ifdef CONFIG_KVM_EXIT_TIMING mutex_init(vcpu-arch.exit_timing_lock); #endif - +#ifdef CONFIG_BOOKE + spin_lock_init(vcpu-arch.wdt_lock); + /* setup watchdog timer once */ + setup_timer(vcpu-arch.wdt_timer, kvmppc_watchdog_func, + (unsigned long)vcpu); +#endif return 0; } Can you do this in kvmppc_booke_init()? I do not think we can do this in kvmppc_booke_init(). Watchdog have association with vcpu, while there is no vcpu at kvmppc_booke_init(). Sorry, I meant kvm_arch_vcpu_setup() in booke.c. Any specific reason to move this in above mentioned function? Want to avoid #ifdef config_booke ? Yes, to avoid the ifdef. We already have a (poorly named) place for booke- specific vcpu init. Where we want to delete watchdog timer? Kvm_arch_vcpu_setup() is in flow of CREATE_VCPU ioctl, I do not see DESTROY_VCPU type of code? So is it ok to let del_timer_sync() as is with #ifdef config_booke ? Thanks -Bharat N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
RE: [PATCH 2/2 v5] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: Wood Scott-B07421 Sent: Saturday, July 21, 2012 2:59 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Bhushan Bharat- R65777 Subject: Re: [PATCH 2/2 v5] KVM: PPC: booke: Add watchdog emulation On 07/20/2012 12:00 AM, Bharat Bhushan wrote: This patch adds the watchdog emulation in KVM. The watchdog emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_WDT) ioctl. The kernel timer are used for watchdog emulation and emulates h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how it is configured. Signed-off-by: Liu Yu yu@freescale.com Signed-off-by: Scott Wood scottw...@freescale.com Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com [bharat.bhus...@freescale.com: reworked patch] Typically the [] note goes immediately before your signoff (but after the others). +static void arm_next_watchdog(struct kvm_vcpu *vcpu) { + unsigned long nr_jiffies; + + spin_lock(vcpu-arch.wdt_lock); + nr_jiffies = watchdog_next_timeout(vcpu); + /* +* If the number of jiffies of watchdog timer = NEXT_TIMER_MAX_DELTA +* then do not run the watchdog timer as this can break timer APIs. +*/ + if (nr_jiffies NEXT_TIMER_MAX_DELTA) + mod_timer(vcpu-arch.wdt_timer, jiffies + nr_jiffies); + else + del_timer(vcpu-arch.wdt_timer); + spin_unlock(vcpu-arch.wdt_lock); +} This needs to be an irqsave lock. @@ -386,13 +387,23 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) #ifdef CONFIG_KVM_EXIT_TIMING mutex_init(vcpu-arch.exit_timing_lock); #endif - +#ifdef CONFIG_BOOKE + spin_lock_init(vcpu-arch.wdt_lock); + /* setup watchdog timer once */ + setup_timer(vcpu-arch.wdt_timer, kvmppc_watchdog_func, + (unsigned long)vcpu); +#endif return 0; } Can you do this in kvmppc_booke_init()? I do not think we can do this in kvmppc_booke_init(). Watchdog have association with vcpu, while there is no vcpu at kvmppc_booke_init(). Thanks -Bharat void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) { kvmppc_mmu_destroy(vcpu); +#ifdef CONFIG_BOOKE + spin_lock(vcpu-arch.wdt_lock); + del_timer(vcpu-arch.wdt_timer); + spin_unlock(vcpu-arch.wdt_lock); +#endif } Don't acquire the lock here, but use del_timer_sync(). -Scott N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
RE: [PATCH 2/2 v5] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: Wood Scott-B07421 Sent: Saturday, July 21, 2012 2:59 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Bhushan Bharat- R65777 Subject: Re: [PATCH 2/2 v5] KVM: PPC: booke: Add watchdog emulation On 07/20/2012 12:00 AM, Bharat Bhushan wrote: This patch adds the watchdog emulation in KVM. The watchdog emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_WDT) ioctl. The kernel timer are used for watchdog emulation and emulates h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how it is configured. Signed-off-by: Liu Yu yu@freescale.com Signed-off-by: Scott Wood scottw...@freescale.com Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com [bharat.bhus...@freescale.com: reworked patch] Typically the [] note goes immediately before your signoff (but after the others). ok +static void arm_next_watchdog(struct kvm_vcpu *vcpu) { + unsigned long nr_jiffies; + + spin_lock(vcpu-arch.wdt_lock); + nr_jiffies = watchdog_next_timeout(vcpu); + /* +* If the number of jiffies of watchdog timer = NEXT_TIMER_MAX_DELTA +* then do not run the watchdog timer as this can break timer APIs. +*/ + if (nr_jiffies NEXT_TIMER_MAX_DELTA) + mod_timer(vcpu-arch.wdt_timer, jiffies + nr_jiffies); + else + del_timer(vcpu-arch.wdt_timer); + spin_unlock(vcpu-arch.wdt_lock); +} This needs to be an irqsave lock. Ok, I want to understood why irqsave lock should be used here? irqsave_lock is used when the critical section within lock can be referenced from interrupt context also. What part of critical section above can get affected from local irq? Is it jiffies here? @@ -386,13 +387,23 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) #ifdef CONFIG_KVM_EXIT_TIMING mutex_init(vcpu-arch.exit_timing_lock); #endif - +#ifdef CONFIG_BOOKE + spin_lock_init(vcpu-arch.wdt_lock); + /* setup watchdog timer once */ + setup_timer(vcpu-arch.wdt_timer, kvmppc_watchdog_func, + (unsigned long)vcpu); +#endif return 0; } Can you do this in kvmppc_booke_init()? Ok, so the *_uninit part of code also needed to be moved in respective function. void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) { kvmppc_mmu_destroy(vcpu); +#ifdef CONFIG_BOOKE + spin_lock(vcpu-arch.wdt_lock); + del_timer(vcpu-arch.wdt_timer); + spin_unlock(vcpu-arch.wdt_lock); +#endif } Don't acquire the lock here, but use del_timer_sync(). Ok. Thanks -Bharat -Scott
RE: [PATCH 2/2 v4] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: Wood Scott-B07421 Sent: Thursday, July 19, 2012 7:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Bhushan Bharat- R65777 Subject: Re: [PATCH 2/2 v4] KVM: PPC: booke: Add watchdog emulation On 07/18/2012 04:39 AM, Bharat Bhushan wrote: This patch adds the watchdog emulation in KVM. The watchdog emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_WDT) ioctl. The kernel timer are used for watchdog emulation and emulates h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how it is configured. Signed-off-by: Liu Yu yu@freescale.com Signed-off-by: Scott Wood scottw...@freescale.com Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com Please put a note before your signoff stating that it's been modified since the previous signoffs, such as: [bharat.bhus...@freescale.com: reworked patch] @@ -482,6 +484,7 @@ struct kvm_vcpu_arch { u8 osi_needed; u8 osi_enabled; u8 papr_enabled; + u8 watchdog_enable; s/enable;/enabled;/ +static void arm_next_watchdog(struct kvm_vcpu *vcpu) +{ + unsigned long nr_jiffies; + + nr_jiffies = watchdog_next_timeout(vcpu); + spin_lock(vcpu-arch.wdt_lock); Do watchdog_next_timeout() from within the lock. Why you think so? Is the next timeout calculation needed to be protected? +void kvmppc_watchdog_func(unsigned long data) +{ + struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; + u32 tsr, new_tsr; + int final; + + do { + new_tsr = tsr = vcpu-arch.tsr; + final = 0; + + /* Time out event */ + if (tsr TSR_ENW) { + if (tsr TSR_WIS) { + final = 1; + } else { + new_tsr = tsr | TSR_WIS; + } Unnecessary braces on the inner if/else. + } else { + new_tsr = tsr | TSR_ENW; + } + } while (cmpxchg(vcpu-arch.tsr, tsr, new_tsr) != tsr); + + if (new_tsr TSR_WIS) { + smp_wmb(); + kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu); + kvm_vcpu_kick(vcpu); + } + + /* +* If this is final watchdog expiry and some action is required +* then exit to userspace. +*/ + if (final (vcpu-arch.tcr TCR_WRC_MASK)) { + smp_wmb(); + kvm_make_request(KVM_REQ_WATCHDOG, vcpu); + kvm_vcpu_kick(vcpu); + } + + + /* +* Stop running the watchdog timer after final expiration to +* prevent the host from being flooded with timers if the +* guest sets a short period. +* Timers will resume when TSR/TCR is updated next time. +*/ + if (!final) + arm_next_watchdog(vcpu); +} static void update_timer_ints(struct kvm_vcpu *vcpu) Blank line between functions static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu) @@ -516,6 +627,13 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) goto out; } + if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu) + vcpu-arch.watchdog_enable) { Check .watchdog_enabled when you make the request, not here. + kvm_run-exit_reason = KVM_EXIT_WATCHDOG; + ret = 0; + goto out; + } I think we should check that ENW/WIS are still set. Now that we don't use TSR[WRS], this is the only way QEMU can preemptively clear a pending final expiration after leaving debug halt. If QEMU doesn't do this, and KVM comes back with a watchdog exit, QEMU won't know if it's a new legitimate one, or if it expired during the debug halt. This would be less ugly than QEMU doing a reset-TSR[WIS,ENW]-and-ignore on the first watchdog exit after a debug halt, and letting the expiration happen again if the guest is really stuck. ahh, What about clearing KVM_REQ_WATCHDOG request when QEMU clears ENW|WIS (or whenever KVM or QEMU clears ENW|WIS)? Thanks -Bharat if (sregs-u.e.update_special KVM_SREGS_E_UPDATE_TSR) { + u32 old_tsr = vcpu-arch.tsr; + vcpu-arch.tsr = sregs-u.e.tsr; + + if ((old_tsr ^ vcpu-arch.tsr) + (TSR_ENW | TSR_WIS | TCR_WRC_MASK)) + arm_next_watchdog(vcpu); Do we still do anything with TSR[WRS] at all? diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 87f4dc8..3c50b81 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -38,9 +38,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { - return !(v-arch.shared-msr MSR_WE) || - !!(v-arch.pending_exceptions) || - v-requests; + bool ret = !(v-arch.shared-msr MSR_WE) || + !!(v
RE: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, July 17, 2012 12:50 PM To: Wood Scott-B07421 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; bharatb.ya...@gmail.com; Bhushan Bharat-R65777; Benjamin Herrenschmidt; Kumar Gala Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation On 17.07.2012, at 03:02, Scott Wood scottw...@freescale.com wrote: On 07/16/2012 12:18 PM, Alexander Graf wrote: +/* + * Return the number of jiffies until the next timeout. If the timeout is + * longer than the NEXT_TIMER_MAX_DELTA, that then? return NEXT_TIMER_MAX_DELTA + * instead. I can read code. Come on, it's not exactly x++; /* add one to x */ It's faster to read code (as well as know the constraints within which you can modify it without having to spend a lot of time digesting all the callers' use cases) when you have a high level description of its interface contract, and can be selective about when to zoom in to the details. Linux kernel code tends to be bad about this. Yeah, not opposed to leave that part in :). The important piece of information in the comment is missing: The reason. The reason for what? Why you want to know the next timeout? That's the caller's business. Or why we use NEXT_TIMER_MAX_DELTA as the limit? Why we use the limit. IIRC it was explained in the last thread, just didn't make its way into the comment. Earlier we have a comment on the #define MAX_TIMEOUT (new define added for a purpose, so the comment described the puspose). Now we uses the generic #define NEX_TIMER_MAX_DELTA (include/linux/timer.h), so removed the comment. +void kvmppc_watchdog_func(unsigned long data) { +struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; +u32 tsr, new_tsr; +int final; + +do { +new_tsr = tsr = vcpu-arch.tsr; +final = 0; + +/* Time out event */ +if (tsr TSR_ENW) { +if (tsr TSR_WIS) { +new_tsr = (tsr ~TCR_WRC_MASK) | + (vcpu-arch.tcr TCR_WRC_MASK); +vcpu-arch.tcr = ~TCR_WRC_MASK; Can't we just poke the vcpu to exit the VM and do the above on its own? We've discussed this before. TSR updates are done via atomics, and we send a request for the vcpu to act on the result. This is how the decrementer works. http://www.spinics.net/lists/kvm-ppc/msg03169.html Yeah, the major difference to the dec is the atomicity of the whole thing. Dec changes one bit to enable the interrupt line. The final expiration is more complex. Is not setting the TSR.WRS atomic here (cmpxchg() will handle this)? This is the watdog expired case, right? Final expiration, yes. I'd also prefer to have an explicit event for the expiry than a special TSR check in the main loop. So check TSR[WRS] in update_timer_ints(), and have it queue a pseudoexception? Or here. Do we mean define a sudo IROPRIO for final expiry. That would eliminate the need to change the runnable function. Also call me sceptic on the reset of tcr. If our user space watchdog event is write a message, then we essentially want to hide the fact that the watchdog expired from the guest, right? In that case, the second time-out wouldn't do anything guest visible. This was probably copied straight out of the hardware documentation, which explicitly says TCR[WRC] gets set to zero on final expiration (as part of reset). We should leave that part up to userspace. It definitely shouldn't be done inside the cmpxchg loop (or from interrupt context -- only TSR gets the atomic treatment). I don't think the read of TCR outside vcpu context is a problem, though. Yeah, but it'd just make me less wary if only the vcpu thread itself accesses vcpu internal registers that aren't irq state and thus designed for it (TSR). But yes, the most flexible way would probably be to do it from user space. Since it'd happen from within the vcpu context of user space, we can also guarantee that the TCR access is atomic. Yes, will move the tcr.wrc clearing to userspace. int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { -return !(v-arch.shared-msr MSR_WE) || - !!(v-arch.pending_exceptions) || - v-requests; +bool ret = !(v-arch.shared-msr MSR_WE) || + !!(v-arch.pending_exceptions) || + v-requests; + +ret = ret || kvmppc_get_tsr_wrc(v); Why do you need to declare the cpu as non-runnable when a watchdog event occured? It's the other way around -- it's always runnable when a watchdog exit is pending. It's like a pending exception. Ah, so yes, we should just shove it into pending_exceptions then. Pending_exception? You mean sudo again here as said earlier. Thanks
RE: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, July 17, 2012 6:22 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; bharatb.ya...@gmail.com; Benjamin Herrenschmidt; Kumar Gala; Avi Kivity Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation On 07/17/2012 11:57 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, July 17, 2012 12:50 PM To: Wood Scott-B07421 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; bharatb.ya...@gmail.com; Bhushan Bharat-R65777; Benjamin Herrenschmidt; Kumar Gala Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation On 17.07.2012, at 03:02, Scott Wood scottw...@freescale.com wrote: On 07/16/2012 12:18 PM, Alexander Graf wrote: +/* + * Return the number of jiffies until the next timeout. If the timeout is + * longer than the NEXT_TIMER_MAX_DELTA, that then? return NEXT_TIMER_MAX_DELTA + * instead. I can read code. Come on, it's not exactly x++; /* add one to x */ It's faster to read code (as well as know the constraints within which you can modify it without having to spend a lot of time digesting all the callers' use cases) when you have a high level description of its interface contract, and can be selective about when to zoom in to the details. Linux kernel code tends to be bad about this. Yeah, not opposed to leave that part in :). The important piece of information in the comment is missing: The reason. The reason for what? Why you want to know the next timeout? That's the caller's business. Or why we use NEXT_TIMER_MAX_DELTA as the limit? Why we use the limit. IIRC it was explained in the last thread, just didn't make its way into the comment. Earlier we have a comment on the #define MAX_TIMEOUT (new define added for a purpose, so the comment described the puspose). Now we uses the generic #define NEX_TIMER_MAX_DELTA (include/linux/timer.h), so removed the comment. Ah, ok. Just saying, if you comment on some mechanism, like you did here, please also include the reasoning behind it. For example Do foo if x is true. isn't particularly helpful. However Do foo if x is true because the bar API will break with high values is very helpful. It includes the action and reason of the code :). Alternatively, to me the same as above would be /* bar API will break with high values */ if (x) do(foo) because in this case the code is the action description. Either variant works fine for me. Ok :) +void kvmppc_watchdog_func(unsigned long data) { +struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; +u32 tsr, new_tsr; +int final; + +do { +new_tsr = tsr = vcpu-arch.tsr; +final = 0; + +/* Time out event */ +if (tsr TSR_ENW) { +if (tsr TSR_WIS) { +new_tsr = (tsr ~TCR_WRC_MASK) | + (vcpu-arch.tcr TCR_WRC_MASK); +vcpu-arch.tcr = ~TCR_WRC_MASK; Can't we just poke the vcpu to exit the VM and do the above on its own? We've discussed this before. TSR updates are done via atomics, and we send a request for the vcpu to act on the result. This is how the decrementer works. http://www.spinics.net/lists/kvm-ppc/msg03169.html Yeah, the major difference to the dec is the atomicity of the whole thing. Dec changes one bit to enable the interrupt line. The final expiration is more complex. Is not setting the TSR.WRS atomic here (cmpxchg() will handle this)? Final expiration sets TCR. TSR should be ok. It is clearing some TCR bits :) Let us move the TCR clearing to userspace (please see below response ^^). Then it is just setting TSR. Right? This is the watdog expired case, right? Final expiration, yes. I'd also prefer to have an explicit event for the expiry than a special TSR check in the main loop. So check TSR[WRS] in update_timer_ints(), and have it queue a pseudoexception? Or here. Do we mean define a sudo IROPRIO for final expiry. We can also define an event that is sent through kvm_make_request. But yeah, IRQPRIO is probably easier. Not 100% sure which way is better though. Avi, any preferences? That would eliminate the need to change the runnable function. Also call me sceptic on the reset of tcr. If our user space watchdog event is write a message, then we essentially want to hide the fact that the watchdog expired from the guest, right? In that case, the second time-out wouldn't do anything guest visible. This was probably copied straight out of the hardware documentation, which explicitly says TCR[WRC] gets set
RE: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, July 17, 2012 7:31 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; bharatb.ya...@gmail.com; Benjamin Herrenschmidt; Kumar Gala; Avi Kivity Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation On 07/17/2012 03:15 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, July 17, 2012 6:22 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; bharatb.ya...@gmail.com; Benjamin Herrenschmidt; Kumar Gala; Avi Kivity Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation On 07/17/2012 11:57 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, July 17, 2012 12:50 PM To: Wood Scott-B07421 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; bharatb.ya...@gmail.com; Bhushan Bharat-R65777; Benjamin Herrenschmidt; Kumar Gala Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation On 17.07.2012, at 03:02, Scott Wood scottw...@freescale.com wrote: On 07/16/2012 12:18 PM, Alexander Graf wrote: +/* + * Return the number of jiffies until the next timeout. If the timeout is + * longer than the NEXT_TIMER_MAX_DELTA, that then? return NEXT_TIMER_MAX_DELTA + * instead. I can read code. Come on, it's not exactly x++; /* add one to x */ It's faster to read code (as well as know the constraints within which you can modify it without having to spend a lot of time digesting all the callers' use cases) when you have a high level description of its interface contract, and can be selective about when to zoom in to the details. Linux kernel code tends to be bad about this. Yeah, not opposed to leave that part in :). The important piece of information in the comment is missing: The reason. The reason for what? Why you want to know the next timeout? That's the caller's business. Or why we use NEXT_TIMER_MAX_DELTA as the limit? Why we use the limit. IIRC it was explained in the last thread, just didn't make its way into the comment. Earlier we have a comment on the #define MAX_TIMEOUT (new define added for a purpose, so the comment described the puspose). Now we uses the generic #define NEX_TIMER_MAX_DELTA (include/linux/timer.h), so removed the comment. Ah, ok. Just saying, if you comment on some mechanism, like you did here, please also include the reasoning behind it. For example Do foo if x is true. isn't particularly helpful. However Do foo if x is true because the bar API will break with high values is very helpful. It includes the action and reason of the code :). Alternatively, to me the same as above would be /* bar API will break with high values */ if (x) do(foo) because in this case the code is the action description. Either variant works fine for me. Ok :) +void kvmppc_watchdog_func(unsigned long data) { +struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; +u32 tsr, new_tsr; +int final; + +do { +new_tsr = tsr = vcpu-arch.tsr; +final = 0; + +/* Time out event */ +if (tsr TSR_ENW) { +if (tsr TSR_WIS) { +new_tsr = (tsr ~TCR_WRC_MASK) | + (vcpu-arch.tcr TCR_WRC_MASK); +vcpu-arch.tcr = ~TCR_WRC_MASK; Can't we just poke the vcpu to exit the VM and do the above on its own? We've discussed this before. TSR updates are done via atomics, and we send a request for the vcpu to act on the result. This is how the decrementer works. http://www.spinics.net/lists/kvm-ppc/msg03169.html Yeah, the major difference to the dec is the atomicity of the whole thing. Dec changes one bit to enable the interrupt line. The final expiration is more complex. Is not setting the TSR.WRS atomic here (cmpxchg() will handle this)? Final expiration sets TCR. TSR should be ok. It is clearing some TCR bits :) Let us move the TCR clearing to userspace (please see below response ^^). Then it is just setting TSR. Right? Then TSR is still set, right? So we don't set it, it just keeps on being 1. We need to trigger another expired event in that if branch now. I am sorry Alex but I did not get you. What I meant was that you were having concern that dec is atomic while watchdog final expiration (complex) is not atomic, but if we move the TCR.WRC clearing to userspace on final expiration then are you ok with current code? Thanks -Bharat Alex
RE: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 17, 2012 10:08 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; bharatb.ya...@gmail.com; Benjamin Herrenschmidt; Kumar Gala Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation On 07/17/2012 06:31 AM, Bhushan Bharat-R65777 wrote: int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { -return !(v-arch.shared-msr MSR_WE) || - !!(v-arch.pending_exceptions) || - v-requests; +bool ret = !(v-arch.shared-msr MSR_WE) || + !!(v-arch.pending_exceptions) || + v-requests; + +ret = ret || kvmppc_get_tsr_wrc(v); Why do you need to declare the cpu as non-runnable when a watchdog event occured? It's the other way around -- it's always runnable when a watchdog exit is pending. It's like a pending exception. With the above check, Are we trying to handle the case where watchdog interrupt bit in pending_exception is cleared by guest after final expiry but before the qemu exit? No, we're just trying to test the actual condition we want to exit on. The watchdog interrupt might be masked (either with WIE or CE). If the interrupt is masked then still the pending_exception will be set. But interrupt will not be delivered to guest and Final expiry can still cause reset etc !! Right? And we want that if TSR.WRS update wins the race with clearing of watchdog interrupt condition from guest then anyways let QEMU exit with reason KVM_EXIT_WDT? What race? If ENW and WIS are both set when the watchdog timer fires, it's a final expiration. It's irrelevant what happens to WIS after that point, before enforcement kicks in. What I was thinking of was if we can remove this check in vcpu_runnable() and use watchdog bit in pending_exception to qualify vcpu_runnable() What if we do not allow guest clear watchdog interrupt condition if final expiry already happened? What would that solve? Removing ret || kvmppc_get_tsr_wrc(v); this from vcpu_runnable(). Thanks -Bharat -Scott
RE: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 17, 2012 10:31 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; bharatb.ya...@gmail.com; Benjamin Herrenschmidt; Kumar Gala Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation On 07/17/2012 11:56 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 17, 2012 10:08 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; bharatb.ya...@gmail.com; Benjamin Herrenschmidt; Kumar Gala Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation On 07/17/2012 06:31 AM, Bhushan Bharat-R65777 wrote: int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { -return !(v-arch.shared-msr MSR_WE) || - !!(v-arch.pending_exceptions) || - v-requests; +bool ret = !(v-arch.shared-msr MSR_WE) || + !!(v-arch.pending_exceptions) || + v-requests; + +ret = ret || kvmppc_get_tsr_wrc(v); Why do you need to declare the cpu as non-runnable when a watchdog event occured? It's the other way around -- it's always runnable when a watchdog exit is pending. It's like a pending exception. With the above check, Are we trying to handle the case where watchdog interrupt bit in pending_exception is cleared by guest after final expiry but before the qemu exit? No, we're just trying to test the actual condition we want to exit on. The watchdog interrupt might be masked (either with WIE or CE). If the interrupt is masked then still the pending_exception will be set. Not if it's masked by WIE -- and even when masked by CE, it's a bug that we currently consider the vcpu runnable. We shouldn't depend on that bug. Scott can you please describe what is bug? What I remember is that if vcpu is not run-able then we halt vcpu and cannot cause qemu exit also. Thanks -Bharat
RE: [PATCH] openpic: Added BRR1 register
-Original Message- From: Wood Scott-B07421 Sent: Monday, July 16, 2012 10:34 PM To: Bhushan Bharat-R65777 Cc: Alexander Graf; qemu-...@nongnu.org; kvm-ppc@vger.kernel.org Subject: Re: [PATCH] openpic: Added BRR1 register On 07/16/2012 11:21 AM, Bhushan Bharat-R65777 wrote: +case 0x00: /* BRR1 */ +retval = 0x00400200; Please unmagicify this one :) /* BRR1 ( Block revision register ) */ #define IPID 0x0040 /* IP-block ID */ #define IPMJ 0x0200 /* IP major number */ #define IPMN 0x0200 /* IP minor number */ IPMN looks wrong. Opps : copy paste error here :) Btw, I am not aware of all MPIC IP versions, differences in them and what is the latest/best version this emulated code supports. I drive this value from mpc8544 Reference Manual. I requested Alex to suggest the most updated version. Thanks -Bharat -Scott N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
RE: [PATCH 2/2] KVM: PPC: booke: Add watchdog emulation
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Saturday, July 07, 2012 1:21 PM To: Wood Scott-B07421 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] KVM: PPC: booke: Add watchdog emulation On 07.07.2012, at 01:37, Scott Wood wrote: On 07/06/2012 08:17 AM, Alexander Graf wrote: On 28.06.2012, at 08:17, Bharat Bhushan wrote: +/* + * The timer system can almost deal with LONG_MAX timeouts, except that + * when you get very close to LONG_MAX, the slack added can cause overflow. + * + * LONG_MAX/2 is a conservative threshold, but it should be adequate for + * any realistic use. + */ +#define MAX_TIMEOUT (LONG_MAX/2) Should this really be in kvm code? It looks like we can use NEXT_TIMER_MAX_DELTA for this. + mask = 1ULL (63 - period); + tb = get_tb(); + if (tb mask) + nr_jiffies += mask; To be honest, you lost me here. nr_jiffies is jiffies, right? While mask is basically in timebase granularity. I suppose you're just reusing the variable here for no good reason - the compiler will gladly optimize things for you if you write things a bit more verbose :). Probably due to the way do_div() works, but yeah, it's confusing. Maybe something generic like ticks, interval, remaining, etc. would be better, with a comment on the do_div saying it's converting timebase ticks into jiffies. Well, you could start off with a variable delta_tb, then do nr_jiffies = delta_tb; x = do_div(...); and things would suddenly become readable :). Of course I don't object to comments along the code either :). Ok +static void arm_next_watchdog(struct kvm_vcpu *vcpu) +{ + unsigned long nr_jiffies; + + nr_jiffies = watchdog_next_timeout(vcpu); + if (nr_jiffies MAX_TIMEOUT) + mod_timer(vcpu-arch.wdt_timer, jiffies + nr_jiffies); + else + del_timer(vcpu-arch.wdt_timer); Can you del a timer that's not armed? Could that ever happen in this case? del_timer() deactivates a timer - this works on both active and inactive timers. Ah, good :). Also, could the timer possibly be running somewhere, so do we need del_timer_sync? Or don't we need to care? This can be called in the context of the timer, so del_timer_sync() would hang. As for what would happen if a caller from a different context were to race with a timer, I think you could end up with the timer armed based on an old TCR. del_timer_sync() won't help though, unless you replace mod_timer() with del_timer_sync() plus add_timer() (with a check to see whether it's running in timer context). A better solution is probably to use a spinlock in arm_next_watchdog(). Yup. Either way, we have a race that the guest might not expect. Ok, will use spinlock in arm_next_watchdog(). +void kvmppc_watchdog_func(unsigned long data) +{ + struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; + u32 tsr, new_tsr; + int final; + + do { + new_tsr = tsr = vcpu-arch.tsr; + final = 0; + + /* Time out event */ + if (tsr TSR_ENW) { + if (tsr TSR_WIS) { + new_tsr = (tsr ~TCR_WRC_MASK) | + (vcpu-arch.tcr TCR_WRC_MASK); + vcpu-arch.tcr = ~TCR_WRC_MASK; + final = 1; + } else { + new_tsr = tsr | TSR_WIS; + } + } else { + new_tsr = tsr | TSR_ENW; + } + } while (cmpxchg(vcpu-arch.tsr, tsr, new_tsr) != tsr); + + if (new_tsr (TSR_WIS | TCR_WRC_MASK)) { + smp_wmb(); + kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu); + kvm_vcpu_kick(vcpu); + } + + /* + * Avoid getting a storm of timers if the guest sets + * the period very short. We'll restart it if anything + * changes. + */ + if (!final) + arm_next_watchdog(vcpu); Mind to explain this part a bit further? The whole function, or some subset near the end? The if (!final) check means that we stop running the timer after final expiration, to prevent the host from being flooded with timers if the guest sets a short period but does not have TCR set to exit to QEMU. Timers will resume the next time TSR/TCR is updated. Ah. The semantics make sense. The comment however is slightly too short. Please explain this in a more verbose way, so someone who didn't write the code knows what's going on :). Ok. @@ -1106,7 +1213,14 @@ static int set_sregs_base(struct kvm_vcpu *vcpu, } if (sregs-u.e.update_special KVM_SREGS_E_UPDATE_TSR) { + u32 old_tsr = vcpu-arch.tsr; + vcpu-arch.tsr = sregs-u.e.tsr; + + if ((old_tsr ^ vcpu-arch.tsr) + (TSR_ENW | TSR_WIS | TCR_WRC_MASK
RE: [PATCH 3/4] Watchdog exit handling support
-Original Message- From: Wood Scott-B07421 Sent: Friday, July 06, 2012 1:57 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; qemu-...@nongnu.org; kvm-ppc@vger.kernel.org; ag...@suse.de Subject: Re: [PATCH 3/4] Watchdog exit handling support On 07/04/2012 06:13 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, June 29, 2012 3:57 AM To: Bhushan Bharat-R65777 Cc: qemu-...@nongnu.org; kvm-ppc@vger.kernel.org; ag...@suse.de; Bhushan Bharat- R65777 Subject: Re: [PATCH 3/4] Watchdog exit handling support On 06/28/2012 12:39 AM, Bharat Bhushan wrote: This patch adds the support to handle the exit caused by watchdog (KVM_EXIT_WDT). In the handling we clear the TSR register. I'm not sure what the logical split is between this patch and 4/4... Ok I will merge 3/4 and 4/4. diff --git a/hw/ppc_booke.c b/hw/ppc_booke.c index 837a5b6..a9fba15 100644 --- a/hw/ppc_booke.c +++ b/hw/ppc_booke.c @@ -203,6 +203,11 @@ static void booke_wdt_cb(void *opaque) booke_timer-wdt_timer); } +void ppc_booke_wdt_clear_tsr(CPUPPCState *env, target_ulong tsr) { +env-spr[SPR_BOOKE_TSR] = tsr ~(TSR_ENW | TSR_WIS | +TSR_WRS_MASK); } + We should probably call this function before returning to KVM, at least after we halt for debug, possibly other times. Why clearing watchdog related bits in TSR? This will just reset the watchdog state machine. Do we actually want to reset the state machine or want watchdog interrupt to not occur when exiting to KVM for debug. We want to prevent long delays in QEMU (especially for debug halt) from causing watchdog actions (interrupt, reset, etc). Resetting the state machine seems like the way to do that. Does possibly other times mean except KVM_RUN? Not sure what you mean here. Don't we always use KVM_RUN to enter the guest? My point was we definitely want this after we had the guest halted, and we may want to consider doing it for some other heavyweight exits. Your initial comment was We should probably call this function before returning to KVM. Which suggest that ioctls calls made by QEMU to enter KVM. I am a bit confused, now you want to avoid long delays in QEMU? This suggest whenever we exit to qemu, right? Thanks =Bharat -Scott
RE: [PATCH 3/4] Watchdog exit handling support
-Original Message- From: Wood Scott-B07421 Sent: Friday, July 06, 2012 6:30 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; qemu-...@nongnu.org; kvm-ppc@vger.kernel.org; ag...@suse.de Subject: Re: [PATCH 3/4] Watchdog exit handling support On 07/05/2012 07:43 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, July 06, 2012 1:57 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; qemu-...@nongnu.org; kvm-ppc@vger.kernel.org; ag...@suse.de Subject: Re: [PATCH 3/4] Watchdog exit handling support On 07/04/2012 06:13 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, June 29, 2012 3:57 AM To: Bhushan Bharat-R65777 Cc: qemu-...@nongnu.org; kvm-ppc@vger.kernel.org; ag...@suse.de; Bhushan Bharat- R65777 Subject: Re: [PATCH 3/4] Watchdog exit handling support On 06/28/2012 12:39 AM, Bharat Bhushan wrote: This patch adds the support to handle the exit caused by watchdog (KVM_EXIT_WDT). In the handling we clear the TSR register. I'm not sure what the logical split is between this patch and 4/4... Ok I will merge 3/4 and 4/4. diff --git a/hw/ppc_booke.c b/hw/ppc_booke.c index 837a5b6..a9fba15 100644 --- a/hw/ppc_booke.c +++ b/hw/ppc_booke.c @@ -203,6 +203,11 @@ static void booke_wdt_cb(void *opaque) booke_timer-wdt_timer); } +void ppc_booke_wdt_clear_tsr(CPUPPCState *env, target_ulong tsr) { +env-spr[SPR_BOOKE_TSR] = tsr ~(TSR_ENW | TSR_WIS | +TSR_WRS_MASK); } + We should probably call this function before returning to KVM, at least after we halt for debug, possibly other times. Why clearing watchdog related bits in TSR? This will just reset the watchdog state machine. Do we actually want to reset the state machine or want watchdog interrupt to not occur when exiting to KVM for debug. We want to prevent long delays in QEMU (especially for debug halt) from causing watchdog actions (interrupt, reset, etc). Resetting the state machine seems like the way to do that. Does possibly other times mean except KVM_RUN? Not sure what you mean here. Don't we always use KVM_RUN to enter the guest? My point was we definitely want this after we had the guest halted, and we may want to consider doing it for some other heavyweight exits. Your initial comment was We should probably call this function before returning to KVM. Which suggest that ioctls calls made by QEMU to enter KVM. I am a bit confused, now you want to avoid long delays in QEMU? This suggest whenever we exit to qemu, right? Whenever we exit to QEMU is one possibility, though it's probably too aggressive (if the guest gets stuck in a loop that contains a QEMU exit, the watchdog won't catch it). Another option is to reset the state machine only when the guest resumes from a debug halt. Ok. Resetting the state machine does not disable watchdog, it will just ensure the longest time for debug halt etc. Is that ok ? Thanks -Bharat -Scott N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
RE: [PATCH 3/4] Watchdog exit handling support
-Original Message- From: Wood Scott-B07421 Sent: Friday, June 29, 2012 3:57 AM To: Bhushan Bharat-R65777 Cc: qemu-...@nongnu.org; kvm-ppc@vger.kernel.org; ag...@suse.de; Bhushan Bharat- R65777 Subject: Re: [PATCH 3/4] Watchdog exit handling support On 06/28/2012 12:39 AM, Bharat Bhushan wrote: This patch adds the support to handle the exit caused by watchdog (KVM_EXIT_WDT). In the handling we clear the TSR register. I'm not sure what the logical split is between this patch and 4/4... Ok I will merge 3/4 and 4/4. diff --git a/hw/ppc_booke.c b/hw/ppc_booke.c index 837a5b6..a9fba15 100644 --- a/hw/ppc_booke.c +++ b/hw/ppc_booke.c @@ -203,6 +203,11 @@ static void booke_wdt_cb(void *opaque) booke_timer-wdt_timer); } +void ppc_booke_wdt_clear_tsr(CPUPPCState *env, target_ulong tsr) { +env-spr[SPR_BOOKE_TSR] = tsr ~(TSR_ENW | TSR_WIS | +TSR_WRS_MASK); } + We should probably call this function before returning to KVM, at least after we halt for debug, possibly other times. Why clearing watchdog related bits in TSR? This will just reset the watchdog state machine. Do we actually want to reset the state machine or want watchdog interrupt to not occur when exiting to KVM for debug. Does possibly other times mean except KVM_RUN? Thanks -Bharat -Scott N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
RE: Mapping magic page from host side
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of cs5070214 Sent: Tuesday, May 29, 2012 8:12 PM To: kvm-ppc@vger.kernel.org Subject: Mapping magic page from host side Hi, We are trying to do the patching of the privileged instructions of guest from host side (instead of guest kernel patching itself). For this we first need to map the magic page which is currently being done via hypercall from guest. We tried a few approaches. When we map the magic page in the emulation code for the first exit due to MTMSR, it works and guest boots fine. But if we try to map the page on the first exit due to any privileged exits, the guest does not boot and it gives an error. = Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=8 P2020 RDB Modules linked in: NIP: c005811c LR: c0052284 CTR: c02e59cc REGS: d81dd9b0 TRAP: 0300 Not tainted (3.0.0-rc4-g832abe3-dirty) MSR: 00029200 EE,ME,CE,DE CR: 2224 XER: 2000 DEAR: e11cec14, ESR: TASK = dc902a00[1267] 'qemu-system-ppc' THREAD: d81dc000 CPU: 0 GPR00: e104a7d0 d81dda60 dc902a00 d809 e11cec10 c051 c02e68f8 277420 GPR08: c06f5290 d80cfffc c0720558 d81dda70 8224 d809 00 GPR16: 00 GPR24: c072 c051 d99c d809 f000 d81dda NIP [c005811c] kvmppc_mmu_xlate+0x40/0xc4 LR [c0052284] kvmppc_read_guest+0x48/0xb0 = My question is are there any prerequisites that needs to be satisfied before we map the magic page and what would be the proper place to do it? Can we try mapping before we does first lightweight exit? Thanks -Bharat Thanks and Regards, Dushyant Bansal -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: PPC: booke: Added DECAR support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, May 15, 2012 7:59 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH] KVM: PPC: booke: Added DECAR support On 05/15/2012 09:33 AM, Bharat Bhushan wrote: Added the decrementer auto-reload support. Signed-off-by: Bharat Bhushanbharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |2 ++ arch/powerpc/kvm/booke.c|5 + arch/powerpc/kvm/booke_emulate.c|7 ++- 3 files changed, 13 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index d848cdc..1d6f89e 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -414,7 +414,9 @@ struct kvm_vcpu_arch { ulong mcsrr1; ulong mcsr; u32 dec; +#ifdef CONFIG_BOOKE u32 decar; +#endif u32 tbl; u32 tbu; u32 tcr; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 72f13f4..86681ee 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1267,6 +1267,11 @@ void kvmppc_decrementer_func(unsigned long data) { struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; + if (vcpu-arch.tcr TCR_ARE) { + vcpu-arch.dec = vcpu-arch.decar; + kvmppc_emulate_dec(vcpu); + } + kvmppc_set_tsr_bits(vcpu, TSR_DIS); } diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c index 6c76397..83c3796 100644 --- a/arch/powerpc/kvm/booke_emulate.c +++ b/arch/powerpc/kvm/booke_emulate.c @@ -129,6 +129,9 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val) kvmppc_set_tcr(vcpu, spr_val); break; + case SPRN_DECAR: + vcpu-arch.decar = spr_val; + break; /* * Note: SPRG4-7 are user-readable. * These values are loaded into the real SPRGs when resuming the @@ -244,7 +247,9 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val) case SPRN_TCR: *spr_val = vcpu-arch.tcr; break; - + case SPRN_DECAR: + *spr_val = vcpu-arch.decar; + break; DECAR can't be read. Otherwise looks good to me. DECAR can be read on e500mc cores. So I will make this under CONFIG_KVM_E500MC. What happens if DECAR is read on non e500mc? Is it treated as NOP or illegal instruction exception? Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Not emulated registers on BOOKE_HV (GS-mode)
Hi Alex, There is below comment in arch/powerpc/kvm/booke_emulate.c /* * NOTE: some of these registers are not emulated on BOOKE_HV (GS-mode). * Their backing store is in real registers, and these functions * will return the wrong result if called for them in another context * (such as debugging). */ some of these registers are not emulated on BOOKE_HV (GS-mode) 1) Is not that mtspr()/mfspr() for not emulated registers should follow EMULATE_FAIL path? So should be ifdef out for BOOKE_HV? Otherwise the emulation code execute. 2) Or These are not emulated because the GS mode have direct access to these registers, Right? So no trap? and these functions will return the wrong result if called for them in another context (such as debugging). 1) So do you mean that guest is not supposed to access these registers in normal scenario but the debugger (some command on gdb in guest) can access these register? then does it make sense to treat mtspr() as nop and mfspr returns 0/undefined? In our local repository Scott Wood removed this comment by ifdef out those registers for BOOKE_HV. Below is the change (extracted - not the exact patch which does this) diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c index 83c3796..6d78906 100644 --- a/arch/powerpc/kvm/booke_emulate.c +++ b/arch/powerpc/kvm/booke_emulate.c @@ -46,18 +46,21 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, switch (get_op(inst)) { case 19: switch (get_xop(inst)) { +#ifndef CONFIG_KVM_BOOKE_HV case OP_19_XOP_RFI: kvmppc_emul_rfi(vcpu); kvmppc_set_exit_type(vcpu, EMULATED_RFI_EXITS); *advance = 0; break; +#endif default: emulated = EMULATE_FAIL; break; } break; +#ifndef CONFIG_KVM_BOOKE_HV case 31: switch (get_xop(inst)) { @@ -89,6 +92,7 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, break; +#endif default: emulated = EMULATE_FAIL; } @@ -96,23 +100,19 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, return emulated; } -/* - * NOTE: some of these registers are not emulated on BOOKE_HV (GS-mode). - * Their backing store is in real registers, and these functions - * will return the wrong result if called for them in another context - * (such as debugging). - */ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val) { int emulated = EMULATE_DONE; switch (sprn) { +#ifndef CONFIG_KVM_BOOKE_HV case SPRN_DEAR: vcpu-arch.shared-dar = spr_val; break; case SPRN_ESR: vcpu-arch.shared-esr = spr_val; break; +#endif case SPRN_DBCR0: vcpu-arch.dbcr0 = spr_val; break; @@ -223,6 +223,7 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val) int emulated = EMULATE_DONE; switch (sprn) { +#ifndef CONFIG_KVM_BOOKE_HV case SPRN_IVPR: *spr_val = vcpu-arch.ivpr; break; @@ -232,6 +233,7 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val) case SPRN_ESR: *spr_val = vcpu-arch.shared-esr; break; +#endif case SPRN_DBCR0: *spr_val = vcpu-arch.dbcr0; break; -- 1.7.0.4 Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: PPC: Not optimizing MSR_CE and MSR_DE with paravirt.
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, May 15, 2012 8:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH] KVM: PPC: Not optimizing MSR_CE and MSR_DE with paravirt. On 05/15/2012 09:37 AM, Bharat Bhushan wrote: From: Bhushan Bharat-R65777r65...@freescale.com If there is pending critical or machine check interrupt then guest would like to capture it when guest enable MSR.CE and MSR_ME respectively. Also as mostly MSR_CE and MSR_ME are updated with rfi/rfci/rfmii which anyway traps so removing the the paravirt optimization for MSR.CE and MSR.ME. It's only not safe for e500mc and above, right? For e500mc and above the paravirt emulation code will not come into picture, And critical and machine check interrupt will happen only if MSR have corresponding bits set. So they are already safe. Is not it? E500v2 and book3s should be fine. And with this patch e500v2 will be fine? Not sure of book3s :). Thanks -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: PPC: booke: Added DECAR support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Wednesday, May 16, 2012 2:36 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org Subject: Re: [PATCH] KVM: PPC: booke: Added DECAR support On 05/16/2012 08:57 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, May 15, 2012 7:59 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH] KVM: PPC: booke: Added DECAR support On 05/15/2012 09:33 AM, Bharat Bhushan wrote: Added the decrementer auto-reload support. Signed-off-by: Bharat Bhushanbharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |2 ++ arch/powerpc/kvm/booke.c|5 + arch/powerpc/kvm/booke_emulate.c|7 ++- 3 files changed, 13 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index d848cdc..1d6f89e 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -414,7 +414,9 @@ struct kvm_vcpu_arch { ulong mcsrr1; ulong mcsr; u32 dec; +#ifdef CONFIG_BOOKE u32 decar; +#endif u32 tbl; u32 tbu; u32 tcr; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 72f13f4..86681ee 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1267,6 +1267,11 @@ void kvmppc_decrementer_func(unsigned long data) { struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; + if (vcpu-arch.tcr TCR_ARE) { + vcpu-arch.dec = vcpu-arch.decar; + kvmppc_emulate_dec(vcpu); + } + kvmppc_set_tsr_bits(vcpu, TSR_DIS); } diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c index 6c76397..83c3796 100644 --- a/arch/powerpc/kvm/booke_emulate.c +++ b/arch/powerpc/kvm/booke_emulate.c @@ -129,6 +129,9 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val) kvmppc_set_tcr(vcpu, spr_val); break; + case SPRN_DECAR: + vcpu-arch.decar = spr_val; + break; /* * Note: SPRG4-7 are user-readable. * These values are loaded into the real SPRGs when resuming the @@ -244,7 +247,9 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val) case SPRN_TCR: *spr_val = vcpu-arch.tcr; break; - + case SPRN_DECAR: + *spr_val = vcpu-arch.decar; + break; DECAR can't be read. Otherwise looks good to me. DECAR can be read on e500mc cores. So I will make this under CONFIG_KVM_E500MC. What happens if DECAR is read on non e500mc? Is it treated as NOP or illegal instruction exception? I would assume the latter. NOPs usually don't happen. If anything, it would return 0. See section 9.4 in the PowerISA: The contents of the Decrementer Auto-Reload Register cannot be read. The contents of bits 32:63 of register RS can be written to the Decrementer Auto- Reload Register using the mtspr instruction. Could you please paste the respective passage of the e500mc manual that declares DECAR as readable? I have booke -4 version 1.05. There are remarks at 3 place 1) Table 2-1: There is remark DECAR is defined by the architecture to be write-only, however the e500mc allows it to be read. 2) 2.8.5 Decrementer Auto-Reload Register (DECAR) For e500V4, the DECAR can be read in hypervisor state, although the architecture defines it as a write-only register. -- 3) Table 4.37 Hypervisor Privilege on Read - Yes Hypervisor Privilege on Write - Yes Remark: e500mc allows reading of DECAR although Power ISA does not define it. I verified with my unit test that reading DECAR traps in KVM on e500mc. Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks
-Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Friday, February 24, 2012 7:56 PM To: kvm-ppc@vger.kernel.org Cc: k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Wood Scott-B07421 Subject: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks Instead of checking whether we should reschedule only when we exited due to an interrupt, let's always check before entering the guest back again. This gets the target more in line with the other archs. Also while at it, generalize the whole thing so that eventually we could have a single kvmppc_prepare_to_enter function for all ppc targets that does signal and reschedule checking for us. Signed-off-by: Alexander Graf ag...@suse.de --- arch/powerpc/include/asm/kvm_ppc.h |2 +- arch/powerpc/kvm/book3s.c |4 ++- arch/powerpc/kvm/booke.c | 70 --- 3 files changed, 52 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index e709975..7f0a3da 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -95,7 +95,7 @@ extern int kvmppc_core_vcpu_translate(struct kvm_vcpu *vcpu, extern void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu); extern void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu); -extern void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu); +extern int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu); extern int kvmppc_core_pending_dec(struct kvm_vcpu *vcpu); extern void kvmppc_core_queue_program(struct kvm_vcpu *vcpu, ulong flags); extern void kvmppc_core_queue_dec(struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 7d54f4e..c8ead7b 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -258,7 +258,7 @@ static bool clear_irqprio(struct kvm_vcpu *vcpu, unsigned int priority) return true; } -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) { unsigned long *pending = vcpu-arch.pending_exceptions; unsigned long old_pending = vcpu-arch.pending_exceptions; @@ -283,6 +283,8 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) /* Tell the guest about our interrupt status */ kvmppc_update_int_pending(vcpu, *pending, old_pending); + + return 0; } pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 9979be1..3fcec2c 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -439,8 +439,9 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu) } /* Check pending exceptions and deliver one, if possible. */ -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) { + int r = 0; WARN_ON_ONCE(!irqs_disabled()); kvmppc_core_check_exceptions(vcpu); @@ -451,8 +452,44 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) local_irq_disable(); kvmppc_set_exit_type(vcpu, EMULATED_MTMSRWE_EXITS); - kvmppc_core_check_exceptions(vcpu); + r = 1; }; + + return r; +} + +/* + * Common checks before entering the guest world. Call with interrupts + * disabled. + * + * returns !0 if a signal is pending and check_signal is true */ +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool +check_signal) { + int r = 0; + + WARN_ON_ONCE(!irqs_disabled()); + while (true) { + if (need_resched()) { + local_irq_enable(); + cond_resched(); + local_irq_disable(); + continue; + } + + if (kvmppc_core_prepare_to_enter(vcpu)) { kvmppc_prepare_to_enter() is called even on heavyweight_exit. Should not this be called only on lightweight_exit? Thanks -Bharat + /* interrupts got enabled in between, so we +are back at square 1 */ + continue; + } + + if (check_signal signal_pending(current)) + r = 1; + + break; + } + + return r; } int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) @@ -470,10 +507,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) } local_irq_disable(); - - kvmppc_core_prepare_to_enter(vcpu); - - if (signal_pending(current)) { + if (kvmppc_prepare_to_enter(vcpu, true)) { kvm_run-exit_reason = KVM_EXIT_INTR; ret = -EINTR; goto out; @@ -598,25 +632,21 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
RE: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, February 27, 2012 11:53 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Wood Scott-B07421 Subject: Re: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks On 02/27/2012 06:33 PM, Alexander Graf wrote: On 02/27/2012 05:34 PM, Bhushan Bharat-R65777 wrote: +} + +/* + * Common checks before entering the guest world. Call with interrupts + * disabled. + * + * returns !0 if a signal is pending and check_signal is true */ +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool +check_signal) { +int r = 0; + +WARN_ON_ONCE(!irqs_disabled()); +while (true) { +if (need_resched()) { +local_irq_enable(); +cond_resched(); +local_irq_disable(); +continue; +} + +if (kvmppc_core_prepare_to_enter(vcpu)) { kvmppc_prepare_to_enter() is called even on heavyweight_exit. Should not this be called only on lightweight_exit? Yeah, we don't need to call it when exiting anyways. That's a functional change though, which this patch is trying not to introduce. So we should rather do that as a patch on top. So how about this (warning! broken whitespace)? diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 7a16b56..616aa2d 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -464,7 +464,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) * * returns !0 if a signal is pending and check_signal is true */ -static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal) +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu) { int r = 0; @@ -483,7 +483,7 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal) continue; } - if (check_signal signal_pending(current)) + if (signal_pending(current)) r = 1; break; @@ -507,7 +507,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) } local_irq_disable(); - if (kvmppc_prepare_to_enter(vcpu, true)) { + if (kvmppc_prepare_to_enter(vcpu)) { kvm_run-exit_reason = KVM_EXIT_INTR; ret = -EINTR; goto out; @@ -941,13 +941,16 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, * To avoid clobbering exit_reason, only check for signals if we * aren't already exiting to userspace for some other reason. */ - local_irq_disable(); - if (kvmppc_prepare_to_enter(vcpu, !(r RESUME_HOST))) { - run-exit_reason = KVM_EXIT_INTR; - r = (-EINTR 2) | RESUME_HOST | (r RESUME_FLAG_NV); - kvmppc_account_exit(vcpu, SIGNAL_EXITS); + if (!(r RESUME_HOST)) { + local_irq_disable(); + if (kvmppc_prepare_to_enter(vcpu)) { + run-exit_reason = KVM_EXIT_INTR; + r = (-EINTR 2) | RESUME_HOST | (r RESUME_FLAG_NV); + kvmppc_account_exit(vcpu, SIGNAL_EXITS); + } } +out: Why? Otherwise looks ok to me. Thanks -Bharat return r; } -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2] Fix DEC truncation for greater than 0xffff_ffff/1000
-Original Message- From: Liu Yu-B13201 Sent: Wednesday, October 19, 2011 4:28 PM To: Bhushan Bharat-R65777; ag...@suse.de Cc: kvm-ppc@vger.kernel.org; bharatb.ya...@gmail.com; Bhushan Bharat- R65777 Subject: RE: [PATCH v2] Fix DEC truncation for greater than 0x_/1000 -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Bharat Bhushan Sent: Wednesday, October 19, 2011 12:16 PM To: ag...@suse.de Cc: kvm-ppc@vger.kernel.org; bharatb.ya...@gmail.com; Bhushan Bharat-R65777 Subject: [PATCH v2] Fix DEC truncation for greater than 0x_/1000 kvmppc_emulate_dec() uses dec_nsec of type unsigned long and does below calculation: dec_nsec = vcpu-arch.dec; dec_nsec *= 1000; This will truncate if DEC value vcpu-arch.dec is greater than 0x_/1000. For example : For tb_ticks_per_usec = 4a, we can not set decrementer more than ~58ms. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/emulate.c | 12 +++- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c index 8af3bad..e7f3da4 100644 --- a/arch/powerpc/kvm/emulate.c +++ b/arch/powerpc/kvm/emulate.c @@ -84,6 +84,7 @@ static bool kvmppc_dec_enabled(struct kvm_vcpu *vcpu) void kvmppc_emulate_dec(struct kvm_vcpu *vcpu) { unsigned long dec_nsec; + unsigned long long dec_time; pr_debug(mtDEC: %x\n, vcpu-arch.dec); #ifdef CONFIG_PPC_BOOK3S @@ -103,11 +104,12 @@ void kvmppc_emulate_dec(struct kvm_vcpu *vcpu) * host ticks. */ hrtimer_try_to_cancel(vcpu-arch.dec_timer); - dec_nsec = vcpu-arch.dec; - dec_nsec *= 1000; - dec_nsec /= tb_ticks_per_usec; - hrtimer_start(vcpu-arch.dec_timer, ktime_set(0, dec_nsec), - HRTIMER_MODE_REL); + dec_time = vcpu-arch.dec; + dec_time *= 1000; + do_div(dec_time, tb_ticks_per_usec); + dec_nsec = do_div(dec_time, NSEC_PER_SEC); + hrtimer_start(vcpu-arch.dec_timer, + ktime_set(dec_time, dec_nsec), HRTIMER_MODE_REL); vcpu-arch.dec_jiffies = get_tb(); } else { hrtimer_try_to_cancel(vcpu-arch.dec_timer); -- 1.7.0.4 How does this impact performance? 64bits multiplication and division looks slow. I tried running below test as guest, with and without this patch and tried to find latency added by this patch. Also I run this for a list of timeouts (1, 2 , 4, 8, 16, 32ms) one by one. - get TB (say a). - set decrementer in auto reload mode. - wait for 1000 timebase interrupts. - Get timebase delta (b = get_tb - a). (b1 - b2) = b1 with this patch and b2 without this patch. And roughly I found any impact. For example: For 1ms = ( 48a19d8 - 48a1459) = 0x57f = .0018% For 32ms = (90fdfa23 - 90fdfe79) = -(0x456) Above values are not consistent but always in a delta of ~+-.002%. Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: booke: Do Not start decrementer when SPRN_DEC set 0
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc- ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, October 20, 2011 10:40 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; bharatb.ya...@gmail.com; Bhushan Bharat- R65777 Subject: Re: [PATCH] KVM: booke: Do Not start decrementer when SPRN_DEC set 0 On 17.10.2011, at 20:37, Bharat Bhushan wrote: As per specification the decrementer interrupt not happen when DEC is written with 0. So we should not start hrtimer with timeout = 0 as well. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/emulate.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c index 141dce3..8af3bad 100644 --- a/arch/powerpc/kvm/emulate.c +++ b/arch/powerpc/kvm/emulate.c @@ -70,14 +70,14 @@ #define OP_STHU 45 #ifdef CONFIG_PPC_BOOK3S -static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu) +static bool kvmppc_dec_enabled(struct kvm_vcpu *vcpu) { - return 1; + return true; It's not necessary to change it to bool. The function will be inlined anyways. } #else -static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu) +static bool kvmppc_dec_enabled(struct kvm_vcpu *vcpu) { - return vcpu-arch.tcr TCR_DIE; + return (vcpu-arch.tcr TCR_DIE) !vcpu-arch.dec; This means we declare the dec as enabled only when dec==0, no? I think we can change the name of this function as kvmppc_dec_runnable() ? Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] PPC: Fix race in mtmsr paravirt implementation
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, October 13, 2011 2:36 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; bharatb.ya...@gmail.com; Bhushan Bharat- R65777 Subject: Re: [PATCH] PPC: Fix race in mtmsr paravirt implementation Am 13.10.2011 um 07:40 schrieb Bharat Bhushan r65...@freescale.com: The current implementation of mtmsr and mtmsrd are racy in that it does: * check (int_pending == 0) --- host sets int_pending = 1 --- * write shared page * done while instead we should check for int_pending after the shared page is written. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kernel/kvm_emul.S | 22 ++ 1 files changed, 10 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S index f2b1b25..65f853b 100644 --- a/arch/powerpc/kernel/kvm_emul.S +++ b/arch/powerpc/kernel/kvm_emul.S @@ -85,15 +85,15 @@ kvm_emulate_mtmsrd_reg: /* Put MSR back into magic page */ STL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0) +/* Check if we may trigger an interrupt */ +andi.r30, r30, MSR_EE +beqno_check + /* Check if we have to fetch an interrupt */ lwzr31, (KVM_MAGIC_PAGE + KVM_MAGIC_INT)(0) cmpwir31, 0 beq+no_check -/* Check if we may trigger an interrupt */ -andi.r30, r30, MSR_EE -beqno_check - This chunk should actually be ok already. We're basically doing: if(likely(!int_pending) !(new_msr MSR_EE)) goto no_check; Since we wrote shared.msr before the check, we're good, no? Actually I borrowed this from wrtee implementation and keep consistant across mtmsr/ mtmsrd and wrtee/i. I thought of it like: If we are qualified to take interrupt (EE set) then only check for pending interrupt rather than check for pending interrupt and then check whether we are qualified to take it or not. If you think earlier is better way of understanding then I will change the patch. Thanks -Bharat SCRATCH_RESTORE /* Nag hypervisor */ @@ -167,22 +167,20 @@ maybe_stay_in_guest: kvm_emulate_mtmsr_reg2: orir30, r0, 0 -/* Check if we have to fetch an interrupt */ -lwzr31, (KVM_MAGIC_PAGE + KVM_MAGIC_INT)(0) -cmpwir31, 0 -beq+no_mtmsr +/* Put MSR into magic page because we don't call mtmsr */ +STL64(r30, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0) /* Check if we may trigger an interrupt */ andi.r31, r30, MSR_EE beqno_mtmsr -bdo_mtmsr +/* Check if we have to fetch an interrupt */ +lwzr31, (KVM_MAGIC_PAGE + KVM_MAGIC_INT)(0) +cmpwir31, 0 +bne-do_mtmsr no_mtmsr: -/* Put MSR into magic page because we don't call mtmsr */ -STL64(r30, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0) - This one looks good. Alex SCRATCH_RESTORE /* Go back to caller */ -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 5/5] KVM: PPC: booke: Improve timer register emulation
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc- ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, September 27, 2011 9:39 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Liu Yu-B13201; kvm-ppc@vger.kernel.org Subject: Re: [PATCH 5/5] KVM: PPC: booke: Improve timer register emulation On 27.09.2011, at 18:01, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc- ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Tuesday, September 27, 2011 1:45 PM To: Wood Scott-B07421 Cc: Liu Yu-B13201; Wood Scott-B07421; kvm-ppc@vger.kernel.org Subject: Re: [PATCH 5/5] KVM: PPC: booke: Improve timer register emulation On 27.09.2011, at 02:44, Scott Wood wrote: On 09/24/2011 02:27 AM, Alexander Graf wrote: I think I'm getting your point. So what we want is: in timer handler: set_bit(TSR_DIS, vcpu-arch.tsr); kvm_make_request(KVM_REQ_PPC_TSR_UPDATE, vcpu); kvm_vcpu_kick(vcpu); Still there can be a case where first request not handled and another event happens? Or we would like to pause till first request is handled by vcpu? If the first DIS request is not handled and another event happens, the interrupts get coalesced (like on real hardware). If there is another TSR bit set still, int_pending should still be active and the guest exits the next time it sets MSR_EE, making us inject the next interrupt. This may not work for watchdog, where every event causes a state change. Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] Issue in exit timing clearance
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, March 15, 2011 8:11 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; bharatb.ya...@gmail.com; Bhushan Bharat- R65777 Subject: Re: [PATCH] Issue in exit timing clearance On 15.03.2011, at 06:29, Bharat Bhushan wrote: Following dump is observed on host when clearing the exit timing counters [root@p1021mds kvm]# echo -n 'c' vm1200_vcpu0_timing INFO: task echo:1276 blocked for more than 120 seconds. echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. echo D 0ff5bf94 0 1276 1190 0x Call Trace: [c2157e40] [c0007908] __switch_to+0x9c/0xc4 [c2157e50] [c040293c] schedule+0x1b4/0x3bc [c2157e90] [c04032dc] __mutex_lock_slowpath+0x74/0xc0 [c2157ec0] [c00369e4] kvmppc_init_timing_stats+0x20/0xb8 [c2157ed0] [c0036b00] kvmppc_exit_timing_write+0x84/0x98 [c2157ef0] [c00b9f90] vfs_write+0xc0/0x16c [c2157f10] [c00ba284] sys_write+0x4c/0x90 [c2157f40] [c000e320] ret_from_syscall+0x0/0x3c The vcpu-mutex is used by kvm_ioctl_* (KVM_RUN etc) and same was used when clearing the stats (in kvmppc_init_timing_stats()). What happens is that when the guest is idle then it held the vcpu-mutx. While the exiting timing process waits for guest to release the vcpu-mutex and a hang state is reached. Now using seprate lock for exit timing stats. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com Looks good from a 1 feet perspective. My BookE box is currently down, so I can't test it, but I assume you have verified it works just fine? If so, please CC k...@vger.kernel.org, so Avi can take it in the tree with my signed-off (after I tested it ;)). Hi Alex, I have tested this and works fine for me. Please let me know when you complete your testing so that I send to k...@vger.kernel.org. Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html