Re: [PATCH 18/18] KVM: PPC: Implement Paired Single emulation
On 02/04/2010 05:55 PM, Alexander Graf wrote: The one big thing about the Gekko is paired singles. Paired singles are an extension to the instruction set, that adds 32 single precision floating point registers (qprs), some SPRs to modify the behavior of paired singled operations and instructions to deal with qprs to the instruction set. Unfortunately, it also changes semantics of existing operations that affect single values in FPRs. In most cases they get mirrored to the coresponding QPR. Thanks to that we need to emulate all FPU operations and all the new paired single operations too. In order to achieve that, we take the guest's instruction, rip out the parameters, put in our own and execute the very same instruction, but also fix up the QPR values along the way. That way we can execute paired single FPU operations without implementing a soft fpu. A little frightening. How many instructions are there? Maybe we can just have an array of all of them followed by a return instruction, so we don't jit code. static void call_fpu_inst(u32 inst, u64 *out, u64 *in1, u64 *in2, u64 *in3, + u32 *cr, u32 *fpscr) +{ + u32 cr_val = 0; + u32 *call_stack; + u64 inout[5] = { 0, 0, 0, 0, 0 }; + + if (fpscr) + inout[0] = *fpscr; + if (in1) + inout[1] = *in1; + if (in2) + inout[2] = *in2; + if (in3) + inout[3] = *in3; + if (cr) + cr_val = *cr; + + dprintk(KERN_INFO FPU Emulator 0x%x ( 0x%llx, 0x%llx, 0x%llx ), inst, + inout[1], inout[2], inout[3]); + + call_stack =kvmppc_call_stack[(smp_processor_id() * 2)]; + call_stack[0] = inst; + /* call_stack[1] is INS_BLR */ + Would be easier on the cache to do this per-cpu? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 18/18] KVM: PPC: Implement Paired Single emulation
Am 07.02.2010 um 13:50 schrieb Avi Kivity a...@redhat.com: On 02/04/2010 05:55 PM, Alexander Graf wrote: The one big thing about the Gekko is paired singles. Paired singles are an extension to the instruction set, that adds 32 single precision floating point registers (qprs), some SPRs to modify the behavior of paired singled operations and instructions to deal with qprs to the instruction set. Unfortunately, it also changes semantics of existing operations that affect single values in FPRs. In most cases they get mirrored to the coresponding QPR. Thanks to that we need to emulate all FPU operations and all the new paired single operations too. In order to achieve that, we take the guest's instruction, rip out the parameters, put in our own and execute the very same instruction, but also fix up the QPR values along the way. That way we can execute paired single FPU operations without implementing a soft fpu. A little frightening. How many instructions are there? Maybe we can just have an array of all of them followed by a return instruction, so we don't jit code. There's all the instructions in the list, most can have the rc (compare) bit set to modify CC and iirc there were a couple ones with immediate values. But maybe you're right. I probably could just always set rc and either ignore the result or use it. I could maybe find alternatives to immediate using instructions. Let me check this on the bus trip back from brussels. static void call_fpu_inst(u32 inst, u64 *out, u64 *in1, u64 *in2, u64 *in3, + u32 *cr, u32 *fpscr) +{ +u32 cr_val = 0; +u32 *call_stack; +u64 inout[5] = { 0, 0, 0, 0, 0 }; + +if (fpscr) +inout[0] = *fpscr; +if (in1) +inout[1] = *in1; +if (in2) +inout[2] = *in2; +if (in3) +inout[3] = *in3; +if (cr) +cr_val = *cr; + +dprintk(KERN_INFO FPU Emulator 0x%x ( 0x%llx, 0x%llx, 0x %llx ), inst, +inout[1], inout[2], inout[3]); + +call_stack =kvmppc_call_stack[(smp_processor_id() * 2)]; +call_stack[0] = inst; +/* call_stack[1] is INS_BLR */ + Would be easier on the cache to do this per-cpu? It is per-cpu. Or do you mean to actually use the PER_CPU definition? Is that guaranteed to be executable? Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 18/18] KVM: PPC: Implement Paired Single emulation
On 02/07/2010 05:57 PM, Alexander Graf wrote:+ +dprintk(KERN_INFO FPU Emulator 0x%x ( 0x%llx, 0x%llx, 0x%llx ), inst, +inout[1], inout[2], inout[3]); + +call_stack =kvmppc_call_stack[(smp_processor_id() * 2)]; +call_stack[0] = inst; +/* call_stack[1] is INS_BLR */ + Would be easier on the cache to do this per-cpu? It is per-cpu. Or do you mean to actually use the PER_CPU definition? Is that guaranteed to be executable? I meant, per-cpu vmalloc area, but it should be enough to have a per-cpu cache line. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 18/18] KVM: PPC: Implement Paired Single emulation
On 02/04/2010 05:55 PM, Alexander Graf wrote: The one big thing about the Gekko is paired singles. Paired singles are an extension to the instruction set, that adds 32 single precision floating point registers (qprs), some SPRs to modify the behavior of paired singled operations and instructions to deal with qprs to the instruction set. Unfortunately, it also changes semantics of existing operations that affect single values in FPRs. In most cases they get mirrored to the coresponding QPR. Thanks to that we need to emulate all FPU operations and all the new paired single operations too. In order to achieve that, we take the guest's instruction, rip out the parameters, put in our own and execute the very same instruction, but also fix up the QPR values along the way. That way we can execute paired single FPU operations without implementing a soft fpu. A little frightening. How many instructions are there? Maybe we can just have an array of all of them followed by a return instruction, so we don't jit code. static void call_fpu_inst(u32 inst, u64 *out, u64 *in1, u64 *in2, u64 *in3, + u32 *cr, u32 *fpscr) +{ + u32 cr_val = 0; + u32 *call_stack; + u64 inout[5] = { 0, 0, 0, 0, 0 }; + + if (fpscr) + inout[0] = *fpscr; + if (in1) + inout[1] = *in1; + if (in2) + inout[2] = *in2; + if (in3) + inout[3] = *in3; + if (cr) + cr_val = *cr; + + dprintk(KERN_INFO FPU Emulator 0x%x ( 0x%llx, 0x%llx, 0x%llx ), inst, + inout[1], inout[2], inout[3]); + + call_stack =kvmppc_call_stack[(smp_processor_id() * 2)]; + call_stack[0] = inst; + /* call_stack[1] is INS_BLR */ + Would be easier on the cache to do this per-cpu? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 18/18] KVM: PPC: Implement Paired Single emulation
Am 07.02.2010 um 13:50 schrieb Avi Kivity a...@redhat.com: On 02/04/2010 05:55 PM, Alexander Graf wrote: The one big thing about the Gekko is paired singles. Paired singles are an extension to the instruction set, that adds 32 single precision floating point registers (qprs), some SPRs to modify the behavior of paired singled operations and instructions to deal with qprs to the instruction set. Unfortunately, it also changes semantics of existing operations that affect single values in FPRs. In most cases they get mirrored to the coresponding QPR. Thanks to that we need to emulate all FPU operations and all the new paired single operations too. In order to achieve that, we take the guest's instruction, rip out the parameters, put in our own and execute the very same instruction, but also fix up the QPR values along the way. That way we can execute paired single FPU operations without implementing a soft fpu. A little frightening. How many instructions are there? Maybe we can just have an array of all of them followed by a return instruction, so we don't jit code. There's all the instructions in the list, most can have the rc (compare) bit set to modify CC and iirc there were a couple ones with immediate values. But maybe you're right. I probably could just always set rc and either ignore the result or use it. I could maybe find alternatives to immediate using instructions. Let me check this on the bus trip back from brussels. static void call_fpu_inst(u32 inst, u64 *out, u64 *in1, u64 *in2, u64 *in3, + u32 *cr, u32 *fpscr) +{ +u32 cr_val = 0; +u32 *call_stack; +u64 inout[5] = { 0, 0, 0, 0, 0 }; + +if (fpscr) +inout[0] = *fpscr; +if (in1) +inout[1] = *in1; +if (in2) +inout[2] = *in2; +if (in3) +inout[3] = *in3; +if (cr) +cr_val = *cr; + +dprintk(KERN_INFO FPU Emulator 0x%x ( 0x%llx, 0x%llx, 0x %llx ), inst, +inout[1], inout[2], inout[3]); + +call_stack =kvmppc_call_stack[(smp_processor_id() * 2)]; +call_stack[0] = inst; +/* call_stack[1] is INS_BLR */ + Would be easier on the cache to do this per-cpu? It is per-cpu. Or do you mean to actually use the PER_CPU definition? Is that guaranteed to be executable? Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 18/18] KVM: PPC: Implement Paired Single emulation
On 02/07/2010 05:57 PM, Alexander Graf wrote:+ +dprintk(KERN_INFO FPU Emulator 0x%x ( 0x%llx, 0x%llx, 0x%llx ), inst, +inout[1], inout[2], inout[3]); + +call_stack =kvmppc_call_stack[(smp_processor_id() * 2)]; +call_stack[0] = inst; +/* call_stack[1] is INS_BLR */ + Would be easier on the cache to do this per-cpu? It is per-cpu. Or do you mean to actually use the PER_CPU definition? Is that guaranteed to be executable? I meant, per-cpu vmalloc area, but it should be enough to have a per-cpu cache line. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 18/18] KVM: PPC: Implement Paired Single emulation
The one big thing about the Gekko is paired singles. Paired singles are an extension to the instruction set, that adds 32 single precision floating point registers (qprs), some SPRs to modify the behavior of paired singled operations and instructions to deal with qprs to the instruction set. Unfortunately, it also changes semantics of existing operations that affect single values in FPRs. In most cases they get mirrored to the coresponding QPR. Thanks to that we need to emulate all FPU operations and all the new paired single operations too. In order to achieve that, we take the guest's instruction, rip out the parameters, put in our own and execute the very same instruction, but also fix up the QPR values along the way. That way we can execute paired single FPU operations without implementing a soft fpu. Signed-off-by: Alexander Graf ag...@suse.de --- arch/powerpc/include/asm/kvm_book3s.h|1 + arch/powerpc/kvm/Makefile|1 + arch/powerpc/kvm/book3s_64_emulate.c |3 + arch/powerpc/kvm/book3s_paired_singles.c | 1356 ++ 4 files changed, 1361 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/kvm/book3s_paired_singles.c diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index f74d1db..e32a749 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -121,6 +121,7 @@ extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int vec) extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat, bool upper, u32 val); extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr); +extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu); extern u32 kvmppc_trampoline_lowmem; extern u32 kvmppc_trampoline_enter; diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile index e575cfd..eba721e 100644 --- a/arch/powerpc/kvm/Makefile +++ b/arch/powerpc/kvm/Makefile @@ -41,6 +41,7 @@ kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs) kvm-book3s_64-objs := \ $(common-objs-y) \ fpu.o \ + book3s_paired_singles.o \ book3s.o \ book3s_64_emulate.o \ book3s_64_interrupts.o \ diff --git a/arch/powerpc/kvm/book3s_64_emulate.c b/arch/powerpc/kvm/book3s_64_emulate.c index 1d1b952..c989214 100644 --- a/arch/powerpc/kvm/book3s_64_emulate.c +++ b/arch/powerpc/kvm/book3s_64_emulate.c @@ -200,6 +200,9 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, emulated = EMULATE_FAIL; } + if (emulated == EMULATE_FAIL) + emulated = kvmppc_emulate_paired_single(run, vcpu); + return emulated; } diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c new file mode 100644 index 000..cb258a3 --- /dev/null +++ b/arch/powerpc/kvm/book3s_paired_singles.c @@ -0,0 +1,1356 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + * Copyright Novell Inc 2010 + * + * Authors: Alexander Graf ag...@suse.de + */ + +#include asm/kvm.h +#include asm/kvm_ppc.h +#include asm/disassemble.h +#include asm/kvm_book3s.h +#include asm/kvm_fpu.h +#include asm/reg.h +#include asm/cacheflush.h +#include linux/vmalloc.h + +/* #define DEBUG */ + +#ifdef DEBUG +#define dprintk printk +#else +#define dprintk(...) do { } while(0); +#endif + +#define OP_LFS 48 +#define OP_LFSU49 +#define OP_LFD 50 +#define OP_LFDU51 +#define OP_STFS52 +#define OP_STFSU 53 +#define OP_STFD54 +#define OP_STFDU 55 +#define OP_PSQ_L 56 +#define OP_PSQ_LU 57 +#define OP_PSQ_ST 60 +#define OP_PSQ_STU 61 + +#define OP_31_LFSX 535 +#define OP_31_LFSUX567 +#define OP_31_LFDX 599 +#define OP_31_LFDUX631 +#define OP_31_STFSX663 +#define OP_31_STFSUX 695 +#define OP_31_STFX 727 +#define OP_31_STFUX759 +#define OP_31_LWIZX887 +#define OP_31_STFIWX 983 + +#define OP_59_FADDS21 +#define OP_59_FSUBS20 +#define OP_59_FSQRTS 22
[PATCH 18/18] KVM: PPC: Implement Paired Single emulation
The one big thing about the Gekko is paired singles. Paired singles are an extension to the instruction set, that adds 32 single precision floating point registers (qprs), some SPRs to modify the behavior of paired singled operations and instructions to deal with qprs to the instruction set. Unfortunately, it also changes semantics of existing operations that affect single values in FPRs. In most cases they get mirrored to the coresponding QPR. Thanks to that we need to emulate all FPU operations and all the new paired single operations too. In order to achieve that, we take the guest's instruction, rip out the parameters, put in our own and execute the very same instruction, but also fix up the QPR values along the way. That way we can execute paired single FPU operations without implementing a soft fpu. Signed-off-by: Alexander Graf ag...@suse.de --- arch/powerpc/include/asm/kvm_book3s.h|1 + arch/powerpc/kvm/Makefile|1 + arch/powerpc/kvm/book3s_64_emulate.c |3 + arch/powerpc/kvm/book3s_paired_singles.c | 1356 ++ 4 files changed, 1361 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/kvm/book3s_paired_singles.c diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index f74d1db..e32a749 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -121,6 +121,7 @@ extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int vec) extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat, bool upper, u32 val); extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr); +extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu); extern u32 kvmppc_trampoline_lowmem; extern u32 kvmppc_trampoline_enter; diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile index e575cfd..eba721e 100644 --- a/arch/powerpc/kvm/Makefile +++ b/arch/powerpc/kvm/Makefile @@ -41,6 +41,7 @@ kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs) kvm-book3s_64-objs := \ $(common-objs-y) \ fpu.o \ + book3s_paired_singles.o \ book3s.o \ book3s_64_emulate.o \ book3s_64_interrupts.o \ diff --git a/arch/powerpc/kvm/book3s_64_emulate.c b/arch/powerpc/kvm/book3s_64_emulate.c index 1d1b952..c989214 100644 --- a/arch/powerpc/kvm/book3s_64_emulate.c +++ b/arch/powerpc/kvm/book3s_64_emulate.c @@ -200,6 +200,9 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, emulated = EMULATE_FAIL; } + if (emulated == EMULATE_FAIL) + emulated = kvmppc_emulate_paired_single(run, vcpu); + return emulated; } diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c new file mode 100644 index 000..cb258a3 --- /dev/null +++ b/arch/powerpc/kvm/book3s_paired_singles.c @@ -0,0 +1,1356 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + * Copyright Novell Inc 2010 + * + * Authors: Alexander Graf ag...@suse.de + */ + +#include asm/kvm.h +#include asm/kvm_ppc.h +#include asm/disassemble.h +#include asm/kvm_book3s.h +#include asm/kvm_fpu.h +#include asm/reg.h +#include asm/cacheflush.h +#include linux/vmalloc.h + +/* #define DEBUG */ + +#ifdef DEBUG +#define dprintk printk +#else +#define dprintk(...) do { } while(0); +#endif + +#define OP_LFS 48 +#define OP_LFSU49 +#define OP_LFD 50 +#define OP_LFDU51 +#define OP_STFS52 +#define OP_STFSU 53 +#define OP_STFD54 +#define OP_STFDU 55 +#define OP_PSQ_L 56 +#define OP_PSQ_LU 57 +#define OP_PSQ_ST 60 +#define OP_PSQ_STU 61 + +#define OP_31_LFSX 535 +#define OP_31_LFSUX567 +#define OP_31_LFDX 599 +#define OP_31_LFDUX631 +#define OP_31_STFSX663 +#define OP_31_STFSUX 695 +#define OP_31_STFX 727 +#define OP_31_STFUX759 +#define OP_31_LWIZX887 +#define OP_31_STFIWX 983 + +#define OP_59_FADDS21 +#define OP_59_FSUBS20 +#define OP_59_FSQRTS 22