Re: [PATCH] KVM: PPC: elide struct thread_struct instances from stack

2010-06-02 Thread Marcelo Tosatti
On Mon, May 31, 2010 at 09:59:13PM +0200, Andreas Schwab wrote:
 Instead of instantiating a whole thread_struct on the stack use only the
 required parts of it.
 
 Signed-off-by: Andreas Schwab sch...@linux-m68k.org
 Tested-by: Alexander Graf ag...@suse.de
 ---
  arch/powerpc/include/asm/kvm_fpu.h   |   27 +
  arch/powerpc/kernel/ppc_ksyms.c  |4 -
  arch/powerpc/kvm/book3s.c|   49 +---
  arch/powerpc/kvm/book3s_paired_singles.c |   94 
 --
  arch/powerpc/kvm/fpu.S   |   18 ++
  5 files changed, 97 insertions(+), 95 deletions(-)

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: elide struct thread_struct instances from stack

2010-06-01 Thread Alexander Graf

On 01.06.2010, at 10:36, Andreas Schwab wrote:

 Paul Mackerras pau...@samba.org writes:
 
 I re-read the relevant part of the PowerPC architecture spec
 yesterday, and it seems pretty clear that the FPSCR doesn't affect the
 behaviour of lfs and stfs, and is not affected by them.  So in fact 4
 out of the 7 instructions in each of those procedures are unnecessary
 (and similarly for the cvt_fd/df used in the alignment fixup code).
 
 I'd prefer to have this deferred to a separate patch.

I agree. Andreas' patch takes the current logic and moves it to be KVM 
contained, so we don't clutter the stack. The fact that the old code was 
inefficient is a separate story.

Avi / Marcelo, please apply the patch nevertheless.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: elide struct thread_struct instances from stack

2010-05-31 Thread Andreas Schwab
Instead of instantiating a whole thread_struct on the stack use only the
required parts of it.

Signed-off-by: Andreas Schwab sch...@linux-m68k.org
Tested-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_fpu.h   |   27 +
 arch/powerpc/kernel/ppc_ksyms.c  |4 -
 arch/powerpc/kvm/book3s.c|   49 +---
 arch/powerpc/kvm/book3s_paired_singles.c |   94 --
 arch/powerpc/kvm/fpu.S   |   18 ++
 5 files changed, 97 insertions(+), 95 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_fpu.h 
b/arch/powerpc/include/asm/kvm_fpu.h
index 94f05de..c3d4f05 100644
--- a/arch/powerpc/include/asm/kvm_fpu.h
+++ b/arch/powerpc/include/asm/kvm_fpu.h
@@ -22,24 +22,24 @@
 
 #include linux/types.h
 
-extern void fps_fres(struct thread_struct *t, u32 *dst, u32 *src1);
-extern void fps_frsqrte(struct thread_struct *t, u32 *dst, u32 *src1);
-extern void fps_fsqrts(struct thread_struct *t, u32 *dst, u32 *src1);
+extern void fps_fres(u64 *fpscr, u32 *dst, u32 *src1);
+extern void fps_frsqrte(u64 *fpscr, u32 *dst, u32 *src1);
+extern void fps_fsqrts(u64 *fpscr, u32 *dst, u32 *src1);
 
-extern void fps_fadds(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2);
-extern void fps_fdivs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2);
-extern void fps_fmuls(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2);
-extern void fps_fsubs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2);
+extern void fps_fadds(u64 *fpscr, u32 *dst, u32 *src1, u32 *src2);
+extern void fps_fdivs(u64 *fpscr, u32 *dst, u32 *src1, u32 *src2);
+extern void fps_fmuls(u64 *fpscr, u32 *dst, u32 *src1, u32 *src2);
+extern void fps_fsubs(u64 *fpscr, u32 *dst, u32 *src1, u32 *src2);
 
-extern void fps_fmadds(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2,
+extern void fps_fmadds(u64 *fpscr, u32 *dst, u32 *src1, u32 *src2,
   u32 *src3);
-extern void fps_fmsubs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2,
+extern void fps_fmsubs(u64 *fpscr, u32 *dst, u32 *src1, u32 *src2,
   u32 *src3);
-extern void fps_fnmadds(struct thread_struct *t, u32 *dst, u32 *src1, u32 
*src2,
+extern void fps_fnmadds(u64 *fpscr, u32 *dst, u32 *src1, u32 *src2,
u32 *src3);
-extern void fps_fnmsubs(struct thread_struct *t, u32 *dst, u32 *src1, u32 
*src2,
+extern void fps_fnmsubs(u64 *fpscr, u32 *dst, u32 *src1, u32 *src2,
u32 *src3);
-extern void fps_fsel(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2,
+extern void fps_fsel(u64 *fpscr, u32 *dst, u32 *src1, u32 *src2,
 u32 *src3);
 
 #define FPD_ONE_IN(name) extern void fpd_ ## name(u64 *fpscr, u32 *cr, \
@@ -82,4 +82,7 @@ FPD_THREE_IN(fmadd)
 FPD_THREE_IN(fnmsub)
 FPD_THREE_IN(fnmadd)
 
+extern void kvm_cvt_fd(u32 *from, u64 *to, u64 *fpscr);
+extern void kvm_cvt_df(u64 *from, u32 *to, u64 *fpscr);
+
 #endif
diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index bc9f39d..ab3e392 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -101,10 +101,6 @@ EXPORT_SYMBOL(pci_dram_offset);
 EXPORT_SYMBOL(start_thread);
 EXPORT_SYMBOL(kernel_thread);
 
-#ifndef CONFIG_BOOKE
-EXPORT_SYMBOL_GPL(cvt_df);
-EXPORT_SYMBOL_GPL(cvt_fd);
-#endif
 EXPORT_SYMBOL(giveup_fpu);
 #ifdef CONFIG_ALTIVEC
 EXPORT_SYMBOL(giveup_altivec);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index b998abf..3fea19d 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -1309,12 +1309,17 @@ extern int __kvmppc_vcpu_entry(struct kvm_run *kvm_run, 
struct kvm_vcpu *vcpu);
 int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 {
int ret;
-   struct thread_struct ext_bkp;
+   double fpr[32][TS_FPRWIDTH];
+   unsigned int fpscr;
+   int fpexc_mode;
 #ifdef CONFIG_ALTIVEC
-   bool save_vec = current-thread.used_vr;
+   vector128 vr[32];
+   vector128 vscr;
+   unsigned long uninitialized_var(vrsave);
+   int used_vr;
 #endif
 #ifdef CONFIG_VSX
-   bool save_vsx = current-thread.used_vsr;
+   int used_vsr;
 #endif
ulong ext_msr;
 
@@ -1327,27 +1332,27 @@ int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
/* Save FPU state in stack */
if (current-thread.regs-msr  MSR_FP)
giveup_fpu(current);
-   memcpy(ext_bkp.fpr, current-thread.fpr, sizeof(current-thread.fpr));
-   ext_bkp.fpscr = current-thread.fpscr;
-   ext_bkp.fpexc_mode = current-thread.fpexc_mode;
+   memcpy(fpr, current-thread.fpr, sizeof(current-thread.fpr));
+   fpscr = current-thread.fpscr.val;
+   fpexc_mode = current-thread.fpexc_mode;
 
 #ifdef CONFIG_ALTIVEC
/* Save Altivec state in stack */
-   if (save_vec) {
+   used_vr = current-thread.used_vr;
+   if (used_vr) {
if 

Re: [PATCH] KVM: PPC: elide struct thread_struct instances from stack

2010-05-31 Thread Alexander Graf

On 31.05.2010, at 21:59, Andreas Schwab wrote:

 Instead of instantiating a whole thread_struct on the stack use only the
 required parts of it.
 
 Signed-off-by: Andreas Schwab sch...@linux-m68k.org
 Tested-by: Alexander Graf ag...@suse.de

Avi or Marcelo, please pull this in through kvm.git.

Signed-off-by: Alexander Graf ag...@suse.de


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: elide struct thread_struct instances from stack

2010-05-31 Thread Paul Mackerras
On Mon, May 31, 2010 at 09:59:13PM +0200, Andreas Schwab wrote:

 Instead of instantiating a whole thread_struct on the stack use only the
 required parts of it.

...

 +_GLOBAL(kvm_cvt_fd)
 + lfd 0,0(r5) /* load up fpscr value */
 + MTFSF_L(0)
 + lfs 0,0(r3)
 + stfd0,0(r4)
 + mffs0
 + stfd0,0(r5) /* save new fpscr value */
 + blr
 +
 +_GLOBAL(kvm_cvt_df)
 + lfd 0,0(r5) /* load up fpscr value */
 + MTFSF_L(0)
 + lfd 0,0(r3)
 + stfs0,0(r4)
 + mffs0
 + stfd0,0(r5) /* save new fpscr value */
 + blr

I re-read the relevant part of the PowerPC architecture spec
yesterday, and it seems pretty clear that the FPSCR doesn't affect the
behaviour of lfs and stfs, and is not affected by them.  So in fact 4
out of the 7 instructions in each of those procedures are unnecessary
(and similarly for the cvt_fd/df used in the alignment fixup code).

Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: elide struct thread_struct instances from stack

2010-05-31 Thread Alexander Graf

On 01.06.2010, at 00:40, Paul Mackerras wrote:

 On Mon, May 31, 2010 at 09:59:13PM +0200, Andreas Schwab wrote:
 
 Instead of instantiating a whole thread_struct on the stack use only the
 required parts of it.
 
 ...
 
 +_GLOBAL(kvm_cvt_fd)
 +lfd 0,0(r5) /* load up fpscr value */
 +MTFSF_L(0)
 +lfs 0,0(r3)
 +stfd0,0(r4)
 +mffs0
 +stfd0,0(r5) /* save new fpscr value */
 +blr
 +
 +_GLOBAL(kvm_cvt_df)
 +lfd 0,0(r5) /* load up fpscr value */
 +MTFSF_L(0)
 +lfd 0,0(r3)
 +stfs0,0(r4)
 +mffs0
 +stfd0,0(r5) /* save new fpscr value */
 +blr
 
 I re-read the relevant part of the PowerPC architecture spec
 yesterday, and it seems pretty clear that the FPSCR doesn't affect the
 behaviour of lfs and stfs, and is not affected by them.  So in fact 4
 out of the 7 instructions in each of those procedures are unnecessary
 (and similarly for the cvt_fd/df used in the alignment fixup code).

So the rounding control field is not used on lfs? Interesting. I couldn't find 
a reference to it being used or modified either though.

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html