Re: [PATCH v4 1/9] selftests/powerpc: Test the preservation of FPU and VMX regs across syscall

2016-02-15 Thread Michael Ellerman
On Tue, 2016-02-16 at 11:06 +1100, Cyril Bur wrote:
> On Mon, 15 Feb 2016 22:29:17 +0530
> "Naveen N. Rao"  wrote:
> 
> > On 2016/02/15 04:07PM, Cyril Bur wrote:
> > > Test that the non volatile floating point and Altivec registers get
> > > correctly preserved across the fork() syscall.
> > > 
> > > fork() works nicely for this purpose, the registers should be the same for
> > > both parent and child
> > > 
> > > diff --git a/tools/testing/selftests/powerpc/basic_asm.h 
> > > b/tools/testing/selftests/powerpc/basic_asm.h
> > > new file mode 100644
> > > index 000..f243da0
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/powerpc/basic_asm.h
> > > @@ -0,0 +1,30 @@
> > > +#include 
> > > +#include 
> > > +
> > > +#define LOAD_REG_IMMEDIATE(reg,expr) \
> > > + lis reg,(expr)@highest; \
> > > + ori reg,reg,(expr)@higher;  \
> > > + rldicr  reg,reg,32,31;  \
> > > + orisreg,reg,(expr)@high;\
> > > + ori reg,reg,(expr)@l;
> > > +
> > > +/* It is very important to note here that _extra is the extra amount of
> > > + * stack space needed.
> > > + * This space must be accessed at sp + 32!  
> > 
> 
> Hi Naveen,
> 
> Thanks for the review.

> > This looks to be specific to ABIv2. Is this series limited to ppc64le?  
> > If so, you might want to ensure this only builds there.
> > 
> 
> Is ABIv1 still in use? Can we still compile for v1? 

YES! >:E

> This is for series 64bit only, I've not really got any reason to believe this
> is LE only, shouldn't this work BE? The makefile enforces 64bit, I believe it 
> is
> ok for kernel selftests to fail to compile if they aren't going to be able to
> run.

> > Also:
> > #define PPC_ABIV2_MIN_STACK_SIZE 32
> > 
> > or just:
> > #define PPC_MIN_STACK   32
> > 
> > ... is helpful. And, you might want to base the rest of your code that 
> > use PUSH_BASIC_STACK() on that. If we ever want to have these tests run 
> > anywhere else, that'll help a lot. (See further below)
> > 
> 
> So I thought about it. I agree that it would be nice, I just worry that I 
> might
> get rabbitholed, I can see it going further and then providing stack accessors
> to abstract out even PPC_MIN_STACK except in a bunch of macros, and that's 
> when
> I know I've gone too far.
> 
> Perhaps I could look at adding this when I write more tests, I have grand 
> plans
> to push way more tests.

You definitely need a #define for the minimum stack frame size, based on the
ABI version. You can basically do what the kernel does for STACK_FRAME_MIN_SIZE.

You also need to cope with the TOC save slot moving between ABIv1 & 2, which
shouldn't be hard with a macro for it.

> > > + */
> > > +#define PUSH_BASIC_STACK(_extra) \
> > > + mflrr0; \
> > > + std r0,16(sp); \
> > > + stdusp,-(_extra + 32)(sp); \
> > > + mfcrr0; \
> > > + stw r0,8(sp); \
> > > + std 2,24(sp);  
> > ^^
> > Better to use r2 here and below.
> > 
> 
> I think the reason I used '2' is that 'r2' isn't actually defined in ppc-asm.h
> for userspace, due to conventions, like 'sp', 'toc' has been used. So I could
> have used 'toc' but then there was an issue with toc NOT being defined, or
> getting undefined in some situations.

That's true, ppc-asm.h doesn't define r2, instead it defines toc.

But you can always use %r2, which is preferable to 2 IMHO.

Personally I'd rather you use %r1 than sp, but I won't make you. As someone who
has read lots of powerpc assembler %r1 translates as "stack pointer" where as
"sp" translates as "huh?".

> > > +FUNC_START(test_fpu)
> > > + #r3 holds pointer to where to put the result of fork
> > > + #r4 holds pointer to the pid
> > > + #f14-f31 are non volatiles
> > > + PUSH_BASIC_STACK(256)
> > > + std r3,40(sp) #Address of darray  
> > 
> > So, this could be:
> > PUSH_BASIC_STACK(256)
> > std r3,PPC_MIN_STACK+8(sp)
> > 
> > ... though I wonder why there is +8 here?
> 
> I think the +8 is left over from my using +0 for something else and then not
> and not going back and being all neat about stack usage. Admittedly I didn't
> look over that too hard it being a selftest and all, I'm not sure optimal
> stack usage is super important here.

The first free slot is at PPC_MIN_STACK(%r1), so that's what you should use.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 1/9] selftests/powerpc: Test the preservation of FPU and VMX regs across syscall

2016-02-15 Thread Naveen N. Rao
On 2016/02/16 11:06AM, Cyril Bur wrote:
> On Mon, 15 Feb 2016 22:29:17 +0530
> "Naveen N. Rao"  wrote:
> 
> > On 2016/02/15 04:07PM, Cyril Bur wrote:
> > > Test that the non volatile floating point and Altivec registers get
> > > correctly preserved across the fork() syscall.
> > > 
> > > fork() works nicely for this purpose, the registers should be the same for
> > > both parent and child
> > > 
> > > Signed-off-by: Cyril Bur 
> > > ---



> > > +
> > > +/* It is very important to note here that _extra is the extra amount of
> > > + * stack space needed.
> > > + * This space must be accessed at sp + 32!  
> > 
> 
> Hi Naveen,
> 
> Thanks for the review.
> 
> > This looks to be specific to ABIv2. Is this series limited to ppc64le?  
> > If so, you might want to ensure this only builds there.
> > 
> 
> Is ABIv1 still in use? Can we still compile for v1? 

Yes, that's the earlier ppc64 BE (I'm assuming these tests can be run 
when booted in LPARs as well)

> 
> This is for series 64bit only, I've not really got any reason to believe this
> is LE only, shouldn't this work BE? The makefile enforces 64bit, I believe it 
> is

This won't work for ABIv1 BE since the stack setup is a bit different. I 
think your patches assume that 32 bytes is the minimum stack size, but 
that's only for ABIv2. Also, the locations of CR and TOC save areas on 
the stack are quite different:

http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-1.9.html#STACK

> ok for kernel selftests to fail to compile if they aren't going to be able to
> run.
> 
> > Also:
> > #define PPC_ABIV2_MIN_STACK_SIZE 32
> > 
> > or just:
> > #define PPC_MIN_STACK   32
> > 
> > ... is helpful. And, you might want to base the rest of your code that 
> > use PUSH_BASIC_STACK() on that. If we ever want to have these tests run 
> > anywhere else, that'll help a lot. (See further below)
> > 
> 
> So I thought about it. I agree that it would be nice, I just worry that I 
> might
> get rabbitholed, I can see it going further and then providing stack accessors
> to abstract out even PPC_MIN_STACK except in a bunch of macros, and that's 
> when
> I know I've gone too far.
> 
> Perhaps I could look at adding this when I write more tests, I have grand 
> plans
> to push way more tests.

Sure - just that if you ever intend to have these for ABIv1, it will be 
way easier to put together macros now rather than later.

- Naveen

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 1/9] selftests/powerpc: Test the preservation of FPU and VMX regs across syscall

2016-02-15 Thread Cyril Bur
On Mon, 15 Feb 2016 22:29:17 +0530
"Naveen N. Rao"  wrote:

> On 2016/02/15 04:07PM, Cyril Bur wrote:
> > Test that the non volatile floating point and Altivec registers get
> > correctly preserved across the fork() syscall.
> > 
> > fork() works nicely for this purpose, the registers should be the same for
> > both parent and child
> > 
> > Signed-off-by: Cyril Bur 
> > ---
> >  tools/testing/selftests/powerpc/Makefile   |   3 +-
> >  tools/testing/selftests/powerpc/basic_asm.h|  30 
> >  tools/testing/selftests/powerpc/math/.gitignore|   2 +
> >  tools/testing/selftests/powerpc/math/Makefile  |  14 ++
> >  tools/testing/selftests/powerpc/math/fpu_asm.S | 161 +
> >  tools/testing/selftests/powerpc/math/fpu_syscall.c |  90 ++
> >  tools/testing/selftests/powerpc/math/vmx_asm.S | 193 
> > +
> >  tools/testing/selftests/powerpc/math/vmx_syscall.c |  92 ++
> >  8 files changed, 584 insertions(+), 1 deletion(-)
> >  create mode 100644 tools/testing/selftests/powerpc/basic_asm.h
> >  create mode 100644 tools/testing/selftests/powerpc/math/.gitignore
> >  create mode 100644 tools/testing/selftests/powerpc/math/Makefile
> >  create mode 100644 tools/testing/selftests/powerpc/math/fpu_asm.S
> >  create mode 100644 tools/testing/selftests/powerpc/math/fpu_syscall.c
> >  create mode 100644 tools/testing/selftests/powerpc/math/vmx_asm.S
> >  create mode 100644 tools/testing/selftests/powerpc/math/vmx_syscall.c
> > 
> > diff --git a/tools/testing/selftests/powerpc/Makefile 
> > b/tools/testing/selftests/powerpc/Makefile
> > index 0c2706b..19e8191 100644
> > --- a/tools/testing/selftests/powerpc/Makefile
> > +++ b/tools/testing/selftests/powerpc/Makefile
> > @@ -22,7 +22,8 @@ SUB_DIRS = benchmarks \
> >switch_endian\
> >syscalls \
> >tm   \
> > -  vphn
> > +  vphn \
> > +  math
> >  
> >  endif
> >  
> > diff --git a/tools/testing/selftests/powerpc/basic_asm.h 
> > b/tools/testing/selftests/powerpc/basic_asm.h
> > new file mode 100644
> > index 000..f243da0
> > --- /dev/null
> > +++ b/tools/testing/selftests/powerpc/basic_asm.h
> > @@ -0,0 +1,30 @@
> > +#include 
> > +#include 
> > +
> > +#define LOAD_REG_IMMEDIATE(reg,expr) \
> > +   lis reg,(expr)@highest; \
> > +   ori reg,reg,(expr)@higher;  \
> > +   rldicr  reg,reg,32,31;  \
> > +   orisreg,reg,(expr)@high;\
> > +   ori reg,reg,(expr)@l;
> > +
> > +/* It is very important to note here that _extra is the extra amount of
> > + * stack space needed.
> > + * This space must be accessed at sp + 32!  
> 

Hi Naveen,

Thanks for the review.

> This looks to be specific to ABIv2. Is this series limited to ppc64le?  
> If so, you might want to ensure this only builds there.
> 

Is ABIv1 still in use? Can we still compile for v1? 

This is for series 64bit only, I've not really got any reason to believe this
is LE only, shouldn't this work BE? The makefile enforces 64bit, I believe it is
ok for kernel selftests to fail to compile if they aren't going to be able to
run.

> Also:
> #define PPC_ABIV2_MIN_STACK_SIZE 32
> 
> or just:
> #define PPC_MIN_STACK 32
> 
> ... is helpful. And, you might want to base the rest of your code that 
> use PUSH_BASIC_STACK() on that. If we ever want to have these tests run 
> anywhere else, that'll help a lot. (See further below)
> 

So I thought about it. I agree that it would be nice, I just worry that I might
get rabbitholed, I can see it going further and then providing stack accessors
to abstract out even PPC_MIN_STACK except in a bunch of macros, and that's when
I know I've gone too far.

Perhaps I could look at adding this when I write more tests, I have grand plans
to push way more tests.

> > + */
> > +#define PUSH_BASIC_STACK(_extra) \
> > +   mflrr0; \
> > +   std r0,16(sp); \
> > +   stdusp,-(_extra + 32)(sp); \
> > +   mfcrr0; \
> > +   stw r0,8(sp); \
> > +   std 2,24(sp);  
>   ^^
> Better to use r2 here and below.
> 

I think the reason I used '2' is that 'r2' isn't actually defined in ppc-asm.h
for userspace, due to conventions, like 'sp', 'toc' has been used. So I could
have used 'toc' but then there was an issue with toc NOT being defined, or
getting undefined in some situations.

> > +
> > +#define POP_BASIC_STACK(_extra) \
> > +   ld  2,24(sp); \
> > +   lwz r0,8(sp); \
> > +   mtcrr0; \
> > +   addisp,sp,(_extra + 32); \
> > +   ld  r0,16(sp); \
> > +   mtlrr0;
> > +
> > diff --git a/tools/testing/selftests/powerpc/math/.gitignore 
> > b/tools/testing/selftests/powerpc/math/.gitignore
> > new file mode 100644
> > index 000..b19b269
> > --- /dev/null
> > +++ b/tools/testing/selftests/powerpc/math/.gitignore
> > @@ -0,0 +1,2 @@
> > +fpu_syscall
> > +vmx_syscall
> > diff --git 

Re: [PATCH v4 1/9] selftests/powerpc: Test the preservation of FPU and VMX regs across syscall

2016-02-15 Thread Naveen N. Rao
On 2016/02/15 04:07PM, Cyril Bur wrote:
> Test that the non volatile floating point and Altivec registers get
> correctly preserved across the fork() syscall.
> 
> fork() works nicely for this purpose, the registers should be the same for
> both parent and child
> 
> Signed-off-by: Cyril Bur 
> ---
>  tools/testing/selftests/powerpc/Makefile   |   3 +-
>  tools/testing/selftests/powerpc/basic_asm.h|  30 
>  tools/testing/selftests/powerpc/math/.gitignore|   2 +
>  tools/testing/selftests/powerpc/math/Makefile  |  14 ++
>  tools/testing/selftests/powerpc/math/fpu_asm.S | 161 +
>  tools/testing/selftests/powerpc/math/fpu_syscall.c |  90 ++
>  tools/testing/selftests/powerpc/math/vmx_asm.S | 193 
> +
>  tools/testing/selftests/powerpc/math/vmx_syscall.c |  92 ++
>  8 files changed, 584 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/powerpc/basic_asm.h
>  create mode 100644 tools/testing/selftests/powerpc/math/.gitignore
>  create mode 100644 tools/testing/selftests/powerpc/math/Makefile
>  create mode 100644 tools/testing/selftests/powerpc/math/fpu_asm.S
>  create mode 100644 tools/testing/selftests/powerpc/math/fpu_syscall.c
>  create mode 100644 tools/testing/selftests/powerpc/math/vmx_asm.S
>  create mode 100644 tools/testing/selftests/powerpc/math/vmx_syscall.c
> 
> diff --git a/tools/testing/selftests/powerpc/Makefile 
> b/tools/testing/selftests/powerpc/Makefile
> index 0c2706b..19e8191 100644
> --- a/tools/testing/selftests/powerpc/Makefile
> +++ b/tools/testing/selftests/powerpc/Makefile
> @@ -22,7 +22,8 @@ SUB_DIRS = benchmarks   \
>  switch_endian\
>  syscalls \
>  tm   \
> -vphn
> +vphn \
> +math
>  
>  endif
>  
> diff --git a/tools/testing/selftests/powerpc/basic_asm.h 
> b/tools/testing/selftests/powerpc/basic_asm.h
> new file mode 100644
> index 000..f243da0
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/basic_asm.h
> @@ -0,0 +1,30 @@
> +#include 
> +#include 
> +
> +#define LOAD_REG_IMMEDIATE(reg,expr) \
> + lis reg,(expr)@highest; \
> + ori reg,reg,(expr)@higher;  \
> + rldicr  reg,reg,32,31;  \
> + orisreg,reg,(expr)@high;\
> + ori reg,reg,(expr)@l;
> +
> +/* It is very important to note here that _extra is the extra amount of
> + * stack space needed.
> + * This space must be accessed at sp + 32!

This looks to be specific to ABIv2. Is this series limited to ppc64le?  
If so, you might want to ensure this only builds there.

Also:
#define PPC_ABIV2_MIN_STACK_SIZE 32

or just:
#define PPC_MIN_STACK   32

... is helpful. And, you might want to base the rest of your code that 
use PUSH_BASIC_STACK() on that. If we ever want to have these tests run 
anywhere else, that'll help a lot. (See further below)

> + */
> +#define PUSH_BASIC_STACK(_extra) \
> + mflrr0; \
> + std r0,16(sp); \
> + stdusp,-(_extra + 32)(sp); \
> + mfcrr0; \
> + stw r0,8(sp); \
> + std 2,24(sp);
^^
Better to use r2 here and below.

> +
> +#define POP_BASIC_STACK(_extra) \
> + ld  2,24(sp); \
> + lwz r0,8(sp); \
> + mtcrr0; \
> + addisp,sp,(_extra + 32); \
> + ld  r0,16(sp); \
> + mtlrr0;
> +
> diff --git a/tools/testing/selftests/powerpc/math/.gitignore 
> b/tools/testing/selftests/powerpc/math/.gitignore
> new file mode 100644
> index 000..b19b269
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/math/.gitignore
> @@ -0,0 +1,2 @@
> +fpu_syscall
> +vmx_syscall
> diff --git a/tools/testing/selftests/powerpc/math/Makefile 
> b/tools/testing/selftests/powerpc/math/Makefile
> new file mode 100644
> index 000..418bef1
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/math/Makefile
> @@ -0,0 +1,14 @@
> +TEST_PROGS := fpu_syscall vmx_syscall
> +
> +all: $(TEST_PROGS)
> +
> +$(TEST_PROGS): ../harness.c
> +$(TEST_PROGS): CFLAGS += -O2 -g -pthread -m64 -maltivec
> +
> +fpu_syscall: fpu_asm.S
> +vmx_syscall: vmx_asm.S
> +
> +include ../../lib.mk
> +
> +clean:
> + rm -f $(TEST_PROGS) *.o
> diff --git a/tools/testing/selftests/powerpc/math/fpu_asm.S 
> b/tools/testing/selftests/powerpc/math/fpu_asm.S
> new file mode 100644
> index 000..8733874
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/math/fpu_asm.S
> @@ -0,0 +1,161 @@
> +/*
> + * Copyright 2015, Cyril Bur, IBM Corp.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version
> + * 2 of the License, or (at your option) any later version.
> + */
> +
> +#include "../basic_asm.h"
> +
> +#define PUSH_FPU(pos) \
> + stfdf14,pos(sp); \
> + stfdf15,pos+8(sp); \
> + stfd

[PATCH v4 1/9] selftests/powerpc: Test the preservation of FPU and VMX regs across syscall

2016-02-14 Thread Cyril Bur
Test that the non volatile floating point and Altivec registers get
correctly preserved across the fork() syscall.

fork() works nicely for this purpose, the registers should be the same for
both parent and child

Signed-off-by: Cyril Bur 
---
 tools/testing/selftests/powerpc/Makefile   |   3 +-
 tools/testing/selftests/powerpc/basic_asm.h|  30 
 tools/testing/selftests/powerpc/math/.gitignore|   2 +
 tools/testing/selftests/powerpc/math/Makefile  |  14 ++
 tools/testing/selftests/powerpc/math/fpu_asm.S | 161 +
 tools/testing/selftests/powerpc/math/fpu_syscall.c |  90 ++
 tools/testing/selftests/powerpc/math/vmx_asm.S | 193 +
 tools/testing/selftests/powerpc/math/vmx_syscall.c |  92 ++
 8 files changed, 584 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/basic_asm.h
 create mode 100644 tools/testing/selftests/powerpc/math/.gitignore
 create mode 100644 tools/testing/selftests/powerpc/math/Makefile
 create mode 100644 tools/testing/selftests/powerpc/math/fpu_asm.S
 create mode 100644 tools/testing/selftests/powerpc/math/fpu_syscall.c
 create mode 100644 tools/testing/selftests/powerpc/math/vmx_asm.S
 create mode 100644 tools/testing/selftests/powerpc/math/vmx_syscall.c

diff --git a/tools/testing/selftests/powerpc/Makefile 
b/tools/testing/selftests/powerpc/Makefile
index 0c2706b..19e8191 100644
--- a/tools/testing/selftests/powerpc/Makefile
+++ b/tools/testing/selftests/powerpc/Makefile
@@ -22,7 +22,8 @@ SUB_DIRS = benchmarks \
   switch_endian\
   syscalls \
   tm   \
-  vphn
+  vphn \
+  math
 
 endif
 
diff --git a/tools/testing/selftests/powerpc/basic_asm.h 
b/tools/testing/selftests/powerpc/basic_asm.h
new file mode 100644
index 000..f243da0
--- /dev/null
+++ b/tools/testing/selftests/powerpc/basic_asm.h
@@ -0,0 +1,30 @@
+#include 
+#include 
+
+#define LOAD_REG_IMMEDIATE(reg,expr) \
+   lis reg,(expr)@highest; \
+   ori reg,reg,(expr)@higher;  \
+   rldicr  reg,reg,32,31;  \
+   orisreg,reg,(expr)@high;\
+   ori reg,reg,(expr)@l;
+
+/* It is very important to note here that _extra is the extra amount of
+ * stack space needed.
+ * This space must be accessed at sp + 32!
+ */
+#define PUSH_BASIC_STACK(_extra) \
+   mflrr0; \
+   std r0,16(sp); \
+   stdusp,-(_extra + 32)(sp); \
+   mfcrr0; \
+   stw r0,8(sp); \
+   std 2,24(sp);
+
+#define POP_BASIC_STACK(_extra) \
+   ld  2,24(sp); \
+   lwz r0,8(sp); \
+   mtcrr0; \
+   addisp,sp,(_extra + 32); \
+   ld  r0,16(sp); \
+   mtlrr0;
+
diff --git a/tools/testing/selftests/powerpc/math/.gitignore 
b/tools/testing/selftests/powerpc/math/.gitignore
new file mode 100644
index 000..b19b269
--- /dev/null
+++ b/tools/testing/selftests/powerpc/math/.gitignore
@@ -0,0 +1,2 @@
+fpu_syscall
+vmx_syscall
diff --git a/tools/testing/selftests/powerpc/math/Makefile 
b/tools/testing/selftests/powerpc/math/Makefile
new file mode 100644
index 000..418bef1
--- /dev/null
+++ b/tools/testing/selftests/powerpc/math/Makefile
@@ -0,0 +1,14 @@
+TEST_PROGS := fpu_syscall vmx_syscall
+
+all: $(TEST_PROGS)
+
+$(TEST_PROGS): ../harness.c
+$(TEST_PROGS): CFLAGS += -O2 -g -pthread -m64 -maltivec
+
+fpu_syscall: fpu_asm.S
+vmx_syscall: vmx_asm.S
+
+include ../../lib.mk
+
+clean:
+   rm -f $(TEST_PROGS) *.o
diff --git a/tools/testing/selftests/powerpc/math/fpu_asm.S 
b/tools/testing/selftests/powerpc/math/fpu_asm.S
new file mode 100644
index 000..8733874
--- /dev/null
+++ b/tools/testing/selftests/powerpc/math/fpu_asm.S
@@ -0,0 +1,161 @@
+/*
+ * Copyright 2015, Cyril Bur, IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include "../basic_asm.h"
+
+#define PUSH_FPU(pos) \
+   stfdf14,pos(sp); \
+   stfdf15,pos+8(sp); \
+   stfdf16,pos+16(sp); \
+   stfdf17,pos+24(sp); \
+   stfdf18,pos+32(sp); \
+   stfdf19,pos+40(sp); \
+   stfdf20,pos+48(sp); \
+   stfdf21,pos+56(sp); \
+   stfdf22,pos+64(sp); \
+   stfdf23,pos+72(sp); \
+   stfdf24,pos+80(sp); \
+   stfdf25,pos+88(sp); \
+   stfdf26,pos+96(sp); \
+   stfdf27,pos+104(sp); \
+   stfdf28,pos+112(sp); \
+   stfdf29,pos+120(sp); \
+   stfdf30,pos+128(sp); \
+   stfdf31,pos+136(sp);
+
+#define POP_FPU(pos) \
+   lfd f14,pos(sp); \
+   lfd f15,pos+8(sp); \
+   lfd f16,pos+16(sp); \
+   lfd f17,pos+24(sp); \
+   lfd f18,pos+32(sp); \