Re: [PATCH 1/6] powerpc: fix exception clearing in e500 SPE float emulation

2013-12-06 Thread Scott Wood
On Sat, 2013-11-23 at 01:22 +, Joseph S. Myers wrote:
> On Fri, 22 Nov 2013, Scott Wood wrote:
> 
> > This sounds like an incompatible change to userspace API.  What about
> > older glibc?  What about user code that directly manipulates these bits
> > rather than going through libc, or uses a libc other than glibc?  Where
> > is this API requirement documented?
> 
> The previous EGLIBC port, and the uClibc code copied from it, is 
> fundamentally broken as regards any use of prctl for floating-point 
> exceptions because it didn't use the PR_FP_EXC_SW_ENABLE bit in its prctl 
> calls (and did various worse things, such as passing a pointer when prctl 
> expected an integer).  If you avoid anything where prctl is used, the 
> clearing of sticky bits still means it will never give anything 
> approximating correct exception semantics with existing kernels.  I don't 
> believe the patch makes things any worse for existing code that doesn't 
> try to inform the kernel of changes to sticky bits - such code may get 
> incorrect exceptions in some cases, but it would have done so anyway in 
> other cases.

OK -- please mention this in the changelog.

> This is the best API I could come up with to fix the fundamentally broken 
> nature of what came before, taking into account that in many cases a prctl 
> call is already needed along with userspace manipulation of exception 
> bits.  I'm not aware of any kernel documentation where this sort of 
> subarchitecture-specific API detail is documented.  (The API also includes 
> such things as needing to leave the spefscr trap-enable bits set and use 
> prctl to control whether SIGFPE results from exceptions.)

I don't know of a formal place for it, but there should at least be a
code comment somewhere.

> > I think the impact of this could be reduced by using this mechanism only
> > to clear bits, rather than set them.  That is, if the exception bit is
> > unset, don't set it just because it's set in spefscr_last -- but if it's
> > not set in spefscr_last, and the emulation code doesn't want to set it,
> > then clear it.
> 
> It should already be the case in this patch that if a bit is clear in 
> spefscr, and set in spefscr_last (i.e. userspace did not inform the kernel 
> of clearing the bit, and no traps since then have resulted in the kernel 
> noticing it was cleared), it won't get set unless the emulation code wants 
> to set it.  The sole place spefscr_last is read is in the statement 
> "__FPU_FPSCR &= ~(FP_EX_INVALID | FP_EX_UNDERFLOW) | 
> current->thread.spefscr_last;" - if the bit is already clear in spefscr, 
> this statement has no effect on it.

OK -- I must have misread it before.

-Scott



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] powerpc: fix exception clearing in e500 SPE float emulation

2013-12-06 Thread Scott Wood
On Sat, 2013-11-23 at 01:22 +, Joseph S. Myers wrote:
 On Fri, 22 Nov 2013, Scott Wood wrote:
 
  This sounds like an incompatible change to userspace API.  What about
  older glibc?  What about user code that directly manipulates these bits
  rather than going through libc, or uses a libc other than glibc?  Where
  is this API requirement documented?
 
 The previous EGLIBC port, and the uClibc code copied from it, is 
 fundamentally broken as regards any use of prctl for floating-point 
 exceptions because it didn't use the PR_FP_EXC_SW_ENABLE bit in its prctl 
 calls (and did various worse things, such as passing a pointer when prctl 
 expected an integer).  If you avoid anything where prctl is used, the 
 clearing of sticky bits still means it will never give anything 
 approximating correct exception semantics with existing kernels.  I don't 
 believe the patch makes things any worse for existing code that doesn't 
 try to inform the kernel of changes to sticky bits - such code may get 
 incorrect exceptions in some cases, but it would have done so anyway in 
 other cases.

OK -- please mention this in the changelog.

 This is the best API I could come up with to fix the fundamentally broken 
 nature of what came before, taking into account that in many cases a prctl 
 call is already needed along with userspace manipulation of exception 
 bits.  I'm not aware of any kernel documentation where this sort of 
 subarchitecture-specific API detail is documented.  (The API also includes 
 such things as needing to leave the spefscr trap-enable bits set and use 
 prctl to control whether SIGFPE results from exceptions.)

I don't know of a formal place for it, but there should at least be a
code comment somewhere.

  I think the impact of this could be reduced by using this mechanism only
  to clear bits, rather than set them.  That is, if the exception bit is
  unset, don't set it just because it's set in spefscr_last -- but if it's
  not set in spefscr_last, and the emulation code doesn't want to set it,
  then clear it.
 
 It should already be the case in this patch that if a bit is clear in 
 spefscr, and set in spefscr_last (i.e. userspace did not inform the kernel 
 of clearing the bit, and no traps since then have resulted in the kernel 
 noticing it was cleared), it won't get set unless the emulation code wants 
 to set it.  The sole place spefscr_last is read is in the statement 
 __FPU_FPSCR = ~(FP_EX_INVALID | FP_EX_UNDERFLOW) | 
 current-thread.spefscr_last; - if the bit is already clear in spefscr, 
 this statement has no effect on it.

OK -- I must have misread it before.

-Scott



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] powerpc: fix exception clearing in e500 SPE float emulation

2013-11-22 Thread Joseph S. Myers
On Fri, 22 Nov 2013, Scott Wood wrote:

> This sounds like an incompatible change to userspace API.  What about
> older glibc?  What about user code that directly manipulates these bits
> rather than going through libc, or uses a libc other than glibc?  Where
> is this API requirement documented?

The previous EGLIBC port, and the uClibc code copied from it, is 
fundamentally broken as regards any use of prctl for floating-point 
exceptions because it didn't use the PR_FP_EXC_SW_ENABLE bit in its prctl 
calls (and did various worse things, such as passing a pointer when prctl 
expected an integer).  If you avoid anything where prctl is used, the 
clearing of sticky bits still means it will never give anything 
approximating correct exception semantics with existing kernels.  I don't 
believe the patch makes things any worse for existing code that doesn't 
try to inform the kernel of changes to sticky bits - such code may get 
incorrect exceptions in some cases, but it would have done so anyway in 
other cases.

This is the best API I could come up with to fix the fundamentally broken 
nature of what came before, taking into account that in many cases a prctl 
call is already needed along with userspace manipulation of exception 
bits.  I'm not aware of any kernel documentation where this sort of 
subarchitecture-specific API detail is documented.  (The API also includes 
such things as needing to leave the spefscr trap-enable bits set and use 
prctl to control whether SIGFPE results from exceptions.)

> I think the impact of this could be reduced by using this mechanism only
> to clear bits, rather than set them.  That is, if the exception bit is
> unset, don't set it just because it's set in spefscr_last -- but if it's
> not set in spefscr_last, and the emulation code doesn't want to set it,
> then clear it.

It should already be the case in this patch that if a bit is clear in 
spefscr, and set in spefscr_last (i.e. userspace did not inform the kernel 
of clearing the bit, and no traps since then have resulted in the kernel 
noticing it was cleared), it won't get set unless the emulation code wants 
to set it.  The sole place spefscr_last is read is in the statement 
"__FPU_FPSCR &= ~(FP_EX_INVALID | FP_EX_UNDERFLOW) | 
current->thread.spefscr_last;" - if the bit is already clear in spefscr, 
this statement has no effect on it.

> Are there any cases where the exception bit can be set without the
> kernel taking a trap, or is userspace manipulation limited to clearing
> the bits?

Userspace can both set and clear the bits without a trap.  For example, 
fesetenv restores a saved value of spefscr which may both set and clear 
bits (and then it calls prctl because it needs to do so anyway to restore 
the saved state for which exceptions were enabled).  fesetexceptflag 
restores saved state of particular exceptions without a trap (so needs to 
call prctl specially to inform the kernel of a change).

-- 
Joseph S. Myers
jos...@codesourcery.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] powerpc: fix exception clearing in e500 SPE float emulation

2013-11-22 Thread Scott Wood
On Mon, 2013-11-04 at 16:52 +, Joseph S. Myers wrote:
> From: Joseph Myers 
> 
> The e500 SPE floating-point emulation code clears existing exceptions
> (__FPU_FPSCR &= ~FP_EX_MASK;) before ORing in the exceptions from the
> emulated operation.  However, these exception bits are the "sticky",
> cumulative exception bits, and should only be cleared by the user
> program setting SPEFSCR, not implicitly by any floating-point
> instruction (whether executed purely by the hardware or emulated).
> The spurious clearing of these bits shows up as missing exceptions in
> glibc testing.
> 
> Fixing this, however, is not as simple as just not clearing the bits,
> because while the bits may be from previous floating-point operations
> (in which case they should not be cleared), the processor can also set
> the sticky bits itself before the interrupt for an exception occurs,
> and this can happen in cases when IEEE 754 semantics are that the
> sticky bit should not be set.  Specifically, the "invalid" sticky bit
> is set in various cases with non-finite operands, where IEEE 754
> semantics do not involve raising such an exception, and the
> "underflow" sticky bit is set in cases of exact underflow, whereas
> IEEE 754 semantics are that this flag is set only for inexact
> underflow.  Thus, for correct emulation the kernel needs to know the
> setting of these two sticky bits before the instruction being
> emulated.
> 
> When a floating-point operation raises an exception, the kernel can
> note the state of the sticky bits immediately afterwards.  Some
>  functions that affect the state of these bits, such as
> fesetenv and feholdexcept, need to use prctl with PR_GET_FPEXC and
> PR_SET_FPEXC anyway, and so it is natural to record the state of those
> bits during that call into the kernel and so avoid any need for a
> separate call into the kernel to inform it of a change to those bits.
> Thus, the interface I chose to use (in this patch and the glibc port)
> is that one of those prctl calls must be made after any userspace
> change to those sticky bits, other than through a floating-point
> operation that traps into the kernel anyway.

This sounds like an incompatible change to userspace API.  What about
older glibc?  What about user code that directly manipulates these bits
rather than going through libc, or uses a libc other than glibc?  Where
is this API requirement documented?

I think the impact of this could be reduced by using this mechanism only
to clear bits, rather than set them.  That is, if the exception bit is
unset, don't set it just because it's set in spefscr_last -- but if it's
not set in spefscr_last, and the emulation code doesn't want to set it,
then clear it.

Are there any cases where the exception bit can be set without the
kernel taking a trap, or is userspace manipulation limited to clearing
the bits?

-Scott



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] powerpc: fix exception clearing in e500 SPE float emulation

2013-11-22 Thread Scott Wood
On Mon, 2013-11-04 at 16:52 +, Joseph S. Myers wrote:
 From: Joseph Myers jos...@codesourcery.com
 
 The e500 SPE floating-point emulation code clears existing exceptions
 (__FPU_FPSCR = ~FP_EX_MASK;) before ORing in the exceptions from the
 emulated operation.  However, these exception bits are the sticky,
 cumulative exception bits, and should only be cleared by the user
 program setting SPEFSCR, not implicitly by any floating-point
 instruction (whether executed purely by the hardware or emulated).
 The spurious clearing of these bits shows up as missing exceptions in
 glibc testing.
 
 Fixing this, however, is not as simple as just not clearing the bits,
 because while the bits may be from previous floating-point operations
 (in which case they should not be cleared), the processor can also set
 the sticky bits itself before the interrupt for an exception occurs,
 and this can happen in cases when IEEE 754 semantics are that the
 sticky bit should not be set.  Specifically, the invalid sticky bit
 is set in various cases with non-finite operands, where IEEE 754
 semantics do not involve raising such an exception, and the
 underflow sticky bit is set in cases of exact underflow, whereas
 IEEE 754 semantics are that this flag is set only for inexact
 underflow.  Thus, for correct emulation the kernel needs to know the
 setting of these two sticky bits before the instruction being
 emulated.
 
 When a floating-point operation raises an exception, the kernel can
 note the state of the sticky bits immediately afterwards.  Some
 fenv.h functions that affect the state of these bits, such as
 fesetenv and feholdexcept, need to use prctl with PR_GET_FPEXC and
 PR_SET_FPEXC anyway, and so it is natural to record the state of those
 bits during that call into the kernel and so avoid any need for a
 separate call into the kernel to inform it of a change to those bits.
 Thus, the interface I chose to use (in this patch and the glibc port)
 is that one of those prctl calls must be made after any userspace
 change to those sticky bits, other than through a floating-point
 operation that traps into the kernel anyway.

This sounds like an incompatible change to userspace API.  What about
older glibc?  What about user code that directly manipulates these bits
rather than going through libc, or uses a libc other than glibc?  Where
is this API requirement documented?

I think the impact of this could be reduced by using this mechanism only
to clear bits, rather than set them.  That is, if the exception bit is
unset, don't set it just because it's set in spefscr_last -- but if it's
not set in spefscr_last, and the emulation code doesn't want to set it,
then clear it.

Are there any cases where the exception bit can be set without the
kernel taking a trap, or is userspace manipulation limited to clearing
the bits?

-Scott



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] powerpc: fix exception clearing in e500 SPE float emulation

2013-11-22 Thread Joseph S. Myers
On Fri, 22 Nov 2013, Scott Wood wrote:

 This sounds like an incompatible change to userspace API.  What about
 older glibc?  What about user code that directly manipulates these bits
 rather than going through libc, or uses a libc other than glibc?  Where
 is this API requirement documented?

The previous EGLIBC port, and the uClibc code copied from it, is 
fundamentally broken as regards any use of prctl for floating-point 
exceptions because it didn't use the PR_FP_EXC_SW_ENABLE bit in its prctl 
calls (and did various worse things, such as passing a pointer when prctl 
expected an integer).  If you avoid anything where prctl is used, the 
clearing of sticky bits still means it will never give anything 
approximating correct exception semantics with existing kernels.  I don't 
believe the patch makes things any worse for existing code that doesn't 
try to inform the kernel of changes to sticky bits - such code may get 
incorrect exceptions in some cases, but it would have done so anyway in 
other cases.

This is the best API I could come up with to fix the fundamentally broken 
nature of what came before, taking into account that in many cases a prctl 
call is already needed along with userspace manipulation of exception 
bits.  I'm not aware of any kernel documentation where this sort of 
subarchitecture-specific API detail is documented.  (The API also includes 
such things as needing to leave the spefscr trap-enable bits set and use 
prctl to control whether SIGFPE results from exceptions.)

 I think the impact of this could be reduced by using this mechanism only
 to clear bits, rather than set them.  That is, if the exception bit is
 unset, don't set it just because it's set in spefscr_last -- but if it's
 not set in spefscr_last, and the emulation code doesn't want to set it,
 then clear it.

It should already be the case in this patch that if a bit is clear in 
spefscr, and set in spefscr_last (i.e. userspace did not inform the kernel 
of clearing the bit, and no traps since then have resulted in the kernel 
noticing it was cleared), it won't get set unless the emulation code wants 
to set it.  The sole place spefscr_last is read is in the statement 
__FPU_FPSCR = ~(FP_EX_INVALID | FP_EX_UNDERFLOW) | 
current-thread.spefscr_last; - if the bit is already clear in spefscr, 
this statement has no effect on it.

 Are there any cases where the exception bit can be set without the
 kernel taking a trap, or is userspace manipulation limited to clearing
 the bits?

Userspace can both set and clear the bits without a trap.  For example, 
fesetenv restores a saved value of spefscr which may both set and clear 
bits (and then it calls prctl because it needs to do so anyway to restore 
the saved state for which exceptions were enabled).  fesetexceptflag 
restores saved state of particular exceptions without a trap (so needs to 
call prctl specially to inform the kernel of a change).

-- 
Joseph S. Myers
jos...@codesourcery.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/6] powerpc: fix exception clearing in e500 SPE float emulation

2013-11-04 Thread Joseph S. Myers
From: Joseph Myers 

The e500 SPE floating-point emulation code clears existing exceptions
(__FPU_FPSCR &= ~FP_EX_MASK;) before ORing in the exceptions from the
emulated operation.  However, these exception bits are the "sticky",
cumulative exception bits, and should only be cleared by the user
program setting SPEFSCR, not implicitly by any floating-point
instruction (whether executed purely by the hardware or emulated).
The spurious clearing of these bits shows up as missing exceptions in
glibc testing.

Fixing this, however, is not as simple as just not clearing the bits,
because while the bits may be from previous floating-point operations
(in which case they should not be cleared), the processor can also set
the sticky bits itself before the interrupt for an exception occurs,
and this can happen in cases when IEEE 754 semantics are that the
sticky bit should not be set.  Specifically, the "invalid" sticky bit
is set in various cases with non-finite operands, where IEEE 754
semantics do not involve raising such an exception, and the
"underflow" sticky bit is set in cases of exact underflow, whereas
IEEE 754 semantics are that this flag is set only for inexact
underflow.  Thus, for correct emulation the kernel needs to know the
setting of these two sticky bits before the instruction being
emulated.

When a floating-point operation raises an exception, the kernel can
note the state of the sticky bits immediately afterwards.  Some
 functions that affect the state of these bits, such as
fesetenv and feholdexcept, need to use prctl with PR_GET_FPEXC and
PR_SET_FPEXC anyway, and so it is natural to record the state of those
bits during that call into the kernel and so avoid any need for a
separate call into the kernel to inform it of a change to those bits.
Thus, the interface I chose to use (in this patch and the glibc port)
is that one of those prctl calls must be made after any userspace
change to those sticky bits, other than through a floating-point
operation that traps into the kernel anyway.  feclearexcept and
fesetexceptflag duly make those calls, which would not be required
were it not for this issue.

Signed-off-by: Joseph Myers 

---

Previous submission: .

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index ce4de5a..0b02e23 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -237,6 +237,8 @@ struct thread_struct {
unsigned long   evr[32];/* upper 32-bits of SPE regs */
u64 acc;/* Accumulator */
unsigned long   spefscr;/* SPE & eFP status */
+   unsigned long   spefscr_last;   /* SPEFSCR value on last prctl
+  call or trap return */
int used_spe;   /* set if process has used spe */
 #endif /* CONFIG_SPE */
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
@@ -303,7 +305,9 @@ struct thread_struct {
(_ALIGN_UP(sizeof(init_thread_info), 16) + (unsigned long) _stack)
 
 #ifdef CONFIG_SPE
-#define SPEFSCR_INIT .spefscr = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE 
| SPEFSCR_FOVFE,
+#define SPEFSCR_INIT \
+   .spefscr = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE | 
SPEFSCR_FOVFE, \
+   .spefscr_last = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE | 
SPEFSCR_FOVFE,
 #else
 #define SPEFSCR_INIT
 #endif
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 96d2fdf..e3b91f1 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1151,6 +1151,7 @@ int set_fpexc_mode(struct task_struct *tsk, unsigned int 
val)
if (val & PR_FP_EXC_SW_ENABLE) {
 #ifdef CONFIG_SPE
if (cpu_has_feature(CPU_FTR_SPE)) {
+   tsk->thread.spefscr_last = mfspr(SPRN_SPEFSCR);
tsk->thread.fpexc_mode = val &
(PR_FP_EXC_SW_ENABLE | PR_FP_ALL_EXCEPT);
return 0;
@@ -1182,9 +1183,10 @@ int get_fpexc_mode(struct task_struct *tsk, unsigned 
long adr)
 
if (tsk->thread.fpexc_mode & PR_FP_EXC_SW_ENABLE)
 #ifdef CONFIG_SPE
-   if (cpu_has_feature(CPU_FTR_SPE))
+   if (cpu_has_feature(CPU_FTR_SPE)) {
+   tsk->thread.spefscr_last = mfspr(SPRN_SPEFSCR);
val = tsk->thread.fpexc_mode;
-   else
+   } else
return -EINVAL;
 #else
return -EINVAL;
diff --git a/arch/powerpc/math-emu/math_efp.c b/arch/powerpc/math-emu/math_efp.c
index a73f088..59835c6 100644
--- a/arch/powerpc/math-emu/math_efp.c
+++ b/arch/powerpc/math-emu/math_efp.c
@@ -630,9 +630,27 @@ update_ccr:
regs->ccr |= (IR << ((7 - ((speinsn >> 23) & 0x7)) << 2));
 
 update_regs:
-   __FPU_FPSCR &= ~FP_EX_MASK;
+   /*
+* If the "invalid" exception sticky bit was set by the
+* processor for 

[PATCH 1/6] powerpc: fix exception clearing in e500 SPE float emulation

2013-11-04 Thread Joseph S. Myers
From: Joseph Myers jos...@codesourcery.com

The e500 SPE floating-point emulation code clears existing exceptions
(__FPU_FPSCR = ~FP_EX_MASK;) before ORing in the exceptions from the
emulated operation.  However, these exception bits are the sticky,
cumulative exception bits, and should only be cleared by the user
program setting SPEFSCR, not implicitly by any floating-point
instruction (whether executed purely by the hardware or emulated).
The spurious clearing of these bits shows up as missing exceptions in
glibc testing.

Fixing this, however, is not as simple as just not clearing the bits,
because while the bits may be from previous floating-point operations
(in which case they should not be cleared), the processor can also set
the sticky bits itself before the interrupt for an exception occurs,
and this can happen in cases when IEEE 754 semantics are that the
sticky bit should not be set.  Specifically, the invalid sticky bit
is set in various cases with non-finite operands, where IEEE 754
semantics do not involve raising such an exception, and the
underflow sticky bit is set in cases of exact underflow, whereas
IEEE 754 semantics are that this flag is set only for inexact
underflow.  Thus, for correct emulation the kernel needs to know the
setting of these two sticky bits before the instruction being
emulated.

When a floating-point operation raises an exception, the kernel can
note the state of the sticky bits immediately afterwards.  Some
fenv.h functions that affect the state of these bits, such as
fesetenv and feholdexcept, need to use prctl with PR_GET_FPEXC and
PR_SET_FPEXC anyway, and so it is natural to record the state of those
bits during that call into the kernel and so avoid any need for a
separate call into the kernel to inform it of a change to those bits.
Thus, the interface I chose to use (in this patch and the glibc port)
is that one of those prctl calls must be made after any userspace
change to those sticky bits, other than through a floating-point
operation that traps into the kernel anyway.  feclearexcept and
fesetexceptflag duly make those calls, which would not be required
were it not for this issue.

Signed-off-by: Joseph Myers jos...@codesourcery.com

---

Previous submission: http://lkml.org/lkml/2013/10/4/495.

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index ce4de5a..0b02e23 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -237,6 +237,8 @@ struct thread_struct {
unsigned long   evr[32];/* upper 32-bits of SPE regs */
u64 acc;/* Accumulator */
unsigned long   spefscr;/* SPE  eFP status */
+   unsigned long   spefscr_last;   /* SPEFSCR value on last prctl
+  call or trap return */
int used_spe;   /* set if process has used spe */
 #endif /* CONFIG_SPE */
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
@@ -303,7 +305,9 @@ struct thread_struct {
(_ALIGN_UP(sizeof(init_thread_info), 16) + (unsigned long) init_stack)
 
 #ifdef CONFIG_SPE
-#define SPEFSCR_INIT .spefscr = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE 
| SPEFSCR_FOVFE,
+#define SPEFSCR_INIT \
+   .spefscr = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE | 
SPEFSCR_FOVFE, \
+   .spefscr_last = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE | 
SPEFSCR_FOVFE,
 #else
 #define SPEFSCR_INIT
 #endif
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 96d2fdf..e3b91f1 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1151,6 +1151,7 @@ int set_fpexc_mode(struct task_struct *tsk, unsigned int 
val)
if (val  PR_FP_EXC_SW_ENABLE) {
 #ifdef CONFIG_SPE
if (cpu_has_feature(CPU_FTR_SPE)) {
+   tsk-thread.spefscr_last = mfspr(SPRN_SPEFSCR);
tsk-thread.fpexc_mode = val 
(PR_FP_EXC_SW_ENABLE | PR_FP_ALL_EXCEPT);
return 0;
@@ -1182,9 +1183,10 @@ int get_fpexc_mode(struct task_struct *tsk, unsigned 
long adr)
 
if (tsk-thread.fpexc_mode  PR_FP_EXC_SW_ENABLE)
 #ifdef CONFIG_SPE
-   if (cpu_has_feature(CPU_FTR_SPE))
+   if (cpu_has_feature(CPU_FTR_SPE)) {
+   tsk-thread.spefscr_last = mfspr(SPRN_SPEFSCR);
val = tsk-thread.fpexc_mode;
-   else
+   } else
return -EINVAL;
 #else
return -EINVAL;
diff --git a/arch/powerpc/math-emu/math_efp.c b/arch/powerpc/math-emu/math_efp.c
index a73f088..59835c6 100644
--- a/arch/powerpc/math-emu/math_efp.c
+++ b/arch/powerpc/math-emu/math_efp.c
@@ -630,9 +630,27 @@ update_ccr:
regs-ccr |= (IR  ((7 - ((speinsn  23)  0x7))  2));
 
 update_regs:
-   __FPU_FPSCR = ~FP_EX_MASK;
+   /*
+* If the invalid exception sticky bit was set by the
+