This patch makes restore_fpu() an inline.  When L1/L2 cache are saturated
it makes a measurable difference.

  Results from profiling Volanomark follow.  Sample rate was 2000 samples/sec
(HZ = 250, profile multiplier = 8) on a dual-processor Pentium II Xeon.


Before:

 10680 restore_fpu                              333.7500
  8351 device_not_available                     203.6829
  3823 math_state_restore                        59.7344
 -----
 22854


After:

 12534 math_state_restore                       130.5625
  8354 device_not_available                     203.7561
 -----
 20888


Patch is "obviously correct" and cuts 9% of the overhead.  Please apply.

Next step should be to physically place math_state_restore() after
device_not_available().  Would such a patch be accepted?  (Yes it
would be ugly and require linker script changes.)

Signed-off-by: Chuck Ebbert <[EMAIL PROTECTED]>

Index: 2.6.13-rc3a/arch/i386/kernel/i387.c
===================================================================
--- 2.6.13-rc3a.orig/arch/i386/kernel/i387.c
+++ 2.6.13-rc3a/arch/i386/kernel/i387.c
@@ -82,17 +82,6 @@
 }
 EXPORT_SYMBOL_GPL(kernel_fpu_begin);
 
-void restore_fpu( struct task_struct *tsk )
-{
-       if ( cpu_has_fxsr ) {
-               asm volatile( "fxrstor %0"
-                             : : "m" (tsk->thread.i387.fxsave) );
-       } else {
-               asm volatile( "frstor %0"
-                             : : "m" (tsk->thread.i387.fsave) );
-       }
-}
-
 /*
  * FPU tag word conversions.
  */
Index: 2.6.13-rc3a/include/asm-i386/i387.h
===================================================================
--- 2.6.13-rc3a.orig/include/asm-i386/i387.h
+++ 2.6.13-rc3a/include/asm-i386/i387.h
@@ -22,11 +22,20 @@
 /*
  * FPU lazy state save handling...
  */
-extern void restore_fpu( struct task_struct *tsk );
-
 extern void kernel_fpu_begin(void);
 #define kernel_fpu_end() do { stts(); preempt_enable(); } while(0)
 
+static inline void restore_fpu( struct task_struct *tsk )
+{
+       if ( cpu_has_fxsr ) {
+               asm volatile( "fxrstor %0"
+                             : : "m" (tsk->thread.i387.fxsave) );
+       } else {
+               asm volatile( "frstor %0"
+                             : : "m" (tsk->thread.i387.fsave) );
+       }
+}
+
 /*
  * These must be called with preempt disabled
  */
__
Chuck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to