Re: [patch 1/2] x86,fpu: split FPU state from task struct

2008-02-24 Thread Siddha, Suresh B
On Sun, Feb 24, 2008 at 08:22:02AM +0100, Ingo Molnar wrote:
> 
> * Suresh Siddha <[EMAIL PROTECTED]> wrote:
> 
> > Split the FPU save area from the task struct. This allows easy 
> > migration of FPU context, and it's generally cleaner. It also allows 
> > the following two optimizations:
> > 
> > 1) only allocate when the application actually uses FPU, so in the 
> > first lazy FPU trap. This could save memory for non-fpu using apps. 
> > Next patch does this lazy allocation.
> > 
> > 2) allocate the right size for the actual cpu rather than 512 bytes 
> > always. Patches enabling xsave/xrstor support (coming shortly) will 
> > take advantage of this.
> 
> i like the concept. Please clean up the issues found by Christoph and 
> please also base it against x86.git#testing [this is clear 2.6.26 
> material and there are already some changes in this area]:
> 
>http://people.redhat.com/mingo/x86.git/README

Sure. Will do that in a day.

thanks,
suresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] x86,fpu: split FPU state from task struct

2008-02-24 Thread Siddha, Suresh B
On Sun, Feb 24, 2008 at 08:27:30AM +0100, Roger While wrote:
> 
> On Sat, Feb 23, 2008 at 06:34:38PM -0800, Suresh Siddha wrote:
> > Split the FPU save area from the task struct. This allows easy migration
> > of FPU context, and it's generally cleaner. It also allows the following
> > two optimizations:
> >
> > 1) only allocate when the application actually uses FPU, so in the first
> > lazy FPU trap. This could save memory for non-fpu using apps. Next patch
> > does this lazy allocation.
> >
> > 2) allocate the right size for the actual cpu rather than 512 bytes 
> always.
> > Patches enabling xsave/xrstor support (coming shortly) will take advantage
> > of this.
> 
> > if (next_p->fpu_counter>5)
> > -   prefetch(>i387.fxsave);
> > +   prefetch(FXSAVE(next_p));
> 
> Shouldn't that be  prefetch(FXSAVE(next));  ?

No. 'next_p' which is the task_struct is what FXSAVE macro takes.

thanks,
suresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] x86,fpu: split FPU state from task struct

2008-02-24 Thread Siddha, Suresh B
On Sun, Feb 24, 2008 at 08:27:30AM +0100, Roger While wrote:
 
 On Sat, Feb 23, 2008 at 06:34:38PM -0800, Suresh Siddha wrote:
  Split the FPU save area from the task struct. This allows easy migration
  of FPU context, and it's generally cleaner. It also allows the following
  two optimizations:
 
  1) only allocate when the application actually uses FPU, so in the first
  lazy FPU trap. This could save memory for non-fpu using apps. Next patch
  does this lazy allocation.
 
  2) allocate the right size for the actual cpu rather than 512 bytes 
 always.
  Patches enabling xsave/xrstor support (coming shortly) will take advantage
  of this.
 
  if (next_p-fpu_counter5)
  -   prefetch(next-i387.fxsave);
  +   prefetch(FXSAVE(next_p));
 
 Shouldn't that be  prefetch(FXSAVE(next));  ?

No. 'next_p' which is the task_struct is what FXSAVE macro takes.

thanks,
suresh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] x86,fpu: split FPU state from task struct

2008-02-24 Thread Siddha, Suresh B
On Sun, Feb 24, 2008 at 08:22:02AM +0100, Ingo Molnar wrote:
 
 * Suresh Siddha [EMAIL PROTECTED] wrote:
 
  Split the FPU save area from the task struct. This allows easy 
  migration of FPU context, and it's generally cleaner. It also allows 
  the following two optimizations:
  
  1) only allocate when the application actually uses FPU, so in the 
  first lazy FPU trap. This could save memory for non-fpu using apps. 
  Next patch does this lazy allocation.
  
  2) allocate the right size for the actual cpu rather than 512 bytes 
  always. Patches enabling xsave/xrstor support (coming shortly) will 
  take advantage of this.
 
 i like the concept. Please clean up the issues found by Christoph and 
 please also base it against x86.git#testing [this is clear 2.6.26 
 material and there are already some changes in this area]:
 
http://people.redhat.com/mingo/x86.git/README

Sure. Will do that in a day.

thanks,
suresh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] x86,fpu: split FPU state from task struct

2008-02-23 Thread Roger While


On Sat, Feb 23, 2008 at 06:34:38PM -0800, Suresh Siddha wrote:
> Split the FPU save area from the task struct. This allows easy migration
> of FPU context, and it's generally cleaner. It also allows the following
> two optimizations:
>
> 1) only allocate when the application actually uses FPU, so in the first
> lazy FPU trap. This could save memory for non-fpu using apps. Next patch
> does this lazy allocation.
>
> 2) allocate the right size for the actual cpu rather than 512 bytes always.
> Patches enabling xsave/xrstor support (coming shortly) will take advantage
> of this.

>if (next_p->fpu_counter>5)
> -  prefetch(>i387.fxsave);
> +  prefetch(FXSAVE(next_p));

Shouldn't that be  prefetch(FXSAVE(next));  ?

Roger


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] x86,fpu: split FPU state from task struct

2008-02-23 Thread Ingo Molnar

* Suresh Siddha <[EMAIL PROTECTED]> wrote:

> Split the FPU save area from the task struct. This allows easy 
> migration of FPU context, and it's generally cleaner. It also allows 
> the following two optimizations:
> 
> 1) only allocate when the application actually uses FPU, so in the 
> first lazy FPU trap. This could save memory for non-fpu using apps. 
> Next patch does this lazy allocation.
> 
> 2) allocate the right size for the actual cpu rather than 512 bytes 
> always. Patches enabling xsave/xrstor support (coming shortly) will 
> take advantage of this.

i like the concept. Please clean up the issues found by Christoph and 
please also base it against x86.git#testing [this is clear 2.6.26 
material and there are already some changes in this area]:

   http://people.redhat.com/mingo/x86.git/README

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] x86,fpu: split FPU state from task struct

2008-02-23 Thread Christoph Hellwig
On Sat, Feb 23, 2008 at 06:34:38PM -0800, Suresh Siddha wrote:
> Split the FPU save area from the task struct. This allows easy migration
> of FPU context, and it's generally cleaner. It also allows the following
> two optimizations:
> 
> 1) only allocate when the application actually uses FPU, so in the first
> lazy FPU trap. This could save memory for non-fpu using apps. Next patch
> does this lazy allocation.
> 
> 2) allocate the right size for the actual cpu rather than 512 bytes always.
> Patches enabling xsave/xrstor support (coming shortly) will take advantage
> of this.

This sounds like a wonderful idea.  But I'm a little unhappy with
some of the rather cosmetic things in this patch:

>   if (next_p->fpu_counter>5)
> - prefetch(>i387.fxsave);
> + prefetch(FXSAVE(next_p));

These macros are rather ugly.  If you really want them please

a) make them inlines and lowercase with a descriptive name
b) introduce them in a separate patch before the first real
   path in the series.

> +++ linux-2.6-x86/kernel/fork.c   2008-02-23 15:08:53.0 -0800
> @@ -87,6 +87,7 @@
>  #ifndef __HAVE_ARCH_TASK_STRUCT_ALLOCATOR
>  # define alloc_task_struct() kmem_cache_alloc(task_struct_cachep, GFP_KERNEL)
>  # define free_task_struct(tsk)   kmem_cache_free(task_struct_cachep, 
> (tsk))
> +# define memcpy_task_struct(dst, src) do { *dst = *src; } while (0)
>  static struct kmem_cache *task_struct_cachep;
>  #endif
>  
> @@ -142,6 +143,8 @@
>   task_struct_cachep =
>   kmem_cache_create("task_struct", sizeof(struct task_struct),
>   ARCH_MIN_TASKALIGN, SLAB_PANIC, NULL);
> +#else
> + task_struct_slab_init();
>  #endif
>  
>   /*
> @@ -181,7 +184,8 @@
>   return NULL;
>   }
>  
> - *tsk = *orig;
> + memcpy_task_struct(tsk, orig);

I think this is a bad name for this helper, arch_dup_task_struct
would be more descriptive.

But we actually have an arch hook for this kind of thing called
setup_thread_stack which is used by ia64 and m68k just a few lines
later, so it'd be better to look into having a single hook.
(And possibly rename it to arch_dup_task_struct because the name
is a lot more descriptive)
setup_thread_stack

> + memset(FSAVE(tsk), 0, math_cntxt_size);
> + FSAVE(tsk)->cwd = 0x037fu;
> + FSAVE(tsk)->swd = 0xu;
> + FSAVE(tsk)->twd = 0xu;
> + FSAVE(tsk)->fos = 0xu;

Also if you reference the save area so often it'd be better to just
have a local variable for it.  Much better readable.

> +struct task_struct * alloc_task_struct(void)

this should be struct task_struct *alloc_task_struct(void)

> +void free_task_struct(struct task_struct *tsk)
> +{
> + kmem_cache_free(task_cntxt_cachep, tsk->thread.cntxt);
> + tsk->thread.cntxt=NULL;

missing spaces around the '='

> -#define I387 (current->thread.i387)
> -#define FPU_info (I387.soft.info)
> +#define I387 (current->thread.cntxt)
> +#define FPU_info (I387->soft.info)
> +#define SOFT(t)  (&(t->thread.cntxt->soft))

This is quite butt ugly.  But then again it's fpemu, so there's
probably no point touching it until a bored janitor comes around.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/2] x86,fpu: split FPU state from task struct

2008-02-23 Thread Suresh Siddha
Split the FPU save area from the task struct. This allows easy migration
of FPU context, and it's generally cleaner. It also allows the following
two optimizations:

1) only allocate when the application actually uses FPU, so in the first
lazy FPU trap. This could save memory for non-fpu using apps. Next patch
does this lazy allocation.

2) allocate the right size for the actual cpu rather than 512 bytes always.
Patches enabling xsave/xrstor support (coming shortly) will take advantage
of this.

Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]>
Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>
---

Index: linux-2.6-x86/arch/x86/kernel/process_64.c
===
--- linux-2.6-x86.orig/arch/x86/kernel/process_64.c 2008-02-22 
18:33:17.0 -0800
+++ linux-2.6-x86/arch/x86/kernel/process_64.c  2008-02-23 15:04:30.0 
-0800
@@ -630,7 +630,7 @@
 
/* we're going to use this soon, after a few expensive things */
if (next_p->fpu_counter>5)
-   prefetch(>i387.fxsave);
+   prefetch(FXSAVE(next_p));
 
/*
 * Reload esp0, LDT and the page table pointer:
Index: linux-2.6-x86/arch/x86/kernel/traps_64.c
===
--- linux-2.6-x86.orig/arch/x86/kernel/traps_64.c   2008-02-22 
18:33:17.0 -0800
+++ linux-2.6-x86/arch/x86/kernel/traps_64.c2008-02-23 15:04:30.0 
-0800
@@ -1118,7 +1118,7 @@
 
if (!used_math())
init_fpu(me);
-   restore_fpu_checking(>thread.i387.fxsave);
+   restore_fpu_checking(FXSAVE(me));
task_thread_info(me)->status |= TS_USEDFPU;
me->fpu_counter++;
 }
@@ -1154,6 +1154,10 @@
 #endif

/*
+* initialize the per thread FP context:
+*/
+init_thread_context();
+   /*
 * Should be a barrier for any external CPU state.
 */
cpu_init();
Index: linux-2.6-x86/kernel/fork.c
===
--- linux-2.6-x86.orig/kernel/fork.c2008-02-23 15:03:35.0 -0800
+++ linux-2.6-x86/kernel/fork.c 2008-02-23 15:08:53.0 -0800
@@ -87,6 +87,7 @@
 #ifndef __HAVE_ARCH_TASK_STRUCT_ALLOCATOR
 # define alloc_task_struct()   kmem_cache_alloc(task_struct_cachep, GFP_KERNEL)
 # define free_task_struct(tsk) kmem_cache_free(task_struct_cachep, (tsk))
+# define memcpy_task_struct(dst, src) do { *dst = *src; } while (0)
 static struct kmem_cache *task_struct_cachep;
 #endif
 
@@ -142,6 +143,8 @@
task_struct_cachep =
kmem_cache_create("task_struct", sizeof(struct task_struct),
ARCH_MIN_TASKALIGN, SLAB_PANIC, NULL);
+#else
+   task_struct_slab_init();
 #endif
 
/*
@@ -181,7 +184,8 @@
return NULL;
}
 
-   *tsk = *orig;
+   memcpy_task_struct(tsk, orig);
+
tsk->stack = ti;
 
err = prop_local_init_single(>dirties);
Index: linux-2.6-x86/arch/x86/kernel/i387.c
===
--- linux-2.6-x86.orig/arch/x86/kernel/i387.c   2008-02-23 15:03:35.0 
-0800
+++ linux-2.6-x86/arch/x86/kernel/i387.c2008-02-23 15:08:53.0 
-0800
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -40,16 +41,17 @@
 #endif
 
 static unsigned int mxcsr_feature_mask __read_mostly = 0xu;
+unsigned int math_cntxt_size;
+static struct i387_fxsave_struct fx_scratch __cpuinitdata;
 
-void mxcsr_feature_mask_init(void)
+void __cpuinit mxcsr_feature_mask_init(void)
 {
unsigned long mask = 0;
clts();
if (cpu_has_fxsr) {
-   memset(>thread.i387.fxsave, 0,
-  sizeof(struct i387_fxsave_struct));
-   asm volatile("fxsave %0" : : "m" (current->thread.i387.fxsave));
-   mask = current->thread.i387.fxsave.mxcsr_mask;
+   memset(_scratch, 0, sizeof(struct i387_fxsave_struct));
+   asm volatile("fxsave %0" : : "m" (fx_scratch));
+   mask = fx_scratch.mxcsr_mask;
if (mask == 0)
mask = 0xffbf;
}
@@ -57,6 +59,17 @@
stts();
 }
 
+void __init init_thread_context(void)
+{
+   if (cpu_has_fxsr)
+   math_cntxt_size = sizeof(struct i387_fxsave_struct);
+#ifdef CONFIG_X86_32
+   else
+   math_cntxt_size = sizeof(struct i387_fsave_struct);
+#endif
+   init_task.thread.cntxt = alloc_bootmem(math_cntxt_size);
+}
+
 #ifdef CONFIG_X86_64
 /*
  * Called at bootup to set up the initial FPU state that is later cloned
@@ -65,10 +78,7 @@
 void __cpuinit fpu_init(void)
 {
unsigned long oldcr0 = read_cr0();
-   extern void __bad_fxsave_alignment(void);
 
-   if (offsetof(struct task_struct, thread.i387.fxsave) & 15)
-   __bad_fxsave_alignment();

[patch 1/2] x86,fpu: split FPU state from task struct

2008-02-23 Thread Suresh Siddha
Split the FPU save area from the task struct. This allows easy migration
of FPU context, and it's generally cleaner. It also allows the following
two optimizations:

1) only allocate when the application actually uses FPU, so in the first
lazy FPU trap. This could save memory for non-fpu using apps. Next patch
does this lazy allocation.

2) allocate the right size for the actual cpu rather than 512 bytes always.
Patches enabling xsave/xrstor support (coming shortly) will take advantage
of this.

Signed-off-by: Suresh Siddha [EMAIL PROTECTED]
Signed-off-by: Arjan van de Ven [EMAIL PROTECTED]
---

Index: linux-2.6-x86/arch/x86/kernel/process_64.c
===
--- linux-2.6-x86.orig/arch/x86/kernel/process_64.c 2008-02-22 
18:33:17.0 -0800
+++ linux-2.6-x86/arch/x86/kernel/process_64.c  2008-02-23 15:04:30.0 
-0800
@@ -630,7 +630,7 @@
 
/* we're going to use this soon, after a few expensive things */
if (next_p-fpu_counter5)
-   prefetch(next-i387.fxsave);
+   prefetch(FXSAVE(next_p));
 
/*
 * Reload esp0, LDT and the page table pointer:
Index: linux-2.6-x86/arch/x86/kernel/traps_64.c
===
--- linux-2.6-x86.orig/arch/x86/kernel/traps_64.c   2008-02-22 
18:33:17.0 -0800
+++ linux-2.6-x86/arch/x86/kernel/traps_64.c2008-02-23 15:04:30.0 
-0800
@@ -1118,7 +1118,7 @@
 
if (!used_math())
init_fpu(me);
-   restore_fpu_checking(me-thread.i387.fxsave);
+   restore_fpu_checking(FXSAVE(me));
task_thread_info(me)-status |= TS_USEDFPU;
me-fpu_counter++;
 }
@@ -1154,6 +1154,10 @@
 #endif

/*
+* initialize the per thread FP context:
+*/
+init_thread_context();
+   /*
 * Should be a barrier for any external CPU state.
 */
cpu_init();
Index: linux-2.6-x86/kernel/fork.c
===
--- linux-2.6-x86.orig/kernel/fork.c2008-02-23 15:03:35.0 -0800
+++ linux-2.6-x86/kernel/fork.c 2008-02-23 15:08:53.0 -0800
@@ -87,6 +87,7 @@
 #ifndef __HAVE_ARCH_TASK_STRUCT_ALLOCATOR
 # define alloc_task_struct()   kmem_cache_alloc(task_struct_cachep, GFP_KERNEL)
 # define free_task_struct(tsk) kmem_cache_free(task_struct_cachep, (tsk))
+# define memcpy_task_struct(dst, src) do { *dst = *src; } while (0)
 static struct kmem_cache *task_struct_cachep;
 #endif
 
@@ -142,6 +143,8 @@
task_struct_cachep =
kmem_cache_create(task_struct, sizeof(struct task_struct),
ARCH_MIN_TASKALIGN, SLAB_PANIC, NULL);
+#else
+   task_struct_slab_init();
 #endif
 
/*
@@ -181,7 +184,8 @@
return NULL;
}
 
-   *tsk = *orig;
+   memcpy_task_struct(tsk, orig);
+
tsk-stack = ti;
 
err = prop_local_init_single(tsk-dirties);
Index: linux-2.6-x86/arch/x86/kernel/i387.c
===
--- linux-2.6-x86.orig/arch/x86/kernel/i387.c   2008-02-23 15:03:35.0 
-0800
+++ linux-2.6-x86/arch/x86/kernel/i387.c2008-02-23 15:08:53.0 
-0800
@@ -9,6 +9,7 @@
 #include linux/sched.h
 #include linux/module.h
 #include linux/regset.h
+#include linux/bootmem.h
 #include asm/processor.h
 #include asm/i387.h
 #include asm/math_emu.h
@@ -40,16 +41,17 @@
 #endif
 
 static unsigned int mxcsr_feature_mask __read_mostly = 0xu;
+unsigned int math_cntxt_size;
+static struct i387_fxsave_struct fx_scratch __cpuinitdata;
 
-void mxcsr_feature_mask_init(void)
+void __cpuinit mxcsr_feature_mask_init(void)
 {
unsigned long mask = 0;
clts();
if (cpu_has_fxsr) {
-   memset(current-thread.i387.fxsave, 0,
-  sizeof(struct i387_fxsave_struct));
-   asm volatile(fxsave %0 : : m (current-thread.i387.fxsave));
-   mask = current-thread.i387.fxsave.mxcsr_mask;
+   memset(fx_scratch, 0, sizeof(struct i387_fxsave_struct));
+   asm volatile(fxsave %0 : : m (fx_scratch));
+   mask = fx_scratch.mxcsr_mask;
if (mask == 0)
mask = 0xffbf;
}
@@ -57,6 +59,17 @@
stts();
 }
 
+void __init init_thread_context(void)
+{
+   if (cpu_has_fxsr)
+   math_cntxt_size = sizeof(struct i387_fxsave_struct);
+#ifdef CONFIG_X86_32
+   else
+   math_cntxt_size = sizeof(struct i387_fsave_struct);
+#endif
+   init_task.thread.cntxt = alloc_bootmem(math_cntxt_size);
+}
+
 #ifdef CONFIG_X86_64
 /*
  * Called at bootup to set up the initial FPU state that is later cloned
@@ -65,10 +78,7 @@
 void __cpuinit fpu_init(void)
 {
unsigned long oldcr0 = read_cr0();
-   extern void __bad_fxsave_alignment(void);
 
-   if (offsetof(struct 

Re: [patch 1/2] x86,fpu: split FPU state from task struct

2008-02-23 Thread Christoph Hellwig
On Sat, Feb 23, 2008 at 06:34:38PM -0800, Suresh Siddha wrote:
 Split the FPU save area from the task struct. This allows easy migration
 of FPU context, and it's generally cleaner. It also allows the following
 two optimizations:
 
 1) only allocate when the application actually uses FPU, so in the first
 lazy FPU trap. This could save memory for non-fpu using apps. Next patch
 does this lazy allocation.
 
 2) allocate the right size for the actual cpu rather than 512 bytes always.
 Patches enabling xsave/xrstor support (coming shortly) will take advantage
 of this.

This sounds like a wonderful idea.  But I'm a little unhappy with
some of the rather cosmetic things in this patch:

   if (next_p-fpu_counter5)
 - prefetch(next-i387.fxsave);
 + prefetch(FXSAVE(next_p));

These macros are rather ugly.  If you really want them please

a) make them inlines and lowercase with a descriptive name
b) introduce them in a separate patch before the first real
   path in the series.

 +++ linux-2.6-x86/kernel/fork.c   2008-02-23 15:08:53.0 -0800
 @@ -87,6 +87,7 @@
  #ifndef __HAVE_ARCH_TASK_STRUCT_ALLOCATOR
  # define alloc_task_struct() kmem_cache_alloc(task_struct_cachep, GFP_KERNEL)
  # define free_task_struct(tsk)   kmem_cache_free(task_struct_cachep, 
 (tsk))
 +# define memcpy_task_struct(dst, src) do { *dst = *src; } while (0)
  static struct kmem_cache *task_struct_cachep;
  #endif
  
 @@ -142,6 +143,8 @@
   task_struct_cachep =
   kmem_cache_create(task_struct, sizeof(struct task_struct),
   ARCH_MIN_TASKALIGN, SLAB_PANIC, NULL);
 +#else
 + task_struct_slab_init();
  #endif
  
   /*
 @@ -181,7 +184,8 @@
   return NULL;
   }
  
 - *tsk = *orig;
 + memcpy_task_struct(tsk, orig);

I think this is a bad name for this helper, arch_dup_task_struct
would be more descriptive.

But we actually have an arch hook for this kind of thing called
setup_thread_stack which is used by ia64 and m68k just a few lines
later, so it'd be better to look into having a single hook.
(And possibly rename it to arch_dup_task_struct because the name
is a lot more descriptive)
setup_thread_stack

 + memset(FSAVE(tsk), 0, math_cntxt_size);
 + FSAVE(tsk)-cwd = 0x037fu;
 + FSAVE(tsk)-swd = 0xu;
 + FSAVE(tsk)-twd = 0xu;
 + FSAVE(tsk)-fos = 0xu;

Also if you reference the save area so often it'd be better to just
have a local variable for it.  Much better readable.

 +struct task_struct * alloc_task_struct(void)

this should be struct task_struct *alloc_task_struct(void)

 +void free_task_struct(struct task_struct *tsk)
 +{
 + kmem_cache_free(task_cntxt_cachep, tsk-thread.cntxt);
 + tsk-thread.cntxt=NULL;

missing spaces around the '='

 -#define I387 (current-thread.i387)
 -#define FPU_info (I387.soft.info)
 +#define I387 (current-thread.cntxt)
 +#define FPU_info (I387-soft.info)
 +#define SOFT(t)  ((t-thread.cntxt-soft))

This is quite butt ugly.  But then again it's fpemu, so there's
probably no point touching it until a bored janitor comes around.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] x86,fpu: split FPU state from task struct

2008-02-23 Thread Ingo Molnar

* Suresh Siddha [EMAIL PROTECTED] wrote:

 Split the FPU save area from the task struct. This allows easy 
 migration of FPU context, and it's generally cleaner. It also allows 
 the following two optimizations:
 
 1) only allocate when the application actually uses FPU, so in the 
 first lazy FPU trap. This could save memory for non-fpu using apps. 
 Next patch does this lazy allocation.
 
 2) allocate the right size for the actual cpu rather than 512 bytes 
 always. Patches enabling xsave/xrstor support (coming shortly) will 
 take advantage of this.

i like the concept. Please clean up the issues found by Christoph and 
please also base it against x86.git#testing [this is clear 2.6.26 
material and there are already some changes in this area]:

   http://people.redhat.com/mingo/x86.git/README

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] x86,fpu: split FPU state from task struct

2008-02-23 Thread Roger While


On Sat, Feb 23, 2008 at 06:34:38PM -0800, Suresh Siddha wrote:
 Split the FPU save area from the task struct. This allows easy migration
 of FPU context, and it's generally cleaner. It also allows the following
 two optimizations:

 1) only allocate when the application actually uses FPU, so in the first
 lazy FPU trap. This could save memory for non-fpu using apps. Next patch
 does this lazy allocation.

 2) allocate the right size for the actual cpu rather than 512 bytes always.
 Patches enabling xsave/xrstor support (coming shortly) will take advantage
 of this.

if (next_p-fpu_counter5)
 -  prefetch(next-i387.fxsave);
 +  prefetch(FXSAVE(next_p));

Shouldn't that be  prefetch(FXSAVE(next));  ?

Roger


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/