[patch 3/3] powerpc: Preload application text segment instead of TASK_UNMAPPED_BASE

2009-07-14 Thread Anton Blanchard
TASK_UNMAPPED_BASE is not used with the new top down mmap layout. We can
reuse this preload slot by loading in the segment at 0x1000, where almost
all PowerPC binaries are linked at.

On a microbenchmark that bounces a token between two 64bit processes over pipes
and calls gettimeofday each iteration (to access the VDSO), both the 32bit and
64bit context switch rate improves (tested on a 4GHz POWER6):

32bit: 273k/sec - 283k/sec
64bit: 277k/sec - 284k/sec

Signed-off-by: Anton Blanchard an...@samba.org
---

Index: linux.trees.git/arch/powerpc/mm/slb.c
===
--- linux.trees.git.orig/arch/powerpc/mm/slb.c  2009-07-14 15:09:39.0 
+1000
+++ linux.trees.git/arch/powerpc/mm/slb.c   2009-07-14 15:12:42.0 
+1000
@@ -191,7 +191,7 @@
unsigned long slbie_data = 0;
unsigned long pc = KSTK_EIP(tsk);
unsigned long stack = KSTK_ESP(tsk);
-   unsigned long unmapped_base;
+   unsigned long exec_base;
 
if (!cpu_has_feature(CPU_FTR_NO_SLBIE_B) 
offset = SLB_CACHE_ENTRIES) {
@@ -219,14 +219,13 @@
 
/*
 * preload some userspace segments into the SLB.
+* Almost all 32 and 64bit PowerPC executables are linked at
+* 0x1000 so it makes sense to preload this segment.
 */
-   if (test_tsk_thread_flag(tsk, TIF_32BIT))
-   unmapped_base = TASK_UNMAPPED_BASE_USER32;
-   else
-   unmapped_base = TASK_UNMAPPED_BASE_USER64;
+   exec_base = 0x1000;
 
if (is_kernel_addr(pc) || is_kernel_addr(stack) ||
-   is_kernel_addr(unmapped_base))
+   is_kernel_addr(exec_base))
return;
 
slb_allocate(pc);
@@ -234,9 +233,9 @@
if (!esids_match(pc, stack))
slb_allocate(stack);
 
-   if (!esids_match(pc, unmapped_base) 
-   !esids_match(stack, unmapped_base))
-   slb_allocate(unmapped_base);
+   if (!esids_match(pc, exec_base) 
+   !esids_match(stack, exec_base))
+   slb_allocate(exec_base);
 }
 
 static inline void patch_slb_encoding(unsigned int *insn_addr,

-- 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [patch 3/3] powerpc: Preload application text segment instead of TASK_UNMAPPED_BASE

2009-07-14 Thread Benjamin Herrenschmidt
On Tue, 2009-07-14 at 16:53 +1000, Anton Blanchard wrote:
 plain text document attachment (preload_0x1000)
 TASK_UNMAPPED_BASE is not used with the new top down mmap layout. We can
 reuse this preload slot by loading in the segment at 0x1000, where almost
 all PowerPC binaries are linked at.
 
 On a microbenchmark that bounces a token between two 64bit processes over 
 pipes
 and calls gettimeofday each iteration (to access the VDSO), both the 32bit and
 64bit context switch rate improves (tested on a 4GHz POWER6):
 
 32bit: 273k/sec - 283k/sec
 64bit: 277k/sec - 284k/sec

Any chance you can put that little test program online somewhere ?

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [patch 3/3] powerpc: Preload application text segment instead of TASK_UNMAPPED_BASE

2009-07-14 Thread Anton Blanchard

Hi Ben,

 Any chance you can put that little test program online somewhere ?

Sure, it's here:

http://ozlabs.org/~anton/junkcode/context_switch.c

Anton
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev