On Tue, Nov 20, 2012 at 01:31:56PM +0100, Ingo Molnar wrote:
> 
> * Ingo Molnar <mi...@kernel.org> wrote:
> 
> > * Ingo Molnar <mi...@kernel.org> wrote:
> > 
> > > numa/core profile:
> > > 
> > >     95.66%  perf-1201.map     [.] 0x00007fe4ad1c8fc7                 
> > >      1.70%  libjvm.so         [.] 0x0000000000381581                 
> > >      0.59%  [vdso]            [.] 0x0000000000000607                 
> > >      0.19%  [kernel]          [k] do_raw_spin_lock                   
> > >      0.11%  [kernel]          [k] generic_smp_call_function_interrupt
> > >      0.11%  [kernel]          [k] timekeeping_get_ns.constprop.7     
> > >      0.08%  [kernel]          [k] ktime_get                          
> > >      0.06%  [kernel]          [k] get_cycles                         
> > >      0.05%  [kernel]          [k] __native_flush_tlb                 
> > >      0.05%  [kernel]          [k] rep_nop                            
> > >      0.04%  perf              [.] add_hist_entry.isra.9              
> > >      0.04%  [kernel]          [k] rcu_check_callbacks                
> > >      0.04%  [kernel]          [k] ktime_get_update_offsets           
> > >      0.04%  libc-2.15.so      [.] __strcmp_sse2                      
> > > 
> > > No page fault overhead (see the page fault rate further below) 
> > > - the NUMA scanning overhead shows up only through some mild 
> > > TLB flush activity (which I'll fix btw).
> > 
> > The patch attached below should get rid of that mild TLB 
> > flushing activity as well.
> 
> This has further increased SPECjbb from 203k/sec to 207k/sec, 
> i.e. it's now 5% faster than mainline - THP enabled.
> 
> The profile is now totally flat even during a full 32-WH SPECjbb 
> run, with the highest overhead entries left all related to timer 
> IRQ processing or profiling. That is on a system that should be 
> very close to yours.
> 

This is a stab in the dark but are you always running with profiling enabled?

I have not checked this with perf but a number of years ago I found that
oprofile could distort results really badly (7-30% depending on the workload
at the time) when I was evalating hugetlbfs and THP. In some cases I would
find that profiling would show that a patch series improved performance
when the same series showed regressions if profiling was disabled. The
sampling rate had to be reduced quite a bit to avoid this effect.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to