On Thu, Oct 31, 2013 at 11:22:00AM +0100, Ingo Molnar wrote:
> 
> * Neil Horman <nhor...@tuxdriver.com> wrote:
> 
> > > etc. For such short runtimes make sure the last column displays 
> > > close to 100%, so that the PMU results become trustable.
> > > 
> > > A nehalem+ PMU will allow 2-4 events to be measured in parallel, 
> > > plus generics like 'cycles', 'instructions' can be added 'for free' 
> > > because they get counted in a separate (fixed purpose) PMU register.
> > > 
> > > The last colum tells you what percentage of the runtime that 
> > > particular event was actually active. 100% (or empty last column) 
> > > means it was active all the time.
> > > 
> > > Thanks,
> > > 
> > >   Ingo
> > > 
> > 
> > Hmm, 
> > 
> > I ran this test:
> > 
> > for i in `seq 0 1 3`
> > do
> > echo $i > /sys/module/csum_test/parameters/module_test_mode
> > taskset -c 0 perf stat --repeat 20 -C 0 -e L1-dcache-load-misses -e 
> > L1-dcache-prefetches -e cycles -e instructions -ddd ./test.sh
> > done
> 
> You need to remove '-ddd' which is a shortcut for a ton of useful 
> events, but here you want to use fewer events, to increase the 
> precision of the measurement.
> 
> Thanks,
> 
>       Ingo
> 

Thank you ingo, that fixed it.  I'm trying some other variants of the csum
algorithm that Doug and I discussed last night, but FWIW, the relative
performance of the 4 test cases (base/prefetch/parallel/both) remains unchanged.
I'm starting to feel like at this point, theres very little point in doing
parallel alu operations (unless we can find a way to break the dependency on the
carry flag, which is what I'm tinkering with now).
Neil

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to