* Neil Horman <nhor...@tuxdriver.com> wrote: > Base: > 0.093269042 seconds time elapsed > ( +- 2.24% ) > Prefetch (5x64): > 0.079440009 seconds time elapsed > ( +- 2.29% ) > Parallel ALU: > 0.087666677 seconds time elapsed > ( +- 4.01% ) > Prefetch + Parallel ALU: > 0.080758702 seconds time elapsed > ( +- 2.34% ) > > So we can see here that we get about a 1% speedup between the base > and the both (Prefetch + Parallel ALU) case, with prefetch > accounting for most of that speedup.
Hm, there's still something strange about these results. So the range of the results is 790-930 nsecs. The noise of the measurements is 2%-4%, i.e. 20-40 nsecs. The prefetch-only result itself is the fastest of all - statistically equivalent to the prefetch+parallel-ALU result, within the noise range. So if prefetch is enabled, turning on parallel-ALU has no measurable effect - which is counter-intuitive. Do you have an theory/explanation for that? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/