On Tue, 2013-10-15 at 09:21 -0700, Joe Perches wrote: > Ingo, Eric _showed_ that the prefetch is good here. > How about looking at a little optimization to the minimal > prefetch that gives that level of performance.
Wait a minute, my point was to remind that main cost is the memory fetching. Its nice to optimize cpu cycles if we are short of them, but in the csum_partial() case, the bottleneck is the memory. Also I was wondering on the implications of changing reads order, as it might fool cpu predictions. I do not particularly care about finding the right prefetch stride, I think Intel guys know better than me. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/