On Thu, 28 Jul 2005, Steven Rostedt wrote: > > OK, I guess when I get some time, I'll start testing all the i386 bitop > functions, comparing the asm with the gcc versions. Now could someone > explain to me what's wrong with testing hot cache code. Can one > instruction retrieve from memory better than others?
There's a few issues: - trivially: code/data size. Being smaller automatically means faster if you're cold-cache. If you do cycle tweaking of something that is possibly commonly in the L2 cache or further away, you migt as well consider one byte of code-space to be equivalent to one cycle (a L1 I$ miss can easily take 50+ cycles - the L1 fill cost may be just a small part of that, but the pipeline problem it causes can be deadly). - branch prediction: cold-cache is _different_ from hot-cache. hit-cache predicts the stuff dynamically, cold-cache has different rules (and it is _usually_ "forward predicts not-taken, backwards predicts taken", although you can add static hints if you want to on most architectures). So hot-cache may look very different indeed - the "normal" case might be that you mispredict all the time because the static prediction is wrong, but then a hot-cache benchmark will predict perfectly. - access patterns. This only matters if you look at algorithmic changes. Hashes have atrocious locality, but on the other hand, if you know that the access pattern is cold, a hash will often have a minimum number of accesses. but no, you don't have "some instructions are better at reading from memory" for regular integer code (FP often has other issues, like reading directly from L2 without polluting L1, and then there are obviously prefetch hints). Now, in the case of your "rep scas" conversion, the reason I applied it was that it was obviously a clear win (rep scas is known bad, and has register allocation issues too), so I'm _not_ claiming that the above issues were true in that case. I just wanted to say that in general it's nice (but often quite hard) if you can give cold-cache numbers too (for example, using the cycle counter and being clever can actually give that). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/