Arjan van de Ven <[EMAIL PROTECTED]> writes: > On Wed, 24 Oct 2007 21:29:56 -0700 > "David Schwartz" <[EMAIL PROTECTED]> wrote: > >> >> > Well that's exactly right. For threaded programs (and maybe even >> > real-world non-threaded ones in general), you don't want to be >> > even _reading_ global variables if you don't need to. Cache misses >> > and cacheline bouncing could easily cause performance to completely >> > tank in some cases while only gaining a cycle or two in >> > microbenchmarks for doing these funny x86 predication things. >> >> For some CPUs, replacing an conditional branch with a conditional >> move is a *huge* win because it cannot be mispredicted. > > please name one... > Hint: It's not one made by either Intel or AMD in the last 4 years...
ARM. On ARM1136 (used in the Nokia N800) a mispredicted branch takes 5-7 cycles (a correctly predicted branch takes 0-4 cycles), while a conditional load, store or arithmetic instruction always takes one cycle. -- Måns Rullgård [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/