On Thu, Feb 27, 2014 at 11:22 PM, Stefani Seibold <stef...@seibold.net> wrote: > Am Mittwoch, den 26.02.2014, 16:55 -0800 schrieb Andy Lutomirski: >> >> Once I patch it to work, your 32-bit code is considerably faster than >> the 64-bit case. It's enough faster that I suspect a bug. Dumping >> the in-memory shows some rather suspicious nops before the rdtsc >> instruction. I suspect that you've forgotten to run the 32-bit vdso >> through the alternatives code. The is a nasty bug: it will appear to >> work, but you'll see non-monotonic times on some SMP systems. >> > > I didn't know this. My basic test case is a KVM which defaults to 1 cpu. > Thanks for discovering the issue.
This leads to a potentially interesting question: is rdtsc_barrier() actually necessary on UP? IIRC the point is that, if an rdtsc_barrier(); rdtsc in one thread is "before" (in the sense of being synchronized by some memory operation) an rdtsc_barrier(); rdtsc in another thread, then the first rdtsc needs to return an earlier or equal time to the second one. I assume that no UP CPU is silly enough to execute two rdtsc instructions out of order relative to each other in the absence of barriers. So this is a nonissue on UP. On the other hand, suppose that some code does: volatile long x = *(something that's not in cache) clock_gettime I can imagine a modern CPU speculating far enough ahead that the rdtsc happens *before* the cache miss. This won't cause visible non-monotonicity as far as I can see, but it might annoy people who try to benchmark their code. Note: actually making this change might be a bit tricky. I don't know if the alternatives code is smart enough. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/