> > i wonder if that is uniformly faster. consider that > > making reads of that page coherent enough on a > > big multiprocessor and making sure there's not too > > much interprocesser skew might be slower than a > > system call. > > Real world tests show that it is consistently faster. It's probably > cached anyway.
as andrey points out that's exactly the problem if you want an accurate time. also, while making gettimeofday() faster, you potentially invalidate BY2PG/BY2CL cachelines on *every* processor in the system. this has the real potential for being a problem on, say, a 256 processor system and making everything else on the system slower. linux optimization is a ratrace. you are only judged on the immediate effect on your subsystem, not the system as a whole. so unless you play the game, your system will appear to regress over time as other optimizers take resources away from you. there aren't many processor arm systems yet, but on such a system, the os will need to do the equivalent of cachedinvse() and l2cacheuwbinvse often enough to make the timing look reasonable. - erik