On Tuesday 22 November 2016 14:00, Steve D'Aprano wrote: > Running a whole lot of loops can, sometimes, mitigate some of that > variation, but not always. Even when running in a loop, you can easily get > variation of 10% or more just at random.
I think that needs to be emphasised: there's a lot of random noise in these measurements. For big, heavyweight functions that do a lot of work, the noise is generally a tiny proportion, and you can safely ignore it. (At least for CPU bound tasks: I/O bound tasks, the noise in I/O is potentially very high.) For really tiny operations, the noise *may* be small, depending on the operation. But small is not insignificant. Consider a simple operation like addition: # Python 3.5 import statistics from timeit import Timer t = Timer("x + 1", setup="x = 0") # ten trials, of one million loops each results = t.repeat(repeat=10) best = min(results) average = statistics.mean(results) std_error = statistics.stdev(results)/statistics.mean(results) Best: 0.09761243686079979 Average: 0.0988507878035307 Std error: 0.02260956789268462 So this suggests that on my machine, doing no expensive virus scans or streaming video, the random noise in something as simple as integer addition is around two percent. So that's your baseline: even simple operations repeated thousands of times will show random noise of a few percent. Consequently, if you're doing one trial (one loop of, say, a million operations): start = time.time() for i in range(1000000): x + 1 elapsed = time.time() - start and compare the time taken with another trial, and the difference is of the order of a few percentage points, then you have *no* reason to believe the result is real. You ought to repeat your test multiple times -- the more the better. timeit makes it easy to repeat your tests. It automatically picks the best timer for your platform and avoid serious gotchas from using the wrong timer. When called from the command line, it will automatically select the best number of loops to ensure reliable timing, without wasting time doing more loops than needed. timeit isn't magic. It's not doing anything that you or I couldn't do by hand, if we knew we should be doing it, and if we could be bothered to run multiple trials and gather statistics and keep a close eye on the deviation between measurements. But who wants to do that by hand? -- Steven 299792.458 km/s — not just a good idea, it’s the law! -- https://mail.python.org/mailman/listinfo/python-list