On Thu, 06 Sep 2012 18:30:48 -0700, jimbo1qaz wrote: > OK, I decided to change my code. Which raises a similar question: Which > one is better for setting a bit of a byte: |= or +=, assuming each will > only be run once? Intuitively, I think |=
Python (like most languages) doesn't have a "set this bit" operator, so the closest match is a bitwise-or. So to set a bit of a byte, the operation which most closely matches the programmer's intention is to use the bitwise operator. Even better would be to write a function called "setBit", and use that. > but some timeits are inconclusive, Timing results are usually inconclusive because the difference between the results are much smaller than that average random noise on any particular result. All modern computers, say for the last 20 or 30 years, have used multitasking operating systems. This means that at any time you could have dozens, even hundreds, of programs running at once, with the operating system switching between them faster than you can blink. In addition, the time taken by an operation can depend on dozens of external factors, such as whether the data is already in a CPU cache, whether CPU prediction has pre-fetched the instructions needed, pipelines, memory usage, latency when reading from disks, and many others. Consequently, timing results are very *noisy* -- the *exact* same operation can take different amount of time from one run to the next. Sometimes *large* differences. So any time you time a piece of code, what you are *actually* getting is not the amount of time that code takes to execute, but something slightly more. (And, occasionally, something a lot more.) Note that it is always slightly more -- by definition, it will never be less. So if you want a better estimate of the actual time taken to execute the code, you should repeat the measurement as many times as you can bear, and pick the smallest value. *Not* the average, since the errors are always positive. An average just gives you the "true" time plus some unknown average error, which may not be small. The minimum gives you the "true" time plus some unknown but hopefully small error. The smaller the amount of time you measure, the more likely that it will be disrupted by some external factor. So timeit takes a code snippet and runs it many times (by default, one million times), and returns the total time used. Even if one or two of those runs were blown out significantly, the total probably won't be. (Unless of course your anti-virus decided to start running, and *everything* slows down for 10 minutes, or something like that.) But even that total time returned by timeit is almost certainly wrong. So you should call the repeat method, with as many iterations as you can bear to wait for, and take the minimum, which will still be wrong but it will be less wrong. And remember, the result you get is only valid for *your* computer, running the specific version of Python you have, under the specific operating system. On another computer with a different CPU or a different OS, the results may be *completely* different. Are you still sure you care about shaving off every last nanosecond? > mainly because I don't know how it works. The internal details of how timeit works are complicated, but it is worth reading the comments and documentation, both in the Fine Manual and in the source code: http://docs.python.org/library/timeit.html http://hg.python.org/cpython/file/2.7/Lib/timeit.py -- Steven -- http://mail.python.org/mailman/listinfo/python-list