[Elliot Gorokhovsky] > Warning: the contents of this message may be dangerous for readers with > heart conditions.
It appears some people are offended by your exuberance. Ignore them ;-) > ... > And it turns out that if you do that, PyFloat_RichCompare becomes ONE LINE > (after we set op=Py_LT)!!! Just Right. And this is why floats will likely be the best case for your approach: all the overheads of finding "the right" bottom-level comparison are weighed against what's basically a single machine instruction to do the actual work of comparing two floats. > return (PyFloat_AS_DOUBLE(v) < PyFloat_AS_DOUBLE(w)) == 0 ? Py_False : > Py_True; In real life, this is better written as: return (PyFloat_AS_DOUBLE(v) < PyFloat_AS_DOUBLE(w)) ? Py_True : Py_False; That is, the comparison to 0 was an artificial complication left over from simplifying the code. However, that code is buggy, in a way that may not show up for a long time. Keeping track of reference counts is crucial. It needs to be more like: PyObject *res: res = (PyFloat_AS_DOUBLE(v) < PyFloat_AS_DOUBLE(w)) ? Py_True : Py_False; Py_INCREF(res); return res; > So I thought, wow, this will give some nice numbers! But I underestimated > the power of this optimization. You have no idea. It's crazy. It's in the ballpark of what I expected :-) > Since we're dealing with floats, I used Tim Peter's benchmark: > Lib/test/sortperf.py, just modifying one function: It's actually Guido's file, although I probably changed it much more recently than he did. As I explained in a different msg, it's _not_ a good benchmark. It's good at distinguishing among "really really good", "not a huge change", and "really really bad", but that's all. At the start, such gross distinctions are all anyone should care about. One thing I'd like to see: do exactly the same, but comment out your float specialization. It's important to see whether sortperf.py says "not a huge change" then (which I expect, but it needs confirmation). That is, it's not enough if some cases get 4x faster if it's also that other cases get much slower. > ... > #My function: > def doit(L): > F = FastList(L) > f0 = time.perf_counter() > F.fastsort() > f1 = time.perf_counter() > F = FastList(L) > t0 = time.perf_counter() > F.sort() > t1 = time.perf_counter() > print("%6.2f%%" % (100*(1-(f1-f0)/(t1-t0))), end=' ') > flush() Numbers in benchmarks always need to be explained, because they're only clear to the person who wrote the code. From your code, you're essentially computing (old_time - new_time) / old_time * 100.0 but in an obfuscated way. So long as the new way is at least as fast, the only possible values are between 0% and 100%. That's important to note. Other people, e.g., measure these things by putting "new_time" in the denominator. Or just compute a ratio, like old_time / new_time. They way you're doing it, e.g., "75%" means the new way took only a quarter of the time of the old way - it means new_time is 75% smaller than old_time. I'm comfortable with that, but it's not the _clearest_ way to display timing differences so dramatic; old_time / new_time would be 4.0 in this case, which is easier to grasp at a glance. > So the benchmarking here is valid. No, it sucks ;-) But it's perfectly adequate for what I wanted to see from it :-) > I didn't write it. All I did was modify > it to print percent improvement instead of sort time. The numbers below are > percent improvement of my sort vs default sort (just clone my repo and run > python sortperf.py to verify): > > i 2**i *sort \sort /sort 3sort +sort %sort ~sort =sort !sort > 15 32768 44.11% 54.69% 47.41% 57.61% 50.17% 75.24% 68.03% 65.16% > 82.16% > 16 65536 14.14% 75.38% 63.44% 56.56% 67.99% 66.19% 50.72% 61.55% > 61.87% > 17 131072 73.54% 60.52% 60.97% 61.63% 52.55% 49.84% 68.68% 84.12% > 65.90% > 18 262144 54.19% 55.34% 54.67% 54.13% 55.62% 52.88% 69.30% 74.86% > 72.66% > 19 524288 55.12% 53.32% 53.77% 54.27% 52.97% 53.53% 67.55% 76.60% > 78.56% > 20 1048576 55.05% 51.09% 60.05% 50.69% 62.98% 50.20% 66.24% 71.47% > 61.40% If it _didn't_ suck, all the numbers in a column would be about the same :-) A meta-note: when Guido first wrote sortperf.py, machines - and Python! - were much slower. Now sorting 2**15 elements goes so fast that this gross timing approach is especially only good for making the grossest distinctions. On my box today, even the slowest case (*sort on 2**20 elements) takes under half a second. That's "why" the numbers in each column are much more stable across the last 3 rows than across the first 3 rows - the cases in the first 3 rows take so little time that any timing glitch can skew them badly. Note that you can pass arguments to sortperf.py to check any range of power-of-2 values you like, but if you're aware of the pitfalls the output as-is is fine for me. > This is just insane. This is crazy. Yet nevertheless wholly expected ;-) > ... > This is so cool. It is! Congratulations :-) _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/