OK - here's what the current binsort does, ignoring that it skips an already-sorted prefix (if any), and creating a new list instead of modifying in-place:
def oldsort(a): from bisect import bisect_right as br assert a result = [a[0]] for i in range(1, len(a)): x = a[i] index = br(result, x) result.insert(index, x) return result And here's my best guess as to what you _intend_ the new version to do. Please confirm that, or, if I'm guessing wrong, please give a Python function that _does_ implement your intent: def newsort(a): from bisect import bisect_right as br assert a oldx = a[0] result = [oldx] index = 0 for i in range(1, len(a)): x = a[i] if x < oldx: index = br(result, x, 0, index) else: index = br(result, x, index + 1) result.insert(index, x) oldx = x return result Now assuming that's right, I don't care about timing it ;-) The only basic question to me is whether it in fact reduces the number of comparisons. So here's an integer wrapper that bumps a global counter whenever it's asked to compare: class IntWrap(object): def __init__(self, i): self.i = i def __cmp__(a, b): global gncmp gncmp += 1 return cmp(a.i, b.i) def __repr__(self): return repr(self.i) Now we can build lists containing that, and get exact comparison counts. To start, for a given length `n`, this counts the total number of comparisons needed to sort all possible permutations of a list of length n, under both the old and new ways: def drive(n): import itertools global gncmp base = [IntWrap(i) for i in range(n)] oldcount = newcount = 0 numperms = 0 for p in itertools.permutations(base): numperms += 1 gncmp = 0 oldresult = oldsort(p) oldcount += gncmp gncmp = 0 newresult = newsort(p) newcount += gncmp assert oldresult == newresult == base print 'n', n, 'perms', numperms print 'old compares', oldcount print 'new compares', newcount print 'diff %', (newcount - oldcount) * 100.0 / max(oldcount, 1) And, finally, a tiny loop to drive it: for n in range(1, 13): print drive(n) It's still running as I type this, but the results aren't promising so far - as soon as the list length gets non-trivial, the new way requires more comparisons than the old way so far: n 1 perms 1 old compares 0 new compares 0 diff % 0.0 n 2 perms 2 old compares 2 new compares 2 diff % 0.0 n 3 perms 6 old compares 16 new compares 16 diff % 0.0 n 4 perms 24 old compares 112 new compares 116 diff % 3.57142857143 n 5 perms 120 old compares 848 new compares 880 diff % 3.77358490566 n 6 perms 720 old compares 7008 new compares 7296 diff % 4.1095890411 n 7 perms 5040 old compares 63456 new compares 66432 diff % 4.68986384266 n 8 perms 40320 old compares 628608 new compares 662496 diff % 5.39095907147 n 9 perms 362880 old compares 6826752 new compares 7202304 diff % 5.50118123523 n 10 perms 3628800 old compares 80605440 new compares 85006080 diff % 5.45948263542 I believe it would be very difficult to analyze this rigorously - and even if I saw an analysis it would be hard to trust it. Raw counts from simple code are hard to argue with ;-) FYI, here are two ideas I had way back when, but didn't pursue: 1. Merge "2 at a time" instead of just 1. That is, first "sort" the next 2 elements to be merged (1 compare and a possible swap). Then binary search to find where the smaller belongs, and a shorter binary search to find where the larger belongs. Then shift both into place. This can win on two counts: A. Less data movement, since the already-sorted values after the larger element get moved only once instead of twice. B. A possible cut in number of compares. Merging a sorted list of N elements with a sorted list of 2 elements has been studied a lot (e.g., search for "optimal merging of 2 elements" and find the paper by Hwang and Lin). The minimum average theoretical number of compares needed is ceiling(log2((N+2)*(N+1)/2)). 2. Instead of binary insertion sort, do an ordinary (but optimized) bottom-up merge sort. That may not cut the number of compares, but would slash worst-case data movement cost from O(n**2) to O(n*log(n)). As to why your code is sometimes faster, for the Python code in your timing harness, well, that didn't actually sort anything, so wasn't measuring anything interesting (or even explainable ;-) ). For the Java code, I have no guess - I don't know enough about Java internals. Maybe "lucky" data, maybe cache effects, maybe a mistake - don't know, and can't guess. Or maybe my guess (above) at the intent of your code is all wrong. It _is_ an interesting idea, though! Thanks for bringing it up :-) _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com