On Wed, Jul 21, 2010 at 5:27 PM, Terry Reedy <tjre...@udel.edu> wrote:
> These tests use random numbers with a constant, relatively high density of > 25%, which is favorable to radix sort. What if you do the same test with a > constant range of, say, 1000000000 (1 billion) or even a trillion or > quadrillion. Or how about sorting a thousand 128bit ip6 addresses? Which > wins for that? > blist switches to radix sort only if the keys contain only floats or only integers that fit into a C long. If the integers get too big, it reverts to timsort. When using a sort key, the decorate-sort-undecorate pattern requires touching every object once before the sort. The check for a consistent type is done inexpensively during the decorate phase. Here's an example result with a density of only 1%, where the radix sort is around 4 times as fast: otto:~/src/blist$ python3.1 -m timeit -s 'import random' -s 'x = [random.randrange(10000*100) for i in range(10000)]' 'y = list(x)' 'y.sort(key=float)' 100 loops, best of 3: 12.4 msec per loop otto:~/src/blist$ python3.1 -m timeit -s 'from blist import blist' -s 'import random' -s 'x = [random.randrange(10000*100) for i in range(10000)]' 'y = blist(x)' 'y.sort(key=float)' 100 loops, best of 3: 3.4 msec per loop And a density of 0.01%: otto:~/src/blist$ python3.1 -m timeit -s 'import random' -s 'x = [random.randrange(10000*10000) for i in range(10000)]' 'y = list(x)' 'y.sort(key=float)' 100 loops, best of 3: 12 msec per loop otto:~/src/blist$ python3.1 -m timeit -s 'from blist import blist' -s 'import random' -s 'x = [random.randrange(10000*10000) for i in range(10000)]' 'y = blist(x)' 'y.sort(key=float)' 100 loops, best of 3: 3.52 msec per loop > list.sort is (near) linear for lists that are mostly ordered. I think Tim's > writeup about this in in the source. For instance, if one extends a sorted > with 1% random additions and resorts, list.sort will skip the sorted part, > sort the additions, and merge them back in. Radix sort ignores pre-existing > order. So either a lot of comparitive profiling would be needed for > auto-selection of sort method, or it should be user selectable. > I've tested exactly that scenario. For already-ordered lists, radix sort and timsort tie. > Does your radix sort meet the stability guarantee of list.sort? > Yes. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>
-- http://mail.python.org/mailman/listinfo/python-list