[Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Elliot Gorokhovsky
Hello all, I am interested in making a non-trivial improvement to list.sort(), but before I put in the work, I want to test the waters and see if this is something the community would accept. Basically, I want to implement radix sort for lists of strings. So list.sort() would detect if it is sorti

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Chris Angelico
On Sun, Sep 11, 2016 at 5:45 PM, Elliot Gorokhovsky wrote: > I am interested in making a non-trivial improvement to list.sort(), but > before I put in the work, I want to test the waters and see if this is > something the community would accept. Basically, I want to implement radix > sort for list

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Raymond Hettinger
> On Sep 11, 2016, at 12:45 AM, Elliot Gorokhovsky > wrote: > > I am interested in making a non-trivial improvement to list.sort(), but > before I put in the work, I want to test the waters and see if this is > something the community would accept. Basically, I want to implement radix > sort

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Elliot Gorokhovsky
Thanks for the link. If you look at the conclusion it says "We recommend American flag sort as an all-round algorithm for sorting strings." On Sun, Sep 11, 2016, 11:30 AM Raymond Hettinger < raymond.hettin...@gmail.com> wrote: > > > On Sep 11, 2016, at 12:45 AM, Elliot Gorokhovsky > wrote: > > >

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Mark Dickinson
> I am interested in making a non-trivial improvement to list.sort() [...] Would your proposed new sorting algorithm be stable? The language currently guarantees stability for `list.sort` and `sorted`. -- Mark ___ Python-Dev mailing list Python-Dev@pyt

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Elliot Gorokhovsky
The sort can be made stable, but that requires the allocation of an equal-sized auxiliary array. To quote from the paper: "Both list-based and two-array sorts entail Θ(n) space overhead. That overhead shrinks to Θ(logn) in American flag sort, which, like quicksort, trades off stability for space ef

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Mark Dickinson
On Sun, Sep 11, 2016 at 7:43 PM, Elliot Gorokhovsky wrote: > So I suppose the thing to do is to benchmark stable radix sort against > timsort and see if it's still worth it. Agreed; it would definitely be interesting to see benchmarks for the two-array stable sort as well as the American Flag un

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Raymond Hettinger
> On Sep 11, 2016, at 11:58 AM, Mark Dickinson wrote: > >> So I suppose the thing to do is to benchmark stable radix sort against >> timsort and see if it's still worth it. > > Agreed; it would definitely be interesting to see benchmarks for the > two-array stable sort as well as the American

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Terry Reedy
On 9/11/2016 3:45 AM, Elliot Gorokhovsky wrote: Hello all, I am interested in making a non-trivial improvement to list.sort(), This is non-trivial, and harder than you seem to think it is. > but before I put in the work, I want to test the waters and see if this is something the community wo

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Tim Peters
[redirected from python-dev, to python-ideas; please send followups only to python-ideas] [Elliot Gorokhovsky ] > ... > TL;DR: Should I spend time making list.sort() detect if it is sorting > lexicographically and, if so, use a much faster algorithm? It will be fun to find out ;-) As Mark, and

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Petr Viktorin
On 09/11/2016 10:48 PM, Terry Reedy wrote: [...] Second, with respect to timsort in particular: timsort is designed to exploit structure and run faster than O(n*logn) in special cases. If a list is already sorted, timsort will do one O(n) scan and stop. Any radix sort will take several times lo

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Elliot Gorokhovsky
Wow, Tim himself! Regarding performance on semi-ordered data: we'll have to benchmark to see, but intuitively I imagine radix would meet Timsort because verifying that a list of strings is sorted takes Omega(nw) (which gives a lower bound on Timsort), where w is the word length. Radix sort is Theta

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-13 Thread Nikolaus Rath
On Sep 11 2016, Terry Reedy wrote: > Tim Peters investigated and empirically determined that an > O(n*n) binary insort, as he optimized it on real machines, is faster > than O(n*logn) sorting for up to around 64 items. Out of curiosity: is this test repeated periodically on different architecture

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-13 Thread Random832
On Tue, Sep 13, 2016, at 13:24, Nikolaus Rath wrote: > On Sep 11 2016, Terry Reedy wrote: > > Tim Peters investigated and empirically determined that an > > O(n*n) binary insort, as he optimized it on real machines, is faster > > than O(n*logn) sorting for up to around 64 items. > > Out of curios

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-13 Thread Tim Peters
[Terry Reedy ] >> Tim Peters investigated and empirically determined that an >> O(n*n) binary insort, as he optimized it on real machines, is faster >> than O(n*logn) sorting for up to around 64 items. [Nikolaus Rath ] > Out of curiosity: is this test repeated periodically on different > architect

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-13 Thread Stephen J. Turnbull
Nikolaus Rath writes: > Out of curiosity: is this test repeated periodically on different > architectures? Or could it be that it only ever was true 10 years > ago on Tim's Power Mac G5 (or whatever he used)? This is the same Tim Peters of the eponymous test for readability of syntax: "Syntax

Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-15 Thread Nikolaus Rath
On Sep 13 2016, Tim Peters wrote: > [Terry Reedy ] >>> Tim Peters investigated and empirically determined that an >>> O(n*n) binary insort, as he optimized it on real machines, is faster >>> than O(n*logn) sorting for up to around 64 items. > > [Nikolaus Rath ] >> Out of curiosity: is this test re