On Feb 11, 2008 2:58 AM, Francesc Altet <[EMAIL PROTECTED]> wrote: > A Monday 11 February 2008, Charles R Harris escrigué: > > On Feb 8, 2008 5:29 AM, Francesc Altet <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > > > I'm a bit confused that the sort method of a string character > > > doesn't > > > > > > allow a mergesort: > > > >>> s = numpy.empty(10, "S10") > > > >>> s.sort(kind="merge") > > > > > > TypeError: desired sort not supported for this type > > > > > > However, by looking at the numpy sources, it seems that the only > > > implemented method for sorting array strings is "merge" (I presume > > > because it is stable). So, perhaps the message above should be > > > fixed. > > > > The only available method is the default quicksort. The mergesort is > > for argsort and was put in for lexsort to use. > > That's good to know. However, I'm curious about where it is the > specific quicksort implementation for strings/unicode. I've had a look > at _sortmodule.c.src, but I can only find a quicksort implementation > for: >
The default is the C qsort, it is called from PyArray_Sort in multiarraymodule.c. The type specific sorts, when they exist, are also called from there through _new_sort. To see what type specific sorts are registered, look at the end of _sortmodule.c.src. You can write a new sort, and if it isn't registered it won't be used. So commenting out the registration is a good way to get back to the default. > > > The version you are testing is your own or the one that I provided? > Here are the timings for my laptop: > They turned out to be essentially identical, except I used len instead of ss ;) I also used inlined functions for the copy and swap as they are safer with the arguments. Comparison with the macro versions showed no difference there. > > In [55]: a = np.random.rand(10000).astype('S8') > > In [56]: %timeit a.copy().sort() > 100 loops, best of 3: 3.82 ms per loop > > In [57]: %timeit newqsort(a.copy()) > 100 loops, best of 3: 3.29 ms per loop > > Here, the difference in performance has been reduced to a mere 15% > (still favouring newqsort). So, with this, it seems like the > performance of the original sorting in NumPy only suffers a lot when > running in old processors (eg. Pentium 4), while the performance is > reasonable with newer ones (Opteron). It could also depend on the C library that comes with the compiler. I'm running on a E6600, but numpy compiles everything for the i386, which might make a difference also. Chuck
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion