On Feb 8, 2008 5:29 AM, Francesc Altet <[EMAIL PROTECTED]> wrote: > Hi, > > I'm a bit confused that the sort method of a string character doesn't > allow a mergesort: > > >>> s = numpy.empty(10, "S10") > >>> s.sort(kind="merge") > TypeError: desired sort not supported for this type > > However, by looking at the numpy sources, it seems that the only > implemented method for sorting array strings is "merge" (I presume > because it is stable). So, perhaps the message above should be fixed. >
The only available method is the default quicksort. The mergesort is for argsort and was put in for lexsort to use. > > Also, in the context of my work in indexing, and because of the slowness > of the current implementation in NumPy, I've ended with an > implementation of the quicksort method for 1-D array strings. For > moderately large arrays, it is about 2.5x-3x faster than the > (supposedly) mergesort version in NumPy, not only due to the quicksort, > but also because I've implemented a couple of macros for efficient > string swapping and copy. If this is of interest for NumPy developers, > tell me and I will provide the code. > I've now got a string/ucs4 specific argsort(kind='q'), the string version of which is about 40% faster than the old default and about 10% faster than the mergesort version, but the string/ucs4 specific versions of sort aren't yet fairing as well. I'm timing things with In [1]: import timeit In [2]: t = timeit.Timer("np.fromstring(np.empty(10000).tostring(),dtype='|S8').sort(kind='q')","import numpy as np") In [3]: t.repeat(3,100) Out[3]: [0.22127485275268555, 0.21282196044921875, 0.21273088455200195] That's with the current sort(kind='q') in svn, which uses the new string compare function but is otherwise the old default quicksort. The new string specific version of quicksort I'm testing is actually a bit slower than that. Both versions correctly sort the array. So I'm going to continue to experiment a bit until I see what is going on. Chuck
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion