A Dimarts 19 Setembre 2006 21:41, Bill Baxter va escriure: > I think he meant do an argsort first, then use fancy indexing to get > the sorted array. > For a 1-d array that's just > > ind = A.argsort() > Asorted = A[ind] > > That should be O(N lg N + N), aka O(N lg N)
I see. Thanks. OTOH, maybe your estimations are right, but the effect of the constants in these O(whatever) estimations can be surprisingly high: In [3]: from timeit import Timer In [4]: Timer("b=numpy.argsort(a);c=a[b]", "import numpy; a=numpy.arange(100000,-1,-1)").repeat(3,100) Out[4]: [1.6653108596801758, 1.670341968536377, 1.6632120609283447] In [5]: Timer("b=numpy.argsort(a);c=numpy.sort(a)", "import numpy; a=numpy.arange(100000,-1,-1)").repeat(3,100) Out[5]: [1.6533238887786865, 1.6272940635681152, 1.6253311634063721] In [6]: Timer("b=numpy.argsort(a);a.sort();c=a", "import numpy; a=numpy.arange(100000,-1,-1)").repeat(3,100) Out[6]: [0.95492100715637207, 0.90312504768371582, 0.90426898002624512] so, it seems that argsorting first and fancy indexing later on is the most expensive procedure for relatively large arrays (100000). Interestingly, figures above seems to indicate that in-place sort is stunningly fast: In [7]: Timer("a.sort()","import numpy; a=numpy.arange(100000,-1,-1)").repeat(3,100) Out[7]: [0.32840394973754883, 0.2746579647064209, 0.2770991325378418] and much faster indeed than fancy indexing In [8]: Timer("b[a]","import numpy; a=numpy.arange(100000,-1,-1);b=a.copy()").repeat(3,100) Out[8]: [0.79876089096069336, 0.74172186851501465, 0.74209499359130859] i.e. in-place sort seems 2.7x faster than fancy indexing (at least for these datasets). Mmmm, with this, I really ponder if a combo that makes the argsort() and sort() in one shot really makes any sense, at least from the point of view of speed: In [10]: Timer("b=numpy.argsort(a);a.sort();c=a","import numpy; a=numpy.arange(100000,-1,-1)").repeat(3,100) Out[10]: [0.98506593704223633, 0.89880609512329102, 0.89982390403747559] In [11]: Timer("b=numpy.argsort(a)","import numpy; a=numpy.arange(100000,-1,-1)").repeat(3,100) Out[11]: [0.92959284782409668, 0.85385990142822266, 0.87773990631103516] So, it seems that doing an in-place sort() immediately after an argsort() operation is very efficient (cache effects here?), and would avoid the need of the combo function (from the point of view of efficiency, I repeat). Cheers, -- >0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-" ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion