2009/4/30 David Warde-Farley <d...@cs.toronto.edu>: > Have you considered coding up a looped version in Cython? If this is > going to be a bottleneck then it would be very worthwhile. Stéfan's > code is clever, although as he points out, it will create an > intermediate array of size (len(I))**2, which may end up being as much > of a problem as a Python loop if you're allocating and garbage > collecting an N**2 array every time.
Here is a version that does not create such large intermediate arrays. It's 0200 here, so I'd be surprised if this works for all the corner cases :) [I attach a file with the 3 different approaches given so far, so that you can benchmark them. My second attempt is about 30 times faster than my first attempt, and about 7 times faster than David's for-loop]. import numpy as np def count1(x): """stefan-2: Improved memory footprint. """ sort_idx = np.argsort(x) x_sorted = x[sort_idx] jumps = np.hstack([[0], np.where(np.diff(x_sorted) > 0)[0] + 1]) count = np.ones(len(x)) count[jumps] -= np.hstack([jumps[0] + 1, np.diff(jumps)]) out = np.empty(len(x)) out[sort_idx] = np.cumsum(count) return out Cheers Stéfan
crazysort.py
Description: Binary data
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion