2009/4/30 David Warde-Farley <d...@cs.toronto.edu>:
> Have you considered coding up a looped version in Cython? If this is
> going to be a bottleneck then it would be very worthwhile. Stéfan's
> code is clever, although as he points out, it will create an
> intermediate array of size (len(I))**2, which may end up being as much
> of a problem as a Python loop if you're allocating and garbage
> collecting an N**2 array every time.

Here is a version that does not create such large intermediate arrays.
 It's 0200 here, so I'd be surprised if this works for all the corner
cases :)  [I attach a file with the 3 different approaches given so
far, so that you can benchmark them.  My second attempt is about 30
times faster than my first attempt, and about 7 times faster than
David's for-loop].

import numpy as np

def count1(x):
    """stefan-2: Improved memory footprint.

    """
    sort_idx = np.argsort(x)
    x_sorted = x[sort_idx]

    jumps = np.hstack([[0], np.where(np.diff(x_sorted) > 0)[0] + 1])
    count = np.ones(len(x))
    count[jumps] -= np.hstack([jumps[0] + 1, np.diff(jumps)])

    out = np.empty(len(x))
    out[sort_idx] = np.cumsum(count)
    return out

Cheers
Stéfan

Attachment: crazysort.py
Description: Binary data

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to