Hi all, Here's the problem I want to write vectorised code for. I start with an array of indices, say I=array([0,1,0,2,0,1,4]), and I want to come up with an array C that counts how many times each index has been seen so far if you were counting through the array from the beginning to the end, e.g. for that I, C=array([0,0,1,0,2,1,0]). This is easy enough to do with a C loop, but trickier to do just with numpy. Any ideas?
I came up with a solution, but it's not particularly elegant. I bet it can be done better. But here's my solution FWIW: First, sort I (we can always undo this at the end so no loss of generality there). Now we have I=[0,0,0,1,1,2,4]. Next we construct the array J=I[1:]!=I[:-1], i.e. marking the points where the index changes, and the array K=-I. Write A, B for the cumsum of J and K (but with an extra 0 on the beginning of each), so: A=[0,0,0,1,1,2,3] B=[0,1,2,2,3,3,3] Now A is a sort of index of the indices, and B is sort of close to the C we're looking for - it counts upwards at the right times but doesn't reset to 0 when the index changes. So we fix this by subtracting the value at the point where it should reset. The array of these values that need to be subtracted is B[J], although this misses the first one, so we write BJ for the array B[J] with 0 prepended. Now A indexes the indices, so BJ[A] is an array of the same length as the initial array with value the initial value for each segment: BJ=[0,2,3,3] BJ[A]=[0,0,0,2,2,3,3] Finally, C is then B-BJ[A]: C=[0,1,2,0,1,0,0] Complete code: J = I[1:]!=I[:-1] K = I[1:]==I[:-1] A = hstack((0, cumsum(array(J, dtype=int)))) B = hstack((0, cumsum(array(K, dtype=int)))) BJ = hstack((0, B[J])) C = B-BJ[A] OK, so that's my version - can you improve on it? I don't mind so much if the code is unreadable as long as it's fast! :-) Dan _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion