2008/5/18 Matt Crane <[EMAIL PROTECTED]>: > On Sun, May 18, 2008 at 8:52 PM, Robert Kern <[EMAIL PROTECTED]> wrote: >> Are there repeats? > No, no repeats in the first column. > > I'm going to go get a cup of coffee before I forget to leave out any > potentially vital information again. It's going to be a long day.
It can be done, though I had to be kind of devious. My solution might not even be O(n log n), depending on how mergesort is implemented: def find_matches(A,B): """Find positions A_idx and B_idx so that A[A_idx]==B[B_idx] A and B should be sorted arrays with no repeats; A_idx and B_idx will be all the places they agree. >>> import numpy as np >>> A = np.array([1,2,4,5,7]) >>> B = np.array([1,3,5,9]) >>> find_matches(A,B) array([0,3]), array([0,2]) """ AB = np.concatenate((A,B)) idx = np.argsort(np.concatenate((A,B)),kind="mergesort") sorted = AB[idx] pairs = np.where(np.diff(sorted)==0)[0] A_pairs = idx[pairs] B_pairs = idx[pairs+1]-len(A) if np.any(A_pairs>=len(A)): raise ValueError, "first array contains a repeated element" if np.any(B_pairs<0): raise ValueError, "second array contains a repeated element" return A_pairs, B_pairs The idea is that diff is not a bad way to find repeats in a sequence; so concatenate the two sequences and sort them (if you're lucky mergesort will be fast because they're individually sorted, but in any case we need stability). Of course, you have to take the pairs in the sorted sequence and figure out where they came from, but fortunately you don't even need to invert the permutation returned by argsort (do we have a tool to invert permutations, or should one just use argsort again?). Anne _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion