[Numpy-discussion] using numpy.argmax to index into another array
Hello Everybody, I have the following problem that I would be interested in finding an easy/elegant solution to. I've got it working, but my solution is exceedingly clunky and I'm sure that there must be a better way. Here is the (boiled-down) problem: I have 2 different 3-d array, shaped (4,4,4), a and b I can find the max values and location of those max values in a in the 0-th dimension using max and argmax resulting in a 4x4 matrix. So far, very easy. I then want to find the values in b that correspond to the maximum values in a. This is where I got stuck. Below find the sample code I used (pretty clumsy stuff ...) Can somebody show a better (less clumsy) way of finding bmax in the code excerpt below? a = numpy.random.randint(0, high=15, size=(4,4,4)) b = numpy.random.randint(0, high=15, size=(4,4,4)) amax = a.argmax(axis=0) idx2 = [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3] idx1 = [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3] idx_all = (amax.flatten(), numpy.array(idx1), numpy.array(idx2)) bmax = b[idx_all].reshape(4,4) a array([[[ 7, 1, 7, 10], [11, 5, 9, 0], [13, 1, 13, 14], [ 4, 13, 4, 7]], [[ 2, 11, 4, 5], [ 6, 12, 14, 9], [ 0, 4, 5, 14], [13, 5, 3, 4]], [[ 1, 4, 5, 4], [ 7, 13, 1, 11], [ 9, 2, 10, 4], [ 3, 4, 10, 13]], [[14, 6, 7, 1], [ 8, 0, 8, 2], [ 6, 8, 7, 12], [ 4, 9, 0, 12]]]) idx array([[3, 1, 0, 0], [0, 2, 1, 2], [0, 3, 0, 0], [1, 0, 2, 2]]) b array([[[ 4, 2, 3, 2], [ 9, 6, 1, 4], [ 1, 10, 13, 11], [10, 8, 1, 10]], [[ 3, 13, 7, 7], [ 4, 7, 9, 7], [ 3, 13, 10, 10], [13, 8, 14, 6]], [[14, 10, 13, 6], [13, 2, 12, 12], [ 9, 6, 3, 8], [ 0, 7, 8, 11]], [[12, 10, 3, 0], [11, 7, 0, 4], [ 9, 7, 11, 12], [ 7, 12, 1, 1]]]) bmax = b[idx_all].reshape(4,4) bmax array([[12, 13, 3, 2], [ 9, 2, 9, 12], [ 1, 7, 13, 11], [13, 8, 8, 11]]) Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] record arrays initialization
-- Message: 6 Date: Thu, 3 May 2012 10:00:11 -0700 From: Keith Goodman kwgood...@gmail.com Subject: Re: [Numpy-discussion] record arrays initialization To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: cab6y536wpl-cmcgzasirg7obvamc5jv5ct5byotm-j0++7t...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 On Wed, May 2, 2012 at 4:46 PM, Kevin Jacobs jac...@bioinformed.com bioinfor...@gmail.com wrote: The cKDTree implementation is more than 4 times faster than the brute-force approach: T = scipy.spatial.cKDTree(targets) In [11]: %timeit foo1(element, targets) ? # Brute force 1000 loops, best of 3: 385 us per loop In [12]: %timeit foo2(element, T) ? ? ? ? # cKDTree 1 loops, best of 3: 83.5 us per loop In [13]: 385/83.5 Out[13]: 4.610778443113772 Brute force plus cython beats cKDTree for knn with k=1 and Euclidean distance: I[5] targets = np.random.uniform(0, 8, (5000, 7)) I[6] element = np.arange(1, 8, dtype=np.float64) I[7] T = scipy.spatial.cKDTree(targets) I[8] timeit T.query(element) 1 loops, best of 3: 36.1 us per loop I[9] timeit nn(targets, element) 1 loops, best of 3: 28.5 us per loop What about lower dimensions (2 instead of 7) where cKDTree gets faster? I[18] element = np.arange(1,3, dtype=np.float64) I[19] targets = np.random.uniform(0,8,(5000,2)) I[20] T = scipy.spatial.cKDTree(targets) I[21] timeit T.query(element) 1 loops, best of 3: 27.5 us per loop I[22] timeit nn(targets, element) 10 loops, best of 3: 11.6 us per loop I could add nn to the bottleneck package. Is the k=1, Euclidean distance case too specialized? Prototype (messy) code is here: https://github.com/kwgoodman/bottleneck/issues/45 For a smaller speed up (~2x) foo1 could use bottleneck's sum or squares function, bn.ss(). Thank you everybody for your advice. I will actually be clustering the dataset for use in the production code (which must be in Fortran), and using scipy.cluster.vq.kmeans2 to generate and populate the clusters. Even though a brute-force search in Fortran is fast, it will still be too slow for the final code. But, the kmeans2 clustering isn't perfect in that it doesn't always give identical results to the brute force search, so I want to do an analysis of the differences on a test data-set. Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] timing results (was: record arrays initialization)
A quick recap of the problem: a 128x512 array of 7-element vectors (element), and a 5000-vector training dataset (targets). For each vector in element, I want to find the best-match in targets, defined as minimizing the Euclidean distance. I coded it up three ways: (a) looping through each vector in element individually, (b) vectorizing the function in the previous step, and coding it up in Fortran. The heart of the find-best-match code in Python looks like so I'm not doing an individual loop through all 5000 vectors in targets: nlen = xelement.shape[0] nvec = targets.data.shape[0] x = xelement.reshape(1, nlen).repeat(nvec, axis=0) diffs = ((x - targets.data)**2).sum(axis=1) diffs = numpy.sqrt(diffs) return int(numpy.argmin(diffs, axis=0)) Here are the results: (a) looping through each vector: 68 seconds (b) vectorizing this: 58 seconds (c) raw Fortran with loops: 26 seconds I was surprised to see that vectorizing didn't gain me that much time, and that the Fortran was so much faster than both python alternatives. So, there's a lot that I don't know about how the internals of numpy and python work. Why does the loop through 128x512 elements in python only take an additional 10 seconds? What is the main purpose of vectorizing - is it optimization by taking the looping step out of the Python and into the C-base or something different? And, why is the Fortran so much faster (even without optimization)? It looks like I'll be switching to Fortran after all. Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] timing results (was: record arrays initialization)
On May 3, 2012, at 10:33 AM, Moroney, Catherine M (388D) wrote: A quick recap of the problem: a 128x512 array of 7-element vectors (element), and a 5000-vector training dataset (targets). For each vector in element, I want to find the best-match in targets, defined as minimizing the Euclidean distance. I coded it up three ways: (a) looping through each vector in element individually, (b) vectorizing the function in the previous step, and coding it up in Fortran. The heart of the find-best-match code in Python looks like so I'm not doing an individual loop through all 5000 vectors in targets: nlen = xelement.shape[0] nvec = targets.data.shape[0] x = xelement.reshape(1, nlen).repeat(nvec, axis=0) diffs = ((x - targets.data)**2).sum(axis=1) diffs = numpy.sqrt(diffs) return int(numpy.argmin(diffs, axis=0)) Here are the results: (a) looping through each vector: 68 seconds (b) vectorizing this: 58 seconds (c) raw Fortran with loops: 26 seconds I was surprised to see that vectorizing didn't gain me that much time, and that the Fortran was so much faster than both python alternatives. So, there's a lot that I don't know about how the internals of numpy and python work. Why does the loop through 128x512 elements in python only take an additional 10 seconds? What is the main purpose of vectorizing - is it optimization by taking the looping step out of the Python and into the C-base or something different? And, why is the Fortran so much faster (even without optimization)? It looks like I'll be switching to Fortran after all. Catherine Actually Fortran with correct array ordering - 13 seconds! What horrible python/numpy mistake am I making to cause such a slowdown? Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] timing results (was: record arrays initialization)
On May 3, 2012, at 1:00 PM, numpy-discussion-requ...@scipy.org wrote: A quick recap of the problem: a 128x512 array of 7-element vectors (element), and a 5000-vector training dataset (targets). For each vector in element, I want to find the best-match in targets, defined as minimizing the Euclidean distance. I coded it up three ways: (a) looping through each vector in element individually, (b) vectorizing the function in the previous step, and coding it up in Fortran. The heart of the find-best-match code in Python looks like so I'm not doing an individual loop through all 5000 vectors in targets: nlen = xelement.shape[0] nvec = targets.data.shape[0] x = xelement.reshape(1, nlen).repeat(nvec, axis=0) diffs = ((x - targets.data)**2).sum(axis=1) diffs = numpy.sqrt(diffs) return int(numpy.argmin(diffs, axis=0)) Here are the results: (a) looping through each vector: 68 seconds (b) vectorizing this: 58 seconds (c) raw Fortran with loops: 26 seconds I was surprised to see that vectorizing didn't gain me that much time, and that the Fortran was so much faster than both python alternatives. So, there's a lot that I don't know about how the internals of numpy and python work. Why does the loop through 128x512 elements in python only take an additional 10 seconds? What is the main purpose of vectorizing - is it optimization by taking the looping step out of the Python and into the C-base or something different? If you're doing loops in python, python does all sort of magic for you. Type checking is one thing, and one of the simplest things to optimize away if you use cython. If you're writing an expression similar to this ((x - targets.data)**2) where x and targets.data are vectors, the elements are subtracted and squared elementwise in C instead of in python. So yes, you've got the idea. And, why is the Fortran so much faster (even without optimization)? Could you show us the code? It's hard to tell otherwise. As Keith Goodman pointed out, if he gets 7.5x with cython, it could be that the Fortran code could be improved as well. Fortran has a reputation of being the gold standard for performance in numerical computation, although one can often be surprised. Picking good algorithms is always more important than the language. Paul I'd be very curious to know if the Fortran can be improved on a bit further. The full scale of the problem dwarfs what I describe here, so any additional speed I can wring out of this would be very welcome. Can somebody give me the numpy for dummies explanation as to what the main source of the timing differences is? What portion of the calculations is being done in slow python that's the determining factor for the slowdown? Here is the python code: def single(element, targets): if (isinstance(element, tuple)): xelement = element[0] elif (isinstance(element, numpy.ndarray)): xelement = element else: return FILL nlen = xelement.shape[0] nvec = targets.data.shape[0] x = xelement.reshape(1, nlen).repeat(nvec, axis=0) diffs = ((x - targets.data)**2).sum(axis=1) diffs = numpy.sqrt(diffs) return int(numpy.argmin(diffs, axis=0)) multiple = numpy.vectorize(single) python_results = multiple(vectors, Target) where vectors is a 65536*7 array and Target is 5000x7. So I'm evaluating 5000 potential choices for each of 65536 vectors. And here is the Fortran: matches(:) = - do iv = 1, nvectors min_dist = 9.0 min_idx = - do it = 1, ntargets dvector = targets(:,it) - vectors(:,iv) dist = sqrt(sum(dvector*dvector)) if (dist min_dist) then min_dist = dist min_idx = it endif end do matches(iv) = min_idx end do Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] record arrays and vectorizing
Hello, Can somebody give me some hints as to how to code up this function in pure python, rather than dropping down to Fortran? I will want to compare a 7-element vector (called element) to a large list of similarly-dimensioned vectors (called target, and pick out the vector in target that is the closest to element (determined by minimizing the Euclidean distance). For instance, in (slow) brute force form it would look like: element = numpy.array([1, 2, 3, 4, 5, 6, 7]) target = numpy.array(range(0, 49)).reshape(7,7)*0.1 min_length = .0 min_index = for i in xrange(0, 7): distance = (element-target)**2 distance = numpy.sqrt(distance.sum()) if (distance min_length): min_length = distance min_index = i Now of course, the actual problem will be of a much larger scale. I will have an array of elements, and a large number of potential targets. I was thinking of having element be an array where each element itself is a numpy.ndarray, and then vectorizing the code above so as an output I would have an array of the min_index and min_length values. I can get the following simple test to work so I may be on the right track: import numpy dtype = [(x, numpy.ndarray)] def single(data): return data[0].min() multiple = numpy.vectorize(single) if __name__ == __main__: a = numpy.arange(0, 16).reshape(4,4) b = numpy.recarray((4), dtype=dtype) for i in xrange(0, b.shape[0]): b[i][x] = a[i,:] print a print b x = multiple(b) print x What is the best way of constructing b from a? I tried b = numpy.recarray((4), dtype=dtype, buf=a) but I get a segmentation fault when I try to print b. Is there a way to perform this larger task efficiently with record arrays and vectorization, or am I off on the wrong track completely? How can I do this efficiently without dropping down to Fortran? Thanks for any advice, Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] record arrays initialization
Thanks to Perry for some very useful off-list conversation. I realize that I wasn't being clear at all in my earlier description of the problem so here it is in a nutshell: Find the best match in an array t(5000, 7) for a single vector e(7). Now scale it up so e is (128, 512, 7) and I want to return a (128, 512) array of the t-identifiers that are the best match for e. Best match is defined as the minimum Euclidean distance. I'm going to try three ways: (a) brute force and lots of looping in python, (b) constructing a function to find the match for a single instance of e and vectorizing it, and (c) coding it in Fortran. I'll be curious to see the performance figures. Two smaller questions: A) How do I most efficiently construct a record array from a single array? I want to do the following, but it segfaults on me when i try to print b. vtype = [(x, numpy.ndarray)] a = numpy.arange(0, 16).reshape(4,4) b = numpy.recarray((4), dtype=vtype, buf=a) print a print b What is the most efficient way of constructing b from the values of a? In real-life, a is (128*512*7) and I want b to be (128, 512) with the x component being a 7-value numpy array. and B) If I'm vectorizing a function (single) to find the best match for a single element of e within t, how do I pass the entire array t into the function without having it parcelled down to its individual elements? i.e. def single(elements, targets): nlen = element.shape[0] nvec = targets.data.shape[0] x = element.reshape(1, nlen).repeat(nvec, axis=0) diffs = ((x - targets.data)**2).sum(axis=1) diffs = numpy.sqrt(diffs) return numpy.argmin(diffs, axis=0) multiple = numpy.vectorize(single) x = multiple(all_elements, target) where all_elements is similar to b in my first example, and target is a 2-d array. The above code doesn't work because target gets reduced to a single element when it gets down to single and I need to see the whole array when I'm down in single. I found a work-around by encapsulating target into a single object and passing in the object, but I'm curious if there's a better way of doing this. I hope I've explained myself better this time around, Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] record arrays initialization
On May 2, 2012, at 3:23 PM, numpy-discussion-requ...@scipy.org wrote: A) ?How do I most efficiently construct a record array from a single array? I want to do the following, but it segfaults on me when i try to print b. vtype = [(x, numpy.ndarray)] a = numpy.arange(0, 16).reshape(4,4) b = numpy.recarray((4), dtype=vtype, buf=a) I prefer not to use record arrays, and stick to structured arrays: In [11]: vtype = np.dtype([('x', (np.float, 4))]) In [12]: a = np.arange(16.).reshape((4,4)) In [13]: a.view(vtype) Out[13]: array([[([0.0, 1.0, 2.0, 3.0],)], [([4.0, 5.0, 6.0, 7.0],)], [([8.0, 9.0, 10.0, 11.0],)], [([12.0, 13.0, 14.0, 15.0],)]], dtype=[('x', 'f8', (4,))]) Using structured arrays is making my code complex when I try to call the vectorized function. If I stick to the original record arrays, what's the best way of initializing b from a without doing an row-by-row copy? Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion