Re: [Numpy-discussion] record arrays initialization
On Wed, May 2, 2012 at 4:46 PM, Kevin Jacobs jac...@bioinformed.com bioinfor...@gmail.com wrote: The cKDTree implementation is more than 4 times faster than the brute-force approach: T = scipy.spatial.cKDTree(targets) In [11]: %timeit foo1(element, targets) # Brute force 1000 loops, best of 3: 385 us per loop In [12]: %timeit foo2(element, T) # cKDTree 1 loops, best of 3: 83.5 us per loop In [13]: 385/83.5 Out[13]: 4.610778443113772 Brute force plus cython beats cKDTree for knn with k=1 and Euclidean distance: I[5] targets = np.random.uniform(0, 8, (5000, 7)) I[6] element = np.arange(1, 8, dtype=np.float64) I[7] T = scipy.spatial.cKDTree(targets) I[8] timeit T.query(element) 1 loops, best of 3: 36.1 us per loop I[9] timeit nn(targets, element) 1 loops, best of 3: 28.5 us per loop What about lower dimensions (2 instead of 7) where cKDTree gets faster? I[18] element = np.arange(1,3, dtype=np.float64) I[19] targets = np.random.uniform(0,8,(5000,2)) I[20] T = scipy.spatial.cKDTree(targets) I[21] timeit T.query(element) 1 loops, best of 3: 27.5 us per loop I[22] timeit nn(targets, element) 10 loops, best of 3: 11.6 us per loop I could add nn to the bottleneck package. Is the k=1, Euclidean distance case too specialized? Prototype (messy) code is here: https://github.com/kwgoodman/bottleneck/issues/45 For a smaller speed up (~2x) foo1 could use bottleneck's sum or squares function, bn.ss(). ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] record arrays initialization
-- Message: 6 Date: Thu, 3 May 2012 10:00:11 -0700 From: Keith Goodman kwgood...@gmail.com Subject: Re: [Numpy-discussion] record arrays initialization To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: cab6y536wpl-cmcgzasirg7obvamc5jv5ct5byotm-j0++7t...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 On Wed, May 2, 2012 at 4:46 PM, Kevin Jacobs jac...@bioinformed.com bioinfor...@gmail.com wrote: The cKDTree implementation is more than 4 times faster than the brute-force approach: T = scipy.spatial.cKDTree(targets) In [11]: %timeit foo1(element, targets) ? # Brute force 1000 loops, best of 3: 385 us per loop In [12]: %timeit foo2(element, T) ? ? ? ? # cKDTree 1 loops, best of 3: 83.5 us per loop In [13]: 385/83.5 Out[13]: 4.610778443113772 Brute force plus cython beats cKDTree for knn with k=1 and Euclidean distance: I[5] targets = np.random.uniform(0, 8, (5000, 7)) I[6] element = np.arange(1, 8, dtype=np.float64) I[7] T = scipy.spatial.cKDTree(targets) I[8] timeit T.query(element) 1 loops, best of 3: 36.1 us per loop I[9] timeit nn(targets, element) 1 loops, best of 3: 28.5 us per loop What about lower dimensions (2 instead of 7) where cKDTree gets faster? I[18] element = np.arange(1,3, dtype=np.float64) I[19] targets = np.random.uniform(0,8,(5000,2)) I[20] T = scipy.spatial.cKDTree(targets) I[21] timeit T.query(element) 1 loops, best of 3: 27.5 us per loop I[22] timeit nn(targets, element) 10 loops, best of 3: 11.6 us per loop I could add nn to the bottleneck package. Is the k=1, Euclidean distance case too specialized? Prototype (messy) code is here: https://github.com/kwgoodman/bottleneck/issues/45 For a smaller speed up (~2x) foo1 could use bottleneck's sum or squares function, bn.ss(). Thank you everybody for your advice. I will actually be clustering the dataset for use in the production code (which must be in Fortran), and using scipy.cluster.vq.kmeans2 to generate and populate the clusters. Even though a brute-force search in Fortran is fast, it will still be too slow for the final code. But, the kmeans2 clustering isn't perfect in that it doesn't always give identical results to the brute force search, so I want to do an analysis of the differences on a test data-set. Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] record arrays initialization
Thanks to Perry for some very useful off-list conversation. I realize that I wasn't being clear at all in my earlier description of the problem so here it is in a nutshell: Find the best match in an array t(5000, 7) for a single vector e(7). Now scale it up so e is (128, 512, 7) and I want to return a (128, 512) array of the t-identifiers that are the best match for e. Best match is defined as the minimum Euclidean distance. I'm going to try three ways: (a) brute force and lots of looping in python, (b) constructing a function to find the match for a single instance of e and vectorizing it, and (c) coding it in Fortran. I'll be curious to see the performance figures. Two smaller questions: A) How do I most efficiently construct a record array from a single array? I want to do the following, but it segfaults on me when i try to print b. vtype = [(x, numpy.ndarray)] a = numpy.arange(0, 16).reshape(4,4) b = numpy.recarray((4), dtype=vtype, buf=a) print a print b What is the most efficient way of constructing b from the values of a? In real-life, a is (128*512*7) and I want b to be (128, 512) with the x component being a 7-value numpy array. and B) If I'm vectorizing a function (single) to find the best match for a single element of e within t, how do I pass the entire array t into the function without having it parcelled down to its individual elements? i.e. def single(elements, targets): nlen = element.shape[0] nvec = targets.data.shape[0] x = element.reshape(1, nlen).repeat(nvec, axis=0) diffs = ((x - targets.data)**2).sum(axis=1) diffs = numpy.sqrt(diffs) return numpy.argmin(diffs, axis=0) multiple = numpy.vectorize(single) x = multiple(all_elements, target) where all_elements is similar to b in my first example, and target is a 2-d array. The above code doesn't work because target gets reduced to a single element when it gets down to single and I need to see the whole array when I'm down in single. I found a work-around by encapsulating target into a single object and passing in the object, but I'm curious if there's a better way of doing this. I hope I've explained myself better this time around, Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] record arrays initialization
On Wed, May 2, 2012 at 2:45 PM, Moroney, Catherine M (388D) catherine.m.moro...@jpl.nasa.gov wrote: Find the best match in an array t(5000, 7) for a single vector e(7). Now scale it up so e is (128, 512, 7) and I want to return a (128, 512) array of the t-identifiers that are the best match for e. Best match is defined as the minimum Euclidean distance. I'm going to try three ways: (a) brute force and lots of looping in python, (b) constructing a function to find the match for a single instance of e and vectorizing it, and (c) coding it in Fortran. I'll be curious to see the performance figures. I'd use a mixture of (a) and (b): break the t(N, 7) up into blocks of, say, (1000, 7), compute the best match in each using broadcasting, and then combine your results to find the best of the best. This strategy should be best for very large N. For moderate N, where broadcasting easily fits into memory, the answer given by the OP to your original email would do the trick. A) How do I most efficiently construct a record array from a single array? I want to do the following, but it segfaults on me when i try to print b. vtype = [(x, numpy.ndarray)] a = numpy.arange(0, 16).reshape(4,4) b = numpy.recarray((4), dtype=vtype, buf=a) I prefer not to use record arrays, and stick to structured arrays: In [11]: vtype = np.dtype([('x', (np.float, 4))]) In [12]: a = np.arange(16.).reshape((4,4)) In [13]: a.view(vtype) Out[13]: array([[([0.0, 1.0, 2.0, 3.0],)], [([4.0, 5.0, 6.0, 7.0],)], [([8.0, 9.0, 10.0, 11.0],)], [([12.0, 13.0, 14.0, 15.0],)]], dtype=[('x', 'f8', (4,))]) B) If I'm vectorizing a function (single) to find the best match for a single element of e within t, how do I pass the entire array t into the function without having it parcelled down to its individual elements? I think the new dtype just makes your life more difficult here. Simply do: In [49]: np.sum(a - elements.T, axis=1) Out[49]: array([ 0., 16., 32., 48.]) Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] record arrays initialization
On Wed, May 2, 2012 at 5:45 PM, Moroney, Catherine M (388D) catherine.m.moro...@jpl.nasa.gov wrote: Thanks to Perry for some very useful off-list conversation. I realize that I wasn't being clear at all in my earlier description of the problem so here it is in a nutshell: Find the best match in an array t(5000, 7) for a single vector e(7). Now scale it up so e is (128, 512, 7) and I want to return a (128, 512) array of the t-identifiers that are the best match for e. Best match is defined as the minimum Euclidean distance. It sounds like you want to find the nearest neighbor to a point in a high-dimensional space. This sounds like a job for a spacial data structure like a KD-tree. See: http://docs.scipy.org/doc/scipy/reference/spatial.html http://mloss.org/software/view/143/ http://www.mrzv.org/software/pyANN/ etc. -Kevin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] record arrays initialization
On Wed, May 2, 2012 at 5:27 PM, Kevin Jacobs jac...@bioinformed.com bioinfor...@gmail.com wrote: On Wed, May 2, 2012 at 5:45 PM, Moroney, Catherine M (388D) catherine.m.moro...@jpl.nasa.gov wrote: Thanks to Perry for some very useful off-list conversation. I realize that I wasn't being clear at all in my earlier description of the problem so here it is in a nutshell: Find the best match in an array t(5000, 7) for a single vector e(7). Now scale it up so e is (128, 512, 7) and I want to return a (128, 512) array of the t-identifiers that are the best match for e. Best match is defined as the minimum Euclidean distance. It sounds like you want to find the nearest neighbor to a point in a high-dimensional space. This sounds like a job for a spacial data structure like a KD-tree. See: http://docs.scipy.org/doc/scipy/reference/spatial.html http://mloss.org/software/view/143/ http://www.mrzv.org/software/pyANN/ etc. In general this is a good suggestion - I was going to mention it earlier - but I think for this particular problem it is not better than the brute force and argmin() NumPy approach. On my laptop, the KDTree query is about a factor of 7 slower (ignoring the time cost to create the KDTree) In [45]: def foo1(element, targets): distance_squared = ((element-targets)**2).sum(1) min_index = distance_squared.argmin() return sqrt(distance_squared[min_index]), min_index : In [46]: def foo2(element, T): return T.query(element) In [47]: element = np.arange(1,8) In [48]: targets = np.random.uniform(0,8,(5000,7)) In [49]: T = scipy.spatial.KDTree(targets) In [50]: %timeit foo1(element, targets) 1000 loops, best of 3: 401 us per loop In [51]: %timeit foo2(element, T) 100 loops, best of 3: 2.92 ms per loop Just to make sure they say the same thing: In [53]: foo1(element, targets) Out[53]: (1.8173671152898632, 570) In [54]: foo2(element, T) Out[54]: (1.8173671152898632, 570) I think KDTree is more optimal for larger searches (more than 5000 elements), and fewer dimensions. For example, with 500,000 elements and 2 dimensions, I get 34 ms for NumPy and 2 ms for the KDtree. Back to the original question, for 400 us per search, even over 128x512 elements that would be 26 seconds total, I think? That might not be too bad. Cheers, Aronne ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] record arrays initialization
On May 2, 2012, at 3:23 PM, numpy-discussion-requ...@scipy.org wrote: A) ?How do I most efficiently construct a record array from a single array? I want to do the following, but it segfaults on me when i try to print b. vtype = [(x, numpy.ndarray)] a = numpy.arange(0, 16).reshape(4,4) b = numpy.recarray((4), dtype=vtype, buf=a) I prefer not to use record arrays, and stick to structured arrays: In [11]: vtype = np.dtype([('x', (np.float, 4))]) In [12]: a = np.arange(16.).reshape((4,4)) In [13]: a.view(vtype) Out[13]: array([[([0.0, 1.0, 2.0, 3.0],)], [([4.0, 5.0, 6.0, 7.0],)], [([8.0, 9.0, 10.0, 11.0],)], [([12.0, 13.0, 14.0, 15.0],)]], dtype=[('x', 'f8', (4,))]) Using structured arrays is making my code complex when I try to call the vectorized function. If I stick to the original record arrays, what's the best way of initializing b from a without doing an row-by-row copy? Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] record arrays initialization
On Wed, May 2, 2012 at 7:25 PM, Aronne Merrelli aronne.merre...@gmail.comwrote: In general this is a good suggestion - I was going to mention it earlier - but I think for this particular problem it is not better than the brute force and argmin() NumPy approach. On my laptop, the KDTree query is about a factor of 7 slower (ignoring the time cost to create the KDTree) The cKDTree implementation is more than 4 times faster than the brute-force approach: T = scipy.spatial.cKDTree(targets) In [11]: %timeit foo1(element, targets) # Brute force 1000 loops, best of 3: 385 us per loop In [12]: %timeit foo2(element, T) # cKDTree 1 loops, best of 3: 83.5 us per loop In [13]: 385/83.5 Out[13]: 4.610778443113772 A FLANN implementation should be even faster--perhaps by as much as another factor of two. -Kevin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] record arrays initialization
On Wed, May 2, 2012 at 4:26 PM, Moroney, Catherine M (388D) catherine.m.moro...@jpl.nasa.gov wrote: Using structured arrays is making my code complex when I try to call the vectorized function. If I stick to the original record arrays, what's the best way of initializing b from a without doing an row-by-row copy? What does your original data look like? It seems like `a` is already what you need after the reshape? Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] record arrays initialization
On Wed, May 2, 2012 at 4:46 PM, Kevin Jacobs jac...@bioinformed.com bioinfor...@gmail.com wrote: A FLANN implementation should be even faster--perhaps by as much as another factor of two. I guess it depends on whether you care about the Approximate in Fast Library for Approximate Nearest Neighbors. Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] record arrays initialization
On Wednesday, May 2, 2012, Stéfan van der Walt wrote: On Wed, May 2, 2012 at 4:46 PM, Kevin Jacobs jac...@bioinformed.comjavascript:; bioinfor...@gmail.com javascript:; wrote: A FLANN implementation should be even faster--perhaps by as much as another factor of two. I guess it depends on whether you care about the Approximate in Fast Library for Approximate Nearest Neighbors. Stéfan This is why I love following these lists! I don't think I ever would have come across this method on my own. Nifty! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] record arrays initialization
On Wed, May 2, 2012 at 6:46 PM, Kevin Jacobs jac...@bioinformed.com bioinfor...@gmail.com wrote: On Wed, May 2, 2012 at 7:25 PM, Aronne Merrelli aronne.merre...@gmail.com wrote: In general this is a good suggestion - I was going to mention it earlier - but I think for this particular problem it is not better than the brute force and argmin() NumPy approach. On my laptop, the KDTree query is about a factor of 7 slower (ignoring the time cost to create the KDTree) The cKDTree implementation is more than 4 times faster than the brute-force approach: T = scipy.spatial.cKDTree(targets) In [11]: %timeit foo1(element, targets) # Brute force 1000 loops, best of 3: 385 us per loop In [12]: %timeit foo2(element, T) # cKDTree 1 loops, best of 3: 83.5 us per loop In [13]: 385/83.5 Out[13]: 4.610778443113772 Wow, not sure how I missed that! It even seems to scale better than linear (some of that 83us is call overhead, I guess): In [35]: %timeit foo2(element, T) 1 loops, best of 3: 115 us per loop In [36]: elements = np.random.uniform(0,8,[128,512,7]) In [37]: %timeit foo2(elements.reshape((128*512,7)), T) 1 loops, best of 3: 2.66 s per loop So only 2.7 seconds to search the whole set. Not bad! Cheers, Aronne ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion