Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-12 Thread Anne Archibald
2008/10/9 David Bolme [EMAIL PROTECTED]: I have written up basic nearest neighbor algorithm. It does a brute force search so it will be slower than kdtrees as the number of points gets large. It should however work well for high dimensional data. I have also added the option for user

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-09 Thread David Bolme
I have written up basic nearest neighbor algorithm. It does a brute force search so it will be slower than kdtrees as the number of points gets large. It should however work well for high dimensional data. I have also added the option for user defined distance measures. The user can

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-08 Thread Rob Hetland
My version of a wrapper of ANN is attached. I wrote it when I had some issues installing the scikits.ann package. It uses ctypes, and might be useful in deciding on an API. Please feel free to take what you like, -Rob ann.tar.gz Description: GNU Zip compressed data Rob

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-08 Thread Rob Hetland
On Oct 8, 2008, at 1:36 PM, Anne Archibald wrote: How important did you find the ability to select priority versus non-priority search? How important did you find the ability to select other splitting rules? In both cases, I included those things just because they were options in ANN,

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-03 Thread David Bolme
I remember reading a paper or book that stated that for data that has been normalized correlation and Euclidean are equivalent and will produce the same knn results. To this end I spent a couple hours this afternoon doing the math. This document is the result.

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-03 Thread Anne Archibald
2008/10/3 David Bolme [EMAIL PROTECTED]: I remember reading a paper or book that stated that for data that has been normalized correlation and Euclidean are equivalent and will produce the same knn results. To this end I spent a couple hours this afternoon doing the math. This document is

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-02 Thread David Bolme
I also like the idea of a scipy.spatial library. For the research I do in machine learning and computer vision we are often interested in specifying different distance measures. It would be nice to have a way to specify the distance measure. I would like to see a standard set included:

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-02 Thread Matthieu Brucher
2008/10/2 David Bolme [EMAIL PROTECTED]: I also like the idea of a scipy.spatial library. For the research I do in machine learning and computer vision we are often interested in specifying different distance measures. It would be nice to have a way to specify the distance measure. I would

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-02 Thread David Bolme
It may be useful to have an interface that handles both cases: similarity and dissimilarity. Often I have seen Nearest Neighbor algorithms that look for maximum similarity instead of minimum distance. In my field (biometrics) we often deal with very specialized distance or similarity

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-02 Thread Anne Archibald
2008/10/2 David Bolme [EMAIL PROTECTED]: It may be useful to have an interface that handles both cases: similarity and dissimilarity. Often I have seen Nearest Neighbor algorithms that look for maximum similarity instead of minimum distance. In my field (biometrics) we often deal with very

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-01 Thread Barry Wark
On Mon, Sep 29, 2008 at 8:24 PM, Anne Archibald [EMAIL PROTECTED] wrote: Hi, Once again there has been a thread on the numpy/scipy mailing lists requesting (essentially) some form of spatial data structure. Pointers have been posted to ANN (sadly LGPLed and in C++) as well as a handful of

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-01 Thread Nathan Bell
On Wed, Oct 1, 2008 at 1:26 AM, Gael Varoquaux [EMAIL PROTECTED] wrote: Absolutely. I just think k should default to None, when distance_upper_bound is specified. k=None could be interpreted as k=1 when distance_uppper_bound is not specified. Why not expose the various possibilities through

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-01 Thread Anne Archibald
2008/10/1 Gael Varoquaux [EMAIL PROTECTED]: On Tue, Sep 30, 2008 at 06:10:46PM -0400, Anne Archibald wrote: k=None in the third call to T.query seems redundant. It should be possible do put some logics so that the call is simply distances, indices = T.query(xs, distance_upper_bound=1.0)

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-01 Thread Anne Archibald
2008/10/1 Barry Wark [EMAIL PROTECTED]: Thanks for taking this on. The scikits.ann has licensing issues (as noted above), so it would be nice to have a clean-room implementation in scipy. I am happy to port the scikits.ann API to the final API that you choose, however, if you think that would

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-01 Thread James Philbin
distances, indices = T.query(xs) # single nearest neighbor I'm not sure if it's implied, but can xs be a NxD matrix here i.e. query for all N points rather than just one. This will reduce the python call overhead for large queries. Also, I have some c++ code for locality sensitive hashing which

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Peter
On Tue, Sep 30, 2008 at 5:10 AM, Christopher Barker [EMAIL PROTECTED] wrote: Anne Archibald wrote: I suggest the creation of a new submodule of scipy, scipy.spatial, +1 Here's one to consider: http://pypi.python.org/pypi/Rtree and perhaps other stuff from:

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Anne Archibald
2008/9/30 Peter [EMAIL PROTECTED]: On Tue, Sep 30, 2008 at 5:10 AM, Christopher Barker [EMAIL PROTECTED] wrote: Anne Archibald wrote: I suggest the creation of a new submodule of scipy, scipy.spatial, +1 Here's one to consider: http://pypi.python.org/pypi/Rtree and perhaps other stuff

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Gael Varoquaux
On Tue, Sep 30, 2008 at 05:31:17PM -0400, Anne Archibald wrote: T = KDTree(data) distances, indices = T.query(xs) # single nearest neighbor distances, indices = T.query(xs, k=10) # ten nearest neighbors distances, indices = T.query(xs, k=None, distance_upper_bound=1.0) # all within 1 of x

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Anne Archibald
2008/9/30 Gael Varoquaux [EMAIL PROTECTED]: On Tue, Sep 30, 2008 at 05:31:17PM -0400, Anne Archibald wrote: T = KDTree(data) distances, indices = T.query(xs) # single nearest neighbor distances, indices = T.query(xs, k=10) # ten nearest neighbors distances, indices = T.query(xs, k=None,

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Nathan Bell
On Tue, Sep 30, 2008 at 6:10 PM, Anne Archibald [EMAIL PROTECTED] wrote: Well, the problem with this is that you often want to provide a distance upper bound as well as a number of nearest neighbors. For This use case is also important in scattered data interpolation, so we definitely want to

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Gael Varoquaux
On Tue, Sep 30, 2008 at 06:10:46PM -0400, Anne Archibald wrote: k=None in the third call to T.query seems redundant. It should be possible do put some logics so that the call is simply distances, indices = T.query(xs, distance_upper_bound=1.0) Well, the problem with this is that you often

[Numpy-discussion] Proposal: scipy.spatial

2008-09-29 Thread Anne Archibald
Hi, Once again there has been a thread on the numpy/scipy mailing lists requesting (essentially) some form of spatial data structure. Pointers have been posted to ANN (sadly LGPLed and in C++) as well as a handful of pure-python implementations of kd-trees. I suggest the creation of a new

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-29 Thread Christopher Barker
Anne Archibald wrote: I suggest the creation of a new submodule of scipy, scipy.spatial, +1 Here's one to consider: http://pypi.python.org/pypi/Rtree and perhaps other stuff from: http://trac.gispython.org/spatialindex/wiki which I think is LGPL -- can scipy use that? By the way, a

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-29 Thread Charles R Harris
On Mon, Sep 29, 2008 at 9:24 PM, Anne Archibald [EMAIL PROTECTED]wrote: Hi, Once again there has been a thread on the numpy/scipy mailing lists requesting (essentially) some form of spatial data structure. Pointers have been posted to ANN (sadly LGPLed and in C++) as well as a handful of