2008/10/9 David Bolme [EMAIL PROTECTED]:
I have written up basic nearest neighbor algorithm. It does a brute
force search so it will be slower than kdtrees as the number of points
gets large. It should however work well for high dimensional data. I
have also added the option for user
I have written up basic nearest neighbor algorithm. It does a brute
force search so it will be slower than kdtrees as the number of points
gets large. It should however work well for high dimensional data. I
have also added the option for user defined distance measures. The
user can
My version of a wrapper of ANN is attached. I wrote it when I had
some issues installing the scikits.ann package. It uses ctypes, and
might be useful in deciding on an API.
Please feel free to take what you like,
-Rob
ann.tar.gz
Description: GNU Zip compressed data
Rob
On Oct 8, 2008, at 1:36 PM, Anne Archibald wrote:
How important did you find the ability to select priority versus
non-priority search?
How important did you find the ability to select other splitting
rules?
In both cases, I included those things just because they were options
in ANN,
I remember reading a paper or book that stated that for data that has
been normalized correlation and Euclidean are equivalent and will
produce the same knn results. To this end I spent a couple hours this
afternoon doing the math. This document is the result.
2008/10/3 David Bolme [EMAIL PROTECTED]:
I remember reading a paper or book that stated that for data that has
been normalized correlation and Euclidean are equivalent and will
produce the same knn results. To this end I spent a couple hours this
afternoon doing the math. This document is
I also like the idea of a scipy.spatial library. For the research I
do in machine learning and computer vision we are often interested in
specifying different distance measures. It would be nice to have a
way to specify the distance measure. I would like to see a standard
set included:
2008/10/2 David Bolme [EMAIL PROTECTED]:
I also like the idea of a scipy.spatial library. For the research I
do in machine learning and computer vision we are often interested in
specifying different distance measures. It would be nice to have a
way to specify the distance measure. I would
It may be useful to have an interface that handles both cases:
similarity and dissimilarity. Often I have seen Nearest Neighbor
algorithms that look for maximum similarity instead of minimum
distance. In my field (biometrics) we often deal with very
specialized distance or similarity
2008/10/2 David Bolme [EMAIL PROTECTED]:
It may be useful to have an interface that handles both cases:
similarity and dissimilarity. Often I have seen Nearest Neighbor
algorithms that look for maximum similarity instead of minimum
distance. In my field (biometrics) we often deal with very
On Mon, Sep 29, 2008 at 8:24 PM, Anne Archibald
[EMAIL PROTECTED] wrote:
Hi,
Once again there has been a thread on the numpy/scipy mailing lists
requesting (essentially) some form of spatial data structure. Pointers
have been posted to ANN (sadly LGPLed and in C++) as well as a handful
of
On Wed, Oct 1, 2008 at 1:26 AM, Gael Varoquaux
[EMAIL PROTECTED] wrote:
Absolutely. I just think k should default to None, when
distance_upper_bound is specified. k=None could be interpreted as k=1
when distance_uppper_bound is not specified.
Why not expose the various possibilities through
2008/10/1 Gael Varoquaux [EMAIL PROTECTED]:
On Tue, Sep 30, 2008 at 06:10:46PM -0400, Anne Archibald wrote:
k=None in the third call to T.query seems redundant. It should be
possible do put some logics so that the call is simply
distances, indices = T.query(xs, distance_upper_bound=1.0)
2008/10/1 Barry Wark [EMAIL PROTECTED]:
Thanks for taking this on. The scikits.ann has licensing issues (as
noted above), so it would be nice to have a clean-room implementation
in scipy. I am happy to port the scikits.ann API to the final API that
you choose, however, if you think that would
distances, indices = T.query(xs) # single nearest neighbor
I'm not sure if it's implied, but can xs be a NxD matrix here i.e.
query for all N points rather than just one. This will reduce the
python call overhead for large queries.
Also, I have some c++ code for locality sensitive hashing which
On Tue, Sep 30, 2008 at 5:10 AM, Christopher Barker
[EMAIL PROTECTED] wrote:
Anne Archibald wrote:
I suggest the creation of
a new submodule of scipy, scipy.spatial,
+1
Here's one to consider:
http://pypi.python.org/pypi/Rtree
and perhaps other stuff from:
2008/9/30 Peter [EMAIL PROTECTED]:
On Tue, Sep 30, 2008 at 5:10 AM, Christopher Barker
[EMAIL PROTECTED] wrote:
Anne Archibald wrote:
I suggest the creation of
a new submodule of scipy, scipy.spatial,
+1
Here's one to consider:
http://pypi.python.org/pypi/Rtree
and perhaps other stuff
On Tue, Sep 30, 2008 at 05:31:17PM -0400, Anne Archibald wrote:
T = KDTree(data)
distances, indices = T.query(xs) # single nearest neighbor
distances, indices = T.query(xs, k=10) # ten nearest neighbors
distances, indices = T.query(xs, k=None, distance_upper_bound=1.0) #
all within 1 of x
2008/9/30 Gael Varoquaux [EMAIL PROTECTED]:
On Tue, Sep 30, 2008 at 05:31:17PM -0400, Anne Archibald wrote:
T = KDTree(data)
distances, indices = T.query(xs) # single nearest neighbor
distances, indices = T.query(xs, k=10) # ten nearest neighbors
distances, indices = T.query(xs, k=None,
On Tue, Sep 30, 2008 at 6:10 PM, Anne Archibald
[EMAIL PROTECTED] wrote:
Well, the problem with this is that you often want to provide a
distance upper bound as well as a number of nearest neighbors. For
This use case is also important in scattered data interpolation, so we
definitely want to
On Tue, Sep 30, 2008 at 06:10:46PM -0400, Anne Archibald wrote:
k=None in the third call to T.query seems redundant. It should be
possible do put some logics so that the call is simply
distances, indices = T.query(xs, distance_upper_bound=1.0)
Well, the problem with this is that you often
Hi,
Once again there has been a thread on the numpy/scipy mailing lists
requesting (essentially) some form of spatial data structure. Pointers
have been posted to ANN (sadly LGPLed and in C++) as well as a handful
of pure-python implementations of kd-trees. I suggest the creation of
a new
Anne Archibald wrote:
I suggest the creation of
a new submodule of scipy, scipy.spatial,
+1
Here's one to consider:
http://pypi.python.org/pypi/Rtree
and perhaps other stuff from:
http://trac.gispython.org/spatialindex/wiki
which I think is LGPL -- can scipy use that?
By the way, a
On Mon, Sep 29, 2008 at 9:24 PM, Anne Archibald
[EMAIL PROTECTED]wrote:
Hi,
Once again there has been a thread on the numpy/scipy mailing lists
requesting (essentially) some form of spatial data structure. Pointers
have been posted to ANN (sadly LGPLed and in C++) as well as a handful
of
24 matches
Mail list logo