----- Original Message ----
From: Christopher Barker <chris.bar...@noaa.gov>
To: Discussion of Numerical Python <numpy-discussion@scipy.org>
Sent: Thu, November 12, 2009 12:37:37 PM
Subject: Re: [Numpy-discussion] finding close together points.

Lou Pecora wrote:
> a KD tree for 2D nearest neighbor seems like over kill.  You
> might want to try the simple approach of using boxes of points to
> narrow things down by sorting on the first component.

yeah, we'll probably do something like that if we have to write the code 
ourselves. At the moment, we're using geohash:

http://pypi.python.org/pypi/Geohash/

(this is for points on the earth)

and it's working OK. I was just hoping kdtree would work out of the box!

> where for a
> static data set it can match KD trees in speed

Why does it have to be static -- it doesn't look hard to insert/remove 
points.

--- Lou answers ----------------------


It doesn't have to be static, but if I remember correctly, when you add points 
you have to resort or do a "smart" insert (which may require a lot of data 
"shifting").  Not hard, but the original paper claimed that if there are a lot 
of changes to the data set this becomes more expensive than the kd-tree.  If 
you can get the original paper I mentioned, it will have more info.  It's 
pretty clearly written and contains some nice comparisons to kd-trees for 
finding near neighbors.  

Bottom line is that you can still try it with changing data.  See what the run 
times are like.  I've used the approach for up to eight dimensions with about 
10^5 data points.  It worked nicely.  I remember looking for a kd-tree library 
that would work out of the box (C or C++), but everything I found was not as 
simple as I would have liked.  The slice searching was a nice solution.  But 
maybe I just didn't know where to look or I'm not understanding some of the 
libraries I did find.

Let us know how it goes.

-- Lou Pecora


      
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to