Hey Marc

LocalLucene has been rewritten since then to use a Cartesian grid for it's boundary box look ups
http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html

GeoHash is method of consistent hashing to produce an id where the length of the id
gives way to the precision of the point, as in 123ab6789 might be (42.12345, -73.12345)
and 123ab would be (42.12, -73.12)

It's a great way to store individual points or areas in a compressed format, kind of like a tiny url to a particular point on the globe.

Locallucene works differently by placing points within boxes at different zoom levels.
At minimum zoom level 0 (_localTier0) everything exists within 1 box,
zoom level 1it's 4 boxes
zoom level 2 it's 16 boxes
.....
zoom level 15 it's 1,073,741,824 boxes

Obviously the index will only contain box id's for the boxes that have points inside them (thus if your indexing only
the land mass of the planet, your only going to use at most 30% of those boxes)

Based on the radius of your search, locallucene will select the appropriate zoom level to find your results in.

So locallucene can benefit from changing our notation for box id's to something similar to geohash to reduce index size,
the concept for search is different. A couple of us are looking at including geohash into the locallucene code base, it would make
our distance calculation less memory intensive having to load only one field cache for a point rather than the current 2 lat & long
fields we use, but I have to test the decoding speed to see if it slows us down.

GeoHash's main benefit comes in the form of lookup by id, say for an image or tile map at a point or for geocoding.
It probably has more benefits than that, and I'm sure someone will correct me on that.

I should also warn you, that I'm the guy who wrote locallucene so I have a natural bias towards it, but I'll be honest this is how I see
most geo searches working.

- P

squaro wrote:
Hello everybody

I would like to have your mind about spatial search techniques using Lucene

According to you is it better to use 
http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene.htm
LocalLucene  or encoding lat and long with  http://geohash.org/ Geohash  (
and then use a RangeFilter between the two boundaries hash) ?

In my mind I think using geohash should be better because the comparaison is
done on one field only.

What is your opinion about it ?

Best regards

Marc
  

--
Patrick O'Leary

AOL Local Search Technologies
Phone: + 1 703 265 8763

You see, wire telegraph is a kind of a very, very long cat. You pull his tail in New York and his head is meowing in Los Angeles.
 Do you understand this? 
And radio operates exactly the same way: you send signals here, they receive them there. The only difference is that there is no cat.
  - Albert Einstein
View Patrick O Leary's LinkedIn profileView Patrick O Leary's profile

Reply via email to