[ https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527338#comment-14527338 ]
Michael McCandless commented on LUCENE-6450: -------------------------------------------- Here's the OSM subset I'm using for the benchmarks: http://people.apache.org/~mikemccand/latlon.subsetPlusAllLondon.txt.lzma It's a random 1/50th of the latest OSM export (as of last week), but includes all points within London, UK. The search benchmark then runs a fixed set (225 total) of axis-aligned rectangle intersects queries around London. Look for Index/SearchOSM/GeoPoint.java/py in luceneutil... I ran the same benchmarks (except for Packed/QuadPrefixTree): *Geopoint* Index time: 157.3 sec (incl. forceMerge) Index size: 1.8 GB Mean query time: .077 sec 221,119,062 total hits *GeoHashPrefixTree* Index time: 628.5 sec (incl. forceMerge) Index size: 4.2 GB Mean query time: .039 sec 221,120,027 total hits *libspatialindex* (using Python Rtree wrapper) Index time: 469.6 sec Index size: 2.6 GB Mean query time: .158 sec 221,118,844 total hits The first geopoint patch here got exactly the same total hit count as libspatialindex, but now it's different, I think because of the precision control to control how deep the ranges recurse. I think it's also expected geohash won't get the same hit count since it's doing a bit of quantizing (level 11 ... not sure what that equates to in meters). I'm surprised the Rtree impl is so slow ... > Add simple encoded GeoPointField type to core > --------------------------------------------- > > Key: LUCENE-6450 > URL: https://issues.apache.org/jira/browse/LUCENE-6450 > Project: Lucene - Core > Issue Type: New Feature > Affects Versions: Trunk, 5.x > Reporter: Nicholas Knize > Priority: Minor > Attachments: LUCENE-6450-5x.patch, LUCENE-6450-TRUNK.patch, > LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch > > > At the moment all spatial capabilities, including basic point based indexing > and querying, require the lucene-spatial module. The spatial module, designed > to handle all things geo, requires dependency overhead (s4j, jts) to provide > spatial rigor for even the most simplistic spatial search use-cases (e.g., > lat/lon bounding box, point in poly, distance search). This feature trims the > overhead by adding a new GeoPointField type to core along with > GeoBoundingBoxQuery and GeoPolygonQuery classes to the .search package. This > field is intended as a straightforward lightweight type for the most basic > geo point use-cases without the overhead. > The field uses simple bit twiddling operations (currently morton hashing) to > encode lat/lon into a single long term. The queries leverage simple > multi-phase filtering that starts by leveraging NumericRangeQuery to reduce > candidate terms deferring the more expensive mathematics to the smaller > candidate sets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org