[
https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510431#comment-14510431
]
David Smiley commented on LUCENE-6450:
--------------------------------------
*This is really nice Nick; thanks for contributing!*
I looked over the patch. I like the light-weight-ness and ease of use; and I
suspect this will perform admirably. It should address many use-cases,
particularly once a point-radius (circle) query is added. Did you forget or
are you planning to add that in the future? The other super-common use-case is
distance sorting, and that's not here. In pointing this out I don't mean to
take away from the great stuff you have here. Further comments:
* I can see that GeoPolygonQuery extends GeoBBoxQuery for re-use, but I think
it's confusing. Instead, how about a base AbstractGeoTermRangeQuery (or
something like that) with GeoBBoxQuery being a fairly simple subclass?
* Somehow, I think the user should understand that these queries only work with
a GeoPointFieldType. Javadocs are minimally sufficient, but might the queries
be named accordingly, such as GeoPointInPolygonQuery and GeoPointInBBoxQuery?
Those names aren't even that long yet I think it's an important clarification
to help differentiate geo/spatial stuff generally.
* GeoPointFieldType has DocValues enabled yet I see that these queries don't
use that; or did I miss something? Even if they don't, I can only surmise the
intention to use them for distance sorting, although that's not here.
* If you derived those Morton number magic constants and related code yourself
then you are a better man than I and most any other coder passerby to read
this. Otherwise, please reference your sources.
** I would love to see some randomized testing of round-trip encode-decode of
the morton numbers. I understand if there needs to be an error tolerance.
* I'd like to see more javadocs on spatial matters like:
** does the GeoBBoxQuery support dateline wrap? Ditto for GeoPolygonQuery.
** are "lat" and "lon" in degrees or radians?
** What mathematical model does the polygon query operate in? (Answer is
Cartesian despite lat-lon surface of sphere coordinates: warn the user of the
implications)
* Have you thought about a way to use GeoPointFieldType with pre-projected data
(x,y with a pre-configured range boundary)?
* I'm wondering what [~mikemccand]'s opinion on what changes will be necessary
if any to work with the experimental auto-prefix term stuff.
RE contributing to core:
_I really wonder what other people think_. If others think it's great then I'm
definitely not going to stand in the way. But I am concerned about confusion
this may introduce about where spatial stuff is and how it's related. Javadocs
could help some. I think it's a slippery slope as to identifying what the
scope of spatial that you think should go in Lucene-core versus Lucene-spatial
is. Perhaps you might think it's due to the dependencies? I think that's not
a great differentiator, especially if one considers that the old spatial module
(Lucene 3.x and prior by Patrick O'Leary) had no dependencies and I think you'd
be hard pressed to think that belonged in Lucene-core if you were to see it.
This reminds me a little of some perceptual confusion of Solr's "LatLonType"
(internally comprised of two double fields) versus a Solr field type for RPT.
A new user is easily led to believe that they should use LatLonType if they
have point data, especially because of it's name (hey yeah I have lat's and
lon's), and say want to do simple bounding-box or point-radius queries. Sure
it works for that, and it may very well be the best choice, but depending on
the scale and various factors it _may_ be less performant compared to RPT which
perceptually appears as a more "advanced" choice to the user even though it's
almost as easy to use for the simple cases. By the same token, that could
occur here if it's in core, but wouldn't if all this was wrapped up in a
Lucene-spatial SpatialStrategy facade. Not a big deal, I think; but a concern.
Thanks again for contributing this Nick.
> Add simple encoded GeoPointField type to core
> ---------------------------------------------
>
> Key: LUCENE-6450
> URL: https://issues.apache.org/jira/browse/LUCENE-6450
> Project: Lucene - Core
> Issue Type: New Feature
> Affects Versions: 5.x
> Reporter: Nicholas Knize
> Priority: Minor
> Attachments: LUCENE-6450.patch
>
>
> At the moment all spatial capabilities, including basic point based indexing
> and querying, require the lucene-spatial module. The spatial module, designed
> to handle all things geo, requires dependency overhead (s4j, jts) to provide
> spatial rigor for even the most simplistic spatial search use-cases (e.g.,
> lat/lon bounding box, point in poly, distance search). This feature trims the
> overhead by adding a new GeoPointField type to core along with
> GeoBoundingBoxQuery, GeoPolygonQuery, and GeoDistanceQuery classes to the
> .search package. This field is intended as a straightforward lightweight type
> for the most basic geo point use-cases without the overhead.
> The field uses simple bit twiddling operations (currently morton hashing) to
> encode lat/lon into a single long term. The queries leverage simple
> multi-phase filtering that starts by leveraging NumericRangeQuery to reduce
> candidate terms deferring the more expensive mathematics to the smaller
> candidate sets.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]