[
https://issues.apache.org/jira/browse/LUCENE-8396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542761#comment-16542761
]
Adrien Grand commented on LUCENE-8396:
--------------------------------------
This looks awesome. Indexing two additional dimensions sounds worth it to me if
it helps index way fewer fields. It's also exciting we're exercising high
numbers of dimensions with the points API. Since this is targeting sandbox and
the patch is a bit large, I think it's easier to get it in soon and iterate
from there? It looks pretty clean to me already. Let's maybe just make
windingOrder final on the Polygon class first.
Some other comments I had while skimming through the patch:
- LatLonShape's javadocs say that "Finding all shapes within a range at search
time is efficient." but I think it really means "intersect" rather than "within"
- TriangleField should be private and maybe renamed to LatLonTriangle for
consistency with LatLonBoundingBox and LatLonPoint?
- LatLonShapeBoundingBoxQuery should be pkg-private and its constructor should
validate that the box doesn't cross the dateline?
- Tessellator would probably be a bit easier to test and debug if it didn't
handle both quantization and tessellation at once. Can we eg. do quantization
first (eg. with a clone of the Polygon class that has ints instead of doubles)
on top of tessellation and then make tessellation work directly in the
quantized space? (Probably best done after merging to ease reviewing)
- Tessellator.Node uses encodeLatCeil and encodeLonCeil, but it should really
use encodeLat and encodeLon?
- Some tests like TestLatLonShape.testSVG and TestTessellator.testBug write
files and don't assert anything, looks like left-overs?
> Add Points Based Shape Indexing
> -------------------------------
>
> Key: LUCENE-8396
> URL: https://issues.apache.org/jira/browse/LUCENE-8396
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Nicholas Knize
> Priority: Major
> Attachments: LUCENE-8396.patch, polyWHole.png, tessellatedPoly.png
>
>
> I've been tinkering with this for a while and would like to solicit some
> feedback. I'd like to introduce a new shape field based on the BKD/Points
> codec to bring much of the Points based performance improvements to the shape
> indexing and search usecase. Much like the existing shape indexing in
> {{spatial-extras}} the shape will be decomposed into smaller parts, but
> instead of decomposing into quad cells (which have the drawback of precision
> accuracy and sheer volume of terms) I'd like to explore decomposing the
> shapes into a triangular mesh; similar to gaming and computer graphics. Not
> only does this approach reduce the number of terms, but it has the added
> benefit of better accuracy (precision is based on the index encoding
> technique instead of the spatial resolution of the quad cell).
> For better clarity, consider the following illustrations (of a polygon in a 1
> degree x 1 degree spatial area). The first is using the quad tree technique
> applied in the existing inverted index. The second is using a triangular mesh
> decomposition as used by popular OpenGL and javascript rendering systems
> (such as those used by mapbox).
> !polyWHole.png!
> Decomposing this shape using a quad tree results in 1,105,889 quad terms at 3
> meter spatial resolution.
> !tessellatedPoly.png!
>
> Decomposing using a triangular mesh results in 8 triangles at the same
> resolution as {{encodeLat/Lon}}.
> The decomposed triangles can then be encoded as a 6 dimensional POINT and
> queries are implemented using the computed relations against these triangles
> (similar to how its done with the inverted index today).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]