[
https://issues.apache.org/jira/browse/LUCENE-8396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16543644#comment-16543644
]
David Smiley commented on LUCENE-8396:
--------------------------------------
+1 super cool Nick! I like how the use of a tessellation technique can
represent polygons with a number of triangles on the order of the number of
edge vertices, and you wind up with perfect accuracy & scalability. There are
also some off the shelf computational geometry libraries that can simplify
polygons (JTS can), and some users may want to do that when indexing shapes.
In your example you show a quad grid index of today decomposing a shape into a
million terms. Why would someone do this? If someone wants to index shapes
today with spatial-extras, they ought to use SerializedDVStrategy for accuracy
combined with RecursivePrefixTreeStrategy for a grid index (perhaps 20%
distErrPct) and CompositeSpatialStrategy wrapping both. The number of terms
for any shape is effectively capped and controlled indirectly via distErrPct –
perhaps 100 terms for distErrPct=0.2? (not sure without trying).
> Add Points Based Shape Indexing
> -------------------------------
>
> Key: LUCENE-8396
> URL: https://issues.apache.org/jira/browse/LUCENE-8396
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Nicholas Knize
> Priority: Major
> Attachments: LUCENE-8396.patch, polyWHole.png, tessellatedPoly.png
>
>
> I've been tinkering with this for a while and would like to solicit some
> feedback. I'd like to introduce a new shape field based on the BKD/Points
> codec to bring much of the Points based performance improvements to the shape
> indexing and search usecase. Much like the existing shape indexing in
> {{spatial-extras}} the shape will be decomposed into smaller parts, but
> instead of decomposing into quad cells (which have the drawback of precision
> accuracy and sheer volume of terms) I'd like to explore decomposing the
> shapes into a triangular mesh; similar to gaming and computer graphics. Not
> only does this approach reduce the number of terms, but it has the added
> benefit of better accuracy (precision is based on the index encoding
> technique instead of the spatial resolution of the quad cell).
> For better clarity, consider the following illustrations (of a polygon in a 1
> degree x 1 degree spatial area). The first is using the quad tree technique
> applied in the existing inverted index. The second is using a triangular mesh
> decomposition as used by popular OpenGL and javascript rendering systems
> (such as those used by mapbox).
> !polyWHole.png!
> Decomposing this shape using a quad tree results in 1,105,889 quad terms at 3
> meter spatial resolution.
> !tessellatedPoly.png!
>
> Decomposing using a triangular mesh results in 8 triangles at the same
> resolution as {{encodeLat/Lon}}.
> The decomposed triangles can then be encoded as a 6 dimensional POINT and
> queries are implemented using the computed relations against these triangles
> (similar to how its done with the inverted index today).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]