[
https://issues.apache.org/jira/browse/LUCENE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16326497#comment-16326497
]
Nicholas Knize commented on LUCENE-7110:
----------------------------------------
After getting away from this issue for a bit to investigate using Range fields
with doc values for the same purpose I wanted to raise attention back here to
solicit some quick feedback.
What are the general thoughts on building on/extending the BKD/Points codec
logic to create a complementary RTree/Ranges codec designed for fast bounding
box index/search? The bones are already there, its just a matter of changing:
# the \{min/max}PackedValue logic in the reader/writer to represent bounding
boxes/ranges
# Add logic to handle coordinate system wrapping.
Overly simplified, of course, but this codec would effectively support spatial
data only (not just spherical for lat/lon/alt, but also cartesian coordinates
for 2/3D gaming). It would also likely be better suited for the existing range
fields and would enable ranges to wrap the min/max boundary.
An alternative would be to add something like a rangeType property to
{{IndexableFieldType}} and modify the existing {{Point}} codec to handle range
encoding and coordinate system wrapping but I think that's too big of a risky
hack.
Any thoughts?
> Add Shape Support to BKD (extend to an R*/X-Tree data structure)
> ----------------------------------------------------------------
>
> Key: LUCENE-7110
> URL: https://issues.apache.org/jira/browse/LUCENE-7110
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Nicholas Knize
> Priority: Major
>
> I've been tinkering with this off and on for a while and its showing some
> promise so I'm going to open an issue to (eventually) add this feature to
> either a 6.x or (more likely) a 7.x release.
> R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where,
> like the internal node, the key for each leaf node is the Minimum Bounding
> Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding
> Rectangle) of the shape. Inserting a shape then boils down to the best way of
> optimizing the tree structure. This optimization is driven by a set of
> criteria for choosing the appropriate internal key (e.g., minimizing overlap
> between siblings, maximizing "squareness", minimizing area, maximizing space
> usage). Query is then (a bit oversimplified) a two-phase process:
> * recurse each branch that overlaps with the MBR of the query shape
> * compute the relation with the leaf node(s) - in higher dimensions (3+) this
> becomes an increasingly difficult computational geometry problem
> The current BKD implementation is a special simplified case of an R*/X tree
> where, for Point data, it is always guaranteed there will never be overlap
> between sibling nodes (because you're using the point data as the keys). By
> exploiting this property the tree algorithms (split, merge, etc) are
> relatively cheap (hence their performance boost over postings based
> numerics). By modifying the key data, and extending the tree generation
> algorithms BKD logic can be extended to support Shape data using the MBR as
> the Key and modifying split and merge based on the criteria needed for
> optimizing a shape-based data structure.
> The initial implementation (based on limitations of the GeoAPI) will support
> 2D shapes only. Once the GeoAPI can performantly handle 3D shapes the change
> is relatively trivial to add the third dimension to the tree generation code.
> Like everything else, this feature will be created in sandbox and, once
> mature, will graduate (likely to lucene-spatial or lucene-spatial-extras
> depending on the library needs).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]