[ 
https://issues.apache.org/jira/browse/LUCENE-8126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16320976#comment-16320976
 ] 

David Smiley commented on LUCENE-8126:
--------------------------------------

_(an aside: I wish the JIRA GitHub integration didn't put so much code context 
around the feedback text!)_

It's nice to see a new RPT SpatialPrefixTree implementation :-)  The API is a 
little crusty; perhaps sometime we could kick around some ideas to make it 
nicer.

It'll be interesting to see how well this performs.  This appears to be a 6-ary 
tree, as opposed to 4-ary (quad) or 32-ary (geohash).  One could build a 
variable arity prefixTree by the way (i.e. first level has 256, next 128, 
etc.), and I recently tweaked one of ours to do that (not contributed back yet).

For point data, the higher the arity, the smaller the index but slower search 
as it must scan more.

For non-point data, it's not clear since distErrPct caps the resolution of a 
shape relative to its size, and I believe (though not 100% sure) that it yields 
a roughly normal distribution around a certain number of cells (given fixed 
distErrPct, random shape type & size, near equator, random tree arity).  It'd 
be neat to empirically validate my theory.  If I'm right, then the optimal 
arity is probably 4 for non-point shapes, and we have two of those 
implementations. RE "near equator" above, see LUCENE-5056 though it has an easy 
fix in my last comment.

Given the way S2 divides a world into 6 sides recursively, it seems it would 
place shapes at a balanced depth in the tree no matter where in the world the 
data is.  That's a nice benefit... making the cell depth for a shape a bit more 
shallow than the probable depth in the other tree implementations (assuming a 
target precision for a given shape).  That's a bonus.

CC [~nknize] you may find this issue interesting

> Spatial prefix tree based on S2 geometry
> ----------------------------------------
>
>                 Key: LUCENE-8126
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8126
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial-extras
>            Reporter: Ignacio Vera
>
> Hi [~dsmiley],
> I have been working on a prefix tree based on goggle S2 geometry 
> (https://s2geometry.io/) to be used mainly with Geo3d shapes with very 
> promising results, in particular for complex shapes (e.g polygons). Using 
> this pixelization scheme reduces the size of the index, improves the 
> performance of the queries and reduces the loading time for non-point shapes. 
> If you are ok with this contribution and before providing any code I would 
> like to understand what is the correct/prefered approach:
> 1) Add new depency to the S2 library 
> (https://mvnrepository.com/artifact/io.sgr/s2-geometry-library-java). It has 
> Apache 2.0 license so it should be ok.
> 2) Create a utility class with all methods necessary to navigate the S2 tree 
> and create shapes from S2 cells (basically port what we need from the library 
> into Lucene).
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to