[ 
https://issues.apache.org/jira/browse/LUCENE-7270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268797#comment-15268797
 ] 

Karl Wright commented on LUCENE-7270:
-------------------------------------

Implemented this and discovered that actually my former implementation 
performed much better for the London borough case.  In fact, the 2D balanced 
tree is 4x slower:

{code}
ITER 0: 0.05 M hits/sec, 0.61 QPS (53.91 sec for 33 queries), totHits=2693539
{code}

I believe this may be because it is visiting far more nodes (but not more 
edges), because of the way the ranging is done:

{code}
    private static boolean traverse(final Node node, final EdgeIterator 
edgeIterator, final double minValue, final double maxValue) {
      if (node == null) {
        return true;
      }
      if (minValue <= node.max) {
        
        // Does this node overlap?
        if (minValue <= node.high && maxValue >= node.low) {
          if (edgeIterator.matches(node.edge) == false) {
            return false;
          }
        }
        
        if (traverse(node.left, edgeIterator, minValue, maxValue) == false) {
          return false;
        }
        if (traverse(node.right, edgeIterator, minValue, maxValue) == false) {
          return false;
        }
      }
      return true;
    }
{code}

I'm a bit puzzled as to why this doesn't totally sink the 2D code. Maybe it's 
because the first edge it finds causes it to exit?  Or maybe I just have a 
silly error somewhere?  Don't know, maybe [~rcmuir] has an idea?  I'll attach a 
patch if requested.


> Use better balanced trees for Geo3d complex polygons
> ----------------------------------------------------
>
>                 Key: LUCENE-7270
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7270
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/spatial3d
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>
> The current tree data structure in GeoComplexPolygon has a lot of weaknesses. 
>  A better algorithm maybe can be taken from Polygon2D, which basically does 
> the following:
> Each node has:
> - low value (which is for that edge alone)
> - max value (which is for that edge and all children)
> Balanced tree building:
> - sort by low value (which is for the individual edge), and use max value as 
> tie breaker (which is max for edge and all children)
> - build tree after sorting, picking midpoint and recursively building lesser 
> and greater children that way
> Balanced tree traversal (looking for range minValue -> maxValue):
> - Descend the entire tree until the node fails this test:
>       if (minValue <= max) { ... }
>   So if the minimum value being sought is greater than the max for this edge 
> and all children, we stop and don't look at children.
>   (Q: does this represent a good split and a targeted range?  Maybe...  We 
> can certainly try it.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to