[ https://issues.apache.org/jira/browse/LUCENE-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272827#comment-15272827 ]
ASF subversion and git services commented on LUCENE-7241: --------------------------------------------------------- Commit 2a3549a25766be556577d4ccc443e4de0358f7a8 in lucene-solr's branch refs/heads/branch_6x from [~kwri...@metacarta.com] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2a3549a ] LUCENE-7241: Don't allocate GeoPoints we aren't going to return. > Improve performance of geo3d for polygons with very large numbers of points > --------------------------------------------------------------------------- > > Key: LUCENE-7241 > URL: https://issues.apache.org/jira/browse/LUCENE-7241 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial3d > Affects Versions: master > Reporter: Karl Wright > Assignee: Karl Wright > Fix For: master, 6.x > > Attachments: LUCENE-7241.patch, LUCENE-7241.patch, LUCENE-7241.patch, > LUCENE-7241.patch, LUCENE-7241.patch > > > This ticket corresponds to LUCENE-7239, except it's for geo3d polygons. > The trick here is to organize edges by some criteria, e.g. z value range, and > use that to avoid needing to go through all edges and/or tile large irregular > polygons. Then we use the ability to quickly determine intersections to > figure out whether a point is within the polygon, or not. > The current way geo3d polygons are constructed involves finding a single > point, or "pole", which all polygon points circle. This point is known to be > either "in" or "out" based on the direction of the points. So we have one > place of "truth" on the globe that is known at polygon setup time. > If edges are organized by z value, where the z values for an edge are > computed by the standard way of computing bounds for a plane, then we can > readily organize edges into a tree structure such that it is easy to find all > edges we need to check for a given z value. Then, we merely need to compute > how many intersections to consider as we navigate from the "truth" point to > the point being tested. In practice, this means both having a tree that is > organized by z, and a tree organized by (x,y), since we need to navigate in > both directions. But then we can cheaply count the number of intersections, > and once we do that, we know whether our point is "in" or "out". > The other performance improvement we need is whether a given plane intersects > the polygon within provided bounds. This can be done using the same two > trees (z and (x,y)), by virtue of picking which tree to use based on the > plane's minimum bounds in z or (x,y). And, in practice, we might well use > three trees: one in x, one in y, and one in z, which would mean we didn't > have to compute longitudes ever. > An implementation like this trades off the cost of finding point membership > in near O\(log\(n)) time vs. the extra expense per step of finding that > membership. Setup of the query is O\(n) in this scheme, rather than O\(n^2) > in the current implementation, but once again each individual step is more > expensive. Therefore I would expect we'd want to use the current > implementation for simpler polygons and this sort of implementation for > tougher polygons. Choosing which to use is a topic for another ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org