[ 
https://issues.apache.org/jira/browse/LUCENE-7153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7153:
--------------------------------
    Attachment: LUCENE-7153.patch

Here is a patch. I hate adding a public class, but I think its needed here. Its 
completely immutable like String and has some helper methods for multipolygon 
logic. I added "hole" support to the existing traversal operations and some 
simple tests.

Also, GeoPoint moves from approximate to precise implementation. Today we have 
two different ones,  for some reason it uses approximate. So this solves 
LUCENE-7145 too. Since GeoPoint is not in the sandbox i deprecated its current 
two ctors and added two new ones.

I benchmarked performance of single polygons versus master with Mike's london 
benchmark (these are circle-like approximations of varying number of vertices 
{{n}}):

{noformat}
LatLonPoint
n=5   25.7 QPS -> 26.2 QPS
n=50  17.8 QPS -> 18.3 QPS
n=500 7.5 QPS  -> 8.0 QPS

GeoPointField
n=5   19.3 QPS -> 19.0 QPS
n=50  11.3 QPS -> 12.0 QPS
n=500 4.0 QPS  -> 3.8 QPS
{noformat}

For LatLonPoint the patch is an improvement as the polygon complexity grows, 
thats because bounding box was not always used before.

For GeoPoint, its faster in some and slower in other cases. But still very 
close to the same performance. And LatLonPoint is faster and the patch makes it 
even more so, so I think things are ok.

> give GeoPoint and LatLonPoint full polygon support
> --------------------------------------------------
>
>                 Key: LUCENE-7153
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7153
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>         Attachments: LUCENE-7153.patch
>
>
> These two geo impls have a very limited polygon support that does not support 
> inner rings (holes) or multiple outer rings efficiently.
> Basically if you want to do this, you are left building crazy logic with 
> booleanquery which will send memory into the gigabytes for a single query, 
> needlessly. 
> For example Russia polygon from geonames is 250KB of geojson and over a 
> thousand outer rings.
> We should instead support this stuff with the queries themselves, especially 
> it will allow us to implement things more efficiently in the future.
> I think instead of {{newPolygonQuery(double[], double[])}} it should look 
> like {{newPolygonQuery(Polygon...)}}. A polygon can be a single outer ring 
> (shape) with 0 or more inner rings (holes). No nesting, you just use multiply 
> polygons if you e.g. have an island. 
> See http://esri.github.io/geometry-api-java/doc/Polygon.html for visuals and 
> examples. I indented their GeoJSON example:
> {noformat}
> {
>   "type":"MultiPolygon",
>   "coordinates": [
>      // first polygon (order does not matter could have been last instead)
>      [
>        // clockwise => outer ring
>        
> [[0.0,0.0],[-0.5,0.5],[0.0,1.0],[0.5,1.0],[1.0,0.5],[0.5,0.0],[0.0,0.0]],
>        // hole
>        
> [[0.5,0.2],[0.6,0.5],[0.2,0.9],[-0.2,0.5],[0.1,0.2],[0.2,0.3],[0.5,0.2]]
>      ],
>      // second polygon (order does not matter, could have been first instead)
>      [ 
>        // island
>        [[0.1,0.7],[0.3,0.7],[0.3,0.4],[0.1,0.4],[0.1,0.7]]
>      ]
>   ],
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to