[
https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559269#comment-14559269
]
Michael McCandless commented on LUCENE-6481:
--------------------------------------------
Thanks [~nknize], tests pass for me now, and I ran the same "bboxes around
London, UK" perf test and this is much faster than before (= LUCENE-6450): I
now see ~17.8 msec avg per query (~2X faster than GeoHashPrefixTree I think).
I'll have a closer look at the patch ... I really want to make a visualization
here just like the BKD tree ones
(https://plus.google.com/+MichaelMcCandless/posts/Daj9FgYPdtv) to see how the
morton-shapes "work" in filling in a polygon...
I think there are fun things we can explore (future!) to speed things up
further, e.g. if we also index lat/lon into doc values, then we can use that
for filtering, freeing us to also use prefix terms on boundary shapes, and also
maybe freeing us to use encodings like Hilbert curves which should give better
locality / able to use prefix terms more frequently since you would no longer
need a fast decode from term -> lat/lon. But, it would use more disk space...
we can also integrate with geo3d so we get shape intersection for faster
polygon querying (now it must filter every point?). Later!
> Improve GeoPointField type to only visit high precision boundary terms
> -----------------------------------------------------------------------
>
> Key: LUCENE-6481
> URL: https://issues.apache.org/jira/browse/LUCENE-6481
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/index
> Reporter: Nicholas Knize
> Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch,
> LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch
>
>
> Current GeoPointField [LUCENE-6450 |
> https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges
> along the space-filling curve that represent a provided bounding box. This
> determines which terms to visit in the terms dictionary and which to skip.
> This is suboptimal for large bounding boxes as we may end up visiting all
> terms (which could be quite large).
> This incremental improvement is to improve GeoPointField to only visit high
> precision terms in boundary ranges and use the postings list for ranges that
> are completely within the target bounding box.
> A separate improvement is to switch over to auto-prefix and build an
> Automaton representing the bounding box. That can be tracked in a separate
> issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]