[jira] [Commented] (LUCENE-7656) Implement geo box and distance queries using doc values.

2017-01-27 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843148#comment-15843148
 ] 

Michael McCandless commented on LUCENE-7656:


bq. For the record, the nightly benchmarks confirm the speedup.

Nice!

> Implement geo box and distance queries using doc values.
> 
>
> Key: LUCENE-7656
> URL: https://issues.apache.org/jira/browse/LUCENE-7656
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (7.0), 6.5
>
> Attachments: LUCENE-7656.patch, LUCENE-7656.patch, LUCENE-7656.patch
>
>
> Having geo box and distance queries available as both point and 
> doc-values-based queries means we could use them with 
> {{IndexOrDocValuesQuery}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7656) Implement geo box and distance queries using doc values.

2017-01-27 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842905#comment-15842905
 ] 

Adrien Grand commented on LUCENE-7656:
--

For the record, the nightly benchmarks confirm the speedup. 
http://people.apache.org/~mikemccand/geobench.html#search-distance

> Implement geo box and distance queries using doc values.
> 
>
> Key: LUCENE-7656
> URL: https://issues.apache.org/jira/browse/LUCENE-7656
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (7.0), 6.5
>
> Attachments: LUCENE-7656.patch, LUCENE-7656.patch, LUCENE-7656.patch
>
>
> Having geo box and distance queries available as both point and 
> doc-values-based queries means we could use them with 
> {{IndexOrDocValuesQuery}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7656) Implement geo box and distance queries using doc values.

2017-01-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839666#comment-15839666
 ] 

ASF subversion and git services commented on LUCENE-7656:
-

Commit cd1be78e2cb9a9bc2e65d5adcc7cecca997330b4 in lucene-solr's branch 
refs/heads/branch_6x from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cd1be78 ]

LUCENE-7656: Implement geo box/distance queries using doc values.


> Implement geo box and distance queries using doc values.
> 
>
> Key: LUCENE-7656
> URL: https://issues.apache.org/jira/browse/LUCENE-7656
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7656.patch, LUCENE-7656.patch, LUCENE-7656.patch
>
>
> Having geo box and distance queries available as both point and 
> doc-values-based queries means we could use them with 
> {{IndexOrDocValuesQuery}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7656) Implement geo box and distance queries using doc values.

2017-01-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839667#comment-15839667
 ] 

ASF subversion and git services commented on LUCENE-7656:
-

Commit cf943c545478e01a2c76013f1c31b96786cdd165 in lucene-solr's branch 
refs/heads/master from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cf943c5 ]

LUCENE-7656: Implement geo box/distance queries using doc values.


> Implement geo box and distance queries using doc values.
> 
>
> Key: LUCENE-7656
> URL: https://issues.apache.org/jira/browse/LUCENE-7656
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7656.patch, LUCENE-7656.patch, LUCENE-7656.patch
>
>
> Having geo box and distance queries available as both point and 
> doc-values-based queries means we could use them with 
> {{IndexOrDocValuesQuery}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7656) Implement geo box and distance queries using doc values.

2017-01-26 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839578#comment-15839578
 ] 

Michael McCandless commented on LUCENE-7656:


bq. I wanted to look into initializing the distance predicate lazily but 
remembered that IndexSearcher might call Weight.scorer from multiple threads so 
this has some complexity that I'd like to delay to another issue.

Ahh, yes, hairy ... that's fine to postpone!  The amortized cost of that 
initialization approaches zero as the index grows ...

+1, patch looks great!

> Implement geo box and distance queries using doc values.
> 
>
> Key: LUCENE-7656
> URL: https://issues.apache.org/jira/browse/LUCENE-7656
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7656.patch, LUCENE-7656.patch, LUCENE-7656.patch
>
>
> Having geo box and distance queries available as both point and 
> doc-values-based queries means we could use them with 
> {{IndexOrDocValuesQuery}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7656) Implement geo box and distance queries using doc values.

2017-01-25 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838949#comment-15838949
 ] 

Adrien Grand commented on LUCENE-7656:
--

bq. I suppose it would help "big" distance queries more and maybe hurt "tiny" 
distance queries, since it does the up front work

I think the only scenario that gets worse is when the distance is so tiny that 
the distance range is always contained in a single BKD cell. As soon as you 
start having crossing cells, that cost is quickly amortized. For instance, say 
your index has 30 segments with one crossing cell each (which is still a 
best-case scenario), we already need to perform 30*1024~=30k distance 
computations. On the other hand, this change needs to do 4096*4~=16k up-front 
distance computations (regardless of the number of segments since it is 
computed for a whole query) so if it allows to save 1/2 distance computations, 
its cost is already amortized.

bq. the same up front work is done twice, and one of them won't be used

True, this should be easy to fix!

bq. Since you use bit shifting, it looks like the number of effective cells may 
be anywhere between 1024 and 4096 right? Do you think two straight integer 
divisions instead, which could get us usually to 4096 cells, is too costly per 
hit?

You are right about the fact that there are lost cells. Avoiding integer 
divisions was one reason in favor of bit shifting, but there was another one, 
which is that they do not create boxes that cross the dateline.

That said, you make a good point that we should not have to both store and 
compute relations for those lost cells, let me look into fixing that.

> Implement geo box and distance queries using doc values.
> 
>
> Key: LUCENE-7656
> URL: https://issues.apache.org/jira/browse/LUCENE-7656
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7656.patch, LUCENE-7656.patch
>
>
> Having geo box and distance queries available as both point and 
> doc-values-based queries means we could use them with 
> {{IndexOrDocValuesQuery}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7656) Implement geo box and distance queries using doc values.

2017-01-25 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838767#comment-15838767
 ] 

Michael McCandless commented on LUCENE-7656:


I like this change!  It's nice you see a perf gain on the OSM benchmarks.  I 
suppose it would help "big" distance queries more and maybe hurt "tiny" 
distance queries, since it does the up front work (the {{DistancePredicate}}, 
but that's the right tradeoff.

It's a bit annoying that, if you use the {{IndexOrDocValuesQuery}}, all the 
same up front work is done twice, and one of them won't be used; maybe we could 
make it lazy?  But that can wait, it's just an opto.

Since you use bit shifting, it looks like the number of effective cells may be 
anywhere between 1024 and 4096 right?  Do you think two straight integer 
divisions instead, which could get us usually to 4096 cells, is too costly per 
hit?

bq. maybe the way LatLonPointDistanceQuery computes relations between a box and 
a circle relies on assumptions that are not met in this new code

I believe you are using it in essentially the same way as before, just 
different sized cells, so this should be fine.

> Implement geo box and distance queries using doc values.
> 
>
> Key: LUCENE-7656
> URL: https://issues.apache.org/jira/browse/LUCENE-7656
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7656.patch, LUCENE-7656.patch
>
>
> Having geo box and distance queries available as both point and 
> doc-values-based queries means we could use them with 
> {{IndexOrDocValuesQuery}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7656) Implement geo box and distance queries using doc values.

2017-01-25 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837594#comment-15837594
 ] 

Michael McCandless commented on LUCENE-7656:


Very cool!   I will have a look at the patch ...

> Implement geo box and distance queries using doc values.
> 
>
> Key: LUCENE-7656
> URL: https://issues.apache.org/jira/browse/LUCENE-7656
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7656.patch, LUCENE-7656.patch
>
>
> Having geo box and distance queries available as both point and 
> doc-values-based queries means we could use them with 
> {{IndexOrDocValuesQuery}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7656) Implement geo box and distance queries using doc values.

2017-01-25 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837575#comment-15837575
 ] 

Adrien Grand commented on LUCENE-7656:
--

For the record, I also made sure that the factory methods for these new 
dv-based queries refer to {{IndexOrDocValuesQuery}}.

> Implement geo box and distance queries using doc values.
> 
>
> Key: LUCENE-7656
> URL: https://issues.apache.org/jira/browse/LUCENE-7656
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7656.patch
>
>
> Having geo box and distance queries available as both point and 
> doc-values-based queries means we could use them with 
> {{IndexOrDocValuesQuery}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org