[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-02-06 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136136#comment-15136136
 ] 

ASF subversion and git services commented on LUCENE-6930:
-

Commit 74a08c08006941b74eda585b86b57fbe0ff341b2 in lucene-solr's branch 
refs/heads/branch_5x from nknize
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=74a08c0 ]

LUCENE-6930: Decouples GeoPointField from NumericType by using a custom 
GeoPointTokenStream and TermEnum designed for GeoPoint prefix terms


> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spatial
>Reporter: Nicholas Knize
> Fix For: 5.5, trunk
>
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-02-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15135558#comment-15135558
 ] 

ASF subversion and git services commented on LUCENE-6930:
-

Commit ae3b388e974960091594aee7e1b39d3d3a090520 in lucene-solr's branch 
refs/heads/master from nknize
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ae3b388 ]

LUCENE-6930: Decouples GeoPointField from NumericType by using a custom 
GeoPointTokenStream and TermEnum designed for GeoPoint prefix terms


> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spatial
>Reporter: Nicholas Knize
> Fix For: 5.5, trunk
>
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-02-04 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133089#comment-15133089
 ] 

Michael McCandless commented on LUCENE-6930:


Patch looks good!

If a user accidentally indexes with the legacy ({{NUMERIC}}) encoding
but searches with {{PREFIX}} it won't be detected right?  Like they
will just get 0 results?  I don't think we must fix this... seems like
it's not easy since there is no "schema" for this.

Hmm I didn't see this added?

{quote}
Can you update GeoPointDistanceQuery javadocs explaining the max
radius limit? I.e. that the circle projected on the earth's surface
cannot wrap around and touch itself again (if I understand that
right!)?
{quote}

+1 to commit!  Thanks [~nknize] ... I'm curious to see how this improves
the metrics (indexing time, index size, heap used by {{IndexReader}},
search time).


> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spatial
>Reporter: Nicholas Knize
> Fix For: 5.5, trunk
>
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-27 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120345#comment-15120345
 ] 

Nicholas Knize commented on LUCENE-6930:


Thanks for the feedback [~mikemccand]! I'll post an updated patch shortly. In 
the meantime I think this will be blocked by LUCENE-6997 so I may put this in a 
feature branch.

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-27 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15119397#comment-15119397
 ] 

Michael McCandless commented on LUCENE-6930:


Thanks [~nknize], this is a nice (and large!) change.

The sole "R" in this javadoc left me hanging a bit ;)

{noformat}

  /**
   * GeoTerms are coded as the following:
   *
   * R
   */

{noformat}

Can you update {{GeoPointDistanceQuery}} javadocs explaining the max
radius limit?  I.e. that the circle projected on the earth's surface
cannot wrap around and touch itself again (if I understand that
right!)?

Can we move {{GeoPointTokenStream}} under {{o.a.l.document}} and make
it package private?  (And make {{TermEncoding}} public elsewhere.)

Can all other ctors of {{GeoPointField}} just forward to the primary
("takes everything") ctor call, i.e. call {{this(...)}} instead of
{{super(...)}}?  Also, can we break this compound ternary operator
into a static helper method?:

{noformat}
super(name, stored == Store.YES ?
termEncoding == GeoPointTokenStream.TermEncoding.PREFIX ? 
PREFIX_TYPE_STORED : NUMERIC_TYPE_STORED :
termEncoding == GeoPointTokenStream.TermEncoding.PREFIX ? 
PREFIX_TYPE_NOT_STORED : NUMERIC_TYPE_NOT_STORED);
{noformat}

E.g. maybe {{getFieldType}}.

Should it be an error if you pass a custom {{FieldType}} to
{{GeoPointField}} that disabled indexing?  I.e. catch that up front,
where we check DV type and numeric type, and then remove this:

{noformat}
if (fieldType().indexOptions() == IndexOptions.NONE) {
  // Not indexed
  return null;
}
{noformat}

from the {{tokenStream}} method?

Can we deprecate the {{GeoPointField}} ctors that take
{{TermEncoding}}?  (You should use/migrate to the default ctor that
uses the better {{PREFIX}} encoding).

{{GeoUtils.longToByteArray}} and {{.fromByteArray}} and
{{GeoEncodingUtils.geoTermToString}} look dead?

This comment confuses me:

{noformat}
  // start shift at 61
  private short shift;
{noformat}

Does it really start at 61?  Seems like ({{computeMaxShift}}) it's
either 45 (for a large bbox) or 36 (for a not-large bbox)?  Can we
move the comment down to where we actually do assign to shift?



> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-26 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118217#comment-15118217
 ] 

Michael McCandless commented on LUCENE-6930:


Thanks [~nknize] ... the last patch is a bit odd, e.g. removing the entire 
GeoUtils.java and adding it back again ... can you fix it?  Maybe there is some 
magical git option to git diff?

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-26 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118227#comment-15118227
 ] 

Michael McCandless commented on LUCENE-6930:


Hmm, also a few compilation errors, e.g.:

{noformat}
[javac] 
/l/nick/lucene/sandbox/src/test/org/apache/lucene/search/TestGeoPointQuery.java:373:
 error: cannot find symbol
[javac]   long enc = GeoUtils.mortonHash(lon, lat);
[javac]  ^
[javac]   symbol:   method mortonHash(double,double)
[javac]   location: class GeoUtils
[javac] 
/l/nick/lucene/sandbox/src/test/org/apache/lucene/search/TestGeoPointQuery.java:374:
 error: cannot find symbol
[javac]   double latEnc = GeoUtils.mortonUnhashLat(enc);
[javac]   ^
[javac]   symbol:   method mortonUnhashLat(long)
[javac]   location: class GeoUtils
{noformat}

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-26 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118233#comment-15118233
 ] 

Nicholas Knize commented on LUCENE-6930:


argghh.. guessed the fancy {{--show-copies-as-adds}} didn't work this time. 
I'll prep a new patch shortly. Thx [~mikemccand]!

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-22 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15112630#comment-15112630
 ] 

Nicholas Knize commented on LUCENE-6930:


That's unrelated to this patch. Its from nuking the pre-quantization in the 
tests and is being fixed by LUCENE-6956.

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-22 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15112468#comment-15112468
 ] 

Michael McCandless commented on LUCENE-6930:


Hmm I'm seeing this test failure with this patch:

{noformat}
[junit4:pickseed] Seed property 'tests.seed' already defined: 3B2C0D9EBF6EC99D
   [junit4]  says hello! Master seed: 3B2C0D9EBF6EC99D
   [junit4] Executing 1 suite with 1 JVM.
   [junit4] 
   [junit4] Started J0 PID(30297@localhost).
   [junit4] Suite: org.apache.lucene.search.TestGeoPointQuery
   [junit4]   1> T4: id=7224 should not match but did
   [junit4]   1>   small=true query=GeoPointInBBoxQuery: field=point: Lower 
Left: [85.35664315745854,-41.59146759172397] Upper Right: 
[86.66116425340478,-40.74649261518726] docID=7060
   [junit4]   1>   lat=-40.74649160581509 lon=86.53170426878272
   [junit4]   1>   deleted?=false
   [junit4]   2> jan 22, 2016 2:35:24 PM 
com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
 uncaughtException
   [junit4]   2> WARNING: Uncaught exception in thread: 
Thread[T4,5,TGRP-TestGeoPointQuery]
   [junit4]   2> java.lang.AssertionError: some hits were wrong
   [junit4]   2>at 
__randomizedtesting.SeedInfo.seed([3B2C0D9EBF6EC99D]:0)
   [junit4]   2>at org.junit.Assert.fail(Assert.java:93)
   [junit4]   2>at 
org.apache.lucene.util.BaseGeoPointTestCase$VerifyHits.test(BaseGeoPointTestCase.java:552)
   [junit4]   2>at 
org.apache.lucene.util.BaseGeoPointTestCase$2._run(BaseGeoPointTestCase.java:756)
   [junit4]   2>at 
org.apache.lucene.util.BaseGeoPointTestCase$2.run(BaseGeoPointTestCase.java:623)
   [junit4]   2> 
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGeoPointQuery 
-Dtests.method=testAllLonEqual -Dtests.seed=3B2C0D9EBF6EC99D 
-Dtests.multiplier=2 -Dtests.slow=true 
-Dtests.linedocsfile=/lucenedata/hudson.enwiki.random.lines.txt.fixed 
-Dtests.locale=no -Dtests.timezone=Europe/Jersey -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] ERROR   2.46s | TestGeoPointQuery.testAllLonEqual <<<
   [junit4]> Throwable #1: java.lang.AssertionError
   [junit4]>at 
org.apache.lucene.util.BaseGeoPointTestCase.verify(BaseGeoPointTestCase.java:770)
   [junit4]>at 
org.apache.lucene.util.BaseGeoPointTestCase.testAllLonEqual(BaseGeoPointTestCase.java:203)
   [junit4]>at java.lang.Thread.run(Thread.java:745)Throwable #2: 
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=19, name=T4, state=RUNNABLE, 
group=TGRP-TestGeoPointQuery]
   [junit4]> Caused by: java.lang.AssertionError: some hits were wrong
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([3B2C0D9EBF6EC99D]:0)
   [junit4]>at 
org.apache.lucene.util.BaseGeoPointTestCase$VerifyHits.test(BaseGeoPointTestCase.java:552)
   [junit4]>at 
org.apache.lucene.util.BaseGeoPointTestCase$2._run(BaseGeoPointTestCase.java:756)
   [junit4]>at 
org.apache.lucene.util.BaseGeoPointTestCase$2.run(BaseGeoPointTestCase.java:623)
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene60): 
{id=FSTOrd50, point=PostingsFormat(name=MockRandom)}, 
docValues:{id=DocValuesFormat(name=Direct), 
point=DocValuesFormat(name=Memory)}, sim=ClassicSimilarity, locale=no, 
timezone=Europe/Jersey
   [junit4]   2> NOTE: Linux 3.13.0-71-generic amd64/Oracle Corporation 
1.8.0_60 (64-bit)/cpus=8,threads=1,free=417685072,total=493355008
   [junit4]   2> NOTE: All tests run in this JVM: [TestGeoPointQuery]
   [junit4] Completed [1/1 (1!)] in 2.87s, 1 test, 1 error <<< FAILURES!
   [junit4] 
   [junit4] 
   [junit4] Tests with failures [seed: 3B2C0D9EBF6EC99D]:
   [junit4]   - org.apache.lucene.search.TestGeoPointQuery.testAllLonEqual
{noformat}

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its 

[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-22 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15112478#comment-15112478
 ] 

Michael McCandless commented on LUCENE-6930:


I think you can attach javadoc to each enum constant value
({{PREFIX}}, {{NUMERIC}})?  And then add {{@deprecated}} for
{{NUMERIC}}'s javadoc.

Can you add braces even if the body is a single line?  E.g.:

{noformat}
+if (isInit == false)
+  throw new IllegalStateException("call setGeoCode() before usage");
{noformat}


> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-22 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15112774#comment-15112774
 ] 

Michael McCandless commented on LUCENE-6930:


OK, phew!

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-21 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111345#comment-15111345
 ] 

Nicholas Knize commented on LUCENE-6930:


It's related to {{TOLERANCE}} and that {{computeMaxShift}} in this patch is 
reused for all query types. So the depth of traversal for distanceQueries may 
vary slightly. The effect is that trunk has a handful of false negatives that 
this patch has picked up. The tests don't report them as failures because the 
deltas are within the {{TOLERANCE}}.

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109247#comment-15109247
 ] 

Michael McCandless commented on LUCENE-6930:


It's odd that we pass {{GeoPointTermQuery}} down to {{GeoPointTermsEnum}}
ctor which then goes and secretly sets a field:

{noformat}
  query.cellComparator.termEnum = this;
{noformat}

Can we make {{GeoPointTermQuery}} package private again, and do this
"up above"?  And put {{TermEncoding}} somewhere else to be public?

Can we rename {{GeoPointTermQuery}} to {{GeoPointMultiTermQuery}}?

In the javadocs for {{TermEncoding}}'s two options can you explain
that one is newer and smaller/faster than the other, which is now
"legacy" ({{NUMERIC}})?  Can we deprecate the {{NUMERIC}} one?

In {{GeoPointTermsEnum.newInstance}} can we change:

{noformat}
if (query.termEncoding == GeoPointTermQuery.TermEncoding.PREFIX) {
  return new GeoPointPrefixTermsEnum(terms, query);
}
return new GeoPointNumericTermsEnum(terms, query);
{noformat}

to e.g.:
{noformat}
if (query.termEncoding == GeoPointTermQuery.TermEncoding.PREFIX) {
  return new GeoPointPrefixTermsEnum(terms, query);
} else if (query.termEncoding == GeoPointTermQuery.TermEncoding.NUMERIC) {
  return new GeoPointNumericTermsEnum(terms, query);
} else {
  throw new IllegalArgumentException(...);
}
{noformat}


> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-19 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106856#comment-15106856
 ] 

Michael McCandless commented on LUCENE-6930:


Hmm, the tests pass for me with this patch, but when I went to benchmark it, 
I'm seeing a different number of hits on trunk:

{noformat}
ITER: 5 5.861972507 sec; totHits=221120357; 225 queries
{noformat}

vs with the patch:

{noformat}
ITER: 5 5.754698705 sec; totHits=221120418; 225 queries
{noformat}

The results should not have changed, because we use doc values for precise 
matching, right?  Why are tests not catching this :)

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-18 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106018#comment-15106018
 ] 

Nicholas Knize commented on LUCENE-6930:


Try again with the new patch. I guess diff didn't like the way I refactored 
{{GeoUtils}} and adding another new {{GeoUtils}} class.

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-18 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15105986#comment-15105986
 ] 

Michael McCandless commented on LUCENE-6930:


Hmm I get compilation errors because things still reference GeoUtils but it was 
moved?  Maybe run "ant clean" first and then you should see the errors?

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-18 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106127#comment-15106127
 ] 

Nicholas Knize commented on LUCENE-6930:


Nice!!! Thanks for that protip! That certainly would have come in handy in this 
situation. 

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-18 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106031#comment-15106031
 ] 

Michael McCandless commented on LUCENE-6930:


Thanks [~nknize]

If you pass {{--show-copies-as-adds}} to svn it will make a more easily applied 
patch ...

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6930) Decouple GeoPointField from NumericType

2015-12-15 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057865#comment-15057865
 ] 

Michael McCandless commented on LUCENE-6930:


+1, {{LegacyNumericType}} is now deprecated in trunk (to be removed in 7.0), so 
we should migrate away from it ...

But we should maybe take this further, once we get all dimensional values based 
geo queries working well in trunk (e.g. at least {{DimensionalDistanceQuery}} 
and {{DimensionalDistanceRangeQuery}} are still missing?) and deprecate the 
postings based geo queries as well?

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org