[jira] [Commented] (LUCENE-10288) Are 1-dimensional kd trees in pre-86 indices always unbalanced trees?

Ignacio Vera (Jira) Mon, 06 Dec 2021 01:30:06 -0800


    [ 
https://issues.apache.org/jira/browse/LUCENE-10288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453877#comment-17453877
 ]


Ignacio Vera commented on LUCENE-10288:
---------------------------------------

First inspection of the code shows that  {{SortingCodecReader}} is not used 
when adding index sort which is good. Therefore this error is probably an 
effect of the test when wrapping the current codecs. What I see in the test we 
are using realWriters only when we have lots of points:
{code:java}
boolean useRealWriter = docValues.length > 10000; {code}
If set to true, then the test doesn't fail, maybe for backwards codec we should 
use always read writers, e.g. they are not wrapped?

 

 

> Are 1-dimensional kd trees in pre-86 indices always unbalanced trees?
> ---------------------------------------------------------------------
>
>                 Key: LUCENE-10288
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10288
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Ignacio Vera
>            Priority: Major
>
> I am looking into a set error, it can be reproduced with the following 
> command in brach 9x:
> {code}
> ./gradlew :lucene:backward-codecs:test --tests 
> "org.apache.lucene.backward_codecs.lucene60.TestLucene60PointsFormat.testOneDimTwoValues"
>   -Dtests.seed=A70882387D2AAFC2 -Dtests.multiplier=3 
> {code}
> The actual error looks looks like:
> {code:java}
> org.apache.lucene.backward_codecs.lucene60.TestLucene60PointsFormat > test 
> suite's output saved to 
> /Users/ivera/projects/lucene_prod/lucene/backward-codecs/build/test-results/test/outputs/OUTPUT-org.apache.lucene.backward_codecs.lucene60.TestLucene60PointsFormat.txt,
>  copied below:
>    >     java.lang.AssertionError: expected:<1137> but was:<1138>
>    >         at 
> __randomizedtesting.SeedInfo.seed([A70882387D2AAFC2:1B737C7FDE6454F3]:0)
>    >         at org.junit.Assert.fail(Assert.java:89)
>    >         at org.junit.Assert.failNotEquals(Assert.java:835)
>    >         at org.junit.Assert.assertEquals(Assert.java:647)
>    >         at org.junit.Assert.assertEquals(Assert.java:633)
>  {code}
> For Lucene created with this codec we assume that for 1D cases, the kd-trees 
> are unbalance but for the ND case we assume that they are always fully 
> balance. This is true for the generic case but this failure might show that 
> it might not always the case.
> During this test a merging is going on, but during the merge we Havel the 
> following code:
> {code:java}
> for (PointsReader reader : mergeState.pointsReaders) {
>   if (reader instanceof Lucene60PointsReader == false) {
>     // We can only bulk merge when all to-be-merged segments use our format:
>     super.merge(mergeState);
>     return;
>   }
> } {code}
> So we only bulk merge segments that use `Lucene60PointsReader`. Not that if 
> we do not bulk merge a 1D index then it will be created as a fully balanced 
> tree!
> In this case the test is wrapping the readers with the 
> {{SlowCodecReaderWrapper}} and therefore tricking our logic.
> But I am wondering if this the case for Index sorting where our readers might 
> be wrapped with the {{{}SortingCodecReader{}}}.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10288) Are 1-dimensional kd trees in pre-86 indices always unbalanced trees?

Reply via email to