[jira] [Commented] (LUCENE-10288) Are 1-dimensional kd trees in pre-86 indices always unbalanced trees?

Adrien Grand (Jira) Thu, 13 Jan 2022 07:00:05 -0800


    [ 
https://issues.apache.org/jira/browse/LUCENE-10288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475423#comment-17475423
 ]


Adrien Grand commented on LUCENE-10288:
---------------------------------------

This problem would also happen if you used a codec (e.g. SimpleText) and later 
moved to a different codec that uses a different PointsFormat (e.g. the default 
codec) so I think we should make sure everything works correctly in that case. 
Do you know if it would be possible to detect whether the tree is fully 
balanced or not at open time and adjust the read logic?

> Are 1-dimensional kd trees in pre-86 indices always unbalanced trees?
> ---------------------------------------------------------------------
>
>                 Key: LUCENE-10288
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10288
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Ignacio Vera
>            Priority: Blocker
>             Fix For: 9.1
>
>
> I am looking into a set error, it can be reproduced with the following 
> command in brach 9x:
> {code}
> ./gradlew :lucene:backward-codecs:test --tests 
> "org.apache.lucene.backward_codecs.lucene60.TestLucene60PointsFormat.testOneDimTwoValues"
>   -Dtests.seed=A70882387D2AAFC2 -Dtests.multiplier=3 
> {code}
> The actual error looks looks like:
> {code:java}
> org.apache.lucene.backward_codecs.lucene60.TestLucene60PointsFormat > test 
> suite's output saved to 
> /Users/ivera/projects/lucene_prod/lucene/backward-codecs/build/test-results/test/outputs/OUTPUT-org.apache.lucene.backward_codecs.lucene60.TestLucene60PointsFormat.txt,
>  copied below:
>    >     java.lang.AssertionError: expected:<1137> but was:<1138>
>    >         at 
> __randomizedtesting.SeedInfo.seed([A70882387D2AAFC2:1B737C7FDE6454F3]:0)
>    >         at org.junit.Assert.fail(Assert.java:89)
>    >         at org.junit.Assert.failNotEquals(Assert.java:835)
>    >         at org.junit.Assert.assertEquals(Assert.java:647)
>    >         at org.junit.Assert.assertEquals(Assert.java:633)
>  {code}
> For Lucene created with this codec we assume that for 1D cases, the kd-trees 
> are unbalance but for the ND case we assume that they are always fully 
> balance. This is true for the generic case but this failure might show that 
> it might not always the case.
> During this test a merging is going on, but during the merge we Havel the 
> following code:
> {code:java}
> for (PointsReader reader : mergeState.pointsReaders) {
>   if (reader instanceof Lucene60PointsReader == false) {
>     // We can only bulk merge when all to-be-merged segments use our format:
>     super.merge(mergeState);
>     return;
>   }
> } {code}
> So we only bulk merge segments that use `Lucene60PointsReader`. Not that if 
> we do not bulk merge a 1D index then it will be created as a fully balanced 
> tree!
> In this case the test is wrapping the readers with the 
> {{SlowCodecReaderWrapper}} and therefore tricking our logic.
> But I am wondering if this the case for Index sorting where our readers might 
> be wrapped with the {{{}SortingCodecReader{}}}.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10288) Are 1-dimensional kd trees in pre-86 indices always unbalanced trees?

Reply via email to