[jira] [Commented] (LUCENE-6529) NumericFields + SlowCompositeReaderWrapper + UninvertedReader + -Dtests.codec=random can results in incorrect SortedSetDocValues

Robert Muir (JIRA) Mon, 08 Jun 2015 15:54:17 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578008#comment-14578008
 ]


Robert Muir commented on LUCENE-6529:
-------------------------------------

{quote}
FWIW, as far as i understand BasePostingsFormatTestCase and 
RandomPostingsTester based on skimming them this morning, they may not ever 
reproduce this bug since (AFAICT) only ever operate on single segment indexes?
{quote}

The problem has nothing to do with single segment. Now we know: its that this 
DocTermOrds optimization is conceptually broken with precisionStep. This just 
causes problems downstream but its not filtering out the "range terms" and that 
is the root cause. It cannot return the terms dict directly, it needs to wrap 
it with something that filters those out. Methods like 
NumericUtils.intTerms()/longTerms() are close, but those currently do not yet 
support ord() and seek(ord) which is needed here.

> NumericFields + SlowCompositeReaderWrapper + UninvertedReader + 
> -Dtests.codec=random can results in incorrect SortedSetDocValues 
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-6529
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6529
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Hoss Man
>         Attachments: LUCENE-6529.patch, LUCENE-6529.patch
>
>
> Digging into SOLR-7631 and SOLR-7605 I became fairly confident that the only 
> explanation of the behavior i was seeing was some sort of bug in either the 
> randomized codec/postings-format or the UninvertedReader, that was only 
> evident when two were combined and used on a multivalued Numeric Field using 
> precision steps.  But since i couldn't find any -Dtests.codec or 
> -Dtests.postings.format options that would cause the bug 100% regardless of 
> seed, I switched tactices and focused on reproducing the problem using 
> UninvertedReader directly and checking the SortedSetDocValues.getValueCount().
> I now have a test that fails frequently (and consistently for any seed i 
> find), but only with -Dtests.codec=random -- override it with 
> -Dtests.codec=default and everything works fine (based on the exhaustive 
> testing I did in the linked issues, i suspect every named codec works fine - 
> but i didn't re-do that testing here)
> The failures only seem to happen when checking the 
> SortedSetDocValues.getValueCount() of a SlowCompositeReaderWrapper around the 
> UninvertedReader -- which suggests the root bug may actually be in 
> SlowCompositeReaderWrapper? (but still has some dependency on the random 
> codec)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6529) NumericFields + SlowCompositeReaderWrapper + UninvertedReader + -Dtests.codec=random can results in incorrect SortedSetDocValues

Reply via email to