[ 
https://issues.apache.org/jira/browse/LUCENE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4403:
--------------------------------

    Attachment: LUCENE-4403.patch

start to a patch. 

As i said the javadocs are ambiguous, an unpositioned DocsEnum can return 
NO_MORE_DOCS, but you also shouldnt nextDoc() after you see this. 

Because of the ambiguity I'm sure we have test bugs: would be good to have a 
test postings-format that "caches" (reads ahead or something) to sometimes do 
this and tickle them out.

                
> sharpen javadocs for DISI.docID() when unpositioned
> ---------------------------------------------------
>
>                 Key: LUCENE-4403
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4403
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>         Attachments: LUCENE-4403.patch
>
>
> Spinoff from LUCENE-4401.
> Currently DISI requires an unpositioned iterator to be -1 or NO_MORE_DOCS. 
> But I think we should refine this: in my opinion NO_MORE_DOCS should mean 
> NO_MORE_DOCS.
> So its ok for it to return NO_MORE_DOCS when its unpositioned, but only if it 
> can already determine that its exhausted.
> This makes life easier on consumers.
> {quote}
> Separately we cant really test this situation very well as long as the 
> javadocs for nextDoc say, Returns the following:
>     -1 or NO_MORE_DOCS if nextDoc() or
>     advance(int) were not called yet.
>     NO_MORE_DOCS if the iterator has exhausted.
>     Otherwise it should return the doc ID it is currently on.
> This prevents us from being able to easily assert that nobody is calling 
> nextDoc()/advance() after the enum is exhausted, since we cannot 
> differentiate 'exhausted' from 'uninitialized'.
> I think we should clarify the javadocs, such that if nextDoc()/advance() are 
> not called yet, you can still return NO_MORE_DOCS, but only if you somehow 
> know you are exhausted-before-you-start. NO_MORE_DOCS should mean 
> NO_MORE_DOCS.
> It could also be everyone reads it this way already, and I'm just being 
> super-anal.
> {quote}
> {quote}
> +1 to sharpen when a DocsEnum can return NO_MORE_DOCS before nextDoc: it 
> should only be if the enum knows it has zero docs. But I'm not even sure we 
> should allow that ... why not always make it -1 ...? We can do that 
> separately...
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to