[ https://issues.apache.org/jira/browse/LUCENE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-4403: -------------------------------- Attachment: LUCENE-4403.patch start to a patch. As i said the javadocs are ambiguous, an unpositioned DocsEnum can return NO_MORE_DOCS, but you also shouldnt nextDoc() after you see this. Because of the ambiguity I'm sure we have test bugs: would be good to have a test postings-format that "caches" (reads ahead or something) to sometimes do this and tickle them out. > sharpen javadocs for DISI.docID() when unpositioned > --------------------------------------------------- > > Key: LUCENE-4403 > URL: https://issues.apache.org/jira/browse/LUCENE-4403 > Project: Lucene - Core > Issue Type: Bug > Reporter: Robert Muir > Attachments: LUCENE-4403.patch > > > Spinoff from LUCENE-4401. > Currently DISI requires an unpositioned iterator to be -1 or NO_MORE_DOCS. > But I think we should refine this: in my opinion NO_MORE_DOCS should mean > NO_MORE_DOCS. > So its ok for it to return NO_MORE_DOCS when its unpositioned, but only if it > can already determine that its exhausted. > This makes life easier on consumers. > {quote} > Separately we cant really test this situation very well as long as the > javadocs for nextDoc say, Returns the following: > -1 or NO_MORE_DOCS if nextDoc() or > advance(int) were not called yet. > NO_MORE_DOCS if the iterator has exhausted. > Otherwise it should return the doc ID it is currently on. > This prevents us from being able to easily assert that nobody is calling > nextDoc()/advance() after the enum is exhausted, since we cannot > differentiate 'exhausted' from 'uninitialized'. > I think we should clarify the javadocs, such that if nextDoc()/advance() are > not called yet, you can still return NO_MORE_DOCS, but only if you somehow > know you are exhausted-before-you-start. NO_MORE_DOCS should mean > NO_MORE_DOCS. > It could also be everyone reads it this way already, and I'm just being > super-anal. > {quote} > {quote} > +1 to sharpen when a DocsEnum can return NO_MORE_DOCS before nextDoc: it > should only be if the enum knows it has zero docs. But I'm not even sure we > should allow that ... why not always make it -1 ...? We can do that > separately... > {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org