[ https://issues.apache.org/jira/browse/LUCENE-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712745#action_12712745 ]
Michael McCandless commented on LUCENE-1652: -------------------------------------------- bq. I'm not sure about it. In 3.0, we'll make nextDoc() abstract (for sure, since the default impl calls next()) and probably advance() also. So when you upgrade to 2.9, you can switch to calling nextDoc() and advance(), but if you implemented DISI, you won't be required to implement nextDoc() and/or advance(), so when you upgrade to 3.0 your code won't compile. You're right -- on making nextDoc & advance abstract in 3.0, your code won't compile on upgrading to 3.0 and you'd have to go fix any custom DISIs you have. But: if we leave doc() as is, you wouldn't be forced to do anything on that. You just implement nextDoc/advance and think you're done... bq. When upgrading, I think we should assume (or even require) users reading CHANGES. When they notice that DISI has changed and that they need to implement two new methods, they should also notice the change in semantics of doc(). Relying only on this (seeing CHANGES.txt) is what makes me nervous. bq. I take it that by "catastrophic" you mean that you're ok with people upgrading to 3.0 and don't compile, since that will force them to read CHANGES or javadocs and understand what they are now supposed to implement. Therefore if document() documents the new semantics, it is ok for us to rely on that, and if something fails, it's the user's problem. Right that's what I mean by "catastrophic" (note: Marvin used it first, but I like it ;) ) But: I want the catastrophe specifically to apply to doc() as well, so that you are forced to make that a new method. Ie, I'm hoping that the extra step of having a newly named method is enough to get you to go and understand that we subtly changed its semantics. bq. If we add document() (note the longer method name, compared to doc()) we can implement it following the new semantics and take advantage of that in 2.9 already (I think?). Exactly, another benefit of this approach (besides bringing catastrophe) is that we can do all of this in 2.9, including taking advantage of the new semantics. Which is great. bq. If this indeed should work, where should I do it - in this issue (I need 1614 to be committed first) or in 1614? I think do this as another iteration of the patch on LUCENE-1614? > Enhancements to Scorers following the changes to DocIdSetIterator > ----------------------------------------------------------------- > > Key: LUCENE-1652 > URL: https://issues.apache.org/jira/browse/LUCENE-1652 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Reporter: Shai Erera > Fix For: 3.0 > > > In LUCENE-1614, we changed the semantics of DocIdSetIterator's methods to > return a sentinel NO_MORE_DOCS (= Integer.MAX_VALUE) when the iterator has > exhausted. Due to backward compatibility issues, we couldn't implement that > semantics in doc(). Therefore this issue, which can be introduced in 3.0 only > will: > # Implement the new semantics in all extending classes, such that doc() will > return NO_MORE_DOCS when the iterator has exhausted. > # Change BooleanScorer to take advantage of that by removing sub.done from > SubScorer and operate under the assumption that NO_MORE_DOCS is larger than > any doc ID (Integer.MAX_VALUE). > # Change ConjunctionScorer to operate under the same assumptions and remove > 'more'. > # Change ReqExclScorer to not rely on reqScorer in doc(), since the latter > may be null. > # Make more changes to ConjunctionScorer's init() and remove 'firstTime' to > improve the performance of nextDoc(), score(), advance(). > # Add start()/finish() to DISI? > A snippet from LUCENE-1614 regarding the change in BooleanScorer > {code} > int doc = sub.done ? -1 : scorer.doc(); > while (!sub.done && doc < end) { > sub.collector.collect(doc); > doc = scorer.nextDoc(); > sub.done = doc < 0; > } > {code} > To this: > {code} > int doc = scorer.doc(); > while (doc < end) { > sub.collector.collect(doc); > doc = scorer.nextDoc(); > } > {code} > And in ConjunctionScorer, change this: > {code} > while (more && (firstScorer=scorers[first]).doc() < > (lastDoc=lastScorer.doc())) { > more = firstScorer.advance(lastDoc) >= 0; > lastScorer = firstScorer; > first = (first == (scorers.length-1)) ? 0 : first+1; > } > return more; > {code} > To this: > {code} > while ((firstScorer=scorers[first]).doc() < (lastDoc=lastScorer.doc())) { > firstScorer.advance(lastDoc); > lastScorer = firstScorer; > first = (first == (scorers.length-1)) ? 0 : first+1; > } > return lastDoc != DOC_SENTINEL; > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org