Shai Erera <ser...@gmail.com> wrote: >> The difference is for the new code, it's an upcast, which catches any >> errors at compile time, not run time. The compiler determines that >> the class implements the required interface. > > I still don't understand how a compiler can detect at compilation time that > a HitCollector instance that is used in method A, and is casted to a > TopDocsOutput instance by calling method B (from A) is actually ok ... I may > be missing some Java information here, but I simply don't see how that > happens at compilation time instead of at runtime ...
I may have lost the context here... but here's what I thought we were talking about. If we choose the interface option (adding a "ProvidesTopDocsResults" (say) interface), then you would create method renderResults(ProvidesTopDocResults). Then, any collector implementing that interface could be safely passed in, as the upcast is done at compile time not runtime. > So in fact the internal Lucene code expects only MRHC from a certain point, > and so even if I wrote a HC and passed it on Searcher, it's still converted > to MRHC, with an empty setNextReader() method implementation. That's why I > meant that HC is already deprecated, whether it's marked as deprecated or > not. The setNextReader() impl is not empty; it does the re-basing of docID on behalf of the HC. > What you say about deprecating HC to me is unnecessary. Simply pull > setNextReader up with an empty implementation, get rid of all the > instanceof, casting and wrapping code and you're fine. Nothing is broken. > All works well and better (instanceof, casting and wrapping have their > overheads). > Isn't that right? I think we need to deprecate HC, in favor of MRHC (or if we can think of a better name... ResultCollector?). > Regarding interfaces .. I don't think they're that bad. Perhaps a different > viewing angle might help. Lucene processes a query and fires events. Today > it fires an event every time a doc's score has been computed, and recently > every time it starts to process a different reader. HitCollector is a > listener implementation on the "doc-score event", while MRHC is a listener > on both. > To me, interfaces play just nicely here. Assume that you have the following > interfaces: > - DocScoreEvent > - ChangeReaderEvent > - EndProcessingEvent (thrown when the query has finished processing - > perhaps to aid collectors to free resources) > - <any other events you foresee?> > The Lucene code receives a HitCollector which listens on all events. In the > future, Lucene might throw other events, but still get a HitCollector. Those > methods will check for instanceof, and you as a user will know that if you > want to catch those events, you pass in a collector implementation which > does. Those events cannot of course be main-stream events (like > DocScoreEvent), but new ones, perhaps even experimental. > Since HitCollector is a concrete class, we can always add new interfaces to > it in the future with empty implementations? I agree interfaces clearly have useful properties, but their achilles heel for Lucene in all but the most trivial needs is the non-back-compatibility when you want to add a method to the interface. There have been a number of java-dev discussons on this problem. So, I think something like this: * Deprecate HitCollector in favor of MultiReaderHitCollector (any better name here?) abstract class. If you want to make a fully custom HitCollector, you subclass this calss. Let's change MRHC's collect to take only an [unbased] docID, and expose a setScorer(Scorer) method. Then if the collector needs score it can call Scorer.score(). * Subclass that to create an abstract "tracks top N results collector" (TopDocsCollector? TopHitsCollector?) * Subclass TopDocsCollector to a final, fast "top N sorted by score" collector (exists already: TopScoreDocCollector) * Subclass TopDocsCollector to a final, fast "top N sorted by field" collector (exists already: TopFieldCollector) * Subclass TopDocsCollector to a "you provide your own pqueue and we collect top N according to it" collector (does not yet exist -- name?). This is the way forward for existing subclasses of TopDocCollector. Shai do you want to take a first cut at making a patch? Can you open an issue? Thanks. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org