This stuff is surprisingly hard to think about! >> Actually, it was Nadav who first proposed the "read interface", to >> solve the "there's no common way for reading its output" problem. >> With an interface (say TopDocsOutput), then you could have some >> method somewhere: >> >> renderResults(TopDocsOutput results) >> >> and then any collector, independent of how it *collects* results, >> could implement TopDocsOutput if appropriate. > > You'd still need to cast the collector to TopDocsOutput, won't you? > How's that different than the code snippet I showed above?
The difference is for the new code, it's an upcast, which catches any errors at compile time, not run time. The compiler determines that the class implements the required interface. > The current situation introduces a bug, that's true. However, unless > something better pops up, shouldn't we just make it final? But that leaves no way forward for current users subclassing TopDocCollector (for the freedom of providing your own pqueue). > May I suggest something else? What if MRHC was actually an > interface? I think interface is too dangerous in this case (the future back compatibility problem). EG here we are wanting to explore a way to not pre-compute the score; had we released MRHC as an interface we'd be in trouble. (We may still be in trouble, anyway!). >> Would TopDocsCollector subclass HitCollector or >> MultiReaderHitCollector? > > Well ... we've been there as well already :). I don't think there's > an easy answer here. I guess if MRHC is the better approach, and we > think all Top***DocCollector would want to have the MRHC > functionality, then I'd say let's extend MRHC. Otherwise, I don't > have a good answer. When I started this thread, I only knew of > HitCollector, so things were simpler at the time. We have challenging goals here: * The "collect top N by score" collector should be final, use ScorerDocQueue, specialized to sorting by score/docID: performance is important. * Likewise for the "collect top N by sorted field" collector, though it does provide extensibility by letting you make a custom comparator (FieldComparatorSource). Ideally this'd allow with and without computing score (it does not today). * A "top N by my own pqueue" collector (this is what TopDocCollector/TopScoreDocsCollector allow today, but it has the bug). * Allow fully custom collection, with and without score. Maybe we should in fact simply deprecate HitCollector (in favor of MultiReaderHitCollector)? After all, making your own HitCollector is an advanced thing; expecting you to properly implement setNextReader may be fine. And then we can subclass MultiReaderHitCollector to TopDocsCollector (which adds the totalHits/topDocs "results delivery" API). And then the "collect top docs by score", and "collect top docs by fields" collectors subclass TopDocsCollector? Finally, we add a "collect top docs according to my own pqueue" class. Then we wouldn't need an interface; this works because all core collectors deliver top N results in the end. All that's missing is a way to NOT compute score if it's not needed. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org