On Monday 18 October 2004 23:04, Doug Cutting wrote: > Christoph Goller wrote: > > With the current scorer API one could get rid of buckettable and > > advance all subscores only by one document each time. I am not sure > > whether the bucketable implementation is really much more efficient. > > I only see the advantage of inlining some of the scorer.next and > > score.score code. > > Indeed, sub-scorers could be, e.g., kept in a priority queue. This is > done in ConjunctionScorer, PhraseScorer, etc. However this adds a > priority queue update to the inner search loop. With long queries and > with common terms this overhead can be significant. With short queries > and/or with rare terms the current ScoreTable-based implementation may > indeed be slower, but I believe with longer queries containing common > terms it is substantially faster. > > This algorithm is described in: > > http://lucene.sourceforge.net/papers/riao97.ps > > If we had a priority-queue-based implementation then we could benchmark > these. If we found that one were faster than the other for particular > classes of queries then we could have a query optimizer which > automatically selects the most efficient implementation...
I have a DisjunctionScorer based on a PriorityQueue lying around, but I can't benchmark it myself at the moment. In case there is interest, I'll gladly adapt it to org.apache.lucene.search and add it in bugzilla. Regards, Paul Elschot --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]