DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG� RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=34407>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND� INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=34407 ------- Additional Comments From [EMAIL PROTECTED] 2005-04-12 20:46 ------- (In reply to comment #2) ... > > > During development of a new scorer one can temporarily use the code > > shown in the javadocs of Scorer that implements skipTo with > > next() and doc(). > > I'm not sure I can do that. > What I'm trying to do is develop a replacement for RangeQuery that is fast and > *always* works (no expanding to BooleanQuery). I don't even care about scoring > since it almost never makes sense for a RangeQuery. Using the same techniques > as RangeFilter, I think it should be pretty easy to do, except for implementing > skipTo. A scorer that is only used in isolation or on the top level will only have next() called. For these it is safe to throw an UnsupportedOperationException from skipTo(). For an OR like query one could use the same technique as RangeFilter by using the scorers of the clauses separately. You might call this a DisjunctionFilter. Each of the separate scorers could implement skipTo(), but it wouldn't normally be used. Using skipTo would be only useful for very dense results to skip to the doc corresponding to the next unset bit in the BitSet. However, such dense results are not normal for text searching. Once a BitSet filter is available bug 32965 can also be useful. This uses skipTo on the scorer of the filtered query to skip over documents not present in the filter. > > It seems like skipTo for UnscoredRangeQuery would require either enumerating > *all* docs beforehand (store in a BitSet or whatever), or keeping a termdoc > enumerator open for every term in the range. Neither option seems attractive. skipTo() requires that the documents are accessed in order. That means some form of sorting is needed: for example a PriorityQueue with all termdoc enumerators open, or a BitSet for distribution sort. BooleanScorer with skipTo allowed uses a form of distribution sort (the buckets) combined with local sorting. Other methods are also possible, eg. merge sort, but these are currently not used in Lucene scorers. BooleanScorer without skipTo uses distribution and an incomplete form of sorting by working over intervals of document numbers. I don't think scoring disjunctions can be made faster than that. Since the sorting is not complete, the next() method does not guarantee that documents are accessed in order. BooleanScorers without skipTo using the same interval can be nested nicely, though. Regards, Paul Elschot -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
