DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG� RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=34407>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND� INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=34407 ------- Additional Comments From [EMAIL PROTECTED] 2005-04-13 09:03 ------- (In reply to comment #4) .. > > My motivation for a RangeQuery is not making it faster for the average case, > it's making it possible in any scenario (any place in a query, any number of > terms, etc). > > We have some search collections with over 100M documents. Now imagine a range > query on a unique id field... I don't think any method utilizing 100M termdoc > enumerators is really feasible (am I understanding correctly?) This is very similar to a date range. Try searching for this on the web: yyyy yyyymm yyyymmdd lucene The results are getting dense in this way, and for performance you might consider caching (intermediate) results in (BitSet) filters. Lucene itself is meant for smaller numbers of results. 100M docs means about 12Mbyte per BitSet filter. When your filters contain fewer docs than 12M and you need many filters you might consider the sparse filters of bug 32921 . However, these filters require skipTo on all their filtered scorers, meaning that they require the development version of BooleanQuery at the moment. Regards, Paul Elschot P.S. Perhaps someone is interested in writing a story about Lucene and the ordered document skippers. It's getting a bit involved. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
