On Wed, Jun 10, 2009 at 20:17, Uwe Schindler<u...@thetaphi.de> wrote: > You are right, you can, but if you just want to retrieve all hits, this is > ineffective. A HitCollector is the correct way to do this (especially > because the order of hits is mostly not interesting when retrieving all > hits). Hits and TopDocs are intended for paged results lists.
As a relevant note, what I have noticed about using HitCollector alone is that the code effectively loses control of the loop (you get the same problem with any API where you hand it a callback and let it do all the work, e.g. SAX.) The callback is good if you have a relatively small number of results and/or a relatively fast operation to perform with each one, but if the process as a whole takes a long time and the user wants to be able to cancel it, then it isn't great. It also isn't great if you want to wrap an Iterator or some other existing API around it. Our workaround for this is a HitCollector which populates a BitSet (relatively fast), and then do the slow operation when iterating over the BitSet. This also has drawbacks in terms of memory usage, but that doesn't become a huge problem until you have a very large number of documents in the index. It's a shame we don't have an inverted kind of HitCollector where we can say "give me the next hit", so that we can get the best of both worlds (like what StAX gives us in the XML world.) Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org