A HitCollector object invokes its collect method on every document which matches the query/filter submitted to the Searcher.search method. I think all you would need to do is pass in the page number and results per page to your HitCollector constructor and then in the collect method do the bookeeping to keep track of where you are in the result set. A simple version of your HitCollector might look like this:
public class MyHitCollector extends HitCollector { private int docCount = 0; private List documents = new ArrayList(); private int startDoc; private int endDoc; private Searcher searcher; public MyHitCollector(Searcher searcher, int requestedPageNumber, int resultsPerPage) { startDoc = (requestedPageNumber - 1) * resultsPerPage; endDoc = requestedPageNumber * resultsPerPage; this.searcher = searcher; } public void collect(int id, float score) { if(docCount >= startDoc && docCount < endDoc) { documents.add(searcher.doc(id)); } docCount++; } public List getDocuments() { return documents; } } Then in your application, wherever you're executing your search you just need: HitCollector hc = new MyHitCollector(yourSearcher,500,20); yourSearcher.search(yourQuery,hc); List results = hc.getDocuments(); Now results should contain the 1001st-1020th results. JAMES --- "heritrix.lucene" <[EMAIL PROTECTED]> wrote: > I am using Hits object to collect all documents. > Let me tell you my problem. I am creating a web > application. Every time when > a user looks for something it goes and search the > index and return the > results. Results may be in millions. So for > displaying results, i am doing > pagination. > Here the problem is that, i can not keep the hits > with the session. So if a > user is looking for say 500th page and per page > there are 20 results, it > again goes to look into the index, finds result and > simply gets the id of > 500*20th document and displays 20 results after > that. > > I don't know, how can i solve this problem with > HitCollector.... > > Thanks & Regards > > > On 6/28/06, Erick Erickson <[EMAIL PROTECTED]> > wrote: > > > > I hope you're not using the Hits object to > assemble all 14M results. A > > recurring theme is that a Hits object should NOT > be used for collection > > more > > than a few (100 I think) objects since it > re-executes the query every 100 > > or > > so terms it returns. It's intent is to efficiently > return the first few > > hits. > > > > Look at HitCollector of you want to examine lots > of results. > > > > Of course this doesn't apply if you are just using > the Hits object to see > > how many documents matched and NOT looping through > all them. > > > > Why does the second query take so little time? Two > things suggest > > themselves. > > 1> if your first query is the first query after > opening the index, there's > > significant overhead involved that you pay when > opening the index, and the > > second time you won't pay it. > > 2> If you're issuing the same query twice, there > is probably some cache > > involved. > > > > It would probably be instructive to issue a second > query that is NOT the > > same as the first (using the same searcher) and > see the response time. > > That > > way you could gain some insight into whether the > time differential is due > > to > > opening the index. > > > > Best > > Erick > > > > > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]