Mark, How I can get the information of Document. I think that is in the implementation do method abstract collect. How I can get it .
Below is the example of javadoc the Lucene. Searcher searcher = new IndexSearcher(indexReader); final BitSet bits = new BitSet(indexReader.maxDoc()); searcher.search(query, new HitCollector() { public void collect(int doc, float score) { bits.set(doc); } }); Thanks On Nov 18, 2007 8:09 PM, Mark Miller <[EMAIL PROTECTED]> wrote: > Hey Haroldo. > > First thing you need to do is *stop* using Hits in your searches. Hits > is optimized for some pretty specific use cases and you will get along > much better by using a HitCollector. > > Hits has three main functions: > > It caches documents, normalizes scores, and stores ids associated with > scores (a HitDoc). If you attempt to retrieve a HitDoc past the first > 100 from Hits, a new search will be issued to grab double the required > HitDocs needed to satisfy your HitDoc retrieval attempt. This will be > repeated everytime you ask for a HitDoc beyond the current cache (which > began at 100). This means that if you need to get a HitDoc beyond 100, > Hits is not a great choice for you. You will want to use the > HitCollector instead...but remember that you are losing the normalized > scores (simple to copy code if you still want it) and the document > caching (I rarely want that anyway). > > An issue to watch out for: with Hits, you do not have to ask for how > many docs to get back, but with a HitCollector solution you will need > to. This is a minor dilema if you want to go over all of the hits no > matter what. You can pass a huge number to ensure you get everything, > but you will be creating large data structures if you do this, as > structure sizes may be initialized by the number you pass. Also, passing > the maximum integer will cause an error (negative init size) as Lucene > initializes a data structure to hold the hits as n+1. > > - Mark > > > Haroldo Nascimento wrote: > > I have a problem of performance when I need group the result do search > > > > I have the code below: > > > > for (int i = 0; i < hits.length(); i++) { > > doc = hits.doc(i); > > > > obj1 = doc.get(Constants.STATE_DESC_FIELD_LABEL); > > obj2 = doc.get(xxx); > > ... > > } > > > > I work with volume of data very big. The search process in 0.300 > > seconds but when the object hits have much results, the time for get > > all objects is very big. The command hits.doc(i) is processed in 2 > > second. > > > > Por exemplo. For hits.length() equals the 25.000 results, the time > > of "pos search" is 7 seconds. > > > > I get all result because I need group the result (remove the > > duplicate results). > > > > Is there any form in Lucene that group the result. I need of > > anything as the command "group by" of sql. > > > > Thanks. > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]