Re: Fastest Method for Searching (need all results)

Mark Miller Fri, 21 Jul 2006 11:59:25 -0700

Ryan O'Hara wrote:

My index contains approximately 5 millions documents. During asearch, I need to grab the value of a field for every document in theresult set. I am currently using a HitCollector to search. Below ismy code:
searcher.search(query, new HitCollector(){
                        public void collect(int doc, float score){
                                if(searcher.doc(doc).get("SYM") != null){
addSymbolsToHash(searcher.doc(doc).get("SYM").split("ENDOFSYM"));
                                }
                        }
                    });
This is fairly fast for small and medium-sized result sets. However,it gets slow as the result set grows. I read this on HitCollector'sAPI page:
"For good search performance, implementations of this method shouldnot call Searcher.doc(int) or Reader.document(int) on every documentnumber encountered. Doing so can slow searches by an order ofmagnitude or more."
Along with this implementation, I've also tried using FieldCache.This faired better with large-sized result sets, but worse with smalland medium-sized result sets. Anyone have any ideas of what the bestapproach might be?
Thanks a lot,
Ryan

Perhaps I am speaking too quickly, but I would try by not grabbing thevalue of the field for every document in the results set. Someone willsee that value or use it for a couple million hits? Could be Isuppose...but if not than axe it. Grab the first few thousand (or MUCHless) and if they need more head back in and grab more.



- mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Fastest Method for Searching (need all results)

Reply via email to