Ryan O'Hara wrote:
My index contains approximately 5 millions documents. During a search, I need to grab the value of a field for every document in the result set. I am currently using a HitCollector to search. Below is my code:

searcher.search(query, new HitCollector(){
                        public void collect(int doc, float score){
                                if(searcher.doc(doc).get("SYM") != null){
addSymbolsToHash(searcher.doc(doc).get("SYM").split("ENDOFSYM"));
                                }
                        }
                    });

This is fairly fast for small and medium-sized result sets. However, it gets slow as the result set grows. I read this on HitCollector's API page:

"For good search performance, implementations of this method should not call Searcher.doc(int) or Reader.document(int) on every document number encountered. Doing so can slow searches by an order of magnitude or more."

Along with this implementation, I've also tried using FieldCache. This faired better with large-sized result sets, but worse with small and medium-sized result sets. Anyone have any ideas of what the best approach might be?

Thanks a lot,
Ryan
Perhaps I am speaking too quickly, but I would try by not grabbing the value of the field for every document in the results set. Someone will see that value or use it for a couple million hits? Could be I suppose...but if not than axe it. Grab the first few thousand (or MUCH less) and if they need more head back in and grab more.


- mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to