I'd ask for more details. You say that you've narrowed it down to Lucene doing the searching.... But which part of the search? Here're two places people have run into problems before (sorry if you already know this...).
1> Iterating through the entire returned set with Hits.doc(#). 2> opening and closing your indexreader between queries. The first thing I'd do is insert some timing logging into your search code. For instance, log the time after you've assembled your query and before you execute the search. Log the time it takes to do the raw search. Log the time you spend spinning through the returned hits preparing to return the results. I'm not talking anything fancy here, just System.currentTimeMilliseconds(). I can't emphasize strongly enough that you simply *cannot* jump to a solution before you *know* where you're spending your time. I've spent waaaaay more time that I want to admit to fixing code that I was *sure* was slow only find out that the *real* problem was somewhere else. Finally, what times are you seeing? And what was the index size before and after? Without some numbers, nobody else can guess at any solutions. Best Erick On 9/27/06, Rob Young <[EMAIL PROTECTED]> wrote:
Hi, I'm using Lucene to search a product database (CDs, DVDs, games and now books). Recently that index has increased in size to over a million items (added books). I have been performance testing our search server and the throughput of requests has dropped significantly, profiling the server it all seems to be in the Lucene searching. So, now that I've narrowed it down to the searching itself rather than the rest of the application. What can I do about it? I am running a TermQuery, falling back to a FuzzyQuery when no results are found (each combined in a boolean queries with the product type restrictions). One solution I had in mind was to split the index down into four, would this provide any gains? It will require a lot of re-factoring so I don't want to commit myself if there's no chance it will help. Another solution along the same train of thought was to use a caching filter search to cut the index into parts. How would this compare to the previous idea? Does anyone have any other ideas / suggestions? Thanks Rob --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]