You should build your own performance test cases to see what works for your 
data.  That being said, here are some numbers from a similar test I ran:

I did the following:
1) run a single term query which resulted in about half of the total set of 
documents being returned.  (~36,000)
2) built a Boolean query (tree of them actually) with a Boolean clause for each 
of the 36,000 documents.
3) run this new secondary query to get 'final' results.

The time it took to run each step was:
1) 31 ms
2) 1610 ms
3) 1047 ms

Without the giant Boolean query attached to the final query, stem three takes 
about 15 ms.  So... Yes, a large query structure will increase query execution 
time.  In researching how to reduce this, I found filters to be the fastest.  
Using filters:

Steps required:
1) run the initial single term query.
2) Build a query object model based on results from step 1.
3) Build bitset filter using query from step 2.  CACHE THE RESULTS!
4) Execute main query, with cached bitset filter.

Execution time:
1) 16 ms
2) 1609 ms
3) 969 ms
4) 15 ms

As you can see, the total time is longer, but, in our specific case, the 
initial query rarely changes.  Thus, by caching the filter results, the vast 
majority of queries are done in 15 ms.  This only works because we are not 
updating our indexes that often, and the initial query is fairly static too.

This is a classic trade.  I am trading space for time.  Each cached filter 
costs around 2k (reflection of index size) of memory to keep around.  In turn, 
that saves me a little over a second from the final query.

I hope these number help, and perhaps spark an idea.

-Russell

-----Original Message-----
From: zhongyi yuan [mailto:[EMAIL PROTECTED] 
Sent: Friday, July 28, 2006 9:47 PM
To: java-user@lucene.apache.org
Subject: About search performance

Hi,How about implement multi-key search use lucene, for example use boolean 
search exceed 1000 clauses,it will affect the performance greatly. If use 
filter or custom sorter to select the result, because the result is extremely 
large in amount,so the performance is lower.
Please give me some advice,Thanks.

Reply via email to