We also use similar kind of technique, breaking indexes in to smaller and 
search using ParallelMultiSearcher. We have to do incremental indexing and the 
records older than 6 months or 1 year (based on ageout setting) should be 
deleted. Having multiple small indexes is really fast in terms of indexing.  

Since you guys mentioned about keeping single large index. Search time woule be 
faster but the indexing and index optimization will take more time. How you are 
handling it in case of incremental indexing. If we keep the indexes size to 
100+ GB then each file size (fdt, fdx etc) would in GB's. Small addition or 
deletion to the file will not cause more IO as it has to skip those bytes and 
write it at the end of file.   

Regards
Ganesh

----- Original Message ----- 
From: "Burton-West, Tom" <tburt...@umich.edu>
To: <java-user@lucene.apache.org>
Sent: Tuesday, May 10, 2011 9:46 PM
Subject: RE: Sharding Techniques


Hi Samar,

>>Normal queries go fine under 500 ms but when people start searching
>>"anything" some queries take up to > 100 seconds. Don't you think
>>distributing smaller indexes on different machines would reduce the average
>>.search time. (Although I have a feeling that search time for smaller queries
>>may be slightly increased)

What are the characteristics of your slow queries?  Can you give examples?   
Are the slow queries always slow or only under heavy load?   What the 
bottleneck is and whether splitting into smaller indexes would help depends on 
just what your bottleneck is. It's not clear that your index is large enough 
that the size of the index is causing your bottleneck.

We run indexes of about 350GB with average response times under 200ms and 99th 
percentile reponse times of under 2 seconds. (We have a very low qps rate 
however).


Tom




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to