Ganesh

Nobody is saying that sharding is never a good idea - it just doesn't
seem to be applicable in the case being discussed.  On my indexes I
care much more about speed of searching rather than speed of indexing.
 The latter typically happens in the background in the dead of night
and within reason I don't really care how long it takes.  Your
application and requirements will be different and you may come to
different conclusions.


--
Ian.


On Wed, May 11, 2011 at 6:09 AM, Ganesh <emailg...@yahoo.co.in> wrote:
>
> We also use similar kind of technique, breaking indexes in to smaller and 
> search using ParallelMultiSearcher. We have to do incremental indexing and 
> the records older than 6 months or 1 year (based on ageout setting) should be 
> deleted. Having multiple small indexes is really fast in terms of indexing.
>
> Since you guys mentioned about keeping single large index. Search time woule 
> be faster but the indexing and index optimization will take more time. How 
> you are handling it in case of incremental indexing. If we keep the indexes 
> size to 100+ GB then each file size (fdt, fdx etc) would in GB's. Small 
> addition or deletion to the file will not cause more IO as it has to skip 
> those bytes and write it at the end of file.
>
> Regards
> Ganesh
>
> ----- Original Message -----
> From: "Burton-West, Tom" <tburt...@umich.edu>
> To: <java-user@lucene.apache.org>
> Sent: Tuesday, May 10, 2011 9:46 PM
> Subject: RE: Sharding Techniques
>
>
> Hi Samar,
>
>>>Normal queries go fine under 500 ms but when people start searching
>>>"anything" some queries take up to > 100 seconds. Don't you think
>>>distributing smaller indexes on different machines would reduce the average
>>>.search time. (Although I have a feeling that search time for smaller queries
>>>may be slightly increased)
>
> What are the characteristics of your slow queries?  Can you give examples?   
> Are the slow queries always slow or only under heavy load?   What the 
> bottleneck is and whether splitting into smaller indexes would help depends 
> on just what your bottleneck is. It's not clear that your index is large 
> enough that the size of the index is causing your bottleneck.
>
> We run indexes of about 350GB with average response times under 200ms and 
> 99th percentile reponse times of under 2 seconds. (We have a very low qps 
> rate however).
>
>
> Tom
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to