On 4/15/2014 10:44 PM, Mukesh Jha wrote: > In my solr cluster I've multiple shards and each shard containing > ~500,000,000 documents total index size being ~1 TB. > > I was just wondering how much more can I keep on adding to the shard before > we reach a tipping point and the performance starts to degrade? > > Also as best practice what is the recomended no of docs / size of shards .
Vinay has given you my performance problems wiki page, which I created because I was always telling people the same things about why their Solr performance sucked. Erick has given you additional good information. I think Erick intended to give you this link instead of the wiki page about XML updates: http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ The performance tipping point is highly variable, mostly depending on RAM. The index details and the nature of the queries will affect this ... but honestly, that mostly comes down to RAM as well. These details affect how much heap is required as well as how much of the full index must be loaded into the OS disk cache for good performance. The "best practices" recommendation is to have enough RAM to cache the entire index ... but getting 1TB of total RAM for your case is *really* expensive. It also is probably not a strict requirement. RAM is the most precious resource for large indexes. CPU speed and capability is important, but generally only for scaling query load once you've gotten performance into an acceptable place. (must remember to put that in the SolrPerformanceProblems wiki page!) I've got a dev Solr server that reached the tipping point yesterday. It's only got 16GB of RAM, and the RAM slots are maxed out. The Solr instance has 7GB of heap, and I have a few other java programs on it that each take a few hundred MB. It's handling nearly 150GB of index, with only about 7GB of OS disk cache. I have never expected its performance to be stellar. The dev server normally runs fine, if a bit slow... but I had done an index rebuild earlier in the day, so the amount of index data on the machine was up near 200GB instead of below 150GB. Suddenly it was taking minutes to do basic update operations instead of a few hundred milliseconds. Once I deleted the data in the build cores (swapped with live cores) and restarted Solr, everything started working OK again. The tipping point would have been reached much sooner if this were the hardware I had for production. The dev server barely sees any query load. I doubt this dev server would survive production queries, even if the indexes were spread across two of them instead of just the one. In production, the full distributed index is served by two machines that each have 64GB of RAM. Those machines right now only need to deal with about 100GB of total index data (with a 6GB Solr heap on each one), so there's plenty of RAM between them to cache the entire index. When I add the new indexes that are under development, there won't be quite enough RAM to cache everything, but it will be close enough that it won't matter. My current production index is 94 million docs, and the new index that I am adding is currently at 13 million, expected to grow to 20 million in the near future. Thanks, Shawn