On 4/27/2015 9:15 AM, Gopal Jee wrote: > We have a 26 node solr cloud cluster. During heavy re-indexing, some of > nodes go into recovering state. > as per current config, soft commit is set to 15 minute and hard commit to > 30 sec. Moreover, zkClientTimeout is set to 30 sec in solr nodes. > Please advise.
The most common reason for this is general performance issues that make some operations take longer than the zkClientTimeout. My first suspect would be long garbage collection pauses. This assumes you're not using a very recent version (4.10.x or 5.x) with the new bin/solr script, and your java commandline does not have any garbage collection tuning. The bin/solr script does a lot of GC tuning. The second suspect would be that you don't have enough RAM left for your operating system to cache your index effectively. It's possible to have both of these problems happening. These problems, and a few others, are outlined here: http://wiki.apache.org/solr/SolrPerformanceProblems Thanks, Shawn