Dear all, we are currenty using Solr 4.3.1 in production (With SolrCloud).
We encounter quite the same problem described in this other old post: http://lucene.472066.n3.nabble.com/SolrCloud-CloudSolrServer-Zookeeper-disconnects-and-re-connects-with-heavy-memory-usage-consumption-td4026421.html Sometime some nodes are disconnected from Zookeeper and then they try to reconnect. The process is quite long because we have a quite long warming process. And because of this long warming process, just after the recovery process, the node is disconnected again and so on... until OOM sometime. We already increased the Zk timeout. But it is not enought. We are thinking to migrate to Solr 4.6.1 at least (perhaps 4.7 will be up before the end of the migration :) ). I know that a lot of SolrCloud bugs are corrected since Solr 4.3.1. But, could we be sure that this problem will be resolved ? Or can this problem occur with the last Solr version ? (I know this is not an easy question ;) ) It seems that this correction : Deadlock while trying to recover after a ZK session expiry : https://issues.apache.org/jira/browse/SOLR-5615 is a good point in addressing our current problem. But do you think it will be enought ? One last thing, I don't know if it is already adressed by a correction, but, if there is no updates between disconnection and the reconnection, the recovery process should not do anything more than the reconnection, I mean: no replication, no tLog replay and no warming process. Is it the case ? Ludovic. ----- Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Zookeeper-disconnection-reconnection-tp4117101.html Sent from the Solr - User mailing list archive at Nabble.com.