Hi Metin, I think removing the softCommit=true parameter on the client side will definitely help as NRT wasn't designed to re-open searchers after every document. Try every 1 second (or even every few seconds), I doubt your users will notice. To get an idea of what threads are running in your JVM process, you can use jstack.
Cheers, Timothy Potter Sr. Software Engineer, LucidWorks www.lucidworks.com ________________________________________ From: OSMAN Metin <metin.os...@canal-plus.com> Sent: Wednesday, December 04, 2013 7:36 AM To: solr-user@lucene.apache.org Subject: Questions about commits and OOE Hi all, let me first explain our situation : We have - two virtual servers with each : 4x SolR 4.4.0 on Tomcat 6 (+ with mod_cluster 1.2.0), each JVM has -Xms2048m -Xmx2048m -XX:MaxPermSize=384m 1x Zookeeper 3.4.5 (Only one of the two Zookeeper is active.) CentOS 6.4 Sun JDK 1.6.0-31 16 GB of RAM 4 vCPU - only one core and one shard - ~250000 docs and 50-100 MB of index size - two load balancers (apache + mod_cluster) who are both connected to the 8 SolR nodes - 1 VIP pointing to these two LB The commit configuration is - every update request do a soft commit (i.e. param softCommit=true in the http request) - autosoftcommit disabled - autocommit enabled every 15 seconds The client application is a java app with SolRj client using the previous VIP as an endpoint. We need NearRealTime modifications visible by the end users. During the day, the client uses SolR with about 80% of select requests and 20% of update requests. Every morning, the client is sending a massive bunch of updates (about 10000 in a few minutes). During this massive update, we have sometimes a peak of active threads exceeding the limit of 8192 process authorized for the user running the tomcat and zookeeper process. When this happens, every hardCommit is failing with an "OutOfMemory : unable to create native thread" message. Now, I have some questions : - Why are there some many threads created ? Is the softCommit on every update that opens a new thread ? - Once an OOE occurs, every hardcommit will be broken, even if the number of threads opened on the system is low. Is there any way to "free" the JVM ? The only solution we have found is to restart all the JVM. - When the OOE occurs, the SolR cloud console shows the leader node as active and the others as recovering o is the replication working at that moment ? o as all the hardcommits are failing but the softcommits not, am I very sure that I will not lose some updates when restarting all the nodes ? By the way, we are planning to - disable the softCommit parameter on the client side and to enable the autosoftcommit instead. - create another server and make 3 zookeeper chorum instead of a unique zookeeper master. - skip the use of load balancers and let zookeeper decide which node will respond to the requests Any help would be appreciated ! Metin OSMAN