Hi Otis, Following is the setup:
6 Solr individual servers (VMs) running on Jetty. 3 Shards. Each shard with a leader and replica. *Solr Version *: /Solr 4.0 (with a patch from Solr-2592)./ *OS*: /CentOS release 5.8 (Final)/ *Java*: /java version "1.6.0_32" Java(TM) SE Runtime Environment (build 1.6.0_32-b05) Java HotSpot(TM) 64-Bit Server VM (build 20.7-b02, mixed mode) / *Memory*: /4 servers have 32 GB, 2 have 30 GB. / *Disk space*: /500 GB on each server. / *Queries*: Usual select queries with upto 6 filters. facets on around 8 fields. (returning only top 20) . *Java options while starting the server:* /JAVA_OPTIONS="-Xms15360m -Xmx15360m -DSTOP.PORT=1234 -DSTOP.KEY=XXXX -XX:NewRatio=1 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCompressedOops -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/ABC/LOGFOLDER -XX:-TraceClassUnloading -Dbootstrap_confdir=./solr/collection123/conf -Dcollection.configName=123conf -DzkHost=ZooKeeper001:1111,ZooKeeper002:1111,SGAZZooKeeper003:1111 -DnumShards=3 -jar start.jar" LOG_FILE="/ABC/LOGFOLDER/solrlogfile.log" / I run a *commit* using a curl command every 30 mins using a cron job. /curl --silent http://11.111.111.111:1234/solr/collection123/update/?commit=true&openSearcher=false/ In my SolrConfig file I have these *Commit settings*: /updateHandler class="solr.DirectUpdateHandler2"> <autoCommit> <maxDocs>0</maxDocs> <maxTime>0</maxTime> </autoCommit> <autoSoftCommit> <maxTime>0</maxTime> </autoSoftCommit> <openSearcher>false</openSearcher> <waitSearcher>false</waitSearcher> <updateLog> <str name="dir">${solr.data.dir:}</str> </updateLog> </updateHandler> / Please let me know if you would like more information. I am Not indexing any documents right now and I again got a OOM around an hour back one one of the nodes. Lets call it Node1. The node is in "recovery" right now. and keeps erroring with this message: /SEVERE: Error while trying to recover:org.apache.solr.common.SolrException: Server at http://NODE2:8983/solr/collection1 returned non ok status:500, message:Server Error / Although its still showing as "recovering" it is serving queries according to the log file. The other instance in this shard became the leader and is up and running properly (serving queries). -- View this message in context: http://lucene.472066.n3.nabble.com/Frequent-OOM-Unknown-source-in-logs-tp4029361p4029459.html Sent from the Solr - User mailing list archive at Nabble.com.