It has typically been when query traffic was lowest! We are at 12 GB heap, so I will try to bump it to 14 GB. We have 64GB main memory installed now. Here is our settings, do these look OK?
export JAVA_OPTS="-Xmx12228m -Xms12228m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode" -----Original Message----- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Tuesday, November 30, 2010 6:44 PM To: solr-user@lucene.apache.org Subject: Re: entire farm fails at the same time with OOM issues On Tue, Nov 30, 2010 at 6:04 PM, Robert Petersen <rober...@buy.com> wrote: > My question is this. Why in the world would all of my slaves, after > running fine for some days, suddenly all at the exact same minute > experience OOM heap errors and go dead? If there is no change in query traffic when this happens, then it's due to what the index looks like. My guess is a large index merge happened, which means that when the searchers re-open on the new index, it requires more memory than normal (much less can be shared with the previous index). I'd try bumping the heap a little bit, and then optimizing once a day during off-peak hours. If you still get OOM errors, bump the heap a little more. -Yonik http://www.lucidimagination.com