On 6/2/2013 8:16 AM, Yoni Amir wrote: > Hello, > I am receiving OutOfMemoryError during indexing, and after investigating the > heap dump, I am still missing some information, and I thought this might be a > good place for help. > > I am using Solr 4.0 beta, and I have 5 threads that send update requests to > Solr. Each request is a bulk of 100 SolrInputDocuments (using solrj), and my > goal is to index around 2.5 million documents. > Solr is configured to do a hard-commit every 10 seconds, so initially I > thought that it can only accumulate in memory 10 seconds worth of updates, > but that's not the case. I can see in a profiler how it accumulates memory > over time, even with 4 to 6 GB of memory. It is also configured to optimize > with mergeFactor=10.
4.0-BETA came out several months ago. Even at the time, support for the alpha and beta releases was limited. Now it has been superseded by 4.0.0, 4.1.0, 4.2.0, 4.2.1, and 4.3.0, all of which are full releases. There is a 4.3.1 release currently in the works. Please upgrade. Ten seconds is a very short interval for hard commits, even if you have openSearcher=false. Frequent hard commits can cause a whole host of problems. It's better to have an interval of several minutes, and I wouldn't go less than a minute. Soft commits can be much more frequent, but if you are frequently opening new searchers, you'll probably want to disable cache warming. On optimization: don't do it unless you absolutely must. Most of the time, optimization is only needed if you delete a lot of documents and you need to get them removed from your index. If you must optimize to get rid of deleted documents, do it on a very long interval (once a day, once a week) and pause indexing during optimization. You haven't said anything about your index size, java heap size, total RAM, etc. With those numbers I could offer some guesses about what you need, but I'll warn you that they would only be guesses - watching a system with real data under load is the only way to get concrete information. Here are some basic guidelines on performance problems and RAM information: http://wiki.apache.org/solr/SolrPerformanceProblems Thanks, Shawn