On 6/2/2013 8:16 AM, Yoni Amir wrote:
> Hello,
> I am receiving OutOfMemoryError during indexing, and after investigating the 
> heap dump, I am still missing some information, and I thought this might be a 
> good place for help.
> 
> I am using Solr 4.0 beta, and I have 5 threads that send update requests to 
> Solr. Each request is a bulk of 100 SolrInputDocuments (using solrj), and my 
> goal is to index around 2.5 million documents.
> Solr is configured to do a hard-commit every 10 seconds, so initially I 
> thought that it can only accumulate in memory 10 seconds worth of updates, 
> but that's not the case. I can see in a profiler how it accumulates memory 
> over time, even with 4 to 6 GB of memory. It is also configured to optimize 
> with mergeFactor=10.

4.0-BETA came out several months ago.  Even at the time, support for the
alpha and beta releases was limited.  Now it has been superseded by
4.0.0, 4.1.0, 4.2.0, 4.2.1, and 4.3.0, all of which are full releases.
There is a 4.3.1 release currently in the works.  Please upgrade.

Ten seconds is a very short interval for hard commits, even if you have
openSearcher=false.  Frequent hard commits can cause a whole host of
problems.  It's better to have an interval of several minutes, and I
wouldn't go less than a minute.  Soft commits can be much more frequent,
but if you are frequently opening new searchers, you'll probably want to
disable cache warming.

On optimization: don't do it unless you absolutely must.  Most of the
time, optimization is only needed if you delete a lot of documents and
you need to get them removed from your index.  If you must optimize to
get rid of deleted documents, do it on a very long interval (once a day,
once a week) and pause indexing during optimization.

You haven't said anything about your index size, java heap size, total
RAM, etc.  With those numbers I could offer some guesses about what you
need, but I'll warn you that they would only be guesses - watching a
system with real data under load is the only way to get concrete
information.  Here are some basic guidelines on performance problems and
RAM information:

http://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn

Reply via email to