A couple of things: 

1) can you give some more details about your setup ? Like whether its cloud or 
single instance . How many nodes if its cloud.  The hardware - memory per 
machine , JVM options. Etc 

2) any specific reason for using 4.0 beta? The latest version is 4.3. I used 
4.0 for a few weeks and there were a lot if bugs related to memory and 
communication between nodes ( zookeeper) 
3) if you haven't seen it already , please go through this wiki page . It's an 
excellent starting point for troubleshooting memory n indexing issues. 
Specially section 3 to 7 
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations


-- 
Shreejay


On Sunday, June 2, 2013 at 7:16, Yoni Amir wrote:

> Hello,
> I am receiving OutOfMemoryError during indexing, and after investigating the 
> heap dump, I am still missing some information, and I thought this might be a 
> good place for help.
> 
> I am using Solr 4.0 beta, and I have 5 threads that send update requests to 
> Solr. Each request is a bulk of 100 SolrInputDocuments (using solrj), and my 
> goal is to index around 2.5 million documents.
> Solr is configured to do a hard-commit every 10 seconds, so initially I 
> thought that it can only accumulate in memory 10 seconds worth of updates, 
> but that's not the case. I can see in a profiler how it accumulates memory 
> over time, even with 4 to 6 GB of memory. It is also configured to optimize 
> with mergeFactor=10.
> 
> At first I thought that optimization is a blocking, synchronous operation. It 
> is, in the sense that the index can't be updated during optimization. 
> However, it is not synchronous, in the sense that the update request coming 
> from my code is not blocked - Solr just returns an OK response, even while 
> the index is optimizing.
> This indicates that Solr has an internal queue of inbound requests, and that 
> the OK response just means that it is in the queue. I get confirmation for 
> this from a friend who is a Solr expert (or so I hope).
> 
> My main question is: how can I put a bound on this internal queue, and make 
> update requests synchronous in case the queue is full? Put it another way, I 
> need to know if Solr is really ready to receive more requests, so I don't 
> overload it and cause OOME.
> 
> I performed several tests, with slow and fast disks, and on the really fasts 
> disk the problem didn't occur. However, I can't demand such fast disk from 
> all the clients, and also even with a fast disk the problem will occur 
> eventually when I try to index 10 million documents.
> I also tried to perform indexing with optimization disabled, but it didn't 
> help.
> 
> Thanks,
> Yoni
> 
> Confidentiality: This communication and any attachments are intended for the 
> above-named persons only and may be confidential and/or legally privileged. 
> Any opinions expressed in this communication are not necessarily those of 
> NICE Actimize. If this communication has come to you in error you must take 
> no action based on it, nor must you copy or show it to anyone; please 
> delete/destroy and inform the sender by e-mail immediately. 
> Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
> Viruses: Although we have taken steps toward ensuring that this e-mail and 
> attachments are free from any virus, we advise that in keeping with good 
> computing practice the recipient should ensure they are actually virus free.
> 
> 


Reply via email to