On Jul 15, 2012, at 2:45 PM, Nick Koton wrote:

> I converted my program to use
> the SolrServer::add(Collection<SolrInputDocument> docs) method with 100
> documents in each add batch.  Unfortunately, the out of memory errors still
> occur without client side commits.

This won't change much unfortunately - currently, each host has 10 add and 10 
deletes buffered for it before it will flush. There are some recovery 
implications that have kept that buffer size low so far - but what it ends up 
meaning is that when you stream docs, every 10 docs is sent off on a thread. 
Generally, you might be able to keep up with this - but the commit cost appears 
to perhaps cause a small resource drop that backs things up a bit - and some of 
those threads take a little longer to finish while new threads fire off to keep 
servicing the constantly coming new documents. What appears will happen is 
large momentary spikes in the number of threads. Each thread needs a bit of 
space on the heap, and it would seem with a high enough spike you could get an 
OOM. In my testing, I have not triggered that yet, but I have seen large thread 
count spikes.

Raising the add doc buffer to 100 docs makes those thread bursts much, much 
less severe. I can't remember all of the implications of that buffer size 
though - need to talk to Yonik about it.

We could limit the number of threads for that executor, but I think that comes 
with some negatives as well.

You could try lowering -Xss so that each thread uses less RAM (if possible) as 
a shorter term (possible) workaround.

You could also use multiple threads with the std HttpSolrServer - it won't be 
quite as fast probably, but it can get close(ish).

My guess is that your client commits help because a commit will cause a wait on 
all outstanding requests - so that the commit is in logical order - this 
probably is like releasing a pressure valve - the system has a chance to catch 
up and reclaim lots of threads.

We will keep looking into what the best improvement is.

- Mark Miller
lucidimagination.com











Reply via email to