Re: SolrException Error when indexing new documents at scale in SolrCloud -

Shawn Heisey Wed, 15 Jan 2014 14:03:49 -0800

On 1/15/2014 2:43 PM, cwhi wrote:

I have a SolrCloud installation with about 2 million documents indexed in it.
It's been buzzing along without issue for the past 8 days, but today started
throwing errors on document adds that eventually resulted in out of memory
exceptions.  There is nothing funny going on.  There are a few infrequent
searches on the index every few minutes, and documents are being added in
batch (batches of 1000-5000) every few minutes as well.


The exceptions I'm receiving don't seem very informative.  The first
exception looks like this:

org.apache.solr.common.SolrException
        at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
        at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
        at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
-- snip --

I've now experienced this with two SolrCloud instances in a row.  The
SolrCloud instance has 3 shards, each on a separate machine (each machine is
also running Zookeeper).  Each of the machines have 4 GB of RAM, with ~1.5
GB allocated to Solr.  Solr seems to be maxing out the CPU on index, so I
don't know if that's related.

If anybody could help me in sorting out these issues, it would be greatly
appreciated.  I pulled the Solr log file and have uploaded it at
https://www.dropbox.com/s/co3r4esjnsas0tl/solr.log

Also, a short snippet of the first exception is available on pastebin at
http://pastebin.com/pWZrkGEr


I think the relevant part of your exception is this is:

Caused by: org.eclipse.jetty.io.EofException
<snip>
Caused by: java.net.SocketException: Connection reset

When Jetty throws the EofException, it's almost always caused by theclient disconnecting the TCP connection before the HTTP transaction iscomplete. The "Connection reset" message pretty much confirms it, IMHO.

What I think *might* be happening here is that you have a low SO_TIMEOUTconfigured on whatever is making your HTTP connections, and the updaterequests are not completing before that timeout expires, so the clientcloses the TCP connection before transfer is done. Most of the time,SO_TIMEOUT should either be left at infinity or configured with aninsanely high value measured in minutes, not seconds.

An potential underlying problem is that your index has gotten too bigand the OS disk cache is no longer able to cache it effectively. Whenthis happens, Solr performance will drop significantly. It's verycommon for Solr to be completely fine up to a certain threshold and thensuffer horrible performance problems once that threshold is crossed.


Thanks,
Shawn

Re: SolrException Error when indexing new documents at scale in SolrCloud -

Reply via email to