Sorry, but you need to resend this message to the Solr user list - this is
the Lucene user list.
-- Jack Krupansky
-----Original Message-----
From: Beale, Jim (US-KOP)
Sent: Thursday, July 18, 2013 12:34 PM
To: [email protected]
Subject: Indexing into SolrCloud
Hey folks,
I've been migrating an application which indexes about 15M documents from
straight-up Lucene into SolrCloud. We've set up 5 Solr instances with a 3
zookeeper ensemble using HAProxy for load balancing. The documents are
processed on a quad core machine with 6 threads and indexed into SolrCloud
through HAProxy using ConcurrentUpdateSolrServer in order to batch the
updates. The indexing box is heavily-loaded
I've been accepting the default HttpClient with 50K buffered docs and 2
threads, i.e.,
int solrMaxBufferedDocs = 50000;
int solrThreadCount = 2;
solrServer = new ConcurrentUpdateSolrServer(solrHttpIPAddress,
solrMaxBufferedDocs, solrThreadCount);
autoCommit is configured in the solrconfig as follows:
<autoCommit>
<maxTime>600000</maxTime>
<maxDocs>500000</maxDocs>
<openSearcher>false</openSearcher>
</autoCommit>
I'm getting the following errors on the client and server sides
respectively:
Client side:
2013-07-16 19:02:47,002 [concurrentUpdateScheduler-1-thread-4] INFO
SystemDefaultHttpClient - I/O exception (java.net.SocketException) caught
when processing request: Software caused connection abort: socket write
error
2013-07-16 19:02:47,002 [concurrentUpdateScheduler-1-thread-4] INFO
SystemDefaultHttpClient - Retrying request
2013-07-16 19:02:47,002 [concurrentUpdateScheduler-1-thread-5] INFO
SystemDefaultHttpClient - I/O exception (java.net.SocketException) caught
when processing request: Software caused connection abort: socket write
error
2013-07-16 19:02:47,002 [concurrentUpdateScheduler-1-thread-5] INFO
SystemDefaultHttpClient - Retrying request
Server side:
7988753 [qtp1956653918-23] ERROR org.apache.solr.core.SolrCore รข
java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException]
early EOF
at
com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
at
com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
at
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
at
com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
at
org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:393)
When I disabled autoCommit on the server side, I didn't see any errors there
but I still get the issue client-side after about 2 million documents -
which is about 45 minutes.
Has anyone seen this issue before? I couldn't find anything useful on the
usual places.
I suppose I could setup wireshark to see what is happening but I'm hoping
that someone has a better suggestion.
Thanks in advance for any help!
Best regards,
Jim Beale
hibu.com
2201 Renaissance Boulevard, King of Prussia, PA, 19406
Office: 610-879-3864
Mobile: 610-220-3067
The information contained in this email message, including any attachments,
is intended solely for use by the individual or entity named above and may
be confidential. If the reader of this message is not the intended
recipient, you are hereby notified that you must not read, use, disclose,
distribute or copy any part of this communication. If you have received this
communication in error, please immediately notify me by email and destroy
the original message, including any attachments. Thank you.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]