I do this all the time with batches of 1,000 and don't see this problem.

one thing that sometimes bites people is to fail to clear the doclist
after every call to add. So you send ever-increasing batches to Solr.
Assuming when you talk about batch size meaning the size of the
solrDocunentList, increasing it would make  the broken pipe problem
worse if anything...

Also, it's generally bad practice to commit after every batch. That's not
your problem here, just something to note. Let your autocommit
settings in solrconfig handle it or specify commitWithin in your
add call.

I'd also look in your Solr logs and see if there's a problem there.

Net-net is this is a perfectly reasonable pattern, I suspect some
innocent-seeming problem with your indexing code.

Best,
Erick



On Fri, Jul 20, 2018 at 9:32 AM, Arunan Sugunakumar
<arunans...@cse.mrt.ac.lk> wrote:
> Hi,
>
> I have around 12 millions objects in my PostgreSQL database to be indexed.
> I'm running a thread to fetch the rows from the database. The thread will
> also create the documents and put it in an indexing queue. While this is
> happening my main process will retrieve the documents from the queue and
> will index it in the size of 1000. For some time the process is running as
> expected, but after some time, I get an exception.
>
> *[corePostProcess] org.apache.solr.client.solrj.SolrServerException:
> IOException occured when talking to server at:
> http://localhost:8983/solr/mine-search
> <http://localhost:8983/solr/intermine-search>…………………………….…………………………….[corePostProcess]
> Caused by: java.net.SocketException: Broken pipe (Write
> failed)[corePostProcess]    at
> java.net.SocketOutputStream.socketWrite0(Native Method)*
>
>
> I tried increasing the batch size upto 30000. Then I got a different
> exception.
>
> *[corePostProcess] org.apache.solr.client.solrj.SolrServerException:
> IOException occured when talking to server at:
> http://localhost:8983/solr/mine-search
> <http://localhost:8983/solr/mine-search>……………………………………………….…………………………………………….[corePostProcess]
> Caused by: org.apache.http.NoHttpResponseException: localhost:8983 failed
> to respond*
>
>
> I would like to know whether there are any good practices on handling such
> situation, such as max no of documents to index in one attempt etc.
>
> My environement :
>
> Version : solr 7.2, solrj 7.2
> Ubuntu 16.04
> RAM 20GB
> I started Solr in standalone mode.
> Number of replicas and shards : 1
>
> The method I used :
>                 UpdateResponse response = solrClient.add(solrDocumentList);
>                 solrClient.commit();
>
>
> Thanks in advance.
>
> Arunan

Reply via email to