Re: SolrJ stalls/hangs on client.add(); and doesn't return

Susheel Kumar Fri, 30 Oct 2015 06:30:06 -0700

Just a suggestion Markus that sending 50k documents in your case worked but
you may want to benchmark sending batches in 5K, 10k or 20k batches and
compare with sending 50k batches.  It may turn out that smaller batch size
may be faster than very big batch size...


On Fri, Oct 30, 2015 at 7:59 AM, Markus Jelsma <markus.jel...@openindex.io>
wrote:

> Hi - Solr doesn't seem to receive anything, and it certainly doesn't log
> anything, nothing is running out of memory. Indeed, i was clearly
> misunderstanding ConcurrentUpdateSolrClient.
>
> I hoped, without reading its code, it would partition input, which it
> clearly doesn't. I changed the code to partition my own input up to 50k
> documents and everything is running fine.
>
> Markus
>
>
>
> -----Original message-----
> > From:Erick Erickson <erickerick...@gmail.com>
> > Sent: Thursday 29th October 2015 22:28
> > To: solr-user <solr-user@lucene.apache.org>
> > Subject: Re: SolrJ stalls/hangs on client.add(); and doesn't return
> >
> > You're sending 100K docs in a single packet? It's vaguely possible that
> you're
> > getting a timeout although that doesn't square with no docs being
> indexed...
> >
> > Hmmm, to check you could do a manual commit. Or watch the Solr log to
> > see if update
> > requests ever go there.
> >
> > Or you're running out of memory on the client.
> >
> > Or even exceeding the packet size that the servlet container will accept?
> >
> > But I think at root you're misunderstanding
> > ConcurrentUpdateSolrClient. It doesn't
> > partition up a huge array and send them in parallel, it parallelized
> sending the
> > packet each call is given. So it's trying to send all 100K docs at
> > once. Probably not
> > what you were aiming for.
> >
> > Try making batches of 1,000 docs and sending them through instead.
> >
> > So the parameters are a bit of magic. You can have up to the number of
> threads
> > you specify sending their entire packet to solr in parallel, and up to
> queueSize
> > requests. Note this is the _request_, not the docs in the list if I'm
> > reading the code
> > correctly.....
> >
> > Best,
> > Erick
> >
> > On Thu, Oct 29, 2015 at 1:52 AM, Markus Jelsma
> > <markus.jel...@openindex.io> wrote:
> > > Hello - we have some processes periodically sending documents to 5.3.0
> in local mode using ConcurrentUpdateSolrClient 5.3.0, it has queueSize 10
> and threadCount 4, just chosen arbitrarily having no idea what is right.
> > >
> > > Usually its a few thousand up to some tens of thousands of rather
> small documents. Now, when the number of documents is around or near a
> hundred thousand, client.add(Iterator<SolrInputDocument> docIterator)
> stalls and never returns. It also doesn't index any of the documents. Upon
> calling, it quickly eats CPU and a load of heap but shortly after it goes
> idle, no CPU and memory is released.
> > >
> > > I am puzzled, any ideas to share?
> > > Markus
> >
>

Re: SolrJ stalls/hangs on client.add(); and doesn't return

Reply via email to