Kalyan,

150/200 ms per 1 document to index seems too long, but it really depends on how 
much analysis is going on and size of docs.  32 threads seems too high, unless 
your Solr server really has 32 cores.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: "Manepalli, Kalyan" <kalyan.manepa...@orbitz.com>
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Sent: Wednesday, July 1, 2009 4:21:30 PM
> Subject: RE: Tips on speeding the indexing process
> 
> Here are some specs for my indexer.
> Indexer is custom Java code that reads data from DB and other services builds 
> the solrDocument and submits it using SolrJ via Http. Indexer is doing a bit 
> of 
> work for building the documents. The overhead is around 30 to 40ms. For every 
> document addition solr takes around 150 to 200 ms. 
> I tried the bulk addition approach with 1000 documents at time. But found out 
> that solr just take the same amount of time. I commit and optimize only once 
> at 
> the end. I currently use 32 threads in production environment to get that 
> speed 
> of 2hrs.
> 
> 
> Thanks,
> Kalyan Manepalli
> 
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
> Sent: Wednesday, July 01, 2009 3:11 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Tips on speeding the indexing process
> 
> 
> Kalyan,
> 
> Using SolrJ?  Use the StreamingServer, it's nice and fast.
> Alternatively, start multiple indexing threads (match the number of Solr 
> server 
> CPU cores) and index from there.
> Send batches of docs, not one by one.
> Don't commit or optimize until you are done.
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> ----- Original Message ----
> > From: "Manepalli, Kalyan" 
> > To: "solr-user@lucene.apache.org" 
> > Sent: Wednesday, July 1, 2009 3:42:45 PM
> > Subject: Tips on speeding the indexing process
> > 
> > Hi,
> >             I have a very generic question regarding indexing. In my 
> > current 
> > app, I have about 450,000 docs each doc size around 2k. The total indexing 
> time 
> > is around 2hrs.
> > Now due to multi language support, the number of documents is increasing to 
> 2.0 
> > million. The total indexing time is exceeding 6 hrs.
> > I wanted to know if there are any general tips to speedup the indexing 
> process.
> > 
> > Thanks,
> > Kalyan Manepalli

Reply via email to