Charlie, How's this: * -Xmx2g * ramBufferSizeMB 512 * mergeFactor 10 (default, but you could up it to 20, 30, if ulimit -n allows) * ignore/delete maxBufferedDocs - not used if you ran ramBufferSizeMB * use SolrStreamingUpdateServer (with params matching your number of CPU cores) or send batches of say 1000 docs with the other SolrServer impl using N threads (N=# of your CPU cores)
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Charles Wardell <charles.ward...@bcsolution.com> > To: solr-user@lucene.apache.org > Sent: Tue, April 26, 2011 2:32:29 PM > Subject: Question on Batch process > > I am sure that this question has been asked a few times, but I can't seem to >find the sweetspot for indexing. > > I have about 100,000 files each containing 1,000 xml documents ready to be >posted to Solr. My desire is to have it index as quickly as possible and then >once completed the daily stream of ADDs will be small in comparison. > > The individual documents are small. Essentially web postings from the net. >Title, postPostContent, date. > > > What would be the ideal configuration? For RamBufferSize, mergeFactor, >MaxbufferedDocs, etc.. > > My machine is a quad core hyper-threaded. So it shows up as 8 cpu's in TOP > I have 16GB of available ram. > > > Thanks in advance. > Charlie