Hi Tom,

32MB is very low, 320MB is medium, and I think you could go higher, just pick 
whichever garbage collector is good for throughput.  I know Java 1.6 update 18 
also has some Hotspot and maybe also GC fixes, so I'd use that.  Finally, this 
sounds like a good use case for reindexing with Hadoop!

 Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



----- Original Message ----
> From: "Burton-West, Tom" <tburt...@umich.edu>
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Sent: Wed, February 17, 2010 5:16:26 PM
> Subject: What is largest reasonable setting for ramBufferSizeMB?
> 
> Hello all,
> 
> At some point we will need to re-build an index that totals about 2 
> terrabytes 
> in size (split over 10 shards).  At our current indexing speed we estimate 
> that 
> this will take about 3 weeks.  We would like to reduce that time.  It appears 
> that our main bottleneck is disk I/O.
> We currently have ramBufferSizeMB set to 32 and our merge factor is 10.  If 
> we 
> increase ramBufferSizeMB to 320, we avoid a merge and the 9 disk writes and 
> reads to merge 9+1 32MB segments into a 320MB segment.
> 
> Assuming we allocate enough memory to the JVM, would it make sense to 
> increase 
> ramBufferSize to 3200MB?   What are people's experiences with very large 
> ramBufferSizeMB sizes?
> 
> Tom Burton-West
> University of Michigan Library
> www.hathitrust.org

Reply via email to