Hi Tom I don't know of an easy way to understand the relationship between the max RAM and the buffer size. I ran the test w/ 8GB heap and 2048 MB RAM buffer. indexing 16M documents (roughly 288GB data) took 7400 seconds (by 8 threads). I will post the full benchmark output when I finish indexing 25M documents w/ different RAM buffer sizes.
My gut feeling (and after reading this http://www.ibm.com/developerworks/java/library/j-jtp09275.html) tells me that if I need N MB of RAM, I should allocate at least 2*N space on the heap. But that just takes the RAM buffer into consideration. Since there is other memory that is allocated, GC might wake up, so in order to avoid that (as much as possible), I allocate at least 3*N, if N is large enough. In the current example, I need 2GB for RAM buffer, so I'll allocate at least 4 for on the heap. Then if I assume that the rest of the app won't allocate a total of more than 2GB, I'll set the heap size to 6GB. Since I have lots of RAM and cannot use it w/ Lucene, I set the heap size to 8GB. I haven't though turned on any flags to determine if and when GC ran, so I don't know if I've hit any nasty GC issues. But, given the total indexing throughput ~(140GB / hour), I think these are good settings. BTW, I think that w/ parallel arrays ( https://issues.apache.org/jira/browse/LUCENE-2329), the performance should be better if you use a lower heap size. You can also read there that Michael B. ran the test w/ 200 RAM buffer and 2GB heap (and also 256MB heap), which might give you another indication of the RAM buffer / heap size ratio. Hope this helps, Shai On Mon, Apr 26, 2010 at 8:26 PM, Tom Burton-West <tburtonw...@gmail.com>wrote: > > I'm looking forward to your results Shai. > > > Once we get our new test server we will be running tests with different RAM > buffer sizes. We have 10 300GB indexes to re-index, so we need to minimize > any merging/disk I/O. > > See also this related thread on the Solr list: > > http://lucene.472066.n3.nabble.com/What-is-largest-reasonable-setting-for-ramBufferSizeMB-tc505964.html#a505964 > > Is there any easy way to understand the relationship between the max RAM > buffer size and the total amount of memory you need to give the JVM ? > > > Tom Burton-West > www.hathitrust.org > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Lucene-RAM-buffer-size-limit-tp756752p757354.html > Sent from the Lucene - Java Developer mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >