30ms is not better or worse than 1s until you look at the service requirements. For many applications, it is worth dedicating 10% of your processing time to GC if that makes the worst-case pause short.
On the other hand, my experience with the IBM JVM was that the maximum query rate was 2-3X better with the concurrent generational GC compared to any of their other GC algorithms, so we got the best throughput along with the shortest pauses. Solr garbage generation (for queries) seems to have two major components: per-request garbage and cache evictions. With a generational collector, these two are handled by separate parts of the collector. Per-request garbage should completely fit in the short-term heap (nursery), so that it can be collected rapidly and returned to use for further requests. If the nursery is too small, the per-request allocations will be made in tenured space and sit there until the next major GC. Cache evictions are almost always in long-term storage (tenured space) because an LRU algorithm guarantees that the garbage will be old. Check the growth rate of tenured space (under constant load, of course) while increasing the size of the nursery. That rate should drop when the nursery gets big enough, then not drop much further as it is increased more. After that, reduce the size of tenured space until major GCs start happening "too often" (a judgment call). A bigger tenured space means longer major GCs and thus longer pauses, so you don't want it oversized by too much. Also check the hit rates of your caches. If the hit rate is low, say 20% or less, make that cache much bigger or set it to zero. Either one will reduce the number of cache evictions. If you have an HTTP cache in front of Solr, zero may be the right choice, since the HTTP cache is cherry-picking the easily cacheable requests. Note that a commit nearly doubles the memory required, because you have two live Searcher objects with all their caches. Make sure you have headroom for a commit. If you want to test the tenured space usage, you must test with real world queries. Those are the only way to get accurate cache eviction rates. wunder -----Original Message----- From: Jonathan Ariel [mailto:ionat...@gmail.com] Sent: Friday, September 25, 2009 9:34 AM To: solr-user@lucene.apache.org Subject: Re: Solr and Garbage Collection BTW why making them equal will lower the frequency of GC? On 9/25/09, Fuad Efendi <f...@efendi.ca> wrote: >> Bigger heaps lead to bigger GC pauses in general. > > Opposite viewpoint: > 1sec GC happening once an hour is MUCH BETTER than 30ms GC once-per-second. > > To lower frequency of GC: -Xms4096m -Xmx4096m (make it equal!) > > Use -server option. > > -server option of JVM is 'native CPU code', I remember WebLogic 7 console > with SUN JVM 1.3 not showing any GC (just horizontal line). > > -Fuad > http://www.linkedin.com/in/liferay > > > >