As I said, I was using the IBM JVM, not the Sun JVM. The "concurrent low
pause" collector is only in the Sun JVM.

I just found this excellent article about the various IBM GC options for a
Lucene application with a 100GB heap:

http://www.nearinfinity.com/blogs/aaron_mccurry/tuning_the_ibm_jvm_for_large
_h.html

wunder

-----Original Message-----
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Friday, September 25, 2009 10:03 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr and Garbage Collection

Walter Underwood wrote:
> 30ms is not better or worse than 1s until you look at the service
> requirements. For many applications, it is worth dedicating 10% of your
> processing time to GC if that makes the worst-case pause short.
>
> On the other hand, my experience with the IBM JVM was that the maximum
query
> rate was 2-3X better with the concurrent generational GC compared to any
of
> their other GC algorithms, so we got the best throughput along with the
> shortest pauses.
>   
With which collector? Since the very early JVM's, all GC is generational.
Most of the collectors (other than the Serial Collector) also work
concurrently.
By default, they are concurrent on different generations, but you can
add concurrency
to the "other" generation with each now too.
> Solr garbage generation (for queries) seems to have two major components:
> per-request garbage and cache evictions. With a generational collector,
> these two are handled by separate parts of the collector.
Different parts of the collector? Its a different collector depending on
the generation.
The young generation is collected with a copy collector. This is because
almost all the objects
in the young generation are likely dead, and a copy collector only needs
to visit live objects. So
its very efficient. The tenured generation uses something more along the
lines of mark and sweep or mark
and compact.
>  Per-request
> garbage should completely fit in the short-term heap (nursery), so that it
> can be collected rapidly and returned to use for further requests. If the
> nursery is too small, the per-request allocations will be made in tenured
> space and sit there until the next major GC. Cache evictions are almost
> always in long-term storage (tenured space) because an LRU algorithm
> guarantees that the garbage will be old.
>
> Check the growth rate of tenured space (under constant load, of course)
> while increasing the size of the nursery. That rate should drop when the
> nursery gets big enough, then not drop much further as it is increased
more.
>
> After that, reduce the size of tenured space until major GCs start
happening
> "too often" (a judgment call). A bigger tenured space means longer major
GCs
> and thus longer pauses, so you don't want it oversized by too much.
>   
With the concurrent low pause collector, the goal is to avoid "major"
collections,
by collecting *before* the tenured space is filled. If you you are
getting "major" collections,
you need to tune your settings - the whole point of that collector is to
avoid "major"
collections, and do almost all of the work while your application is not
paused. There are
still 2 brief pauses during the collection, but they should not be
significant at all.
> Also check the hit rates of your caches. If the hit rate is low, say 20%
or
> less, make that cache much bigger or set it to zero. Either one will
reduce
> the number of cache evictions. If you have an HTTP cache in front of Solr,
> zero may be the right choice, since the HTTP cache is cherry-picking the
> easily cacheable requests.
>
> Note that a commit nearly doubles the memory required, because you have
two
> live Searcher objects with all their caches. Make sure you have headroom
for
> a commit.
>
> If you want to test the tenured space usage, you must test with real world
> queries. Those are the only way to get accurate cache eviction rates.
>
> wunder
>
> -----Original Message-----
> From: Jonathan Ariel [mailto:ionat...@gmail.com] 
> Sent: Friday, September 25, 2009 9:34 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr and Garbage Collection
>
> BTW why making them equal will lower the frequency of GC?
>
> On 9/25/09, Fuad Efendi <f...@efendi.ca> wrote:
>   
>>> Bigger heaps lead to bigger GC pauses in general.
>>>       
>> Opposite viewpoint:
>> 1sec GC happening once an hour is MUCH BETTER than 30ms GC
>>     
> once-per-second.
>   
>> To lower frequency of GC: -Xms4096m -Xmx4096m (make it equal!)
>>
>> Use -server option.
>>
>> -server option of JVM is 'native CPU code', I remember WebLogic 7 console
>> with SUN JVM 1.3 not showing any GC (just horizontal line).
>>
>> -Fuad
>> http://www.linkedin.com/in/liferay
>>
>>
>>
>>
>>     
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




Reply via email to