Re: Solr and Garbage Collection

Mark Miller Sat, 26 Sep 2009 14:11:43 -0700

Sorry Walter. Half the time I type faster than I think. I was mixing
concurrent with parallel.
I do agree with you on the concurrent part for batch processing (and
likely other things).
It would likely be far better to use as many CPU's as you can (as many
as make sense) collecting in
parallel while the world is stopped, rather than paying to do it
concurrently. My fault on the confusion.


Parallel, super important for large heaps.
Concurrent, supper important for systems that always need low response
times.

Hence the Parallel collector being named the throughput collector :)

Sorry for the confusion - wouldn't be the first time ;)

I'll stick to my generational argument though - as I said, if most of
your objects are long lived (*extremely* rare from what I know), it make
senses, but in almost all cases, its super helpful. Which is why sun
doesnt even offer non generational anymore.

- Mark

Mark Miller wrote:
> Walter Underwood wrote:
>   
>> For batch-oriented computing, like Hadoop, the most efficient GC is probably
>> a non-concurrent, non-generational GC. 
>>     
> Okay - for batch we somewhat agree I guess - if you can stand any length
> of pausing, non concurrent can be nice, because you don't pay for thread
> sync communication. Only with a small heap size though (less than 100MB
> is what I've seen). You would pause the batch job while GC takes place.
> If you have 8 processors, and you are pausing all of them to collect a
> large heap using only 1 processor, that doesn't make much sense to me.
> The thread communication pain will be far outweighed by using more
> processors to do the collection faster, and not "stop the world" for
> your batch job so long. Stopping your application dead in its tracks,
> and then only using one of the available processors to collect a large
> heap, while the rest sit idle, doesn't make much sense.
>
> I also don't agree it ever really makes sense not to do generational
> collection. What is your argument here? Generational collection is
> **way** more efficient for short lived objects, which tend to be up to
> 98% of the objects in most applications. The only way I see that making
> sense is if you have almost no short lived objects (which occurs in
> what, .0001% of apps if at all?). The Sun JVM doesn't even offer a non
> generational approach anymore. It's just standard GC practice.
>   
>> I doubt that there are many
>> batch-oriented applications of Solr, though.
>>
>> The rest of the advice is intended to be general and it sounds like we agree
>> about sizing. If the nursery is not big enough, the tenured space will be
>> used for allocations that have a short lifetime and that will increase the
>> length and/or frequency of major collections.
>>   
>>     
> Yes - I wasn't arguing with every point - I was picking and choosing :)
> After the heap size, the size of the young generation is the most
> important factor.
>   
>> Cache evictions are the interesting part, because they cause a constant rate
>> of tenured space garbage. In most many servers, you can get a big enough
>> nursery that major collections are very rare. That won't happen in Solr
>> because of cache evictions.
>>
>> The IBM JVM is excellent. Their concurrent generational GC policy is
>> "gencon".
>>   
>>     
> Yeah, I actually know very little about the IBM JVM, so I wasn't really
> commenting. But from the info I gleaned here and on a couple quick web
> searches, I'm not too impressed by it's GC.
>   
>> wunder
>>
>> -----Original Message-----
>> From: Mark Miller [mailto:markrmil...@gmail.com] 
>> Sent: Friday, September 25, 2009 10:31 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr and Garbage Collection
>>
>> My bad - later, it looks as if your giving general advice, and thats
>> what I took issue with.
>>
>> Any Collector that is not doing generational collection is essentially
>> from the dark ages and shouldn't be used.
>>
>> Any Collector that doesn't have concurrent options, unless possibly your
>> running a tiny app (under 100MB of RAM), or only have a single CPU, is
>> also dark ages, and not fit for a server environement.
>>
>> I havn't kept up with IBM's JVM, but it sounds like they are well behind
>> Sun in GC then.
>>
>> - Mark
>>
>> Walter Underwood wrote:
>>   
>>     
>>> As I said, I was using the IBM JVM, not the Sun JVM. The "concurrent low
>>> pause" collector is only in the Sun JVM.
>>>
>>> I just found this excellent article about the various IBM GC options for a
>>> Lucene application with a 100GB heap:
>>>
>>>
>>>     
>>>       
>> http://www.nearinfinity.com/blogs/aaron_mccurry/tuning_the_ibm_jvm_for_large
>>   
>>     
>>> _h.html
>>>
>>> wunder
>>>
>>> -----Original Message-----
>>> From: Mark Miller [mailto:markrmil...@gmail.com] 
>>> Sent: Friday, September 25, 2009 10:03 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Solr and Garbage Collection
>>>
>>> Walter Underwood wrote:
>>>   
>>>     
>>>       
>>>> 30ms is not better or worse than 1s until you look at the service
>>>> requirements. For many applications, it is worth dedicating 10% of your
>>>> processing time to GC if that makes the worst-case pause short.
>>>>
>>>> On the other hand, my experience with the IBM JVM was that the maximum
>>>>     
>>>>       
>>>>         
>>> query
>>>   
>>>     
>>>       
>>>> rate was 2-3X better with the concurrent generational GC compared to any
>>>>     
>>>>       
>>>>         
>>> of
>>>   
>>>     
>>>       
>>>> their other GC algorithms, so we got the best throughput along with the
>>>> shortest pauses.
>>>>   
>>>>     
>>>>       
>>>>         
>>> With which collector? Since the very early JVM's, all GC is generational.
>>> Most of the collectors (other than the Serial Collector) also work
>>> concurrently.
>>> By default, they are concurrent on different generations, but you can
>>> add concurrency
>>> to the "other" generation with each now too.
>>>   
>>>     
>>>       
>>>> Solr garbage generation (for queries) seems to have two major components:
>>>> per-request garbage and cache evictions. With a generational collector,
>>>> these two are handled by separate parts of the collector.
>>>>     
>>>>       
>>>>         
>>> Different parts of the collector? Its a different collector depending on
>>> the generation.
>>> The young generation is collected with a copy collector. This is because
>>> almost all the objects
>>> in the young generation are likely dead, and a copy collector only needs
>>> to visit live objects. So
>>> its very efficient. The tenured generation uses something more along the
>>> lines of mark and sweep or mark
>>> and compact.
>>>   
>>>     
>>>       
>>>>  Per-request
>>>> garbage should completely fit in the short-term heap (nursery), so that
>>>>       
>>>>         
>> it
>>   
>>     
>>>> can be collected rapidly and returned to use for further requests. If the
>>>> nursery is too small, the per-request allocations will be made in tenured
>>>> space and sit there until the next major GC. Cache evictions are almost
>>>> always in long-term storage (tenured space) because an LRU algorithm
>>>> guarantees that the garbage will be old.
>>>>
>>>> Check the growth rate of tenured space (under constant load, of course)
>>>> while increasing the size of the nursery. That rate should drop when the
>>>> nursery gets big enough, then not drop much further as it is increased
>>>>     
>>>>       
>>>>         
>>> more.
>>>   
>>>     
>>>       
>>>> After that, reduce the size of tenured space until major GCs start
>>>>     
>>>>       
>>>>         
>>> happening
>>>   
>>>     
>>>       
>>>> "too often" (a judgment call). A bigger tenured space means longer major
>>>>     
>>>>       
>>>>         
>>> GCs
>>>   
>>>     
>>>       
>>>> and thus longer pauses, so you don't want it oversized by too much.
>>>>   
>>>>     
>>>>       
>>>>         
>>> With the concurrent low pause collector, the goal is to avoid "major"
>>> collections,
>>> by collecting *before* the tenured space is filled. If you you are
>>> getting "major" collections,
>>> you need to tune your settings - the whole point of that collector is to
>>> avoid "major"
>>> collections, and do almost all of the work while your application is not
>>> paused. There are
>>> still 2 brief pauses during the collection, but they should not be
>>> significant at all.
>>>   
>>>     
>>>       
>>>> Also check the hit rates of your caches. If the hit rate is low, say 20%
>>>>     
>>>>       
>>>>         
>>> or
>>>   
>>>     
>>>       
>>>> less, make that cache much bigger or set it to zero. Either one will
>>>>     
>>>>       
>>>>         
>>> reduce
>>>   
>>>     
>>>       
>>>> the number of cache evictions. If you have an HTTP cache in front of
>>>>       
>>>>         
>> Solr,
>>   
>>     
>>>> zero may be the right choice, since the HTTP cache is cherry-picking the
>>>> easily cacheable requests.
>>>>
>>>> Note that a commit nearly doubles the memory required, because you have
>>>>     
>>>>       
>>>>         
>>> two
>>>   
>>>     
>>>       
>>>> live Searcher objects with all their caches. Make sure you have headroom
>>>>     
>>>>       
>>>>         
>>> for
>>>   
>>>     
>>>       
>>>> a commit.
>>>>
>>>> If you want to test the tenured space usage, you must test with real
>>>>       
>>>>         
>> world
>>   
>>     
>>>> queries. Those are the only way to get accurate cache eviction rates.
>>>>
>>>> wunder
>>>>
>>>> -----Original Message-----
>>>> From: Jonathan Ariel [mailto:ionat...@gmail.com] 
>>>> Sent: Friday, September 25, 2009 9:34 AM
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Re: Solr and Garbage Collection
>>>>
>>>> BTW why making them equal will lower the frequency of GC?
>>>>
>>>> On 9/25/09, Fuad Efendi <f...@efendi.ca> wrote:
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>> Bigger heaps lead to bigger GC pauses in general.
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>> Opposite viewpoint:
>>>>> 1sec GC happening once an hour is MUCH BETTER than 30ms GC
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>> once-per-second.
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> To lower frequency of GC: -Xms4096m -Xmx4096m (make it equal!)
>>>>>
>>>>> Use -server option.
>>>>>
>>>>> -server option of JVM is 'native CPU code', I remember WebLogic 7
>>>>>         
>>>>>           
>> console
>>   
>>     
>>>>> with SUN JVM 1.3 not showing any GC (just horizontal line).
>>>>>
>>>>> -Fuad
>>>>> http://www.linkedin.com/in/liferay
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>>   
>>>>     
>>>>       
>>>>         
>>>   
>>>     
>>>       
>>   
>>     
>
>
>   


-- 
- Mark

http://www.lucidimagination.com

Re: Solr and Garbage Collection

Reply via email to