One more point and I'll stop - I've hit my email quota for the day ;)

While its a pain to have to juggle GC params and tune - when you require
a heap thats more than a gig or two, I personally believe its essential
to do so for good performance. The (default settings / ergonomics with
throughput) just don't cut it. Sad fact of life :) Luckily, you don't
generally have to do that much to get things nice - the number of
options is not that staggering, and you don't usually need to get into
most of them. Choosing the right collector, and tweaking a setting or
two can often be enough.

The most important to do with a large heap and the throughput collector
is to turn on parallel tenured collection. I've said it before, but it
really is key. At least if you have more than a processor or two -
which, for your sake, I hope you do :)

- Mark

Mark Miller wrote:
> Thats a good point too - if you can reduce your need for such a large
> heap, by all means, do so.
>
> However, considering you already need at least 10GB or you get OOM, you
> have a long way to go with that approach. Good luck :)
>
> How many docs do you have ? I'm guessing its mostly FieldCache type
> stuff, and thats the type of thing you can't really side step, unless
> you give up the functionality thats using it.
>
> Grant Ingersoll wrote:
>   
>> On Sep 25, 2009, at 9:30 AM, Jonathan Ariel wrote:
>>
>>     
>>> Hi to all!
>>> Lately my solr servers seem to stop responding once in a while. I'm
>>> using
>>> solr 1.3.
>>> Of course I'm having more traffic on the servers.
>>> So I logged the Garbage Collection activity to check if it's because of
>>> that. It seems like 11% of the time the application runs, it is stopped
>>> because of GC. And some times the GC takes up to 10 seconds!
>>> Is is normal? My instances run on a 16GB RAM, Dual Quad Core Intel Xeon
>>> servers. My index is around 10GB and I'm giving to the instances 10GB of
>>> RAM.
>>>
>>> How can I check which is the GC that it is being used? If I'm right JVM
>>> Ergonomics should use the Throughput GC, but I'm not 100% sure. Do
>>> you have
>>> any recommendation on this?
>>>       
>> As I said in Eteve's thread on JVM settings, some extra time spent on
>> application design/debugging will save a whole lot of headache in
>> Garbage Collection and trying to tune the gazillion different options
>> available.  Ask yourself:  What is on the heap and does it need to be
>> there?  For instance, do you, if you have them, really need sortable
>> ints?   If your servers seem to come to a stop, I'm going to bet you
>> have major collections going on.  Major collections in a production
>> system are very bad.  They tend to happen right after commits in
>> poorly tuned systems, but can also happen in other places if you let
>> things build up due to really large heaps and/or things like really
>> large cache settings.  I would pull up jConsole and have a look at
>> what is happening when the pauses occur.  Is it a major collection? 
>> If so, then hook up a heap analyzer or a profiler and see what is on
>> the heap around those times.  Then have a look at your schema/config,
>> etc. and see if there are things that are memory intensive (sorting,
>> faceting, excessively large filter caches).
>>
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>>
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
>> using Solr/Lucene:
>> http://www.lucidimagination.com/search
>>
>>     
>
>
>   


-- 
- Mark

http://www.lucidimagination.com



Reply via email to