> On Jan 25, 2017, at 5:19 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> 
>  It seems that Lucene/Solr
> creates a lot of references as it runs, and collecting those in parallel
> offers a significant performance advantage.

This is critical for any tuning. Most of the query time allocations in Solr 
have the lifetime of a single request. Query parsing, result scoring, all that 
is garbage after the HTTP response is sent. So the GC must be configured with a 
large young generation (Eden, Nursery, whatever). If that generation cannot 
handle all the short-lived allocations under heavy load, they will be allocated 
from tenured space.

Right now, we run with an 8G heap and 2G of young generation space with 
CMS/ParNew. We see a major GC every 30-60 minutes, depending on load.

Cache evictions will always be garbage in tenured space, so we cannot avoid 
major GCs. The oldest non-accessed objects are evicted, and those will almost 
certainly be tenured.

All this means that Solr puts a heavy burden on the GC. A combination of many 
short-lived allocations plus a steady flow of tenured garbage.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

Reply via email to