> On Jan 25, 2017, at 5:19 PM, Shawn Heisey <apa...@elyograg.org> wrote: > > It seems that Lucene/Solr > creates a lot of references as it runs, and collecting those in parallel > offers a significant performance advantage.
This is critical for any tuning. Most of the query time allocations in Solr have the lifetime of a single request. Query parsing, result scoring, all that is garbage after the HTTP response is sent. So the GC must be configured with a large young generation (Eden, Nursery, whatever). If that generation cannot handle all the short-lived allocations under heavy load, they will be allocated from tenured space. Right now, we run with an 8G heap and 2G of young generation space with CMS/ParNew. We see a major GC every 30-60 minutes, depending on load. Cache evictions will always be garbage in tenured space, so we cannot avoid major GCs. The oldest non-accessed objects are evicted, and those will almost certainly be tenured. All this means that Solr puts a heavy burden on the GC. A combination of many short-lived allocations plus a steady flow of tenured garbage. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog)