On 11/8/2016 12:49 PM, Susheel Kumar wrote:
> Ran into OOM Error again right after two weeks. Below is the GC log
> viewer graph. The first time we run into this was after 3 months and
> then second time in two weeks. After first incident reduced the cache
> size and increase heap from 8 to 10G. Interestingly query and
> ingestion load is like normal other days and heap utilisation remains
> stable and suddenly jumps to x2. 

It looks like something happened at about 9:12:30 on that graph.  Do you
know what that was?  Starting at about that time, GC times went through
the roof and the allocated heap began a steady rise.  At about 9:15, a
lot of garbage was freed up and GC times dropped way down again.  At
about 9:18, the GC once again started taking a long time, and the used
heap was still going up steadily. At about 9:21, the full GCs started --
the wide black bars.  I assume that the end of the graph is the OOM.

> We are looking to reproduce this in test environment by producing
> similar queries/ingestion but wondering if running into some memory
> leak or bug like "SOLR-8922 - DocSetCollector can allocate massive
> garbage on large indexes" which can cause this issue. Also we have
> frequent updates and wondering if not optimizing the index can result
> into this situation

It looks more like a problem with allocated memory that's NOT garbage
than a problem with garbage, but I can't really rule anything out, and
even what I've said below could be wrong.

Most of the allocated heap is in the old generation.  If there's a bug
in Solr causing this problem, it would probably be a memory leak, but
SOLR-8922 doesn't talk about a leak.  A memory leak is always possible,
but those have been rare in Solr.  The most likely problem is that
something changed in your indexing or query patterns which required a
lot more memory than what happened before that point.

Thanks,
Shawn

Reply via email to