On 11/8/2016 12:49 PM, Susheel Kumar wrote: > Ran into OOM Error again right after two weeks. Below is the GC log > viewer graph. The first time we run into this was after 3 months and > then second time in two weeks. After first incident reduced the cache > size and increase heap from 8 to 10G. Interestingly query and > ingestion load is like normal other days and heap utilisation remains > stable and suddenly jumps to x2.
It looks like something happened at about 9:12:30 on that graph. Do you know what that was? Starting at about that time, GC times went through the roof and the allocated heap began a steady rise. At about 9:15, a lot of garbage was freed up and GC times dropped way down again. At about 9:18, the GC once again started taking a long time, and the used heap was still going up steadily. At about 9:21, the full GCs started -- the wide black bars. I assume that the end of the graph is the OOM. > We are looking to reproduce this in test environment by producing > similar queries/ingestion but wondering if running into some memory > leak or bug like "SOLR-8922 - DocSetCollector can allocate massive > garbage on large indexes" which can cause this issue. Also we have > frequent updates and wondering if not optimizing the index can result > into this situation It looks more like a problem with allocated memory that's NOT garbage than a problem with garbage, but I can't really rule anything out, and even what I've said below could be wrong. Most of the allocated heap is in the old generation. If there's a bug in Solr causing this problem, it would probably be a memory leak, but SOLR-8922 doesn't talk about a leak. A memory leak is always possible, but those have been rare in Solr. The most likely problem is that something changed in your indexing or query patterns which required a lot more memory than what happened before that point. Thanks, Shawn