Hello,

Ran into OOM Error again right after two weeks. Below is the GC log viewer
graph.  The first time we run into this was after 3 months and then second
time in two weeks. After first incident reduced the cache size and increase
heap from 8 to 10G.  Interestingly query and ingestion load is like normal
other days and heap utilisation remains stable and suddenly jumps to x2.

We are looking to reproduce this in test environment by producing similar
queries/ingestion but wondering if running into some memory leak or bug
like  "SOLR-8922 - DocSetCollector can allocate massive garbage on large
indexes" which can cause this issue.  Also we have frequent updates and
wondering if not optimizing the index can result into this situation

Any thoughts ?

GC Viewer
====
https://www.dropbox.com/s/bb29ub5q2naljdl/gc_log_snapshot.png?dl=0




On Wed, Oct 26, 2016 at 10:47 AM, Susheel Kumar <susheel2...@gmail.com>
wrote:

> Hi Toke,
>
> I think your guess is right.  We have ingestion running in batches.  We
> have 6 shards & 6 replicas on 12 VM's each around 40+ million docs on each
> shard.
>
> Thanks everyone for the suggestions/pointers.
>
> Thanks,
> Susheel
>
> On Wed, Oct 26, 2016 at 1:52 AM, Toke Eskildsen <t...@statsbiblioteket.dk>
> wrote:
>
>> On Tue, 2016-10-25 at 15:04 -0400, Susheel Kumar wrote:
>> > Thanks, Toke.  Analyzing GC logs helped to determine that it was a
>> > sudden
>> > death.
>>
>> > The peaks in last 20 mins... See   http://tinypic.com/r/n2zonb/9
>>
>> Peaks yes, but there is a pattern of
>>
>> 1) Stable memory use
>> 2) Temporary doubling of the memory used and a lot of GC
>> 3) Increased (relative to last stable period) but stable memory use
>> 4) Goto 2
>>
>> Should I guess, I would say that you are running ingests in batches,
>> which temporarily causes 2 searchers to be open at the same time. That
>> is 2 in the list above. After the batch ingest, the baseline moves up,
>> assumedly because your have added quite a lot of documents, relative to
>> the overall number of documents.
>>
>>
>> The temporary doubling of the baseline is hard to avoid, but I am
>> surprised of the amount of heap that you need in the stable periods.
>> Just to be clear: This is from a Solr with 8GB of heap handling only 1
>> shard of 20GB and you are using DocValues? How many documents do you
>> have in such a shard?
>>
>> - Toke Eskildsen, State and University Library, Denmark
>>
>
>

Reply via email to