Re: Elasticsearch fills the heap then spends all its time doing garbage collection

Wilfred Hughes Wed, 17 Dec 2014 09:44:32 -0800

Thanks, that sort of feedback is invaluable.

We send JSON representing API calls to logstash, which forwards them to 
elasticsearch. Users then use Kibana to do queries like "what are the most 
common values passed to this function" or "how has the time taken for this 
function varied over time"? Users typically look at time ranges of one to 
seven days.


I'm happy to provide more details if that doesn't answer your question.

On Wednesday, 17 December 2014 15:04:17 UTC, Mark Walkom wrote:
>
> Then you're quite possibly at the limits for your heap/nodes.
>
> You can try adding more nodes (recommended), increasing your heap to a max 
> of 31GB or removing or closing old indexes. If you are using time based 
> indexes, you can also try disabling bloom filter to get a little bit of 
> memory back from older indexes, but it won't be much.
> It should also be noted that having a shard comes at a cost, and so having 
> 5 shards on two nodes may be a bit of overkill.
>
> What sort of queries are you running?
>
> On 17 December 2014 at 15:03, Wilfred Hughes <yowi...@gmail.com 
> <javascript:>> wrote:
>>
>> We're running three nodes (two data and one dataless) and using ES 1.2.4, 
>> for storing logstash data. 500 GiB data total, 49 indexes, 5 shards per 
>> index.
>>
>> On Wednesday, 17 December 2014 11:39:29 UTC, Mark Walkom wrote:
>>>
>>> How many nodes, how much data and in how many indexes? What ES version?
>>>
>>> On 17 December 2014 at 11:47, Wilfred Hughes <yowi...@gmail.com> wrote:
>>>>
>>>> Hi folks
>>>>
>>>> After a few hours/days of uptime, our elasticsearch cluster is spending 
>>>> all its time in GC. We're forced to restart nodes to bring response times 
>>>> back to what they should be. We're using G1GC with a 25 GiB heap on Java 8.
>>>>
>>>> In the GC logs, we just see lots of stop-the-world collections:
>>>>
>>>> 426011.398: [Full GC (Allocation Failure)  23G->22G(25G), 9.8222680 
>>>> secs]
>>>>    [Eden: 0.0B(1280.0M)->0.0B(1280.0M) Survivors: 0.0B->0.0B Heap: 
>>>> 23.2G(25.0G)->22.6G(25.0G)], [Metaspace: 42661K->42661K(1087488K)]
>>>>  [Times: user=16.97 sys=0.01, real=9.82 secs] 
>>>> 426021.221: Total time for which application threads were stopped: 
>>>> 9.8237600 seconds
>>>> 426021.221: [GC concurrent-mark-abort]
>>>> 426022.226: Total time for which application threads were stopped: 
>>>> 0.0015720 seconds
>>>> 426026.342: [GC pause (G1 Evacuation Pause) (young)
>>>> Desired survivor size 83886080 bytes, new threshold 15 (max 15)
>>>>  (to-space exhausted), 0.2428630 secs]
>>>>    [Parallel Time: 177.6 ms, GC Workers: 13]
>>>>       [GC Worker Start (ms): Min: 426026344.4, Avg: 426026344.7, Max: 
>>>> 426026344.9, Diff: 0.5]
>>>>       [Ext Root Scanning (ms): Min: 0.7, Avg: 0.9, Max: 1.0, Diff: 0.3, 
>>>> Sum: 11.4]
>>>>       [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 5.5, Diff: 5.5, Sum: 
>>>> 40.1]
>>>>          [Processed Buffers: Min: 0, Avg: 10.5, Max: 28, Diff: 28, Sum: 
>>>> 136]
>>>>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.5]
>>>>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 
>>>> 0.0, Sum: 0.1]
>>>>       [Object Copy (ms): Min: 170.5, Avg: 172.9, Max: 176.3, Diff: 5.7, 
>>>> Sum: 2248.3]
>>>>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.3, Sum: 
>>>> 1.7]
>>>>       [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, 
>>>> Sum: 0.4]
>>>>       [GC Worker Total (ms): Min: 176.9, Avg: 177.1, Max: 177.4, Diff: 
>>>> 0.6, Sum: 2302.3]
>>>>       [GC Worker End (ms): Min: 426026521.8, Avg: 426026521.8, Max: 
>>>> 426026521.8, Diff: 0.0]
>>>>    [Code Root Fixup: 0.2 ms]
>>>>    [Code Root Migration: 0.0 ms]
>>>>    [Code Root Purge: 0.0 ms]
>>>>    [Clear CT: 0.2 ms]
>>>>    [Other: 64.8 ms]
>>>>       [Evacuation Failure: 60.9 ms]
>>>>       [Choose CSet: 0.0 ms]
>>>>       [Ref Proc: 0.3 ms]
>>>>       [Ref Enq: 0.0 ms]
>>>>       [Redirty Cards: 0.7 ms]
>>>>       [Free CSet: 0.3 ms]
>>>>    [Eden: 624.0M(1280.0M)->0.0B(1280.0M) Survivors: 0.0B->0.0B Heap: 
>>>> 23.2G(25.0G)->23.1G(25.0G)]
>>>>  [Times: user=0.81 sys=0.02, real=0.25 secs]
>>>>
>>>> I've tried lowering fielddata usage on the cluster, but the heap usage 
>>>> does not change:
>>>>
>>>> $ curl http://my-host:9200/_cluster/settings?pretty
>>>> {
>>>>   "persistent" : { },
>>>>   "transient" : {
>>>>     "indices" : {
>>>>       "fielddata" : {
>>>>         "breaker" : {
>>>>           "limit" : "40%",
>>>>           "overhead" : "1.2"
>>>>         }
>>>>       }
>>>>     }
>>>>   }
>>>> }
>>>>
>>>> I'm going to look at indices.fielddata.cache.size and 
>>>> indices.fielddata.cache.expire, but I can't set these dynamically. 
>>>> Querying the node stats, only around 12GiB seems to be from field data:
>>>>
>>>> $ curl "http://my-host:9200/_nodes/stats?pretty";
>>>>       ...
>>>>       "indices" : {
>>>>         ...
>>>>         "fielddata" : {
>>>>           "memory_size_in_bytes" : 12984041509,
>>>>           "evictions" : 0,
>>>>           "fields" : { }
>>>>         },
>>>>       },
>>>>       ...
>>>>       "fielddata_breaker" : {
>>>>         "maximum_size_in_bytes" : 10737418240,
>>>>         "maximum_size" : "10gb",
>>>>         "estimated_size_in_bytes" : 12984041509,
>>>>         "estimated_size" : "12gb",
>>>>         "overhead" : 1.2,
>>>>         "tripped" : 0
>>>>
>>>> Where should I look to see what elasticsearch is doing with all this 
>>>> heap data?
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to elasticsearc...@googlegroups.com.
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/elasticsearch/c5d525fc-62fa-4281-ab53-79b2abbaa9be%
>>>> 40googlegroups.com 
>>>> <https://groups.google.com/d/msgid/elasticsearch/c5d525fc-62fa-4281-ab53-79b2abbaa9be%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/09f80323-f347-4d1d-a792-8ab2eca49132%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/09f80323-f347-4d1d-a792-8ab2eca49132%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c438f62a-82b8-4ae0-8d50-2ef6a22f1f0c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch fills the heap then spends all its time doing garbage collection

Reply via email to