[graylog2] Re: Last GC run with PS Scavenge took longer than 1 second

Arie Tue, 17 Feb 2015 11:12:50 -0800

grrr,

just wanted to ask if you have the latest java after reading that ES 
demands the latest version.


Nice you found it.

On Tuesday, February 17, 2015 at 12:08:16 PM UTC+1, Christoph Fürstaller 
wrote:
>
> I found a solution for my problem with the GC. I was using an old Java 
> Version 1.7.0_21 with the newer version 1.7.0_71 the GC warnings are gone 
> and everything is running fine!
>
>
> On Wednesday, February 11, 2015 at 9:54:59 AM UTC+1, Christoph Fürstaller 
> wrote:
>>
>> That's not good :/ 
>> Cause graylog2 is behaving strangely. It's running for approx. 17 hours 
>> now and these warnings appear every few seconds:
>> 2015-02-11 09:40:59,393 WARN : 
>> org.graylog2.periodical.GarbageCollectionWarningThread - Last GC run with 
>> PS Scavenge took longer than 1 second (last duration=3015 milliseconds)
>> 2015-02-11 09:41:03,190 WARN : 
>> org.graylog2.periodical.GarbageCollectionWarningThread - Last GC run with 
>> PS Scavenge took longer than 1 second (last duration=3024 milliseconds)
>> 2015-02-11 09:41:06,986 WARN : 
>> org.graylog2.periodical.GarbageCollectionWarningThread - Last GC run with 
>> PS Scavenge took longer than 1 second (last duration=3181 milliseconds)
>> 2015-02-11 09:41:11,303 WARN : 
>> org.graylog2.periodical.GarbageCollectionWarningThread - Last GC run with 
>> PS Scavenge took longer than 1 second (last duration=3048 milliseconds)
>> 2015-02-11 09:41:11,304 WARN : 
>> org.graylog2.periodical.GarbageCollectionWarningThread - Last GC run with 
>> PS MarkSweep took longer than 1 second (last duration=159306 milliseconds)
>> 2015-02-11 09:41:15,169 WARN : 
>> org.graylog2.periodical.GarbageCollectionWarningThread - Last GC run with 
>> PS Scavenge took longer than 1 second (last duration=2575 milliseconds)
>> 2015-02-11 09:41:19,652 WARN : 
>> org.graylog2.periodical.GarbageCollectionWarningThread - Last GC run with 
>> PS Scavenge took longer than 1 second (last duration=2838 milliseconds)
>>
>> As seen in the HQ plugin from ES, the Cluster is fine. The Search 
>> Query/Fetch could be faster ...
>>
>> Summary  *Node Name:*  host1.test.local  host3.test.local  
>> host2.test.local  graylog.test.local   *IP Address:*  192.168.0.1:9300  
>> 192.168.0.3:9300  192.168.0.2:9300  192.168.0.3:9350   *Node ID:*  
>> Fbiyz9krQq-KkhxxxI5NQQ  N-GgJy4aR1ecMxxxJPIE2Q  aWH0zgNSRJuoQZAxxxbUSQ  
>> DJdD6KJtSM6uoxxxCGTIoQ   *ES Uptime:*  0.69 days  0.69 days  0.69 days  
>> 0.69 days   File System  *Store Size:*  12.4GB  12.4GB  12.4GB  0.0   *# 
>> Documents:*  52,278,572  52,278,572  52,278,572  0   *Documents Deleted:*  
>> 0%  0%  0%  0%   *Merge Size:*  13.1GB  12.8GB  13.1GB  0.0   *Merge 
>> Time:*  00:18:35  00:17:08  00:18:15  00:00:00   *Merge Rate:*  12.6 
>> MB/s  13.4 MB/s  12.9 MB/s  0 MB/s   *File Descriptors:*  565  561  555  
>> 366   Index Activity  *Indexing - Index:*  0.71ms  1.06ms  0.71ms  0ms   
>> *Indexing 
>> - Delete:*  0ms  0ms  0ms  0ms   *Search - Query:*  1031.5ms  1076.11ms  
>> 965.14ms  0ms   *Search - Fetch:*  47.5ms  61ms  101ms  0ms   *Get - 
>> Total:*  0ms  0ms  0ms  0ms   *Get - Exists:*  0ms  0ms  0ms  0ms   *Get 
>> - Missing:*  0ms  0ms  0ms  0ms   *Refresh:*  3.97ms  3.53ms  3.99ms  
>> 0ms   *Flush:*  31.64ms  54.56ms  33.04ms  0ms   Cache Activity  *Field 
>> Size:*  92.9MB  93.4MB  93.2MB  0.0   *Field Evictions:*  0  0  0  0   
>> *Filter 
>> Cache Size:*  1.4KB  1.4KB  144.0B  0.0   *Filter Evictions:*  0 per 
>> query  0 per query  0 per query  0 per query   *ID Cache Size:*  
>>  
>>  
>>  
>>   *% ID Cache:*  0%  0%  0%  0%   Memory  *Total Memory:*  16 gb  16 gb  
>> 16 gb  0 gb   *Heap Size:*  5.9 gb  5.9 gb  5.9 gb  0.1 gb   *Heap % of 
>> RAM:*  38.1%  38.1%  38.1%  0%   *% Heap Used:*  8%  13.1%  10.7%  80.4%   
>> *GC 
>> MarkSweep Frequency:*  0 s  0 s  0 s  0 s   *GC MarkSweep Duration:*  
>> 0ms  0ms  0ms  0ms   *GC ParNew Frequency:*  0 s  0 s  0 s  0 s   *GC 
>> ParNew Duration:*  0ms  0ms  0ms  0ms   *G1 GC Young Generation Freq:*  
>> 0 s  0 s  0 s  0 s   *G1 GC Young Generation Duration:*  0ms  0ms  0ms  
>> 0ms   *G1 GC Old Generation Freq:*  0 s  0 s  0 s  0 s   *G1 GC Old 
>> Generation Duration:*  0ms  0ms  0ms  0ms   *Swap Space:*  0.0000 mb  
>> 0.0000 mb  0.0000 mb  undefined mb   Network  *HTTP Connection Rate:*  0 
>> /second  0 /second  0 /second  0 /second 
>> Any ideas where the problems with the GC come from??
>>
>> On Tuesday, February 10, 2015 at 3:57:25 PM UTC+1, Arie wrote:
>>>
>>> not 100% sure read about it and it looks fine.
>>> We are running with a master node explicitly.
>>>
>>> I now see I was confused by your question, because it seams more graylog 
>>> related.
>>> Looking at your config I am not seeing strange things.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tuesday, February 10, 2015 at 3:17:49 PM UTC+1, Christoph Fürstaller 
>>> wrote:
>>>>
>>>> correct! But the other two could take this role if the master goes 
>>>> down. Am I right? So my setup is fine. Or do I misunderstand something?
>>>>
>>>> On Tuesday, February 10, 2015 at 2:51:20 PM UTC+1, Arie wrote:
>>>>>
>>>>> When running, only one server can be master. This server is regulating 
>>>>> all the logic of your es cluster,
>>>>> and is the one that graylog is talking to.
>>>>>
>>>>>
>>>>>
>>>>> On Tuesday, February 10, 2015 at 2:04:52 PM UTC+1, Christoph 
>>>>> Fürstaller wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Thanks for the configuration docu.
>>>>>>
>>>>>> Can I really run into split brain?
>>>>>> I have 3 nodes, they are all equal. Everyone of them can be a master 
>>>>>> and will store data. With the discovery.zen.minimum_master_nodes: 2 I 
>>>>>> can't 
>>>>>> get a split brain. Or am I wrong?
>>>>>> Or is this setup not ideal?
>>>>>>
>>>>>> Chris...
>>>>>>
>>>>>> On Tuesday, February 10, 2015 at 1:38:06 PM UTC+1, Arie wrote:
>>>>>>>
>>>>>>> You coud bump into a split brain situation running all ES nodes as 
>>>>>>> master.
>>>>>>>
>>>>>>> Check out this to configure your cluster:
>>>>>>>
>>>>>>>
>>>>>>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_important_configuration_changes.html#_minimum_master_nodes
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tuesday, February 10, 2015 at 12:09:33 AM UTC+1, Christoph 
>>>>>>> Fürstaller wrote:
>>>>>>>>
>>>>>>>> Thanks for your answer!
>>>>>>>>
>>>>>>>> About the master/data nodes. What happens when the master goes 
>>>>>>>> down? Will one of the 'slaves' become a master? I configured all 3 as 
>>>>>>>> master for redundancy, so the cluster still survives if only one node 
>>>>>>>> is 
>>>>>>>> present. Is this assumption wrong?
>>>>>>>>
>>>>>>>> I've increased the ES_HEAP_SIZE to 6G before, with the same 
>>>>>>>> results. 
>>>>>>>>
>>>>>>>> Chris...
>>>>>>>>
>>>>>>>> Am Montag, 9. Februar 2015 20:30:28 UTC+1 schrieb Arie:
>>>>>>>>>
>>>>>>>>> Hi,,
>>>>>>>>>
>>>>>>>>> Looking @ your config in elasticsearch.yml the follwing comes in 
>>>>>>>>> to mind
>>>>>>>>>
>>>>>>>>> One node should be:
>>>>>>>>> node.master: true
>>>>>>>>> node.data: true
>>>>>>>>>  
>>>>>>>>> and for the other two nodes:
>>>>>>>>> node.master: false
>>>>>>>>> node.data: false
>>>>>>>>>
>>>>>>>>> elasticseaarch.conf
>>>>>>>>> ES_HEAP_SIZE
>>>>>>>>>
>>>>>>>>> you can take this easy up o 8G (50% of your memory) and check if 
>>>>>>>>> this is really
>>>>>>>>> running so. In my case on Centos6 I put this in 
>>>>>>>>> /etc/conf.d/elasticseaarch
>>>>>>>>>
>>>>>>>>> Good luck.
>>>>>>>>>
>>>>>>>>> On Friday, February 6, 2015 at 12:58:27 PM UTC+1, Christoph 
>>>>>>>>> Fürstaller wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> Yesterday we've updatet our Graylog2/Elasticsearch Cluster. The 
>>>>>>>>>> Elastic Search Cluster consists of 3 physical Maschins: DL380 G7, 
>>>>>>>>>> E5620, 
>>>>>>>>>> 16GB RAM on RHEL 6.6. Each ES Node gets 4GB RAM. On one Host there 
>>>>>>>>>> is the 
>>>>>>>>>> graylog2 Server/Interface installed. Until yesterday we used 
>>>>>>>>>> Elasticsearch 
>>>>>>>>>> 0.90.10-1 and graylog2-0.20.3 Yesterday we updatet graylog2 to 
>>>>>>>>>> 0.90.0, 
>>>>>>>>>> startet everything, everything was running fine. Then Stopped 
>>>>>>>>>> graylog2 and 
>>>>>>>>>> the ElaticSearch Cluster, upgraded ES to 1.3.4 and graylog to 
>>>>>>>>>> 0.92.4. The 
>>>>>>>>>> Upgrade from ES was successfully, after that, startet graylog2, 
>>>>>>>>>> which 
>>>>>>>>>> connected to the cluster and showed everything.
>>>>>>>>>>
>>>>>>>>>> In the ES Cluster there are 7 indices a 20mio messages. The last 
>>>>>>>>>> 3 indices are opened, the other closed. Graylog2 sees approx 50mio 
>>>>>>>>>> messages. New messages arrive with approx 5msg/sec
>>>>>>>>>>
>>>>>>>>>> In the logs from graylog2-server there are messages like this, 
>>>>>>>>>> every couple of minutes:
>>>>>>>>>> org.graylog2.periodical.GarbageCollectionWarningThread - Last GC 
>>>>>>>>>> run with PS Scavenge took longer than 1 second
>>>>>>>>>>
>>>>>>>>>> It seems graylog is running fine, a bit slow on searches, but 
>>>>>>>>>> fine.
>>>>>>>>>>
>>>>>>>>>> Attached are the config files for graylog2 and elasticsearch.
>>>>>>>>>>
>>>>>>>>>> Can someone give us a hint where this warnings come from? What we 
>>>>>>>>>> can tweak? Would be very helpful!
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>> Chris...
>>>>>>>>>>
>>>>>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[graylog2] Re: Last GC run with PS Scavenge took longer than 1 second

Reply via email to