Re: Logstash limitting ElasticSearch heap

Erica Thu, 03 Jul 2014 12:05:29 -0700

Antonio,

I have heard many people talking about setting the heap size in 
elasticsearch, but I can't seem to figure out where to do this--I have 
tried multiple ways, none of which seem to change the performance and 
throughput, so I am assuming I have maybe implemented them incorrectly. If 
you could point me in the right direction that would be great. I am using a 
windows machine with 4GB RAM if that makes any difference.


Erica

On Thursday, June 19, 2014 9:43:36 PM UTC-4, Antonio Augusto Santos wrote:
>
> Thanks for you response Mark.
>
> I think I've finally fine tuned my scenario...
> For starters, it helped me A LOT to set xms on Logstash to the same value 
> as LS_HEAP_SIZE. It really reduced the GC.
>
> Second, I followed some tips form 
> http://jablonskis.org/2013/elasticsearch-and-logstash-tuning/index.html 
> and 
> https://blog.codecentric.de/en/2014/05/elasticsearch-indexing-performance-cheatsheet/,
>  
> for increasing my indexing speed (search is second here). 
>
> After that I increased the number of workers on LS (had to change 
> /etc/init.d/logstash, since it was not respecting LS_WORKERS on 
> /etc/sysconfig/logstash). This made a big difference, and I could finally 
> see that the workers were being my bottleneck (with 3 workers my 4 cores 
> were hitting 100% usage all the time). So I increased my VM cores to 8, set 
> LS_WORKERS to 6, and set workers to 3 on the elasticsearch output.  The 
> major boost came form these changes. And I could see LS is heavily CPU 
> dependent.
>
> Last, but not least, I changed my log strategy. Instead of saving the logs 
> to disk with syslog and reading it back with LS, I got a setup a scenario 
> like http://cookbook.logstash.net/recipes/central-syslog/ and got my self 
> a redis server as a temp storage (for these logs I don't need logs on file, 
> ES will do just fine).
>
> After that I've bumped my indexing speed from about 500 tps to about 4k 
> TPS. 
>
> Not bad ;)
>
> On Thursday, June 19, 2014 5:32:56 AM UTC-3, Mark Walkom wrote:
>>
>> Lots of GC isn't bad, you want to see a lot of small GCs rather than 
>> stop-the-world sort of ones which can bring your cluster down.
>>
>> You can try increasing the index refresh interval 
>> - index.refresh_interval. If you don't require "live" access, then 
>> increasing it to 60 seconds or more will help.
>> If you can gist/pastebin a bit more info on your cluster, node specs, 
>> versions, total indexes and size etc it may help.
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>  
>>
>> On 18 June 2014 22:53, Antonio Augusto Santos <mkh...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I think I'm hitting some kind of wall here... I'm running logstash on a 
>>> syslog server. It receives logs from about 150 machines and also a LOT of 
>>> iptables logs, and sending it to ElasticSearch. But, I think I'm not 
>>> hitting all speed that I should. My Logstash throughput tops at about 1.000 
>>> events/s, and it looks like my ES servers (I've 2) are really light. 
>>>
>>> On logstash I've three configs (syslog, ossec and iptables), so I get 
>>> three new nodes on my cluster. I've set up LS Heap Size to be 2G, but 
>>> according to bigdesk, the ES module is getting only about 150MB, and its 
>>> generating a LOT of GC.
>>>
>>> Bellow the screenshot for big desk:
>>>
>>> [image: bigdesk] 
>>> <https://cloud.githubusercontent.com/assets/6423413/3261580/5bc4faf2-f25b-11e3-8529-df0eee61b1e5.png>
>>>
>>> And here the logstash process I'm running:
>>>
>>> *#ps -ef | grep logstash
>>> logstash 13371     1 99 14:42 pts/0    00:29:37 /usr/bin/java 
>>> -Djava.io.tmpdir=/opt/logstash/tmp -Xmx2g -XX:+UseParNewGC 
>>> -XX:+UseConcMarkSweepGC -Djava.awt.headless=true 
>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
>>> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram 
>>> -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime 
>>> -Xloggc:./logstash-gc.log -jar 
>>> /opt/logstash/vendor/jar/jruby-complete-1.7.11.jar -I/opt/logstash/lib 
>>> /opt/logstash/lib/logstash/runner.rb agent -f /etc/logstash/conf.d -l 
>>> /var/log/logstash/logstash.log*
>>>
>>>
>>>  
>>>
>>>  My Syslog/LS memory usage seens very light as well (its a 4 core VM), 
>>> but the logstash process is always topping in about 150% - 200%
>>>
>>>
>>> *# free -m
>>>              total       used       free     shared    buffers     cached
>>> Mem:          7872       2076       5795          0         39       1502
>>> -/+ buffers/cache:        534       7337
>>> Swap:         1023          8       1015
>>> # uptime
>>>  15:02:04 up 23:52,  1 user,  load average: 1.39, 1.12, 0.96*
>>>
>>>  
>>>
>>> Any ideas what I can do to increase the indexing performance?
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/30e59c24-44fb-458f-8689-1d8ecf3f84c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Logstash limitting ElasticSearch heap

Reply via email to