Thanks for you response Mark.

I think I've finally fine tuned my scenario...
For starters, it helped me A LOT to set xms on Logstash to the same value 
as LS_HEAP_SIZE. It really reduced the GC.

Second, I followed some tips form 
http://jablonskis.org/2013/elasticsearch-and-logstash-tuning/index.html and 
https://blog.codecentric.de/en/2014/05/elasticsearch-indexing-performance-cheatsheet/,
 
for increasing my indexing speed (search is second here). 

After that I increased the number of workers on LS (had to change 
/etc/init.d/logstash, since it was not respecting LS_WORKERS on 
/etc/sysconfig/logstash). This made a big difference, and I could finally 
see that the workers were being my bottleneck (with 3 workers my 4 cores 
were hitting 100% usage all the time). So I increased my VM cores to 8, set 
LS_WORKERS to 6, and set workers to 3 on the elasticsearch output.  The 
major boost came form these changes. And I could see LS is heavily CPU 
dependent.

Last, but not least, I changed my log strategy. Instead of saving the logs 
to disk with syslog and reading it back with LS, I got a setup a scenario 
like http://cookbook.logstash.net/recipes/central-syslog/ and got my self a 
redis server as a temp storage (for these logs I don't need logs on file, 
ES will do just fine).

After that I've bumped my indexing speed from about 500 tps to about 4k 
TPS. 

Not bad ;)

On Thursday, June 19, 2014 5:32:56 AM UTC-3, Mark Walkom wrote:
>
> Lots of GC isn't bad, you want to see a lot of small GCs rather than 
> stop-the-world sort of ones which can bring your cluster down.
>
> You can try increasing the index refresh interval 
> - index.refresh_interval. If you don't require "live" access, then 
> increasing it to 60 seconds or more will help.
> If you can gist/pastebin a bit more info on your cluster, node specs, 
> versions, total indexes and size etc it may help.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com <javascript:>
> web: www.campaignmonitor.com
>  
>
> On 18 June 2014 22:53, Antonio Augusto Santos <mkh...@gmail.com 
> <javascript:>> wrote:
>
>> Hello,
>>
>> I think I'm hitting some kind of wall here... I'm running logstash on a 
>> syslog server. It receives logs from about 150 machines and also a LOT of 
>> iptables logs, and sending it to ElasticSearch. But, I think I'm not 
>> hitting all speed that I should. My Logstash throughput tops at about 1.000 
>> events/s, and it looks like my ES servers (I've 2) are really light. 
>>
>> On logstash I've three configs (syslog, ossec and iptables), so I get 
>> three new nodes on my cluster. I've set up LS Heap Size to be 2G, but 
>> according to bigdesk, the ES module is getting only about 150MB, and its 
>> generating a LOT of GC.
>>
>> Bellow the screenshot for big desk:
>>
>> [image: bigdesk] 
>> <https://cloud.githubusercontent.com/assets/6423413/3261580/5bc4faf2-f25b-11e3-8529-df0eee61b1e5.png>
>>
>> And here the logstash process I'm running:
>>
>> *#ps -ef | grep logstash
>> logstash 13371     1 99 14:42 pts/0    00:29:37 /usr/bin/java 
>> -Djava.io.tmpdir=/opt/logstash/tmp -Xmx2g -XX:+UseParNewGC 
>> -XX:+UseConcMarkSweepGC -Djava.awt.headless=true 
>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
>> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram 
>> -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime 
>> -Xloggc:./logstash-gc.log -jar 
>> /opt/logstash/vendor/jar/jruby-complete-1.7.11.jar -I/opt/logstash/lib 
>> /opt/logstash/lib/logstash/runner.rb agent -f /etc/logstash/conf.d -l 
>> /var/log/logstash/logstash.log*
>>
>>
>>  
>>
>>  My Syslog/LS memory usage seens very light as well (its a 4 core VM), but 
>> the logstash process is always topping in about 150% - 200%
>>
>>
>> *# free -m
>>              total       used       free     shared    buffers     cached
>> Mem:          7872       2076       5795          0         39       1502
>> -/+ buffers/cache:        534       7337
>> Swap:         1023          8       1015
>> # uptime
>>  15:02:04 up 23:52,  1 user,  load average: 1.39, 1.12, 0.96*
>>
>>  
>>
>> Any ideas what I can do to increase the indexing performance?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6727570c-085d-4e85-b3a3-f965b8c08d87%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to