Antonio, I have heard many people talking about setting the heap size in elasticsearch, but I can't seem to figure out where to do this--I have tried multiple ways, none of which seem to change the performance and throughput, so I am assuming I have maybe implemented them incorrectly. If you could point me in the right direction that would be great. I am using a windows machine with 4GB RAM if that makes any difference.
Erica On Thursday, June 19, 2014 9:43:36 PM UTC-4, Antonio Augusto Santos wrote: > > Thanks for you response Mark. > > I think I've finally fine tuned my scenario... > For starters, it helped me A LOT to set xms on Logstash to the same value > as LS_HEAP_SIZE. It really reduced the GC. > > Second, I followed some tips form > http://jablonskis.org/2013/elasticsearch-and-logstash-tuning/index.html > and > https://blog.codecentric.de/en/2014/05/elasticsearch-indexing-performance-cheatsheet/, > > for increasing my indexing speed (search is second here). > > After that I increased the number of workers on LS (had to change > /etc/init.d/logstash, since it was not respecting LS_WORKERS on > /etc/sysconfig/logstash). This made a big difference, and I could finally > see that the workers were being my bottleneck (with 3 workers my 4 cores > were hitting 100% usage all the time). So I increased my VM cores to 8, set > LS_WORKERS to 6, and set workers to 3 on the elasticsearch output. The > major boost came form these changes. And I could see LS is heavily CPU > dependent. > > Last, but not least, I changed my log strategy. Instead of saving the logs > to disk with syslog and reading it back with LS, I got a setup a scenario > like http://cookbook.logstash.net/recipes/central-syslog/ and got my self > a redis server as a temp storage (for these logs I don't need logs on file, > ES will do just fine). > > After that I've bumped my indexing speed from about 500 tps to about 4k > TPS. > > Not bad ;) > > On Thursday, June 19, 2014 5:32:56 AM UTC-3, Mark Walkom wrote: >> >> Lots of GC isn't bad, you want to see a lot of small GCs rather than >> stop-the-world sort of ones which can bring your cluster down. >> >> You can try increasing the index refresh interval >> - index.refresh_interval. If you don't require "live" access, then >> increasing it to 60 seconds or more will help. >> If you can gist/pastebin a bit more info on your cluster, node specs, >> versions, total indexes and size etc it may help. >> >> Regards, >> Mark Walkom >> >> Infrastructure Engineer >> Campaign Monitor >> email: ma...@campaignmonitor.com >> web: www.campaignmonitor.com >> >> >> On 18 June 2014 22:53, Antonio Augusto Santos <mkh...@gmail.com> wrote: >> >>> Hello, >>> >>> I think I'm hitting some kind of wall here... I'm running logstash on a >>> syslog server. It receives logs from about 150 machines and also a LOT of >>> iptables logs, and sending it to ElasticSearch. But, I think I'm not >>> hitting all speed that I should. My Logstash throughput tops at about 1.000 >>> events/s, and it looks like my ES servers (I've 2) are really light. >>> >>> On logstash I've three configs (syslog, ossec and iptables), so I get >>> three new nodes on my cluster. I've set up LS Heap Size to be 2G, but >>> according to bigdesk, the ES module is getting only about 150MB, and its >>> generating a LOT of GC. >>> >>> Bellow the screenshot for big desk: >>> >>> [image: bigdesk] >>> <https://cloud.githubusercontent.com/assets/6423413/3261580/5bc4faf2-f25b-11e3-8529-df0eee61b1e5.png> >>> >>> And here the logstash process I'm running: >>> >>> *#ps -ef | grep logstash >>> logstash 13371 1 99 14:42 pts/0 00:29:37 /usr/bin/java >>> -Djava.io.tmpdir=/opt/logstash/tmp -Xmx2g -XX:+UseParNewGC >>> -XX:+UseConcMarkSweepGC -Djava.awt.headless=true >>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly >>> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram >>> -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime >>> -Xloggc:./logstash-gc.log -jar >>> /opt/logstash/vendor/jar/jruby-complete-1.7.11.jar -I/opt/logstash/lib >>> /opt/logstash/lib/logstash/runner.rb agent -f /etc/logstash/conf.d -l >>> /var/log/logstash/logstash.log* >>> >>> >>> >>> >>> My Syslog/LS memory usage seens very light as well (its a 4 core VM), >>> but the logstash process is always topping in about 150% - 200% >>> >>> >>> *# free -m >>> total used free shared buffers cached >>> Mem: 7872 2076 5795 0 39 1502 >>> -/+ buffers/cache: 534 7337 >>> Swap: 1023 8 1015 >>> # uptime >>> 15:02:04 up 23:52, 1 user, load average: 1.39, 1.12, 0.96* >>> >>> >>> >>> Any ideas what I can do to increase the indexing performance? >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/30e59c24-44fb-458f-8689-1d8ecf3f84c1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.