3,000 events/sec Architecture

Eric Luellen Tue, 04 Mar 2014 07:12:50 -0800

Hello,

I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
months now and everything has worked out pretty good and we are ready to 
move it to production. Before building out the infrastructure, I want to 
make sure my shard/node/index setup is correct as that is the main part 
that I'm still a bit fuzzy on. Overall my setup is this:

Servers
Networking Gear
syslog-ng server
End Points -----------------> Load Balancer
------------> syslog-ng server --------------> Logs
stored in 5 flat files on SAN storage
Security Devices
syslog-ng server
Etc.

I have logstash running on one of the syslog-ng servers and is basically
reading the input of 5 different files and sending them to ElasticSearch.
So within ElasticSearch, I am creating 5 different indexes a day so I can
do granular user access control within Kibana.

unix-$date
windows-$date
networking-$date
security-$date
endpoint-$date

My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on
them. For my POC I have 2 and it's working fine for 2,000 events/second. My
main concern is how I setup the ElasticSearch servers so they are as
efficient as possible. With my 5 different indexes a day, and I plan on
keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1
master node and the other 2 be just basic setups that are data and
searching? Also, will 1 replica be sufficient for this setup or should I do
2 to be safe? In my POC, I've had a few issues where I ran out of memory or
something weird happened and I lost data for a while so wanted to try to
limit that as much as possible. We'll also have quite a few users
potentially querying the system so I didn't know if I should setup a
dedicated search node for one of these.

Besides the ES cluster, I think everything else should be fine. I have had
a few concerns about logstash keeping up with the amount of entries coming
into syslog-ng but haven't seen much in the way of load balancing logstash
or verifying if it's able to keep up or not. I've spot checked the files
quite a bit and everything seems to be correct but if there is a better way
to do this, I'm all ears.

I'm going to have my KIbana instance installed on the master ES node, which
shouldn't be a big deal. I've played with the idea of putting the ES
servers on the syslog-ng servers and just have a separate NIC for the ES
traffic but didn't want to bog down the servers a whole lot.

Any thoughts or recommendations would be greatly appreciated.

Thanks,
Eric

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/13a76e46-91b5-41fe-9667-f674706fe127%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

3,000 events/sec Architecture

Reply via email to