Re: 3,000 events/sec Architecture

2014-03-14 Thread Zachary Lammers
Eric, as an update, I hit OOM with a couple nodes in my cluster today w/ 
16gb ram for ES alone (each data node has 24gb ram) - I was running fine, 
but then I had users kick off regular searches to watch performance, and my 
indexing rates went from 35k/sec down to almost nothing (ran at a lesser 
rate for a while), but as more searches were performed, marvel was showing 
my JVM usage getting dangerously high on a few nodes (almost entirely my VM 
Data nodes, which I thought odd).

I had roughly 27 billion total docs (log events) in the cluster, with daily 
indexes of 3/1-3/14 (today).  Ended up trying to up the worst JVM node 
with a few more gig of ram, and it kinda hosed things, so I wiped it and 
starting with a slightly modified config.  I may wipe it again and try to 
get LS1.4b and ES1.01 going this weekend again, to see if i can keep my 
indexing rates high enough.

-z

On Wednesday, March 12, 2014 1:50:51 PM UTC-5, Eric wrote:

 Yes, currently logstash is reading files that syslog-ng created. We 
 already had the syslog-ng architecture in place so just kept rolling with 
 that.


 On Tuesday, March 11, 2014 11:16:42 PM UTC-4, Otis Gospodnetic wrote:

 Hi,

 Is that Logstash instance reading files that are produces by syslog-ng 
 servers?  Maybe not but if yes, have you considered using Rsyslog with 
 omelasticsearch instead to simplify the architecture?

 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/


 On Tuesday, March 4, 2014 10:11:59 AM UTC-5, Eric wrote:

 Hello,

 I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
 months now and everything has worked out pretty good and we are ready to 
 move it to production. Before building out the infrastructure, I want to 
 make sure my shard/node/index setup is correct as that is the main part 
 that I'm still a bit fuzzy on. Overall my setup is this:

 Servers
 Networking Gear 
 syslog-ng server
 End Points   -   Load Balancer 
     syslog-ng server  -- Logs 
 stored in 5 flat files on SAN storage
 Security Devices 
 syslog-ng server
 Etc.

 I have logstash running on one of the syslog-ng servers and is basically 
 reading the input of 5 different files and sending them to ElasticSearch. 
 So within ElasticSearch, I am creating 5 different indexes a day so I can 
 do granular user access control within Kibana.

 unix-$date
 windows-$date
 networking-$date
 security-$date
 endpoint-$date

 My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on 
 them. For my POC I have 2 and it's working fine for 2,000 events/second. My 
 main concern is how I setup the ElasticSearch servers so they are as 
 efficient as possible. With my 5 different indexes a day, and I plan on 
 keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1 
 master node and the other 2 be just basic setups that are data and 
 searching? Also, will 1 replica be sufficient for this setup or should I do 
 2 to be safe? In my POC, I've had a few issues where I ran out of memory or 
 something weird happened and I lost data for a while so wanted to try to 
 limit that as much as possible. We'll also have quite a few users 
 potentially querying the system so I didn't know if I should setup a 
 dedicated search node for one of these.

 Besides the ES cluster, I think everything else should be fine. I have 
 had a few concerns about logstash keeping up with the amount of entries 
 coming into syslog-ng but haven't seen much in the way of load balancing 
 logstash or verifying if it's able to keep up or not. I've spot checked the 
 files quite a bit and everything seems to be correct but if there is a 
 better way to do this, I'm all ears.

 I'm going to have my KIbana instance installed on the master ES node, 
 which shouldn't be a big deal. I've played with the idea of putting the ES 
 servers on the syslog-ng servers and just have a separate NIC for the ES 
 traffic but didn't want to bog down the servers a whole lot. 

 Any thoughts or recommendations would be greatly appreciated.

 Thanks,
 Eric



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9f2c48fc-dd30-4f87-bb7f-15ac7c59b4b1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: 3,000 events/sec Architecture

2014-03-12 Thread Otis Gospodnetic
Apache Flume has the necessary pieces.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Wednesday, March 12, 2014 5:01:37 AM UTC-4, Jörg Prante wrote:

 It would also be possible to write a custom Java syslog protocol socket 
 receiver and index log messages into ES, for example by reusing syslog4j. 
 Similar to the UDP bulk indexing.

 Jörg


 On Wed, Mar 12, 2014 at 4:16 AM, Otis Gospodnetic 
 otis.gos...@gmail.comjavascript:
  wrote:

 Hi,

 Is that Logstash instance reading files that are produces by syslog-ng 
 servers?  Maybe not but if yes, have you considered using Rsyslog with 
 omelasticsearch instead to simplify the architecture?

 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/


 On Tuesday, March 4, 2014 10:11:59 AM UTC-5, Eric wrote:

 Hello,

 I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
 months now and everything has worked out pretty good and we are ready to 
 move it to production. Before building out the infrastructure, I want to 
 make sure my shard/node/index setup is correct as that is the main part 
 that I'm still a bit fuzzy on. Overall my setup is this:

 Servers
 Networking Gear 
 syslog-ng server
 End Points   -   Load Balancer 
     syslog-ng server  -- Logs 
 stored in 5 flat files on SAN storage
 Security Devices 
 syslog-ng server
 Etc.

 I have logstash running on one of the syslog-ng servers and is basically 
 reading the input of 5 different files and sending them to ElasticSearch. 
 So within ElasticSearch, I am creating 5 different indexes a day so I can 
 do granular user access control within Kibana.

 unix-$date
 windows-$date
 networking-$date
 security-$date
 endpoint-$date

 My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on 
 them. For my POC I have 2 and it's working fine for 2,000 events/second. My 
 main concern is how I setup the ElasticSearch servers so they are as 
 efficient as possible. With my 5 different indexes a day, and I plan on 
 keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1 
 master node and the other 2 be just basic setups that are data and 
 searching? Also, will 1 replica be sufficient for this setup or should I do 
 2 to be safe? In my POC, I've had a few issues where I ran out of memory or 
 something weird happened and I lost data for a while so wanted to try to 
 limit that as much as possible. We'll also have quite a few users 
 potentially querying the system so I didn't know if I should setup a 
 dedicated search node for one of these.

 Besides the ES cluster, I think everything else should be fine. I have 
 had a few concerns about logstash keeping up with the amount of entries 
 coming into syslog-ng but haven't seen much in the way of load balancing 
 logstash or verifying if it's able to keep up or not. I've spot checked the 
 files quite a bit and everything seems to be correct but if there is a 
 better way to do this, I'm all ears.

 I'm going to have my KIbana instance installed on the master ES node, 
 which shouldn't be a big deal. I've played with the idea of putting the ES 
 servers on the syslog-ng servers and just have a separate NIC for the ES 
 traffic but didn't want to bog down the servers a whole lot. 

 Any thoughts or recommendations would be greatly appreciated.

 Thanks,
 Eric

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/aa24cb27-6292-4d42-aa09-b13f6688c11f%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/aa24cb27-6292-4d42-aa09-b13f6688c11f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4292278e-bf8f-4638-a00b-14f59e79851b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: 3,000 events/sec Architecture

2014-03-12 Thread Eric
Yes, currently logstash is reading files that syslog-ng created. We already 
had the syslog-ng architecture in place so just kept rolling with that.


On Tuesday, March 11, 2014 11:16:42 PM UTC-4, Otis Gospodnetic wrote:

 Hi,

 Is that Logstash instance reading files that are produces by syslog-ng 
 servers?  Maybe not but if yes, have you considered using Rsyslog with 
 omelasticsearch instead to simplify the architecture?

 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/


 On Tuesday, March 4, 2014 10:11:59 AM UTC-5, Eric wrote:

 Hello,

 I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
 months now and everything has worked out pretty good and we are ready to 
 move it to production. Before building out the infrastructure, I want to 
 make sure my shard/node/index setup is correct as that is the main part 
 that I'm still a bit fuzzy on. Overall my setup is this:

 Servers
 Networking Gear   
   syslog-ng server
 End Points   -   Load Balancer 
     syslog-ng server  -- Logs 
 stored in 5 flat files on SAN storage
 Security Devices 
 syslog-ng server
 Etc.

 I have logstash running on one of the syslog-ng servers and is basically 
 reading the input of 5 different files and sending them to ElasticSearch. 
 So within ElasticSearch, I am creating 5 different indexes a day so I can 
 do granular user access control within Kibana.

 unix-$date
 windows-$date
 networking-$date
 security-$date
 endpoint-$date

 My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on 
 them. For my POC I have 2 and it's working fine for 2,000 events/second. My 
 main concern is how I setup the ElasticSearch servers so they are as 
 efficient as possible. With my 5 different indexes a day, and I plan on 
 keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1 
 master node and the other 2 be just basic setups that are data and 
 searching? Also, will 1 replica be sufficient for this setup or should I do 
 2 to be safe? In my POC, I've had a few issues where I ran out of memory or 
 something weird happened and I lost data for a while so wanted to try to 
 limit that as much as possible. We'll also have quite a few users 
 potentially querying the system so I didn't know if I should setup a 
 dedicated search node for one of these.

 Besides the ES cluster, I think everything else should be fine. I have 
 had a few concerns about logstash keeping up with the amount of entries 
 coming into syslog-ng but haven't seen much in the way of load balancing 
 logstash or verifying if it's able to keep up or not. I've spot checked the 
 files quite a bit and everything seems to be correct but if there is a 
 better way to do this, I'm all ears.

 I'm going to have my KIbana instance installed on the master ES node, 
 which shouldn't be a big deal. I've played with the idea of putting the ES 
 servers on the syslog-ng servers and just have a separate NIC for the ES 
 traffic but didn't want to bog down the servers a whole lot. 

 Any thoughts or recommendations would be greatly appreciated.

 Thanks,
 Eric



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3f1637e2-c712-4e56-91be-32116d92a3ee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: 3,000 events/sec Architecture

2014-03-11 Thread Otis Gospodnetic
Hi,

Is that Logstash instance reading files that are produces by syslog-ng 
servers?  Maybe not but if yes, have you considered using Rsyslog with 
omelasticsearch instead to simplify the architecture?

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Tuesday, March 4, 2014 10:11:59 AM UTC-5, Eric wrote:

 Hello,

 I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
 months now and everything has worked out pretty good and we are ready to 
 move it to production. Before building out the infrastructure, I want to 
 make sure my shard/node/index setup is correct as that is the main part 
 that I'm still a bit fuzzy on. Overall my setup is this:

 Servers
 Networking Gear   
   syslog-ng server
 End Points   -   Load Balancer 
     syslog-ng server  -- Logs 
 stored in 5 flat files on SAN storage
 Security Devices   
   syslog-ng server
 Etc.

 I have logstash running on one of the syslog-ng servers and is basically 
 reading the input of 5 different files and sending them to ElasticSearch. 
 So within ElasticSearch, I am creating 5 different indexes a day so I can 
 do granular user access control within Kibana.

 unix-$date
 windows-$date
 networking-$date
 security-$date
 endpoint-$date

 My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on 
 them. For my POC I have 2 and it's working fine for 2,000 events/second. My 
 main concern is how I setup the ElasticSearch servers so they are as 
 efficient as possible. With my 5 different indexes a day, and I plan on 
 keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1 
 master node and the other 2 be just basic setups that are data and 
 searching? Also, will 1 replica be sufficient for this setup or should I do 
 2 to be safe? In my POC, I've had a few issues where I ran out of memory or 
 something weird happened and I lost data for a while so wanted to try to 
 limit that as much as possible. We'll also have quite a few users 
 potentially querying the system so I didn't know if I should setup a 
 dedicated search node for one of these.

 Besides the ES cluster, I think everything else should be fine. I have had 
 a few concerns about logstash keeping up with the amount of entries coming 
 into syslog-ng but haven't seen much in the way of load balancing logstash 
 or verifying if it's able to keep up or not. I've spot checked the files 
 quite a bit and everything seems to be correct but if there is a better way 
 to do this, I'm all ears.

 I'm going to have my KIbana instance installed on the master ES node, 
 which shouldn't be a big deal. I've played with the idea of putting the ES 
 servers on the syslog-ng servers and just have a separate NIC for the ES 
 traffic but didn't want to bog down the servers a whole lot. 

 Any thoughts or recommendations would be greatly appreciated.

 Thanks,
 Eric



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aa24cb27-6292-4d42-aa09-b13f6688c11f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


3,000 events/sec Architecture

2014-03-04 Thread Eric Luellen
Hello,

I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
months now and everything has worked out pretty good and we are ready to 
move it to production. Before building out the infrastructure, I want to 
make sure my shard/node/index setup is correct as that is the main part 
that I'm still a bit fuzzy on. Overall my setup is this:

Servers
Networking Gear 
syslog-ng server
End Points   -   Load Balancer 
    syslog-ng server  -- Logs 
stored in 5 flat files on SAN storage
Security Devices   
  syslog-ng server
Etc.

I have logstash running on one of the syslog-ng servers and is basically 
reading the input of 5 different files and sending them to ElasticSearch. 
So within ElasticSearch, I am creating 5 different indexes a day so I can 
do granular user access control within Kibana.

unix-$date
windows-$date
networking-$date
security-$date
endpoint-$date

My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on 
them. For my POC I have 2 and it's working fine for 2,000 events/second. My 
main concern is how I setup the ElasticSearch servers so they are as 
efficient as possible. With my 5 different indexes a day, and I plan on 
keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1 
master node and the other 2 be just basic setups that are data and 
searching? Also, will 1 replica be sufficient for this setup or should I do 
2 to be safe? In my POC, I've had a few issues where I ran out of memory or 
something weird happened and I lost data for a while so wanted to try to 
limit that as much as possible. We'll also have quite a few users 
potentially querying the system so I didn't know if I should setup a 
dedicated search node for one of these.

Besides the ES cluster, I think everything else should be fine. I have had 
a few concerns about logstash keeping up with the amount of entries coming 
into syslog-ng but haven't seen much in the way of load balancing logstash 
or verifying if it's able to keep up or not. I've spot checked the files 
quite a bit and everything seems to be correct but if there is a better way 
to do this, I'm all ears.

I'm going to have my KIbana instance installed on the master ES node, which 
shouldn't be a big deal. I've played with the idea of putting the ES 
servers on the syslog-ng servers and just have a separate NIC for the ES 
traffic but didn't want to bog down the servers a whole lot. 

Any thoughts or recommendations would be greatly appreciated.

Thanks,
Eric

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/13a76e46-91b5-41fe-9667-f674706fe127%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: 3,000 events/sec Architecture

2014-03-04 Thread Zachary Lammers
Based on my experience, I think you may have an issue with OOM trying to 
keep a month of logs with ~10gb ram / server.

Say, for instance, 5 indexes a day for 30 days = 150 indexes.  How many 
shards per index/replicas?

I ran some tests with 8GB assigned to my 20x ES data nodes, and after a ~7 
days of single index per day of all log data, my cluster would crash due to 
data nodes going OOM.  I know I can't perfectly compare, and I'm someone 
new to ES myself, but as soon as I removed the 'older' servers from the 
cluster that had smaller ram, and gave ES 16GB for each data node, I've not 
gone OOM since.  I was working with higher data rates, but I'm not sure the 
volume mattered as much as my shard count per index per node.

For reference, my current lab config is 36 data nodes, running single index 
per day (18 shards/1 replica), and I can index near 40,000 per second at 
beginning of day, closer to 30,000 per second near end of day when index is 
much larger.  I used to run 36 shards/1 replica, but I wanted the 
shards/index/per node to be minimal, as I'd really like to keep 60 days 
(except I'm running out of disk space on my old servers first!)  To pipe 
the data in, I'm running 45 separate logstash instances, each monitoring a 
single FIFO that I have scripts simply catting data into.  Eash LS instance 
is joining the ES cluster (no redis/etc, I've had too many issues not going 
direct to ES).  I recently started over after keeping steady with 25B log 
events over ~12 days (but ran out of disk so had to delete old indexes).  I 
tried updating to LS1.4b2/ES1.0.1, but it failed miserably, LS1.4b2 was 
extremely, extremely slow in indexing, so I'm still LS 1.3.3 and ES0.90.9.

As for master question, I can't answer.  I'm only running one right now for 
this lab cluster, which I know is not recommended, but I have zero idea how 
many I should truly have.  Like I said, I'm new to this :)

-Zachary

On Tuesday, March 4, 2014 9:11:59 AM UTC-6, Eric Luellen wrote:

 Hello,

 I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
 months now and everything has worked out pretty good and we are ready to 
 move it to production. Before building out the infrastructure, I want to 
 make sure my shard/node/index setup is correct as that is the main part 
 that I'm still a bit fuzzy on. Overall my setup is this:

 Servers
 Networking Gear   
   syslog-ng server
 End Points   -   Load Balancer 
     syslog-ng server  -- Logs 
 stored in 5 flat files on SAN storage
 Security Devices   
   syslog-ng server
 Etc.

 I have logstash running on one of the syslog-ng servers and is basically 
 reading the input of 5 different files and sending them to ElasticSearch. 
 So within ElasticSearch, I am creating 5 different indexes a day so I can 
 do granular user access control within Kibana.

 unix-$date
 windows-$date
 networking-$date
 security-$date
 endpoint-$date

 My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on 
 them. For my POC I have 2 and it's working fine for 2,000 events/second. My 
 main concern is how I setup the ElasticSearch servers so they are as 
 efficient as possible. With my 5 different indexes a day, and I plan on 
 keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1 
 master node and the other 2 be just basic setups that are data and 
 searching? Also, will 1 replica be sufficient for this setup or should I do 
 2 to be safe? In my POC, I've had a few issues where I ran out of memory or 
 something weird happened and I lost data for a while so wanted to try to 
 limit that as much as possible. We'll also have quite a few users 
 potentially querying the system so I didn't know if I should setup a 
 dedicated search node for one of these.

 Besides the ES cluster, I think everything else should be fine. I have had 
 a few concerns about logstash keeping up with the amount of entries coming 
 into syslog-ng but haven't seen much in the way of load balancing logstash 
 or verifying if it's able to keep up or not. I've spot checked the files 
 quite a bit and everything seems to be correct but if there is a better way 
 to do this, I'm all ears.

 I'm going to have my KIbana instance installed on the master ES node, 
 which shouldn't be a big deal. I've played with the idea of putting the ES 
 servers on the syslog-ng servers and just have a separate NIC for the ES 
 traffic but didn't want to bog down the servers a whole lot. 

 Any thoughts or recommendations would be greatly appreciated.

 Thanks,
 Eric



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 

Re: 3,000 events/sec Architecture

2014-03-04 Thread Eric Luellen
Zach,

Thanks for the information. With my POC, I have 2 10 gig VMs and I'm 
keeping 7 days of logs with no issues but that is a fairly large jump and I 
could see where it may pose an issue. 

As far as the 150 indexes, I'm not sure on the shards per index/replicas. 
That is the part that I'm the weakest on in ES setup. I'm not exactly sure 
how I should set up the ES cluster as far as the shards, replicas, master 
node, data node, search node etc.

I fully agree with the logstash directly to ES. I have 1 logstash instance 
right now failing 5 files and directly feeding in to ES and I've enjoyed 
not having another application to have to worry about.

Eric


On Tuesday, March 4, 2014 10:32:26 AM UTC-5, Zachary Lammers wrote:

 Based on my experience, I think you may have an issue with OOM trying to 
 keep a month of logs with ~10gb ram / server.

 Say, for instance, 5 indexes a day for 30 days = 150 indexes.  How many 
 shards per index/replicas?

 I ran some tests with 8GB assigned to my 20x ES data nodes, and after a ~7 
 days of single index per day of all log data, my cluster would crash due to 
 data nodes going OOM.  I know I can't perfectly compare, and I'm someone 
 new to ES myself, but as soon as I removed the 'older' servers from the 
 cluster that had smaller ram, and gave ES 16GB for each data node, I've not 
 gone OOM since.  I was working with higher data rates, but I'm not sure the 
 volume mattered as much as my shard count per index per node.

 For reference, my current lab config is 36 data nodes, running single 
 index per day (18 shards/1 replica), and I can index near 40,000 per second 
 at beginning of day, closer to 30,000 per second near end of day when index 
 is much larger.  I used to run 36 shards/1 replica, but I wanted the 
 shards/index/per node to be minimal, as I'd really like to keep 60 days 
 (except I'm running out of disk space on my old servers first!)  To pipe 
 the data in, I'm running 45 separate logstash instances, each monitoring a 
 single FIFO that I have scripts simply catting data into.  Eash LS instance 
 is joining the ES cluster (no redis/etc, I've had too many issues not going 
 direct to ES).  I recently started over after keeping steady with 25B log 
 events over ~12 days (but ran out of disk so had to delete old indexes).  I 
 tried updating to LS1.4b2/ES1.0.1, but it failed miserably, LS1.4b2 was 
 extremely, extremely slow in indexing, so I'm still LS 1.3.3 and ES0.90.9.

 As for master question, I can't answer.  I'm only running one right now 
 for this lab cluster, which I know is not recommended, but I have zero idea 
 how many I should truly have.  Like I said, I'm new to this :)

 -Zachary

 On Tuesday, March 4, 2014 9:11:59 AM UTC-6, Eric Luellen wrote:

 Hello,

 I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
 months now and everything has worked out pretty good and we are ready to 
 move it to production. Before building out the infrastructure, I want to 
 make sure my shard/node/index setup is correct as that is the main part 
 that I'm still a bit fuzzy on. Overall my setup is this:

 Servers
 Networking Gear   
   syslog-ng server
 End Points   -   Load Balancer 
     syslog-ng server  -- Logs 
 stored in 5 flat files on SAN storage
 Security Devices 
 syslog-ng server
 Etc.

 I have logstash running on one of the syslog-ng servers and is basically 
 reading the input of 5 different files and sending them to ElasticSearch. 
 So within ElasticSearch, I am creating 5 different indexes a day so I can 
 do granular user access control within Kibana.

 unix-$date
 windows-$date
 networking-$date
 security-$date
 endpoint-$date

 My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on 
 them. For my POC I have 2 and it's working fine for 2,000 events/second. My 
 main concern is how I setup the ElasticSearch servers so they are as 
 efficient as possible. With my 5 different indexes a day, and I plan on 
 keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1 
 master node and the other 2 be just basic setups that are data and 
 searching? Also, will 1 replica be sufficient for this setup or should I do 
 2 to be safe? In my POC, I've had a few issues where I ran out of memory or 
 something weird happened and I lost data for a while so wanted to try to 
 limit that as much as possible. We'll also have quite a few users 
 potentially querying the system so I didn't know if I should setup a 
 dedicated search node for one of these.

 Besides the ES cluster, I think everything else should be fine. I have 
 had a few concerns about logstash keeping up with the amount of entries 
 coming into syslog-ng but haven't seen much in the way of load balancing 
 

Re: 3,000 events/sec Architecture

2014-03-04 Thread Zachary Lammers
My initial suggestion would be to set your templates to 3 shards, 1 
replica.  With three data nodes, you'd have two shards per index, at 5 
indexes/day, that's 10 shards per day per index per node.  3 nodes/10 
shards per day/30 days is 900 shards.  I don't know any 'cutoff' per se, 
but 900 may be a bit much for ~10g instance, but I've run 1500+ shards on 
16g instances.

I set my shards/replicas via template to match my auto-index-naming 
starting with year (20* matching), though you can do it via your YML config 
as well.

{
  template : 20*,
  settings : {
  index.number_of_shards : 18,
  index.number_of_replicas : 1,
  index.auto_expand_replicas : false
  },
  mappings : {
_default_ : {
  _source : { compress : false },
  properties : {
priority : { type : string, index : 
not_analyzed },
facility : { type : string, index : 
not_analyzed },

 

...and so on.


Default without any settings is 5 shards/1 replica per index, which 
wouldn't distribute evenly across 3 data nodes.  It will balance out over 
multiple days though.  That's not necessarily a bad thing, as more cpus can 
search faster, but the more shards, more ram used, etc.  

I currently have a one dedicated master node and one dedicated search node. 
 In a prod environment, I'd have a small group of virtual masters (3-5?), 
but probably only the one virtual search node (we do *far* more indexing 
that searching).  Depending on how much searching, you may not need a 
dedicated search node, you can just hit any node on 9200, or do a 
search/master combo dedicated, or...really lots of ways, this is where I'm 
weak though, not sure how to estimate needs, as I don't have my environment 
mapped out!

Are some of your indexes much larger that others per day?  If so, I believe 
nodes are balanced by shard, not by shard disk usage -- so a much smaller 
shard is the same for ES 'capacity planning' as a larger one.  Unless this 
changed recently in 1.0.x ?

-Zachary

On Tuesday, March 4, 2014 9:51:47 AM UTC-6, Eric wrote:

 Zach,

 Thanks for the information. With my POC, I have 2 10 gig VMs and I'm 
 keeping 7 days of logs with no issues but that is a fairly large jump and I 
 could see where it may pose an issue. 

 As far as the 150 indexes, I'm not sure on the shards per index/replicas. 
 That is the part that I'm the weakest on in ES setup. I'm not exactly sure 
 how I should set up the ES cluster as far as the shards, replicas, master 
 node, data node, search node etc.

 I fully agree with the logstash directly to ES. I have 1 logstash instance 
 right now failing 5 files and directly feeding in to ES and I've enjoyed 
 not having another application to have to worry about.

 Eric


 On Tuesday, March 4, 2014 10:32:26 AM UTC-5, Zachary Lammers wrote:

 Based on my experience, I think you may have an issue with OOM trying to 
 keep a month of logs with ~10gb ram / server.

 Say, for instance, 5 indexes a day for 30 days = 150 indexes.  How many 
 shards per index/replicas?

 I ran some tests with 8GB assigned to my 20x ES data nodes, and after a 
 ~7 days of single index per day of all log data, my cluster would crash due 
 to data nodes going OOM.  I know I can't perfectly compare, and I'm someone 
 new to ES myself, but as soon as I removed the 'older' servers from the 
 cluster that had smaller ram, and gave ES 16GB for each data node, I've not 
 gone OOM since.  I was working with higher data rates, but I'm not sure the 
 volume mattered as much as my shard count per index per node.

 For reference, my current lab config is 36 data nodes, running single 
 index per day (18 shards/1 replica), and I can index near 40,000 per second 
 at beginning of day, closer to 30,000 per second near end of day when index 
 is much larger.  I used to run 36 shards/1 replica, but I wanted the 
 shards/index/per node to be minimal, as I'd really like to keep 60 days 
 (except I'm running out of disk space on my old servers first!)  To pipe 
 the data in, I'm running 45 separate logstash instances, each monitoring a 
 single FIFO that I have scripts simply catting data into.  Eash LS instance 
 is joining the ES cluster (no redis/etc, I've had too many issues not going 
 direct to ES).  I recently started over after keeping steady with 25B log 
 events over ~12 days (but ran out of disk so had to delete old indexes).  I 
 tried updating to LS1.4b2/ES1.0.1, but it failed miserably, LS1.4b2 was 
 extremely, extremely slow in indexing, so I'm still LS 1.3.3 and ES0.90.9.

 As for master question, I can't answer.  I'm only running one right now 
 for this lab cluster, which I know is not recommended, but I have zero idea 
 how many I should truly have.  Like I said, I'm new to this :)

 -Zachary

 On Tuesday, March 4, 2014 9:11:59 AM UTC-6, Eric Luellen wrote:

 Hello,

 I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
 months now and everything has worked out pretty good and