Re: A strange behavior we've encountered on our ELK

2015-02-14 Thread Yuval Khalifa
Hi,

I just wanted to let you all know that I think that I solved it... I found
out that one of the programs that we built that sent logs to the ELK
created new Tcp connection for each event which exausted the Tcp buffers on
the server (just like a DoS attack). When I modified that program to re-use
the same connection things started to return to norm.

Thanks for all your help,
Yuval.

On Thursday, February 12, 2015, Yuval Khalifa  wrote:

> Well SSD would also fix all the pains for my bank too... (-;
>
> Are you sure it's caused by disk latency and not some sort of mis-tuned
> TCP driver? I've read some blogs that recommeded to increase some of the
> buffers at the sysctl.conf. Do you think so too?
>
> On Thursday, February 12, 2015, Itamar Syn-Hershko  > wrote:
>
>> Yes, make sure the disk is local and not low latency shared one (e.g.
>> SAN). Also SSD will probably fix all your pains.
>>
>> --
>>
>> Itamar Syn-Hershko
>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>> Freelance Developer & Consultant
>> Lucene.NET committer and PMC member
>>
>> On Thu, Feb 12, 2015 at 3:28 PM, Yuval Khalifa  wrote:
>>
>>> Sort of... The ELK is running as a VM on a dedicated ESXi. Are there
>>> special configurations I should do in such a case?
>>>
>>> Thanks,
>>> Yuval.
>>>
>>> On Thursday, February 12, 2015, Itamar Syn-Hershko 
>>> wrote:
>>>
>>>> Yes - can you try using the bulk API? Also, are you running on a cloud
>>>> server?
>>>>
>>>> --
>>>>
>>>> Itamar Syn-Hershko
>>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>>> Freelance Developer & Consultant
>>>> Lucene.NET committer and PMC member
>>>>
>>>> On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa 
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I wrote that program and ran it and it did managed to keep a steady
>>>>> rate of about 1,000 events per minute even when the Kibana's total events
>>>>> per minute dropped from 60,000 to 6,000. However, when the
>>>>> Kibana's total events per minute dropped to zero, my program got a
>>>>> "connection refused" exception. I ran netstat -s and found out that every
>>>>> time the Kibana's line hit zero the number of RX-DRP increased. At that
>>>>> point I understood that I forgot to mention that this server has a 10GbE
>>>>> nic. Is it possible that the packets are being dropped because of some
>>>>> bufferis filling up? If so, how can I test it and verify that this is
>>>>> actually the case? If it is, how can I solve it?
>>>>>
>>>>> Thanks,
>>>>> Yuval.
>>>>> On Wednesday, February 11, 2015, Yuval Khalifa 
>>>>> wrote:
>>>>>
>>>>>> Hi.
>>>>>>
>>>>>> When you say "see how the file behaves" I'm not quite sure what you
>>>>>> mean by that... As I mentioned earlier, it's not that events do not 
>>>>>> appear
>>>>>> at all but instead, the RATE at which they come decreases, so how can I
>>>>>> measure the events rate in a file? I thought that there's another way 
>>>>>> that
>>>>>> I can test this: I'll write a quick-and-dirty program that will send an
>>>>>> event to the ELK via TCP every 12ms which should result in events rate of
>>>>>> about 5,000 events per minute and I'll let you know if the events rate
>>>>>> continues to drop or not...
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Yuval.
>>>>>>
>>>>>> On Tuesday, February 10, 2015, Itamar Syn-Hershko 
>>>>>> wrote:
>>>>>>
>>>>>>> I'd start by using logstash with input tcp and output fs and see how
>>>>>>> the file behaves. Same for the fs inputs - see how their files behave. 
>>>>>>> And
>>>>>>> take it from there.
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Itamar Syn-Hershko
>>>>>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>>>>>> Freelance Developer & Consultant

Re: A strange behavior we've encountered on our ELK

2015-02-12 Thread Yuval Khalifa
Well SSD would also fix all the pains for my bank too... (-;

Are you sure it's caused by disk latency and not some sort of mis-tuned TCP
driver? I've read some blogs that recommeded to increase some of the
buffers at the sysctl.conf. Do you think so too?

On Thursday, February 12, 2015, Itamar Syn-Hershko 
wrote:

> Yes, make sure the disk is local and not low latency shared one (e.g.
> SAN). Also SSD will probably fix all your pains.
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko <https://twitter.com/synhershko>
> Freelance Developer & Consultant
> Lucene.NET committer and PMC member
>
> On Thu, Feb 12, 2015 at 3:28 PM, Yuval Khalifa  > wrote:
>
>> Sort of... The ELK is running as a VM on a dedicated ESXi. Are there
>> special configurations I should do in such a case?
>>
>> Thanks,
>> Yuval.
>>
>> On Thursday, February 12, 2015, Itamar Syn-Hershko > > wrote:
>>
>>> Yes - can you try using the bulk API? Also, are you running on a cloud
>>> server?
>>>
>>> --
>>>
>>> Itamar Syn-Hershko
>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>> Freelance Developer & Consultant
>>> Lucene.NET committer and PMC member
>>>
>>> On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I wrote that program and ran it and it did managed to keep a steady
>>>> rate of about 1,000 events per minute even when the Kibana's total events
>>>> per minute dropped from 60,000 to 6,000. However, when the
>>>> Kibana's total events per minute dropped to zero, my program got a
>>>> "connection refused" exception. I ran netstat -s and found out that every
>>>> time the Kibana's line hit zero the number of RX-DRP increased. At that
>>>> point I understood that I forgot to mention that this server has a 10GbE
>>>> nic. Is it possible that the packets are being dropped because of some
>>>> bufferis filling up? If so, how can I test it and verify that this is
>>>> actually the case? If it is, how can I solve it?
>>>>
>>>> Thanks,
>>>> Yuval.
>>>> On Wednesday, February 11, 2015, Yuval Khalifa 
>>>> wrote:
>>>>
>>>>> Hi.
>>>>>
>>>>> When you say "see how the file behaves" I'm not quite sure what you
>>>>> mean by that... As I mentioned earlier, it's not that events do not appear
>>>>> at all but instead, the RATE at which they come decreases, so how can I
>>>>> measure the events rate in a file? I thought that there's another way that
>>>>> I can test this: I'll write a quick-and-dirty program that will send an
>>>>> event to the ELK via TCP every 12ms which should result in events rate of
>>>>> about 5,000 events per minute and I'll let you know if the events rate
>>>>> continues to drop or not...
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Yuval.
>>>>>
>>>>> On Tuesday, February 10, 2015, Itamar Syn-Hershko 
>>>>> wrote:
>>>>>
>>>>>> I'd start by using logstash with input tcp and output fs and see how
>>>>>> the file behaves. Same for the fs inputs - see how their files behave. 
>>>>>> And
>>>>>> take it from there.
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Itamar Syn-Hershko
>>>>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>>>>> Freelance Developer & Consultant
>>>>>> Lucene.NET committer and PMC member
>>>>>>
>>>>>> On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa 
>>>>>> wrote:
>>>>>>
>>>>>>> Great! How can I check that?
>>>>>>>
>>>>>>>
>>>>>>> On Tuesday, February 10, 2015, Itamar Syn-Hershko <
>>>>>>> ita...@code972.com> wrote:
>>>>>>>
>>>>>>>> The graphic you sent suggests the issue is with logstash - since
>>>>>>>> the @timestamp field is being populated by logstash and is the one 
>>>>>>>> that is
>>>>>>>> used to display the date histogram graphics in Kibana. I would start 
>>>

Re: A strange behavior we've encountered on our ELK

2015-02-12 Thread Yuval Khalifa
Well SSD would also fix all the pains for my bank too... (-;

Are you sure it's caused by disk latency and not some sort of mis-tuned TCP
driver? I've read some blogs that recommeded to increase some of the
buffers at the sysctl.conf. Do you think so too?

On Thursday, February 12, 2015, Itamar Syn-Hershko 
wrote:

> Yes, make sure the disk is local and not low latency shared one (e.g.
> SAN). Also SSD will probably fix all your pains.
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko <https://twitter.com/synhershko>
> Freelance Developer & Consultant
> Lucene.NET committer and PMC member
>
> On Thu, Feb 12, 2015 at 3:28 PM, Yuval Khalifa  > wrote:
>
>> Sort of... The ELK is running as a VM on a dedicated ESXi. Are there
>> special configurations I should do in such a case?
>>
>> Thanks,
>> Yuval.
>>
>> On Thursday, February 12, 2015, Itamar Syn-Hershko > > wrote:
>>
>>> Yes - can you try using the bulk API? Also, are you running on a cloud
>>> server?
>>>
>>> --
>>>
>>> Itamar Syn-Hershko
>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>> Freelance Developer & Consultant
>>> Lucene.NET committer and PMC member
>>>
>>> On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I wrote that program and ran it and it did managed to keep a steady
>>>> rate of about 1,000 events per minute even when the Kibana's total events
>>>> per minute dropped from 60,000 to 6,000. However, when the
>>>> Kibana's total events per minute dropped to zero, my program got a
>>>> "connection refused" exception. I ran netstat -s and found out that every
>>>> time the Kibana's line hit zero the number of RX-DRP increased. At that
>>>> point I understood that I forgot to mention that this server has a 10GbE
>>>> nic. Is it possible that the packets are being dropped because of some
>>>> bufferis filling up? If so, how can I test it and verify that this is
>>>> actually the case? If it is, how can I solve it?
>>>>
>>>> Thanks,
>>>> Yuval.
>>>> On Wednesday, February 11, 2015, Yuval Khalifa 
>>>> wrote:
>>>>
>>>>> Hi.
>>>>>
>>>>> When you say "see how the file behaves" I'm not quite sure what you
>>>>> mean by that... As I mentioned earlier, it's not that events do not appear
>>>>> at all but instead, the RATE at which they come decreases, so how can I
>>>>> measure the events rate in a file? I thought that there's another way that
>>>>> I can test this: I'll write a quick-and-dirty program that will send an
>>>>> event to the ELK via TCP every 12ms which should result in events rate of
>>>>> about 5,000 events per minute and I'll let you know if the events rate
>>>>> continues to drop or not...
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Yuval.
>>>>>
>>>>> On Tuesday, February 10, 2015, Itamar Syn-Hershko 
>>>>> wrote:
>>>>>
>>>>>> I'd start by using logstash with input tcp and output fs and see how
>>>>>> the file behaves. Same for the fs inputs - see how their files behave. 
>>>>>> And
>>>>>> take it from there.
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Itamar Syn-Hershko
>>>>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>>>>> Freelance Developer & Consultant
>>>>>> Lucene.NET committer and PMC member
>>>>>>
>>>>>> On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa 
>>>>>> wrote:
>>>>>>
>>>>>>> Great! How can I check that?
>>>>>>>
>>>>>>>
>>>>>>> On Tuesday, February 10, 2015, Itamar Syn-Hershko <
>>>>>>> ita...@code972.com> wrote:
>>>>>>>
>>>>>>>> The graphic you sent suggests the issue is with logstash - since
>>>>>>>> the @timestamp field is being populated by logstash and is the one 
>>>>>>>> that is
>>>>>>>> used to display the date histogram graphics in Kibana. I would start 
>>>

Re: A strange behavior we've encountered on our ELK

2015-02-12 Thread Yuval Khalifa
Sort of... The ELK is running as a VM on a dedicated ESXi. Are there
special configurations I should do in such a case?

Thanks,
Yuval.

On Thursday, February 12, 2015, Itamar Syn-Hershko 
wrote:

> Yes - can you try using the bulk API? Also, are you running on a cloud
> server?
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko <https://twitter.com/synhershko>
> Freelance Developer & Consultant
> Lucene.NET committer and PMC member
>
> On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa  > wrote:
>
>> Hi,
>>
>> I wrote that program and ran it and it did managed to keep a steady rate
>> of about 1,000 events per minute even when the Kibana's total events per
>> minute dropped from 60,000 to 6,000. However, when the
>> Kibana's total events per minute dropped to zero, my program got a
>> "connection refused" exception. I ran netstat -s and found out that every
>> time the Kibana's line hit zero the number of RX-DRP increased. At that
>> point I understood that I forgot to mention that this server has a 10GbE
>> nic. Is it possible that the packets are being dropped because of some
>> bufferis filling up? If so, how can I test it and verify that this is
>> actually the case? If it is, how can I solve it?
>>
>> Thanks,
>> Yuval.
>> On Wednesday, February 11, 2015, Yuval Khalifa > > wrote:
>>
>>> Hi.
>>>
>>> When you say "see how the file behaves" I'm not quite sure what you
>>> mean by that... As I mentioned earlier, it's not that events do not appear
>>> at all but instead, the RATE at which they come decreases, so how can I
>>> measure the events rate in a file? I thought that there's another way that
>>> I can test this: I'll write a quick-and-dirty program that will send an
>>> event to the ELK via TCP every 12ms which should result in events rate of
>>> about 5,000 events per minute and I'll let you know if the events rate
>>> continues to drop or not...
>>>
>>>
>>> Thanks,
>>> Yuval.
>>>
>>> On Tuesday, February 10, 2015, Itamar Syn-Hershko 
>>> wrote:
>>>
>>>> I'd start by using logstash with input tcp and output fs and see how
>>>> the file behaves. Same for the fs inputs - see how their files behave. And
>>>> take it from there.
>>>>
>>>> --
>>>>
>>>> Itamar Syn-Hershko
>>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>>> Freelance Developer & Consultant
>>>> Lucene.NET committer and PMC member
>>>>
>>>> On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa 
>>>> wrote:
>>>>
>>>>> Great! How can I check that?
>>>>>
>>>>>
>>>>> On Tuesday, February 10, 2015, Itamar Syn-Hershko 
>>>>> wrote:
>>>>>
>>>>>> The graphic you sent suggests the issue is with logstash - since the
>>>>>> @timestamp field is being populated by logstash and is the one that is 
>>>>>> used
>>>>>> to display the date histogram graphics in Kibana. I would start there. 
>>>>>> I.e.
>>>>>> maybe SecurityOnion buffers writes etc, and then to check the logstash
>>>>>> shipper process stats.
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Itamar Syn-Hershko
>>>>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>>>>> Freelance Developer & Consultant
>>>>>> Lucene.NET committer and PMC member
>>>>>>
>>>>>> On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi.
>>>>>>>
>>>>>>> Absolutely (but since that in the past I also worked at the helpdesk
>>>>>>> dept. I certainly understand why it is important to ask those "Are you 
>>>>>>> sure
>>>>>>> it's plugged in?" questions...). One of the logs is comming from
>>>>>>> SecurityOnion which logs (via bro-conn) all the connections so it must 
>>>>>>> be
>>>>>>> sending data 24x7x365.
>>>>>>>
>>>>>>> Thanks for the quick reply,
>>>>>>> Yuval.
>>>>>>>
>>>>>>> O

Re: A strange behavior we've encountered on our ELK

2015-02-12 Thread Yuval Khalifa
Hi,

I wrote that program and ran it and it did managed to keep a steady rate of
about 1,000 events per minute even when the Kibana's total events per
minute dropped from 60,000 to 6,000. However, when the
Kibana's total events per minute dropped to zero, my program got a
"connection refused" exception. I ran netstat -s and found out that every
time the Kibana's line hit zero the number of RX-DRP increased. At that
point I understood that I forgot to mention that this server has a 10GbE
nic. Is it possible that the packets are being dropped because of some
bufferis filling up? If so, how can I test it and verify that this is
actually the case? If it is, how can I solve it?

Thanks,
Yuval.
On Wednesday, February 11, 2015, Yuval Khalifa  wrote:

> Hi.
>
> When you say "see how the file behaves" I'm not quite sure what you mean
> by that... As I mentioned earlier, it's not that events do not appear at
> all but instead, the RATE at which they come decreases, so how can I
> measure the events rate in a file? I thought that there's another way that
> I can test this: I'll write a quick-and-dirty program that will send an
> event to the ELK via TCP every 12ms which should result in events rate of
> about 5,000 events per minute and I'll let you know if the events rate
> continues to drop or not...
>
>
> Thanks,
> Yuval.
>
> On Tuesday, February 10, 2015, Itamar Syn-Hershko  > wrote:
>
>> I'd start by using logstash with input tcp and output fs and see how the
>> file behaves. Same for the fs inputs - see how their files behave. And take
>> it from there.
>>
>> --
>>
>> Itamar Syn-Hershko
>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>> Freelance Developer & Consultant
>> Lucene.NET committer and PMC member
>>
>> On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa  wrote:
>>
>>> Great! How can I check that?
>>>
>>>
>>> On Tuesday, February 10, 2015, Itamar Syn-Hershko 
>>> wrote:
>>>
>>>> The graphic you sent suggests the issue is with logstash - since the
>>>> @timestamp field is being populated by logstash and is the one that is used
>>>> to display the date histogram graphics in Kibana. I would start there. I.e.
>>>> maybe SecurityOnion buffers writes etc, and then to check the logstash
>>>> shipper process stats.
>>>>
>>>> --
>>>>
>>>> Itamar Syn-Hershko
>>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>>> Freelance Developer & Consultant
>>>> Lucene.NET committer and PMC member
>>>>
>>>> On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa 
>>>> wrote:
>>>>
>>>>> Hi.
>>>>>
>>>>> Absolutely (but since that in the past I also worked at the helpdesk
>>>>> dept. I certainly understand why it is important to ask those "Are you 
>>>>> sure
>>>>> it's plugged in?" questions...). One of the logs is comming from
>>>>> SecurityOnion which logs (via bro-conn) all the connections so it must be
>>>>> sending data 24x7x365.
>>>>>
>>>>> Thanks for the quick reply,
>>>>> Yuval.
>>>>>
>>>>> On Tuesday, February 10, 2015, Itamar Syn-Hershko 
>>>>> wrote:
>>>>>
>>>>>> Are you sure your logs are generated linearly without bursts?
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Itamar Syn-Hershko
>>>>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>>>>> Freelance Developer & Consultant
>>>>>> Lucene.NET committer and PMC member
>>>>>>
>>>>>> On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> We just installed an ELK server and configured the logstash
>>>>>>> configuration to match the data that we send to it and until last month 
>>>>>>> it
>>>>>>> seems to be working fine but since then we see very strange behavior in 
>>>>>>> the
>>>>>>> Kibana, the event over time histogram shows the event rate at the normal
>>>>>>> level for about a half an hour, then drops to about 20% of the normal 
>>>>>>> rate
>>>>&g

Re: A strange behavior we've encountered on our ELK

2015-02-10 Thread Yuval Khalifa
Hi.

When you say "see how the file behaves" I'm not quite sure what you mean by
that... As I mentioned earlier, it's not that events do not appear at all
but instead, the RATE at which they come decreases, so how can I measure
the events rate in a file? I thought that there's another way that I can
test this: I'll write a quick-and-dirty program that will send an event to
the ELK via TCP every 12ms which should result in events rate of about
5,000 events per minute and I'll let you know if the events rate continues
to drop or not...


Thanks,
Yuval.

On Tuesday, February 10, 2015, Itamar Syn-Hershko 
wrote:

> I'd start by using logstash with input tcp and output fs and see how the
> file behaves. Same for the fs inputs - see how their files behave. And take
> it from there.
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko <https://twitter.com/synhershko>
> Freelance Developer & Consultant
> Lucene.NET committer and PMC member
>
> On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa  > wrote:
>
>> Great! How can I check that?
>>
>>
>> On Tuesday, February 10, 2015, Itamar Syn-Hershko > > wrote:
>>
>>> The graphic you sent suggests the issue is with logstash - since the
>>> @timestamp field is being populated by logstash and is the one that is used
>>> to display the date histogram graphics in Kibana. I would start there. I.e.
>>> maybe SecurityOnion buffers writes etc, and then to check the logstash
>>> shipper process stats.
>>>
>>> --
>>>
>>> Itamar Syn-Hershko
>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>> Freelance Developer & Consultant
>>> Lucene.NET committer and PMC member
>>>
>>> On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa 
>>> wrote:
>>>
>>>> Hi.
>>>>
>>>> Absolutely (but since that in the past I also worked at the helpdesk
>>>> dept. I certainly understand why it is important to ask those "Are you sure
>>>> it's plugged in?" questions...). One of the logs is comming from
>>>> SecurityOnion which logs (via bro-conn) all the connections so it must be
>>>> sending data 24x7x365.
>>>>
>>>> Thanks for the quick reply,
>>>> Yuval.
>>>>
>>>> On Tuesday, February 10, 2015, Itamar Syn-Hershko 
>>>> wrote:
>>>>
>>>>> Are you sure your logs are generated linearly without bursts?
>>>>>
>>>>> --
>>>>>
>>>>> Itamar Syn-Hershko
>>>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>>>> Freelance Developer & Consultant
>>>>> Lucene.NET committer and PMC member
>>>>>
>>>>> On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa 
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We just installed an ELK server and configured the logstash
>>>>>> configuration to match the data that we send to it and until last month 
>>>>>> it
>>>>>> seems to be working fine but since then we see very strange behavior in 
>>>>>> the
>>>>>> Kibana, the event over time histogram shows the event rate at the normal
>>>>>> level for about a half an hour, then drops to about 20% of the normal 
>>>>>> rate
>>>>>> and then it continues to drop slowly for about two hours and then stops 
>>>>>> and
>>>>>> after a minute or two it returns to normal for the next half an hour or 
>>>>>> so
>>>>>> and the same behavior repeats. Needless to say that both the
>>>>>> /var/log/logstash and /var/log/elasticsearch both show nothing since the
>>>>>> service started and by using tcpdump we can verify that events keep 
>>>>>> coming
>>>>>> in at the same rate all time. I attached our logstash configuration, the
>>>>>> /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log 
>>>>>> and
>>>>>> a screenshot of our Kibana with no filter applied so that you can see the
>>>>>> weird behavior that we see.
>>>>>>
>>>>>> Is there someone/somewhere that we can turn to to get some help on
>>>>>> the subject?
>>>>>>
>>>>>>
>>>>>> 

Re: A strange behavior we've encountered on our ELK

2015-02-10 Thread Yuval Khalifa
Great! How can I check that?

On Tuesday, February 10, 2015, Itamar Syn-Hershko 
wrote:

> The graphic you sent suggests the issue is with logstash - since the
> @timestamp field is being populated by logstash and is the one that is used
> to display the date histogram graphics in Kibana. I would start there. I.e.
> maybe SecurityOnion buffers writes etc, and then to check the logstash
> shipper process stats.
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko <https://twitter.com/synhershko>
> Freelance Developer & Consultant
> Lucene.NET committer and PMC member
>
> On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa  > wrote:
>
>> Hi.
>>
>> Absolutely (but since that in the past I also worked at the helpdesk
>> dept. I certainly understand why it is important to ask those "Are you sure
>> it's plugged in?" questions...). One of the logs is comming from
>> SecurityOnion which logs (via bro-conn) all the connections so it must be
>> sending data 24x7x365.
>>
>> Thanks for the quick reply,
>> Yuval.
>>
>> On Tuesday, February 10, 2015, Itamar Syn-Hershko > > wrote:
>>
>>> Are you sure your logs are generated linearly without bursts?
>>>
>>> --
>>>
>>> Itamar Syn-Hershko
>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>> Freelance Developer & Consultant
>>> Lucene.NET committer and PMC member
>>>
>>> On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> We just installed an ELK server and configured the logstash
>>>> configuration to match the data that we send to it and until last month it
>>>> seems to be working fine but since then we see very strange behavior in the
>>>> Kibana, the event over time histogram shows the event rate at the normal
>>>> level for about a half an hour, then drops to about 20% of the normal rate
>>>> and then it continues to drop slowly for about two hours and then stops and
>>>> after a minute or two it returns to normal for the next half an hour or so
>>>> and the same behavior repeats. Needless to say that both the
>>>> /var/log/logstash and /var/log/elasticsearch both show nothing since the
>>>> service started and by using tcpdump we can verify that events keep coming
>>>> in at the same rate all time. I attached our logstash configuration, the
>>>> /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
>>>> a screenshot of our Kibana with no filter applied so that you can see the
>>>> weird behavior that we see.
>>>>
>>>> Is there someone/somewhere that we can turn to to get some help on the
>>>> subject?
>>>>
>>>>
>>>> Thanks a lot,
>>>> Yuval.
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "elasticsearch" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/elasticsearch/cw7zEVTy09M/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>> --
>>
>> בברכה,
>>
>> *יובל כליפא*
>>
>> CTO
>> תחום מערכות

Re: A strange behavior we've encountered on our ELK

2015-02-10 Thread Yuval Khalifa
Hi.

Absolutely (but since that in the past I also worked at the helpdesk dept.
I certainly understand why it is important to ask those "Are you sure it's
plugged in?" questions...). One of the logs is comming from SecurityOnion
which logs (via bro-conn) all the connections so it must be sending data
24x7x365.

Thanks for the quick reply,
Yuval.

On Tuesday, February 10, 2015, Itamar Syn-Hershko 
wrote:

> Are you sure your logs are generated linearly without bursts?
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko <https://twitter.com/synhershko>
> Freelance Developer & Consultant
> Lucene.NET committer and PMC member
>
> On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa  > wrote:
>
>> Hi,
>>
>> We just installed an ELK server and configured the logstash configuration
>> to match the data that we send to it and until last month it seems to be
>> working fine but since then we see very strange behavior in the Kibana, the
>> event over time histogram shows the event rate at the normal level for
>> about a half an hour, then drops to about 20% of the normal rate and then
>> it continues to drop slowly for about two hours and then stops and after a
>> minute or two it returns to normal for the next half an hour or so and the
>> same behavior repeats. Needless to say that both the /var/log/logstash and
>> /var/log/elasticsearch both show nothing since the service started and by
>> using tcpdump we can verify that events keep coming in at the same rate all
>> time. I attached our logstash configuration, the
>> /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
>> a screenshot of our Kibana with no filter applied so that you can see the
>> weird behavior that we see.
>>
>> Is there someone/somewhere that we can turn to to get some help on the
>> subject?
>>
>>
>> Thanks a lot,
>> Yuval.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com
>> 
>> .
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/cw7zEVTy09M/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com
> 
> .
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 

בברכה,

*יובל כליפא*

CTO
תחום מערכות מידע | מגדל סוכנויות.
נייד:052-3336098
משרד:  03-7966565
פקס:03-7976565
  בלוג: http://www.artifex.co.il
<https://owa.mvs.co.il/OWA/redir.aspx?C=2843559e53a94386b1211d26cb20f8ef&URL=http%3a%2f%2fwww.artifex.co.il%2f>

*[image: תיאור: תיאור: cid:image003.png@01CBB583.C49AE5A0]*

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CADtR2A9-UtP5GJLORnVW%2BMowbB%2B0ZV%3DeDFMfN5u3xFPD2Zv5FQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


A strange behavior we've encountered on our ELK

2015-02-10 Thread Yuval Khalifa
Hi,

We just installed an ELK server and configured the logstash configuration 
to match the data that we send to it and until last month it seems to be 
working fine but since then we see very strange behavior in the Kibana, the 
event over time histogram shows the event rate at the normal level for 
about a half an hour, then drops to about 20% of the normal rate and then 
it continues to drop slowly for about two hours and then stops and after a 
minute or two it returns to normal for the next half an hour or so and the 
same behavior repeats. Needless to say that both the /var/log/logstash and 
/var/log/elasticsearch both show nothing since the service started and by 
using tcpdump we can verify that events keep coming in at the same rate all 
time. I attached our logstash configuration, the 
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and 
a screenshot of our Kibana with no filter applied so that you can see the 
weird behavior that we see.

Is there someone/somewhere that we can turn to to get some help on the 
subject?


Thanks a lot,
Yuval.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
#This file was created by Yuval Khalifa - Mivtach Simon to handle inputs to the 
ElasticSearch/Kibana analysis at 2014-07-13T17:20
#
#
input {
  tcp {
port => 
type => "syslog_onion"
  }
}
input {
  tcp {
port => 5551
type => "syslog_f5"
codec => plain {
  charset => "CP1252"
}
  }
}
input {
  tcp {
port => 5552
type => "syslog_vault"
  }
}
input {
  tcp {
port => 5553
type => "syslog_fortigate"
codec => plain {
  charset => "CP1252"
}
  }
}
input {
  tcp {
port => 5554
type => "syslog_eventlogs"
  }
}
input {
  tcp {
port => 5556
type => "syslog_mailalerts"
  }
}
input {
  tcp {
port => 5557
type => "syslog_test"
  }
}
input {
  tcp {
port => 5558
type => "syslog_elkconnector"
  }
}
input {
  tcp {
port => 1514
type => "syslog_vmware_esxi"
  }
}
input {
  file {
type => "snmptrap"
path => [ "/srv/snmptraps/snmptrapd.log" ]
codec => plain {
  charset => "CP1252"
}
  }
}
input {
  file {
type => "f5_certs"
path => [ "/srv/f5/certs_*" ]
  }
}
#input {
#  file {
#type => "iis"
#path => ["/srv/iis/**/*.log"]
#codec => plain {
#  charset => "ISO-8859-1"
#}
#  }
#}
filter {
if ([message] =~ /^\s*$/) or
   ([message] == "\"") or
   ([message] =~ /^#/) or 
   ([message] =~ /.* - - #.*/) or
   ([message] == "default send string") or 
   ([message] =~ /^NET-SNMP version.*/) or
   ([message] =~ /^AgentX master disconnected.*/) or
   ([message] =~ /^Stopping snmptrapd.*/) or
   ([message] =~ /^.*NET-SNMP version.*Stopped./) {
drop{}
}
}
filter {
  if ([type] == "iis") {
grok {
  add_field => { "sotool" => "iis" }
  match => [
   "message", "%{DATESTAMP:log_timestamp} %{WORD:s_sitename} 
%{NOTSPACE:s_computername} %{IP:dstip} %{WORD:cs_method} %{URIPATH:cs_uri_stem} 
%{NOTSPACE:cs_uri_query} %{NUMBER:dstport} %{NOTSPACE:cs_username} 
%{IPORHOST:srcip} %{NOTSPACE:cs_version} %{NOTSPACE:httpUserAgent} 
%{NOTSPACE:cs_cookie} %{NOTSPACE:cs_referer} %{NOTSPACE:cs_host} 
%{NOTSPACE:sc_status} %{NOTSPACE:sc_substatus} %{NOTSPACE:sc_win32status} 
%{INT:sc_bytes} %{INT:cs_bytes} %{INT:timeTaken}",
   "message", "%{DATESTAMP:log_timestamp} %{WORD:s_sitename} 
%{NOTSPACE:s_computername} %{IP:dstip} %{WORD:cs_method} %{URIPATH:cs_uri_stem} 
%{NOTSPACE:cs_uri_query} %{NUMBER:dstport} %{NOTSPACE:cs_username} 
%{IPORHOST:srcip} %{NOTSPACE:cs_version} %{NOTSPACE:httpUserAgent} 
%{NOTSPACE:cs_cookie} %{NOTSPACE:cs_referer} %{NOTSPACE:cs_host} 
%{NOTSPACE:sc_status} %{NOTSPACE:sc_substatus} %{NOTSPACE:sc_win32status} 
%{INT:sc_bytes} %{INT:timeTaken}"
  ]
}
  }
}
#filter {
#  if ([type] == "syslog_vmware_esxi") {
#grok {
#  match => [ 
#   "message", "(?\<[0-9].*\>)%{INT:prefixInt} 
%{TIMESTAMP_ISO8601:@timestamp} %{NOTSPACE:host} 
%{NOTSPACE:service}:(?.*)"
#  ]
#}
#  }
#}
filter {
  if([type] == "syslog_elkconnector") {
mutate {
  g