from:"sunner"

Re: [graylog2] Graylog 1.0 UDP process buffer performance

2015-03-02 Thread sunner

I installed unbound locally and used this, and it seems to have resolved 
the issue. It's odd that the old server didn't show this behavior, but I'm 
happy enough that it's resolved anyway. :)

Regards
Johan

On Friday, February 27, 2015 at 2:02:08 PM UTC+1, Bernd Ahlers wrote:

 Johan, Henrik, 

 I tried to track this problem down.The problem is that the JVM does 
 not cache reverse DNS lookups. The available JVM DNS cache settings 
 like networkaddress.cache.ttl only affect forward DNS lookups. 

 The code for doing the reverse lookups in Graylog did not change in a 
 long time, so this problem is not new in 1.0. 

 I my test setup enabling force_rdns for a syslog input reduced the 
 throughput from around 7000 msg/s to 300 msg/s. This was without a 
 local DNS cache. Once I installed a DNS cache on the Graylog server, 
 the throughput went up to around 3000 msg/s. 

 We will investigate if there is a sane way to cache the reverse 
 lookups ourselves. In the meantime I suggest to test with a DNS cache 
 installed on the Graylog server nodes to see if that helps or to 
 disable the force_rdns setting. 

 Regards, 
 Bernd 

 On 25 February 2015 at 18:00, Bernd Ahlers be...@graylog.com 
 javascript: wrote: 
  Johan, Henrik, 
  
  thanks for the details. I created an issue on GitHub and will 
 investigate. 
  
  https://github.com/Graylog2/graylog2-server/issues/999 
  
  Regards, 
  Bernd 
  
  On 25 February 2015 at 17:48, Henrik Johansen h...@myunix.dk 
 javascript: wrote: 
  Bernd, 
  
  Correct - that issue started after 0.92.x. 
  
  We are still seeing evaluated CPU utilisation but we are attributing 
 that 
  to the fact that 0.92 was loosing messages in our setup. 
  
  
  On 25 Feb 2015, at 17:37, Bernd Ahlers be...@graylog.com 
 javascript: wrote: 
  
  Henrik, 
  
  uh, okay. I suppose it worked for you in 0.92 as well? 
  
  I will create an issue on GitHub for that. 
  
  Bernd 
  
  On 25 February 2015 at 17:14, Henrik Johansen h...@myunix.dk 
 javascript: wrote: 
  Bernd, 
  
  We saw the exact same issue - here is a graph over the CPU idle 
  percentage across a few of the cluster nodes during the upgrade : 
  
  http://5.9.37.177/graylog_cluster_cpu_idle.png 
  
  We went from ~20% CPU utilisation to ~100% CPU utilisation across 
  ~200 cores and things only settled down after disabling force_rdns. 
  
  
  On 25 Feb 2015, at 11:55, Bernd Ahlers be...@graylog.com 
 javascript: wrote: 
  
  Johan, 
  
  the only thing that changed from 0.92 to 1.0 is that the DNS lookup 
 is 
  now done when the messages are read from the journal and not in the 
  input path where the messages are received. Otherwise, nothing has 
  changed in that regard. 
  
  We do not do any manual caching of the DNS lookups, but the JVM 
 caches 
  them by default. Check 
  
 http://docs.oracle.com/javase/7/docs/technotes/guides/net/properties.html 
  for networkaddress.cache.ttl and networkaddress.cache.negative.ttl. 
  
  Regards, 
  Bernd 
  
  On 25 February 2015 at 08:56,  sun...@sunner.com javascript: 
 wrote: 
  
  This is strange, I went through all of the settings for my reply, and 
 we are 
  indeed using rdns, and it seems to be the culprit. The strangeness is 
 that 
  it works fine on the old servers even though they're on the same 
 networks, 
  and using the same DNS's and resolver settings. 
  Did something regarding reverse DNS change between 0.92 and 1.0? I'm 
  thinking perhaps the server is trying to do one lookup per message 
 instead 
  of caching reverse lookups, seeing as the latter would result in very 
 little 
  DNS traffic since most of the logs will be coming from a small number 
 of 
  hosts. 
  
  Regards 
  Johan 
  
  On Tuesday, February 24, 2015 at 5:08:54 PM UTC+1, Bernd Ahlers 
 wrote: 
  
  
  Johan, 
  
  this sounds very strange indeed. Can you provide us with some more 
  details? 
  
  - What kind of messages are you pouring into Graylog via UDP? (GELF, 
  raw, syslog?) 
  - Do you have any extractors or grok filters running for the messages 
  coming in via UDP? 
  - Any other differences between the TCP and UDP messages? 
  - Can you show us your input configuration? 
  - Are you using reverse DNS lookups? 
  
  Thank you! 
  
  Regards, 
  Bernd 
  
  On 24 February 2015 at 16:45,  sun...@sunner.com wrote: 
  
  Well that could be a suspect if it wasn't for the fact that the old 
  nodes 
  running on old hardware handle it just fine, along with the fact that 
  the 
  traffic seems to reach the nodes just fine(i.e it actually fills the 
  journal 
  up just fine, and the input buffer never breaks a sweat). And it's 
  really 
  not that much traffic, even spread across four nodes those ~1000 
  messages 
  per second will cause this whereas the old nodes are just two and can 
  handle 
  it just fine. 
  
  About disk tuning, I haven't done much of that, and I realize I 
 forgot 
  to 
  mention that the Elasticsearch cluster is on separate physical 
 hardware 
  so

[graylog2] Rewrite timestamp on incoming logs?

2014-10-14 Thread sunner

Hello,

Is it possible to rewrite the timestamp on incoming logs to the local 
system time of the Graylog server? I'm asking because we have some machines 
all across the world where the timezones don't match up so the timestamps 
get fairly large offsets.
I've looked into using a Drools rule for this, but from what I can gather 
there's no built in setTimestamp or similar method? And I'm unsure what 
kind of performance hit this would cause for a system handling ~2.000 
messages per second.

Anyone got any insights into this?

Regards
Johan

-- 
You received this message because you are subscribed to the Google Groups 
graylog2 group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to graylog2+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[graylog2] Re: Graylog node almost dying

2014-08-20 Thread sunner

Oh and I noticed I left out the versions of the software, doh.

Graylog 0.20.6(web and server)
ES 0.90.10

Regards
Johan

On Tuesday, August 19, 2014 2:29:48 PM UTC+2, sun...@sunner.com wrote:

 Hello,

 We have a pair of GL nodes with a cluster of three ES servers at the 
 backend. The occasional capacity problem aside this has been working fine 
 for the most part.
 Today one of the GL nodes decided to act up though, it's behavior is 
 pretty strange:

 The process is up, I can connect to it via JMX.
 It doesn't reply to any API calls, so as far as the web interface is 
 concerned it's dead. It still listens on the relevant ports, it just reply 
 to curl for example, or the web interface.
 Checking with JMX, the process keeps eating and GC'ing memory, very slowly 
 increasing the memory usage pattern. Negligible CPU use. 
 I can still see GL as a client node in the ES cluster.

 I can see nothing suspicious about the process or the machine it's running 
 on, it's a physical server and there doesn't appear to be any hardware 
 errors or such. One of the ES nodes is running on the same server without 
 issue.
 The config is exactly the same on the other GL server, same hardware and 
 running both GL and ES without issue.

 The graylog2-server.log file has quite a few of these lines(along with a 
 stack trace):
 2014-08-19 13:03:13,628 ERROR: 
 org.graylog2.jersey.container.netty.NettyContainer - Uncaught exception 
 during jersey resource handling
 java.nio.channels.ClosedChannelException
 2014-08-19 13:03:13,628 INFO : 
 org.graylog2.jersey.container.netty.NettyContainer - Not writing any 
 response, channel is already closed.
 java.nio.channels.ClosedChannelException

 I've looked through the other boxes, and aside from the other node being 
 pretty heavily loaded since it has to deal with the entire load, I can't 
 find anything apparently wrong.

 The setup is all Debian 7/amd64, running Sun JDK 7u67, and Graylog has a 
 12 GB heap size configured.

 Any tips?

 Regards
 Johan


-- 
You received this message because you are subscribed to the Google Groups 
graylog2 group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to graylog2+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [graylog2] Graylog 1.0 UDP process buffer performance

[graylog2] Rewrite timestamp on incoming logs?

[graylog2] Re: Graylog node almost dying

3 matches

Site Navigation

Mail list logo

Footer information