I am encountering a situation on my 2-node cluster (2 Graylog nodes, 3 Elasticsearch nodes) whereby the Process Buffer fills up, or begins to quickly ramp up in usage, messages are being written to the disk journal, but not read from. The journal usage can grow to the hundreds of thousands or millions of messages in fairly short order, and I'm coming up short as to discovering why this is happening. It's similar to if I manually paused message processing, except that the Process Buffer usage ramps up quickly as well, which is not the case when manually pausing of processing. Processing of messages appears to restart as suddenly as it halted, and when it does the processing rate can be as high as 20k / second, so I'd like to think I'm not running in to a load issue.
We are collecting the usual vitals via SolarWinds, and nothing appears out of the ordinary there. Systems are all physical, HP servers purchased Spring of this year. OS is CentOS 6.8, reasonably up to date patch wise. The default Info log level does not appear to catch anything useful at the onset of this anomaly, and leaving my nodes in Debug chews up storage space very quickly. So, has anyone ever run in to this? Process Buffer usage goes form almost zero to max very quickly, Journal usage shows that messages continue to be written to, but not read-from, and it starts back up as quickly as it halted. Thanks much, John -- You received this message because you are subscribed to the Google Groups "Graylog Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to graylog2+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/graylog2/4937c8b5-5259-4b55-bcb3-9f4cfaa68921%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.