Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread James Srinivasan
Apologies in advance if I've got this completely wrong, but I recall that error if I forget to increase the limit of open files for a heavily loaded install. It is more obvious via the UI but the logs will have error messages about too many open files. On Wed, 22 Mar 2023, 16:49 Mark Payne, wrote

Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Mark Payne
OK. So changing the checkpoint internal to 300 seconds might help reduce IO a bit. But it will cause the repo to become much larger, and it will take much longer to startup whenever you restart NiFi. The variance in size between nodes is likely due to how recently it’s checkpointed. If it stays

Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Joe Obernberger
Thanks for this Mark.  I'm not seeing any large attributes at the moment but will go through this and verify - but I did have one queue that was set to 100k instead of 10k. I set the nifi.cluster.node.connection.timeout to 30 seconds (up from 5) and the nifi.flowfile.repository.checkpoint.interv

Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Mark Payne
Joe, The errors noted are indicating that NiFi cannot communicate with registry. Either the registry is offline, NiFi’s Registry Client is not configured properly, there’s a firewall in the way, etc. A FlowFile repo of 35 GB is rather huge. This would imply one of 3 things: - You have a huge nu

Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Joe Obernberger
Thank you Mark.  These are SATA drives - but there's no way for the flowfile repo to be on multiple spindles.  It's not huge - maybe 35G per node. I do see a lot of messages like this in the log: 2023-03-22 10:52:13,960 ERROR [Timer-Driven Process Thread-62] o.a.nifi.groups.StandardProcessGrou

Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Joe Obernberger
I've since brought the node back up - no change.  Looks like IO is all related to flowfile repository.  When it's running, CPU is pretty high - usually ~12 cores (ie top will show 1200%) per node.  I'm using the XFS filesystem; maybe some FS parameters would help? The big change is that I was

Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Mark Payne
Joe, 1.8 million FlowFiles is not a concern. But when you say “Should I reduce the queue sizes?” it makes me wonder if they’re all in a single queue? Generally, you should leave the backpressure threshold at the default 10,000 FlowFile max. Increasing this can lead to huge amounts of swapping, w

Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Joe Obernberger
Thank you.  Was able to get in. Currently there are 1.8 million flow files and 3.2G.  Is this too much for a 3 node cluster with mutliple spindles each (SATA drives)? Should I reduce the queue sizes? -Joe On 3/22/2023 10:23 AM, Phillip Lord wrote: Joe, If you need the UI to come back up, try

Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Phillip Lord
Joe, If you need the UI to come back up, try setting the autoresume setting in nifi.properties to false and restart node(s). This will bring up every component/controllerService up stopped/disabled and may provide some breathing room for the UI to become available again. Phil On Mar 22, 2023 at

Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Joe Obernberger
atop shows the disk as being all red with IO - 100% utilization. There are a lot of flowfiles currently trying to run through, but I can't monitor it becauseUI wont' load. -Joe On 3/22/2023 10:16 AM, Mark Payne wrote: Joe, I’d recommend taking a look at garbage collection. It is far more

Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Mark Payne
Joe, I’d recommend taking a look at garbage collection. It is far more likely the culprit than disk I/O. Thanks -Mark > On Mar 22, 2023, at 10:12 AM, Joe Obernberger > wrote: > > I'm getting "java.net.SocketTimeoutException: timeout" from the user > interface of NiFi when load is heavy. Th

UI SocketTimeoutException - heavy IO

2023-03-22 Thread Joe Obernberger
I'm getting "java.net.SocketTimeoutException: timeout" from the user interface of NiFi when load is heavy.  This is 1.18.0 running on a 3 node cluster.  Disk IO is high and when that happens, I can't get into the UI to stop any of the processors. Any ideas? I have put the flowfile repository a