Re: Round robin load balancing eventually stops using all nodes

2022-04-01 Thread Mike Thomsen
I think I figured out how to get around this: partition-by-attribute using UUID. About 10 minutes ago, I was down to 3/5 nodes on my cluster. Switched the queues to that strategy, and the 3 full nodes started sending work to the other two nodes without a restart. On Fri, Apr 1, 2022 at 7:44 AM

Re: Round robin load balancing eventually stops using all nodes

2022-04-01 Thread Mike Thomsen
I think I forgot to mention early on that we're using embedded ZooKeeper. Could that be a factor in this behavior? Thanks, Mike On Fri, Apr 1, 2022 at 7:28 AM Mike Thomsen wrote: > > When we talk about "slower nodes" here, are we referring to nodes that > are bogged down by data but of the

Re: Round robin load balancing eventually stops using all nodes

2022-04-01 Thread Mike Thomsen
When we talk about "slower nodes" here, are we referring to nodes that are bogged down by data but of the same size as the rest of the cluster or are we talking about a heterogeneous cluster? On Mon, Sep 27, 2021 at 12:07 PM Joe Witt wrote: > > Ryan, > > Regarding NIFI-9236 the JIRA captures it

Re: Round robin load balancing eventually stops using all nodes

2021-09-27 Thread Joe Witt
Ryan, Regarding NIFI-9236 the JIRA captures it well but sounds like there is now a better understanding of how it works and what options exist to better view details. Regarding Load Balancing: NIFI-7081 is largely about the scenario whereby in load balancing cases nodes which are slower

Re: Round robin load balancing eventually stops using all nodes

2021-09-21 Thread Ryan Hendrickson
Joe - We're testing some scenarios. Andrew captured some confusing behavior in the UI when enabling and disabling load balancing on a relationship: "Update UI for Clustered Connections" -- https://issues.apache.org/jira/projects/NIFI/issues/NIFI-9236 Question - When a FlowFile is Load Balanced

Re: Round robin load balancing eventually stops using all nodes

2021-09-18 Thread Mike Thomsen
> there is a ticket to overcome this (there is no ETA), Do you know what the Jira # is? On Mon, Sep 6, 2021 at 7:14 AM Simon Bence wrote: > > Hi Mike, > > I did a quick check on the round robin balancing and based on what I found > the reason for the issue must lie somewhere else, not directly

Re: Round robin load balancing eventually stops using all nodes

2021-09-17 Thread Ryan Hendrickson
Joe, we were asked to open a ticket and add diagnostics there, we've summarized the diagnostics (https://issues.apache.org/jira/browse/NIFI-9056). Unfortunately, we can't export logs en-masse, but if there is anything specific or a series of lines we should be looking for, we can summarize and

Re: Round robin load balancing eventually stops using all nodes

2021-09-10 Thread Mike Thomsen
The use case where we most often run into this problem involves extracting content from tarballs of varying sizes that are fairly large. These tarballs vary in size from 80GB to the better part of 500GB and contain a ton of 250k-1MB files in them; about 1.5M files per tarball is the norm. (I am

Re: Round robin load balancing eventually stops using all nodes

2021-09-07 Thread Joe Witt
Ryan If this is so easily replicated for you it should be trivially found and fixed most likely. Please share, for each node in your cluster, both a thread dump and heap dump within 30 mins of startup and again after 24 hours. This will allow us to see the delta and if there appears to be any

Re: Round robin load balancing eventually stops using all nodes

2021-09-07 Thread Ryan Hendrickson
We have a daily cron job that restarts our nifi cluster to keep it in a good state. On Mon, Sep 6, 2021 at 6:11 PM Mike Thomsen wrote: > > there is a ticket to overcome this (there is no ETA), but other details > might shed light to a different root cause. > > Good to know I'm not crazy, and

Re: Round robin load balancing eventually stops using all nodes

2021-09-06 Thread Mike Thomsen
> there is a ticket to overcome this (there is no ETA), but other details > might shed light to a different root cause. Good to know I'm not crazy, and it's in the TODO. Until then, it seems fixable by bouncing the box. On Mon, Sep 6, 2021 at 7:14 AM Simon Bence wrote: > > Hi Mike, > > I did

Re: Round robin load balancing eventually stops using all nodes

2021-09-06 Thread Simon Bence
Hi Mike, I did a quick check on the round robin balancing and based on what I found the reason for the issue must lie somewhere else, not directly within it. The one thing I can think of is the scenario where one (or more) nodes are significantly slower than the other ones. In these cases it

Re: Round robin load balancing eventually stops using all nodes

2021-09-06 Thread Jorge Machado
Are you are a dedicated port for transferring the data or using the http protocol ? I had similar issues with a remote port connection that got solved by not using the http protocol. > On 3. Sep 2021, at 14:13, Mike Thomsen wrote: > > We have a 5 node cluster, and sometimes I've noticed that

Round robin load balancing eventually stops using all nodes

2021-09-03 Thread Mike Thomsen
We have a 5 node cluster, and sometimes I've noticed that round robin load balancing stops sending flowfiles to two of them, and sometimes toward the end of the data processing can get as low as a single node. Has anyone seen similar behavior? Thanks, Mike