Hi all,

The 7 node cluster had 1 node underperforming today.  The single node's
object back pressure exceeded the limit and caused the other 6 nodes to
stop processing data, waiting for the single node to complete the work.
Question 1: Is this expected behavior?

To remedy the situation, we added a relationship round-robin load-balance
on that relationship.  We observed FlowFiles moving to one other nodes, but
this unnecessarily saturates the network.
Question 2: When a FlowFile is Load Balanced from one node to another, is
the entire Content Claim load balanced?  Or just the small portion
necessary?
Question 3: When the Load Balancing begins, how many threads can it take
up?  And how many "in-parallel" files can be moved?
Question 4: When the Load Balancing attempts a Load Balance using Round
Robin, will it "skip" a node if a node's Relationship has already exceeded
the object backpressure limit?  Similar to how DistributeLoad uses "Next
Available"?

Thanks,
Ryan

On Mon, Aug 9, 2021 at 4:59 PM Ryan Hendrickson <
[email protected]> wrote:

> Hi all,
> To confirm, when using a NiFi Cluster, are the Relationship Settings the
> "Back Pressure Object Threshold" and "Size Threshold" per node node, or
> cluster-wide?
>
> For example, if we have a 10 node cluster and set the Back Pressure Object
> Threshold to 100.  Would we then expect the Relationship to queue up-to
> 1000 flowfiles prior to exceeding the threshold?
>
> We have the following setup:
> Update Attribute -----Relationship----> JoltTransform
>
> In our case, we set a 70,000 object threshold and have 7 servers in the
> cluster.
>
> When hovering on the Relationship's status bar, it says:   "Queue: 100%
> full (based on 70,000 object threshold)"
>
> There's two things that don't make sense about that message:
> 1. The Relationship only has ~350,000 FlowFiles in it, for it to be 100%
> full, it would need 490,000.
> 2. There are 7 nodes in the cluster, so should the "based on xx object
> threshold" say "based on 490,000 object threshold"?
>
> We also have a 2GB "Size Threshold" set on the Relationship.  The
> Relationship hover text reads: "Queue 36% full (based on 2GB data size
> threshold)".
>
> What doesn't make sense about this, is that the math doesn't make sense if
> you check it yourself.  We have 7 nodes x 2GB each limit equals 14GB
> cluster-wide.
>
> Taking the reported value of 3.45 GB in the queue, divide it by 14, you
> get ~25%.  That's a far 11% off from the 36% noted in the hover text.
>
> We are running 1.13.2 on these servers.  All servers appear to be
> communicating and processing data, per the UI's NiFi Cluster overview
> (thread count, queue size, status, etc)
>
> Any thoughts on this would be appreciated.
>
> Thanks,
> Ryan
>

Reply via email to