[jira] [Commented] (NIFI-9598) Load Balancing on labeled nodes and/or fixed amount of usable nodes in process groups

2022-12-03 Thread Denis Jakupovic (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642883#comment-17642883
 ] 

Denis Jakupovic commented on NIFI-9598:
---

Any news on this?

> Load Balancing on labeled nodes and/or fixed amount of usable nodes in 
> process groups
> -
>
> Key: NIFI-9598
> URL: https://issues.apache.org/jira/browse/NIFI-9598
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.15.3
>Reporter: Denis Jakupovic
>Priority: Trivial
>
> One of NiFi's great features is its linear scalability by adding just more 
> nodes. However by only having the distribute load processor or by round 
> robin, load balance by attribute name or to a single node feature in the 
> connection, we could need a more granular form of distributing flowfiles 
> through the cluster. 
> Let's assume we have a 10 node NiFi Cluster. 
> Round Robin: Each node would get 1/10 of the flowfiles.
> Single Node: Only one node would process all FF. Chance that other process 
> groups distribute to same node is 1/10
> By Attribute: 1-10 nodes could get the data, not evenly partitioned
> Distribute Load Processor: Manual and fixed process, cannot scale with adding 
> more nodes to the cluster and needs 
> By having several dataflows with different use cases with enormous variance 
> in computation, one or a few dataflows can slow down all other data flows. 
> Therefore a solution could be partitioning the data to labeled nodes or by 
> setting the maximum allowed nodes to use for FF partitioning/load balancing 
> on process groups or a connection.
> In the cluster configuration each node could be labeled. Distributing the FF 
> by round robin would only be distributed to the labeled nodes with the proper 
> label. A distribution by attribute name would mean to build the attribute 
> accordingly and cannot be build dynamically. 
> Another great feature would be the maximum amount of nodes a process group 
> can use to distribute nodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-9598) Load Balancing on labeled nodes and/or fixed amount of usable nodes in process groups

2022-01-19 Thread Denis Jakupovic (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478887#comment-17478887
 ] 

Denis Jakupovic commented on NIFI-9598:
---

definitely agree. Labeling would be a game changer for NiFi development, also 
in AWS with different EC2 instances. With labeling the primary e.g could only 
be used to handle critical processes and afterwards partitioning the relevant 
data to other nodes by label.

It's always a hard time explaining to customers why they should provision 
another cluster to partition the dataflows instead of just adding more nodes...

> Load Balancing on labeled nodes and/or fixed amount of usable nodes in 
> process groups
> -
>
> Key: NIFI-9598
> URL: https://issues.apache.org/jira/browse/NIFI-9598
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.15.3
>Reporter: Denis Jakupovic
>Priority: Trivial
>
> One of NiFi's great features is its linear scalability by adding just more 
> nodes. However by only having the distribute load processor or by round 
> robin, load balance by attribute name or to a single node feature in the 
> connection, we could need a more granular form of distributing flowfiles 
> through the cluster. 
> Let's assume we have a 10 node NiFi Cluster. 
> Round Robin: Each node would get 1/10 of the flowfiles.
> Single Node: Only one node would process all FF. Chance that other process 
> groups distribute to same node is 1/10
> By Attribute: 1-10 nodes could get the data, not evenly partitioned
> Distribute Load Processor: Manual and fixed process, cannot scale with adding 
> more nodes to the cluster and needs 
> By having several dataflows with different use cases with enormous variance 
> in computation, one or a few dataflows can slow down all other data flows. 
> Therefore a solution could be partitioning the data to labeled nodes or by 
> setting the maximum allowed nodes to use for FF partitioning/load balancing 
> on process groups or a connection.
> In the cluster configuration each node could be labeled. Distributing the FF 
> by round robin would only be distributed to the labeled nodes with the proper 
> label. A distribution by attribute name would mean to build the attribute 
> accordingly and cannot be build dynamically. 
> Another great feature would be the maximum amount of nodes a process group 
> can use to distribute nodes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (NIFI-9598) Load Balancing on labeled nodes and/or fixed amount of usable nodes in process groups

2022-01-19 Thread Joe Witt (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478878#comment-17478878
 ] 

Joe Witt commented on NIFI-9598:


definitely a concept we've discussed in the past.  this sort of partitioned 
load balancing would be pretty slick.  you could do it by other means too as at 
times users have heterogenous cluster setups whereby some nodes are more 
powerful and thus we could ensure certain data is biased to the more powerful 
systems, etc..

> Load Balancing on labeled nodes and/or fixed amount of usable nodes in 
> process groups
> -
>
> Key: NIFI-9598
> URL: https://issues.apache.org/jira/browse/NIFI-9598
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.15.3
>Reporter: Denis Jakupovic
>Priority: Trivial
>
> One of NiFi's great features is its linear scalability by adding just more 
> nodes. However by only having the distribute load processor or by round 
> robin, load balance by attribute name or to a single node feature in the 
> connection, we could need a more granular form of distributing flowfiles 
> through the cluster. 
> Let's assume we have a 10 node NiFi Cluster. 
> Round Robin: Each node would get 1/10 of the flowfiles.
> Single Node: Only one node would process all FF. Chance that other process 
> groups distribute to same node is 1/10
> By Attribute: 1-10 nodes could get the data, not evenly partitioned
> Distribute Load Processor: Manual and fixed process, cannot scale with adding 
> more nodes to the cluster and needs 
> By having several dataflows with different use cases with enormous variance 
> in computation, one or a few dataflows can slow down all other data flows. 
> Therefore a solution could be partitioning the data to labeled nodes or by 
> setting the maximum allowed nodes to use for FF partitioning/load balancing 
> on process groups or a connection.
> In the cluster configuration each node could be labeled. Distributing the FF 
> by round robin would only be distributed to the labeled nodes with the proper 
> label. A distribution by attribute name would mean to build the attribute 
> accordingly and cannot be build dynamically. 
> Another great feature would be the maximum amount of nodes a process group 
> can use to distribute nodes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)