[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420355#comment-15420355 ] ASF GitHub Bot commented on STORM-1766: --- Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/1621 > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > Fix For: 2.0.0 > > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2] > {code} > Also to deal with the presence of failures, if a topology is partially > scheduled, we find the rack with the most schedul
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420354#comment-15420354 ] ASF GitHub Bot commented on STORM-1766: --- Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1621 +1 > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > Fix For: 2.0.0 > > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2] > {code} > Also to deal with the presence of failures, if a topology is partially > scheduled, we find the rack with the m
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418155#comment-15418155 ] ASF GitHub Bot commented on STORM-1766: --- Github user harshach commented on a diff in the pull request: https://github.com/apache/storm/pull/1621#discussion_r74524961 --- Diff: storm-core/src/jvm/org/apache/storm/scheduler/Cluster.java --- @@ -103,6 +103,9 @@ public Cluster(Cluster src) { this.status.putAll(src.status); this.topologyResources.putAll(src.topologyResources); this.blackListedHosts.addAll(src.blackListedHosts); +if (src.networkTopography != null) { --- End diff -- is this supposed to be == null. why are we creating new Map if there is one already. > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > Fix For: 2.0.0 > > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > ef
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417728#comment-15417728 ] ASF GitHub Bot commented on STORM-1766: --- Github user knusbaum commented on the issue: https://github.com/apache/storm/pull/1621 +1 > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > Fix For: 2.0.0 > > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2] > {code} > Also to deal with the presence of failures, if a topology is partially > scheduled, we find the rack with the mos
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417494#comment-15417494 ] ASF GitHub Bot commented on STORM-1766: --- GitHub user jerrypeng opened a pull request: https://github.com/apache/storm/pull/1621 [STORM-1766] - A better algorithm server rack selection for RAS Backport of #1398 to 1.x branch. I'm not sure this actually needs a PR, but since it's been a while since #1500 was merged, I'll put one anyways since the code went into 2.x a while ago You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerrypeng/storm 1.x-STORM-1766 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1621.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1621 commit d50a2437923f2919fbee130b8e9e86a62a2d9f48 Author: Boyang Jerry Peng Date: 2016-05-04T22:08:57Z [STORM-1766] - A better algorithm server rack selection for RAS > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > Fix For: 2.0.0 > > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0%
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15345535#comment-15345535 ] ASF GitHub Bot commented on STORM-1766: --- Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1398 @jerrypeng Since master branch is target to 2.0.0 and we don't have timeframe so it may be better to add it to 1.1.0 if you think it's not experimental feature. > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > Fix For: 2.0.0 > > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rac
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300735#comment-15300735 ] ASF GitHub Bot commented on STORM-1766: --- Github user jerrypeng commented on the pull request: https://github.com/apache/storm/pull/1398#issuecomment-221686962 @ptgoetz oh i see, thanks for letting me know! I will remember next time to put a comment in the jira regarding which branches i merged the corresponding PR to. > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300726#comment-15300726 ] ASF GitHub Bot commented on STORM-1766: --- Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/1398#issuecomment-221686373 @jerrypeng For tracking what goes into each branch/release. Github only gives us merge notifications for the branch a pull request targeted. If you had merged this to other branches, we wouldn't know unless we looked for it in other branches. That's why most of the time we add a comment noting which branches a patch was applied to. It saves a little time for other committers. > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300682#comment-15300682 ] ASF GitHub Bot commented on STORM-1766: --- Github user jerrypeng commented on the pull request: https://github.com/apache/storm/pull/1398#issuecomment-221678273 @ptgoetz just merged it into master why? > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2] > {code} > Also to deal with the presence of failures, if a topology is partially > sch
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300676#comment-15300676 ] ASF GitHub Bot commented on STORM-1766: --- Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/1398#issuecomment-221677136 @jerrypeng Did you merge this to any other branches, or just master? > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2] > {code} > Also to deal with the presence of failures, if a to
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300634#comment-15300634 ] ASF GitHub Bot commented on STORM-1766: --- Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/1398 > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2] > {code} > Also to deal with the presence of failures, if a topology is partially > scheduled, we find the rack with the most scheduled executors for the > topolog
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300608#comment-15300608 ] ASF GitHub Bot commented on STORM-1766: --- Github user jerrypeng commented on the pull request: https://github.com/apache/storm/pull/1398#issuecomment-221665618 @ptgoetz I have created a jira: https://issues.apache.org/jira/browse/STORM-1866 > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2] > {code} > Also to deal with the presence of
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298790#comment-15298790 ] ASF GitHub Bot commented on STORM-1766: --- Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/1398#issuecomment-221383374 +1 @jerrypeng Can you file a lira for updating the documentation if necessary? > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2] > {code} > Also to deal with the presence
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298386#comment-15298386 ] ASF GitHub Bot commented on STORM-1766: --- Github user jerrypeng commented on the pull request: https://github.com/apache/storm/pull/1398#issuecomment-221314974 @redsanket thanks for the review. Do you have any other comments? > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2] > {code} > Also to deal with the presence of failures, if a to
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298278#comment-15298278 ] ASF GitHub Bot commented on STORM-1766: --- Github user jerrypeng commented on a diff in the pull request: https://github.com/apache/storm/pull/1398#discussion_r64404386 --- Diff: storm-core/src/jvm/org/apache/storm/scheduler/resource/strategies/scheduling/DefaultResourceAwareStrategy.java --- @@ -45,6 +47,7 @@ import org.apache.storm.scheduler.WorkerSlot; import org.apache.storm.scheduler.resource.Component; + --- End diff -- will remove > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percent
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296566#comment-15296566 ] ASF GitHub Bot commented on STORM-1766: --- Github user redsanket commented on a diff in the pull request: https://github.com/apache/storm/pull/1398#discussion_r64245422 --- Diff: storm-core/src/jvm/org/apache/storm/scheduler/resource/strategies/scheduling/DefaultResourceAwareStrategy.java --- @@ -45,6 +47,7 @@ import org.apache.storm.scheduler.WorkerSlot; import org.apache.storm.scheduler.resource.Component; + --- End diff -- extra line? > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percent
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285121#comment-15285121 ] ASF GitHub Bot commented on STORM-1766: --- Github user knusbaum commented on the pull request: https://github.com/apache/storm/pull/1398#issuecomment-219518790 +1 > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities > {code} > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > {code} > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available. > We first calculate the resource availability percentage of all the racks for > each resource by computing: > {code} > (resource available on rack) / (resource available in cluster) > {code} > We do this calculation to normalize the values otherwise the resource values > would not be comparable. > So for our example: > {code} > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] > effective resources: 0.1951219512195122 > {code} > The effective resource of a rack, which is also the subordinate resource, is > computed by: > {code} > MIN(resource availability percentage of {CPU, Memory, # of free Slots}). > {code} > Then we order the racks by the effective resource. > Thus for our example: > {code} > Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2] > {code} > Also to deal with the presence of failures, if a topology is partially > scheduled, we find the rack with the most
[jira] [Commented] (STORM-1766) A better algorithm server rack selection for RAS
[ https://issues.apache.org/jira/browse/STORM-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271526#comment-15271526 ] ASF GitHub Bot commented on STORM-1766: --- GitHub user jerrypeng opened a pull request: https://github.com/apache/storm/pull/1398 [STORM-1766] - A better algorithm server rack selection for RAS You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerrypeng/storm STORM-1766 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1398.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1398 commit 848ef21df184d046ac6617ea2bce1efc00e13a13 Author: Boyang Jerry Peng Date: 2016-05-04T22:08:57Z [STORM-1766] - A better algorithm server rack selection for RAS > A better algorithm server rack selection for RAS > > > Key: STORM-1766 > URL: https://issues.apache.org/jira/browse/STORM-1766 > Project: Apache Storm > Issue Type: Improvement >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng > > Currently the getBestClustering algorithm for RAS finds the "Best" > cluster/rack based on which rack has the most available resources this may be > insufficient and may cause topologies not to be able to be scheduled > successfully even though there are enough resources to schedule it in the > cluster. We attempt to find the rack with the most resources by find the rack > with the biggest sum of available memory + available cpu. This method is not > effective since it does not consider the number of slots available. This > method also fails in identifying racks that are not schedulable due to the > exhaustion of one of the resources either memory, cpu, or slots. The current > implementation also tries the initial scheduling on one rack and not try to > schedule on all the racks before giving up which may cause topologies to be > failed to be scheduled due to the above mentioned shortcomings in the current > method. Also the current method does not consider failures of workers. When > executors of a topology gets unassigned and needs to be scheduled again, the > current logic in getBestClustering may be inadequate if not complete wrong. > When executors needs to rescheduled due to a fault, getBestClustering will > likely return a cluster that is different from where the majority of > executors from the topology is originally scheduling in. > Thus, I propose a different strategy/algorithm to find the "best" cluster. I > have come up with a ordering strategy I dub subordinate resource availability > ordering (inspired by Dominant Resource Fairness) that sorts racks by the > subordinate (not dominant) resource availability. > For example given 4 racks with the following resource availabilities: > //generate some that has alot of memory but little of cpu > rack-3 Avail [ CPU 100.0 MEM 20.0 Slots 40 ] Total [ CPU 100.0 MEM > 20.0 Slots 40 ] > //generate some supervisors that are depleted of one resource > rack-2 Avail [ CPU 0.0 MEM 8.0 Slots 40 ] Total [ CPU 0.0 MEM 8.0 > Slots 40 ] > //generate some that has a lot of cpu but little of memory > rack-4 Avail [ CPU 6100.0 MEM 1.0 Slots 40 ] Total [ CPU 6100.0 MEM > 1.0 Slots 40 ] > //generate another rack of supervisors with less resources than rack-0 > rack-1 Avail [ CPU 2000.0 MEM 4.0 Slots 40 ] Total [ CPU 2000.0 MEM > 4.0 Slots 40 ] > rack-0 Avail [ CPU 4000.0 MEM 8.0 Slots 40( ] Total [ CPU 4000.0 MEM > 8.0 Slots 40 ] > Cluster Overall Avail [ CPU 12200.0 MEM 41.0 Slots 200 ] Total [ CPU > 12200.0 MEM 41.0 Slots 200 ] > It is clear that rack-0 is the best cluster since its the most balanced and > can potentially schedule the most executors, while rack-2 is the worst rack > since rack-2 is depleted of cpu resource thus rendering it unschedulable even > though there are other resources available > We first calculate the resource availability percentage of all the racks for > each resource by computing: (resource available on rack) / (resource > available in cluster) > We do this calculation to normalize the values otherwise the resource values > would not be comparable. So For our example: > rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] > effective resources: 0.00819672131147541 > rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: > 0.0 > rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective > resources: 0.024390243902439025 > rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] > effective resources: 0.0975609756097561 > rack-0