Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1674 Overall I like the concept here, and if a supervisor is appearing/disappearing a lot we probably do want to blacklist that supervisor. That said I have a few system concerns. 1) I would like to see this feature made a part of nimbus, and not so much be a scheduler. The algorithm is generic enough that we could easily wrap all of the schedulers. If you let nimbus hand it the scheduler that it is supposed to wrap through the constructor, then you can make the internals agnostic to the scheduler underneath. This would also fix some of the build dependency issues you where having with needing to build blacklising after building clojure. 2) I like having the reporting plugin, but I really want to see blacklisted nodes show up on the storm UI. We have the supervisor pages now, and the supervisor table on the main page. If I am an administrator I would much rather look at a UI to see what is happening with a supervisor instead of parsing a lot of logs. 3) Cluster wide failures. Blacklisting is a good feature until something odd happens and the entire cluster is blacklisted. (completely theoretical) Lets say that we have nimbus HA and it is the primary nimbus nodes that gets lots of network loss. After a while it blacklists the entire cluster, when it is just nimbus that is bad. I want to be sure that we have something in place that can detect and handle appropriately a situation where the majority of the nodes appear to be bad. 4) master. This patch is just for the 1.x branch. That is fine, but before we can merge it in we need a patch for master as well.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---