Github user revans2 commented on the issue: https://github.com/apache/storm/pull/838 @knusbaum can you please take a look at what we have internally and be sure that the fail over/load balancing features all got merged in to open source? If it is all in then we probably need some better documentation on how to set it up and use it properly. @danny0405 not having HA is not acceptable. I agree. The code is written so that if you have more than one pacemaker server the workers will load balance the heartbeats between them, but if one of them goes down the worker will fail over to one of the existing servers. Nimbus then will read from all of the pacemaker servers and if there are more than one heartbeat for a given worker the one with the newest timestamp wins. You should be able to get HA just by having more than one of them, but I want to be sure that everything for that including bug fixes have been merged into open source. They should be, but if you are having issues I want to be sure.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---