Hi Jungtaek, We are running, Storm 0.9.4, but we are planning to migrate to 1.0.1 version.
We deploy our topologies to move messages inside RabbitMQ brokers. Certanly, we have made the test of forcing a worker's die, and once nimbus timeout has happened, a new worker appeared in another node, but system doesn't behave as good as it should. It was necessary to kill some other workers and rebalance a couple of times in order to get everything OK (A constant message flow inside our brokers). Is it possible to kill all the workers inside a topology and rebalance (like a kind of graceful shutdown)? Or once you kill all of them you must redeploy de hole topology? Is 1.0.1 version a possible solution? Thanks again. *JULIÁN BERMEJO FERREIRO* *Departamento de Tecnología * *[email protected] <[email protected]>* <http://www.beeva.com/> 2016-05-26 15:34 GMT+02:00 Jungtaek Lim <[email protected]>: > Hi Julián, > > Which version of Storm do you use? > I remember some of Storm 0.9.x versions has some issues when workers are > failing, so I'd like to know about it. > > Thanks, > Jungtaek Lim (HeartSaVioR) > > 2016년 5월 26일 (목) 오후 5:53, Julián Bermejo Ferreiro | BEEVA < > [email protected]>님이 작성: > >> Hello, >> >> We have a multiple-node storm cluster running on a Production >> environment. We have had some issues with a couple of machines, which have >> been out of service for a few hours. >> >> Because some workers of the deployed topologies were running on the >> failed machines, cluster's behaviour has been unusual (It has been running >> but not as it should). >> >> Once we recovered the failed nodes, and rebalanced the topologies, the >> cluster returned to work properly. >> >> We would like to know if there is any way to alert nimbus, when a node >> fall down, in order to rebalance the affected topologies and create new >> workers in the healthy nodes of the cluster that supply those who were >> working on the failed ones. >> >> This would have helped us so much, because we could have kept consistency >> in our service in spite of the failed nodes. >> >> Any advice? >> >> Tahnks in advance! >> >> >> >> >> >> >> *JULIÁN BERMEJO FERREIRO* >> *Departamento de Tecnología * >> *[email protected] <[email protected]>* >> <http://www.beeva.com/> >> >> >> >>
