Hi Jungtaek,

We are running, Storm 0.9.4, but we are planning to migrate to 1.0.1
version.

We deploy our topologies to move messages inside RabbitMQ brokers.

Certanly, we have made the test of forcing a worker's die, and once nimbus
timeout has happened, a new worker appeared in another node,  but system
doesn't behave as good as it should. It was necessary to kill some other
workers and rebalance a couple of times in order to get everything OK (A
constant message flow inside our brokers).

Is it possible to kill all the workers inside a topology and rebalance
(like a kind of graceful shutdown)? Or once you kill all of them you must
redeploy de hole topology?

Is 1.0.1 version a possible solution?

Thanks again.




*JULIÁN BERMEJO FERREIRO*
*Departamento de Tecnología *
*[email protected] <[email protected]>*
<http://www.beeva.com/>




2016-05-26 15:34 GMT+02:00 Jungtaek Lim <[email protected]>:

> Hi Julián,
>
> Which version of Storm do you use?
> I remember some of Storm 0.9.x versions has some issues when workers are
> failing, so I'd like to know about it.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2016년 5월 26일 (목) 오후 5:53, Julián Bermejo Ferreiro | BEEVA <
> [email protected]>님이 작성:
>
>> Hello,
>>
>> We have a multiple-node storm cluster running on a Production
>> environment. We have had some issues with a couple of machines, which have
>> been out of service for a few hours.
>>
>> Because some workers of the deployed topologies were running on the
>> failed machines, cluster's behaviour has been unusual (It has been running
>> but not as it should).
>>
>> Once we recovered the failed nodes, and rebalanced the topologies, the
>> cluster returned to work properly.
>>
>> We would like to know if there is any way to alert nimbus, when a node
>> fall down, in order to rebalance the affected topologies and  create new
>> workers in the healthy nodes of the cluster that supply those who were
>> working on the failed ones.
>>
>> This would have helped us so much, because we could have kept consistency
>> in our service in spite of the failed nodes.
>>
>> Any advice?
>>
>> Tahnks in advance!
>>
>>
>>
>>
>>
>>
>> *JULIÁN BERMEJO FERREIRO*
>> *Departamento de Tecnología *
>> *[email protected] <[email protected]>*
>> <http://www.beeva.com/>
>>
>>
>>
>>

Reply via email to