Nimbus dies

Krzysztof Sadowski Fri, 16 May 2014 07:18:00 -0700

Let's imagine the following scenario:

   - machine with one supervisor goes down
   - machine with nimbus goes down


Right now, because some workers go down as well, a few queues are not
drained properly, what causes that these queues are continuously increasing
in size.

To avoid this situation we should rebalance the topology in order to
distribute the load across all of the remaining supervisors, but to do this
I need the nimbus to be up and running. Moreover the basic monitoring
information is not available because StormUI is also not working.

My question is: What is a devops operation when the machine with nimbus
dies and what can be done to minimize its unavailability period? Should we
install nimbus on second machine and run it after first machine dies -
something similar to failover services? Can we run more than one nimbus? Or
maybe there is a better option?

Thanks for help

Nimbus dies

Reply via email to