Dear Apache Storm Community,

I am currently managing an Apache Storm cluster with 38 nodes: 3 dedicated
to ZooKeeper, 1 to Nimbus and the UI, and 34 nodes running Supervisor and
Logviewer processes. Each node has 2 Workers.

At present, our topology update process involves the following steps:

   1. Killing the existing topology.
   2. Changing dependency JARs under the external-lib dir and restarting
   Nimbus.
   3. Changing dependency JARs  under the external-lib dir and restarting
   Supervisors.
   4. Submitting the new topology.

Each operation takes about 2–3 minutes. As the number of Supervisor nodes
increases, the overall time for topology updates is becoming a concern.

I am reaching out to seek advice on how to optimize this process, as I
believe there are more efficient ways to handle topology updates in
large-scale Storm deployments. Specifically:

   - Is there a more efficient process to handle code changes without
   having to manually restart Nimbus and Supervisors?
   - How can I reduce the overall time for topology updates, especially as
   our cluster continues to grow?
   - Are there industry-standard practices for implementing rolling updates
   or automating the deployment process?

Any insights, recommendations, or best practices that could help streamline
our update process would be greatly appreciated.

Thank you for your time, and I look forward to your suggestions!

Reply via email to