Thank you for the responses guys. I’ve done a little experimenting and discovered restarting the daemons (nimbus, supervisor) is not such a big deal for our topologies.
We already have a dedicated zookeeper server in our environment, so we had no need to use the included zk library. It was easy to define the zk server in storm.yaml. We dedicate one node (i.e. virtual server) to Nimbus and the Storm UI. Nothing else runs on that server. We have three nodes dedicated to the supervisors. Nothing else runs on those servers. They all communicate to Zookeeper without issues; The supervisors register themselves when a new daemon starts, and nimbus finds them without issue. One issue we had were the supervisors were trying to communicate with either other using the Netty client. When the supervisors registered in Zookeeper, they would provide their hostname. Well, our Storm cluster is built in AWS (amazon web services) and we spin up/down servers all the time, so we can’t use hostnames and instead rely on IP addresses. After some searching, a coworker found a property that you can define in storm.yaml called storm.local.hostname. You can set that value equal to the ip address. And the supervisor will register it’s ip address instead of the hostname. After we made that change, we no longer got netty-client errors in the supervisor.log file. That was a huge headache and I’m glad it is finally resolved. It would be nice if there was a list somewhere that listed ALL possible values allowed in the storm.yaml file. I have a feeling mine is missing some. Richard Gunderson From: Nathan Leung [mailto:ncle...@gmail.com] Sent: Wednesday, September 24, 2014 8:09 AM To: user Subject: Re: Configuration changes and storm cluster In my experience storm.yaml changes require daemon restarts, while cluster.xml changes get picked up dynamically by running topologies. On Wed, Sep 24, 2014 at 2:44 AM, Richards Peter <hbkricha...@gmail.com<mailto:hbkricha...@gmail.com>> wrote: Answers inline. Regards, Richards Peter. On Tue, Sep 23, 2014 at 8:55 PM, Gunderson, Richard-CW <richard.gunder...@bestbuy.com<mailto:richard.gunder...@bestbuy.com>> wrote: Hi Storm users. I’ve recently started managing a Storm cluster. I’m very new to Storm and have a lot to learn. I’ve been tasked to create Chef scripts to automate two things: 1) Update our storm cluster if we have changes to our Storm configuration (this will primarily be changes to properties in the storm.yaml file) 2) Updates to our topologies. My question: Does storm notice if the properties file (storm.yaml, or cluster.xml) has been updated and automatically “incorporate” the changes while it’s running? Or must I shut down the entire cluster (nimbus and all supervisors) and restart everything? I didn’t notice anywhere in the official apache storm documentation about what to do when after updating the configuration file. From my experience, storm.yaml file is read only once - when storm daemons(nimbus, supervisor, ui) are started. If you are adding only new supervisors machines into the cluster you need not restart any of the running services. You just have to start the supervisor on the newly added machine. In storm.yaml file you do not specify the ip address of any of the supervisor machines. The supervisors automatically talk to the nimbus server configured in the storm.yaml file. Storm is stateless and all state information is stored in zookeeper. So you need not worry about adding any new supervisor machines. I normally restart all the storm daemons if there is a change in my storm.yaml file. You should take expert opinion from others in this forum. But let me bring one more thing to your attention. If you add more supervisor ports in your storm yaml file, you should also increase a configuration parameter in zoo.cfg file of zookeeper -> maxClientCnxns. http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#sc_advancedConfiguration Also: What happens if Nimbus has a different storm.yaml file than the supervisors? Does a supervisor process ever read that file? Or does Nimbus control everything? I am not sure about your question. If the nimbus machine runs only nimbus, then that file is read only by nimbus. However if you have nimbus and supervisor running on same machine the file will be read by both the processes. We have recently created two new topologies and will need to start tuning the performance of these topologies, so we’ll have a lot of need to update our configuration a lot in the coming days. Thank you, Richard Gunderson