Stephen,

Thank you for the response!  Helps out a lot.

So a further question.  And forgive my lack of knowledge here, I am not the one 
using Storm, only deploying and running it, so I don’t understand all the 
reasoning behind why something is done a certain way in Storm.

Let’s say I have deactivated all the topologies.  Is it necessary to then kill 
the topology?  Could I not just wait a set amount of time to ensure the tuples 
have cleared, say 5 minutes, and then bring down the nodes?

The reason I ask this is because it is a lot easier to activate the topologies 
after the nodes are back up with a non-interactive script.  I would like to 
avoid using “storm jar” to load the topology because that means I need to hard 
code stuff into my scripts or come up with a separate conf file for my script.  
See my current code below:

function deactivate_topos {
  STORM_TOPO_STATUS=$(storm list | sed -n -e 
'/^-------------------------------------------------------------------/,$p' | 
sed -e 
'/^-------------------------------------------------------------------/d' | awk 
'{print $1 ":" $2}')

  for i in $STORM_TOPO_STATUS
  do
    IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"
               echo "$TOPO_NAME $TOPO_STATUS"
               if [ $TOPO_STATUS = 'ACTIVE' ]; then
                              storm deactivate ${TOPO_NAME}
               fi
    storm list | sed -n -e 
'/^-------------------------------------------------------------------/,$p'
  done
}

function activate_topos {
  STORM_TOPO_STATUS=$(storm list | sed -n -e 
'/^-------------------------------------------------------------------/,$p' | 
sed -e 
'/^-------------------------------------------------------------------/d' | awk 
'{print $1 ":" $2}')
  for i in $STORM_TOPO_STATUS
  do
    IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"
    echo "$TOPO_NAME $TOPO_STATUS"
    if [ $TOPO_STATUS = 'INACTIVE' ]; then
      storm activate ${TOPO_NAME}
    fi
    storm list | sed -n -e 
'/^-------------------------------------------------------------------/,$p'
  done
}

From: Stephen Powis [mailto:spo...@salesforce.com]
Sent: Tuesday, September 29, 2015 12:45 PM
To: user@storm.apache.org
Subject: Re: Starting and stopping storm

I would imagine the safest way would be to elect to deactivate each running 
topology, which should make your spouts stop emitting tuples.  You'd wait for 
all of the currently processing tuples to finish processing, and then kill the 
topology.

If tuples get processed quickly in your topologies, you can effectively do this 
by selecting kill and giving it a long enough wait time.  IE -- Telling storm 
to kill your topology after 30 seconds means it will deactivate your spouts for 
30 seconds, waiting for existing tuples to finish getting processed, and then 
kill off the topology.

Then bring down each node, upgrade it, bring it back online and resubmit your 
topologies.

On Tue, Sep 29, 2015 at 10:02 AM, Garcia-Contractor, Joseph (CORP) 
<joseph.garcia-contrac...@adp.com<mailto:joseph.garcia-contrac...@adp.com>> 
wrote:
I don't think I got my question across right or I am confused.

Let me break this down in a more simple fashion.

I have a Storm Cluster named "The Quiet Storm" ;) here is what it consists of:

******
Server ZK1: Running Zookeeper
Server ZK2: Running Zookeeper
Server ZK3: Running Zookeeper

Server N1: SupervisorD running Storm Nimbus

Server S1: SupervisorD running Storm Supervisor with 4 workers.
Server S2: SupervisorD running Storm Supervisor with 4 workers.
Server S3: SupervisorD running Storm Supervisor with 4 workers.
******

Now the "The Quiet Storm" can have 1-n number of topologies running on it.

I need to shut down all the servers in the cluster for maintenance.  What is 
the procedure to do this without doing harm to the currently running topologies?

Thank you,

Joe

-----Original Message-----
From: Matthias J. Sax [mailto:mj...@apache.org<mailto:mj...@apache.org>]
Sent: Monday, September 28, 2015 12:15 PM
To: user@storm.apache.org<mailto:user@storm.apache.org>
Subject: Re: Starting and stopping storm

Hi,

as always: it depends. ;)

Storm itself clear ups its own resources just fine. However, if the running 
topology needs to clean-up/release resources before it is shut down, Storm is 
not of any help. Even if there is a Spout/Bolt cleanup() method, Storm does not 
guarantee that it will be called.

Thus, using "storm deactivate" is a good way to achieve proper cleanup.
However, the topology must provide some code for it, too. On the call to 
Spout.deactivate(), it must emit a special "clean-up" message (that you have to 
design by yourself) that must propagate through the whole topology, ie, each 
bolt must forward this message to all its output streams. Furthermore, bolts 
must to the clean-up if they receive this message.

Long story short: "storm deactivate" before "storm kill" makes only sense if 
the topology requires proper cleanup and if the topology itself can 
react/cleanup properly on Spout.deactivate().

Using "storm activate" in not necessary in any case.

-Matthias


On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
> Hi all,
>
>
>
>                I am a DevOps guy and I need implement a storm cluster
> with the proper start and stop init scripts on a Linux server.  I
> already went through the documentation and it seems simple enough.  I
> am using supervisor as my process manager.  I am however having a
> debate with one of the developers using Storm on the proper way to
> shutdown Storm and I am hoping that you fine folks can help us out in this 
> regard.
>
>
>
>                The developer believes that before you tell supervisor
> to kill (SIGTERM) the storm workers, supervisor, and nimbus, you must
> first issue a "storm deactivate topology-name", then tell supervisor
> to kill all the various processes.  He believes this because he
> doesn't know if Storm will do an orderly shutdown on SIGTERM and that
> there is a chance that something will get screwed up.  This also means
> that when you start storm, after nimbus is up, you need to issue a
> ""storm activate topology-name".
>
>
>
>                I am of the belief that because of storms fast fail and
> because it guarantees data processing, none of that is necessary and
> that you can just tell supervisor to stop the process.
>
>
>
>                So who is right here?
>
> ----------------------------------------------------------------------
> -- This message and any attachments are intended only for the use of
> the addressee and may contain information that is privileged and
> confidential. If the reader of the message is not the intended
> recipient or an authorized representative of the intended recipient,
> you are hereby notified that any dissemination of this communication
> is strictly prohibited. If you have received this communication in
> error, notify the sender immediately by return email and delete the
> message and any attachments from your system.

----------------------------------------------------------------------
This message and any attachments are intended only for the use of the addressee 
and may contain information that is privileged and confidential. If the reader 
of the message is not the intended recipient or an authorized representative of 
the intended recipient, you are hereby notified that any dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, notify the sender immediately by return email and delete the message 
and any attachments from your system.

Reply via email to