You can deactivate the topology, which will shut off the spouts. Then after a period of time (enough for your bolts to all drain), kill the topology. I believe this is what kill with a non-zero timeout does as well. Kill with a zero timeout will kill the worker process/es without letting them drain, hence the tuples that were not acked or failed.
On Thu, May 1, 2014 at 3:16 PM, P Ghosh <javadevgh...@gmail.com> wrote: > I have few topologies running. The spout puts the ID of the object it is > emitting into an WIP list in REDIS. When the spout gets the ack or fail > method called, it takes it out of the WIP list. > > The environment and application are undergoing lot of changes.. and as a > result I'm required to occasionally restart the topology or the storm > cluster itself. > > Problem is, as I restart, I see quite few messages are left in WIP..which > means for these messages, spout didn't receive any ack or fail. > > My restart process has been > 1. Kill the topology from UI (I find killing from UI is more > responsive than from command line.... the killed topology goes off very > quickly...if I do it from command line, the "killed" topology remains in > the list for a long time , hindering my ability to relaunch the > topology...). I typically kill it it with 0 secs. wait time..(may be this > where I'm doing wrong) > > 2. Go to each VM and stop the > a> supervisor > b> logviewer > 3. Go to nimbus,shutdown > a> ui/nimbus/logviewer > 4.Go to zookeeper and shutdown zookeeper > > > This I thought is the proper flow...but I doubt that given the left over > messages I see in WIP. > > Any thoughts...will be helpful. > > Thanks, > Prasun > > > >