[ 
https://issues.apache.org/jira/browse/STORM-526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Kellogg updated STORM-526:
-------------------------------
    Component/s: storm-core

> Nimbus triggered complete removal of all topologies due to maintenance in 2 
> out of 3 zookeeper servers
> ------------------------------------------------------------------------------------------------------
>
>                 Key: STORM-526
>                 URL: https://issues.apache.org/jira/browse/STORM-526
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>    Affects Versions: 0.9.2-incubating
>         Environment: AWS EC2 ubuntu
>            Reporter: Itai Frenkel
>
> We use a cluster of 3 zookeepers, all 3 ip addresses are in the storm.yml 
> file. We were restarting one zookeeper, and once it was ready, we restarted 
> the second zookeeper. All this time the third zookeeper was "green" (as 
> monitored by Netfix Exhibitor).
> At this same time nimbus has "decided" to remove all topologies (log entry is 
> "Corrupt topology my-topology-xxx has state on zookeeper but doesn't have a 
> local dir on Nimbus. Cleaning up...").
> I looked at the relevant code and I am not entirely sure the log message 
> describes correctly the code.
> Could anyone please read the nimbus.clj#cleanup-corrupt-topologies and 
> explain under what conditions does nimbus act in that way ?
> https://github.com/apache/storm/blob/v0.9.2-incubating/storm-core/src/clj/backtype/storm/daemon/nimbus.clj#L854
> Log file:
> 2014-10-01 10:47:19 b.s.d.nimbus [INFO] Corrupt topology 
> my-topology-1-2-1412151059 has state on zookeeper but doesn't have a local 
> dir on Nimbus. Cleaning up...
> 2014-10-01 10:47:19 b.s.d.nimbus [INFO] Corrupt topology 
> my-topology-0-1-1412151059 has state on zookeeper but doesn't have a local 
> dir on Nimbus. Cleaning up...
> 2014-10-01 10:47:19 b.s.d.nimbus [INFO] Corrupt topology 
> my-topology-3-4-1412151062 has state on zookeeper but doesn't have a local 
> dir on Nimbus. Cleaning up...
> 2014-10-01 10:47:19 b.s.d.nimbus [INFO] Corrupt topology 
> my-topology-2-3-1412151060 has state on zookeeper but doesn't have a local 
> dir on Nimbus. Cleaning up...
> 2014-10-01 10:47:19 b.s.d.nimbus [INFO] Starting Nimbus server...
> 2014-10-01 10:47:20 b.s.d.nimbus [INFO] Cleaning up my-topology-1-2-1412151059
> 2014-10-01 10:47:20 b.s.d.nimbus [INFO] Cleaning up my-topology-0-1-1412151059
> 2014-10-01 10:47:20 b.s.d.nimbus [INFO] Cleaning up my-topology-3-4-1412151062
> 2014-10-01 10:47:20 b.s.d.nimbus [INFO] Cleaning up my-topology-2-3-1412151060
> 2014-10-01 10:52:16 b.s.d.nimbus [INFO] Shutting down master



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to