[
https://issues.apache.org/jira/browse/STORM-526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rick Kellogg updated STORM-526:
-------------------------------
Component/s: storm-core
> Nimbus triggered complete removal of all topologies due to maintenance in 2
> out of 3 zookeeper servers
> ------------------------------------------------------------------------------------------------------
>
> Key: STORM-526
> URL: https://issues.apache.org/jira/browse/STORM-526
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-core
> Affects Versions: 0.9.2-incubating
> Environment: AWS EC2 ubuntu
> Reporter: Itai Frenkel
>
> We use a cluster of 3 zookeepers, all 3 ip addresses are in the storm.yml
> file. We were restarting one zookeeper, and once it was ready, we restarted
> the second zookeeper. All this time the third zookeeper was "green" (as
> monitored by Netfix Exhibitor).
> At this same time nimbus has "decided" to remove all topologies (log entry is
> "Corrupt topology my-topology-xxx has state on zookeeper but doesn't have a
> local dir on Nimbus. Cleaning up...").
> I looked at the relevant code and I am not entirely sure the log message
> describes correctly the code.
> Could anyone please read the nimbus.clj#cleanup-corrupt-topologies and
> explain under what conditions does nimbus act in that way ?
> https://github.com/apache/storm/blob/v0.9.2-incubating/storm-core/src/clj/backtype/storm/daemon/nimbus.clj#L854
> Log file:
> 2014-10-01 10:47:19 b.s.d.nimbus [INFO] Corrupt topology
> my-topology-1-2-1412151059 has state on zookeeper but doesn't have a local
> dir on Nimbus. Cleaning up...
> 2014-10-01 10:47:19 b.s.d.nimbus [INFO] Corrupt topology
> my-topology-0-1-1412151059 has state on zookeeper but doesn't have a local
> dir on Nimbus. Cleaning up...
> 2014-10-01 10:47:19 b.s.d.nimbus [INFO] Corrupt topology
> my-topology-3-4-1412151062 has state on zookeeper but doesn't have a local
> dir on Nimbus. Cleaning up...
> 2014-10-01 10:47:19 b.s.d.nimbus [INFO] Corrupt topology
> my-topology-2-3-1412151060 has state on zookeeper but doesn't have a local
> dir on Nimbus. Cleaning up...
> 2014-10-01 10:47:19 b.s.d.nimbus [INFO] Starting Nimbus server...
> 2014-10-01 10:47:20 b.s.d.nimbus [INFO] Cleaning up my-topology-1-2-1412151059
> 2014-10-01 10:47:20 b.s.d.nimbus [INFO] Cleaning up my-topology-0-1-1412151059
> 2014-10-01 10:47:20 b.s.d.nimbus [INFO] Cleaning up my-topology-3-4-1412151062
> 2014-10-01 10:47:20 b.s.d.nimbus [INFO] Cleaning up my-topology-2-3-1412151060
> 2014-10-01 10:52:16 b.s.d.nimbus [INFO] Shutting down master
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)