[ 
https://issues.apache.org/jira/browse/STORM-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-1940:
--------------------------------

Removing EPIC since issue seems not clearly stated.
When it's clearly stated and easy to fix I'll re-include this as EPIC.

> Storm Topo is auto re-balance after ZK RECONNECTED
> --------------------------------------------------
>
>                 Key: STORM-1940
>                 URL: https://issues.apache.org/jira/browse/STORM-1940
>             Project: Apache Storm
>          Issue Type: Bug
>    Affects Versions: 1.0.1
>            Reporter: happylu
>            Priority: Critical
>
> I have a Topo with 2 workers at 2 Vm, while ZK RECONNECTED, Storm Topo will 
> be auto-reblance. 
> The log show NodeExists for /meta/712285. I guess it cause by: After 
> reconnect successfully, TridentSpoutCoordinator create this node again, but 
> this node is already created before the reconnect.
>  Can we check if node exist first? Or not throw this exception to make whole 
> Topo re-balance. 
> {code}
> 06-29 05:54:37.515 
> [Thread-151-$spoutcoord-spout-DataKafkaSpout1466801942228-executor[4 
> 4]-SendThread(ip-10-9-255-26.us-west-2.compute.internal:2181)] 
> shade.org.apache.zookeeper.ClientCnxn [INFO] Session establishment complete 
> on server ip-10-9-255-26.us-west-2.compute.internal/10.9.255.26:2181, 
> sessionid = 0x7a556eeee8c70ae1, negotiated timeout = 10000
> 06-29 05:54:37.515 
> [Thread-151-$spoutcoord-spout-DataKafkaSpout1466801942228-executor[4 
> 4]-EventThread] apache.curator.framework.state.ConnectionStateManager [INFO] 
> State change: RECONNECTED
> 06-29 05:54:37.519 [Thread-133-spout-DataKafkaSpout1466801942228-executor[154 
> 154]-SendThread(ip-10-9-255-26.us-west-2.compute.internal:2181)] 
> org.apache.zookeeper.ClientCnxn [INFO] Session establishment complete on 
> server ip-10-9-255-26.us-west-2.compute.internal/10.9.255.26:2181, sessionid 
> = 0x7a556eeee8c70ae5, negotiated timeout = 10000
> 06-29 05:54:37.519 [Thread-133-spout-DataKafkaSpout1466801942228-executor[154 
> 154]-EventThread] org.I0Itec.zkclient.ZkClient [INFO] zookeeper state changed 
> (SyncConnected)
> 06-29 05:54:37.524 [Thread-25-spout-DataKafkaSpout1466801942228-executor[156 
> 156]-SendThread(ip-10-9-255-26.us-west-2.compute.internal:2181)] 
> org.apache.zookeeper.ClientCnxn [INFO] Session establishment complete on 
> server ip-10-9-255-26.us-west-2.compute.internal/10.9.255.26:2181, sessionid 
> = 0x7a556eeee8c70ae4, negotiated timeout = 10000
> 06-29 05:54:37.524 [Thread-25-spout-DataKafkaSpout1466801942228-executor[156 
> 156]-EventThread] org.I0Itec.zkclient.ZkClient [INFO] zookeeper state changed 
> (SyncConnected)
> 06-29 05:54:37.528 
> [main-SendThread(ip-10-9-255-26.us-west-2.compute.internal:2181)] 
> shade.org.apache.zookeeper.ClientCnxn [INFO] Session establishment complete 
> on server ip-10-9-255-26.us-west-2.compute.internal/10.9.255.26:2181, 
> sessionid = 0x7b556f0cc3a40896, negotiated timeout = 10000
> 06-29 05:54:37.528 [main-EventThread] 
> apache.curator.framework.state.ConnectionStateManager [INFO] State change: 
> RECONNECTED
> 06-29 05:54:37.528 [Thread-149-spout-DataKafkaSpout1466801942228-executor[160 
> 160]-SendThread(ip-10-9-255-26.us-west-2.compute.internal:2181)] 
> org.apache.zookeeper.ClientCnxn [INFO] Session establishment complete on 
> server ip-10-9-255-26.us-west-2.compute.internal/10.9.255.26:2181, sessionid 
> = 0x7a556eeee8c70ae3, negotiated timeout = 10000
> 06-29 05:54:37.528 [Thread-149-spout-DataKafkaSpout1466801942228-executor[160 
> 160]-EventThread] org.I0Itec.zkclient.ZkClient [INFO] zookeeper state changed 
> (SyncConnected)
> 06-29 05:54:37.536 
> [Thread-151-$spoutcoord-spout-DataKafkaSpout1466801942228-executor[4 4]] 
> org.apache.storm.util [ERROR] Async loop died!
> java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.storm.shade.org.apache.zookeeper.KeeperException$NodeExistsException:
>  KeeperErrorCode = NodeExists for /meta/712285
>       at 
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:452)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:418)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.daemon.executor$fn__7953$fn__7966$fn__8019.invoke(executor.clj:847)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at org.apache.storm.util$async_loop$fn__625.invoke(util.clj:484) 
> [storm-core-1.0.1.jar:1.0.1]
>       at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
>       at java.lang.Thread.run(Thread.java:745) [?:1.7.0_80]
> Caused by: java.lang.RuntimeException: 
> org.apache.storm.shade.org.apache.zookeeper.KeeperException$NodeExistsException:
>  KeeperErrorCode = NodeExists for /meta/712285
>       at 
> org.apache.storm.trident.topology.state.TransactionalState.setData(TransactionalState.java:119)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.trident.topology.state.RotatingTransactionalState.overrideState(RotatingTransactionalState.java:52)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.trident.spout.TridentSpoutCoordinator.execute(TridentSpoutCoordinator.java:71)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.topology.BasicBoltExecutor.execute(BasicBoltExecutor.java:50)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.daemon.executor$fn__7953$tuple_action_fn__7955.invoke(executor.clj:728)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.daemon.executor$mk_task_receiver$fn__7874.invoke(executor.clj:461)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.disruptor$clojure_handler$reify__7390.onEvent(disruptor.clj:40)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:439)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       ... 6 more
> Caused by: 
> org.apache.storm.shade.org.apache.zookeeper.KeeperException$NodeExistsException:
>  KeeperErrorCode = NodeExists for /meta/712285
>       at 
> org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:721)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:704)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:701)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:477)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:467)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.trident.topology.state.TransactionalState.forPath(TransactionalState.java:83)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.trident.topology.state.TransactionalState.createNode(TransactionalState.java:95)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.trident.topology.state.TransactionalState.setData(TransactionalState.java:115)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.trident.topology.state.RotatingTransactionalState.overrideState(RotatingTransactionalState.java:52)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.trident.spout.TridentSpoutCoordinator.execute(TridentSpoutCoordinator.java:71)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.topology.BasicBoltExecutor.execute(BasicBoltExecutor.java:50)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.daemon.executor$fn__7953$tuple_action_fn__7955.invoke(executor.clj:728)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.daemon.executor$mk_task_receiver$fn__7874.invoke(executor.clj:461)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.disruptor$clojure_handler$reify__7390.onEvent(disruptor.clj:40)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       at 
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:439)
>  ~[storm-core-1.0.1.jar:1.0.1]
>       ... 6 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to