[
https://issues.apache.org/jira/browse/STORM-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15367450#comment-15367450
]
happylu commented on STORM-1940:
--------------------------------
[~kabhwan]
I debug the code and find "if(_curator.checkExists().forPath(path)!=null) {" is
only checking "/meta" exist or not, but not check if "path+ser". So it brings
this exception.
> Storm Topo is auto re-balance after ZK RECONNECTED
> --------------------------------------------------
>
> Key: STORM-1940
> URL: https://issues.apache.org/jira/browse/STORM-1940
> Project: Apache Storm
> Issue Type: Bug
> Affects Versions: 1.0.1
> Reporter: happylu
> Priority: Critical
> Attachments: others.zip, worker1.zip, worker2.zip
>
>
> I have a Topo with 2 workers at 2 Vm, while ZK RECONNECTED, Storm Topo will
> be auto-reblance.
> The log show NodeExists for /meta/712285. I guess it cause by: After
> reconnect successfully, TridentSpoutCoordinator create this node again, but
> this node is already created before the reconnect.
> Can we check if node exist first? Or not throw this exception to make whole
> Topo re-balance.
> {code}
> 06-29 05:54:37.515
> [Thread-151-$spoutcoord-spout-DataKafkaSpout1466801942228-executor[4
> 4]-SendThread(ip-10-9-255-26.us-west-2.compute.internal:2181)]
> shade.org.apache.zookeeper.ClientCnxn [INFO] Session establishment complete
> on server ip-10-9-255-26.us-west-2.compute.internal/10.9.255.26:2181,
> sessionid = 0x7a556eeee8c70ae1, negotiated timeout = 10000
> 06-29 05:54:37.515
> [Thread-151-$spoutcoord-spout-DataKafkaSpout1466801942228-executor[4
> 4]-EventThread] apache.curator.framework.state.ConnectionStateManager [INFO]
> State change: RECONNECTED
> 06-29 05:54:37.519 [Thread-133-spout-DataKafkaSpout1466801942228-executor[154
> 154]-SendThread(ip-10-9-255-26.us-west-2.compute.internal:2181)]
> org.apache.zookeeper.ClientCnxn [INFO] Session establishment complete on
> server ip-10-9-255-26.us-west-2.compute.internal/10.9.255.26:2181, sessionid
> = 0x7a556eeee8c70ae5, negotiated timeout = 10000
> 06-29 05:54:37.519 [Thread-133-spout-DataKafkaSpout1466801942228-executor[154
> 154]-EventThread] org.I0Itec.zkclient.ZkClient [INFO] zookeeper state changed
> (SyncConnected)
> 06-29 05:54:37.524 [Thread-25-spout-DataKafkaSpout1466801942228-executor[156
> 156]-SendThread(ip-10-9-255-26.us-west-2.compute.internal:2181)]
> org.apache.zookeeper.ClientCnxn [INFO] Session establishment complete on
> server ip-10-9-255-26.us-west-2.compute.internal/10.9.255.26:2181, sessionid
> = 0x7a556eeee8c70ae4, negotiated timeout = 10000
> 06-29 05:54:37.524 [Thread-25-spout-DataKafkaSpout1466801942228-executor[156
> 156]-EventThread] org.I0Itec.zkclient.ZkClient [INFO] zookeeper state changed
> (SyncConnected)
> 06-29 05:54:37.528
> [main-SendThread(ip-10-9-255-26.us-west-2.compute.internal:2181)]
> shade.org.apache.zookeeper.ClientCnxn [INFO] Session establishment complete
> on server ip-10-9-255-26.us-west-2.compute.internal/10.9.255.26:2181,
> sessionid = 0x7b556f0cc3a40896, negotiated timeout = 10000
> 06-29 05:54:37.528 [main-EventThread]
> apache.curator.framework.state.ConnectionStateManager [INFO] State change:
> RECONNECTED
> 06-29 05:54:37.528 [Thread-149-spout-DataKafkaSpout1466801942228-executor[160
> 160]-SendThread(ip-10-9-255-26.us-west-2.compute.internal:2181)]
> org.apache.zookeeper.ClientCnxn [INFO] Session establishment complete on
> server ip-10-9-255-26.us-west-2.compute.internal/10.9.255.26:2181, sessionid
> = 0x7a556eeee8c70ae3, negotiated timeout = 10000
> 06-29 05:54:37.528 [Thread-149-spout-DataKafkaSpout1466801942228-executor[160
> 160]-EventThread] org.I0Itec.zkclient.ZkClient [INFO] zookeeper state changed
> (SyncConnected)
> 06-29 05:54:37.536
> [Thread-151-$spoutcoord-spout-DataKafkaSpout1466801942228-executor[4 4]]
> org.apache.storm.util [ERROR] Async loop died!
> java.lang.RuntimeException: java.lang.RuntimeException:
> org.apache.storm.shade.org.apache.zookeeper.KeeperException$NodeExistsException:
> KeeperErrorCode = NodeExists for /meta/712285
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:452)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:418)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.daemon.executor$fn__7953$fn__7966$fn__8019.invoke(executor.clj:847)
> ~[storm-core-1.0.1.jar:1.0.1]
> at org.apache.storm.util$async_loop$fn__625.invoke(util.clj:484)
> [storm-core-1.0.1.jar:1.0.1]
> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_80]
> Caused by: java.lang.RuntimeException:
> org.apache.storm.shade.org.apache.zookeeper.KeeperException$NodeExistsException:
> KeeperErrorCode = NodeExists for /meta/712285
> at
> org.apache.storm.trident.topology.state.TransactionalState.setData(TransactionalState.java:119)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.trident.topology.state.RotatingTransactionalState.overrideState(RotatingTransactionalState.java:52)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.trident.spout.TridentSpoutCoordinator.execute(TridentSpoutCoordinator.java:71)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.topology.BasicBoltExecutor.execute(BasicBoltExecutor.java:50)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.daemon.executor$fn__7953$tuple_action_fn__7955.invoke(executor.clj:728)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.daemon.executor$mk_task_receiver$fn__7874.invoke(executor.clj:461)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.disruptor$clojure_handler$reify__7390.onEvent(disruptor.clj:40)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:439)
> ~[storm-core-1.0.1.jar:1.0.1]
> ... 6 more
> Caused by:
> org.apache.storm.shade.org.apache.zookeeper.KeeperException$NodeExistsException:
> KeeperErrorCode = NodeExists for /meta/712285
> at
> org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:721)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:704)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:701)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:477)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:467)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.trident.topology.state.TransactionalState.forPath(TransactionalState.java:83)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.trident.topology.state.TransactionalState.createNode(TransactionalState.java:95)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.trident.topology.state.TransactionalState.setData(TransactionalState.java:115)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.trident.topology.state.RotatingTransactionalState.overrideState(RotatingTransactionalState.java:52)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.trident.spout.TridentSpoutCoordinator.execute(TridentSpoutCoordinator.java:71)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.topology.BasicBoltExecutor.execute(BasicBoltExecutor.java:50)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.daemon.executor$fn__7953$tuple_action_fn__7955.invoke(executor.clj:728)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.daemon.executor$mk_task_receiver$fn__7874.invoke(executor.clj:461)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.disruptor$clojure_handler$reify__7390.onEvent(disruptor.clj:40)
> ~[storm-core-1.0.1.jar:1.0.1]
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:439)
> ~[storm-core-1.0.1.jar:1.0.1]
> ... 6 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)