Hi all,

We are testing a failure recovery scenario where I have many resources
spanning many participants.  I shutdown all participants and helix
admins, wait a while, then add each participant back into cluster.
(zookeeper is on a separate cluster, not affected by shtudown)  During
the recovery, it seems controller generates too many messages, and
there so many exceptions.  Below are some examples.

Are these exceptions expected?   Any comments are highly appreciated. Thanks.


[ERROR 2017-04-05 14:26:17,734
org.apache.helix.manager.zk.ZkBaseDataAccessor:303] Exception while
updating path: 
/yy_cluster_name/INSTANCES/P60505029461/ERRORS/1002d87a25a0589/USER_DEFINE_MSG/15ae0bd8-10
1f-4af3-acc3-36a486af4f4c
org.I0Itec.zkclient.exception.ZkInterruptedException:
java.lang.InterruptedException
       at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:687)
       at org.apache.helix.manager.zk.ZkClient.readData(ZkClient.java:240)
       at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
       at 
org.apache.helix.manager.zk.ZkBaseDataAccessor.doUpdate(ZkBaseDataAccessor.java:273)
       at 
org.apache.helix.manager.zk.ZkBaseDataAccessor.update(ZkBaseDataAccessor.java:245)
       at 
org.apache.helix.manager.zk.ZKHelixDataAccessor.updateProperty(ZKHelixDataAccessor.java:150)
       at 
org.apache.helix.util.StatusUpdateUtil.publishErrorRecord(StatusUpdateUtil.java:501)
       at 
org.apache.helix.util.StatusUpdateUtil.publishStatusUpdateRecord(StatusUpdateUtil.java:435)
       at 
org.apache.helix.util.StatusUpdateUtil.logMessageStatusUpdateRecord(StatusUpdateUtil.java:334)
       at 
org.apache.helix.util.StatusUpdateUtil.logError(StatusUpdateUtil.java:342)
       at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:163)
       at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:42)
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException
       at java.lang.Object.wait(Native Method)
       at java.lang.Object.wait(Object.java:502)
       at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1344)
       at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:925)
       at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956)
       at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
       at org.apache.helix.manager.zk.ZkClient$4.call(ZkClient.java:244)
       at org.apache.helix.manager.zk.ZkClient$4.call(ZkClient.java:240)
       at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
       ... 15 more


[ERROR 2017-04-05 14:26:17,676
org.apache.helix.messaging.handling.HelixTask:162] Exception after
executing a message, msgId:
35e73c64-8fd3-4fb8-b0b8-419eacfa91a0org.I0Itec.zkclient.exception.ZkInterrupte
dException: java.lang.InterruptedException
org.I0Itec.zkclient.exception.ZkInterruptedException:
java.lang.InterruptedException
       at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:687)
       at org.apache.helix.manager.zk.ZkClient.getChildren(ZkClient.java:212)
       at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:505)
       at 
org.apache.helix.manager.zk.ZkBaseDataAccessor.remove(ZkBaseDataAccessor.java:537)
       at 
org.apache.helix.manager.zk.ZKHelixDataAccessor.removeProperty(ZKHelixDataAccessor.java:271)
       at 
org.apache.helix.messaging.handling.HelixTask.removeMessageFromZk(HelixTask.java:187)
       at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:150)
       at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:42)
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException
       at java.lang.Object.wait(Native Method)
       at java.lang.Object.wait(Object.java:502)
       at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1344)
       at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1247)
       at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1277)
       at org.I0Itec.zkclient.ZkConnection.getChildren(ZkConnection.java:99)
       at org.apache.helix.manager.zk.ZkClient$3.call(ZkClient.java:215)
       at org.apache.helix.manager.zk.ZkClient$3.call(ZkClient.java:212)
       at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
       ... 11 more

[ERROR 2017-04-05 14:26:18,321
org.apache.helix.messaging.handling.HelixTask:102] Exception while
executing a message. java.lang.NullPointerException msgId:
eb79d5b0-4210-479e-b010-076764199059 type: USER
_DEFINE_MSG
java.lang.NullPointerException
       at 
com.hcd.hcdadmin.CustomMessageHandlerFactory$CustomMessageHandler.handleMessage(CustomMessageHandlerFactory.java:149)
       at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:85)
       at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:42)
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       at java.lang.Thread.run(Thread.java:745)

[3:06]
[ERROR 2017-04-05 14:26:15,958
org.apache.helix.messaging.handling.HelixTask:102] Exception while
executing a message. java.lang.NullPointerException msgId:
5b05cd82-c596-4329-8484-28ac3ef40e80 type: USER
_DEFINE_MSG
java.lang.NullPointerException
       at 
com.hcd.hcdadmin.CustomMessageHandlerFactory$CustomMessageHandler.handleMessage(CustomMessageHandlerFactory.java:149)
       at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:85)
       at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:42)
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       at java.lang.Thread.run(Thread.java:745)

Reply via email to