Hi Lei, We use 0.8.2. We initially had 0.8.4 but it contains an issue with task retry logic so we downgraded to 0.8.2. We are planning to go into production with 0.8.2 by next week so can you please advice a better way to solve this without upgrading to 0.8.4.
Thanks Dimuthu On Fri, May 31, 2019 at 1:04 PM Lei Xia <[email protected]> wrote: > Which Helix version do you use? This may caused by this Zookeeper bug ( > https://issues.apache.org/jira/browse/ZOOKEEPER-706). We have upgraded > ZkClient in later Helix versions. > > > Lei > > On Fri, May 31, 2019 at 7:52 AM DImuthu Upeksha < > [email protected]> wrote: > >> Hi Folks, >> >> I'm getting following error in controller log and seems like controller is >> not moving froward after that point >> >> 2019-05-31 10:47:37,084 [main] INFO o.a.a.h.i.c.HelixController - >> Starting helix controller >> 2019-05-31 10:47:37,089 [main] INFO o.a.a.c.u.ApplicationSettings - >> Settings loaded from >> >> file:/home/airavata/staging-deployment/airavata-helix/apache-airavata-controller-0.18-SNAPSHOT/conf/airavata-server.properties >> 2019-05-31 10:47:37,091 [Thread-0] INFO o.a.a.h.i.c.HelixController - >> Connection to helix cluster : AiravataDemoCluster with name : >> helixcontroller2 >> 2019-05-31 10:47:37,092 [Thread-0] INFO o.a.a.h.i.c.HelixController - >> Zookeeper connection string localhost:2181 >> 2019-05-31 10:47:42,907 [GenericHelixController-event_process] ERROR >> o.a.h.c.GenericHelixController - Exception while executing >> DEFAULTpipeline: org.apache.helix.controller.pipeline.Pipeline@408d6d26for >> cluster .AiravataDemoCluster. Will not continue to next pipeline >> org.apache.helix.api.exceptions.HelixMetaDataAccessException: Failed to >> get >> full list of /AiravataDemoCluster/CONFIGS/PARTICIPANT >> at >> >> org.apache.helix.manager.zk.ZkBaseDataAccessor.getChildren(ZkBaseDataAccessor.java:446) >> at >> >> org.apache.helix.manager.zk.ZKHelixDataAccessor.getChildValues(ZKHelixDataAccessor.java:406) >> at >> >> org.apache.helix.manager.zk.ZKHelixDataAccessor.getChildValuesMap(ZKHelixDataAccessor.java:467) >> at >> >> org.apache.helix.controller.stages.ClusterDataCache.refresh(ClusterDataCache.java:176) >> at >> >> org.apache.helix.controller.stages.ReadClusterDataStage.process(ReadClusterDataStage.java:62) >> at org.apache.helix.controller.pipeline.Pipeline.handle(Pipeline.java:63) >> at >> >> org.apache.helix.controller.GenericHelixController.handleEvent(GenericHelixController.java:432) >> at >> >> org.apache.helix.controller.GenericHelixController$ClusterEventProcessor.run(GenericHelixController.java:928) >> Caused by: org.apache.helix.api.exceptions.HelixMetaDataAccessException: >> Fail to read nodes for >> [/AiravataDemoCluster/CONFIGS/PARTICIPANT/helixparticipant] >> at >> >> org.apache.helix.manager.zk.ZkBaseDataAccessor.get(ZkBaseDataAccessor.java:414) >> at >> >> org.apache.helix.manager.zk.ZkBaseDataAccessor.getChildren(ZkBaseDataAccessor.java:479) >> at >> >> org.apache.helix.manager.zk.ZkBaseDataAccessor.getChildren(ZkBaseDataAccessor.java:442) >> ... 7 common frames omitted >> >> In the zookeeper log I can see following warning getting printed >> continuously. What could be the reason for that? I'm using helix 0.8.2 and >> zookeeper 3.4.8 >> >> 2019-05-31 10:49:37,621 [myid:] - INFO [NIOServerCxn.Factory: >> 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1008] - Closed socket connection for >> client /0:0:0:0:0:0:0:1:59056 which had sessionid 0x16b0e59877f0000 >> 2019-05-31 10:49:37,773 [myid:] - INFO [NIOServerCxn.Factory: >> 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket >> connection >> from /127.0.0.1:57984 >> 2019-05-31 10:49:37,774 [myid:] - INFO [NIOServerCxn.Factory: >> 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@893] - Client attempting to renew >> session 0x16b0e59877f0000 at /127.0.0.1:57984 >> 2019-05-31 10:49:37,774 [myid:] - INFO [NIOServerCxn.Factory: >> 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@645] - Established session >> 0x16b0e59877f0000 with negotiated timeout 30000 for client / >> 127.0.0.1:57984 >> 2019-05-31 10:49:37,790 [myid:] - WARN [NIOServerCxn.Factory: >> 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception >> EndOfStreamException: Unable to read additional data from client sessionid >> 0x16b0e59877f0000, likely client has closed socket >> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:230) >> at >> >> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203) >> at java.lang.Thread.run(Thread.java:748) >> >> Thanks >> Dimuthu >> > > > -- > Lei Xia >
