Jay Sen created GOBBLIN-1205: -------------------------------- Summary: restarting gobblin on yarn fails with error Key: GOBBLIN-1205 URL: https://issues.apache.org/jira/browse/GOBBLIN-1205 Project: Apache Gobblin Issue Type: Bug Affects Versions: 0.15.0 Reporter: Jay Sen Fix For: 0.15.0
restarting gobblin deployed on yarn mode occasionally fails starting up with following error, may be the path is still on hold by the previous process, it may just need bit time between stop/start. {code:java} WARN [ZKHelixAdmin] Root directory exists.Cleaning the root directory:/GobblinYarnHelixAppWARN [ZKHelixAdmin] Root directory exists.Cleaning the root directory:/GobblinYarnHelixAppWARN [ZkClient] Failed to delete path /GobblinYarnHelixApp/CONTROLLER! org.I0Itec.zkclient.exception.ZkException: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLERERROR [ZkClient] Failed to delete /GobblinYarnHelixApp/CONTROLLERorg.I0Itec.zkclient.exception.ZkException: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLER at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68) at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1160) at org.apache.helix.manager.zk.zookeeper.ZkClient.delete(ZkClient.java:1215) at org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:949) at org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:942) at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:698) at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:162) at org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:96) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:337) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1067)Caused by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLER at org.apache.zookeeper.KeeperException.create(KeeperException.java:128) at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:882) at org.apache.helix.manager.zk.zookeeper.ZkConnection.delete(ZkConnection.java:119) at org.apache.helix.manager.zk.zookeeper.ZkClient$9.call(ZkClient.java:1219) at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1150) ... 8 more ==> logs/yarn.err <==Exception in thread "main" org.apache.helix.HelixException: Failed to delete /GobblinYarnHelixApp/CONTROLLER at org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:952) at org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:942) at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:698) at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:162) at org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:96) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:337) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1067)Caused by: org.I0Itec.zkclient.exception.ZkException: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLER at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68) at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1160) at org.apache.helix.manager.zk.zookeeper.ZkClient.delete(ZkClient.java:1215) at org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:949) ... 6 moreCaused by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLER at org.apache.zookeeper.KeeperException.create(KeeperException.java:128) at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:882) at org.apache.helix.manager.zk.zookeeper.ZkConnection.delete(ZkConnection.java:119) at org.apache.helix.manager.zk.zookeeper.ZkClient$9.call(ZkClient.java:1219) at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1150) ... 8 moreException in thread "Thread-6" org.apache.helix.HelixException: HelixManager (ZkClient) is not connected. Call HelixManager#connect() at org.apache.helix.manager.zk.ZKHelixManager.checkConnected(ZKHelixManager.java:363) at org.apache.helix.manager.zk.ZKHelixManager.getClusterManagmentTool(ZKHelixManager.java:908) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.disableLiveHelixInstances(GobblinYarnAppLauncher.java:533) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.stop(GobblinYarnAppLauncher.java:436) at org.apache.gobblin.yarn.GobblinYarnAppLauncher$2.run(GobblinYarnAppLauncher.java:1054) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)