Eric Newton created ACCUMULO-4038: ------------------------------------- Summary: recovery failure (no node for WAL) Key: ACCUMULO-4038 URL: https://issues.apache.org/jira/browse/ACCUMULO-4038 Project: Accumulo Issue Type: Bug Components: master Environment: 20 node AWS cluster Reporter: Eric Newton Assignee: Eric Newton Fix For: 1.8.0
I was testing the new agitator, and system failed to make progress after this error in the master log: {noformat} 2015-10-23 18:17:41,212 [master.Master] ERROR: Error processing table state for store Normal Tablets org.apache.accumulo.server.log.WalStateManager$WalMarkerException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /accumulo/046bd887-d524-4fcb-b70f-3bdbdbcb3bab/wals/worker06:9997[35095adcb800018] at org.apache.accumulo.server.log.WalStateManager.getWalsInUse(WalStateManager.java:150) at org.apache.accumulo.master.TabletGroupWatcher.run(TabletGroupWatcher.java:253) Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /accumulo/046bd887-d524-4fcb-b70f-3bdbdbcb3bab/wals/worker06:9997[35095adcb800018] at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500) at org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:151) at org.apache.accumulo.server.log.WalStateManager.getWalsInUse(WalStateManager.java:143) ... 1 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)