Jean-Daniel Cryans created HBASE-8561: -----------------------------------------
Summary: [replication] Don't instantiate a ReplicationSource if the passed implementation isn't found Key: HBASE-8561 URL: https://issues.apache.org/jira/browse/HBASE-8561 Project: HBase Issue Type: Improvement Affects Versions: 0.94.6.1 Reporter: Jean-Daniel Cryans Fix For: 0.98.0, 0.94.8, 0.95.2 I was debugging a case where the region servers were dying with: {noformat} ABORTING region server someserver.com,60020,1368123702806: Writing replication status org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hbase/replication/rs/someserver.com,60020,1368123702806/etcetcetc/somserver.com%2C60020%2C1368123702740.1368123705091 at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1266) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:354) at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:846) at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:898) at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:892) at org.apache.hadoop.hbase.replication.ReplicationZookeeper.writeReplicationStatus(ReplicationZookeeper.java:558) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:154) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:638) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:387) {noformat} Turns out the problem really was: {noformat} 2013-05-09 11:21:45,625 WARN org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: Passed replication source implementation throws errors, defaulting to ReplicationSource java.lang.ClassNotFoundException: Some.Other.ReplicationSource.Implementation at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:186) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.getReplicationSource(ReplicationSourceManager.java:324) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.addSource(ReplicationSourceManager.java:202) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.init(ReplicationSourceManager.java:174) at org.apache.hadoop.hbase.replication.regionserver.Replication.startReplicationService(Replication.java:171) at org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1583) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1042) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:698) at java.lang.Thread.run(Thread.java:722) {noformat} So I think instantiating a ReplicationSource here is wrong and makes it harder to debug. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira