[ https://issues.apache.org/jira/browse/HBASE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Purtell updated HBASE-8561: ---------------------------------- Fix Version/s: (was: 0.98.0) No patch, moving out of 0.98 > [replication] Don't instantiate a ReplicationSource if the passed > implementation isn't found > -------------------------------------------------------------------------------------------- > > Key: HBASE-8561 > URL: https://issues.apache.org/jira/browse/HBASE-8561 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.94.6.1 > Reporter: Jean-Daniel Cryans > > I was debugging a case where the region servers were dying with: > {noformat} > ABORTING region server someserver.com,60020,1368123702806: Writing > replication status > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = > NoNode for > /hbase/replication/rs/someserver.com,60020,1368123702806/etcetcetc/somserver.com%2C60020%2C1368123702740.1368123705091 > > at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1266) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:354) > > at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:846) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:898) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:892) > at > org.apache.hadoop.hbase.replication.ReplicationZookeeper.writeReplicationStatus(ReplicationZookeeper.java:558) > > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:154) > > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:638) > > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:387) > {noformat} > Turns out the problem really was: > {noformat} > 2013-05-09 11:21:45,625 WARN > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Passed replication source implementation throws errors, defaulting to > ReplicationSource > java.lang.ClassNotFoundException: Some.Other.ReplicationSource.Implementation > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:186) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.getReplicationSource(ReplicationSourceManager.java:324) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.addSource(ReplicationSourceManager.java:202) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.init(ReplicationSourceManager.java:174) > at > org.apache.hadoop.hbase.replication.regionserver.Replication.startReplicationService(Replication.java:171) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1583) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1042) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:698) > at java.lang.Thread.run(Thread.java:722) > {noformat} > So I think instantiating a ReplicationSource here is wrong and makes it > harder to debug. -- This message was sent by Atlassian JIRA (v6.1#6144)