[ https://issues.apache.org/jira/browse/HBASE-3515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135570#comment-13135570 ]
Ted Yu commented on HBASE-3515: ------------------------------- +1 to stopping region server if this situation happens. > [replication] ReplicationSource can miss a log after RS comes out of GC > ----------------------------------------------------------------------- > > Key: HBASE-3515 > URL: https://issues.apache.org/jira/browse/HBASE-3515 > Project: HBase > Issue Type: Bug > Affects Versions: 0.90.0 > Reporter: Jean-Daniel Cryans > Assignee: Jean-Daniel Cryans > Priority: Critical > Fix For: 0.92.0 > > Attachments: HBASE-3515.patch > > > This is from Hudson build 1738, if a log is about to be rolled and the ZK > connection is already closed then the replication code will fail at adding > the new log in ZK but the log will still be rolled and it's possible that > some edits will make it in. > From the log: > {quote} > 2011-02-08 10:21:20,618 FATAL > [RegionServer:0;vesta.apache.org,46117,1297160399378.logRoller] > regionserver.HRegionServer(1383): > ABORTING region server serverName=vesta.apache.org,46117,1297160399378, > load=(requests=1525, regions=12, > usedHeap=273, maxHeap=1244): Failed add log to list > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss for > > /1/replication/rs/vesta.apache.org,46117,1297160399378/2/vesta.apache.org%3A46117.1297160480509 > ... > 2011-02-08 10:21:22,444 DEBUG > [MASTER_META_SERVER_OPERATIONS-vesta.apache.org:56008-0] > wal.HLogSplitter(258): > Splitting hlog 8 of 8: > hdfs://localhost:55474/user/hudson/.logs/vesta.apache.org,46117,1297160399378/vesta.apache.org%3A46117.1297160480509, > length=0 > 2011-02-08 10:21:22,862 DEBUG > [MASTER_META_SERVER_OPERATIONS-vesta.apache.org:56008-0] > wal.HLogSplitter(436): > Pushed=31 entries from > hdfs://localhost:55474/user/hudson/.logs/vesta.apache.org,46117,1297160399378/vesta.apache.org%3A46117.1297160480509 > {quote} > The easiest thing to do would be let the exception out and cancel the log > roll. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira