[ https://issues.apache.org/jira/browse/HBASE-13724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Purtell resolved HBASE-13724. ------------------------------------ Resolution: Not A Problem Resolving as Not A Problem because there hasn't been any progress and we can't recommend running in production with -ea. Never mind HBase code, what else is out there waiting to be tripped. No problem to reopen when and if there's a patch available for review. > ReplicationSource dies under certain conditions reading a sequence file > ----------------------------------------------------------------------- > > Key: HBASE-13724 > URL: https://issues.apache.org/jira/browse/HBASE-13724 > Project: HBase > Issue Type: Bug > Reporter: churro morales > > A little background, > We run our server in -ea mode and have seen quite a few replication sources > silently die over the past few months. > Note: the stacktrace I posted below comes from a regionserver running 0.94 > but quickly looking at this issue, I believe this will happen in 98 too. > Should we harden replication source to deal with these types of assertion > errors by catching throwables, should we be dealing with this at the sequence > file reader level? Still looking into the root cause of this issue but when > manually shutdown our regionservers the regionserver that recovered its queue > replicated that log just fine. So in our case a simple retry would've worked > just fine. > {code} > 2015-05-08 11:04:23,348 ERROR > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Unexpected exception in ReplicationSource, > currentPath=hdfs://hm6.xxx.flurry.com:9000/hbase/.logs/xxxxx.yy.flurry.com,60020,1426792702998/xxxxx.atl.flurry.com%2C60020%2C1426792702998.1431107922449 > java.lang.AssertionError > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader$WALReaderFSDataInputStream.getPos(SequenceFileLogReader.java:121) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1489) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1479) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1474) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:55) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:178) > at > org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:734) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.java:69) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:583) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:373) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)