[ https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765761#comment-17765761 ]
Karthik Palanisamy edited comment on HBASE-28076 at 9/15/23 6:09 PM: --------------------------------------------------------------------- Thanks [~stoty]. [~zhangduo] Aside from this particular issue, there is another race condition occurring that is resulting in a NullPointerException and it bring down Regionserver. The exact cause of this NPE is currently unknown. It appears to be attempting to access a queue that either no longer exists or has been removed from the queue before it can be accessed. {code:java} 2023-09-10 20:02:35,365 ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unexpected exception in ReplicationExecutor .. .. .. java.lang.NullPointerException at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:563) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:549) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceInterface.logPositionAndCleanOldLogs(ReplicationSourceInterface.java:202) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:269) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:163) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:119) {code} was (Author: kpalanisamy): Thanks [~stoty]. [~zhangduo] Aside from this particular issue, there is another race condition occurring that is resulting in a NullPointerException. The exact cause of this NPE is currently unknown. It appears to be attempting to access a queue that either no longer exists or has been removed from the queue before it can be accessed. {code:java} 2023-09-10 20:02:35,365 ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unexpected exception in ReplicationExecutor .. .. .. java.lang.NullPointerException at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:563) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:549) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceInterface.logPositionAndCleanOldLogs(ReplicationSourceInterface.java:202) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:269) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:163) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:119) {code} > NPE on initialization error in RecoveredReplicationSourceShipper > ---------------------------------------------------------------- > > Key: HBASE-28076 > URL: https://issues.apache.org/jira/browse/HBASE-28076 > Project: HBase > Issue Type: Bug > Affects Versions: 2.6.0, 2.4.17, 2.5.5 > Reporter: Istvan Toth > Assignee: Istvan Toth > Priority: Minor > Fix For: 2.6.0, 2.4.18, 2.5.6 > > > When we run into problems starting RecoveredReplicationSourceShipper, we try > to stop the reader thread which we haven't initialized yet, resulting in an > NPE. > {noformat} > ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Unexpected exception in redacted currentPath=hdfs://redacted > java.lang.NullPointerException > at > org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100) > at > org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87) > at > org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349) > at > java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601) > at java.lang.Thread.run(Thread.java:750) > {noformat} > A simple null check should fix this. -- This message was sent by Atlassian Jira (v8.20.10#820010)