[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765761#comment-17765761
 ] 

Karthik Palanisamy edited comment on HBASE-28076 at 9/15/23 6:09 PM:
---------------------------------------------------------------------

Thanks [~stoty]. 

[~zhangduo]  Aside from this particular issue, there is another race condition 
occurring that is resulting in a NullPointerException and it bring down 
Regionserver. The exact cause of this NPE is currently unknown. It appears to 
be attempting to access a queue that either no longer exists or has been 
removed from the queue before it can be accessed.
{code:java}
2023-09-10 20:02:35,365 ERROR 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unexpected 
exception in ReplicationExecutor
..
..
..
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:563)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:549)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceInterface.logPositionAndCleanOldLogs(ReplicationSourceInterface.java:202)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:269)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:163)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:119)
 {code}


was (Author: kpalanisamy):
Thanks [~stoty]. 

[~zhangduo]  Aside from this particular issue, there is another race condition 
occurring that is resulting in a NullPointerException. The exact cause of this 
NPE is currently unknown. It appears to be attempting to access a queue that 
either no longer exists or has been removed from the queue before it can be 
accessed.
{code:java}
2023-09-10 20:02:35,365 ERROR 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unexpected 
exception in ReplicationExecutor
..
..
..
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:563)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:549)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceInterface.logPositionAndCleanOldLogs(ReplicationSourceInterface.java:202)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:269)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:163)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:119)
 {code}

> NPE on initialization error in RecoveredReplicationSourceShipper
> ----------------------------------------------------------------
>
>                 Key: HBASE-28076
>                 URL: https://issues.apache.org/jira/browse/HBASE-28076
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.6.0, 2.4.17, 2.5.5
>            Reporter: Istvan Toth
>            Assignee: Istvan Toth
>            Priority: Minor
>             Fix For: 2.6.0, 2.4.18, 2.5.6
>
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to