[ https://issues.apache.org/jira/browse/HBASE-20028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gavin updated HBASE-20028: -------------------------- Comment: was deleted (was: A comment with security level 'jira-users' was removed.) > NPE when comparing versions in AM after RS ZK expiration > -------------------------------------------------------- > > Key: HBASE-20028 > URL: https://issues.apache.org/jira/browse/HBASE-20028 > Project: HBase > Issue Type: Bug > Components: master > Reporter: Josh Elser > Assignee: Josh Elser > Priority: Major > > {noformat} > 2018-02-20 16:36:41,794 ERROR [Thread-85] assignment.AssignmentManager: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.hbase.util.VersionInfo.compareVersion(VersionInfo.java:122) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.lambda$getExcludedServersForSystemTable$5(AssignmentManager.java:1860) > at java.util.Collections.max(Collections.java:712) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.getExcludedServersForSystemTable(AssignmentManager.java:1859) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.lambda$checkIfShouldMoveSystemRegionAsync$0(AssignmentManager.java:464){noformat} > Looks like a race condition around an RS losing its ZK lock. If AM tries to > see if it should move a Region to a server who we've seen that the lock was > lost but the RS hasn't yet been processed as "dead", we can get into a > situation where {{HMaster.getRegionServerVersion()}} returns null and causes > this to fail. > Looks like a simple filter on the servers to preclude null versions would fix > the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)