[ https://issues.apache.org/jira/browse/HBASE-16201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15368823#comment-15368823 ]
Hudson commented on HBASE-16201: -------------------------------- SUCCESS: Integrated in HBase-1.1-JDK7 #1743 (See [https://builds.apache.org/job/HBase-1.1-JDK7/1743/]) HBASE-16201 fix a NPE issue in RpcServer (liyu: rev 73189eb801f1c49e738e8a79838b1cd17b1fcff5) * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java > NPE in RpcServer causing intermittent UT failure of > TestMasterReplication#testHFileCyclicReplication > ---------------------------------------------------------------------------------------------------- > > Key: HBASE-16201 > URL: https://issues.apache.org/jira/browse/HBASE-16201 > Project: HBase > Issue Type: Bug > Reporter: Yu Li > Assignee: Yu Li > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: HBASE-16201.patch > > > Every several rounds of {{TestMasterReplication#testHFileCyclicReplication}}, > we could observe below NPE in UT log: > {noformat} > java.lang.NullPointerException > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2257) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:118) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:189) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:169) > {noformat} > And related codes at RpcServer line 2257 are: > {code} > if (e instanceof ServiceException) { > e = e.getCause(); > } > // increment the number of requests that were exceptions. > metrics.exception(e); > if (e instanceof LinkageError) throw new DoNotRetryIOException(e); > if (e instanceof IOException) throw (IOException)e; > {code} > And after some debugging, we could find several places that constructing > ServiceException with no cause, such as in > {{RsRpcServices#replicateWALEntry}}: > {code} > if (regionServer.replicationSinkHandler != null) { > ... > } else { > throw new ServiceException("Replication services are not initialized > yet"); > } > {code} > So we should firstly check and only reset {{e=e.getCause()}} when the cause > is not null -- This message was sent by Atlassian JIRA (v6.3.4#6332)