[ 
https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200228#comment-16200228
 ] 

Mukul Kumar Singh commented on HDFS-12415:
------------------------------------------

Hi [~cheersyang], I looked into these common failure and I feel this error 
happens because nodes is not declared as a concurrent Hashmap.

The only way this issue can happen if one of the datanode entry was deleted, 
however there are not deletes in the code, So I feel that this issue happens 
because during registration one of the datanode entries is not updated 
correctly.

{code}
  private final Map<String, DatanodeID> nodes;
.....
....
    nodes = new HashMap<>();
{code}

Error Logs:
{code}
2017-10-08 12:31:34,943 [IPC Server handler 0 on 35383] INFO  
node.SCMNodeManager (SCMNodeManager.java:register(760))      - Data node with 
ID: 13a17735-2d91-43f4-8d09-4e3d8e08c5fd Registered.
2017-10-08 12:31:34,943 [IPC Server handler 1 on 35383] INFO  
node.SCMNodeManager (SCMNodeManager.java:register(760))      - Data node with 
ID: ff586889-3956-4e51-8b5a-bca32557d85e Registered.
2017-10-08 12:31:34,944 [IPC Server handler 4 on 35383] INFO  
node.SCMNodeManager (SCMNodeManager.java:register(760))      - Data node with 
ID: 017eada2-a5c8-492f-9cf4-e6ca46e8c954 Registered.
{code}

> Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-12415
>                 URL: https://issues.apache.org/jira/browse/HDFS-12415
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7240
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>         Attachments: HDFS-12415-HDFS-7240.001.patch, 
> HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch
>
>
> TestXceiverClientManager seems to be occasionally failing in some jenkins 
> jobs,
> {noformat}
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
> {noformat}
> see more from [this 
> report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to