[ https://issues.apache.org/jira/browse/HDFS-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15119871#comment-15119871 ]
Surendra Singh Lilhore commented on HDFS-9684: ---------------------------------------------- Yes, DN should have some healthcheck to monitor all the service threads. For OutOfMemoryError one discussion happened in HDFS-2911 and I think conclusion is to kill the DN in case of OOM. > DataNode stopped sending heartbeat after getting OutOfMemoryError form > DataTransfer thread. > ------------------------------------------------------------------------------------------- > > Key: HDFS-9684 > URL: https://issues.apache.org/jira/browse/HDFS-9684 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.7.1 > Reporter: Surendra Singh Lilhore > Assignee: Surendra Singh Lilhore > Priority: Blocker > Attachments: HDFS-9684.01.patch > > > {noformat} > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.transferBlock(DataNode.java:1999) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.transferBlocks(DataNode.java:2008) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:657) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:615) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:857) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:671) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:823) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)