[ https://issues.apache.org/jira/browse/HDFS-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375401#comment-17375401 ]
Daniel Ma edited comment on HDFS-16115 at 7/6/21, 9:33 AM: ----------------------------------------------------------- Hello [~brahmareddy], [~ayush] Pls help to review this patch. thanks. was (Author: daniel ma): [~brahmareddy] [~ayush] Pls help to review this patch. > Asynchronously handle BPServiceActor command mechanism may result in > BPServiceActor never fails even CommandProcessingThread is closed with fatal > error. > -------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-16115 > URL: https://issues.apache.org/jira/browse/HDFS-16115 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 3.3.1 > Reporter: Daniel Ma > Priority: Critical > Fix For: 3.3.1 > > Attachments: 0001-HDFS-16115.patch > > > It is an improvement issue. Actually the issue has two sub issues: > 1- BPServerActor thread handle commands from NameNode in aysnchronous way ( > CommandProcessThread handle commands ), so if there are any exception or > errors happens in thread CommandProcessthread resulting the thread fails and > stop, of which BPServiceActor cannot aware and still keep put commands from > namenode into queues waiting to be handled by CommandProcessThread, actually > CommandProcessThread was dead already. > 2-the second sub issue is based on the first one, if CommandProcessThread > fails owing to some non-fatal error like "can not create native thread" which > is caused by too many threads existed on the node, this kind of problem > should be given much torlerance instead of simply shudown the thread and > never recover automatically, because the non-fatal eror mention above may > recover soon by itself, > currently, Datanode BPServiceActor cannot turn to normal even when the > non-fatal error was eliminated. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org