Zilong Zhu created HDFS-17504:
---------------------------------

             Summary: DN process should exit when BPServiceActor exit
                 Key: HDFS-17504
                 URL: https://issues.apache.org/jira/browse/HDFS-17504
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Zilong Zhu


BPServiceActor is a very important thread. In a non-HA cluster, the exit of the 
BPServiceActor thread will cause the DN process to exit. However, in a HA 
cluster, this is not the case.
I found HDFS-15651 causes BPServiceActor thread to exit and sets the 
"runningState" from "RunningState.FAILED" to "RunningState.EXITED",  it can be 
confusing during troubleshooting.
I believe that the DN process should exit when the flag of the BPServiceActor 
is set to RunningState.FAILED because at this point, the DN is unable to 
recover and establish a heartbeat connection with the ANN on its own.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to