[HOD] Cleanup idle HOD clusters whose ringmaster nodes might have gone down
---------------------------------------------------------------------------

                 Key: HADOOP-4938
                 URL: https://issues.apache.org/jira/browse/HADOOP-4938
             Project: Hadoop Core
          Issue Type: Improvement
          Components: contrib/hod
            Reporter: Hemanth Yamijala


As mentioned in HADOOP-4937, sometimes in large cluster deployments, faulty 
nodes on which the ringmaster process comes up may go down after the cluster is 
successfully allocated. Such clusters fail to deallocate automatically even if 
the idleness limit of the cluster is exceeded. This is because the idleness is 
tracked by the ringmaster process which itself has gone down.

As large number of nodes can get held up due to this, such clusters should be 
detected and deallocated in some manner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to