[ https://issues.apache.org/jira/browse/HDDS-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647225#comment-16647225 ]
Arpit Agarwal commented on HDDS-609: ------------------------------------ +1 pending Jenkins. > On restart, SCM does not exit chill mode as it expects DNs to report > containers in ALLOCATED state > -------------------------------------------------------------------------------------------------- > > Key: HDDS-609 > URL: https://issues.apache.org/jira/browse/HDDS-609 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Reporter: Namit Maheshwari > Assignee: Hanisha Koneru > Priority: Major > Attachments: HDDS-609.001.patch, HDDS-609.002.patch, > HDDS-609.003.patch > > > Note: Updated the description to describe the root cause of the bug and moved > the error logs to comments. > On restart, SCM can exit chill mode only if it receives report of 99% > (default) of containers from the DNs. > SCM includes containers in ALLOCATED state in calculating the total number of > containers. But since ALLOCATED containers are not reported by DNs, the > calculation of percentage of reported containers is misconstrued. > {code:java} > For example, say we have 1DN in the cluster and we restart SCM. > Total number of containers in SCM ContainerMap = 20 > Containers in OPEN state = 2 > Containers in ALLOCATED state = 18 > Containers reported by DN on SCM restart = 2 > Fraction of reported containers as calculated by SCMChillNodeManager = (2/20) > = 0.10 > {code} > We should not include the ALLOCATED containers while calculating the total > number of containers for chill mode exit rule. Otherwise, for scenarios such > as above, SCM can never come out of chill mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org