[ https://issues.apache.org/jira/browse/HDDS-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sandeep Nemuri updated HDDS-1310: --------------------------------- Attachment: (was: HDDS-1310.001.patch) > In datanode once a container becomes unhealthy, datanode restart fails. > ----------------------------------------------------------------------- > > Key: HDDS-1310 > URL: https://issues.apache.org/jira/browse/HDDS-1310 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode > Affects Versions: 0.3.0 > Reporter: Sandeep Nemuri > Assignee: Sandeep Nemuri > Priority: Blocker > Attachments: HDDS-1310.001.patch > > > When a container is marked as {{UNHEALTHY}} in a datanode, subsequent restart > of that datanode fails as it cannot generate ContainerReports anymore. > Unhealthy state of a container is not handled in ContainerReport generation > inside a datanode. > We get the below exception when a datanode tries to generate the > ContainerReport which contains unhealthy container(s) > {noformat} > 2019-03-19 13:51:13,646 [Datanode State Machine Thread - 0] ERROR - > Unable to communicate to SCM server at xxxxx.xxxxx.xxx:9861 for past 3300 > seconds. > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > Invalid Container state found: 86 > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.getHddsState(KeyValueContainer.java:623) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.getContainerReport(KeyValueContainer.java:593) > at > org.apache.hadoop.ozone.container.common.impl.ContainerSet.getContainerReport(ContainerSet.java:204) > at > org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.getContainerReport(ContainerController.java:82) > at > org.apache.hadoop.ozone.container.common.states.endpoint.RegisterEndpointTask.call(RegisterEndpointTask.java:114) > at > org.apache.hadoop.ozone.container.common.states.endpoint.RegisterEndpointTask.call(RegisterEndpointTask.java:47) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org