[ https://issues.apache.org/jira/browse/HDDS-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arpit Agarwal resolved HDDS-3066. --------------------------------- Fix Version/s: (was: 0.6.0) 0.5.0 Resolution: Fixed Cherry-picked to ozone-0.5.0. > SCM startup failed during loading containers from DB > ----------------------------------------------------- > > Key: HDDS-3066 > URL: https://issues.apache.org/jira/browse/HDDS-3066 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM > Reporter: Bharat Viswanadham > Assignee: Bharat Viswanadham > Priority: Blocker > Labels: OMHATest, pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This is happening because pipeline scrubber came and removed pipeline, and > it closed pipeline and removed from DB and triggered close containers to set > them to CLOSING. When SCM is restarted before close container command is > handled and change the state to CLOSING, the below issue can happen. > > This can happen in other scenarios like when safeModeHandler calls > finalizeAndDestroyPipeline and do SCM restart. > > The root cause for this is Pipeline removed from DB and the container is in > open state in this scenario, and when trying to get pipeline we will crash > SCM due to the {{PipelineNotFoundException error.}} > {{}} > {code:java} > 2020-02-21 13:57:34,888 [main] ERROR > org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: SCM start > failed with exception > org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException: > PipelineID=35dff62d-9bfa-449b-b6e8-6f00cc8c1b6e not found at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.getPipeline(PipelineStateMap.java:133) > at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.addContainerToPipeline(PipelineStateMap.java:110) > at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.addContainerToPipeline(PipelineStateManager.java:59) > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.addContainerToPipeline(SCMPipelineManager.java:309) > at > org.apache.hadoop.hdds.scm.container.SCMContainerManager.loadExistingContainers(SCMContainerManager.java:121) > at > org.apache.hadoop.hdds.scm.container.SCMContainerManager.<init>(SCMContainerManager.java:107) > at > org.apache.hadoop.hdds.scm.server.StorageContainerManager.initializeSystemManagers(StorageContainerManager.java:412) > at > org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:283) > at > org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:215) > at > org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:612) > at > org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter$SCMStarterHelper.start(StorageContainerManagerStarter.java:142) > at > org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.startScm(StorageContainerManagerStarter.java:117) > at > org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.call(StorageContainerManagerStarter.java:66) > at > org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.call(StorageContainerManagerStarter.java:42) > at picocli.CommandLine.execute(CommandLine.java:1173) at > picocli.CommandLine.access$800(CommandLine.java:141) at > picocli.CommandLine$RunLast.handle(CommandLine.java:1367) at > picocli.CommandLine$RunLast.handle(CommandLine.java:1335) at > picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1243) > at picocli.CommandLine.parseWithHandlers(CommandLine.java:1526) at > picocli.CommandLine.parseWithHandler(CommandLine.java:1465) at > org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:65) at > org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:56) at > org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.main(StorageContainerManagerStarter.java:55) > 2020-02-21 13:57:34,892 [shutdown-hook-0] INFO > org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: > SHUTDOWN_MSG: /************************************************************ > SHUTDOWN_MSG: Shutting down StorageContainerManager at > om-ha-1.vpc.cloudera.com/10.65.51.49 > ************************************************************/{code} > {{}} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org