[ https://issues.apache.org/jira/browse/HDDS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879243#comment-16879243 ]
Supratim Deka commented on HDDS-1765: ------------------------------------- similar symptom but not the same problem. Linking for reference. > destroyPipeline scheduled from finalizeAndDestroyPipeline fails for short > dead node interval > -------------------------------------------------------------------------------------------- > > Key: HDDS-1765 > URL: https://issues.apache.org/jira/browse/HDDS-1765 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM > Reporter: Supratim Deka > Priority: Major > > This happens when > OZONE_SCM_PIPELINE_DESTROY_TIMEOUT exceeds the value of > OZONE_SCM_DEADNODE_INTERVAL. This is the case for start-chaos.sh > When a Datanode is shutdown, SCM Stale node handler calls > finalizeAndDestroyPipeline() which schedules destroyPipeline() operation with > a delay > of OZONE_SCM_PIPELINE_DESTROY_TIMEOUT. By the time this gets scheduled, dead > node handler would have destroyed the pipeline. > > {code:java} > 2019-07-05 14:45:16,358 INFO pipeline.SCMPipelineManager > (SCMPipelineManager.java:finalizeAndDestroyPipeline(307)) - destroying > pipeline:Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes: > 7947bf32-faaa-4b34-bf1e-2752a929938c{ip: 192.168.1.6, host: 192.168.1.6, > networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, > State:CLOSED] > 2019-07-05 14:45:16,363 INFO pipeline.PipelineStateManager > (PipelineStateManager.java:removePipeline(108)) - Pipeline Pipeline[ Id: > ef60537a-0a82-4fea-a574-109c881fa140, Nodes: > 7947bf32-faaa-4b34-bf1e-2752a929938c{ip: 192.168.1.6, host: 192.168.1.6, > networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, > State:CLOSED] removed from db > ... > 2019-07-05 14:46:12,400 WARN pipeline.RatisPipelineUtils > (RatisPipelineUtils.java:destroyPipeline(66)) - Pipeline destroy failed for > pipeline=PipelineID=ef60537a-0a82-4fea-a574-109c881fa140 > dn=7947bf32-faaa-4b34-bf1e-2752a929938c\{ip: 192.168.1.6, host: 192.168.1.6, > networkLocation: /default-rack, certSerialId: null} > 2019-07-05 14:46:12,401 ERROR pipeline.SCMPipelineManager > (Scheduler.java:lambda$schedule$1(70)) - Destroy pipeline failed for > pipeline:Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes: > 7947bf32-faaa-4b34-bf1e-2752a929938c\{ip: 192.168.1.6, host: 192.168.1.6, > networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, > State:OPEN] > org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException: > PipelineID=ef60537a-0a82-4fea-a574-109c881fa140 not found > at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.getPipeline(PipelineStateMap.java:132) > at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.removePipeline(PipelineStateMap.java:322) > at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.removePipeline(PipelineStateManager.java:107) > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.removePipeline(SCMPipelineManager.java:401) > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.destroyPipeline(SCMPipelineManager.java:387) > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.lambda$finalizeAndDestroyPipeline$0(SCMPipelineManager.java:321) > at > org.apache.hadoop.utils.Scheduler.lambda$schedule$1(Scheduler.java:68) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org