[jira] [Updated] (HDDS-3358) Intermittent test failure related to a race conditon during PipelineManager close
[ https://issues.apache.org/jira/browse/HDDS-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sammi Chen updated HDDS-3358: - Parent: HDDS-1127 Issue Type: Sub-task (was: Improvement) > Intermittent test failure related to a race conditon during PipelineManager > close > - > > Key: HDDS-3358 > URL: https://issues.apache.org/jira/browse/HDDS-3358 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: TriagePending, flaky-test, ozone-flaky-test > Attachments: > org.apache.hadoop.hdds.scm.node.TestSCMNodeManager-output.txt > > > The test which is failed: > TestSCMNodeManager > The end of the log is: > {code} > 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor > (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution > message 19844615-0d70-4172-8c34-96e5b7295ef2{ip: 196.189.243.187, host: > localhost-196.189.243.187, networkLocation: /default-rack, certSerialId: null} > java.lang.NullPointerException > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38) > at > org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-04-08 10:49:44,544 INFO node.StaleNodeHandler > (StaleNodeHandler.java:onMessage(58)) - Datanode > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null} > moved to stale state. Finalizing its pipelines > [PipelineID=fd1f9e92-2f90-43e7-8406-94ba6ac356b0, > PipelineID=8d380e3c-b632-4bda-aa7a-554774fba09d] > 2020-04-08 10:49:44,544 INFO pipeline.SCMPipelineManager > (SCMPipelineManager.java:finalizeAndDestroyPipeline(373)) - Destroying > pipeline:Pipeline[ Id: fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, > Type:RATIS, Factor:ONE, State:ALLOCATED, leaderId:null, > CreationTimestamp2020-04-08T10:49:37.441Z] > 2020-04-08 10:49:44,544 INFO pipeline.PipelineStateManager > (PipelineStateManager.java:finalizePipeline(120)) - Pipeline Pipeline[ Id: > fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, > Type:RATIS, Factor:ONE, State:CLOSED, leaderId:null, > CreationTimestamp2020-04-08T10:49:37.441Z] moved to CLOSED state > 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor > (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution > message 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null} > java.lang.NullPointerException > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38) > at > org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-04-08 10:49:44,544 INFO pipeline.RatisPipelineProvider > (RatisPipelineProvider.java:lambda$close$4(208)) - Send > pipeline:PipelineID=e0e155c6-9fbe-46a7-b742-e805ea9baacf close command to > datanode 30a24b04-1289-4c30-a28a-034edfe29e3d > 2020-04-08 10:49:44,545 WARN events.EventQueue > (EventQueue.java:fireEvent(151)) - Processing of > TypedEvent{payloadType=CommandForDatanode, name='Datanode_Command'} is > skipped, EventQueue is not running > 2020-04-08 10:49:44,544 INFO node.StaleNodeHandler > (StaleNodeHandler.java:onMessage(58)) - Datanode > 59bdd26b-05da-47d1-8c3f-8350d55d7299{ip: 248.147.58.17, host: > localhost-248.147.58.17,
[jira] [Updated] (HDDS-3358) Intermittent test failure related to a race conditon during PipelineManager close
[ https://issues.apache.org/jira/browse/HDDS-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3358: Component/s: test > Intermittent test failure related to a race conditon during PipelineManager > close > - > > Key: HDDS-3358 > URL: https://issues.apache.org/jira/browse/HDDS-3358 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: TriagePending, flaky-test, ozone-flaky-test > Attachments: > org.apache.hadoop.hdds.scm.node.TestSCMNodeManager-output.txt > > > The test which is failed: > TestSCMNodeManager > The end of the log is: > {code} > 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor > (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution > message 19844615-0d70-4172-8c34-96e5b7295ef2{ip: 196.189.243.187, host: > localhost-196.189.243.187, networkLocation: /default-rack, certSerialId: null} > java.lang.NullPointerException > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38) > at > org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-04-08 10:49:44,544 INFO node.StaleNodeHandler > (StaleNodeHandler.java:onMessage(58)) - Datanode > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null} > moved to stale state. Finalizing its pipelines > [PipelineID=fd1f9e92-2f90-43e7-8406-94ba6ac356b0, > PipelineID=8d380e3c-b632-4bda-aa7a-554774fba09d] > 2020-04-08 10:49:44,544 INFO pipeline.SCMPipelineManager > (SCMPipelineManager.java:finalizeAndDestroyPipeline(373)) - Destroying > pipeline:Pipeline[ Id: fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, > Type:RATIS, Factor:ONE, State:ALLOCATED, leaderId:null, > CreationTimestamp2020-04-08T10:49:37.441Z] > 2020-04-08 10:49:44,544 INFO pipeline.PipelineStateManager > (PipelineStateManager.java:finalizePipeline(120)) - Pipeline Pipeline[ Id: > fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, > Type:RATIS, Factor:ONE, State:CLOSED, leaderId:null, > CreationTimestamp2020-04-08T10:49:37.441Z] moved to CLOSED state > 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor > (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution > message 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null} > java.lang.NullPointerException > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38) > at > org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-04-08 10:49:44,544 INFO pipeline.RatisPipelineProvider > (RatisPipelineProvider.java:lambda$close$4(208)) - Send > pipeline:PipelineID=e0e155c6-9fbe-46a7-b742-e805ea9baacf close command to > datanode 30a24b04-1289-4c30-a28a-034edfe29e3d > 2020-04-08 10:49:44,545 WARN events.EventQueue > (EventQueue.java:fireEvent(151)) - Processing of > TypedEvent{payloadType=CommandForDatanode, name='Datanode_Command'} is > skipped, EventQueue is not running > 2020-04-08 10:49:44,544 INFO node.StaleNodeHandler > (StaleNodeHandler.java:onMessage(58)) - Datanode > 59bdd26b-05da-47d1-8c3f-8350d55d7299{ip: 248.147.58.17, host: > localhost-248.147.58.17, networkLocation: /default-rack, certSer
[jira] [Updated] (HDDS-3358) Intermittent test failure related to a race conditon during PipelineManager close
[ https://issues.apache.org/jira/browse/HDDS-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3358: Target Version/s: 0.6.0 Labels: TriagePending flaky-test (was: ) > Intermittent test failure related to a race conditon during PipelineManager > close > - > > Key: HDDS-3358 > URL: https://issues.apache.org/jira/browse/HDDS-3358 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: TriagePending, flaky-test > Attachments: > org.apache.hadoop.hdds.scm.node.TestSCMNodeManager-output.txt > > > The test which is failed: > TestSCMNodeManager > The end of the log is: > {code} > 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor > (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution > message 19844615-0d70-4172-8c34-96e5b7295ef2{ip: 196.189.243.187, host: > localhost-196.189.243.187, networkLocation: /default-rack, certSerialId: null} > java.lang.NullPointerException > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38) > at > org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-04-08 10:49:44,544 INFO node.StaleNodeHandler > (StaleNodeHandler.java:onMessage(58)) - Datanode > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null} > moved to stale state. Finalizing its pipelines > [PipelineID=fd1f9e92-2f90-43e7-8406-94ba6ac356b0, > PipelineID=8d380e3c-b632-4bda-aa7a-554774fba09d] > 2020-04-08 10:49:44,544 INFO pipeline.SCMPipelineManager > (SCMPipelineManager.java:finalizeAndDestroyPipeline(373)) - Destroying > pipeline:Pipeline[ Id: fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, > Type:RATIS, Factor:ONE, State:ALLOCATED, leaderId:null, > CreationTimestamp2020-04-08T10:49:37.441Z] > 2020-04-08 10:49:44,544 INFO pipeline.PipelineStateManager > (PipelineStateManager.java:finalizePipeline(120)) - Pipeline Pipeline[ Id: > fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, > Type:RATIS, Factor:ONE, State:CLOSED, leaderId:null, > CreationTimestamp2020-04-08T10:49:37.441Z] moved to CLOSED state > 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor > (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution > message 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null} > java.lang.NullPointerException > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38) > at > org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-04-08 10:49:44,544 INFO pipeline.RatisPipelineProvider > (RatisPipelineProvider.java:lambda$close$4(208)) - Send > pipeline:PipelineID=e0e155c6-9fbe-46a7-b742-e805ea9baacf close command to > datanode 30a24b04-1289-4c30-a28a-034edfe29e3d > 2020-04-08 10:49:44,545 WARN events.EventQueue > (EventQueue.java:fireEvent(151)) - Processing of > TypedEvent{payloadType=CommandForDatanode, name='Datanode_Command'} is > skipped, EventQueue is not running > 2020-04-08 10:49:44,544 INFO node.StaleNodeHandler > (StaleNodeHandler.java:onMessage(58)) - Datanode > 59bdd26b-05da-47d1-8c3f-8350d55d7299{ip: 248.147.58.17, host: > localhost-248.147.58.17, networkLocation: /defau
[jira] [Updated] (HDDS-3358) Intermittent test failure related to a race conditon during PipelineManager close
[ https://issues.apache.org/jira/browse/HDDS-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-3358: --- Summary: Intermittent test failure related to a race conditon during PipelineManager close (was: Intermittent test failure related to a reca conditon during PipelineManager close) > Intermittent test failure related to a race conditon during PipelineManager > close > - > > Key: HDDS-3358 > URL: https://issues.apache.org/jira/browse/HDDS-3358 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Attachments: > org.apache.hadoop.hdds.scm.node.TestSCMNodeManager-output.txt > > > The test which is failed: > TestSCMNodeManager > The end of the log is: > {code} > 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor > (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution > message 19844615-0d70-4172-8c34-96e5b7295ef2{ip: 196.189.243.187, host: > localhost-196.189.243.187, networkLocation: /default-rack, certSerialId: null} > java.lang.NullPointerException > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38) > at > org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-04-08 10:49:44,544 INFO node.StaleNodeHandler > (StaleNodeHandler.java:onMessage(58)) - Datanode > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null} > moved to stale state. Finalizing its pipelines > [PipelineID=fd1f9e92-2f90-43e7-8406-94ba6ac356b0, > PipelineID=8d380e3c-b632-4bda-aa7a-554774fba09d] > 2020-04-08 10:49:44,544 INFO pipeline.SCMPipelineManager > (SCMPipelineManager.java:finalizeAndDestroyPipeline(373)) - Destroying > pipeline:Pipeline[ Id: fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, > Type:RATIS, Factor:ONE, State:ALLOCATED, leaderId:null, > CreationTimestamp2020-04-08T10:49:37.441Z] > 2020-04-08 10:49:44,544 INFO pipeline.PipelineStateManager > (PipelineStateManager.java:finalizePipeline(120)) - Pipeline Pipeline[ Id: > fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: > 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, > Type:RATIS, Factor:ONE, State:CLOSED, leaderId:null, > CreationTimestamp2020-04-08T10:49:37.441Z] moved to CLOSED state > 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor > (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution > message 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: > localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null} > java.lang.NullPointerException > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63) > at > org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38) > at > org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-04-08 10:49:44,544 INFO pipeline.RatisPipelineProvider > (RatisPipelineProvider.java:lambda$close$4(208)) - Send > pipeline:PipelineID=e0e155c6-9fbe-46a7-b742-e805ea9baacf close command to > datanode 30a24b04-1289-4c30-a28a-034edfe29e3d > 2020-04-08 10:49:44,545 WARN events.EventQueue > (EventQueue.java:fireEvent(151)) - Processing of > TypedEvent{payloadType=CommandForDatanode, name='Datanode_Command'} is > skipped, EventQueue is not running > 2020-04-08 10:49:44,544 INFO node.StaleNodeHandler > (StaleNodeHandler.java:onMessage(58)) - Datanode > 59bdd26b-05da-47d1-8c3f-8350d55d7299{ip: 248.147.58.1