sodonnel opened a new pull request #1471:
URL: https://github.com/apache/hadoop-ozone/pull/1471


   ## What changes were proposed in this pull request?
   
   If you call `pipelineManager.finalizeAndDestroyPipeline()` with 
onTimeout=false, then the finalizePipeline call will result in a closeContainer 
event to be fired for every container on the pipeline. These are handled 
asynchronously.
   
   However, immediately after that, the `destroyPipeline(...)` call is made. 
This will remove the pipeline details from the various maps / stores.
   
   Then the closeContainer events get processed, and they attempt to remove the 
container from the pipeline. However as the pipeline has already been 
destroyed, this throws an exception and the close container events never get 
sent to the DNs:
   
   ```
   2020-10-01 15:44:18,838 
[EventQueue-CloseContainerForCloseContainerEventHandler] INFO 
container.CloseContainerEventHandler: Close container Event triggered for 
container : #2
   2020-10-01 15:44:18,842 
[EventQueue-CloseContainerForCloseContainerEventHandler] ERROR 
container.CloseContainerEventHandler: Failed to close the container #2.
   org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException: 
PipelineID=59e5ae16-f1fe-45ff-9044-dd237b0e91c6 not found
        at 
org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.removeContainerFromPipeline(PipelineStateMap.java:372)
        at 
org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.removeContainerFromPipeline(PipelineStateManager.java:111)
        at 
org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.removeContainerFromPipeline(SCMPipelineManager.java:413)
        at 
org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:352)
        at 
org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:331)
        at 
org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler.onMessage(CloseContainerEventHandler.java:66)
        at 
org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler.Onmessage(CloseContainerEventHandler.java:45)
        at 
org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor
   ```
   
   The simple solution is to catch the exception and ignore it.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4304
   
   ## How was this patch tested?
   
   Validated manually in a docker environment.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to