[ 
https://issues.apache.org/jira/browse/HDDS-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076664#comment-18076664
 ] 

Ethan Rose commented on HDDS-9600:
----------------------------------

I converted the linked google doc to a PDF so it remains readable to all. We 
would need to re-review that document to check for completeness.

> Clear out empty containers that are never created
> -------------------------------------------------
>
>                 Key: HDDS-9600
>                 URL: https://issues.apache.org/jira/browse/HDDS-9600
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Ethan Rose
>            Priority: Major
>         Attachments: Handling empty missing containers in ozone.pdf
>
>
> HDDS-9550 documented a case where containers can be created on SCM, but 
> replicas are never created on datanodes and then being tracked as missing in 
> the system even though there is no data in them. Since it is tricky to 
> determine whether or not these containers are actually empty from SCM's point 
> of view, pull request 5523 implemented a solution that keeps tracking the 
> containers in SCM, but reports them as empty instead of missing.
> In this Jira, I propose a solution that is a bit more involved, but should 
> provide a path for these containers to be cleared from the system safely:
> - When SCM first creates the container, it knows the datanode replicas that 
> are supposed to have the container. It should track this information until it 
> gets reports that the container is created, even after the pipeline is closed.
> - When the pipeline is either closed gracefully by SCM or fails on the 
> datanode, SCM should send close commands for all affected containers, 
> including these empty ones.
> - When a datanode gets a close container command for a container it does not 
> have, it can ack back to the SCM that the container is closed with BCSID=0, 
> block count=0, empty, etc. If the container has data then the normal 
> container flow still applies.
> - If the container was never created, SCM will now see it as empty and can 
> then move this container through the regular close and delete flow. A 
> datanode getting a delete command for a container it does not have should be 
> ok.
> With this approach, we can re-use the normal delete flow and safely clean the 
> containers out of the system, because it requires one round of back and forth 
> between SCM and datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to