[ 
https://issues.apache.org/jira/browse/HDDS-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar resolved HDDS-4589.
-------------------------------
    Resolution: Fixed

> Handle potential data loss during 
> ReplicationManager.handleOverReplicatedContainer()
> ------------------------------------------------------------------------------------
>
>                 Key: HDDS-4589
>                 URL: https://issues.apache.org/jira/browse/HDDS-4589
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>          Components: SCM HA
>    Affects Versions: 1.1.0
>            Reporter: Glen Geng
>            Assignee: Glen Geng
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.1.0
>
>
> h4. Problem: 
> ReplicationManager maintains the in-flight replication and deletion 
> in-memory, which is not replicated using Ratis. So, theoretically it’s 
> possible that we might run into issues if we immediately start 
> ReplicationManager after a failover.
> Scenario: There are 6 replicas of the container C1 namely CR1, CR2, CR3, CR4, 
> CR5 and CR6. The container is over replicated, so the current SCM S1 decides 
> to delete the excess replicas. SCM S1 picks CR1, CR2 and CR3 for deletion, 
> this information is updated in the in-flight deletion list and deletion 
> commands are sent to the datanodes. If there is a failover at this point and 
> SCM S2 becomes leader, it doesn’t have the in-flight deletion list from SCM 
> S1 and it finds the container C1 to be over replicated. Theoretically it’s 
> possible that SCM S2 picks CR4, CR5 and CR6 for deletion. If this happens, we 
> will end up in data loss.
> To address this issue we will make the logic to select a replica for deletion 
> deterministic. This will make sure that the new leader after failover will 
> pick the same replica for deletion which was picked by the old leader. 
> h4. Approach: 
> Sort the candidate replicas. Delete excess replicas from small to large. 
> There will not be any DeleteContainerCommand for the largest 3 Replica sent 
> by any SCM.
> h4. Example:
> Assume there are 6 replicas of the container C1, the factor of C1 is 3, the 
> names of the replicas are CR1, CR2, CR3, CR4, CR5 and CR6. There will not be 
> any DeleteContainerCommand for CR4, CR5, CR6 sent by any SCM:
> If SCM sees less than or equal to 3 replicas, it won’t send any 
> DeleteContainerCommand.
> If SCM sees 4 replicas, It deletes the smallest replica, which means it won’t 
> send DeleteContainerCommand for CR4 CR5, CR6, otherwise there will be a 
> contradiction.
> If SCM sees 5 replicas, It deletes the smallest 2 replicas, which means it 
> won’t send DeleteContainerCommand for CR4 CR5, CR6, otherwise there will be a 
> contradiction.
> If SCM sees 6 replicas, It deletes the smallest 3 replicas, which means it 
> won’t send DeleteContainerCommand for CR4 CR5, CR6, otherwise there will be a 
> contradiction.
> h4. P.S.:
> Since this issue exists in master as well, e.g., a quickly restart of SCM, we 
> decide to fix this problem in master. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to