[
https://issues.apache.org/jira/browse/HDDS-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei-Chiu Chuang updated HDDS-10239:
-----------------------------------
Fix Version/s: 2.1.0
> Storage Container Reconciliation
> --------------------------------
>
> Key: HDDS-10239
> URL: https://issues.apache.org/jira/browse/HDDS-10239
> Project: Apache Ozone
> Issue Type: New Feature
> Components: Ozone Datanode, SCM
> Reporter: Ethan Rose
> Assignee: Ethan Rose
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.1.0
>
>
> Ideally, a healthy Ozone cluster would contain only open and closed
> containers. However, container replicas commonly end up with a mix of states
> including quasi-closed and unhealthy that the current system is not able to
> resolve to cleanly closed replicas. The cause of these states is often bugs
> or broad failure handling on the write path. While we should fix these
> causes, they raise the problem that Ozone is not able to reconcile these
> mismatched container states on its own, regardless of their cause. This has
> lead to significant complexity in the replication manager for how to handle
> cases where only quasi-closed and unhealthy replicas are available,
> especially in the case of decommissioning.
> Even when all replicas are closed, the system assumes that these closed
> container replicas are equal with no way to verify this. Checksumming is done
> for individual chunks within each container, but if two container replicas
> somehow end up with chunks that differ in length or content despite being
> marked closed with local checksums matching, the system has no way to detect
> or resolve this anomaly.
> This Jira proposes a container reconciliation protocol to solve these
> problems. After implementing the proposal:
> 1. It should be possible for a cluster to progress to a state where it has
> only properly replicated closed and open containers.
> 2. We can verify the equality and integrity of all closed containers.
> The design doc is linked here as a markdown pull request for inline comments.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]