[
https://issues.apache.org/jira/browse/HDDS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDDS-14908:
----------------------------------
Labels: pull-request-available (was: )
> Container scanner should check for updates before persisting merkle tree
> ------------------------------------------------------------------------
>
> Key: HDDS-14908
> URL: https://issues.apache.org/jira/browse/HDDS-14908
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Ethan Rose
> Assignee: Ethan Rose
> Priority: Major
> Labels: pull-request-available
>
> There is no coordination between reconciliation and container scanner by
> design since they are both long running background processes. As part of this
> design, there was a known corner case that could cause the checksum to
> deviate before eventually converging if scanner and reconciliation ran at the
> same time:
> The container scanner builds the merkle tree in memory as it is running, and
> then flushes it to disk when it finishes scanning the container. If the
> container is updated (via reconciliation) while the scanner is running, the
> scanner may not see these updates and as a result persist a stale tree at the
> end of its scan. A subsequent container scan would then see the fully updated
> container contents and write the correct tree, restoring the system to a
> stable state.
> We can actually prevent this by having the container scanner check the data
> checksum when it starts, and then again before it persists the tree. If the
> checksum has changed, the scanner knows its work is now invalid and it must
> rescan the container to build a new tree.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]