[
https://issues.apache.org/jira/browse/HDDS-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368592#comment-17368592
]
Bharat Viswanadham commented on HDDS-5338:
------------------------------------------
We need to download checkpoint when converting from non-ha ratis based cluster
to ha enabled cluster like when we add 2 more nodes to make it HA (in this
case, the old single node OM is first converted to ratis-enabled, and then if
we add 2 more nodes, only the older one can become leader, so we can download
the checkpoint from that.
{quote}Let's say there are 3 existing OMs - om1, om2 and om3. om1 is network
partitioned from the other 2 and assumes itself to be the leader. We try to
bootstrap a new OM om4 and it contacts om1 first and downloads a checkpoint
from it (since om1 replies that it is the leader). But since om1 was network
partitioned, it does not have the correct DB snapshot. After this, om4 contacts
the OM ring again to do a SetConfiguration. This request now goes to the
correct leader OM - om2. om2 assumes that the bootstrapping OM has already got
the non-ratis transactions through the DB checkpoint and sends it only the
ratis logs. This will lead to inconsistent state in om4.{quote}
If it has more than 1 node, that means it is already ratis enabled cluster, why
do we need to download checkpoint at all in this scenario?
> Handle Bootstrap when original OM has non-ratis transactions
> ------------------------------------------------------------
>
> Key: HDDS-5338
> URL: https://issues.apache.org/jira/browse/HDDS-5338
> Project: Apache Ozone
> Issue Type: Sub-task
> Affects Versions: 1.2.0
> Reporter: Hanisha Koneru
> Assignee: Hanisha Koneru
> Priority: Major
>
> When non-Ratis OM is converted to ratis enabled OM, there could be
> transactions in the RocksDB which are not part of the Ratis logs. If the
> Ratis logs are not purged when a new OM is bootstrapped, it will just get all
> the Ratis logs from the old OM. The non-ratis transactions in the RocksDB
> will not be transferred to the new OM as Ratis will not know that there are
> transactions in the DB not present in the logs.
> So when a new OM is bootstrapping, we should check the DB for non-ratis
> transactions and if any are present, the new OM should download the DB from
> existing OM before the setConf request is sent out.
> Thanks [~bharat] for identifying this scenario
> [here|https://github.com/apache/ozone/pull/1494#issuecomment-859329558] .
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]