Sammi Chen created HDDS-15133:
---------------------------------
Summary: Fail OM start if OM installsnapshot is not finished prior
start
Key: HDDS-15133
URL: https://issues.apache.org/jira/browse/HDDS-15133
Project: Apache Ozone
Issue Type: Bug
Reporter: Sammi Chen
Assignee: Sammi Chen
Based on HDDS-15068 analysis, OM is terminated during snapshot installation,
leading to new DB from leader, and old raft log file of itself. When OM
starts, it fails to apply any entity from leader as the new entity index
doesn't equal to last index in raft log file plus 1, while the new entity is
saved to raft log file(RATIS-2507), which leads to this follow every restart
will fail.
If OM is terminated, and leave the on disk new db and old raft log files, we
should not allow OM to restart again next time. We can recover the OM state,
by switching back to use old DB, so the DB and raft log will match. Then
restart this OM, it will succeed this time, and it can continue accept install
snapshot requests from leader in this case.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]