Sammi Chen created HDDS-15133:
---------------------------------

             Summary: Fail OM start if OM installsnapshot is not finished prior 
start
                 Key: HDDS-15133
                 URL: https://issues.apache.org/jira/browse/HDDS-15133
             Project: Apache Ozone
          Issue Type: Bug
            Reporter: Sammi Chen
            Assignee: Sammi Chen


Based on HDDS-15068 analysis, OM is terminated during snapshot installation, 
leading to new DB from leader, and old raft log file of itself.  When OM 
starts, it fails to apply any entity from leader as the new entity index 
doesn't equal to last index in raft log file plus 1, while the new entity is 
saved to raft log file(RATIS-2507), which leads to this follow every restart 
will fail. 

If OM is terminated, and leave the on disk new db and old raft log files, we 
should not allow OM to restart again next time.  We can recover the OM state, 
by switching back to use old DB, so the DB and raft log will match.  Then 
restart this OM, it will succeed this time, and it can continue accept install 
snapshot requests from leader in this case. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to