Devesh Kumar Singh created HDDS-12377:
-----------------------------------------
Summary: Ozone Recon - Improve error handling of OM background
tasks processing in case of abrupt crash of Recon
Key: HDDS-12377
URL: https://issues.apache.org/jira/browse/HDDS-12377
Project: Apache Ozone
Issue Type: Task
Components: Ozone Recon
Reporter: Devesh Kumar Singh
Assignee: Devesh Kumar Singh
If Recon has applied incremental DB updates and just before consuming those
events, if Recon crashed due to some unexpected error or CU restarted the Recon
during that time, then on restart of Recon again, recon will not try to consume
those events again and due to this edge case, OM DB updates will be missed, So
there are 2 solutions to fix this gap:
* On restart, check if incremental DB update task lastSequence number not
matching with lastUpdatedSeq number of underlying task, then just run reprocess
for such tasks.
* Another way, maintain lastUpdatedSequence number with each event consumption
and then start applying from there on restart, but this may not be worth to
implement the complex handling for this edge case.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]