devmadhuu commented on PR #5645: URL: https://github.com/apache/ozone/pull/5645#issuecomment-1822886873
> @devmadhuu Thanks for working on this, testing the fix repeatedly, and providing explanation. > > Can you please run all of `TestOzoneManagerHAWithStoppedNodes` instead of just the failing test case? Tests may affect each other (ideally we try to avoid / get rid of that, but it's not perfect). We need to match how it is run in regular CI. > > I also have a question regarding the explanation and the fix: > > * `lastAppliedTermIndex` is from state machine (memory) > * `ratisSnapshotIndex` is from DB > * proposed fix waits for transactions to be flushed to DB, which increases `ratisSnapshotIndex` > * test asserts `lastAppliedTermIndex >= ratisSnapshotIndex` but sometimes finds this to be false, i.e. `lastAppliedTermIndex < ratisSnapshotIndex` > > I don't see how waiting for increase in `ratisSnapshotIndex` would help in the failure case. > > Am I missing something? Thanks @adoroszlai for review. As per current OM StateMachine and OMDoubleBuffer implementation, we update lastAppliedTermIndex of Ratis in OM State Machine inside `org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine#computeAndUpdateLastAppliedIndex` from two callers, one is Ratis `StateMachineUpdater` thread and another `OMDoubleBufferFlushThread`. And `StateMachineUpdater` thread updates `lastAppliedTermIndex` only when notifyIndexUpdate calls and new index is one higher than current `lastAppliedTermIndex`. Now in this method `computeAndUpdateLastAppliedIndex`, there seems to be sync and timing issue here as`lastAppliedTermIndex` on OM state machine is being applied after `OMDoubleBufferFlushThread` flush of transactions. So we may need to call `awaitDoubleBufferFlush` call in test case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
