devmadhuu commented on PR #5645:
URL: https://github.com/apache/ozone/pull/5645#issuecomment-1822886873

   > @devmadhuu Thanks for working on this, testing the fix repeatedly, and 
providing explanation.
   > 
   > Can you please run all of `TestOzoneManagerHAWithStoppedNodes` instead of 
just the failing test case? Tests may affect each other (ideally we try to 
avoid / get rid of that, but it's not perfect). We need to match how it is run 
in regular CI.
   > 
   > I also have a question regarding the explanation and the fix:
   > 
   > * `lastAppliedTermIndex` is from state machine (memory)
   > * `ratisSnapshotIndex` is from DB
   > * proposed fix waits for transactions to be flushed to DB, which increases 
`ratisSnapshotIndex`
   > * test asserts `lastAppliedTermIndex >= ratisSnapshotIndex` but sometimes 
finds this to be false, i.e. `lastAppliedTermIndex < ratisSnapshotIndex`
   > 
   > I don't see how waiting for increase in `ratisSnapshotIndex` would help in 
the failure case.
   > 
   > Am I missing something?
   
   Thanks @adoroszlai for review. As per current OM StateMachine and 
OMDoubleBuffer implementation, we update lastAppliedTermIndex of Ratis in OM 
State Machine inside 
`org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine#computeAndUpdateLastAppliedIndex`
 from two callers, one is Ratis `StateMachineUpdater` thread and another 
`OMDoubleBufferFlushThread`.  And `StateMachineUpdater` thread updates 
`lastAppliedTermIndex` only when notifyIndexUpdate calls and new index is one 
higher than current `lastAppliedTermIndex`. Now in this method 
`computeAndUpdateLastAppliedIndex`, there seems to be sync and timing issue 
here as`lastAppliedTermIndex` on OM state machine is being applied after 
`OMDoubleBufferFlushThread` flush of transactions. So we may need to call 
`awaitDoubleBufferFlush` call in test case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to