Tsz-wo Sze created HDDS-11955:
---------------------------------

             Summary: ContainerStateMachine.readStateMachineData may throw 
NoSuchFileException
                 Key: HDDS-11955
                 URL: https://issues.apache.org/jira/browse/HDDS-11955
             Project: Apache Ozone
          Issue Type: Bug
          Components: Ozone Datanode
            Reporter: Tsz-wo Sze


Suppose we have the following Raft log entires
 - index 110 is a writeChunk,
 - index 120 is a deleteBlock for the chunk above, and
 - one of the datanode followers has nextIndex 100.

Then, the datanode leader has to send log entries to that follower starting 
from 100.  If the leader already has applied log entry 120, it requires to read 
a deleted block in ContainerStateMachine.readStateMachineData and leads to 
NoSuchFileException.

Also, Ozone datanode does not support Ratis snapshot, this problem cannot be 
worked around by sending a snapshot.

Potential fix: The datanode leader has to return something instead of throwing 
NoSuchFileException. The datanode leader might return a proto indicating that 
the block is already deleted. When the datanode follower receives the proto, it 
will just mark it as deleted. We need a design for this problem.

Thanks [~sammichen] for pointing out the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to