Tsz-wo Sze created HDDS-11955:
---------------------------------
Summary: ContainerStateMachine.readStateMachineData may throw
NoSuchFileException
Key: HDDS-11955
URL: https://issues.apache.org/jira/browse/HDDS-11955
Project: Apache Ozone
Issue Type: Bug
Components: Ozone Datanode
Reporter: Tsz-wo Sze
Suppose we have the following Raft log entires
- index 110 is a writeChunk,
- index 120 is a deleteBlock for the chunk above, and
- one of the datanode followers has nextIndex 100.
Then, the datanode leader has to send log entries to that follower starting
from 100. If the leader already has applied log entry 120, it requires to read
a deleted block in ContainerStateMachine.readStateMachineData and leads to
NoSuchFileException.
Also, Ozone datanode does not support Ratis snapshot, this problem cannot be
worked around by sending a snapshot.
Potential fix: The datanode leader has to return something instead of throwing
NoSuchFileException. The datanode leader might return a proto indicating that
the block is already deleted. When the datanode follower receives the proto, it
will just mark it as deleted. We need a design for this problem.
Thanks [~sammichen] for pointing out the problem.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]