[
https://issues.apache.org/jira/browse/HDDS-13487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Rose resolved HDDS-13487.
-------------------------------
Fix Version/s: 2.1.0
Resolution: Fixed
> Checksum is updated after reaching CLOSED state (should remain immutable)
> -------------------------------------------------------------------------
>
> Key: HDDS-13487
> URL: https://issues.apache.org/jira/browse/HDDS-13487
> Project: Apache Ozone
> Issue Type: Bug
> Components: Ozone Datanode
> Affects Versions: 2.0.0
> Reporter: Bablu Raul
> Priority: Major
> Fix For: 2.1.0
>
>
> Multiple containers show unexpected checksum changes after transitioning to
> the *CLOSED* state, which violates expected behavior. Once a container
> reaches the *CLOSED* state, its checksum should remain stable.
> {code:java}
> for cid in 5030 5027 5028 5026 5019; do
> echo "Checksums for container $cid:"
> ozone admin container info --json $cid | jq -r '.replicas[] |
> "\(.datanodeDetails.hostName) \(.dataChecksum)"'
> echo "-----"
> doneChecksums for container 5030:
> data6 a0b5e024
> data5 a0b5e024
> data3 572ab97a
> -----
> Checksums for container 5027:
> data4 20120d7e
> data8 20120d7e
> data9 904f5997
> -----
> Checksums for container 5028:
> data6 1e8c0039
> data5 1e8c0039
> data3 1e8c0039
> -----
> Checksums for container 5026:
> data9 e7c3f2cf
> data4 e7c3f2cf
> data8 e7c3f2cf
> -----
> Checksums for container 5019:
> data6 38e8d3d8
> data5 38e8d3d8
> data3 4e9c35b1
> -----
> {code}
> This only involves multiple closed container checksums
> {code:java}
> grep 5028 /var/log/hadoop-ozone/dn-container.log
> 2025-06-12 17:18:17,268 | INFO | ID=5028 | Index=0 | BCSID=0 | State=OPEN |
> DataChecksum=0 |
> 2025-06-12 19:32:50,015 | WARN | ID=5028 | Index=0 | BCSID=991 |
> State=CLOSING | DataChecksum=1e8c0039 | Container data checksum updated from
> 0 to 1e8c0039 |
> 2025-06-12 19:32:50,015 | INFO | ID=5028 | Index=0 | BCSID=991 |
> State=CLOSING | DataChecksum=1e8c0039 |
> 2025-06-12 19:32:50,017 | INFO | ID=5028 | Index=0 | BCSID=991 |
> State=CLOSING | DataChecksum=1e8c0039 |
> 2025-06-12 19:32:50,018 | INFO | ID=5028 | Index=0 | BCSID=991 |
> State=CLOSED | DataChecksum=1e8c0039 |
> 2025-06-12 19:50:04,341 | WARN | ID=5028 | Index=0 | BCSID=991 |
> State=CLOSED | DataChecksum=2571a4df | Container data checksum updated from
> 1e8c0039 to 2571a4df |
> {code}
> {code:java}
> grep 5030 /var/log/hadoop-ozone/dn-container.log 2025-06-12 17:23:27,289 |
> INFO | ID=5030 | Index=0 | BCSID=0 | State=OPEN | DataChecksum=0 |
> 2025-06-12 19:32:49,561 | WARN | ID=5030 | Index=0 | BCSID=1091 |
> State=CLOSING | DataChecksum=a0b5e024 | Container data checksum updated from
> 0 to a0b5e024 | 2025-06-12 19:32:49,562 | INFO | ID=5030 | Index=0 |
> BCSID=1091 | State=CLOSING | DataChecksum=a0b5e024 | 2025-06-12
> 19:32:49,563 | INFO | ID=5030 | Index=0 | BCSID=1091 | State=CLOSED |
> DataChecksum=a0b5e024 | 2025-06-12 19:34:56,818 | WARN | ID=5030 | Index=0
> | BCSID=1091 | State=CLOSED | DataChecksum=572ab97a | Container data checksum
> updated from a0b5e024 to 572ab97a | 2025-06-12 19:50:55,837 | WARN |
> ID=5030 | Index=0 | BCSID=1091 | State=CLOSED | DataChecksum=7cd417ef |
> Container data checksum updated from 572ab97a to 7cd417ef | {code}
> In the case of container ID 4028, the checksum is updated when transitioning
> from the CLOSING to the CLOSED state, which is the expected behavior.
> However, for container ID 4030, once the container reaches the CLOSED state,
> the checksum should remain unchanged. The correct behavior is that the
> checksum must not be updated after the container is in the CLOSED state. It
> is therefore unexpected and potentially problematic that the checksum for
> container ID 4030 is being updated after achieving the CLOSED state
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]