sreejasahithi commented on code in PR #10547:
URL: https://github.com/apache/ozone/pull/10547#discussion_r3459487267
##########
hadoop-ozone/cli-debug/src/main/java/org/apache/hadoop/ozone/debug/datanode/container/analyze/AnalyzeSubcommand.java:
##########
@@ -34,65 +41,123 @@
*/
@Command(
name = "analyze",
- description = "Analyze container consistency between on-disk container " +
- "directories on this DataNode and SCM metadata. Must be run
locally on a DataNode.")
+ description = {
+ "Analyze container consistency between on-disk container directories
on this DataNode and SCM metadata.",
+ "Must be run locally on a DataNode.",
+ "",
+ "Each reported container occurrence includes a status:",
+ " MISSING_METADATA: metadata/{containerId}.container does not exist.",
+ " INVALID_METADATA: metadata file exists but cannot be parsed, or the
container ID in the metadata",
+ " does not match the directory name.",
+ " VALID: metadata file is present and consistent with the directory."
Review Comment:
Thanks for the review, I think there may be a misunderstanding about when we
compute status/size and what VALID means here.
How the command works:
This is one single command `ozone debug datanode container analyze` as per
we discussed that reports all three inconsistency types - duplicate container
directories, orphan containers, and deleted-but-present containers.
Part 1 - DN scan (done in https://github.com/apache/ozone/pull/10414)
We scan each volume under hdds.datanode.dir and find all container
directories on that DataNode. For each container we only store:
- container ID
- container dir path(s)
We build two maps:
singles: container ID -> one path (seen on only one volume/path)
duplicates: container ID -> list of paths (same ID seen in more than one
place)
A container is in either singles or duplicates, not both.
At this stage we do not compute status or size for every container on the DN.
We only enrich duplicates, for each path in the duplicates map we compute:
- on-disk metadata status (VALID / MISSING_METADATA / INVALID_METADATA)
- directory size
Part 2 - SCM lookup (only when --scm-db is provided) - current patch
We take the union of container IDs from singles and duplicates, and for each
ID check scm.db:
- not present in SCM -> orphan
- present but state is DELETED -> deleted-but-present
- present with any other state -> ignore (not reported)
For orphan / deleted-but-present containers:
- if the container is already in duplicates, we reuse the enriched
occurrences from Part 1 (no second enrichment)
- if it is only in singles, we enrich(i.e compute status and size) that
one path on demand
So status and size are computed only for:
duplicate containers, and orphan / deleted-but-present
We do not compute or print status/size for every container found on the DN.
Normal containers that exist in SCM with a non-DELETED state are never enriched
and never printed - so we will not return millions of VALID containers.
here each status is:
- MISSING_METADATA if metadata/{containerId}.container does not exist.
- INVALID_METADATA if the metadata file exists but cannot be parsed, or if
the container ID stored in the metadata does not match the directory name
container ID.
- VALID otherwise
I will update the command description so it does not sound confusing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]