Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24092 )

Change subject: [fs] add metrics for untracked orphaned blocks
......................................................................


Patch Set 2:

(15 comments)

A quick first pass.

http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/fs/log_block_manager.cc@2528
PS2, Line 2528:   if (PREDICT_FALSE(!s.ok())) {
Why not to move this report/warning piece closer to the source of the non-OK 
result status, so there is less noise and uncertainty whether block may or may 
not be left orphaned?  E.g., some where in LogBlockManager::RemoveLogBlocks()?


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/fs/log_block_manager.cc@2533
PS2, Line 2533: LOG(WARNING)
There might be many blocks failing to delete if they are on a failed data 
directory, and this might flood the logs.  I don't think there is a lot of 
demand for having information on exact block IDs, length, etc. to report right 
away.  If necessary, it might be found using offline CLI tools.

Also, such detailed block information doesn't provide much value for the 
suggested `kudu fs check --repair' run since the tool doesn't take any piece of 
that info as an input.

With that, consider rate/count limiting when outputting this warning to avoid 
flooding the log with useless details.

As for exact count of orphaned blocks and estimate for the disk space behind, 
that should be taken care in metrics.  I don't think log is the place to keep 
track of and account for that.


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/fs/log_block_manager.cc@2534
PS2, Line 2534: due to failure writing deletion metadata: $3.
I'm not sure this is always due to a failure writing metadata: e.g., what about 
an attempt to delete blocks in a failed data directory?


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet.cc@1741
PS2, Line 1741:     TabletMetadata::OrphanBlockCleanupStats orphan_stats;
              :     RETURN_NOT_OK(metadata_->Flush(&orphan_stats));
              :     UpdateOrphanBlockMetrics(orphan_stats);
Here and elsewhere: why not to encapsulate this into  TabletMetadata::Flush() 
method instead of spreading this in many places at the upper level?


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet.cc@1986
PS2, Line 1986: if (!metrics_) return;
style nit: add braces scope even for one-line 'if' scope, similar to one-line 
'for' cycle scope


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metadata.h
File src/kudu/tablet/tablet_metadata.h:

http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metadata.h@189
PS2, Line 189:   // Callers may pass a pointer to Flush() or UpdateAndFlush() 
to receive this information, and
             :   // can then update tablet-level metrics and log the results.
nit: why to mention this here at all when this belongs rather to docs on 
particular methods/functions?  Such mentions tend to become outdated and 
confusing on updates on the related methods/functions.


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metadata.h@192
PS2, Line 192:     // Number of orphaned blocks successfully deleted from disk.
             :     int blocks_cleaned = 0;
             :     // Number of orphaned blocks that could not be deleted (e.g. 
I/O error).
             :     // These are retained in the superblock and retried on the 
next flush.
             :     int blocks_failed = 0;
What's the reason behind using signed integers for counters when using unsigned 
for 'bytes_failed'?  Is there any reserved/special semantics for negative 
values for these two fields?


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metadata.cc@565
PS2, Line 565: (orphaned_blocks_.size() + block_ids.size())
nit: remove the enclosing parentheses?


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metadata.cc@572
PS2, Line 572: if (blocks.empty()) return stats;
style nit: always use scope braces for 'if', similar to 'for' cycle

Also, make this block evaluated before allocating the 'stats' variable on the 
stack.


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metrics.h
File src/kudu/tablet/tablet_metrics.h:

http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metrics.h@87
PS2, Line 87: I/O or hardware errors
nit: does the related code even capable of differentiating between I/O and 
hardware errors?


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metrics.h@88
PS2, Line 88: A sustained non-zero value may indicate disk issues
This is confusing.  Sustained by which means?  These counters are reset to 0 
upon restarting tablet servers, aren't they?  Also, they stay non-zero in the 
lifecycle of a tablet on particular tablet server if any blocks found orphaned 
during bootstrap.


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metrics.h@92
PS2, Line 92: orphaned_block_cleanup_failures_bytes
If introducing this pairing metric for orphaned_block_cleanup_failures, why not 
to venture introducing similar metric for orphaned_blocks_cleaned?  If somebody 
in interested in knowing how much space is occupied by orphaned blocks, they 
might be curious how much was reclaimed to see if the orphaned block cleanup 
caused drop in disk space utilization since last restart of a tablet server.


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metrics.cc
File src/kudu/tablet/tablet_metrics.cc:

http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metrics.cc@234
PS2, Line 234: Failed blocks are retried on the next metadata flush.
If this means blocks can be eventually cleaned up during lifecycle of a tablet, 
then this metric cannot be of 'counter' type, nope.  Switch to the 'gauge' type 
if the value of this metric isn't monotonically increasing during the lifecycle 
of a tablet within one continuous run (i.e. no restart) of a tablet server.

Same for the pairing orphaned_block_cleanup_failures_bytes metric below.


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metrics.cc@241
PS2, Line 241:  This is disk space that is tracked in the tablet superblock "
             :                       "but has not been reclaimed. A 
persistently non-zero value indicates "
             :                       "orphaned data that may be accumulating 
and causing disk pressure. "
             :                       "Run 'kudu fs check --repair' to manually 
clean up orphaned blocks.
Move this out and merge into the description of the 
orphaned_block_cleanup_failures metric.


http://gerrit.cloudera.org:8080/#/c/24092/2/src/kudu/tablet/tablet_metrics.cc@245
PS2, Line 245: kWarn
FWIW, it's already enough to have orphaned_block_cleanup_failures as a metric 
with kWarn severity to spot an issue and build notifications.  I'd think of 
moving this metric under the 'kInfo' domain to reduce unnecessary load on 
metric collection engines which is a well known issue for larg Kudu clusters.



--
To view, visit http://gerrit.cloudera.org:8080/24092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id386d9fc8d0900839e229e66772f35299b3ef2e9
Gerrit-Change-Number: 24092
Gerrit-PatchSet: 2
Gerrit-Owner: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber <[email protected]>
Gerrit-Reviewer: Zoltan Martonka <[email protected]>
Gerrit-Comment-Date: Tue, 24 Mar 2026 20:28:20 +0000
Gerrit-HasComments: Yes

Reply via email to