Copilot commented on code in PR #10110:
URL: https://github.com/apache/ozone/pull/10110#discussion_r3186043010
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/DirectoryDeletingService.java:
##########
@@ -214,6 +221,9 @@ private synchronized void
updateAndRestart(OzoneConfiguration conf) {
@Override
public DeletingServiceTaskQueue getTasks() {
+ resetDdsRoundStats();
+ ddsRunStartMs = System.currentTimeMillis();
+ getMetrics().setDdsCurRunTimestamp(ddsRunStartMs);
Review Comment:
These new DDS run timestamps are updated before the task checks
`shouldRun()`. On follower OMs (or while the service is suspended),
`DirDeletingTask.call()` exits immediately, but the next scheduler tick will
still publish a fresh `DdsLastRunTimestamp` and zeroed last-run counters. That
makes the new JMX fields report runs that never actually executed.
##########
hadoop-ozone/ozone-manager/src/main/resources/webapps/ozoneManager/om-deletion.html:
##########
@@ -0,0 +1,186 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<h1>Deletion services</h1>
+
+<div class="alert alert-info" ng-show="$ctrl.role && ($ctrl.role.Role ||
'').trim() !== 'LEADER'">
+ <p class="m-0">
+ <strong>This node is not the OM leader.</strong>
+ Key and directory deletion services run on the leader only; deletion
settings and the metrics
+ on this page reflect the process where you are connected, not live
leader work.
+ Open the <strong>Ozone Manager</strong> web UI on the
<strong>leader</strong> OM to view
+ current deletion configuration and metrics.
+ </p>
+</div>
+
+<div ng-show="!$ctrl.role || ($ctrl.role.Role || '').trim() === 'LEADER'">
+<h2>Effective configuration (DELETION tag)</h2>
+<p class="text-muted small">Properties tagged <code>DELETION</code> in
<code>ozone-default.xml</code>, loaded via
<code>conf?cmd=getPropertyByTag&tags=DELETION</code>.</p>
Review Comment:
This section is presented as the OM's "effective" deletion configuration,
but it now pulls every property tagged `DELETION` from the local OM process.
Because the tag was also added to DN/SCM-only settings (for example
`ozone.block.deleting.service.*` and
`ozone.scm.keyvalue.container.deletion-choosing.policy`), the page can show
values that do not reflect the actual datanode/SCM configuration in the cluster.
##########
hadoop-ozone/ozone-manager/src/main/resources/webapps/ozoneManager/om-deletion.html:
##########
@@ -0,0 +1,186 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<h1>Deletion services</h1>
+
+<div class="alert alert-info" ng-show="$ctrl.role && ($ctrl.role.Role ||
'').trim() !== 'LEADER'">
+ <p class="m-0">
+ <strong>This node is not the OM leader.</strong>
+ Key and directory deletion services run on the leader only; deletion
settings and the metrics
+ on this page reflect the process where you are connected, not live
leader work.
+ Open the <strong>Ozone Manager</strong> web UI on the
<strong>leader</strong> OM to view
+ current deletion configuration and metrics.
+ </p>
+</div>
+
+<div ng-show="!$ctrl.role || ($ctrl.role.Role || '').trim() === 'LEADER'">
+<h2>Effective configuration (DELETION tag)</h2>
+<p class="text-muted small">Properties tagged <code>DELETION</code> in
<code>ozone-default.xml</code>, loaded via
<code>conf?cmd=getPropertyByTag&tags=DELETION</code>.</p>
+<p class="text-muted" ng-show="!$ctrl.deletionConfigs.length">No matching
deletion-related properties found, or configuration is still loading.</p>
+<table class="table table-bordered table-striped"
ng-show="$ctrl.deletionConfigs.length">
+ <thead>
+ <tr>
+ <th class="col-md-3">Property</th>
+ <th class="col-md-2">Value</th>
+ <th class="col-md-7">Description</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr ng-repeat="c in $ctrl.deletionConfigs">
+ <td style="word-break: break-all;">{{c.name}}</td>
+ <td style="word-break: break-all;">{{c.value}}</td>
+ <td>{{c.description}}</td>
+ </tr>
+ </tbody>
+</table>
+
+<h2>Service iteration latency (last run)</h2>
+<p class="text-muted" ng-show="!$ctrl.perf">OMPerformanceMetrics JMX bean not
available.</p>
+<table class="table table-bordered table-striped" ng-show="$ctrl.perf">
+ <tbody>
+ <tr>
+ <td>KeyDeletingService (ms)</td>
+ <td>{{$ctrl.perf.KeyDeletingServiceLatencyMs != null ?
$ctrl.perf.KeyDeletingServiceLatencyMs : 'N/A'}}</td>
+ </tr>
+ <tr>
+ <td>DirectoryDeletingService (ms)</td>
+ <td>{{$ctrl.perf.DirectoryDeletingServiceLatencyMs != null ?
$ctrl.perf.DirectoryDeletingServiceLatencyMs : 'N/A'}}</td>
+ </tr>
+ </tbody>
+</table>
+
+<div ng-show="!$ctrl.del" class="text-muted">DeletingServiceMetrics JMX bean
not available.</div>
+<div ng-show="$ctrl.del">
+ <h2>Deletion Progress [{{$ctrl.del.MetricsResetTimeStamp ? 'since ' +
($ctrl.del.MetricsResetTimeStamp * 1000 | date:'yyyy-MM-dd HH:mm:ss') :
'Initializing'}}]
+ •
+ <b>Size Reclaimed:</b>
{{$ctrl.formatBytes($ctrl.del.ReclaimedSizeInInterval)}}
+ •
+ <b>Keys Reclaimed:</b> {{$ctrl.del.KeysReclaimedInInterval || 0}}
+ </h2>
+ <h3>KeyDeletingService (last run)</h3>
+ <div class="mt-3">
+ <div class="row mb-2" ng-if="$ctrl.del.KdsCurRunTimestamp">
+ <div class="col-md-3"><b>Current Run Started:</b></div>
+ <div class="col-md-9">{{$ctrl.convertMsToTime($ctrl.Date.now() -
$ctrl.del.KdsCurRunTimestamp)}} ago</div>
+ </div>
+ <div class="row mb-2" ng-if="$ctrl.del.KdsLastRunTimestamp">
+ <div class="col-md-3"><b>Last Run:</b></div>
+ <div class="col-md-9">{{$ctrl.convertMsToTime($ctrl.Date.now() -
$ctrl.del.KdsLastRunTimestamp)}} ago</div>
+ </div>
+ </div>
+ <div style="margin-bottom: 2px;"></div>
+ <table class="table table-sm table-bordered mt-2">
+ <thead>
+ <tr>
+ <th>Store</th>
+ <th>Reclaimed Size</th>
+ <th>#Reclaimed Keys</th>
+ <th>#Iterated Keys</th>
+ <th>#NotReclaimable Keys (Referred by Snapshots)</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>Active Object Store</td>
+ <td>{{$ctrl.formatBytes($ctrl.del.AosReclaimedSizeLast)}}</td>
+ <td>{{$ctrl.del.AosKeysReclaimedLast || 0}}</td>
+ <td>{{$ctrl.del.AosKeysIteratedLast || 0}}</td>
+ <td>{{$ctrl.del.AosKeysNotReclaimableLast || 0}}</td>
+ </tr>
+ <tr>
+ <td>Snapshots</td>
+ <td>{{$ctrl.formatBytes($ctrl.del.SnapReclaimedSizeLast)}}</td>
+ <td>{{$ctrl.del.SnapKeysReclaimedLast || 0}}</td>
+ <td>{{$ctrl.del.SnapKeysIteratedLast || 0}}</td>
+ <td>{{$ctrl.del.SnapKeysNotReclaimableLast || 0}}</td>
+ </tr>
+ </tbody>
+ </table>
+
+ <h3>DirectoryDeletingService (last run)</h3>
+ <div class="mt-3">
+ <div class="row mb-2" ng-if="$ctrl.del.DdsCurRunTimestamp">
Review Comment:
This row is driven by `DdsCurRunTimestamp`, but the service never clears
that gauge when a run finishes. After the first successful run, the UI will
keep showing a "Current Run Started" value even when DirectoryDeletingService
is idle, so operators can't tell whether a run is actually in progress.
##########
hadoop-ozone/ozone-manager/src/main/resources/webapps/ozoneManager/om-deletion.html:
##########
@@ -0,0 +1,186 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<h1>Deletion services</h1>
+
+<div class="alert alert-info" ng-show="$ctrl.role && ($ctrl.role.Role ||
'').trim() !== 'LEADER'">
+ <p class="m-0">
+ <strong>This node is not the OM leader.</strong>
+ Key and directory deletion services run on the leader only; deletion
settings and the metrics
+ on this page reflect the process where you are connected, not live
leader work.
+ Open the <strong>Ozone Manager</strong> web UI on the
<strong>leader</strong> OM to view
+ current deletion configuration and metrics.
+ </p>
+</div>
+
+<div ng-show="!$ctrl.role || ($ctrl.role.Role || '').trim() === 'LEADER'">
Review Comment:
This leader-only section is shown while the role request is still in flight
because `!$ctrl.role` evaluates to true on first render. On follower OMs the
page will briefly display the leader dashboard and then swap to the warning
banner, which is the opposite of the intended "shown if leader" behavior.
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java:
##########
@@ -110,6 +114,19 @@ public final class DeletingServiceMetrics {
@Metric("Snapshot: No. of not reclaimable keys the last run")
private MutableGaugeLong snapKeysNotReclaimableLast;
+ @Metric("AOS: deleted directories sent for purge in the last
DirectoryDeletingService run")
+ private MutableGaugeLong ddsAosDirsSentForPurgeLast;
+ @Metric("AOS: sub-directories in the last DirectoryDeletingService run
(mark/purge as applicable)")
+ private MutableGaugeLong ddsAosSubDirsLast;
+ @Metric("AOS: sub-files in the last DirectoryDeletingService run")
+ private MutableGaugeLong ddsAosSubFilesLast;
+ @Metric("Snapshot: deleted directories sent for purge in the last
DirectoryDeletingService run")
+ private MutableGaugeLong ddsSnapDirsSentForPurgeLast;
+ @Metric("Snapshot: sub-directories in the last DirectoryDeletingService run
(mark/purge as applicable)")
+ private MutableGaugeLong ddsSnapSubDirsLast;
+ @Metric("Snapshot: sub-files in the last DirectoryDeletingService run")
+ private MutableGaugeLong ddsSnapSubFilesLast;
Review Comment:
There are existing unit/integration tests for `DirectoryDeletingService`,
but this change adds new per-run DDS metrics and timestamps without any
corresponding assertions. A regression here would be easy to miss because
nothing currently verifies that the new JMX fields are populated only after
real work and that the AOS/snapshot counters match a completed run.
##########
hadoop-ozone/ozone-manager/src/main/resources/webapps/ozoneManager/om-deletion.html:
##########
@@ -0,0 +1,186 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<h1>Deletion services</h1>
+
+<div class="alert alert-info" ng-show="$ctrl.role && ($ctrl.role.Role ||
'').trim() !== 'LEADER'">
+ <p class="m-0">
+ <strong>This node is not the OM leader.</strong>
+ Key and directory deletion services run on the leader only; deletion
settings and the metrics
+ on this page reflect the process where you are connected, not live
leader work.
+ Open the <strong>Ozone Manager</strong> web UI on the
<strong>leader</strong> OM to view
+ current deletion configuration and metrics.
+ </p>
+</div>
+
+<div ng-show="!$ctrl.role || ($ctrl.role.Role || '').trim() === 'LEADER'">
+<h2>Effective configuration (DELETION tag)</h2>
+<p class="text-muted small">Properties tagged <code>DELETION</code> in
<code>ozone-default.xml</code>, loaded via
<code>conf?cmd=getPropertyByTag&tags=DELETION</code>.</p>
+<p class="text-muted" ng-show="!$ctrl.deletionConfigs.length">No matching
deletion-related properties found, or configuration is still loading.</p>
+<table class="table table-bordered table-striped"
ng-show="$ctrl.deletionConfigs.length">
+ <thead>
+ <tr>
+ <th class="col-md-3">Property</th>
+ <th class="col-md-2">Value</th>
+ <th class="col-md-7">Description</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr ng-repeat="c in $ctrl.deletionConfigs">
+ <td style="word-break: break-all;">{{c.name}}</td>
+ <td style="word-break: break-all;">{{c.value}}</td>
+ <td>{{c.description}}</td>
+ </tr>
+ </tbody>
+</table>
+
+<h2>Service iteration latency (last run)</h2>
+<p class="text-muted" ng-show="!$ctrl.perf">OMPerformanceMetrics JMX bean not
available.</p>
+<table class="table table-bordered table-striped" ng-show="$ctrl.perf">
+ <tbody>
+ <tr>
+ <td>KeyDeletingService (ms)</td>
+ <td>{{$ctrl.perf.KeyDeletingServiceLatencyMs != null ?
$ctrl.perf.KeyDeletingServiceLatencyMs : 'N/A'}}</td>
+ </tr>
+ <tr>
+ <td>DirectoryDeletingService (ms)</td>
Review Comment:
This table is labeled as the latency of the last service run, but these JMX
gauges are overwritten once per store/snapshot processed inside a run. When
deep-cleaning snapshots is enabled, the value shown here is only the latency of
the final store processed, not the full
KeyDeletingService/DirectoryDeletingService iteration, so the page presents
inaccurate numbers.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]