[
https://issues.apache.org/jira/browse/HADOOP-19737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18040763#comment-18040763
]
ASF GitHub Bot commented on HADOOP-19737:
-----------------------------------------
bhattmanish98 commented on code in PR #8056:
URL: https://github.com/apache/hadoop/pull/8056#discussion_r2558633843
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemConfigurations.java:
##########
@@ -389,6 +384,12 @@ public final class FileSystemConfigurations {
*/
public static final int DEFAULT_WRITE_HIGH_TIER_MEMORY_MULTIPLIER = 16;
+ /** Percentage threshold of heap usage at which memory pressure is
considered high. */
+ public static final int DEFAULT_WRITE_HIGH_MEMORY_USAGE_THRESHOLD_PERCENTAGE
= 60;
Review Comment:
Since we have used PERCENT in the FileSystemConfiguration
FS_AZURE_WRITE_HIGH_MEMORY_USAGE_THRESHOLD_PERCENT, should be better to rename
this to DEFAULT_WRITE_HIGH_MEMORY_USAGE_THRESHOLD_PERCENT
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemConfigurations.java:
##########
@@ -389,6 +384,12 @@ public final class FileSystemConfigurations {
*/
public static final int DEFAULT_WRITE_HIGH_TIER_MEMORY_MULTIPLIER = 16;
+ /** Percentage threshold of heap usage at which memory pressure is
considered high. */
+ public static final int DEFAULT_WRITE_HIGH_MEMORY_USAGE_THRESHOLD_PERCENTAGE
= 60;
+
+ /** Percentage threshold of heap usage at which memory pressure is
considered low. */
+ public static final int DEFAULT_WRITE_LOW_MEMORY_USAGE_THRESHOLD_PERCENTAGE
= 30;
Review Comment:
same as above
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsDfsClient.java:
##########
@@ -1080,7 +1080,12 @@ public AbfsRestOperation read(final String path,
String sasTokenForReuse = appendSASTokenToQuery(path,
SASTokenProvider.READ_OPERATION,
abfsUriQueryBuilder, cachedSasToken);
-
+ // Retrieve the read thread pool metrics from the ABFS counters.
+ AbfsReadResourceUtilizationMetrics readResourceUtilizationMetrics =
retrieveReadResourceUtilizationMetrics();
Review Comment:
This part of code is common in both DFS and blob client, we can define a
method in abfs client class which will add the metrics to tracing Context and
that method will be called from both the places.
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/enums/AbfsWriteResourceUtilizationMetricsEnum.java:
##########
@@ -0,0 +1,99 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.azurebfs.enums;
+
+/**
+ * Enum representing the set of metrics tracked for the ABFS write thread pool.
+ * Each metric entry defines a short name identifier and its corresponding
+ * {@link StatisticTypeEnum}, which specifies the type of measurement (e.g.,
gauge).
+ * These metrics are used for monitoring and analyzing the performance and
+ * resource utilization of the write thread pool.
+ */
+public enum AbfsWriteResourceUtilizationMetricsEnum implements
+ AbfsResourceUtilizationMetricsEnum {
+
+ /** Current number of threads in the write thread pool. */
+ CURRENT_POOL_SIZE("CP", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Maximum configured size of the write thread pool. */
+ MAX_POOL_SIZE("MP", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Number of threads currently executing write operations. */
+ ACTIVE_THREADS("AT", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Number of threads currently idle. */
+ IDLE_THREADS("IT", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Recent JVM CPU load value as reported by the JVM (0.0 to 1.0). */
+ JVM_CPU_UTILIZATION("JC", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Overall system-wide CPU utilization percentage during write operations.
*/
+ SYSTEM_CPU_UTILIZATION("SC", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Available heap memory (in GB) measured during write operations. */
+ AVAILABLE_MEMORY("AM", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Committed heap memory (in GB) measured during write operations. */
+ COMMITTED_MEMORY("CM", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Available heap memory (in GB) measured during write operations. */
+ MEMORY_LOAD("ML", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Direction of the last scaling decision (e.g., scale-up or scale-down). */
+ LAST_SCALE_DIRECTION("SD", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Maximum CPU utilization recorded during the monitoring interval. */
+ MAX_CPU_UTILIZATION("MC", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** The process ID (PID) of the running JVM, useful for correlating metrics
with system-level process information. */
+ JVM_PROCESS_ID("JI", StatisticTypeEnum.TYPE_GAUGE);
+
+ private final String name;
+ private final StatisticTypeEnum statisticType;
+
+ /**
+ * Constructs a metric definition for the ABFS write thread pool.
+ *
+ * @param name the short name identifier for the metric.
+ * @param type the {@link StatisticTypeEnum} describing the metric type.
+ */
+ AbfsWriteResourceUtilizationMetricsEnum(String name, StatisticTypeEnum type)
{
+ this.name = name;
+ this.statisticType = type;
+ }
+
+ /**
+ * Returns the short name identifier of the metric.
+ *
+ * @return the metric name.
+ */
+ public String getName() {
Review Comment:
Same as above
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TracingContext.java:
##########
@@ -226,28 +236,22 @@ public void constructHeader(AbfsHttpOperation
httpOperation, String previousFail
+ position + COLON
+ operatedBlobCount + COLON
+ getOperationSpecificHeader(opType) + COLON
- + httpOperation.getTracingContextSuffix();
-
- metricHeader += !(metricResults.trim().isEmpty()) ? metricResults :
EMPTY_STRING;
+ + httpOperation.getTracingContextSuffix() + COLON
+ + metricResults + COLON + resourceUtilizationMetricResults;
break;
case TWO_ID_FORMAT:
header = TracingHeaderVersion.getCurrentVersion() + COLON
+ clientCorrelationID + COLON + clientRequestId;
- metricHeader += !(metricResults.trim().isEmpty()) ? metricResults :
EMPTY_STRING;
Review Comment:
Why have we removed this?
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/enums/AbfsReadResourceUtilizationMetricsEnum.java:
##########
@@ -0,0 +1,96 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.azurebfs.enums;
+
+/**
+ * Enum representing the set of metrics tracked for the ABFS read thread pool.
+ * Each metric includes a short name used for reporting and its corresponding
+ * {@link StatisticTypeEnum}, which defines how the metric is measured (e.g.,
gauge).
+ */
+public enum AbfsReadResourceUtilizationMetricsEnum implements
+ AbfsResourceUtilizationMetricsEnum {
+
+ /** Current number of threads in the read thread pool. */
+ CURRENT_POOL_SIZE("CP", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Maximum configured size of the read thread pool. */
+ MAX_POOL_SIZE("MP", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Number of threads currently executing read operations. */
+ ACTIVE_THREADS("AT", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Number of threads currently idle. */
+ IDLE_THREADS("IT", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Recent JVM CPU load value as reported by the JVM (0.0 to 1.0). */
+ JVM_CPU_UTILIZATION("JC", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Overall system-wide CPU utilization percentage during write operations.
*/
+ SYSTEM_CPU_UTILIZATION("SC", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Available heap memory (in GB) measured during write operations. */
+ AVAILABLE_MEMORY("AM", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Committed heap memory (in GB) measured during write operations. */
+ COMMITTED_MEMORY("CM", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Available heap memory (in GB) measured during write operations. */
+ MEMORY_LOAD("ML", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Direction of the last scaling decision (e.g., scale-up or scale-down). */
+ LAST_SCALE_DIRECTION("SD", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** Maximum CPU utilization recorded during the monitoring interval. */
+ MAX_CPU_UTILIZATION("MC", StatisticTypeEnum.TYPE_GAUGE),
+
+ /** The process ID (PID) of the running JVM, useful for correlating metrics
with system-level process information. */
+ JVM_PROCESS_ID("JI", StatisticTypeEnum.TYPE_GAUGE);
+
+ private final String name;
+ private final StatisticTypeEnum statisticType;
+
+ /**
+ * Constructs a metric enum constant with its short name and type.
+ *
+ * @param name the short name or label for the metric.
+ * @param type the {@link StatisticTypeEnum} indicating the metric type.
+ */
+ AbfsReadResourceUtilizationMetricsEnum(String name, StatisticTypeEnum type) {
+ this.name = name;
+ this.statisticType = type;
+ }
+
+ /**
+ * Returns the short name of the metric.
+ *
+ * @return the metric name.
+ */
+ public String getName() {
Review Comment:
This method and below one should be annotated with @override
> ABFS: Add metrics to identify improvements with read and write aggressiveness
> -----------------------------------------------------------------------------
>
> Key: HADOOP-19737
> URL: https://issues.apache.org/jira/browse/HADOOP-19737
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.5.0, 3.4.2
> Reporter: Anmol Asrani
> Assignee: Anmol Asrani
> Priority: Major
> Labels: pull-request-available
>
> Introduces new performance metrics in the ABFS driver to monitor and evaluate
> the effectiveness of read and write aggressiveness tuning. These metrics help
> in understanding how thread pool behavior, CPU utilization, and heap
> availability impact overall I/O throughput and latency. By capturing detailed
> statistics such as active thread count, pool size, and system resource
> utilization, this enhancement enables data-driven analysis of optimizations
> made to improve ABFS read and write performance under varying workloads.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]