[ 
https://issues.apache.org/jira/browse/HDFS-17042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732160#comment-17732160
 ] 

ASF GitHub Bot commented on HDFS-17042:
---------------------------------------

goiri commented on code in PR #5730:
URL: https://github.com/apache/hadoop/pull/5730#discussion_r1228393934


##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:
##########
@@ -600,17 +600,18 @@ void logSlowRpcCalls(String methodName, Call call,
     }
   }
 
-  void updateMetrics(Call call, long startTime, boolean connDropped) {
+  void updateMetrics(Call call, long processingStartTime, boolean connDropped) 
{

Review Comment:
   Add units to the name.



##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:
##########
@@ -600,17 +600,18 @@ void logSlowRpcCalls(String methodName, Call call,
     }
   }
 
-  void updateMetrics(Call call, long startTime, boolean connDropped) {
+  void updateMetrics(Call call, long processingStartTime, boolean connDropped) 
{
     totalRequests.increment();
     // delta = handler + processing + response
-    long deltaNanos = Time.monotonicNowNanos() - startTime;
-    long timestampNanos = call.timestampNanos;
+    long completionTime = Time.monotonicNowNanos();

Review Comment:
   Add units.



##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:
##########
@@ -636,10 +637,17 @@ void updateMetrics(Call call, long startTime, boolean 
connDropped) {
     processingTime -= waitTime;
     String name = call.getDetailedMetricsName();
     rpcDetailedMetrics.addProcessingTime(name, processingTime);
+    // Overall processing time is from arrival to completion.
+    rpcDetailedMetrics.addOverallProcessingTime(name,
+        rpcMetrics.getMetricsTimeUnit().convert(completionTime - arrivalTime,

Review Comment:
   Let's extract this a little.



##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:
##########
@@ -600,17 +600,18 @@ void logSlowRpcCalls(String methodName, Call call,
     }
   }
 
-  void updateMetrics(Call call, long startTime, boolean connDropped) {
+  void updateMetrics(Call call, long processingStartTime, boolean connDropped) 
{
     totalRequests.increment();
     // delta = handler + processing + response
-    long deltaNanos = Time.monotonicNowNanos() - startTime;
-    long timestampNanos = call.timestampNanos;
+    long completionTime = Time.monotonicNowNanos();
+    long deltaNanos = completionTime - processingStartTime;
+    long arrivalTime = call.timestampNanos;
 
     ProcessingDetails details = call.getProcessingDetails();
     // queue time is the delta between when the call first arrived and when it
     // began being serviced, minus the time it took to be put into the queue
     details.set(Timing.QUEUE,

Review Comment:
   This now fits in one line.



##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/MetricsAsserts.java:
##########
@@ -358,6 +358,18 @@ public static void assertGaugeGt(String name, double 
greater,
         getDoubleGauge(name, rb) > greater);
   }
 
+  /**
+   * Assert that a double gauge metric is greater than or equal to a value.
+   * @param name  of the metric
+   * @param greater value of the metric should be greater than or equal to this
+   * @param rb  the record builder mock used to getMetrics
+   */
+  public static void assertGaugeGte(String name, double greater,
+      MetricsRecordBuilder rb) {
+    Assert.assertTrue("Bad value for metric " + name,

Review Comment:
   This assert message could report the greatervalue and the current value.
   We should extract:
   ```
   double curValue = getDoubleGauge(name, rb);
   ```





> Add rpcCallSuccesses and OverallRpcProcessingTime to RpcMetrics for Namenode
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-17042
>                 URL: https://issues.apache.org/jira/browse/HDFS-17042
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 3.4.0, 3.3.9
>            Reporter: Xing Lin
>            Assignee: Xing Lin
>            Priority: Major
>              Labels: pull-request-available
>
> We'd like to add two new types of metrics to the existing NN 
> RpcMetrics/RpcDetailedMetrics. These two metrics can then be used as part of 
> SLA/SLO for the HDFS service.
>  * {_}RpcCallSuccesses{_}: it measures the number of RPC requests where they 
> are successfully processed by a NN (e.g., with a response with an RpcStatus 
> {_}RpcStatusProto.SUCCESS){_}{_}.{_} Then, together with {_}RpcQueueNumOps 
> ({_}which refers the total number of RPC requests{_}){_}, we can derive the 
> RpcErrorRate for our NN, as (RpcQueueNumOps - RpcCallSuccesses) / 
> RpcQueueNumOps. 
>  * OverallRpcProcessingTime for each RPC method: this metric measures the 
> overall RPC processing time for each RPC method at the NN. It covers the time 
> from when a request arrives at the NN to when a response is sent back. We are 
> already emitting processingTime for each RPC method today in 
> RpcDetailedMetrics. We want to extend it to emit overallRpcProcessingTime for 
> each RPC method, which includes enqueueTime, queueTime, processingTime, 
> responseTime, and handlerTime.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to