[jira] [Work logged] (HIVE-21823) New metrics to get the average queue length / free executor number for a given time window

ASF GitHub Bot (JIRA) Wed, 05 Jun 2019 01:05:28 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-21823?focusedWorklogId=254247&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254247
 ]


ASF GitHub Bot logged work on HIVE-21823:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Jun/19 08:04
            Start Date: 05/Jun/19 08:04
    Worklog Time Spent: 10m 
      Work Description: pvary commented on pull request #660: HIVE-21823: New 
metrics to get the average queue length / free executor number for a given time 
window
URL: https://github.com/apache/hive/pull/660#discussion_r290621435
 
 

 ##########
 File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/metrics/LlapDaemonExecutorMetrics.java
 ##########
 @@ -397,4 +425,106 @@ public JvmMetrics getJvmMetrics() {
   public String getName() {
     return name;
   }
+
+  /**
+   * Generate time aware average for data points.
+   * For example if we have 3s when the queue size is 1, and 1s when the queue 
size is 2 then the
+   * calculated average should be (3*1+1*2)/4 = 1.25.
+   */
+  @VisibleForTesting
+  static class TimedAverageMetrics {
+    private final int windowDataSize;
+    private final long windowTimeSize;
+    private final Data[] data;
+    private int nextPos = 0;
+
+    /**
+     * Creates and initializes the metrics object.
+     * @param windowDataSize The maximum number of samples stored
+     * @param windowTimeSize The time window used to generate the average in 
nanoseconds
+     */
+    TimedAverageMetrics(int windowDataSize, long windowTimeSize) {
+      this(windowDataSize, windowTimeSize, System.nanoTime());
+    }
+
+    @VisibleForTesting
+    TimedAverageMetrics(int windowDataSize, long windowTimeSize,
+        long defaultTime) {
+      assert(windowDataSize > 0);
+      this.windowDataSize = windowDataSize;
+      this.windowTimeSize = windowTimeSize;
+      this.data = new Data[windowDataSize];
+      Arrays.setAll(data, i -> new Data(defaultTime, 0L));
+    }
+
+    /**
+     * Adds a new sample value to the metrics.
+     * @param value The new sample value
+     */
+    public synchronized void add(long value) {
+      add(System.nanoTime(), value);
+    }
+
+    /**
+     * Calculates the average for the last windowTimeSize window.
+     * @return The average
+     */
+    public synchronized long value() {
+      return value(System.nanoTime());
+    }
+
+    @VisibleForTesting
+    void add(long time, long value) {
+      data[nextPos].nanoTime = time;
+      data[nextPos].value = value;
+      nextPos++;
+      if (nextPos == windowDataSize) {
+        nextPos = 0;
+      }
+    }
+
+    @VisibleForTesting
+    long value(long time) {
+      // We expect that the data time positions are strictly increasing and 
the time is greater than
+      // any of the data position time. This is ensured by using 
System.nanoTime().
+      long sum = 0L;
+      long lastTime = time;
+      long minTime = lastTime - windowTimeSize;
+      int pos = nextPos - 1;
+      boolean firstRound = true;
+      while (firstRound || pos != nextPos - 1) {
+        // Loop the window
+        if (pos < 0) {
+          pos = windowDataSize - 1;
+        }
+        // If we are at the end of the window
+        if (data[pos].nanoTime < minTime) {
+          sum += (lastTime - minTime) * data[pos].value;
+          break;
+        }
+        sum += (lastTime - data[pos].nanoTime) * data[pos].value;
+        lastTime = data[pos].nanoTime;
+        firstRound = false;
+        pos--;
+      }
+      // If we exited the loop and we did not have enough data point estimate 
the data with the last
+      // known point
+      if (!firstRound && pos == nextPos - 1) {
+        sum += (lastTime - minTime) * data[nextPos].value;
+      }
+      return Math.round((double)sum / (double)windowTimeSize);
+    }
+  }
+
+  /**
+   * Single sample data.
+   */
+  static private class Data {
+    private long nanoTime;
 
 Review comment:
   We are reusing the Data objects so we will not cause unnecessary GC. That is 
why I did not use final here
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 254247)
    Time Spent: 2h  (was: 1h 50m)

> New metrics to get the average queue length / free executor number for a 
> given time window
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-21823
>                 URL: https://issues.apache.org/jira/browse/HIVE-21823
>             Project: Hive
>          Issue Type: Sub-task
>          Components: llap
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-21823.patch
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> We need to calculate the average queue size / free executor size for a window 
> to have good data for making routing decisions.
> Interesting things to consider:
>  * The time gap between arriving request can be different, so simple average 
> is not enough to have correct data
>  * We need to have 2 parameters
>  ** Time window length
>  ** Maximum data point numbers - so we will not collect "infinite" amount of 
> data at high load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (HIVE-21823) New metrics to get the average queue length / free executor number for a given time window

Reply via email to