[ 
https://issues.apache.org/jira/browse/GOBBLIN-762?focusedWorklogId=239461&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239461
 ]

ASF GitHub Bot logged work on GOBBLIN-762:
------------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/May/19 20:18
            Start Date: 08/May/19 20:18
    Worklog Time Spent: 10m 
      Work Description: htran1 commented on pull request #2626: [GOBBLIN-762] 
Add automatic scaling for Gobblin on YARN
URL: https://github.com/apache/incubator-gobblin/pull/2626#discussion_r281850457
 
 

 ##########
 File path: gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java
 ##########
 @@ -310,10 +344,70 @@ private EventSubmitter buildEventSubmitter() {
         .build();
   }
 
-  private void requestInitialContainers(int containersRequested) {
-    for (int i = 0; i < containersRequested; i++) {
+  /**
+   * Request an allocation of containers. If numTargetContainers is larger 
than the max of current and expected number
+   * of containers then additional containers are requested.
+   *
+   * If numTargetContainers is less than the current number of allocated 
containers then release free containers.
+   * Shrinking is relative to the number of currently allocated containers 
since it takes time for containers
+   * to be allocated and assigned work and we want to avoid releasing a 
container prematurely before it is assigned
+   * work. This means that a container may not be released even though 
numTargetContainers is less than the requested
+   * number of containers. The intended usage is for the caller of this method 
to make periodic calls to attempt to
+   * adjust the cluster towards the desired number of containers.
+   *
+   * @param numTargetContainers the desired number of containers
+   * @param inUseInstances  a set of in use instances
+   */
+  public synchronized void requestTargetNumberOfContainers(int 
numTargetContainers, Set<String> inUseInstances) {
+    LOGGER.debug("Requesting numTargetContainers {} current 
numRequestedContainers {} in use instances {} map size {}",
+        numTargetContainers, this.numRequestedContainers, inUseInstances, 
this.containerMap.size());
+
+    // YARN can allocate more than the requested number of containers, compute 
additional allocations and deallocations
+    // based on the max of the requested and actual allocated counts
+    int numAllocatedContainers = this.containerMap.size();
+
+    // The number of allocated containers may be higher than the previously 
requested amount
+    // and there may be outstanding allocation requests, so the max of both 
counts is computed here
+    // and used to decide whether to allocate containers.
+    int numContainers = Math.max(numRequestedContainers, 
numAllocatedContainers);
+
+    // Request additional containers if the desired count is higher than the 
max of the current allocation or previously
+    // requested amount.
+    for (int i = numContainers; i < numTargetContainers; i++) {
 
 Review comment:
   Yes, the true allocation count can be higher than the requested count. If 
YARN happens to give more containers after we compute `numContainers` here then 
the requested amount can take us higher than the target. This should stabilize 
over time as auto-scaling brings the count down.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 239461)
    Time Spent: 3h 10m  (was: 3h)

> Add automatic scaling for Gobblin on YARN
> -----------------------------------------
>
>                 Key: GOBBLIN-762
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-762
>             Project: Apache Gobblin
>          Issue Type: Task
>            Reporter: Hung Tran
>            Priority: Major
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Gobblin on YARN needs a way to scale up and down the containers based on the 
> workload.
> Added `YarnAutoScalingManager` which can be started by the 
> `GobblinApplicationMaster` by setting the 
> `gobblin.yarn.app.master.serviceClasses` configuration. This class runs a 
> scheduled task with a default interval of 60 seconds to detect the number of 
> required partitions for the workflows submitted to Helix. It will request the 
> `YarnService` to scale to a computed number of containers. If the requested 
> number of containers is higher than the YarnService has previously requested 
> then it will request more containers. If the requested count is less than the 
> current number of allocated containers then it will free any unused 
> containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to