[jira] [Created] (YARN-8850) Make certain aspects of the NM pluggable to support a DynoYARN cluster
Arun Suresh created YARN-8850: - Summary: Make certain aspects of the NM pluggable to support a DynoYARN cluster Key: YARN-8850 URL: https://issues.apache.org/jira/browse/YARN-8850 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8849) DynoYARN: A simulation and testing infrastructure for YARN clusters
Arun Suresh created YARN-8849: - Summary: DynoYARN: A simulation and testing infrastructure for YARN clusters Key: YARN-8849 URL: https://issues.apache.org/jira/browse/YARN-8849 Project: Hadoop YARN Issue Type: New Feature Reporter: Arun Suresh Traditionally, YARN workload simulation is performed using SLS (Scheduler Load Simulator) which is packaged with YARN. It Essentially, starts a full fledged *ResourceManager*, but runs simulators for the *NodeManager* and the *ApplicationMaster* Containers. These simulators are lightweight and run in a threadpool. The NM simulators do not open any external ports and send (in-process) heartbeats to the ResourceManager. There are a couple of drawbacks with using the SLS: * It might be difficult to simulate really large clusters without having access to a very beefy box - since the NMs are launched as tasks in a threadpool, and each NM has to send periodic heartbeats to the RM. * Certain features (like YARN-1011) requires changes to the NodeManager - aspects such as queuing and selectively killing containers have to be incorporate into the existing NM Simulator which might make the simulator a bit heavy weight - there is a need for locking and synchronization. * Since the NM and AM are simulations, only the Scheduler is faithfully tested - it does not really perform an end-2-end test of a cluster. Therefore, drawing inspiration from [Dynamometer|https://github.com/linkedin/dynamometer], we propose a framework for YARN deployable YARN cluster - *DynoYARN* - for testing, with the following features: * The NM already has hooks to plug-in custom *ContainerExecutor* and *NodeResourceMonitor*. If we can plug-in a custom *ContainersMonitorImpl*'s Monitoring thread (and other modules like the LocalizationService), We can probably inject an Executor that does not actually launch containers and a Node and Container resource monitor that reports synthetic pre-specified Utilization metrics back to the RM. * Since we are launching fake containers, we cannot run normal AM containers. We can therefore, use *Unmanaged AM*'s to launch synthetic jobs. Essentially, a test workflow would look like this: * Launch a DynoYARN cluster. * Use the Unmanaged AM feature to directly negotiate with the DynaYARN Resource Manager for container tokens. * Use the container tokens from the RM to directly ask the DynoYARN Node Managers to start fake containers. * The DynoYARN NodeManagers will start the fake containers and report to the DynoYARN Resource Manager synthetically generated resource utilization for the containers (which will be injected via the *ContainerLaunchContext* and parsed by the plugged-in Container Executor). * The Scheduler will use the utilization report to schedule containers - we will be able to test allocation of {{Opportunistic}} containers based on resource utilization. * Since the DynoYARN Node Managers run the actual code paths, all preemption and queuing logic will be faithfully executed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8848) Improvements to YARN over-allocation (YARN-1011)
Arun Suresh created YARN-8848: - Summary: Improvements to YARN over-allocation (YARN-1011) Key: YARN-8848 URL: https://issues.apache.org/jira/browse/YARN-8848 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Consolidating work to be done in the next phase of YARN over-allocation (YARN-1011). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8846) Allow Applications to demand Guaranteed Containers
Arun Suresh created YARN-8846: - Summary: Allow Applications to demand Guaranteed Containers Key: YARN-8846 URL: https://issues.apache.org/jira/browse/YARN-8846 Project: Hadoop YARN Issue Type: Sub-task Components: capacity scheduler Reporter: Arun Suresh The Capacity Scheduler should ensure that if the {{enforceExecutionType}} flag in the resource request is {{true}} and the requested Container is of {{GUARANTEED}} type, the Capacity scheduler should not return over-allocated containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8827) Plumb per app, per user and per queue resource utilization from the NM to RM
Arun Suresh created YARN-8827: - Summary: Plumb per app, per user and per queue resource utilization from the NM to RM Key: YARN-8827 URL: https://issues.apache.org/jira/browse/YARN-8827 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh Opportunistic Containers for OverAllocation need to be allocated to pending applications in some fair manner. Rather than evaluating queue and user resource usage (allocated resource usage) and comparing against queue and user limits to decide the allocation, it might be make more sense to use a snapshot of actual resource utilization of the queue and user. To facilitate this, this JIRA proposes to aggregate per user, per app (and maybe per queue) resource utilization in addition to aggregated Container and Node Utilization and send it along with the NM heartbeat. It should be fairly inexpensive to aggregate - since it can be performed in the same loop of the {{ContainersMonitorImpl}}'s Monitoring thread. A snapshot aggregate can be made every couple of seconds in the RM. This instantaneous resource utilization should be used to decide if Opportunistic containers can be allocated to an App, Queue or User. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Closed] (YARN-7792) Merge work for YARN-6592
[ https://issues.apache.org/jira/browse/YARN-7792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh closed YARN-7792. - Assignee: Sunil G > Merge work for YARN-6592 > > > Key: YARN-7792 > URL: https://issues.apache.org/jira/browse/YARN-7792 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Sunil G >Assignee: Sunil G >Priority: Blocker > Fix For: 3.1.0 > > Attachments: YARN-6592.001.patch, YARN-7792.002.patch, > YARN-7792.003.patch, YARN-7792.004.patch > > > This Jira is to run aggregated YARN-6592 branch patch against trunk and check > for any jenkins issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6592) Rich placement constraints in YARN
[ https://issues.apache.org/jira/browse/YARN-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-6592. --- Resolution: Fixed Target Version/s: 3.1.0 All tasks have been completed. Merged the branch with trunk. Thanks [~kkaranasos], [~leftnoteasy], [~pgaref], [~cheersyang] and [~sunilg] for all the effort here. Thanks also to [~chris.douglas], [~subru], [~curino] and [~vinodkv] for the discussions. > Rich placement constraints in YARN > -- > > Key: YARN-6592 > URL: https://issues.apache.org/jira/browse/YARN-6592 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Konstantinos Karanasos >Assignee: Arun Suresh >Priority: Major > Fix For: 3.1.0 > > Attachments: YARN-6592-Rich-Placement-Constraints-Design-V1.pdf > > > This JIRA consolidates the efforts of YARN-5468 and YARN-4902. > It adds support for rich placement constraints to YARN, such as affinity and > anti-affinity between allocations within the same or across applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7858) Support special Node Attribute scopes in addition to NODE and RACK
Arun Suresh created YARN-7858: - Summary: Support special Node Attribute scopes in addition to NODE and RACK Key: YARN-7858 URL: https://issues.apache.org/jira/browse/YARN-7858 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Currently, we have only two scopes defined: NODE and RACK against which we check the cardinality of the placement. This idea should be extended to support node-attribute scopes. For eg: Placement of containers across *upgrade domains* and *failure domains*. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7839) Check node capacity before placing in the Algorithm
Arun Suresh created YARN-7839: - Summary: Check node capacity before placing in the Algorithm Key: YARN-7839 URL: https://issues.apache.org/jira/browse/YARN-7839 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Currently, the Algorithm assigns a node to a requests purely based on if the constraints are met. It is later in the scheduling phase that the Queue capacity and Node capacity are checked. If the request cannot be placed because of unavailable Queue/Node capacity, the request is retried by the Algorithm. For clusters that are running at high utilization, we can reduce the retries if we perform the Node capacity check in the Algorithm as well. The Queue capacity check can still be handled by the scheduler (since queues are tied to the scheduler) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7822) Fix constraint satisfaction checker to handle composite OR and AND constraints
Arun Suresh created YARN-7822: - Summary: Fix constraint satisfaction checker to handle composite OR and AND constraints Key: YARN-7822 URL: https://issues.apache.org/jira/browse/YARN-7822 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh JIRA to track changes to {{PlacementConstraintsUtil#canSatisfyConstraints}} handle OR and AND Composite constaints -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7821) Fix constraint satisfaction checker to handle inter-app constraints
Arun Suresh created YARN-7821: - Summary: Fix constraint satisfaction checker to handle inter-app constraints Key: YARN-7821 URL: https://issues.apache.org/jira/browse/YARN-7821 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh JIRA to track changes to {{PlacementConstraintsUtil#canSatisfyConstraints}} handle inter-app constraints -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler
Arun Suresh created YARN-7819: - Summary: Allow PlacementProcessor to be used with the FairScheduler Key: YARN-7819 URL: https://issues.apache.org/jira/browse/YARN-7819 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh The FairScheduler needs to implement the {{ResourceScheduler#attemptAllocationOnNode}} function for the processor to support the FairScheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7812) Improvements to Rich Placement Constraints in YARN
Arun Suresh created YARN-7812: - Summary: Improvements to Rich Placement Constraints in YARN Key: YARN-7812 URL: https://issues.apache.org/jira/browse/YARN-7812 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Closed] (YARN-6942) Add examples for placement constraints usage in applications
[ https://issues.apache.org/jira/browse/YARN-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh closed YARN-6942. - > Add examples for placement constraints usage in applications > > > Key: YARN-6942 > URL: https://issues.apache.org/jira/browse/YARN-6942 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Konstantinos Karanasos >Assignee: Panagiotis Garefalakis >Priority: Major > > This JIRA will include examples of how the new {{PlacementConstraints}} API > can be used by various applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6942) Add examples for placement constraints usage in applications
[ https://issues.apache.org/jira/browse/YARN-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-6942. --- Resolution: Resolved > Add examples for placement constraints usage in applications > > > Key: YARN-6942 > URL: https://issues.apache.org/jira/browse/YARN-6942 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Konstantinos Karanasos >Assignee: Panagiotis Garefalakis >Priority: Major > > This JIRA will include examples of how the new {{PlacementConstraints}} API > can be used by various applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7783) Add validation step to ensure constraints are not violated due to order in which a request is processed
Arun Suresh created YARN-7783: - Summary: Add validation step to ensure constraints are not violated due to order in which a request is processed Key: YARN-7783 URL: https://issues.apache.org/jira/browse/YARN-7783 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh When the algorithm has placed a container on a node, allocation tags are added to the node if the constraint is satisfied, But depending on the order in which the algorithm sees the request, it is possible that a constraint that happen to be valid during placement of an earlier-seen request, might not be valid after all subsequent requests have been placed. For eg: Assume nodes n1, n2, n3, n4 and n5 Consider the 2 constraints: # *foo* -> anti-affinity with *foo* # *bar* -> anti-affinity with *foo* And 2 requests # req1: NumAllocations = 4, allocTags = [foo] # req2: NumAllocations = 1, allocTags = [bar] If *req1* is seen first, the algorithm can place the 4 containers in n1, n2, n3 and n4. And when it gets to *req2*, it will see that 4 nodes have the *foo* tag and will place it on n5. But if *req2* is seen first, then *bar* tag will be placed on any node, since no node will at that point have *foo*, and then when it gets to *req1*, since *foo* has no anti-affinity with *bar*, the algorithm can end up placing *foo* on a node with *bar* violating the second constraint. To prevent the above, we need a validation step: after the placements for a batch of requests are made, then for each req, we remove its tags from the node and try to see of constraints are still satisfied if the tag were to be added back on the node. When applied to the example above, after the algorithm has run through *req2* and then *req1*, we remove the *bar* tag from the node and try to add it back on the node. This time, constraint satisfaction will fail, since there is now a *foo* tag on the node and *bar* cannot be added. The algorithm will then retry placing *req2* on another node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7780) Documentation for Placement Constraints
Arun Suresh created YARN-7780: - Summary: Documentation for Placement Constraints Key: YARN-7780 URL: https://issues.apache.org/jira/browse/YARN-7780 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Konstantinos Karanasos JIRA to track documentation for the feature. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7752) Handle AllocationTags for Opportunistic containers.
Arun Suresh created YARN-7752: - Summary: Handle AllocationTags for Opportunistic containers. Key: YARN-7752 URL: https://issues.apache.org/jira/browse/YARN-7752 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh JIRA to track how opportunistic containers are handled w.r.t AllocationTagsManager creation and removal of tags. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7746) Minor bug fixes to PlacementConstraintUtils and PlacementProcessor to support app priority
Arun Suresh created YARN-7746: - Summary: Minor bug fixes to PlacementConstraintUtils and PlacementProcessor to support app priority Key: YARN-7746 URL: https://issues.apache.org/jira/browse/YARN-7746 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh JIRA opened to track 2 minor fixes. The PlacementConstraintsUtil does a scope check using object equality and not string equality, which causes some tests to pass, but it really fails in an actual deployment. The Threadpools used in the Processor should be modified to take a priority blocking queue that respects application priority. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7745) Allow DistributedShell to take a placement specification for containers it wants to launch
Arun Suresh created YARN-7745: - Summary: Allow DistributedShell to take a placement specification for containers it wants to launch Key: YARN-7745 URL: https://issues.apache.org/jira/browse/YARN-7745 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh This is add a '-placement_spec' option to the distributed shell client. Where the user can specify a string-ified specification for how it wants containers to be placed. For eg: {noformat} $ yarn org.apache.hadoop.yarn.applications.distributedshell.Client –jar \ $YARN_DS/hadoop-yarn-applications-distributedshell-$YARN_VERSION.jar \ -shell_command sleep -shell_args 10 -placement_spec {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7715) Update CPU and Memory cgroups params on container update as well.
Arun Suresh created YARN-7715: - Summary: Update CPU and Memory cgroups params on container update as well. Key: YARN-7715 URL: https://issues.apache.org/jira/browse/YARN-7715 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups params for the containers, based on opportunistic or guaranteed, in the *preStart* method. Now that YARN-5085 is in, Container executionType (as well as the cpu, memory and any other resources) can be updated after the container has started. This means we need the ability to change cgroups params after container start. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7696) Add container tags to ContainerTokenIdentifier, api.Container and NMContainerStatus to handle all recovery cases
Arun Suresh created YARN-7696: - Summary: Add container tags to ContainerTokenIdentifier, api.Container and NMContainerStatus to handle all recovery cases Key: YARN-7696 URL: https://issues.apache.org/jira/browse/YARN-7696 Project: Hadoop YARN Issue Type: Sub-task Environment: The NM needs to persist the Container tags so that on RM recovery, it is sent back to the RM via the NMContainerStatus. The RM would then recover the AllocationTagsManager using this information. The api.Container also requires the allocationTags since after AM recovery, we need to provide the AM with previously allocated containers. Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7670) Modifications to the ResourceScheduler to support SchedulingRequests
Arun Suresh created YARN-7670: - Summary: Modifications to the ResourceScheduler to support SchedulingRequests Key: YARN-7670 URL: https://issues.apache.org/jira/browse/YARN-7670 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh As per discussions in YARN-7612. This JIRA tracks the changes to the ResourceScheduler interface and implementation in CapacityScheduler to support SchedulingRequests -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7669) [API] Introduce interfaces for placement constraint processing
Arun Suresh created YARN-7669: - Summary: [API] Introduce interfaces for placement constraint processing Key: YARN-7669 URL: https://issues.apache.org/jira/browse/YARN-7669 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh As per discussions in YARN-7612. This JIRA will introduce the generic interfaces which will be implemented in YARN-7612 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7623) Fix the CapacityScheduler Queue configuration documentation
Arun Suresh created YARN-7623: - Summary: Fix the CapacityScheduler Queue configuration documentation Key: YARN-7623 URL: https://issues.apache.org/jira/browse/YARN-7623 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun Suresh It looks like the [Changing Queue Configuration|https://hadoop.apache.org/docs/r2.9.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Changing_queue_configuration_via_API] section is mis-formatted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7613) Implement Planning algorithms for rich placement
Arun Suresh created YARN-7613: - Summary: Implement Planning algorithms for rich placement Key: YARN-7613 URL: https://issues.apache.org/jira/browse/YARN-7613 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Panagiotis Garefalakis -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7612) Add Placement Processor and planner framework
Arun Suresh created YARN-7612: - Summary: Add Placement Processor and planner framework Key: YARN-7612 URL: https://issues.apache.org/jira/browse/YARN-7612 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh This introduces a Placement Processor and a Planning algorithm framework to handle placement constraints and scheduling requests from an app and places them on nodes. The actual planning algorithm(s) will be handled in a separate JIRA. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7559) TestNodeLabelContainerAllocation failing intermittently
Arun Suresh created YARN-7559: - Summary: TestNodeLabelContainerAllocation failing intermittently Key: YARN-7559 URL: https://issues.apache.org/jira/browse/YARN-7559 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7547) Throttle Localization for Opportunistic Containers in the NM
Arun Suresh created YARN-7547: - Summary: Throttle Localization for Opportunistic Containers in the NM Key: YARN-7547 URL: https://issues.apache.org/jira/browse/YARN-7547 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: kartheek muthyala Currently, Localization is performed before the container is queued on the NM. It is possible that a barrage of Opportunsitic containers can prevent Guaranteed containers from starting. This can be avoided by throttling Localization Requests for opportunistic containers - for eg. if the number of Queued containers is > x, then don't start localization for new Opp containers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7542) NM recovers some Running Opportunistic Containers as SUSPEND
Arun Suresh created YARN-7542: - Summary: NM recovers some Running Opportunistic Containers as SUSPEND Key: YARN-7542 URL: https://issues.apache.org/jira/browse/YARN-7542 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Sampada Dehankar Steps to reproduce: * Start YARN cluster - Enable Opportunistic containers and set NM queue length to something > 10. Also Enable work preserving restart * Start an MR job (without opportunistic containers) * Kill the NM and restart it again. * In the logs - it shows that some of the containers are in SUSPENDED state - even though they are still running. [~sampada15] / [~kartheek], can you take a look at this ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7448) [API] Add SchedulingRequest to the AllocateRequest
Arun Suresh created YARN-7448: - Summary: [API] Add SchedulingRequest to the AllocateRequest Key: YARN-7448 URL: https://issues.apache.org/jira/browse/YARN-7448 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh YARN-6594 introduces the {{SchedulingRequest}}. This JIRA tracks the inclusion of the SchedulingRequest into the AllocateRequest. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5220) Scheduling of OPPORTUNISTIC containers through YARN RM
[ https://issues.apache.org/jira/browse/YARN-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-5220. --- Resolution: Fixed Fix Version/s: 2.9.0 Release Note: This extends the centralized YARN RM in to enable the scheduling of OPPORTUNISTIC containers in a centralized fashion. This way, users can use OPPORTUNISTIC containers to improve the cluster's utilization, without needing to enable distributed scheduling. > Scheduling of OPPORTUNISTIC containers through YARN RM > -- > > Key: YARN-5220 > URL: https://issues.apache.org/jira/browse/YARN-5220 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos >Priority: Major > Fix For: 2.9.0 > > > In YARN-2882, we introduced the notion of OPPORTUNISTIC containers, along > with the existing GUARANTEED containers of YARN. > OPPORTUNISTIC containers are allowed to be queued at the NMs (YARN-2883), and > are executed as long as there are available resources at the NM. Moreover, > they are of lower priority than the GUARANTEED containers, that is, they can > be preempted for a GUARANTEED container to start its execution. > In YARN-2877, we introduced distributed scheduling in YARN, and enabled > OPPORTUNISTIC containers to be scheduled exclusively by distributed > schedulers. > In this JIRA, we are proposing to extend the centralized YARN RM in order to > enable the scheduling of OPPORTUNISTIC containers in a centralized fashion. > This way, users can use OPPORTUNISTIC containers to improve the cluster's > utilization, without the need to enable distributed scheduling. > This JIRA is also related to YARN-1011 that introduces the over-commitment of > resources, scheduling additional OPPORTUNISTIC containers to the NMs based on > the currently used resources and not based only on the allocated resources. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5687) Refactor TestOpportunisticContainerAllocation to extend TestAMRMClient
[ https://issues.apache.org/jira/browse/YARN-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-5687. --- Resolution: Implemented Fix Version/s: 2.9.0 This is already done > Refactor TestOpportunisticContainerAllocation to extend TestAMRMClient > -- > > Key: YARN-5687 > URL: https://issues.apache.org/jira/browse/YARN-5687 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Konstantinos Karanasos >Priority: Major > Fix For: 2.9.0 > > Attachments: YARN-5687.001.patch > > > Since {{TestOpportunisticContainerAllocation}} shares a lot of code with the > {{TestAMRMClient}}, we should refactor the former, making it a subclass of > the latter. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-4631) Add specialized Token support for DistributedSchedulingProtocol
[ https://issues.apache.org/jira/browse/YARN-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-4631. --- Resolution: Won't Fix Closing this as it is not required. > Add specialized Token support for DistributedSchedulingProtocol > --- > > Key: YARN-4631 > URL: https://issues.apache.org/jira/browse/YARN-4631 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Major > > The {{DistributedSchedulingProtocol}} introduced in YARN-2885 which extends > the {{ApplicationMasterProtocol}}. This protocol should support its own Token > type, and not just reuse the AMRMToken. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-4742) [Umbrella] Enhancements to Distributed Scheduling
[ https://issues.apache.org/jira/browse/YARN-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-4742. --- Resolution: Fixed Fix Version/s: 3.0.0-beta1 2.9.0 > [Umbrella] Enhancements to Distributed Scheduling > - > > Key: YARN-4742 > URL: https://issues.apache.org/jira/browse/YARN-4742 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Major > Fix For: 2.9.0, 3.0.0-beta1 > > > This is an Umbrella JIRA to track enhancements / improvements that can be > made to the core Distributed Scheduling framework : YARN-2877 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-2877) Extend YARN to support distributed scheduling
[ https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-2877. --- Resolution: Fixed > Extend YARN to support distributed scheduling > - > > Key: YARN-2877 > URL: https://issues.apache.org/jira/browse/YARN-2877 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Sriram Rao >Assignee: Konstantinos Karanasos >Priority: Major > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: distributed-scheduling-design-doc_v1.pdf > > > This is an umbrella JIRA that proposes to extend YARN to support distributed > scheduling. Briefly, some of the motivations for distributed scheduling are > the following: > 1. Improve cluster utilization by opportunistically executing tasks otherwise > idle resources on individual machines. > 2. Reduce allocation latency. Tasks where the scheduling time dominates > (i.e., task execution time is much less compared to the time required for > obtaining a container from the RM). > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5447) Consider including allocationRequestId in NMContainerStatus to allow recovery in case of RM failover
[ https://issues.apache.org/jira/browse/YARN-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-5447. --- Resolution: Won't Fix Today, this is a problem even when a user does not specify an allocate required Id. Since the AMRMClient sends all outstanding requests to the RM after a failover, it should not be that big of an issue. > Consider including allocationRequestId in NMContainerStatus to allow recovery > in case of RM failover > > > Key: YARN-5447 > URL: https://issues.apache.org/jira/browse/YARN-5447 > Project: Hadoop YARN > Issue Type: Sub-task > Components: applications, resourcemanager >Reporter: Subru Krishnan >Assignee: Subru Krishnan > > We have added a mapping of the allocated container to the original request > through YARN-4887/YARN-4888. There is a corner case in which the mapping will > be lost, i.e. if RM fails over before notifying the AM about newly allocated > container(s). This JIRA tracks the changes required to include the > allocationRequestId in NMContainerStatus to allow recovery in case of RM > failover. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5861) Add support for recovery of queued opportunistic containers in the NM.
[ https://issues.apache.org/jira/browse/YARN-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-5861. --- Resolution: Duplicate > Add support for recovery of queued opportunistic containers in the NM. > -- > > Key: YARN-5861 > URL: https://issues.apache.org/jira/browse/YARN-5861 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > > Currently, the NM stateStore marks a container as QUEUED but they are ignored > (deemed lost) if the container had not started before the NM went down. These > containers should ideally be re-queued when the NM restarts. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5860) Add support for increase and decrease of container resources to NM Container Queuing
[ https://issues.apache.org/jira/browse/YARN-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-5860. --- Resolution: Implemented Fix Version/s: 3.0.0 2.9.0 > Add support for increase and decrease of container resources to NM Container > Queuing > - > > Key: YARN-5860 > URL: https://issues.apache.org/jira/browse/YARN-5860 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 2.9.0, 3.0.0 > > > Currently the queuing framework (introduced in YARN-2877) in the NM that > handles opportunistic containers, are pre-empts opportunistic containers only > when resources are need to start guaranteed containers. > It currently does not handle situations where a guaranteed container > resources have been increased. Conversely, if a guaranteed (or opportunistic) > container's resources have been decreased, the NM must start queued > opportunistic containers waiting on the newly available resources. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7275) NM Statestore cleanup for Container updates
Arun Suresh created YARN-7275: - Summary: NM Statestore cleanup for Container updates Key: YARN-7275 URL: https://issues.apache.org/jira/browse/YARN-7275 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: kartheek muthyala Priority: Blocker Currently, only resource updates are recorded in the NM state store, we need to add ExecutionType updates as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7258) Add Node and Rack Hints to Opportunistic Scheduler
Arun Suresh created YARN-7258: - Summary: Add Node and Rack Hints to Opportunistic Scheduler Key: YARN-7258 URL: https://issues.apache.org/jira/browse/YARN-7258 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: kartheek muthyala Currently, the Opportunistic Scheduler ignores the node and rack information and allocates strictly on the least loaded node (based on queue length) at the time it received the request. This JIRA is to track changes needed to allow the OpportunisticContainerAllocator to take the node/rack name as hints. The flow would be: # If requested node found in the top K leastLoaded nodes, allocate on that node # Else, allocate on least loaded node on the same rack from the top K least Loaded nodes. # Else, allocate on least loaded node. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine
Arun Suresh created YARN-7240: - Summary: Add more states and transitions to stabilize the NM Container state machine Key: YARN-7240 URL: https://issues.apache.org/jira/browse/YARN-7240 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun Suresh Assignee: kartheek muthyala There seem to be a few intermediate states that can be added to improve the stability of the NM container state machine. For. eg: * The REINITIALIZING should probably be split into REINITIALIZING and REINITIALIZING_AWAITING_KILL. * Container updates are currently handled in the ContainerScheduler, but it would probably be better to have it plumbed through the container state machine as a new state, say UPDATING and a new container event. The plan is to add some extra tests too to try and test every transition. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7192) Add a pluggable StateMachine Listener that is notified of NM Container State changes
Arun Suresh created YARN-7192: - Summary: Add a pluggable StateMachine Listener that is notified of NM Container State changes Key: YARN-7192 URL: https://issues.apache.org/jira/browse/YARN-7192 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Arun Suresh This JIRA is to add support for a plugggable class in the NodeManager that is notified of changes to the Container StateMachine state and the events that caused the change. The proposal is to modify the basic StateMachine class add support for a hook that is called before and after a transition. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Closed] (YARN-6692) Delay pause when container is localizing
[ https://issues.apache.org/jira/browse/YARN-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh closed YARN-6692. - > Delay pause when container is localizing > > > Key: YARN-6692 > URL: https://issues.apache.org/jira/browse/YARN-6692 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Jose Miguel Arreola >Assignee: Jose Miguel Arreola > Original Estimate: 96h > Remaining Estimate: 96h > > If a container receives a Pause event while localizing, allow container > finish localizing and then pause it -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6692) Delay pause when container is localizing
[ https://issues.apache.org/jira/browse/YARN-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-6692. --- Resolution: Invalid Closing this, since it is not a valid scenario currently. > Delay pause when container is localizing > > > Key: YARN-6692 > URL: https://issues.apache.org/jira/browse/YARN-6692 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Jose Miguel Arreola >Assignee: Jose Miguel Arreola > Original Estimate: 96h > Remaining Estimate: 96h > > If a container receives a Pause event while localizing, allow container > finish localizing and then pause it -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7178) Add documentation for Container Update API
Arun Suresh created YARN-7178: - Summary: Add documentation for Container Update API Key: YARN-7178 URL: https://issues.apache.org/jira/browse/YARN-7178 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-4509) Promote containers from OPPORTUNISTIC to GUARANTEED
[ https://issues.apache.org/jira/browse/YARN-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-4509. --- Resolution: Duplicate Target Version/s: (was: ) > Promote containers from OPPORTUNISTIC to GUARANTEED > --- > > Key: YARN-4509 > URL: https://issues.apache.org/jira/browse/YARN-4509 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0-alpha1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > > YARN-2882 adds the notion of an OPPORTUNISTIC containers. We should define > the protocol for promoting these containers to GUARATEED. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7173) Container Update Backward compatibility fix for upgrades from 2.8.x
Arun Suresh created YARN-7173: - Summary: Container Update Backward compatibility fix for upgrades from 2.8.x Key: YARN-7173 URL: https://issues.apache.org/jira/browse/YARN-7173 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh This is based on discussions with [~leftnoteasy] in YARN-6979. In YARN-6979, the {{getContainersToDecrease()}} and {{addAllContainersToDecrease()}} methods were removed from the NodeHeartbeatResponse (although the actual protobuf fields were still retained). We need to ensure that for clusters that upgrade from 2.8.x to 2.9.0, the decreased containers should also be sent to the NM. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7086) Release all containers aynchronously
Arun Suresh created YARN-7086: - Summary: Release all containers aynchronously Key: YARN-7086 URL: https://issues.apache.org/jira/browse/YARN-7086 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Arun Suresh Assignee: Arun Suresh We have noticed in production two situations that can cause deadlocks and cause scheduling of new containers to come to a halt, especially with regard to applications that have a lot of live containers: # When these applicaitons release these containers in bulk. # When these applications terminate abruptly due to some failure, the scheduler releases all its live containers in a loop. To handle the issues mentioned above, we have a patch in production to make sure ALL container releases happen asynchronously - and it has served us well. Opening this JIRA to gather feedback on if this is a good idea generally (cc [~leftnoteasy], [~jlowe], [~curino], [~kasha], [~subru], [~roniburd]) BTW, In YARN-6251, we already have an asyncReleaseContainer() in the AbstractYarnScheduler and a corresponding scheduler event, which is currently used specifically for the container-update code paths (where the scheduler realeases temp containers which it creates for the update) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7015) Handle Container ExecType update (Promotion/Demotion) in cgroups resource handlers
Arun Suresh created YARN-7015: - Summary: Handle Container ExecType update (Promotion/Demotion) in cgroups resource handlers Key: YARN-7015 URL: https://issues.apache.org/jira/browse/YARN-7015 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh YARN-5085 allows support for change of container execution type (Promotion/Demotion). Modifications to the ContainerManagementProtocol, ContainerManager and ContainerScheduler to handle this change are now in trunk. Opening this JIRA to track changes (if any) required in the cgroups resourcehandlers to accommodate this in the context of YARN-1011. (cc [~kasha], [~kkaranasos], [~haibochen], [~miklos.szeg...@cloudera.com]) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6979) Add flag to allow all container updates to be initiated via NodeHeartbeatResponse
Arun Suresh created YARN-6979: - Summary: Add flag to allow all container updates to be initiated via NodeHeartbeatResponse Key: YARN-6979 URL: https://issues.apache.org/jira/browse/YARN-6979 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: kartheek muthyala Currently, only the Container Resource increase command is sent to the NM via NodeHeartbeat response. This JIRA proposes to add a flag in the RM to allow ALL container updates (increase, decrease, promote and demote) to initiated via node HB. The AM is still free to use the ContainerManagementPrototol's {{updateContainer}} API in cases where for instance, the Node HB is frequency is very low and the AM needs to update the container as soon as possible. In these situations, if the Node HB arrives before the updateContainer API call, the call would error out, due to a version mismatch and the AM is required to handle it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6978) Add updateContainer API to NMClient.
Arun Suresh created YARN-6978: - Summary: Add updateContainer API to NMClient. Key: YARN-6978 URL: https://issues.apache.org/jira/browse/YARN-6978 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: kartheek muthyala This is to track the addition of updateContainer API to the {{NMClient}} and {{NMClientAsync}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6963) Prevent other containers from staring when a container is re-initializing
Arun Suresh created YARN-6963: - Summary: Prevent other containers from staring when a container is re-initializing Key: YARN-6963 URL: https://issues.apache.org/jira/browse/YARN-6963 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Arun Suresh Further to discussions in YARN-6920. Container re-initialization will lead to momentary relinquishing of NM resources when the container is brought down followed by re-claiming of the same resources when it is re-launched. If there are Opportunistic containers in the queue, it can lead to un-necessary churn if one of those opportunistic containers are started and immediately killed. This JIRA tracks changes required to prevent the above by ensuring the resources for a container are 'locked' for the during of the container lifetime - including the time it takes for a re-initialization. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6180) Clean unused SchedulerRequestKeys once ExecutionType updates are completed
[ https://issues.apache.org/jira/browse/YARN-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-6180. --- Resolution: Not A Problem Resolving this - verified it is not a problem > Clean unused SchedulerRequestKeys once ExecutionType updates are completed > -- > > Key: YARN-6180 > URL: https://issues.apache.org/jira/browse/YARN-6180 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > > The SchedulerRequestKeys used for ExecutionType updates, that are generated, > tend to accumulate in the AppSchedulingInfo and over time lead to a situation > outlined in YARN-5540. > These keys must be removed once the container update completes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6940) Enable Container Resize testcase for FairScheduler
Arun Suresh created YARN-6940: - Summary: Enable Container Resize testcase for FairScheduler Key: YARN-6940 URL: https://issues.apache.org/jira/browse/YARN-6940 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Arun Suresh Assignee: Arun Suresh After YARN-6216, the Container Update (which includes Resource increase and decrease) code-paths are mostly scheduler agnostic. This JIRA tracks the final minor change needed in the FairScheduler. It also re-enables the {{TestAMRMClient#testAMRMClientWithContainerResourceChange}} test for the FairScheduler - which verifies that it works for the FairScheduler. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6932) Fix TestFederationRMFailoverProxyProvider test case
Arun Suresh created YARN-6932: - Summary: Fix TestFederationRMFailoverProxyProvider test case Key: YARN-6932 URL: https://issues.apache.org/jira/browse/YARN-6932 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Subru Krishnan Noticed that {{TestFederationRMFailoverProxyProvider}} after the YARN-2915 merge (cc [~subru]) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6920) fix TestNMClient failure
Arun Suresh created YARN-6920: - Summary: fix TestNMClient failure Key: YARN-6920 URL: https://issues.apache.org/jira/browse/YARN-6920 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Haibo Chen Looks like {{TestNMClient}} has been failing for a while. Opening this JIRA to track the fix. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6849) NMContainerStatus should have the Container ExecutionType.
Arun Suresh created YARN-6849: - Summary: NMContainerStatus should have the Container ExecutionType. Key: YARN-6849 URL: https://issues.apache.org/jira/browse/YARN-6849 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Currently only the ContainerState is sent to the RM in the NMContainerStatus. This lets the restarted RM know if the container is queued or not, but It wont know the ExecutionType. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6848) Move Router ClientRMServices Interceptor and chain into yarn api and common package
Arun Suresh created YARN-6848: - Summary: Move Router ClientRMServices Interceptor and chain into yarn api and common package Key: YARN-6848 URL: https://issues.apache.org/jira/browse/YARN-6848 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh Once YARN-2915 is merged, The ClientRMServices interceptor and Proxy classes should be moved into yarn common and api packages, so that it can be used not just in the Router, but also for specifying Interceptors that run in the RM. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6838) Add support to linux container-executor to support container freezing
Arun Suresh created YARN-6838: - Summary: Add support to linux container-executor to support container freezing Key: YARN-6838 URL: https://issues.apache.org/jira/browse/YARN-6838 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6835) Remove runningContainers from ContainerScheduler
Arun Suresh created YARN-6835: - Summary: Remove runningContainers from ContainerScheduler Key: YARN-6835 URL: https://issues.apache.org/jira/browse/YARN-6835 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh The *runningContainers* collection contains both running containers as well as container that are scheduled but not yet started. We can remove this data structure completely by introducing a *LAUNCHING* container state. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6829) Promote Opportunistic Containers to Guaranteed containers when Guaranteed containers complete
Arun Suresh created YARN-6829: - Summary: Promote Opportunistic Containers to Guaranteed containers when Guaranteed containers complete Key: YARN-6829 URL: https://issues.apache.org/jira/browse/YARN-6829 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh Once Guaranteed containers of apps complete, it is possible that the queue/app might go below configured capacity. In which case existing Opportunistic containers of an app can be promoted to ensure they are not preempted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6828) [Umbrella] Container preemption using OPPORTUNISTIC containers
Arun Suresh created YARN-6828: - Summary: [Umbrella] Container preemption using OPPORTUNISTIC containers Key: YARN-6828 URL: https://issues.apache.org/jira/browse/YARN-6828 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Arun Suresh Currently, the YARN schedulers selects containers for preemption only in response to a starved queue / app's request. We propose to allow the Schedulers to mark containers that are allocated over queue capacity/fair-share as Opportunistic containers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5049) Extend NMStateStore to save queued container information
[ https://issues.apache.org/jira/browse/YARN-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-5049. --- Resolution: Fixed Committed to branch-2 as well > Extend NMStateStore to save queued container information > > > Key: YARN-5049 > URL: https://issues.apache.org/jira/browse/YARN-5049 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Fix For: 3.0.0-alpha1 > > Attachments: YARN-5049.001.patch, YARN-5049.002.patch, > YARN-5049.003.patch > > > This JIRA is about extending the NMStateStore to save queued container > information whenever a new container is added to the NM queue. > It also removes the information from the state store when the queued > container starts its execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6826) SLS NMSimulator support for Opportunistic Container Queuing.
Arun Suresh created YARN-6826: - Summary: SLS NMSimulator support for Opportunistic Container Queuing. Key: YARN-6826 URL: https://issues.apache.org/jira/browse/YARN-6826 Project: Hadoop YARN Issue Type: Bug Components: scheduler-load-simulator Reporter: Arun Suresh Assignee: Arun Suresh -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6808) Allow Schedulers to return OPPORTUNISTIC containers when queues go over configured capacity
Arun Suresh created YARN-6808: - Summary: Allow Schedulers to return OPPORTUNISTIC containers when queues go over configured capacity Key: YARN-6808 URL: https://issues.apache.org/jira/browse/YARN-6808 Project: Hadoop YARN Issue Type: New Feature Reporter: Arun Suresh This is based on discussions with [~kasha] and [~kkaranasos]. Currently, when a Queues goes over capacity, apps on starved queues must wait either for containers to complete or for them to be pre-empted by the scheduler to get resources. This JIRA proposes to allow Schedulers to: # Allocate all containers over the configured queue capacity/weight as OPPORTUNISTIC. # Auto-promote running OPPORTUNISTIC containers of apps as and when their GUARANTEED containers complete. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors
Arun Suresh created YARN-6777: - Summary: Support for ApplicationMasterService processing chain of interceptors Key: YARN-6777 URL: https://issues.apache.org/jira/browse/YARN-6777 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh This JIRA extends the Processor introduced in YARN-6776 with a configurable processing chain of interceptors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6776) Refactor ApplicaitonMasterService to move actual processing logic to a separate class
Arun Suresh created YARN-6776: - Summary: Refactor ApplicaitonMasterService to move actual processing logic to a separate class Key: YARN-6776 URL: https://issues.apache.org/jira/browse/YARN-6776 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh Priority: Minor Minor refactoring to move the processing logic of the {{ApplicationMasterService}} into a separate class. The per appattempt locking as well as the extraction of the appAttemptId etc. will remain in the ApplicationMasterService -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects
Arun Suresh created YARN-6619: - Summary: AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects Key: YARN-6619 URL: https://issues.apache.org/jira/browse/YARN-6619 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Panagiotis Garefalakis Opening this JIRA to track changes needed in the AMRMClient to incorporate the PlacementConstraint and SchedulingRequest objects -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6614) Deprecate DistributedSchedulingProtocol and add required fields directly to ApplicationMasterProtocol
Arun Suresh created YARN-6614: - Summary: Deprecate DistributedSchedulingProtocol and add required fields directly to ApplicationMasterProtocol Key: YARN-6614 URL: https://issues.apache.org/jira/browse/YARN-6614 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Arun Suresh The {{DistributedSchedulingProtocol}} was initially designed as a wrapper protocol over the {{ApplicaitonMasterProtocol}}. This JIRA proposes to deprecate the protocol itself and move the extra fields of the {{RegisterDistributedSchedulingAMResponse}} and {{DistributedSchedulingAllocateResponse}} to the {{RegisterApplicationMasterResponse}} and {{AllocateResponse}} respectively. This will simplify the code quite a bit and make it reimplement the feature as a preprocessor. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6443) Allow for Priority order relaxing in favor of node locality
Arun Suresh created YARN-6443: - Summary: Allow for Priority order relaxing in favor of node locality Key: YARN-6443 URL: https://issues.apache.org/jira/browse/YARN-6443 Project: Hadoop YARN Issue Type: Improvement Components: capacity scheduler, fairscheduler Reporter: Arun Suresh Assignee: Arun Suresh Currently the Schedulers examine an applications pending Requests in Priority order. This JIRA proposes to introduce a flag (either via the ApplicationMasterService::registerApplication() or via some Scheduler configuration) to favor an ordering that is baised to the node that is currently heartbeating by relaxing the priority constraint. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6406) Garbage Collect unused SchedulerRequestKeys
Arun Suresh created YARN-6406: - Summary: Garbage Collect unused SchedulerRequestKeys Key: YARN-6406 URL: https://issues.apache.org/jira/browse/YARN-6406 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh YARN-5540 introduced some optimizations to remove satisfied SchedulerKeys from the AppScheduleingInfo. It looks like after YARN-6040, ScedulerRequestKeys are removed only if the Application sends a 0 numContainers requests. While earlier, the outstanding schedulerKeys were also remove as soon as a container is allocated as well. An additional optimization we were hoping to include is to remove the ResourceRequests itself once the numContainers == 0, since we see in our clusters that the RM heap space consumption increases drastically due to a large number of ResourceRequests with 0 num containers. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6355) Interceptor framework for the YARN ApplicationMasterService
Arun Suresh created YARN-6355: - Summary: Interceptor framework for the YARN ApplicationMasterService Key: YARN-6355 URL: https://issues.apache.org/jira/browse/YARN-6355 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Currently on the NM, we have the {{AMRMProxy}} framework to intercept the AM <-> RM communication and enforce policies. This is used both by YARN federation (YARN-2915) as well as Distributed Scheduling (YARN-2877). This JIRA proposes to introduce a similar framework on the the RM side, so that pluggable policies can be enforced on ApplicationMasterService centrally as well. This would be similar in spirit to a Java Servlet Filter Chain. Where the order of the interceptors can declared externally. Once possible usecase would be: the {{OpportunisticContainerAllocatorAMService}} is implemented as a wrapper over the {{ApplicationMasterService}}. It would probably be better to implement it as an Interceptor. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6181) SchedulerRequestKey compareTo method neglects to compare containerToUpdate
[ https://issues.apache.org/jira/browse/YARN-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-6181. --- Resolution: Duplicate > SchedulerRequestKey compareTo method neglects to compare containerToUpdate > --- > > Key: YARN-6181 > URL: https://issues.apache.org/jira/browse/YARN-6181 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > > The SchedulerRequestKey's {{compareTo}} does not correctly compare the > {{containerToUpdate}} fields. Thus multiple update requests against the same > priority and allocationRequestId will be clobbered. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6251) Fix Scheduler locking issue introduced by YARN-6216
Arun Suresh created YARN-6251: - Summary: Fix Scheduler locking issue introduced by YARN-6216 Key: YARN-6251 URL: https://issues.apache.org/jira/browse/YARN-6251 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Opening to track a locking issue that was uncovered when running a custom SLS AMSimulator. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6231) TestFairScheduler::testMoveWouldViolateMaxResourcesConstraints failing on branch-2
Arun Suresh created YARN-6231: - Summary: TestFairScheduler::testMoveWouldViolateMaxResourcesConstraints failing on branch-2 Key: YARN-6231 URL: https://issues.apache.org/jira/browse/YARN-6231 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.9.0 Reporter: Arun Suresh Assignee: Karthik Kambatla -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
Arun Suresh created YARN-6216: - Summary: Unify Container Resizing code paths with Container Updates making it scheduler agnostic Key: YARN-6216 URL: https://issues.apache.org/jira/browse/YARN-6216 Project: Hadoop YARN Issue Type: Bug Components: capacity scheduler, fairscheduler, resourcemanager Affects Versions: 3.0.0-alpha2 Reporter: Arun Suresh Assignee: Arun Suresh Fix For: 3.0.0-alpha3 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6181) SchedulerRequestKey compareTo method neglects to compare containerToUpdate
Arun Suresh created YARN-6181: - Summary: SchedulerRequestKey compareTo method neglects to compare containerToUpdate Key: YARN-6181 URL: https://issues.apache.org/jira/browse/YARN-6181 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh The SchedulerRequestKey's {{compareTo}} does not correctly compare the {{containerToUpdate}} fields. Thus multiple update requests against the same priority and allocationRequestId will be clobbered. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6180) Clean unused SchedulerRequestKeys once ExecutionType updates are completed
Arun Suresh created YARN-6180: - Summary: Clean unused SchedulerRequestKeys once ExecutionType updates are completed Key: YARN-6180 URL: https://issues.apache.org/jira/browse/YARN-6180 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh The SchedulerRequestKeys used for ExecutionType updates, that are generated, tend to accumulate in the AppSchedulingInfo and over time lead to a situation outlined in YARN-5540. These keys must be removed once the container update completes. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5646) Add documentation and update config parameter names for scheduling of OPPORTUNISTIC containers
[ https://issues.apache.org/jira/browse/YARN-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-5646. --- Resolution: Fixed Committed addendum patch to trunk and branch-2 > Add documentation and update config parameter names for scheduling of > OPPORTUNISTIC containers > -- > > Key: YARN-5646 > URL: https://issues.apache.org/jira/browse/YARN-5646 > Project: Hadoop YARN > Issue Type: Task >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos >Priority: Blocker > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-5646.001.patch, YARN-5646.002.patch, > YARN-5646.003.patch, YARN-5646.004.patch, YARN-5646.addendum.patch > > > This is for adding documentation regarding the scheduling of OPPORTUNISTIC > containers. > It includes both the centralized (YARN-5220) and the distributed (YARN-2877) > scheduling. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6066) Opportunistic containers: Minor fixes: API annotations and config parameter changes
Arun Suresh created YARN-6066: - Summary: Opportunistic containers: Minor fixes: API annotations and config parameter changes Key: YARN-6066 URL: https://issues.apache.org/jira/browse/YARN-6066 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Priority: Minor Creating this to capture changes suggested by [~leftnoteasy] and [~kasha] in YARN-6041 in its own JIRA. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6041) Opportunistic containers : Combined patch for branch-2
Arun Suresh created YARN-6041: - Summary: Opportunistic containers : Combined patch for branch-2 Key: YARN-6041 URL: https://issues.apache.org/jira/browse/YARN-6041 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Arun Suresh Fix For: 2.9.0 This is a combined patch targeting branch-2 of the following JIRAs which have already been committed to trunk : YARN-5938. Refactoring OpportunisticContainerAllocator to use SchedulerRequestKey instead of Priority and other misc fixes YARN-5646. Add documentation and update config parameter names for scheduling of OPPORTUNISTIC containers. YARN-5982. Simplify opportunistic container parameters and metrics. YARN-5918. Handle Opportunistic scheduling allocate request failure when NM is lost. YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager container lifecycle. YARN-5823. Update NMTokens in case of requests with only opportunistic containers. YARN-5377. Fix TestQueuingContainerManager.testKillMultipleOpportunisticContainers. YARN-2995. Enhance UI to show cluster resource utilization of various container Execution types. YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http Address. YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method to handle OPPORTUNISTIC container requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5978) ContainerScheduler and Container state machine changes to support ExecType update
Arun Suresh created YARN-5978: - Summary: ContainerScheduler and Container state machine changes to support ExecType update Key: YARN-5978 URL: https://issues.apache.org/jira/browse/YARN-5978 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5977) ContainerManagementProtocol changes to support change of container ExecutionType
Arun Suresh created YARN-5977: - Summary: ContainerManagementProtocol changes to support change of container ExecutionType Key: YARN-5977 URL: https://issues.apache.org/jira/browse/YARN-5977 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5972) Add Support for Pausing/Freezing of containers
Arun Suresh created YARN-5972: - Summary: Add Support for Pausing/Freezing of containers Key: YARN-5972 URL: https://issues.apache.org/jira/browse/YARN-5972 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5966) AMRMClient changes to support ExecutionType update
Arun Suresh created YARN-5966: - Summary: AMRMClient changes to support ExecutionType update Key: YARN-5966 URL: https://issues.apache.org/jira/browse/YARN-5966 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5087) Expose API to allow AM to request for change of container ExecutionType
[ https://issues.apache.org/jira/browse/YARN-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-5087. --- Resolution: Resolved Closing this, as YARN-5221 fixes this > Expose API to allow AM to request for change of container ExecutionType > --- > > Key: YARN-5087 > URL: https://issues.apache.org/jira/browse/YARN-5087 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5959) Add support for ExecutionType change from OPPORTUNISTIC to GUARANTEED
Arun Suresh created YARN-5959: - Summary: Add support for ExecutionType change from OPPORTUNISTIC to GUARANTEED Key: YARN-5959 URL: https://issues.apache.org/jira/browse/YARN-5959 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService
Arun Suresh created YARN-5938: - Summary: Minor refactoring to OpportunisticContainerAllocatorAMService Key: YARN-5938 URL: https://issues.apache.org/jira/browse/YARN-5938 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh Minor code re-organization to do the following: # The OpportunisticContainerAllocatorAMService currently allocates outside the ApplicationAttempt lock maintained by the ApplicationMasterService. This should happen inside the lock. # Refactored out some code to simplify the allocate() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5861) Add support for recovery of queued opportunistic containers in the NM.
Arun Suresh created YARN-5861: - Summary: Add support for recovery of queued opportunistic containers in the NM. Key: YARN-5861 URL: https://issues.apache.org/jira/browse/YARN-5861 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Arun Suresh Currently, the NM stateStore marks a container as QUEUED but they are ignored (deemed lost) if the container had not started before the NM went down. These containers should ideally be re-queued when the NM restarts. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5860) Add support for increase and decrease of container resources to NM Container Queuing
Arun Suresh created YARN-5860: - Summary: Add support for increase and decrease of container resources to NM Container Queuing Key: YARN-5860 URL: https://issues.apache.org/jira/browse/YARN-5860 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Arun Suresh Currently the queuing framework (introduced in YARN-2877) in the NM that handles opportunistic containers, are pre-empts opportunistic containers only when resources are need to start guaranteed containers. It currently does not handle situations where a guaranteed container resources have been increased. Conversely, if a guaranteed (or opportunistic) container's resources have been decreased, the NM must start queued opportunistic containers waiting on the newly available resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5799) Fix OpportunisticAllocation to set the correct value of Node Http Address
Arun Suresh created YARN-5799: - Summary: Fix OpportunisticAllocation to set the correct value of Node Http Address Key: YARN-5799 URL: https://issues.apache.org/jira/browse/YARN-5799 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Konstantinos Karanasos This proposes to fix the the OpportunisticAllocator, used to allocate OPPORTUNISTIC containers (both Centrally as well as in a Distributed manner) to correctly populate the Node Http Address in the returned Container. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5651) Changes to NMStateStore to persist reinitialization and rollback state
Arun Suresh created YARN-5651: - Summary: Changes to NMStateStore to persist reinitialization and rollback state Key: YARN-5651 URL: https://issues.apache.org/jira/browse/YARN-5651 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit
Arun Suresh created YARN-5637: - Summary: Changes in NodeManager to support Container upgrade and rollback/commit Key: YARN-5637 URL: https://issues.apache.org/jira/browse/YARN-5637 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh YARN-5620 added support for re-initialization of Containers using a new launch Context. This JIRA proposes to use the above feature to support upgrade and subsequent rollback or commit of the upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5633) Update Container Version in NMStateStore only if Resources have changed
[ https://issues.apache.org/jira/browse/YARN-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-5633. --- Resolution: Duplicate > Update Container Version in NMStateStore only if Resources have changed > --- > > Key: YARN-5633 > URL: https://issues.apache.org/jira/browse/YARN-5633 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5633-branch-2.8-v1.patch > > > YARN-5221 introduced a containerVersion that is stored in the NMStateStore. > The version is stored when > # the container is first started in the NM > # when the AM requests the NM to increase the Container resources. > # when the RM notifies the NM to decrease the Container resources. > Unfortunately, this would result in NM not being able to rollback after an > upgrade (for eg. from 2.8 > 2.7).. as noticed by [~jlowe]. > This JIRA proposes to update the version in the NM state store only when 2 > and 3 above occurs, this way, the rollback will be hampered only if the user > has used the new feature (resource increase / decrease) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5633) Update Container Version in NMStateStore only if Resources have changed
Arun Suresh created YARN-5633: - Summary: Update Container Version in NMStateStore only if Resources have changed Key: YARN-5633 URL: https://issues.apache.org/jira/browse/YARN-5633 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Assignee: Arun Suresh YARN-5221 introduced a containerVersion that is stored in the NMStateStore. The version is stored when # the container is first started in the NM # when the AM requests the NM to increase the Container resources. # when the RM notifies the NM to decrease the Container resources. Unfortunately, this would result in NM not being able to rollback after an upgrade (for eg. from 2.8 > 2.7).. as noticed by [~jlowe]. This JIRA proposes to update the version in the NM state store only when 2 and 3 above occurs, this way, the rollback will be hampered only if the user has used the new feature (resource increase / decrease) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5620) Core changes in NodeManager to support for upgrade and rollback of Containers
Arun Suresh created YARN-5620: - Summary: Core changes in NodeManager to support for upgrade and rollback of Containers Key: YARN-5620 URL: https://issues.apache.org/jira/browse/YARN-5620 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh JIRA proposes to modify the ContainerManager (and other core classes) to support upgrade of a running container with a new {{ContainerLaunchContext}} as well as the ability to rollback the upgrade if the container is not able to restart using the new launch Context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5609) Expose upgrade and restart API in ContainerManagementProtocol
Arun Suresh created YARN-5609: - Summary: Expose upgrade and restart API in ContainerManagementProtocol Key: YARN-5609 URL: https://issues.apache.org/jira/browse/YARN-5609 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh YARN-4876 allows an AM to explicitly *initialize*, *start*, *stop* and *destroy* a {{Container}}. This JIRA proposes to extend the ContainerManagementProtocol with the following API: # *upgrade* : which is a composition of *stop* + *(re)initialize* + *start* # *restart* : which is *stop* + *start* -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5593) [Umbrella] Add support for YARN Allocation composed of multiple containers/processes
Arun Suresh created YARN-5593: - Summary: [Umbrella] Add support for YARN Allocation composed of multiple containers/processes Key: YARN-5593 URL: https://issues.apache.org/jira/browse/YARN-5593 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Opening this to explicitly call out and track some of the ideas that were discussed in YARN-1040. Specifically the concept of an {{Allocation}} which can be used by an AM to start multiple {{Containers}} against as long as the sum of resources used by all containers {{fitsIn()}} the Resources leased to the {{Allocation}}. This is especially useful for AMs that might want to target certain operations (like upgrade / restart) on specific containers / processes within an Allocation without fear of losing the allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5580) Refactor Scheduler to take only an single list of container updates
Arun Suresh created YARN-5580: - Summary: Refactor Scheduler to take only an single list of container updates Key: YARN-5580 URL: https://issues.apache.org/jira/browse/YARN-5580 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh This is a follow up to YARN-5221. It proposes to fix the following: * The {{AbstractYarnScheduler::allocate}} method should take just a single list of container updates rather than a list for each type of update. * The Container version check is enforced only across updates within an allocate call. This must be extended to all outstanding updates for that container. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org