[jira] [Created] (YARN-8850) Make certain aspects of the NM pluggable to support a DynoYARN cluster

2018-10-04 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-8850:
-

 Summary: Make certain aspects of the NM pluggable to support a 
DynoYARN cluster
 Key: YARN-8850
 URL: https://issues.apache.org/jira/browse/YARN-8850
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-8849) DynoYARN: A simulation and testing infrastructure for YARN clusters

2018-10-04 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-8849:
-

 Summary: DynoYARN: A simulation and testing infrastructure for 
YARN clusters
 Key: YARN-8849
 URL: https://issues.apache.org/jira/browse/YARN-8849
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Arun Suresh


Traditionally, YARN workload simulation is performed using SLS (Scheduler Load 
Simulator) which is packaged with YARN. It Essentially, starts a full fledged 
*ResourceManager*, but runs simulators for the *NodeManager* and the 
*ApplicationMaster* Containers. These simulators are lightweight and run in a 
threadpool. The NM simulators do not open any external ports and send 
(in-process) heartbeats to the ResourceManager.

There are a couple of drawbacks with using the SLS:
* It might be difficult to simulate really large clusters without having access 
to a very beefy box - since the NMs are launched as tasks in a threadpool, and 
each NM has to send periodic heartbeats to the RM.
* Certain features (like YARN-1011) requires changes to the NodeManager - 
aspects such as queuing and selectively killing containers have to be 
incorporate into the existing NM Simulator which might make the simulator a bit 
heavy weight - there is a need for locking and synchronization.
* Since the NM and AM are simulations, only the Scheduler is faithfully tested 
- it does not really perform an end-2-end test of a cluster.

Therefore, drawing inspiration from 
[Dynamometer|https://github.com/linkedin/dynamometer], we propose a framework 
for YARN deployable YARN cluster - *DynoYARN* - for testing, with the following 
features:
* The NM already has hooks to plug-in custom *ContainerExecutor* and 
*NodeResourceMonitor*. If we can plug-in a custom *ContainersMonitorImpl*'s 
Monitoring thread (and other modules like the LocalizationService), We can 
probably inject an Executor that does not actually launch containers and a Node 
and Container resource monitor that reports synthetic pre-specified Utilization 
metrics back to the RM.
* Since we are launching fake containers, we cannot run normal AM containers. 
We can therefore, use *Unmanaged AM*'s to launch synthetic jobs.

Essentially, a test workflow would look like this:
* Launch a DynoYARN cluster.
* Use the Unmanaged AM feature to directly negotiate with the DynaYARN Resource 
Manager for container tokens.
* Use the container tokens from the RM to directly ask the DynoYARN Node 
Managers to start fake containers.
* The DynoYARN NodeManagers will start the fake containers and report to the 
DynoYARN Resource Manager synthetically generated resource utilization for the 
containers (which will be injected via the *ContainerLaunchContext* and parsed 
by the plugged-in Container Executor).
* The Scheduler will use the utilization report to schedule containers - we 
will be able to test allocation of {{Opportunistic}} containers based on 
resource utilization.
* Since the DynoYARN Node Managers run the actual code paths, all preemption 
and queuing logic will be faithfully executed.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-8848) Improvements to YARN over-allocation (YARN-1011)

2018-10-04 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-8848:
-

 Summary: Improvements to YARN over-allocation (YARN-1011)
 Key: YARN-8848
 URL: https://issues.apache.org/jira/browse/YARN-8848
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh


Consolidating work to be done in the next phase of YARN over-allocation 
(YARN-1011).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-8846) Allow Applications to demand Guaranteed Containers

2018-10-04 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-8846:
-

 Summary: Allow Applications to demand Guaranteed Containers
 Key: YARN-8846
 URL: https://issues.apache.org/jira/browse/YARN-8846
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacity scheduler
Reporter: Arun Suresh


The Capacity Scheduler should ensure that if the {{enforceExecutionType}} flag 
in the resource request is {{true}} and the requested Container is of 
{{GUARANTEED}} type, the Capacity scheduler should not return over-allocated 
containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-8827) Plumb per app, per user and per queue resource utilization from the NM to RM

2018-09-26 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-8827:
-

 Summary: Plumb per app, per user and per queue resource 
utilization from the NM to RM
 Key: YARN-8827
 URL: https://issues.apache.org/jira/browse/YARN-8827
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


Opportunistic Containers for OverAllocation need to be allocated to pending 
applications in some fair manner. Rather than evaluating queue and user 
resource usage (allocated resource usage) and comparing against queue and user 
limits to decide the allocation, it might be make more sense to use a snapshot 
of actual resource utilization of the queue and user.

To facilitate this, this JIRA proposes to aggregate per user, per app (and 
maybe per queue) resource utilization in addition to aggregated Container and 
Node Utilization and send it along with the NM heartbeat. It should be fairly 
inexpensive to aggregate - since it can be performed in the same loop of the 
{{ContainersMonitorImpl}}'s Monitoring thread.

A snapshot aggregate can be made every couple of seconds in the RM. This 
instantaneous resource utilization should be used to decide if Opportunistic 
containers can be allocated to an App, Queue or User.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Closed] (YARN-7792) Merge work for YARN-6592

2018-01-31 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh closed YARN-7792.
-
Assignee: Sunil G

> Merge work for YARN-6592
> 
>
> Key: YARN-7792
> URL: https://issues.apache.org/jira/browse/YARN-7792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Blocker
> Fix For: 3.1.0
>
> Attachments: YARN-6592.001.patch, YARN-7792.002.patch, 
> YARN-7792.003.patch, YARN-7792.004.patch
>
>
> This Jira is to run aggregated YARN-6592 branch patch against trunk and check 
> for any jenkins issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6592) Rich placement constraints in YARN

2018-01-31 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-6592.
---
  Resolution: Fixed
Target Version/s: 3.1.0

All tasks have been completed. Merged the branch with trunk.
Thanks [~kkaranasos], [~leftnoteasy], [~pgaref], [~cheersyang] and [~sunilg] 
for all the effort here.
Thanks also to [~chris.douglas], [~subru], [~curino] and [~vinodkv] for the 
discussions.

> Rich placement constraints in YARN
> --
>
> Key: YARN-6592
> URL: https://issues.apache.org/jira/browse/YARN-6592
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-6592-Rich-Placement-Constraints-Design-V1.pdf
>
>
> This JIRA consolidates the efforts of YARN-5468 and YARN-4902.
> It adds support for rich placement constraints to YARN, such as affinity and 
> anti-affinity between allocations within the same or across applications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7858) Support special Node Attribute scopes in addition to NODE and RACK

2018-01-30 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7858:
-

 Summary: Support special Node Attribute scopes in addition to NODE 
and RACK
 Key: YARN-7858
 URL: https://issues.apache.org/jira/browse/YARN-7858
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh


Currently, we have only two scopes defined: NODE and RACK against which we 
check the cardinality of the placement.

This idea should be extended to support node-attribute scopes. For eg: 
Placement of containers across *upgrade domains* and *failure domains*. 




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7839) Check node capacity before placing in the Algorithm

2018-01-28 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7839:
-

 Summary: Check node capacity before placing in the Algorithm
 Key: YARN-7839
 URL: https://issues.apache.org/jira/browse/YARN-7839
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh


Currently, the Algorithm assigns a node to a requests purely based on if the 
constraints are met. It is later in the scheduling phase that the Queue 
capacity and Node capacity are checked. If the request cannot be placed because 
of unavailable Queue/Node capacity, the request is retried by the Algorithm.

For clusters that are running at high utilization, we can reduce the retries if 
we perform the Node capacity check in the Algorithm as well. The Queue capacity 
check can still be handled by the scheduler (since queues are tied to the 
scheduler)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7822) Fix constraint satisfaction checker to handle composite OR and AND constraints

2018-01-25 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7822:
-

 Summary: Fix constraint satisfaction checker to handle composite 
OR and AND constraints
 Key: YARN-7822
 URL: https://issues.apache.org/jira/browse/YARN-7822
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh


JIRA to track changes to {{PlacementConstraintsUtil#canSatisfyConstraints}} 
handle OR and AND Composite constaints



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7821) Fix constraint satisfaction checker to handle inter-app constraints

2018-01-25 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7821:
-

 Summary: Fix constraint satisfaction checker to handle inter-app 
constraints
 Key: YARN-7821
 URL: https://issues.apache.org/jira/browse/YARN-7821
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh


JIRA to track changes to {{PlacementConstraintsUtil#canSatisfyConstraints}} 
handle inter-app constraints



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler

2018-01-25 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7819:
-

 Summary: Allow PlacementProcessor to be used with the FairScheduler
 Key: YARN-7819
 URL: https://issues.apache.org/jira/browse/YARN-7819
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


The FairScheduler needs to implement the 
{{ResourceScheduler#attemptAllocationOnNode}} function for the processor to 
support the FairScheduler.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7812) Improvements to Rich Placement Constraints in YARN

2018-01-24 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7812:
-

 Summary: Improvements to Rich Placement Constraints in YARN
 Key: YARN-7812
 URL: https://issues.apache.org/jira/browse/YARN-7812
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Closed] (YARN-6942) Add examples for placement constraints usage in applications

2018-01-22 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh closed YARN-6942.
-

> Add examples for placement constraints usage in applications
> 
>
> Key: YARN-6942
> URL: https://issues.apache.org/jira/browse/YARN-6942
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> This JIRA will include examples of how the new {{PlacementConstraints}} API 
> can be used by various applications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6942) Add examples for placement constraints usage in applications

2018-01-22 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-6942.
---
Resolution: Resolved

> Add examples for placement constraints usage in applications
> 
>
> Key: YARN-6942
> URL: https://issues.apache.org/jira/browse/YARN-6942
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> This JIRA will include examples of how the new {{PlacementConstraints}} API 
> can be used by various applications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7783) Add validation step to ensure constraints are not violated due to order in which a request is processed

2018-01-21 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7783:
-

 Summary: Add validation step to ensure constraints are not 
violated due to order in which a request is processed
 Key: YARN-7783
 URL: https://issues.apache.org/jira/browse/YARN-7783
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


When the algorithm has placed a container on a node, allocation tags are added 
to the node if the constraint is satisfied, But depending on the order in which 
the algorithm sees the request, it is possible that a constraint that happen to 
be valid during placement of an earlier-seen request, might not be valid after 
all subsequent requests have been placed.

For eg:
Assume nodes n1, n2, n3, n4 and n5
Consider the 2 constraints:
# *foo* -> anti-affinity with *foo*
# *bar* -> anti-affinity with *foo*


And 2 requests
# req1: NumAllocations = 4, allocTags = [foo]
# req2: NumAllocations = 1, allocTags = [bar]

If *req1* is seen first, the algorithm can place the 4 containers in n1, n2, n3 
and n4. And when it gets to *req2*, it will see that 4 nodes have the *foo* tag 
and will place it on n5. But if *req2* is seen first, then *bar* tag will be 
placed on any node, since no node will at that point have *foo*, and then when 
it gets to *req1*, since *foo* has no anti-affinity with *bar*, the algorithm 
can end up placing *foo* on a node with *bar* violating the second constraint.

To prevent the above, we need a validation step: after the placements for a 
batch of requests are made, then for each req, we remove its tags from the node 
and try to see of constraints are still satisfied if the tag were to be added 
back on the node.

When applied to the example above, after the algorithm has run through *req2* 
and then *req1*, we remove the *bar* tag from the node and try to add it back 
on the node. This time, constraint satisfaction will fail, since there is now a 
*foo* tag on the node and *bar* cannot be added. The algorithm will then retry 
placing *req2* on another node.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7780) Documentation for Placement Constraints

2018-01-19 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7780:
-

 Summary: Documentation for Placement Constraints
 Key: YARN-7780
 URL: https://issues.apache.org/jira/browse/YARN-7780
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Konstantinos Karanasos


JIRA to track documentation for the feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7752) Handle AllocationTags for Opportunistic containers.

2018-01-15 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7752:
-

 Summary: Handle AllocationTags for Opportunistic containers.
 Key: YARN-7752
 URL: https://issues.apache.org/jira/browse/YARN-7752
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh


JIRA to track how opportunistic containers are handled w.r.t 
AllocationTagsManager creation and removal of tags.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7746) Minor bug fixes to PlacementConstraintUtils and PlacementProcessor to support app priority

2018-01-12 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7746:
-

 Summary: Minor bug fixes to PlacementConstraintUtils and 
PlacementProcessor to support app priority
 Key: YARN-7746
 URL: https://issues.apache.org/jira/browse/YARN-7746
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


JIRA opened to track 2 minor fixes.

The PlacementConstraintsUtil does a scope check using object equality and not 
string equality, which causes some tests to pass, but it really fails in an 
actual deployment.

The Threadpools used in the Processor should be modified to take a priority 
blocking queue that respects application priority.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7745) Allow DistributedShell to take a placement specification for containers it wants to launch

2018-01-12 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7745:
-

 Summary: Allow DistributedShell to take a placement specification 
for containers it wants to launch
 Key: YARN-7745
 URL: https://issues.apache.org/jira/browse/YARN-7745
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


This is add a '-placement_spec' option to the distributed shell client. Where 
the user can specify a string-ified specification for how it wants containers 
to be placed.

For eg:
{noformat}
$ yarn org.apache.hadoop.yarn.applications.distributedshell.Client –jar \
$YARN_DS/hadoop-yarn-applications-distributedshell-$YARN_VERSION.jar \
 -shell_command sleep -shell_args 10 -placement_spec 
{noformat}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7715) Update CPU and Memory cgroups params on container update as well.

2018-01-08 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7715:
-

 Summary: Update CPU and Memory cgroups params on container update 
as well.
 Key: YARN-7715
 URL: https://issues.apache.org/jira/browse/YARN-7715
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh


In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups 
params for the containers, based on opportunistic or guaranteed, in the 
*preStart* method.

Now that YARN-5085 is in, Container executionType (as well as the cpu, memory 
and any other resources) can be updated after the container has started. This 
means we need the ability to change cgroups params after container start.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7696) Add container tags to ContainerTokenIdentifier, api.Container and NMContainerStatus to handle all recovery cases

2018-01-03 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7696:
-

 Summary: Add container tags to ContainerTokenIdentifier, 
api.Container and NMContainerStatus to handle all recovery cases
 Key: YARN-7696
 URL: https://issues.apache.org/jira/browse/YARN-7696
 Project: Hadoop YARN
  Issue Type: Sub-task
 Environment: The NM needs to persist the Container tags so that on RM 
recovery, it is sent back to the RM via the NMContainerStatus. The RM would 
then recover the AllocationTagsManager using this information.
The api.Container also requires the allocationTags since after AM recovery, we 
need to provide the AM with previously allocated containers.
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7670) Modifications to the ResourceScheduler to support SchedulingRequests

2017-12-18 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7670:
-

 Summary: Modifications to the ResourceScheduler to support 
SchedulingRequests
 Key: YARN-7670
 URL: https://issues.apache.org/jira/browse/YARN-7670
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


As per discussions in YARN-7612. This JIRA tracks the changes to the 
ResourceScheduler interface and implementation in CapacityScheduler to support 
SchedulingRequests



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7669) [API] Introduce interfaces for placement constraint processing

2017-12-18 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7669:
-

 Summary: [API] Introduce interfaces for placement constraint 
processing
 Key: YARN-7669
 URL: https://issues.apache.org/jira/browse/YARN-7669
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


As per discussions in YARN-7612. This JIRA will introduce the generic 
interfaces which will be implemented in YARN-7612



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7623) Fix the CapacityScheduler Queue configuration documentation

2017-12-07 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7623:
-

 Summary: Fix the CapacityScheduler Queue configuration 
documentation
 Key: YARN-7623
 URL: https://issues.apache.org/jira/browse/YARN-7623
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun Suresh


It looks like the [Changing Queue 
Configuration|https://hadoop.apache.org/docs/r2.9.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Changing_queue_configuration_via_API]
 section is mis-formatted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7613) Implement Planning algorithms for rich placement

2017-12-05 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7613:
-

 Summary: Implement Planning algorithms for rich placement
 Key: YARN-7613
 URL: https://issues.apache.org/jira/browse/YARN-7613
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Panagiotis Garefalakis






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7612) Add Placement Processor and planner framework

2017-12-05 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7612:
-

 Summary: Add Placement Processor and planner framework
 Key: YARN-7612
 URL: https://issues.apache.org/jira/browse/YARN-7612
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


This introduces a Placement Processor and a Planning algorithm framework to 
handle placement constraints and scheduling requests from an app and places 
them on nodes.

The actual planning algorithm(s) will be handled in a separate JIRA.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7559) TestNodeLabelContainerAllocation failing intermittently

2017-11-22 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7559:
-

 Summary: TestNodeLabelContainerAllocation failing intermittently
 Key: YARN-7559
 URL: https://issues.apache.org/jira/browse/YARN-7559
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7547) Throttle Localization for Opportunistic Containers in the NM

2017-11-21 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7547:
-

 Summary: Throttle Localization for Opportunistic Containers in the 
NM
 Key: YARN-7547
 URL: https://issues.apache.org/jira/browse/YARN-7547
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: kartheek muthyala


Currently, Localization is performed before the container is queued on the NM. 
It is possible that a barrage of Opportunsitic containers can prevent 
Guaranteed containers from starting. This can be avoided by throttling 
Localization Requests for opportunistic containers - for eg. if the number of 
Queued containers is > x, then don't start localization for new Opp containers. 
  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7542) NM recovers some Running Opportunistic Containers as SUSPEND

2017-11-20 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7542:
-

 Summary: NM recovers some Running Opportunistic Containers as 
SUSPEND
 Key: YARN-7542
 URL: https://issues.apache.org/jira/browse/YARN-7542
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Sampada Dehankar


Steps to reproduce:
* Start YARN cluster - Enable Opportunistic containers and set NM queue length 
to something > 10. Also Enable work preserving restart
* Start an MR job (without opportunistic containers)
* Kill the NM and restart it again.
* In the logs - it shows that some of the containers are in SUSPENDED state - 
even though they are still running.

[~sampada15] / [~kartheek], can you take a look at this ?




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7448) [API] Add SchedulingRequest to the AllocateRequest

2017-11-06 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7448:
-

 Summary: [API] Add SchedulingRequest to the AllocateRequest
 Key: YARN-7448
 URL: https://issues.apache.org/jira/browse/YARN-7448
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


YARN-6594 introduces the {{SchedulingRequest}}. This JIRA tracks the inclusion 
of the SchedulingRequest into the AllocateRequest.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5220) Scheduling of OPPORTUNISTIC containers through YARN RM

2017-11-04 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-5220.
---
   Resolution: Fixed
Fix Version/s: 2.9.0
 Release Note: 
This extends the centralized YARN RM in to enable the scheduling of 
OPPORTUNISTIC containers in a centralized fashion.
This way, users can use OPPORTUNISTIC containers to improve the cluster's 
utilization, without needing to enable distributed scheduling.

> Scheduling of OPPORTUNISTIC containers through YARN RM
> --
>
> Key: YARN-5220
> URL: https://issues.apache.org/jira/browse/YARN-5220
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Major
> Fix For: 2.9.0
>
>
> In YARN-2882, we introduced the notion of OPPORTUNISTIC containers, along 
> with the existing GUARANTEED containers of YARN.
> OPPORTUNISTIC containers are allowed to be queued at the NMs (YARN-2883), and 
> are executed as long as there are available resources at the NM. Moreover, 
> they are of lower priority than the GUARANTEED containers, that is, they can 
> be preempted for a GUARANTEED container to start its execution.
> In YARN-2877, we introduced distributed scheduling in YARN, and enabled 
> OPPORTUNISTIC containers to be scheduled exclusively by distributed 
> schedulers.
> In this JIRA, we are proposing to extend the centralized YARN RM in order to 
> enable the scheduling of OPPORTUNISTIC containers in a centralized fashion.
> This way, users can use OPPORTUNISTIC containers to improve the cluster's 
> utilization, without the need to enable distributed scheduling.
> This JIRA is also related to YARN-1011 that introduces the over-commitment of 
> resources, scheduling additional OPPORTUNISTIC containers to the NMs based on 
> the currently used resources and not based only on the allocated resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5687) Refactor TestOpportunisticContainerAllocation to extend TestAMRMClient

2017-11-04 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-5687.
---
   Resolution: Implemented
Fix Version/s: 2.9.0

This is already done

> Refactor TestOpportunisticContainerAllocation to extend TestAMRMClient
> --
>
> Key: YARN-5687
> URL: https://issues.apache.org/jira/browse/YARN-5687
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Konstantinos Karanasos
>Priority: Major
> Fix For: 2.9.0
>
> Attachments: YARN-5687.001.patch
>
>
> Since {{TestOpportunisticContainerAllocation}} shares a lot of code with the 
> {{TestAMRMClient}}, we should refactor the former, making it a subclass of 
> the latter.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-4631) Add specialized Token support for DistributedSchedulingProtocol

2017-11-04 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-4631.
---
Resolution: Won't Fix

Closing this as it is not required.

> Add specialized Token support for DistributedSchedulingProtocol
> ---
>
> Key: YARN-4631
> URL: https://issues.apache.org/jira/browse/YARN-4631
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
>
> The {{DistributedSchedulingProtocol}} introduced in YARN-2885 which extends 
> the {{ApplicationMasterProtocol}}. This protocol should support its own Token 
> type, and not just reuse the AMRMToken.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-4742) [Umbrella] Enhancements to Distributed Scheduling

2017-11-04 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-4742.
---
   Resolution: Fixed
Fix Version/s: 3.0.0-beta1
   2.9.0

> [Umbrella] Enhancements to Distributed Scheduling
> -
>
> Key: YARN-4742
> URL: https://issues.apache.org/jira/browse/YARN-4742
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
> Fix For: 2.9.0, 3.0.0-beta1
>
>
> This is an Umbrella JIRA to track enhancements / improvements that can be 
> made to the core Distributed Scheduling framework : YARN-2877



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-2877) Extend YARN to support distributed scheduling

2017-11-04 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-2877.
---
Resolution: Fixed

> Extend YARN to support distributed scheduling
> -
>
> Key: YARN-2877
> URL: https://issues.apache.org/jira/browse/YARN-2877
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
>Priority: Major
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: distributed-scheduling-design-doc_v1.pdf
>
>
> This is an umbrella JIRA that proposes to extend YARN to support distributed 
> scheduling.  Briefly, some of the motivations for distributed scheduling are 
> the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise 
> idle resources on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
> (i.e., task execution time is much less compared to the time required for 
> obtaining a container from the RM).
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5447) Consider including allocationRequestId in NMContainerStatus to allow recovery in case of RM failover

2017-09-29 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-5447.
---
Resolution: Won't Fix

Today, this is a problem even when a user does not specify an allocate required 
Id.
Since the AMRMClient sends all outstanding requests to the RM after a failover, 
it should not be that big of an issue.

> Consider including allocationRequestId in NMContainerStatus to allow recovery 
> in case of RM failover
> 
>
> Key: YARN-5447
> URL: https://issues.apache.org/jira/browse/YARN-5447
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: applications, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>
> We have added a mapping of the allocated container to the original request 
> through YARN-4887/YARN-4888. There is a corner case in which the mapping will 
> be lost, i.e. if RM fails over before notifying the AM about newly allocated 
> container(s). This JIRA tracks the changes required to include the 
> allocationRequestId in NMContainerStatus to allow recovery in case of RM 
> failover.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5861) Add support for recovery of queued opportunistic containers in the NM.

2017-09-29 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-5861.
---
Resolution: Duplicate

> Add support for recovery of queued opportunistic containers in the NM.
> --
>
> Key: YARN-5861
> URL: https://issues.apache.org/jira/browse/YARN-5861
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> Currently, the NM stateStore marks a container as QUEUED but they are ignored 
> (deemed lost) if the container had not started before the NM went down. These 
> containers should ideally be re-queued when the NM restarts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5860) Add support for increase and decrease of container resources to NM Container Queuing

2017-09-29 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-5860.
---
   Resolution: Implemented
Fix Version/s: 3.0.0
   2.9.0

> Add support for increase and decrease of container resources to NM Container 
> Queuing 
> -
>
> Key: YARN-5860
> URL: https://issues.apache.org/jira/browse/YARN-5860
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0, 3.0.0
>
>
> Currently the queuing framework (introduced in YARN-2877) in the NM that 
> handles opportunistic containers, are pre-empts opportunistic containers only 
> when resources are need to start guaranteed containers.
> It currently does not handle situations where a guaranteed container 
> resources have been increased. Conversely, if a guaranteed (or opportunistic) 
> container's resources have been decreased, the NM must start queued 
> opportunistic containers waiting on the newly available resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7275) NM Statestore cleanup for Container updates

2017-09-29 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7275:
-

 Summary: NM Statestore cleanup for Container updates
 Key: YARN-7275
 URL: https://issues.apache.org/jira/browse/YARN-7275
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: kartheek muthyala
Priority: Blocker


Currently, only resource updates are recorded in the NM state store, we need to 
add ExecutionType updates as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7258) Add Node and Rack Hints to Opportunistic Scheduler

2017-09-26 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7258:
-

 Summary: Add Node and Rack Hints to Opportunistic Scheduler
 Key: YARN-7258
 URL: https://issues.apache.org/jira/browse/YARN-7258
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: kartheek muthyala


Currently, the Opportunistic Scheduler ignores the node and rack information 
and allocates strictly on the least loaded node (based on queue length) at the 
time it received the request. This JIRA is to track changes needed to allow the 
OpportunisticContainerAllocator to take the node/rack name as hints.

The flow would be:
# If requested node found in the top K leastLoaded nodes, allocate on that node
# Else, allocate on least loaded node on the same rack from the top K least 
Loaded nodes.
# Else, allocate on least loaded node.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-21 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7240:
-

 Summary: Add more states and transitions to stabilize the NM 
Container state machine
 Key: YARN-7240
 URL: https://issues.apache.org/jira/browse/YARN-7240
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun Suresh
Assignee: kartheek muthyala


There seem to be a few intermediate states that can be added to improve the 
stability of the NM container state machine.

For. eg:
* The REINITIALIZING should probably be split into REINITIALIZING and 
REINITIALIZING_AWAITING_KILL. 
* Container updates are currently handled in the ContainerScheduler, but it 
would probably be better to have it plumbed through the container state machine 
as a new state, say UPDATING and a new container event.

The plan is to add some extra tests too to try and test every transition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7192) Add a pluggable StateMachine Listener that is notified of NM Container State changes

2017-09-13 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7192:
-

 Summary: Add a pluggable StateMachine Listener that is notified of 
NM Container State changes
 Key: YARN-7192
 URL: https://issues.apache.org/jira/browse/YARN-7192
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Arun Suresh


This JIRA is to add support for a plugggable class in the NodeManager that is 
notified of changes to the Container StateMachine state and the events that 
caused the change.

The proposal is to modify the basic StateMachine class add support for a hook 
that is called before and after a transition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Closed] (YARN-6692) Delay pause when container is localizing

2017-09-10 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh closed YARN-6692.
-

> Delay pause when container is localizing
> 
>
> Key: YARN-6692
> URL: https://issues.apache.org/jira/browse/YARN-6692
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jose Miguel Arreola
>Assignee: Jose Miguel Arreola
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> If a container receives a Pause event while localizing, allow container 
> finish localizing and then pause it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6692) Delay pause when container is localizing

2017-09-10 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-6692.
---
Resolution: Invalid

Closing this, since it is not a valid scenario currently.

> Delay pause when container is localizing
> 
>
> Key: YARN-6692
> URL: https://issues.apache.org/jira/browse/YARN-6692
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jose Miguel Arreola
>Assignee: Jose Miguel Arreola
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> If a container receives a Pause event while localizing, allow container 
> finish localizing and then pause it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7178) Add documentation for Container Update API

2017-09-08 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7178:
-

 Summary: Add documentation for Container Update API
 Key: YARN-7178
 URL: https://issues.apache.org/jira/browse/YARN-7178
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-4509) Promote containers from OPPORTUNISTIC to GUARANTEED

2017-09-08 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-4509.
---
  Resolution: Duplicate
Target Version/s:   (was: )

> Promote containers from OPPORTUNISTIC to GUARANTEED
> ---
>
> Key: YARN-4509
> URL: https://issues.apache.org/jira/browse/YARN-4509
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>
> YARN-2882 adds the notion of an OPPORTUNISTIC containers. We should define 
> the protocol for promoting these containers to GUARATEED.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7173) Container Update Backward compatibility fix for upgrades from 2.8.x

2017-09-07 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7173:
-

 Summary: Container Update Backward compatibility fix for upgrades 
from 2.8.x
 Key: YARN-7173
 URL: https://issues.apache.org/jira/browse/YARN-7173
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


This is based on discussions with [~leftnoteasy] in YARN-6979.

In YARN-6979, the {{getContainersToDecrease()}} and 
{{addAllContainersToDecrease()}} methods were removed from the 
NodeHeartbeatResponse (although the actual protobuf fields were still 
retained). We need to ensure that for clusters that upgrade from 2.8.x to 
2.9.0, the decreased containers should also be sent to the NM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7086) Release all containers aynchronously

2017-08-23 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7086:
-

 Summary: Release all containers aynchronously
 Key: YARN-7086
 URL: https://issues.apache.org/jira/browse/YARN-7086
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Arun Suresh
Assignee: Arun Suresh


We have noticed in production two situations that can cause deadlocks and cause 
scheduling of new containers to come to a halt, especially with regard to 
applications that have a lot of live containers:
# When these applicaitons release these containers in bulk.
# When these applications terminate abruptly due to some failure, the scheduler 
releases all its live containers in a loop.

To handle the issues mentioned above, we have a patch in production to make 
sure ALL container releases happen asynchronously - and it has served us well.

Opening this JIRA to gather feedback on if this is a good idea generally (cc 
[~leftnoteasy], [~jlowe], [~curino], [~kasha], [~subru], [~roniburd])

BTW, In YARN-6251, we already have an asyncReleaseContainer() in the 
AbstractYarnScheduler and a corresponding scheduler event, which is currently 
used specifically for the container-update code paths (where the scheduler 
realeases temp containers which it creates for the update)





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7015) Handle Container ExecType update (Promotion/Demotion) in cgroups resource handlers

2017-08-15 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7015:
-

 Summary: Handle Container ExecType update (Promotion/Demotion) in 
cgroups resource handlers
 Key: YARN-7015
 URL: https://issues.apache.org/jira/browse/YARN-7015
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh


YARN-5085 allows support for change of container execution type 
(Promotion/Demotion).
Modifications to the ContainerManagementProtocol, ContainerManager and 
ContainerScheduler to handle this change are now in trunk. Opening this JIRA to 
track changes (if any) required in the cgroups resourcehandlers to accommodate 
this in the context of YARN-1011. (cc [~kasha], [~kkaranasos], [~haibochen], 
[~miklos.szeg...@cloudera.com])



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6979) Add flag to allow all container updates to be initiated via NodeHeartbeatResponse

2017-08-09 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6979:
-

 Summary: Add flag to allow all container updates to be initiated 
via NodeHeartbeatResponse
 Key: YARN-6979
 URL: https://issues.apache.org/jira/browse/YARN-6979
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: kartheek muthyala


Currently, only the Container Resource increase command is sent to the NM via 
NodeHeartbeat response. This JIRA proposes to add a flag in the RM to allow ALL 
container updates (increase, decrease, promote and demote) to initiated via 
node HB.

The AM is still free to use the ContainerManagementPrototol's 
{{updateContainer}} API in cases where for instance, the Node HB is frequency 
is very low and the AM needs to update the container as soon as possible. In 
these situations, if the Node HB arrives before the updateContainer API call, 
the call would error out, due to a version mismatch and the AM is required to 
handle it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6978) Add updateContainer API to NMClient.

2017-08-09 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6978:
-

 Summary: Add updateContainer API to NMClient.
 Key: YARN-6978
 URL: https://issues.apache.org/jira/browse/YARN-6978
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: kartheek muthyala


This is to track the addition of updateContainer API to the {{NMClient}} and 
{{NMClientAsync}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6963) Prevent other containers from staring when a container is re-initializing

2017-08-07 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6963:
-

 Summary: Prevent other containers from staring when a container is 
re-initializing
 Key: YARN-6963
 URL: https://issues.apache.org/jira/browse/YARN-6963
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Arun Suresh


Further to discussions in YARN-6920.
Container re-initialization will lead to momentary relinquishing of NM 
resources when the container is brought down followed by re-claiming of the 
same resources when it is re-launched. If there are Opportunistic containers in 
the queue, it can lead to un-necessary churn if one of those opportunistic 
containers are started and immediately killed.
This JIRA tracks changes required to prevent the above by ensuring the 
resources for a container are 'locked' for the during of the container lifetime 
- including the time it takes for a re-initialization. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6180) Clean unused SchedulerRequestKeys once ExecutionType updates are completed

2017-08-05 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-6180.
---
Resolution: Not A Problem

Resolving this - verified it is not a problem

> Clean unused SchedulerRequestKeys once ExecutionType updates are completed
> --
>
> Key: YARN-6180
> URL: https://issues.apache.org/jira/browse/YARN-6180
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> The SchedulerRequestKeys used for ExecutionType updates, that are generated, 
> tend to accumulate in the AppSchedulingInfo and over time lead to a situation 
> outlined in YARN-5540.
> These keys must be removed once the container update completes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6940) Enable Container Resize testcase for FairScheduler

2017-08-03 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6940:
-

 Summary: Enable Container Resize testcase for FairScheduler
 Key: YARN-6940
 URL: https://issues.apache.org/jira/browse/YARN-6940
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Arun Suresh
Assignee: Arun Suresh


After YARN-6216, the Container Update (which includes Resource increase and 
decrease) code-paths are mostly scheduler agnostic.

This JIRA tracks the final minor change needed in the FairScheduler. It also 
re-enables the {{TestAMRMClient#testAMRMClientWithContainerResourceChange}} 
test for the FairScheduler - which verifies that it works for the FairScheduler.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6932) Fix TestFederationRMFailoverProxyProvider test case

2017-08-02 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6932:
-

 Summary: Fix TestFederationRMFailoverProxyProvider test case
 Key: YARN-6932
 URL: https://issues.apache.org/jira/browse/YARN-6932
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Subru Krishnan


Noticed that {{TestFederationRMFailoverProxyProvider}} after the YARN-2915 merge
(cc [~subru])



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6920) fix TestNMClient failure

2017-08-01 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6920:
-

 Summary: fix TestNMClient failure
 Key: YARN-6920
 URL: https://issues.apache.org/jira/browse/YARN-6920
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Haibo Chen


Looks like {{TestNMClient}} has been failing for a while. Opening this JIRA to 
track the fix.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6849) NMContainerStatus should have the Container ExecutionType.

2017-07-19 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6849:
-

 Summary: NMContainerStatus should have the Container ExecutionType.
 Key: YARN-6849
 URL: https://issues.apache.org/jira/browse/YARN-6849
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh


Currently only the ContainerState is sent to the RM in the NMContainerStatus. 
This lets the restarted RM know if the container is queued or not, but It wont 
know the ExecutionType.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6848) Move Router ClientRMServices Interceptor and chain into yarn api and common package

2017-07-19 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6848:
-

 Summary: Move Router ClientRMServices Interceptor and chain into 
yarn api and common package
 Key: YARN-6848
 URL: https://issues.apache.org/jira/browse/YARN-6848
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


Once YARN-2915 is merged, The ClientRMServices interceptor and Proxy classes 
should be moved into yarn common and api packages, so that it can be used not 
just in the Router, but also for specifying Interceptors that run in the RM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6838) Add support to linux container-executor to support container freezing

2017-07-18 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6838:
-

 Summary: Add support to linux container-executor to support 
container freezing 
 Key: YARN-6838
 URL: https://issues.apache.org/jira/browse/YARN-6838
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6835) Remove runningContainers from ContainerScheduler

2017-07-17 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6835:
-

 Summary: Remove runningContainers from ContainerScheduler
 Key: YARN-6835
 URL: https://issues.apache.org/jira/browse/YARN-6835
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


The *runningContainers* collection contains both running containers as well as 
container that are scheduled but not yet started.

We can remove this data structure completely by introducing a *LAUNCHING* 
container state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6829) Promote Opportunistic Containers to Guaranteed containers when Guaranteed containers complete

2017-07-15 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6829:
-

 Summary: Promote Opportunistic Containers to Guaranteed containers 
when Guaranteed containers complete 
 Key: YARN-6829
 URL: https://issues.apache.org/jira/browse/YARN-6829
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


Once Guaranteed containers of apps complete, it is possible that the queue/app 
might go below configured capacity. In which case existing Opportunistic 
containers of an app can be promoted to ensure they are not preempted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6828) [Umbrella] Container preemption using OPPORTUNISTIC containers

2017-07-15 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6828:
-

 Summary: [Umbrella] Container preemption using OPPORTUNISTIC 
containers
 Key: YARN-6828
 URL: https://issues.apache.org/jira/browse/YARN-6828
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Arun Suresh


Currently, the YARN schedulers selects containers for preemption only in 
response to a starved queue / app's request. We propose to allow the Schedulers 
to mark containers that are allocated over queue capacity/fair-share as 
Opportunistic containers.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5049) Extend NMStateStore to save queued container information

2017-07-14 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-5049.
---
Resolution: Fixed

Committed to branch-2 as well

> Extend NMStateStore to save queued container information
> 
>
> Key: YARN-5049
> URL: https://issues.apache.org/jira/browse/YARN-5049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Fix For: 3.0.0-alpha1
>
> Attachments: YARN-5049.001.patch, YARN-5049.002.patch, 
> YARN-5049.003.patch
>
>
> This JIRA is about extending the NMStateStore to save queued container 
> information whenever a new container is added to the NM queue. 
> It also removes the information from the state store when the queued 
> container starts its execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6826) SLS NMSimulator support for Opportunistic Container Queuing.

2017-07-14 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6826:
-

 Summary: SLS NMSimulator support for Opportunistic Container 
Queuing.
 Key: YARN-6826
 URL: https://issues.apache.org/jira/browse/YARN-6826
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler-load-simulator
Reporter: Arun Suresh
Assignee: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6808) Allow Schedulers to return OPPORTUNISTIC containers when queues go over configured capacity

2017-07-11 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6808:
-

 Summary: Allow Schedulers to return OPPORTUNISTIC containers when 
queues go over configured capacity
 Key: YARN-6808
 URL: https://issues.apache.org/jira/browse/YARN-6808
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Arun Suresh


This is based on discussions with [~kasha] and [~kkaranasos].
Currently, when a Queues goes over capacity, apps on starved queues must wait 
either for containers to complete or for them to be pre-empted by the scheduler 
to get resources.
This JIRA proposes to allow Schedulers to:
# Allocate all containers over the configured queue capacity/weight as 
OPPORTUNISTIC.
# Auto-promote running OPPORTUNISTIC containers of apps as and when their 
GUARANTEED containers complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors

2017-07-08 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6777:
-

 Summary: Support for ApplicationMasterService processing chain of 
interceptors
 Key: YARN-6777
 URL: https://issues.apache.org/jira/browse/YARN-6777
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


This JIRA extends the Processor introduced in YARN-6776 with a configurable 
processing chain of interceptors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6776) Refactor ApplicaitonMasterService to move actual processing logic to a separate class

2017-07-07 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6776:
-

 Summary: Refactor ApplicaitonMasterService to move actual 
processing logic to a separate class
 Key: YARN-6776
 URL: https://issues.apache.org/jira/browse/YARN-6776
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh
Priority: Minor


Minor refactoring to move the processing logic of the 
{{ApplicationMasterService}} into a separate class.

The per appattempt locking as well as the extraction of the appAttemptId etc. 
will remain in the ApplicationMasterService 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects

2017-05-17 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6619:
-

 Summary: AMRMClient Changes to use the PlacementConstraint and 
SchcedulingRequest objects
 Key: YARN-6619
 URL: https://issues.apache.org/jira/browse/YARN-6619
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Panagiotis Garefalakis


Opening this JIRA to track changes needed in the AMRMClient to incorporate the 
PlacementConstraint and SchedulingRequest objects



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6614) Deprecate DistributedSchedulingProtocol and add required fields directly to ApplicationMasterProtocol

2017-05-16 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6614:
-

 Summary: Deprecate DistributedSchedulingProtocol and add required 
fields directly to ApplicationMasterProtocol
 Key: YARN-6614
 URL: https://issues.apache.org/jira/browse/YARN-6614
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Arun Suresh


The {{DistributedSchedulingProtocol}} was initially designed as a wrapper 
protocol over the {{ApplicaitonMasterProtocol}}.

This JIRA proposes to deprecate the protocol itself and move the extra fields 
of the {{RegisterDistributedSchedulingAMResponse}} and 
{{DistributedSchedulingAllocateResponse}} to the 
{{RegisterApplicationMasterResponse}} and {{AllocateResponse}} respectively.

This will simplify the code quite a bit and make it reimplement the feature as 
a preprocessor.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6443) Allow for Priority order relaxing in favor of node locality

2017-04-04 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6443:
-

 Summary: Allow for Priority order relaxing in favor of node 
locality 
 Key: YARN-6443
 URL: https://issues.apache.org/jira/browse/YARN-6443
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacity scheduler, fairscheduler
Reporter: Arun Suresh
Assignee: Arun Suresh


Currently the Schedulers examine an applications pending Requests in Priority 
order. This JIRA proposes to introduce a flag (either via the 
ApplicationMasterService::registerApplication() or via some Scheduler 
configuration) to favor an ordering that is baised to the node that is 
currently heartbeating by relaxing the priority constraint.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6406) Garbage Collect unused SchedulerRequestKeys

2017-03-28 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6406:
-

 Summary: Garbage Collect unused SchedulerRequestKeys
 Key: YARN-6406
 URL: https://issues.apache.org/jira/browse/YARN-6406
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun Suresh
Assignee: Arun Suresh


YARN-5540 introduced some optimizations to remove satisfied SchedulerKeys from 
the AppScheduleingInfo. It looks like after YARN-6040, ScedulerRequestKeys are 
removed only if the Application sends a 0 numContainers requests. While 
earlier, the outstanding schedulerKeys were also remove as soon as a container 
is allocated as well.

An additional optimization we were hoping to include is to remove the 
ResourceRequests itself once the numContainers == 0, since we see in our 
clusters that the RM heap space consumption increases drastically due to a 
large number of ResourceRequests with 0 num containers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6355) Interceptor framework for the YARN ApplicationMasterService

2017-03-16 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6355:
-

 Summary: Interceptor framework for the YARN 
ApplicationMasterService
 Key: YARN-6355
 URL: https://issues.apache.org/jira/browse/YARN-6355
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun Suresh
Assignee: Arun Suresh


Currently on the NM, we have the {{AMRMProxy}} framework to intercept the AM 
<-> RM communication and enforce policies. This is used both by YARN federation 
(YARN-2915) as well as Distributed Scheduling (YARN-2877).

This JIRA proposes to introduce a similar framework on the the RM side, so that 
pluggable policies can be enforced on ApplicationMasterService centrally as 
well.

This would be similar in spirit to a Java Servlet Filter Chain. Where the order 
of the interceptors can declared externally.

Once possible usecase would be:
the {{OpportunisticContainerAllocatorAMService}} is implemented as a wrapper 
over the {{ApplicationMasterService}}. It would probably be better to implement 
it as an Interceptor.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6181) SchedulerRequestKey compareTo method neglects to compare containerToUpdate

2017-03-03 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-6181.
---
Resolution: Duplicate

> SchedulerRequestKey compareTo method neglects to compare containerToUpdate 
> ---
>
> Key: YARN-6181
> URL: https://issues.apache.org/jira/browse/YARN-6181
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> The SchedulerRequestKey's {{compareTo}} does not correctly compare the 
> {{containerToUpdate}} fields. Thus multiple update requests against the same 
> priority and allocationRequestId will be clobbered.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6251) Fix Scheduler locking issue introduced by YARN-6216

2017-02-28 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6251:
-

 Summary: Fix Scheduler locking issue introduced by YARN-6216
 Key: YARN-6251
 URL: https://issues.apache.org/jira/browse/YARN-6251
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh


Opening to track a locking issue that was uncovered when running a custom SLS 
AMSimulator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6231) TestFairScheduler::testMoveWouldViolateMaxResourcesConstraints failing on branch-2

2017-02-23 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6231:
-

 Summary: 
TestFairScheduler::testMoveWouldViolateMaxResourcesConstraints failing on 
branch-2 
 Key: YARN-6231
 URL: https://issues.apache.org/jira/browse/YARN-6231
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.9.0
Reporter: Arun Suresh
Assignee: Karthik Kambatla






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic

2017-02-22 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6216:
-

 Summary: Unify Container Resizing code paths with Container 
Updates making it scheduler agnostic
 Key: YARN-6216
 URL: https://issues.apache.org/jira/browse/YARN-6216
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler, fairscheduler, resourcemanager
Affects Versions: 3.0.0-alpha2
Reporter: Arun Suresh
Assignee: Arun Suresh
 Fix For: 3.0.0-alpha3






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6181) SchedulerRequestKey compareTo method neglects to compare containerToUpdate

2017-02-12 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6181:
-

 Summary: SchedulerRequestKey compareTo method neglects to compare 
containerToUpdate 
 Key: YARN-6181
 URL: https://issues.apache.org/jira/browse/YARN-6181
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


The SchedulerRequestKey's {{compareTo}} does not correctly compare the 
{{containerToUpdate}} fields. Thus multiple update requests against the same 
priority and allocationRequestId will be clobbered.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6180) Clean unused SchedulerRequestKeys once ExecutionType updates are completed

2017-02-12 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6180:
-

 Summary: Clean unused SchedulerRequestKeys once ExecutionType 
updates are completed
 Key: YARN-6180
 URL: https://issues.apache.org/jira/browse/YARN-6180
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


The SchedulerRequestKeys used for ExecutionType updates, that are generated, 
tend to accumulate in the AppSchedulingInfo and over time lead to a situation 
outlined in YARN-5540.

These keys must be removed once the container update completes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5646) Add documentation and update config parameter names for scheduling of OPPORTUNISTIC containers

2017-01-13 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-5646.
---
Resolution: Fixed

Committed addendum patch to trunk and branch-2

> Add documentation and update config parameter names for scheduling of 
> OPPORTUNISTIC containers
> --
>
> Key: YARN-5646
> URL: https://issues.apache.org/jira/browse/YARN-5646
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5646.001.patch, YARN-5646.002.patch, 
> YARN-5646.003.patch, YARN-5646.004.patch, YARN-5646.addendum.patch
>
>
> This is for adding documentation regarding the scheduling of OPPORTUNISTIC 
> containers.
> It includes both the centralized (YARN-5220) and the distributed (YARN-2877) 
> scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6066) Opportunistic containers: Minor fixes: API annotations and config parameter changes

2017-01-06 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6066:
-

 Summary: Opportunistic containers: Minor fixes: API annotations 
and config parameter changes
 Key: YARN-6066
 URL: https://issues.apache.org/jira/browse/YARN-6066
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Priority: Minor


Creating this to capture changes suggested by [~leftnoteasy] and [~kasha] in 
YARN-6041 in its own JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6041) Opportunistic containers : Combined patch for branch-2

2016-12-29 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-6041:
-

 Summary: Opportunistic containers : Combined patch for branch-2 
 Key: YARN-6041
 URL: https://issues.apache.org/jira/browse/YARN-6041
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Arun Suresh
 Fix For: 2.9.0


This is a combined patch targeting branch-2 of the following JIRAs which have 
already been committed to trunk :

YARN-5938. Refactoring OpportunisticContainerAllocator to use 
SchedulerRequestKey instead of Priority and other misc fixes
YARN-5646. Add documentation and update config parameter names for scheduling 
of OPPORTUNISTIC containers.
YARN-5982. Simplify opportunistic container parameters and metrics.
YARN-5918. Handle Opportunistic scheduling allocate request failure when NM is 
lost.
YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager 
container lifecycle.
YARN-5823. Update NMTokens in case of requests with only opportunistic 
containers.
YARN-5377. Fix 
TestQueuingContainerManager.testKillMultipleOpportunisticContainers.
YARN-2995. Enhance UI to show cluster resource utilization of various container 
Execution types.
YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http 
Address.
YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method to 
handle OPPORTUNISTIC container requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5978) ContainerScheduler and Container state machine changes to support ExecType update

2016-12-06 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5978:
-

 Summary: ContainerScheduler and Container state machine changes to 
support ExecType update
 Key: YARN-5978
 URL: https://issues.apache.org/jira/browse/YARN-5978
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5977) ContainerManagementProtocol changes to support change of container ExecutionType

2016-12-06 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5977:
-

 Summary: ContainerManagementProtocol changes to support change of 
container ExecutionType
 Key: YARN-5977
 URL: https://issues.apache.org/jira/browse/YARN-5977
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5972) Add Support for Pausing/Freezing of containers

2016-12-06 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5972:
-

 Summary: Add Support for Pausing/Freezing of containers
 Key: YARN-5972
 URL: https://issues.apache.org/jira/browse/YARN-5972
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5966) AMRMClient changes to support ExecutionType update

2016-12-05 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5966:
-

 Summary: AMRMClient changes to support ExecutionType update
 Key: YARN-5966
 URL: https://issues.apache.org/jira/browse/YARN-5966
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5087) Expose API to allow AM to request for change of container ExecutionType

2016-12-03 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-5087.
---
Resolution: Resolved

Closing this, as YARN-5221 fixes this

> Expose API to allow AM to request for change of container ExecutionType
> ---
>
> Key: YARN-5087
> URL: https://issues.apache.org/jira/browse/YARN-5087
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5959) Add support for ExecutionType change from OPPORTUNISTIC to GUARANTEED

2016-12-01 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5959:
-

 Summary: Add support for ExecutionType change from OPPORTUNISTIC 
to GUARANTEED
 Key: YARN-5959
 URL: https://issues.apache.org/jira/browse/YARN-5959
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService

2016-11-28 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5938:
-

 Summary: Minor refactoring to 
OpportunisticContainerAllocatorAMService
 Key: YARN-5938
 URL: https://issues.apache.org/jira/browse/YARN-5938
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


Minor code re-organization to do the following:

# The OpportunisticContainerAllocatorAMService currently allocates outside the 
ApplicationAttempt lock maintained by the ApplicationMasterService. This should 
happen inside the lock.
# Refactored out some code to simplify the allocate() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5861) Add support for recovery of queued opportunistic containers in the NM.

2016-11-08 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5861:
-

 Summary: Add support for recovery of queued opportunistic 
containers in the NM.
 Key: YARN-5861
 URL: https://issues.apache.org/jira/browse/YARN-5861
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Arun Suresh


Currently, the NM stateStore marks a container as QUEUED but they are ignored 
(deemed lost) if the container had not started before the NM went down. These 
containers should ideally be re-queued when the NM restarts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5860) Add support for increase and decrease of container resources to NM Container Queuing

2016-11-08 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5860:
-

 Summary: Add support for increase and decrease of container 
resources to NM Container Queuing 
 Key: YARN-5860
 URL: https://issues.apache.org/jira/browse/YARN-5860
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Arun Suresh


Currently the queuing framework (introduced in YARN-2877) in the NM that 
handles opportunistic containers, are pre-empts opportunistic containers only 
when resources are need to start guaranteed containers.
It currently does not handle situations where a guaranteed container resources 
have been increased. Conversely, if a guaranteed (or opportunistic) container's 
resources have been decreased, the NM must start queued opportunistic 
containers waiting on the newly available resources.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5799) Fix OpportunisticAllocation to set the correct value of Node Http Address

2016-10-28 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5799:
-

 Summary: Fix OpportunisticAllocation to set the correct value of 
Node Http Address 
 Key: YARN-5799
 URL: https://issues.apache.org/jira/browse/YARN-5799
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Konstantinos Karanasos


This proposes to fix the the OpportunisticAllocator, used to allocate 
OPPORTUNISTIC containers (both Centrally as well as in a Distributed manner) to 
correctly populate the Node Http Address in the returned Container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5651) Changes to NMStateStore to persist reinitialization and rollback state

2016-09-14 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5651:
-

 Summary: Changes to NMStateStore to persist reinitialization and 
rollback state
 Key: YARN-5651
 URL: https://issues.apache.org/jira/browse/YARN-5651
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit

2016-09-12 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5637:
-

 Summary: Changes in NodeManager to support Container upgrade and 
rollback/commit
 Key: YARN-5637
 URL: https://issues.apache.org/jira/browse/YARN-5637
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


YARN-5620 added support for re-initialization of Containers using a new launch 
Context.
This JIRA proposes to use the above feature to support upgrade and subsequent 
rollback or commit of the upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5633) Update Container Version in NMStateStore only if Resources have changed

2016-09-09 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh resolved YARN-5633.
---
Resolution: Duplicate

> Update Container Version in NMStateStore only if Resources have changed
> ---
>
> Key: YARN-5633
> URL: https://issues.apache.org/jira/browse/YARN-5633
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5633-branch-2.8-v1.patch
>
>
> YARN-5221 introduced a containerVersion that is stored in the NMStateStore. 
> The version is stored when
> # the container is first started in the NM
> # when the AM requests the NM to increase the Container resources.
> # when the RM notifies the NM to decrease the Container resources. 
> Unfortunately, this would result in NM not being able to rollback after an 
> upgrade (for eg. from 2.8 > 2.7).. as noticed by [~jlowe].
> This JIRA proposes to update the version in the NM state store only when 2 
> and 3 above occurs, this way, the rollback will be hampered only if the user 
> has used the new feature (resource increase / decrease)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5633) Update Container Version in NMStateStore only if Resources have changed

2016-09-09 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5633:
-

 Summary: Update Container Version in NMStateStore only if 
Resources have changed
 Key: YARN-5633
 URL: https://issues.apache.org/jira/browse/YARN-5633
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Arun Suresh


YARN-5221 introduced a containerVersion that is stored in the NMStateStore. The 
version is stored when
# the container is first started in the NM
# when the AM requests the NM to increase the Container resources.
# when the RM notifies the NM to decrease the Container resources. 

Unfortunately, this would result in NM not being able to rollback after an 
upgrade (for eg. from 2.8 > 2.7).. as noticed by [~jlowe].

This JIRA proposes to update the version in the NM state store only when 2 and 
3 above occurs, this way, the rollback will be hampered only if the user has 
used the new feature (resource increase / decrease)





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5620) Core changes in NodeManager to support for upgrade and rollback of Containers

2016-09-06 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5620:
-

 Summary: Core changes in NodeManager to support for upgrade and 
rollback of Containers
 Key: YARN-5620
 URL: https://issues.apache.org/jira/browse/YARN-5620
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh


JIRA proposes to modify the ContainerManager (and other core classes) to 
support upgrade of a running container with a new {{ContainerLaunchContext}} as 
well as the ability to rollback the upgrade if the container is not able to 
restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5609) Expose upgrade and restart API in ContainerManagementProtocol

2016-08-31 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5609:
-

 Summary: Expose upgrade and restart API in 
ContainerManagementProtocol
 Key: YARN-5609
 URL: https://issues.apache.org/jira/browse/YARN-5609
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh


YARN-4876 allows an AM to explicitly *initialize*, *start*, *stop* and 
*destroy* a {{Container}}.

This JIRA proposes to extend the ContainerManagementProtocol with the following 
API:
# *upgrade* : which is a composition of *stop* + *(re)initialize* + *start*
# *restart* : which is *stop* + *start*




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5593) [Umbrella] Add support for YARN Allocation composed of multiple containers/processes

2016-08-30 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5593:
-

 Summary: [Umbrella] Add support for YARN Allocation composed of 
multiple containers/processes
 Key: YARN-5593
 URL: https://issues.apache.org/jira/browse/YARN-5593
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun Suresh
Assignee: Arun Suresh


Opening this to explicitly call out and track some of the ideas that were 
discussed in YARN-1040. Specifically the concept of an {{Allocation}} which can 
be used by an AM to start multiple {{Containers}} against as long as the sum of 
resources used by all containers {{fitsIn()}} the Resources leased to the 
{{Allocation}}.
This is especially useful for AMs that might want to target certain operations 
(like upgrade / restart) on specific containers / processes within an 
Allocation without fear of losing the allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-5580) Refactor Scheduler to take only an single list of container updates

2016-08-29 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5580:
-

 Summary: Refactor Scheduler to take only an single list of 
container updates
 Key: YARN-5580
 URL: https://issues.apache.org/jira/browse/YARN-5580
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh


This is a follow up to YARN-5221. It proposes to fix the following:

* The {{AbstractYarnScheduler::allocate}} method should take just a single list 
of container updates rather than a list for each type of update.
* The Container version check is enforced only across updates within an 
allocate call. This must be extended to all outstanding updates for that 
container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



  1   2   >