[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-08 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626139#comment-13626139
 ] 

Carlo Curino commented on YARN-45:
--


High level idea:

The philosophy behind preemption is that we give the AM a heads up about 
resources that are likely to be taken away, and give it the opportunity to save 
the state of its tasks. A separate kill-based mechanism already exists 
(leveraged by the FairScheduler for its preemption) to forcibly recover 
containers.
This is our fallback if the AM does not release containers in a certain amount 
of time (note that due to the quickly evolving conditions of the cluster we 
might not kill a container if at a later time we realize this is not strictly 
needed to achieve fairness/capacity). This means an AM can be written 
completely ignoring preemption hints, and would work correctly (although it 
might waste useful work).

The goal is to allow for smart local policies in the AM, which leverage 
application-level understanding of the ongoing computation to face the imminent 
reduction of resources (e.g., by saving the state of the computation to a 
checkpoint, by promoting partial output, by migrating competencies to other 
tasks, by try to complete the work quickly). The goal is to spare the RM from 
understanding application-level optimization concerns but rather focus on 
resource management issues. As a consequence we envision (among others) 
preemption requests that are not fully bounded, allowing the AM to leverage 
some flexibility. Note that the significant "lag"  imposed by the heartbeat 
protocols between RM-AM and AM-Tasks and NM-RM force us to consider in most 
cases preemption actions to be limited to a rather long time horizon. We can't 
expect to operate in a tight sub-second control loop, but rather trigger 
changes in the cluster allocation in the orders of tens of seconds. As a 
consequence preemption should be used to correct macroscopic issues that are 
likely to be somewhat stable over time, rather than micro-managing container 
allocations.

We consider the following use cases for preemption: 
# Scheduling policies aimed at rebalancing some global property such as 
capacity or fairness. This allows to go for example over capacity on a queue 
and get resources back as the cluster conditions change. 
# Scheduling policies that are making point decisions about individual 
containers (e.g., preempt a container on a machine and restart it elsewhere to 
improve data locality, or preempting containers on a box that is observing 
excessive IOs).
# Administrative actions that are aimed at modifying the cluster allocations 
without wasting work (e.g., draining a machine or a rack before taking it 
offline for maintenance), manually reducing allocations for a job, etc.

Use cases 1 and 3 can be implemented by picking containers at the RM, or by 
expressing a "broad" request of a certain amount of resources (we reuse the 
ResourceRequest for this, in a way that is symmetric to the AM request) and let 
the AM to bound this to specific containers. While use case 2 is more likely to 
be implemented using ContainerIDs.



Protocol change proposal:

Our proposal consists in extending the ResourceResponse with a PreemptRequest 
message (further extensible in the future) that contains a Set and 
a Set.  The current semantics is that these two sets are 
non-overlapping (i.e., if I ask for a specific container and a ResourceRequest 
the AM is supposed to satisfy both). Once again, as we never rely on the AM to 
"enforce" preemption but we have a kill-based fallback, the AM implementation 
is not required to understand the preemption requests (nor even acknowledging 
their receiving). This make for an simple upgrade story and one could run mixed 
preemption-aware and not-preemption-aware AMs on the same cluster. 

A current open question we would like input on is whether to have the 
PreemptRequest to be a union-type where we have either sets (but not both 
together), or whether to allow, as we do in the attached patch, for both to 
co-exists in the same PreemptRequest. We do not have a current need for the 
"both" use case, but maybe others do. thoughts?



Coming up next:

We are cleaning up further patches to the FairScheduler, CapacityScheduler and 
ApplicationMasterService leveraging this AM-RM protocol, and changes to the 
mapreduce AM that implements work-saving preemption via checkpointing for 
Shuffle and Reducers (while for Mappers we are currently "making a run for it" 
given the commonly short runtime of maps). The other patches will be posted 
soon. 


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Com

[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626162#comment-13626162
 ] 

Hadoop QA commented on YARN-45:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12577679/YARN-45.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 2 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/691//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/691//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/691//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-api.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/691//console

This message is automatically generated.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628247#comment-13628247
 ] 

Alejandro Abdelnur commented on YARN-45:


Nice, the proposed functionality comes quite handy for some stuff I'm working 
on.

Regarding the question on how to model the PreemptRequest, have you thought 
about the following alternative?

* The PreemptRequest would contain only a Set.
* PreemptResource has the following properties: a String location, a Resource 
capability
* The PreemptResource location can be ANY, a rack or a node.
* The PreemptResource capability is the total capacity that should be released.
* The AM, if taking the hint, would release containers that match the location 
and add up to the PreemptResource capability.

By doing this, you give full control to the AM to decide what to release by 
grouping any containers that match the location.

Thoughts?


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-10 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628305#comment-13628305
 ] 

Carlo Curino commented on YARN-45:
--

Alejandro, thanks for the feedback, and yes you are spot on. I think what you 
propose is akin to the Set we have (which is similar if I 
understand correctly to the PreemptResource thing you describe). We plan to 
support this, and it does cover one set of use cases very well, i.e., when we 
have a "broad" request and we are ok with the AM resolving this as it see fit. 
As you point out this is good because it allows the AM to be smart about what 
to return and thus more likely to save expensive preemptions in favor of cheap 
ones, or even return a container which is not data-local in place of one that 
is data-local etc... 

However, this feels contrived when we know precisely what we want back from a 
certain AM (e.g., we want to preempt a specific container). To this purpose the 
Set -based preemption is easier to use, and also simplifies the 
bookeeping done in the RM (in our preemption policy), to decide when to "kill" 
a container if the AM does not preempt it within a certain timeout. This is a 
good match with the FairScheduler internals and we adapted CapacityScheduler to 
leverage this too by means of a preemption monitor.  This will be more clear 
when we release the actual monitor (in the next few days) but the idea is that 
if we talk to the AM in terms of a Set there is no ambiguity to 
detect when the AM is ignoring us, and thus we have to move on with container 
killing (e.g., to enforce capacity/fairness).  On the contrary using 
ResourceRequest or something like that, we might not know whether the resource 
I want back now is the same I wanted in some previous iteration (hence i am 
being ignored by the AM) or they just happen to be the same/similar. 

If we can devise a simple way to leverage a single resource-based 
representation for both scenarios I would be happy to drop the 
Set, but so far we haven't found a clean way to do it, so we 
provisioned for both Set and/or Set to be 
optionally part of a PreemptRequest. The current semantic is that these are 
disjoint sets of resources we want (some called-out as containers, and some 
expressed as resources), but we don't have a strong reason for this not to be a 
tagged union.

Do you think the above covers the use case you have in mind or am I missing 
something? (BTW I am very curious to hear what's your use case).




> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628358#comment-13628358
 ] 

Alejandro Abdelnur commented on YARN-45:


Carlo, I may be missing something then.

>From your description I'm understanding that a PreemptRequest could contain 
>either Set or Set but not both.

If I'm correct with this assumption, then if the RM chooses to send 
Set then  we are back to square one where the RM is deciding what 
to kill, it just giving a heads up.

If the idea is that the RM will send PreemptRequest containing both 
Set and Set which are equivalent (just 2 ways of 
expressing the same amount of resources), then it seems OK. In this case, the 
Set is just a convenience fo the AM not to go and dig its internal 
data structures. But you seem to indicate this is not the case in your second 
last paragraph.

I'd argue that the Set is just an early warning, it does not 
delegate the choice to the AM. The fact that the AM could decide to get rid of 
another container in the same location and make this preemption to go away 
seems twisted.

Regardless of the convenience because of the implementation, I don't think the 
RM should cares which containers the AM chooses to release but the amount of 
resources.

My specific use case is that that the AM should get an amount of preempt 
resources and decide which containers are best to release.


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-10 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628477#comment-13628477
 ] 

Carlo Curino commented on YARN-45:
--

This is still a point we are discussing and it is not fully binded, this is why 
is why it comes out confusing and why we were soliciting opinions. 

Your observations I think are helping us frame this a bit better. We can see 
three possible uses of preemption:

1) A preemption policy that does not necessarily trust the AM, picks containers 
and list them as a Set, and give the AM a heads up on who is going 
to die soon if it is not preempted. Note that If the AM is mapreduce this is 
not too bad as we know how containers are used (maps before reducers) and so we 
can pick containers in a reasonable order. We have been testing a policy that 
does this, and works well in our tests. Also this is a perfect match with how 
the FairScheduler thinks about preemption.

2) A preemption policy that trusts the AM and specifies preemption  as a 
Set. This works well for known AMs that we know try to enforce 
the preemption requests, and/or if we do not care to force-killing anyway and 
preemption requests are best-effort. We have played around with a version of 
this too. If I am not mistaken this is also the case you care the most about, 
right?

3) A version of 2 which also enforces its preemption-requests via killing if 
they are not satisfied within a certain period of time. This is not-trivial to 
build as there is inherent ambiguity of how ResourceRequest are mapped to 
containers over-time, so the enforcement part is hard to get right / prove 
correctness for. 

We believe that 3 might be the ideal point of tendency but proving its 
correctness is non-trivial and would require deeper surgery to the 
RM/Schedulers, for example if in subsequent moment in time I want the same 
amount of resources out of an AM it is hard to unambiguously decide whether is 
due to an AM not preempting as I asked (just forcibly killing its containers is 
fine), or whether this are subsequent and independent request of resources (so 
I should not kill but wait).

The proposed protocol, with the change that makes it a tagged union of 
Set and Set seems to allow for all of the above, 
and be easy to explain.  I will update the patch to fix to reflect this if you 
agree.

 


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628503#comment-13628503
 ] 

Alejandro Abdelnur commented on YARN-45:


My soft objection to #1 is just it is telling me 'you are overcapacity better 
get rid of stuff, this is our choice to kill but you could release others and 
you are good'. So why not just tell the amount of capacity I should release to 
be safe?

IMO, if an AM will deal with the complexity of this functionality it should be 
able to map  to containers locally and then decide what to release.
 
IMO #2 is the one I care as it truly gives the AM the flexibility to decide 
what containers to get rid of based on the specified resources. Yes, I prefer 
this one.


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-10 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628625#comment-13628625
 ] 

Carlo Curino commented on YARN-45:
--

Agreed.

As for #1, your previous comments made us indeed "simplify" #1 as follows:
We inform the AM that a Set will be killed unless he preempts them 
(the exact same containers). We dropped the "trading these containers for 
equivalent ones" as we agreed with your comments that would be too funky. 
The rationale behind including this simple container-based preemption is 
twofold: 
 a) it matches very well with what the FairScheduler does today (we simply 
provide a cheaper form of preemption w.r.t. the straight-up kill it used to 
do), and 
 b) it allows for compact bookkeeping for "kill if no preemption happens" in a 
policy we wrote to add preemption to the CapacityScheduler which seems to 
behave well.

As for #2 I totally agree this is important to have, and it has lots of 
potential since it empowers the AM to make smart local decisions (it is well 
aligned with the overall spirit of Yarn I think). 
We will handle this both in the RM and AM in future patches. Where "future" = 
we have the code, but need a polish before posting.

Cheers,
Carlo

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628889#comment-13628889
 ] 

Alejandro Abdelnur commented on YARN-45:


Carlo, what about a small twist?

A preempt message (instead of request, as there is no preempt response) would 
contain:

* Resources (# CPUs & # Memory) : total amount of resources that may be 
preempted if no action is taken by the AM.
* Set : list of containers that would be killed by the RM to claim 
the resources if no action is taken by the AM.

Computing the resources is straight forward, just aggregating the resources of 
the Set.

An AM can take action using either or information.

If an AM releases the requested amount of resources, even if they don't match 
the received container IDs, then the AM will not be over threshold anymore, 
thus getting rid of the preemption pressure fully or partially. If the AM 
fullfils the preemption only partially, then the RM will still kill some 
containers from the set.

As the set is not ordered, still it is not known to the AM what containers will 
exactly be killed. So the set is just the list of containers in danger of being 
preempted.

I may be backtracking a bit on my previous comments, 'trading these containers 
for equivalent ones' seems acceptable and gives the scheduler some freedom on 
how to best take care of things if an AM is over limit. If an AM releases the 
requested amount of resources, regardless of what containers releases, the AM 
won't be preempted for this preemption message. We just need to clearly spell 
out the behavior.

With this approach I think we don't need #1 and #2?

Thoughts?



> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628893#comment-13628893
 ] 

Alejandro Abdelnur commented on YARN-45:


Forgot to add, unless I'm missing something location of the preemption is not 
important, just capacity, right?

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628922#comment-13628922
 ] 

Carlo Curino commented on YARN-45:
--

Our main focus for now is to rebalance capacity, in this sense yes location is 
not important. 

However, one can envision the use of preemption also for other things, e.g., to 
build a monitor that 
tries to improve data-locality by issuing (a moderate amount of) "relocations" 
of a container (probably
riding the same checkpointing mechanics we are bulding for MR). 

This is another case where container-based preemption can turn out to be 
useful. (This is at the moment 
just a speculation).


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628930#comment-13628930
 ] 

Carlo Curino commented on YARN-45:
--

Sorry I read only your last comment and answered to that... 

Regarding your previous "larger" comment:
- what you propose is somewhat of a combination of 1 and 2 above, where we give 
the AM a hint about what would happen at the container level if the pressure 
remains. I don't have strong feelings about it, I agree it is easy to do, and 
maybe is a good compromised.
- however, I want to be able to maintain the tighter semantics of 1 (in case 
the ResourceRequest is not specified in the message), which forces the AM to 
preempt exactly the set of containers I am specifying. (now with very 
"targeted" ResourceRequest you can in practice 
do something similar). This covers use cases like the one I mentioned above.

We are posting more code in YARN-567 YARN-568 and YARN-569, check it out, it 
might provide context for this conversation.  


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628938#comment-13628938
 ] 

Alejandro Abdelnur commented on YARN-45:


I'm just trying to see if we can have (at least for now) a single message type 
instead of two that satisfies the usecases. Regarding keeping the tighter 
semantics, if not difficult/complex, I'm OK with it. Thanks.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628950#comment-13628950
 ] 

Carlo Curino commented on YARN-45:
--

Agreed on a single message, where the semantics is:
1) if both Set and ResourceRequest are specified, than it is what 
said (they overlap and you have to give me back at least the resources I ask 
otherwise these containers are at risk to getting killed)
2) if only Set is specified is the "stricter" semantics of I want 
these containers back and nothing else.
3) if only ResourceRequest is specified the semantics is "please give me back 
this many resources" without binding what containers are at risk (this might be 
good for policies that do not want to think about containers unless it is 
really time to kill them).

Does this work for you? Seems to capture the combination of what we proposed so 
far.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629070#comment-13629070
 ] 

Alejandro Abdelnur commented on YARN-45:


sounds good

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629620#comment-13629620
 ] 

Bikas Saha commented on YARN-45:


All API changes at this point are being tracked under YARN-386

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629635#comment-13629635
 ] 

Karthik Kambatla commented on YARN-45:
--

Great discussion, glad to see this coming along well. Carlo's latest comment 
makes sense to me.

Let me know if I understand it right: ResourceRequest part of the message can 
capture locality, the AM will try to give back Resources on each node as per 
this locality information?

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629638#comment-13629638
 ] 

Karthik Kambatla commented on YARN-45:
--

[~bikassaha], shouldn't this be under YARN-397?

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629660#comment-13629660
 ] 

Carlo Curino commented on YARN-45:
--

[~kkambatl], yes ResourceRequests can be used to capture locality preferences. 
In our first use we focus on capacity, so the RM policies are not very 
picky/aware of location, but we think it is good to build this into the 
protocol for later use (as commented above somewhere). 

(As for the last comment: we moved YARN-567, YARN-568, YARN-569 that will use 
this protocol into YARN-397, while this one is probably part of YARN-386).

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629662#comment-13629662
 ] 

Bikas Saha commented on YARN-45:


Moved to sub-task of YARN-397 for scheduler API changes.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629691#comment-13629691
 ] 

Hadoop QA commented on YARN-45:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12578337/YARN-45.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/722//console

This message is automatically generated.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629707#comment-13629707
 ] 

Hadoop QA commented on YARN-45:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12578339/YARN-45.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/723//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/723//console

This message is automatically generated.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-11 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629806#comment-13629806
 ] 

Carlo Curino commented on YARN-45:
--

Note: we don't have tests as there are no tests for the rest of the 
protocolbuffer messages either (this would consist in validating mostly 
auto-generated code).  

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630734#comment-13630734
 ] 

Sandy Ryza commented on YARN-45:


Carlo,
I'm glad that this is being proposed.  Have you considered including how long 
the grace period is in the response?

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-12 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630769#comment-13630769
 ] 

Bikas Saha commented on YARN-45:


I like the idea of the RM giving information to the AM about actions that it 
might take which will affect the AM. However, I am wary of having the action 
taken in different places. eg. the KILL to the containers should come from the 
RM or the AM exclusively but not from both. Otherwise we open ourselves up to 
race conditions, unnecessary kills and complex logic in the RM.

Preemption is something that, IMO the RM needs to do at the very last moment 
when there is no other alternative of resource being freed up. If we decide to 
preempt at time T1 and then actually preempt at time T2 then the cluster 
conditions may have changed between T1 and T2 which may invalidate the 
decisions taken at T1. New resources may have freed up that reduce the number 
of containers to be killed. This sub-optimality is directly proportional to 
length of time between T1 and T2. So ideally we want to keep T1=T2. One can 
argue that things can change after the preemption which may have made the 
preemption unnecessary. So the above argument of T1=T2 is fallacious. However, 
preemption policies are usually based on deadlines such as the allocation of 
queue1 must be met within X seconds. So RM does not have the luxury of waiting 
for X+1 seconds. The best it can do is to wait upto X seconds in the hope that 
things will work out and at X redistribute resources to meet the deficit.

At the same time, I can see that there is an argument that the AM knows best 
how to free up its resources. It will be good to remember that the AM has 
already informed the RM about the importance of all its containers when it made 
the requests at different priorities. So the RM knows the order of importance 
of the containers and the RM also knows the amount of time each container has 
been allocated. Assuming container runtime as a proxy for container work done, 
this data can be used by the RM to preempt in a work preserving manner without 
having to talk to the AM.

Notifying the AM has the usefulness of allowing the AM to take actions that 
preserve work such as checkpointing. However, IMO, the AM should only do 
checkpointing operations but not kill the containers. That should still happen 
at the RM as the very last option at the last moment. If the situation changes 
in the grace period and the containers do not need to be killed then there is 
no point in the AM killing them right now. This also lets us increase the grace 
period to a longer time because checkpointing and preserving work usually means 
persisting data in a stable store and may be slow in practical scenarios.

To summarize, I would propose an API in which the RM tells the AM about exactly 
which containers it might imminently preempt with the contract being that the 
AM could take actions to preserve the work done in those containers. The AM can 
continue to run those containers until the RM actually preempts them if needed. 
If we really think that the choice of containers needs to be made at the AM 
then the AM needs to checkpoint those containers and inform the RM about the 
containers it has chosen. But the final decision to send the kill must be sent 
by the RM.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630893#comment-13630893
 ] 

Chris Douglas commented on YARN-45:
---

[~sandyr]: Yes, but the correct format/semantics for time are a complex 
discussion in themselves. To keep this easy to review and the discussion 
focused, we were going to file that separately. But I totally agree: for the AM 
to respond intelligently, the time before it's forced to give up the container 
is valuable input.

[~bikash]: Agree almost completely. In YARN-569, the hysteresis you cite 
motivated several design points, including multiple dampers on actions taken by 
the preemption policy, out-of-band observation/enforcement, and no effort to 
fine-tune particular allocations. The role of preemption (to summarize what 
[~curino] discussed in detail in the prenominate JIRA) is to make coarse 
corrections around the core scheduler invariants (e.g., capacity, fairness). 
Rather than introducing new races or complexity, one could argue that 
preemption is a dual of allocation in an inconsistent environment.

Your proposal matches case (1) in the above 
[comment|https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950],
 where the RM specifies the set of containers in jeopardy and a contract (as 
{{ResourceRequest}}) for avoiding the kills, should the AM have cause to pick 
different containers. Further, your observation that the RM has enough 
information in priorities, etc. to make an educated guess at those containers 
is spot-on. IIRC, the policy uses allocation order when selecting containers, 
but that should be a secondary key after priority.

The disputed point, and I'm not sure we actually disagree, is the claim that 
the AM should never kill things in response to this message. To be fair, that 
can be implemented by just ignoring the requests, so it's orthogonal to this 
particular protocol, but it's certainly an important "best practice" to discuss 
to ensure we're capturing the right thing. Certainly there are many cases where 
ignoring the message is correct; most CDFs of map task execution time show that 
over 80% finish in less than a minute, so the AM has few reasons to 
pessimistically kill them.

There are a few scenarios where this isn't optimal. Take the case of YARN-415, 
where the AM is billed cumulatively for cluster time. Assume an AM knows (a) 
the container will not finish (reinforcing [~sandyr]'s point about including 
time in the preemption message) and (b) the work done is not worth 
checkpointing. It can conclude that killing the container is in its best 
interest, because squatting on the resource could affect its ability to get 
containers in the future (or simply cost more). Moreover, for long-lived 
services and speculative container allocation/retention, the AM may actually be 
holding the container only as an optimization or for a future execution, so it 
could release it at low cost to itself. Finally, the time allowed before the RM 
starts killing containers can be extended if AMs typically return resources 
before the deadline.

It's also a mechanism for the RM to advise the AM about constraints that 
prevent it from granting its pending requests. The AM currently kills reducers 
if it can't get containers to regenerate lost map output. If the scheduler 
values some containers more than others, the AM's response to starvation can be 
improved from random killing. This is a case where the current implementation 
acknowledges the fact that it already runs in an inconsistent environment.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators

[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-12 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630898#comment-13630898
 ] 

Carlo Curino commented on YARN-45:
--

As you pointed out, any decision made in the RM needs to deal with an 
inconsistent and evolving view of the world, and the preemption actions suffer 
from an inherent and significant lag. In designing policies around this, one 
must embrace such chaos and operate conservatively and try to affect only 
macroscopic properties (hence the many built-in dampers Chris mentioned). 

As for what to do with the preemption requests, I think we are quite aligned 
with your comments in our current implementation for the mapreduce AM/Task. 

Here's what we do:
1) Maps are typically short-lived, so it is often worth ignoring the preemption 
request and try to "make a run for it", as checkpointing and completion times 
risk to be comparable, and re-execution costs are low. 

2) For reducer, since the state is valuable and runtimes often longer, the AM 
asks the task to checkpoint. In our current implementation, once the state of 
the reducer has been saved to a checkpoint we exit, as continuing execution is 
non-trivial (in particular managing partial output of reducers).  I can 
envision a future version that tries to continue running after having taken a 
checkpoint. 
Note that this (the task exiting) does not introduce any new 
race-condition/complexity in either RM or AM, as both already handle 
failing/killed tasks, and the AM even have logic to kill its own reducers to 
free up space for maps.  
More importantly, this setup (in which containers exit as soon as they are done 
checkpointing) allows us to set rather generous "wait-before-kill" parameters, 
since the containers will be reclaimed as soon as the task is done 
checkpointing anyway. 
The alternative would have the RM pick a static policy for waiting, which risks 
to be either too long (hence delaying by too much the rebalancing), or too 
short (which risks to interrupt containers while finishing the checkpointing 
thus wasting work). I expect that no static solution would fair well for a 
broad range of AMs and job sizes. 

3) When the preemption takes the form of a ResourceRequest we pick reducers 
over maps (as having reducers running when the map are killed would simply lead 
to wasted slot time). Looking forward in Yarn's future this is a key feature as 
other applications might have evolving priorities for containers which are not 
exposed to the RM, hence we can't rely on the RM to guess which container is 
best to preempt, and delegating the choice to the AM could be invaluable.


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-12 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630925#comment-13630925
 ] 

Alejandro Abdelnur commented on YARN-45:


Comments on the patch. 

* Reusing ResourceRequest means we have a bunch of properties that are not 
applicable to the preempt message. Wouldn't be enough just to return the 
ContainerIds and a flag indicating that the set is strict or not? The AM can 
reconstruct all the resources information if it needs to. 

*Do we need the get*Count() methods? You can get the size from the set itself, 
or am I missing something?

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-12 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630964#comment-13630964
 ] 

Carlo Curino commented on YARN-45:
--

[~tucu00] Care to elaborate which properties we don't care about?  

In general, I like the symmetry of using ResourceRequest, because it allows the 
RM to compactly and precisely express what resources it wants back. In 
particular it allows to: 1) list large number of containers compactly, 2) 
allows to express locality preferences, and 3) allows to express priority among 
multiple requests. While we don't make use of 3 quite yet it seems not bad to 
have. Arguably, at least portion of this information can indeed be 
reconstructed from a Set + a tag, but this might force the RM to 
do extra work, build a potentially large list of containers, just to have the 
AM undo all that. 

Symmetrically one can imagine to use only Set and if the RM 
wants exactly a container back, it can try to constraint the request so that 
only the desired container matches. In this case too it is easy to provide 
examples in which this might be awkward to use. 

We discussed with Chris this a fair bit, and it seems that the set of use cases 
which are important to cover are not quite fully served by Set 
alone nor by Set, hence the proposal including both. 

I would say this comes down almost to a "style" choice, we could build a 
protocol that is likely to accommodate most of the future uses we foresee now 
and try (likely to be more stable), or define a minimal protocol that covers 
just the first use case we are targeting (Set would be it in this 
case), and evolve it whenever needed. [~bikassaha] if I understand correctly 
you are driving the protocol overhaul, do you care do comment on this?

As for get*Count() we included them to remain consistent with other messages in 
the yarn protcols which had equivalent methods for each list/set in the 
message, I am happy to drop them if you guys think is best.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-14 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13631400#comment-13631400
 ] 

Bikas Saha commented on YARN-45:


My personal preference would be to not have an API that is not actionable. If 
the RM is not having any support for ResourceRequest scenarios then we can 
leave that out for later when such support does arise. Having something out 
there that does not work may lead to misunderstanding and confusion on the part 
of YARN app developers.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-15 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632511#comment-13632511
 ] 

Carlo Curino commented on YARN-45:
--

[~bikassaha]
Sounds good, I totally agree with the general spirit. We indeed have code that 
exercises the ResourceRequest version of it, thus it is actionable (detailed 
question later).

[~tucu00] 
bq. Wouldn't be enough just to return the ContainerIds and a flag indicating 
that the set is strict or not? The AM can reconstruct all the resources 
information if it needs to.

I think it is important to have the "resource-based" version because if the RM 
wants a large number of containers back (e.g., 1000) and does not care which 
ones, 
it would be very wasteful to resolve them on the RM (extra code, extra 
compute-cost), send a long detailed list, and have the AM simply aggregate the 
resources 
ignoring the individuals in the list and return some other containers.   

[~bikassaha] so previously very broad question can be rephrased, based on 
[~tucu00] review of the patch, more tightly as follows:

Which of the following options you prefer?
# we reuse ResourceRequest of which we use number of containers and Resource 
for each container (and for now not use locality or priority, although we might 
in the future)
# create a new type that carries *only* number of containers and Resource for 
each container

Pros and cons of reusing existing types vs minimalistic approach which you were 
pointing out before. I don't have much of a preference (minor leaning towards 
1, but either way is fine). 





> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-15 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632531#comment-13632531
 ] 

Alejandro Abdelnur commented on YARN-45:


Got it, makes sense.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-15 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632542#comment-13632542
 ] 

Bikas Saha commented on YARN-45:


I took a quick look at this patch and the others and from what I see 
ResourceRequest is not actionable in the sense that neither of the schedulers 
can currently send a non-empty ResourceRequest to preempt. Both only do 
preemption by containers though they have some plumbing to send RR's if they 
want to do so. So I am not quite sure what you mean by "We indeed have code 
that exercises the ResourceRequest version of it". Of course, I may have missed 
something.

The following comment may change after a detailed review of the changes in this 
patch and other related patches. But as of now I agree with you that RR makes 
sense because essentially this request is symmetric. AM uses RR to RM for 
resources to schedule and RM uses RR to AM for resources to preempt. By not 
using location we are implicitly using the "*" location right? Might as well 
make it explicit. Non * locations will make sense when affinity based 
preemptions occur.



> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-15 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632571#comment-13632571
 ] 

Arun C Murthy commented on YARN-45:
---

Sorry, I've been away for a couple of weeks due to family reasons and I'm just 
catching up.

The bare-minimum requirement seems:
# RM should notify the AM that a certain amount of resources will need to be 
reclaimed (ala SIGTERM).
# Thus, the AM gets an opportunity to *pick* which containers it will sacrifice 
to satisfy the RM's requirements.
# Iff the AM doesn't act, the RM will go ahead and terminate some containers 
(probably the most-recently allocated ones); ala SIGKILL.

Given the above, I feel that this is a set of changes we need to be 
conservative about - particularly since the really simple pre-emption i.e. 
SIGKILL alone on RM side is trivial (from an API perspective).

Thus, I'm concerned about jumping into a complex preemption API 
(ResourceRequest etc.) without having sufficient experience i.e. doing this in 
the first iteration itself.

I like [~tucu00]'s initial suggestion of: 
# Resource resourcesToReclaim
# Optionally, a Set which the RM will preempt i.e. SIGKILL 

In fact, for the first iteration, Set is something we can avoid if 
the semantics are clear i.e. RM will preempt the most-recently allocated 
containers.

Once we have sufficient experience with this, we can then dive deeper to think 
about further enhancements to the API by adding features (in a compatible 
manner for 2.x or 3.x).

Thoughts? 

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632630#comment-13632630
 ] 

Chris Douglas commented on YARN-45:
---

bq. ResourceRequest is not actionable in the sense that neither of the 
schedulers can currently send a non-empty ResourceRequest to preempt. Both only 
do preemption by containers though they have some plumbing to send RR's if they 
want to do so. So I am not quite sure what you mean by "We indeed have code 
that exercises the ResourceRequest version of it".

A prototype impl against MapReduce responds to {{ResourceRequest}} in the 
preempt message. We're currently polishing and splitting that up for review, 
but wanted to get consensus on the Yarn changes in case new requirements 
required reworking the rest.

An RM impl that includes killing for {{ResourceRequest}} (or {{Resource}}) is a 
more invasive change, particularly because (a) the AM needs to reason about 
which recently finished containers are included in the message (i.e., it needs 
to reason about what the RM knows, so the RM needs to be consistent in what it 
tells the AM) and (b) the RM needs to track its previous preemption requests, 
timing them out in the context of existing allocations and exited containers 
(i.e., decisions to preempt need to incorporate subsequent information).

To get experience before proposing anything drastic, we marked this API as 
experimental, wrote the enforcement policy against {{ContainerID}}, and tucked 
it behind a pluggable interface. This way, the AM can ignore stale requests for 
exited containers and the RM can time out particular containers it asked for 
easily; every computed preemption set is bound in a namespace that sidesteps 
the most disruptive impl issues on both sides.

bq.  By not using location we are implicitly using the "*" location right? 
Might as well make it explicit. Non * locations will make sense when affinity 
based preemptions occur.

Yes, that's exactly the intent. The policy in YARN-569 doesn't attempt to bias 
the preemptions to match the requests in under-capacity queues, but that's a 
natural policy to implement against this protocol.

{quote}
The bare-minimum requirement seems:

# RM should notify the AM that a certain amount of resources will need to be 
reclaimed (ala SIGTERM).
# Thus, the AM gets an opportunity to *pick* which containers it will sacrifice 
to satisfy the RM's requirements.
# Iff the AM doesn't act, the RM will go ahead and terminate some containers 
(probably the most-recently allocated ones); ala SIGKILL.

Given the above, I feel that this is a set of changes we need to be 
conservative about - particularly since the really simple pre-emption i.e. 
SIGKILL alone on RM side is trivial (from an API perspective).
{quote}

Totally agreed. The symmetry of {{ResourceRequest}} in the ask-back is 
attractive, but it's not a sufficient condition. To it, I'd add all the 
familiar attributes of using them in allocation requests (economy, 
expressiveness, versatility). While {{Resource}} covers the current impl, it 
leaves little room for related improvements, or even refinements (e.g., 
preferring resources requested by under-capacity queues, prioritizing types of 
containers, and time).

The API isn't that complex, but a strict implementation would change the RM 
more, adding risk. To mitigate that, but still encourage applications to write 
against the richer type while we get experience with it, [~curino]'s 
formulation 
[above|https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950]
 seems like a decent set of semantics...

We could add a new type that encodes a subset of the {{ResourceRequest}} type. 
It lacks symmetry, but it also allows them to evolve independently.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release contain

[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-22 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638783#comment-13638783
 ] 

Carlo Curino commented on YARN-45:
--

Updated the protocol patch (and the implementation for capacity scheduler), to 
reflect the discussion in the above comments (plus various offline 
conversations).  The current proposal is a minimal protocol change and compact 
policies, which capture the portion of our initial proposal on which we reached 
reasonable consensus. 

The key change is the following:
# Simplified the protocol modification to include only Set as 
vehicle to express preemption requests.
# Modified ProportionalCapacityPreemptionPolicy to select containers by 
reversed priority, and within each priority by reversed container id (reflects 
order of allocation). 
# Simplified all the "pipes" in the RM that propagated decisions about 
preemption around (to not-include resource-based preemptions). 

The decision is based on the following rationale: 
There seems to be agreement on the fact that ResourceRequest -based preemption 
is appealing due to: symmetry, compactness, and the flexibility it provides to 
the AM. 
However, the declarative nature of the specification makes the "tracking" over 
time quite tricky. In particular, both RM and AMs must be capable of maintain 
some form of history of the resources being requested: 
# for the RM, consciously preempt containers only for the fraction of resources 
that have been consistently asked to the AM over time (a notion of 
ResourceRequest intersection should be defined),  
# for the AM, to track its own preemption actions, and know when they are 
received by the RM (this is needed to discount the RM requests while the task 
are being check pointed).

With [~chris.douglas] we worked out a possible set of semantics for the above 
and started to work on a version of the ProportionalCapacityPreemptionPolicy 
that reflects those. While they seem reasonable are likely to generate longer 
(speculative) discussions. 

So following the spirit of [~acmurthy]'s last comment and after feedback from 
[~tucu00], [~bikassaha], [~vinodkv], [~sseth], [~hitesh] we propose 
Set as an initial strategy that will allow us to: 
# observe most of the benefits of preemption, 
# gain experience in running schedulers leveraging preemption. 


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638814#comment-13638814
 ] 

Hadoop QA commented on YARN-45:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12579975/YARN-45.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/803//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/803//console

This message is automatically generated.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-23 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13639595#comment-13639595
 ] 

Karthik Kambatla commented on YARN-45:
--

Barely skimmed through the patch, it looks good. Noticed a few javadoc typos we 
might like to fix. Will try to get in a more detailed review "soon".

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-25 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642163#comment-13642163
 ] 

Chris Douglas commented on YARN-45:
---

If everyone's OK with the current patch as a base, I'll commit it in the next 
couple days.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642886#comment-13642886
 ] 

Thomas Graves commented on YARN-45:
---

A couple of very nit picks that we might fix before commit that Karthik 
referred to are a few typos in the comments/javadoc.  

AllocateResponse - comment still references resources - "description of 
resources and containers"

PreemptMessage - Grammar needs fixing - 

" * A PreemptMessage is part of the RM-AM protocol, and it is used 
by the RM 
+ * specify resources that are the RM wants to reclaim from the AM."

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642889#comment-13642889
 ] 

Karthik Kambatla commented on YARN-45:
--

Also, for the javadoc, do we prefer blah or {@link blah}?

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642939#comment-13642939
 ] 

Karthik Kambatla commented on YARN-45:
--

Thanks Thomas. Those are the only comments I have. Otherwise, the code part 
looks good to me.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643045#comment-13643045
 ] 

Carlo Curino commented on YARN-45:
--

Thanks for the feedback, we will make sure these comments get reflected in the 
final version.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643061#comment-13643061
 ] 

Carlo Curino commented on YARN-45:
--

About the choice between  and @link there are almost the same number of 
files using each (about 8% lead for @link in the count of file using each, 
stronger lead in yarn).

Any preference among the watchers? 

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643065#comment-13643065
 ] 

Karthik Kambatla commented on YARN-45:
--

IIUC, the compiler and IDEs understand @link: for instance, renaming the 
referenced class/method in Eclipse updates the link as well. I haven't used 
 before and not sure if the same applies to it.

Probably not important: while digging around, one advice that I came across was 
to use @link for the first occurrence to create a hyperlink and @code for other 
instances to format text but not create a hyperlink.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643373#comment-13643373
 ] 

Bikas Saha commented on YARN-45:


I like PreemptionMessage or PreemptionNotification.

The patch mostly looks good and I agree with Vinod's commnents on a get and set.

I am assuming that a container will show up repeatedly in AllocationResponse 
until it is either preempted or removed from the preemption list. The javadoc 
is not clear about this.
At this point, I wonder how the client/app figures out the time to preempt left 
for the container. How does it differentiate between containers that are new 
additions to that list vs older ones. It could maintain its cache of when it 
first saw a container. Or the time to preempt could be passed along by the RM. 
Does it matter. Is it needed in any scenario?

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643444#comment-13643444
 ] 

Carlo Curino commented on YARN-45:
--

We modified the patch to account for the most recent round of comments from 
[~tgraves], [~kkambatl], [~vinodkv] and [~bikassaha]. 

In particular:
# various javadoc fixes (including @link notation)
# PreemptMessage -> PreemptionMessage
# set/get version of PreemptionMessage (and propagate through the depending 
patches) 
# clarified in AllocateResponse javadoc that PreemptMessage could be repeated 
over time.

[~bikassaha], you are right it is likely that there will be repetitions in the 
ask over time (4 above). In fact, by design the RM will "sustain" its asks 
until either: 1) the need for those resources is gone, 2) the containers are 
released (natural or AM-initiated completion), or 3) a timeout expires and the 
RM force-kill the containers. The possible overlap among subsequent messages is 
not a big concern on the AM side given our choice of a Set based 
PreemptionMessage.  Duplicates are trivial to detect, and/or the AM can simply 
implement preemption in an idempotent way (which is what we do in our mapreduce 
solution). 

Regarding time, in the basic implementation we have for mapreduce, the AM does 
not attempt complex speculations on when to preempt, it simply acts on the 
requests in a idempotent way as soon as they are received (this also maximizes 
the chance to complete a checkpoint before being killed). In our design we 
pushed this in an AMPreemptionPolicy, so you can easily imagine more advanced 
policies to track containers over-time and speculate on when is best to 
preempt. Adding more sophisticated "timing" information in the protocol is also 
something I can see being an interesting addition, but I would want to spend 
some more time (no pun intended) working with it, before proposing a public 
protocol change---again we mention this in the attached summary document.

We get into a bit more details in the attached document, which reports as  
[~vinodkv]  asked a summary of the conversations around various alternatives 
using resource-based specification, and about adding time. 



> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643458#comment-13643458
 ] 

Hadoop QA commented on YARN-45:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12580805/YARN-45_summary_of_alternatives.pdf
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/830//console

This message is automatically generated.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643461#comment-13643461
 ] 

Carlo Curino commented on YARN-45:
--

Reposting the patch with included BuilderUtils changes per [~vinodkv] request. 
(missed them in previous diff). 

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643473#comment-13643473
 ] 

Hadoop QA commented on YARN-45:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12580806/YARN-45.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
24 warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/831//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/831//console

This message is automatically generated.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644327#comment-13644327
 ] 

Bikas Saha commented on YARN-45:


My understanding is the the containers being presented in PreemptionMessage are 
going to be preempted by the RM some time in the near future if the RM cannot 
find free resources elsewhere. The AM's are not supposed to preempt the 
containers but they are encourage to checkpoint and save work. The RM can 
always choose to not preempt these containers and so it would be sub-optimal 
for the AM to kill these containers.
If we want to add additional information besides the set of 
containers-to-be-preempted then I would prefer ResourceRequest (like it was in 
the original patch) and not Resource. Not only is that symmetric but also 
allows the RM to provide additional information about where to free containers. 
A smarter RM could potentially ask for resources to be preempted where the 
under-allocated job wants it and a smart AM could help out by choosing 
containers close to the desired locations. Secondly, Resource is too amorphous 
by itself. Asking an AM to free 50GB  does not tell it whether the RM needs 
10*5 or 50*1. Without that information the AM can end up freeing containers in 
a manner that does not help the RM to meet the request of the under-allocated 
job, thus failing to meet quota and wasting work at the same time.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644626#comment-13644626
 ] 

Chris Douglas commented on YARN-45:
---

I'm also a fan of {{ResourceRequest}}, but we're not really using all its 
features, yet. Similarly, {{Resource}} bakes in the fungibility of resources, 
which could be awkward as the RM accommodates richer requests (as in YARN-392).

We could use {{ResourceRequest}}- so the API is there for extensions- but only 
populate the capability as an aggregate. With the convention that "\-1 
containers" can mean "packed as you see fit," it expresses {{Resource}} (which 
we need in practice, since the priorities for requests don't always [match the 
preemption 
order|https://issues.apache.org/jira/browse/YARN-569?focusedCommentId=13638825&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13638825]),
 which is sufficient for the current schedulers.

If we're adding the contract back with the set of containers, the 
[semantics|https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950]
 we discussed earlier still seem OK.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644664#comment-13644664
 ] 

Carlo Curino commented on YARN-45:
--

[~acmurthy] I see your point, which was in fact reflected more clearly in our 
initial proposal. The only caveat is not to make this a capacity-only protocol 
(which you are not, but I wanted to reiterate that there are other use cases).  

I like [~bikassaha] and [~chris.douglas] spin on it (i.e., using 
ResourceRequest), as it gives us the immediate "capacity angle", but will 
eventually allow to evolve the implementations towards something richer (e.g., 
the preempt on behalf of a specific request
that Bikas considered before) without impact to the protocols. 

I think there is a slightly cleaner version of Chris's proposal: 
use ResourceRequest and to represent a request that only cares about overall 
capacity we could express the ResourceRequest as a multiple of the minimum 
allocation (i.e., if we want 100GB of RAM back and min_container size is 1GB we 
ask for 100 x 1GB containers). This achieves Chris's proposal with a slightly 
prettier use of ResourceRequest. Note that there are size-matching issues 
(e.g., you have 1.5GB containers and I ask for 1x1GB containers, but we have 
very similar problems with Resource).

I would say that as Chris pointed out [these semantics | 
https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950]
 plus the use of ResourceRequest I propose here as a minor variation on Chris's 
take should cover Arun's and Bika's comments (and I believe also the prior 45+ 
messages). 

Thoughts?




> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644675#comment-13644675
 ] 

Chris Douglas commented on YARN-45:
---

bq. we could express the ResourceRequest as a multiple of the minimum allocation

+1 This is better

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-05-03 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13648986#comment-13648986
 ] 

Carlo Curino commented on YARN-45:
--

Based on all the feedback here, including discussions with [~acmurthy], 
[~vinodkv], [~bikassaha], [~hitesh], [~sseth], and [~tucu00], we propose the 
following message be added to the
{{AllocateResponse}} (pseudo):

{noformat}
PreemptionMessage {
  StrictPreemptionContract {
Set containers
  } strict;
  PreemptionContract {
Set containers
List resources
  } contract;
} message
{noformat}

This has some advantages over the previous design:
# By adding {{PreemptionContainer}} and {{PreemptionResourceRequest}} (wrappers 
of {{ContainerId}} and {{ResourceRequest}} respectively) we can add attributes 
to each item later on, without breaking the protocol (e.g., [~sandyr]'s earlier 
suggestion of time).
# By separating strict and non-strict contracts, the RM can pull back specific 
containers or give the AM flexibility in satisfying the contract. It also 
allows the RM to simultaneously and unambiguously include requests with both 
constraints
# By including the list of containers in the {{PreemptionContract}} together 
with the resources, AMs have a slightly more restricted search space when 
compared to "match all the resources that _might_ be killed, determine 
preferences among them". Thus, simpler AMs can mostly ignore the interpretation 
of {{ResourceRequest}} and just follow the RM hint.

We're updating YARN-567, YARN-568, and YARN-569 to accommodate these changes, 
in addition to the rest of the downstream patches.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-05-06 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650246#comment-13650246
 ] 

Bikas Saha commented on YARN-45:


Overall, the approach looks good.

Would be great if you could add a version number to your patches.

The javadoc is trying to help by giving more information. However, if I think 
from the perspective of someone who doesnt understand YARN, RM, scheduling and 
preemption, the javadoc would be hard to understand. Can we re-write this wrt 
perspective of the user of the API. How are they supposed to interpret this 
data. What needs to be done by them.
{code}
+  /**
+   * Get the description of containers owned by the AM, but requested back by
+   * the cluster. Note that the RM may have an inconsistent view of the
+   * resources owned by the AM. The AM may elect to ignore some or all 
requests.
+   *
+   * The message is a snapshot of the resources the RM wants back from the AM.
+   * While demand persists, the RM will sustain its ask. Resources requested
+   * consistently over some duration may be forcibly killed by the RM.
{code}

In general, the javadocs and class names are a little hard for me to 
understand, but it just might be me :)

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-05-06 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650527#comment-13650527
 ] 

Carlo Curino commented on YARN-45:
--

bq. Would be great if you could add a version number to your patches.

Sorry, we weren't sure of the current convention.

{quote}
 - PreemptionMessage.strict should perhaps be named strictContract
explicitly. You did name the setters and the getters verbosely which
is good.
 - You should mark all the api getters and setters to be synchronized.
There are similar locking bugs in other existing records too but we
are tracking them elsewhere.
 - PreemptionContainer.getId() - Javadoc should refer to containers
instead of Resource?
 - PreemptionContract.getContainers() - Javadoc referring to
"ResourceManager may also include a @link
PreemptionContract that, if satisfied, may replace these" doesn't make
sense to me.
{quote}

Fixed all of these; last one was a copy/paste of an older version of
the code. Thanks for catching these.

[~bikassaha]: we took another attempt at the javadoc, but it's
probably still not sufficient. We opened YARN-XXX to track
documentation of this feature in the AM how-to, which we'll address
presently.

(thanks everyone for the great feedback!)


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-05-06 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650532#comment-13650532
 ] 

Bikas Saha commented on YARN-45:


If you dont mind I will try to take a pass tomorrow morning at making some 
inline edits to the patch. Dont stop for me. I can always do it after the 
initial commit.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.1.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-05-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650545#comment-13650545
 ] 

Hadoop QA commented on YARN-45:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12582045/YARN-45.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/881//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/881//console

This message is automatically generated.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.1.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-05-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650568#comment-13650568
 ] 

Hudson commented on YARN-45:


Integrated in Hadoop-trunk-Commit #3713 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3713/])
YARN-45. Add protocol for schedulers to request containers back from
ApplicationMasters. Contributed by Carlo Curino and Chris Douglas. (Revision 
1479771)

 Result = SUCCESS
cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1479771
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/AllocateResponse.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionContainer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionContract.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionMessage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionResourceRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/StrictPreemptionContract.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/AllocateResponsePBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionContainerPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionContractPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionMessagePBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionResourceRequestPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/StrictPreemptionContractPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestAMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/BuilderUtils.java


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Fix For: 2.0.5-beta
>
> Attachments: YARN-45.1.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650698#comment-13650698
 ] 

Hudson commented on YARN-45:


Integrated in Hadoop-Yarn-trunk #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/202/])
YARN-45. Add protocol for schedulers to request containers back from
ApplicationMasters. Contributed by Carlo Curino and Chris Douglas. (Revision 
1479771)

 Result = SUCCESS
cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1479771
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/AllocateResponse.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionContainer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionContract.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionMessage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionResourceRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/StrictPreemptionContract.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/AllocateResponsePBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionContainerPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionContractPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionMessagePBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionResourceRequestPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/StrictPreemptionContractPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestAMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/BuilderUtils.java


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Fix For: 2.0.5-beta
>
> Attachments: YARN-45.1.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650791#comment-13650791
 ] 

Hudson commented on YARN-45:


Integrated in Hadoop-Hdfs-trunk #1391 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1391/])
YARN-45. Add protocol for schedulers to request containers back from
ApplicationMasters. Contributed by Carlo Curino and Chris Douglas. (Revision 
1479771)

 Result = FAILURE
cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1479771
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/AllocateResponse.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionContainer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionContract.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionMessage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionResourceRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/StrictPreemptionContract.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/AllocateResponsePBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionContainerPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionContractPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionMessagePBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionResourceRequestPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/StrictPreemptionContractPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestAMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/BuilderUtils.java


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Fix For: 2.0.5-beta
>
> Attachments: YARN-45.1.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650846#comment-13650846
 ] 

Hudson commented on YARN-45:


Integrated in Hadoop-Mapreduce-trunk #1418 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1418/])
YARN-45. Add protocol for schedulers to request containers back from
ApplicationMasters. Contributed by Carlo Curino and Chris Douglas. (Revision 
1479771)

 Result = SUCCESS
cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1479771
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/AllocateResponse.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionContainer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionContract.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionMessage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/PreemptionResourceRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/StrictPreemptionContract.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/AllocateResponsePBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionContainerPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionContractPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionMessagePBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/PreemptionResourceRequestPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/StrictPreemptionContractPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestAMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/BuilderUtils.java


> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Fix For: 2.0.5-beta
>
> Attachments: YARN-45.1.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira