[jira] [Updated] (YARN-3463) Integrate OrderingPolicy Framework with CapacityScheduler

2015-04-14 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3463:
--
Attachment: (was: YARN-3643.61.patch)

 Integrate OrderingPolicy Framework with CapacityScheduler
 -

 Key: YARN-3463
 URL: https://issues.apache.org/jira/browse/YARN-3463
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3463.50.patch, YARN-3463.61.patch


 Integrate the OrderingPolicy Framework with the CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-14 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.60.patch

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch, YARN-3318.52.patch, 
 YARN-3318.53.patch, YARN-3318.56.patch, YARN-3318.57.patch, 
 YARN-3318.58.patch, YARN-3318.59.patch, YARN-3318.60.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-14 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494606#comment-14494606
 ] 

Craig Welch commented on YARN-3318:
---

bq. Beyond SchedulerApplicationAttempt which is pending YARN-3361, Few comments 
on latest patch:

I think you misunderstood, the patch doesn't depend on 3361, but after 3361 is 
in some things should be removed from this patch.  In any case, I decided that 
it really belonged in the integration patch, [YARN-3463], so I've dropped it 
from here and it will be committed there

bq. 1) CACHED_USED/CACHED_PENDING don't used by anybody, are they pending 
YARN-3361 as well? 

No, that was a miss during the ResourceUsage usage changes!  Something which 
could affect functionality!  Amazing, fixed.

bq. 2) AbstractComparatorOrderingPolicy doesn't handle locks, I suggest to add 
synchronized lock to all methods if you think it will only be used in 
single-thread scenario

Since the api returns iterators which must be externally synchronized, 
OrderingPolicy makes it clear in documentation that the burden for 
synchronization rests with the user (the schedulers).  That's the threading 
model, so synchronizing here would be pointless

bq. 3) FifoComparator, it will be used by FairOrderingPolicy as well? If so, 
better to make it to a separated class

sure, done

bq. 4) How about call getInfo to getStatusMessage, since the info is too 
generic. And add a comment to indicate it will be used for logger printing.

sure, done

bq. 5) getComparator of AbstractComparatorOrderingPolicy is @VisibleForTest?

sure, done


 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch, YARN-3318.52.patch, 
 YARN-3318.53.patch, YARN-3318.56.patch, YARN-3318.57.patch, 
 YARN-3318.58.patch, YARN-3318.59.patch, YARN-3318.60.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-14 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.61.patch

fix findbugs recurrance due to class name change

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch, YARN-3318.52.patch, 
 YARN-3318.53.patch, YARN-3318.56.patch, YARN-3318.57.patch, 
 YARN-3318.58.patch, YARN-3318.59.patch, YARN-3318.60.patch, YARN-3318.61.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3463) Integrate OrderingPolicy Framework with CapacityScheduler

2015-04-14 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494892#comment-14494892
 ] 

Craig Welch commented on YARN-3463:
---

bq. 1) Preemption policy changes seems not correct to me...
So, I believe the behavior for FIFO should be exactly as it was before - and 
all of the preemption tests were passing with the combined patch, so I think 
this is the case.  Fairness preemption would be handled on [YARN-3319].  I 
don't mind moving the final integration for preemption into another jira, but I 
don't believe the concern is correct / there is any behavioral change for FIFO.

bq. 2) WebUI, REST API and CLI changes are public APIs and related to core 
changes in CS...

There are no REST API or CLI changes in the patch anymore, we agreed on 
[YARN-3318] 
https://issues.apache.org/jira/browse/YARN-3318?focusedCommentId=14393347page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14393347
 that the WebUI changes should stay with the initial integration - it's very 
important/needed to be able to confirm that configuration was accomplished 
properly, without it there is no way to tell what policy is active.

bq. So I suggest only leave core changes for CS including configuration 

So, I think given the WebUI bit above, this is already the case, with the 
possible exception of preemption which, again, I think has not seen any 
behavior change for FIFO, which is all we have at this time.



 Integrate OrderingPolicy Framework with CapacityScheduler
 -

 Key: YARN-3463
 URL: https://issues.apache.org/jira/browse/YARN-3463
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3463.50.patch, YARN-3643.58.patch


 Integrate the OrderingPolicy Framework with the CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3463) Integrate OrderingPolicy Framework with CapacityScheduler

2015-04-14 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3463:
--
Attachment: YARN-3319.61.patch

Updated, matches/should apply and work with [YARN-3318] .61.patch

 Integrate OrderingPolicy Framework with CapacityScheduler
 -

 Key: YARN-3463
 URL: https://issues.apache.org/jira/browse/YARN-3463
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.61.patch, YARN-3463.50.patch, 
 YARN-3643.58.patch


 Integrate the OrderingPolicy Framework with the CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a FairOrderingPolicy

2015-04-13 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.58.patch

 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch, 
 YARN-3319.45.patch, YARN-3319.47.patch, YARN-3319.53.patch, YARN-3319.58.patch


 Implement a FairOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 id, which is generally lexically FIFO for that comparison



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-13 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.58.patch

Missed attaching ResourceUsage

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch, YARN-3318.52.patch, 
 YARN-3318.53.patch, YARN-3318.56.patch, YARN-3318.57.patch, YARN-3318.58.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-13 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.57.patch

Better ResourceUsage usage

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch, YARN-3318.52.patch, 
 YARN-3318.53.patch, YARN-3318.56.patch, YARN-3318.57.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3463) Integrate OrderingPolicy Framework with CapacityScheduler

2015-04-13 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3463:
--
Attachment: YARN-3643.58.patch

 Integrate OrderingPolicy Framework with CapacityScheduler
 -

 Key: YARN-3463
 URL: https://issues.apache.org/jira/browse/YARN-3463
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3463.50.patch, YARN-3643.58.patch


 Integrate the OrderingPolicy Framework with the CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-13 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.59.patch

checkpatch fixes

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch, YARN-3318.52.patch, 
 YARN-3318.53.patch, YARN-3318.56.patch, YARN-3318.57.patch, 
 YARN-3318.58.patch, YARN-3318.59.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-13 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14491959#comment-14491959
 ] 

Craig Welch commented on YARN-3318:
---

added comments to compoundcomparator, re-introduced getId (in lieu of getName), 
switched to ResourceUsage - to avoid unnecessary dependency on [YARN-3361] 
SchedularApplicationAttempt manages pending in a way it won't have to long 
term, this doesn't effect the api and allows these to be committed in any 
order.  Sticking with Scheduling instead of Cached as suggested earlier by 
[~vinodkv] to keep it's purpose clear (Cached is too general) and because it 
can't be used as a generalized cache of the values, the lifecycle is tied to 
use by OrderingPolicies.

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch, YARN-3318.52.patch, 
 YARN-3318.53.patch, YARN-3318.56.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-13 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492615#comment-14492615
 ] 

Craig Welch commented on YARN-3318:
---

Looking again at using ResourceUsage instead of the initial use of application 
demand and consumption, while it may be preferable for future cases like queues 
with node label aware policies, there are deficiencies which need to be 
addressed to use it for the initial case, and it makes it more complex to do 
so.  In fact, for the initial case, this approach is inferior.

ResourceUsage is still a bit rough and incomplete, get does not properly handle 
the ANY/ALL case, which is what we need for application fairness - otherwise, 
applications whose resource requests are labeled something other than NO_LABEL 
will be erroneously preferred for scheduling in the fair case.  The prior 
approach was working with full consumption and demand and did not have this 
issue and did not require additional change to support fairness properly.

Even supporting ANY/ALL in ResourceUsage is a little tricky, as I see no reason 
why someone could not set values on ResourceUsage using the ANY label 
definition, and then there is a question as to what is the proper behavior for 
an ANY get request - should it sum all the values for all labels (which is, in 
some sense, correct), or just return the previously set ANY value? Should we 
disallow setting ANY? (that seems a bit arbitrary...) My suggestion is that we 
introduce explicit getAll(Used, Pending, etc), (not an ALL 
CommonNodeLabelsManager constant, I think this just moves/replicates the 
existing problem).  There would be no corresponding setAll.  getAll(XYZ) would 
iterate all labels in ResourceUsage for the passed ResourceType and return a 
total.

For OrderingPolicy, the values should be cached on ResourceUsage instead of in 
SchedulableEntity for cases where that is needed - cloning an entire 
ResourceUsage will be expensive, inefficient, and unnecessary.  We could add a 
separate cache collection in ResourceUsage, but I think it would actually be 
better to add values to the ResourceType enum, SCHEDULING_USED, 
SCHEDULING_PENDING

When updating the cached value for Used, OrderingPolicy would then call 
getAllUsed() on ResourceUsage and set the resulting value with set (ANY node 
label expression, SCHEDULING_USED ResourceType), and for demand, 
getAllPending() and then set ANY node label expression, SCHEDULING_PENDING

When getting the cached value, OrderingPolicy would call getUsed(ANY 
nlexpression, SCHEDULING_USED ResourceType) and for pending, getPending(ANY, 
SCHEDULING_PENDING)

I'm inclined to roll forward with using ResourceUsage despite this additional 
scope to ease future usecases, but we need to be very careful about continuing 
to pull in additional change and complexity which is not required right now, 
and should avoid doing so again this iteration.  It's good to aim for a stable 
api, but it's also good to complete the initial functionality, and to realize 
it's not possible to anticipate all future needs / highly likely there will be 
some change to api's like this as the system evolves.

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch, YARN-3318.52.patch, 
 YARN-3318.53.patch, YARN-3318.56.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-12 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.56.patch

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch, YARN-3318.52.patch, 
 YARN-3318.53.patch, YARN-3318.56.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487631#comment-14487631
 ] 

Craig Welch commented on YARN-3318:
---

bq. ...Do we really see non-comparator based ordering-policy. We are 
unnecessarily adding two abstractions - adding policies and comparators...

In the context of the code so far, the comparator based approach is specific 
to compounding comparators to achieve functionality (priority + fifo, fair + 
fifo, etc).  This was the initial motivation for the two level api  
configuration, the broader surface of the policy which would allow for 
different collection types, sorting on demand, etc, (the original policy) and 
the narrower one within that (comparator) for the cases where comparator logic 
was sufficient, e.g. sharing a collection (for composition) and a collection 
type (a tree, for efficient resorting of individual elements when required) was 
possible.

The two level api  configuration was not well received.  Offline Wangda has 
indicated that he thinks there are policies coming up which will need the 
wider, initial api, with control over the collection, sorting, etc.  Supporting 
policy composition for those cases would be very awkward  is not really worth 
pursuing.

The various competing tradeoffs, the aversion to a multilevel api, the need for 
the higher level api, and the ability to compose policies creates something of 
a tension, I don't think it's realistic to try and accomplish it all together, 
the result will be Frankensteinian at best.  Something has to go.  Originally, 
I chose the multilevel api to resolve the dilemma, I like that choice, it seems 
unpopular with the crowd.  Given that, the other optional dynamic is the 
ability to compose policies (there's no requirement for either of these as far 
as I can tell, it is a bonus feature).  While I like the composition 
approach, it can't be maintained as such with the broader api and without the 
multilevel config/api.  As one of these has to go and it appears it can't be 
the broader api or the multilevel api I suppose it will have to be composition. 
 Internally there can be some composition of course, but it won't be 
transparent/exposed/configurable as it was initially.

I'll put out a patch with that in a bit.

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.52.patch

Update, removing composition in favor of broader interface

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch, YARN-3318.52.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3293) Track and display capacity scheduler health metrics in web UI

2015-04-08 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485466#comment-14485466
 ] 

Craig Welch commented on YARN-3293:
---

Overall +1 looks good to me.  One additional thing occurred to me when looking 
it over again - I think that CapacitySchedulerHealthInfo in the web dao is, for 
the most part, cross-scheduler.  Does it make sense to factor most of it up 
into a generalized SchedulerHealthInfo with all the common pieces and extend 
it (to CapacitySchedulerHealthInfo)  just for the CS specific constructor?

 Track and display capacity scheduler health metrics in web UI
 -

 Key: YARN-3293
 URL: https://issues.apache.org/jira/browse/YARN-3293
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: Screen Shot 2015-03-30 at 4.30.14 PM.png, 
 apache-yarn-3293.0.patch, apache-yarn-3293.1.patch, apache-yarn-3293.2.patch, 
 apache-yarn-3293.4.patch, apache-yarn-3293.5.patch, apache-yarn-3293.6.patch


 It would be good to display metrics that let users know about the health of 
 the capacity scheduler in the web UI. Today it is hard to get an idea if the 
 capacity scheduler is functioning correctly. Metrics such as the time for the 
 last allocation, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3293) Track and display capacity scheduler health metrics in web UI

2015-04-08 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485557#comment-14485557
 ] 

Craig Welch commented on YARN-3293:
---

Your call, I think it's also fine to wait to do this until we do FairScheduler 
integration when we are clear on exactly what needs to happen (it may be 
premature to do it now, not entirely sure), but ultimately I think as much as 
can be shared should be.

 Track and display capacity scheduler health metrics in web UI
 -

 Key: YARN-3293
 URL: https://issues.apache.org/jira/browse/YARN-3293
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: Screen Shot 2015-03-30 at 4.30.14 PM.png, 
 apache-yarn-3293.0.patch, apache-yarn-3293.1.patch, apache-yarn-3293.2.patch, 
 apache-yarn-3293.4.patch, apache-yarn-3293.5.patch, apache-yarn-3293.6.patch


 It would be good to display metrics that let users know about the health of 
 the capacity scheduler in the web UI. Today it is hard to get an idea if the 
 capacity scheduler is functioning correctly. Metrics such as the time for the 
 last allocation, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2696) Queue sorting in CapacityScheduler should consider node label

2015-04-08 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485954#comment-14485954
 ] 

Craig Welch commented on YARN-2696:
---

Why?  And what kind of consideration, exactly?

 Queue sorting in CapacityScheduler should consider node label
 -

 Key: YARN-2696
 URL: https://issues.apache.org/jira/browse/YARN-2696
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan

 In the past, when trying to allocate containers under a parent queue in 
 CapacityScheduler. The parent queue will choose child queues by the used 
 resource from smallest to largest. 
 Now we support node label in CapacityScheduler, we should also consider used 
 resource in child queues by node labels when allocating resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-08 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486312#comment-14486312
 ] 

Craig Welch commented on YARN-3318:
---

bq. 1) Regarding OrderingPolicy and SchedulingOrder responsibilities:

Scheduling order has multiple purposes, including:

1. Housing supporting code for using policies common across schedulers, e.g. a 
common implementation of behavior 
2. Allowing for the composition of multiple policies together to accomplish 
desired queue behavior - it is awkward to factor the functionality in 
SchedulingOrder down into the policies, as multiple policies are in play for 
one instance of the logic in SchedulingOrder

Although I mentioned that it could be made abstract some day if needed, that's 
not it's current purpose, the above are.

bq. ...Looking at methods of OrderingPolicy, most of them are just pass through 
parameters to OrderingPolicy, and rest of them are instantiation 
OrderingPolicies...

Well, no, it has quite a lot of implementation logic around managing the 
SchedulerProcess collection and the interactions between it and multiple 
policies, it is certainly not limited to Factory operations

bq. OrderingPolicy should be a per-queue instance or global library

OrderingPolicies are per-queue and stateful in terms of configuration specific 
to that queue configuration.  For the reasons mentioned above regarding the 
composition of policies, they do not (and should not) maintain queue state 
(scheduler processes, etc).

bq. Suggestion about OrderingPolicy interface design (if you agree with 1/2):

I don't agree, so skipping the section.  The essential thing that I think is 
being missed here is that there is an intentional desire to compose ordering 
policies for a queue to achieve behavior - so priority + fifo, or fair + fifo, 
etc, and for that reason it is not appropriate to place the management of the 
collection of processes shared amongst policies into the policy implementation 
- it belongs outside, as it is today, in SchedulingOrder.  Mixing these 
together defeats composition and also mixes concerns, making the code more (not 
less) complex and certainly less clean in terms of separation of concern and 
overall design and flow.

bq. ...CompoundOrderingPolicy is implemenation detail for FairOrderingPolicy, 
don't need put in the patch... 

Not is isn't, it's a feature of the generalized framework to support multiple 
policies being composed for a queue, it's not specific to fairness at all 
(fairness may be the first user, but so might priority - in any case, any set 
of policies may use it, it's not specific to any one of them, and therefore is 
framework...)

bq.  ...About spliting SchedulableProcess to App and Queue...

I stand by my earlier explanation (and don't see anything here which alters 
it...), I anticipate that with the current factoring of SchedulerProcess we 
won't have to subtype it to support Queues.  That said, the right time to do 
that is when we are adding such support, anticipatory complexity is the worst 
kind.  It is factored such that adding the subtyping should be additive if it 
needs to happen, so there is no need to anticipate it now (the room is there to 
add it, which is all we need.  We should wait to add it until we know we need 
it).

bq. ...As I mentioned before, use ResourceUsage is much better...  

As I mentioned before, it doesn't presently supply the needed functionality, 
when it does we can convert to it.









 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3463) Integrate OrderingPolicy Framework with CapacityScheduler

2015-04-07 Thread Craig Welch (JIRA)
Craig Welch created YARN-3463:
-

 Summary: Integrate OrderingPolicy Framework with CapacityScheduler
 Key: YARN-3463
 URL: https://issues.apache.org/jira/browse/YARN-3463
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch


Integrate the OrderingPolicy Framework with the CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3319) Implement a FairOrderingPolicy

2015-04-07 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484368#comment-14484368
 ] 

Craig Welch commented on YARN-3319:
---

Apply after applying YARN-3318 and YARN-3463

 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch, 
 YARN-3319.45.patch, YARN-3319.47.patch


 Implement a FairOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 id, which is generally lexically FIFO for that comparison



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3463) Integrate OrderingPolicy Framework with CapacityScheduler

2015-04-07 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3463:
--
Attachment: YARN-3463.50.patch

Must apply YARN-3318 patch first

 Integrate OrderingPolicy Framework with CapacityScheduler
 -

 Key: YARN-3463
 URL: https://issues.apache.org/jira/browse/YARN-3463
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3463.50.patch


 Integrate the OrderingPolicy Framework with the CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a FairOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Summary: Implement a FairOrderingPolicy  (was: Implement a Fair 
SchedulerOrderingPolicy)

 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch


 Implement a Fair Comparator for the Scheduler Comparator Ordering Policy 
 which prefers to allocate to SchedulerProcesses with least current usage, 
 very similar to the FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 An implementation of a Scheduler Comparator for use with the Scheduler 
 Comparator Ordering Policy will be built with the below comparison for 
 ordering applications for container assignment (ascending) and for preemption 
 (descending)
 Current resource usage - less usage is lesser
 Submission time - earlier is lesser
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 name, which is lexically FIFO for that comparison (first submitted is lesser)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Summary: Create Initial OrderingPolicy Framework and FifoOrderingPolicy  
(was: Create Initial OrderingPolicy Framework)

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Description: Create the initial framework required for using 
OrderingPolicies and an initial FifoOrderingPolicy  (was: Create the initial 
framework required for using OrderingPolicies)

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395508#comment-14395508
 ] 

Craig Welch commented on YARN-3318:
---

[~vinodkv]

bq. ...We can strictly focus on the policy framework here...

Sure, limited patch to framework

bq. ...You could also say SchedulableProcess...

SchedulableProcess it is, done

bq. I agree to this, but we are not in a position to support the APIs, CLI, 
config names in a supportable manner yet. They may or may not change depending 
on how parent queue policies, limit policies evolve. For that reason alone, I 
am saying that (1) Don't make the configurations public yet, or put a warning 
saying that they are unstable and (2) don't expose them in CLI , REST APIs yet. 
It's okay to put in the web UI, web UI scraping is not a contract.

You can't see it, because it's part of Capacity Scheduler Integration, but 
removed CLI and proto related change.  There was no rest api change, the web UI 
change is still present.  Will warn unstable when added to config files in the 
scheduler integration patch

bq. SchedulerApplicationAttempt.getDemand() should be private

Done

bq. updateCaches() - updateState() / updateSchedulingState() as that is what 
it is doing?  getCachedConsumption() / getCachedDemand(): simply getCurrent*() 
? What is the need for reorderOnContainerAllocate () / 
reorderOnContainerRelease()?

Is now getSchedulingConsumption(); getSchedulingDemand(); 
updateSchedulingState();

This is needed because mutable values which are used for ordering cannot be 
allowed to change for an item in the tree, else it will not be found in some 
cases during the delete before reinsert process which occurs when a 
schedulable's mutable values used in comparison change (for fairness, changes 
to consumption and potentially demand)  Not all OrderingPolicies require 
reordering on these events, for efficiency they get to decide if they do or 
not, hence the reorderOn.  The reorderOn are now   
reorderForContainerAllocation  reorderForContainerRelease

bq. Move all the comparator related classed into their own package
No longer needed as comparators are now just a property of policies, see below 
for details

bq. This is really a ComparatorBasedOrderingPolicy. Do we really see 
non-comparator based ordering-policy. We are unnecessarily adding two 
abstractions - adding policies and comparators

Originally, there was a perceived need to be able to support a more flexible 
interface than the comparator one, but also a desire to build up a simpler, 
composible abstraction to be used with an instance of the former which had 
most of the hard stuff done.  Given that all of the policies we've 
contemplated building fit the latter abstraction and the level of flexibility 
does not appear to actually be that different, I think it's fair to say that we 
only need what was previously the SchedulerComparator abstraction as a 
plugin-point.  Given that, a slightly refactored version of the 
SchedulerComparator abstraction is now the only plugin point and is now what 
goes by the name of OrderingPolicy.  What was previously the OrderingPolicy 
is now a single concrete class implementing the surrounding logic, meant to be 
usable from any scheduler, named SchedulingOrder.  So, one abstraction, a 
comparator-based ordering-policy.  If we really do find we need a flexibility 
we don't have some day, the SchedulingOrder class could be abstracted to 
provide that higher level abstraction - but as we see no need for it now, and 
it appears probably never will, there's no reason to do so at present

bq. ...Use className.getName()...

Done

[~leftnoteasy]

bq. ...I prefer what Vinod suggested, split SchedulerProcess to be 
QueueSchedulable and AppSchedulable ...

I don't see that he has suggested that.  In any case, with the removal of 
*Serial* and the move to compareInputOrderTo() I don't at present see a need 
to have separate subtypes for app and queue to avoid dangling properties.  
And, I think if we do it right we won't end up introducing them.  By splitting 
in the suggested way we commit ourselves to either multiple comparators (to use 
the differing functionality) or awkward testing of subtype/etc logic in one 
comparator - so it basically moves the complexity/awkwardness, it doesn't 
eliminate it.  I've refactored such that the Policy now provides a Comparator 
as opposed to extending it, so there is now room for it to provide multiple 
comparators and handle subtypes if need be, but I think we should wait until we 
see that we must do that before doing so, as I don't believe we will end up 
needing to (but if we do, existing code should need little change, and 
implementing what you suggest should be essentially additive...)

bq. ...About inherit relationships between interfaces/classes...

Policies will be composed to achieve combined capabilities yet the collection 
of 

[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Summary: Create Initial OrderingPolicy Framework  (was: Create Initial 
OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting 
present behavior)

 Create Initial OrderingPolicy Framework
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Description: Create the initial framework required for using 
OrderingPolicies  (was: Create the initial framework required for using 
OrderingPolicies with SchedulerApplicaitonAttempts and integrate with the 
CapacityScheduler.   This will include an implementation which is compatible 
with current FIFO behavior.)

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.45.patch

 Create Initial OrderingPolicy Framework
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a FairOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.45.patch

 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch, YARN-3319.45.patch


 Implement a FairOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 id, which is generally lexically FIFO for that comparison



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a FairOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Description: 
Implement a FairOrderingPolicy which prefers to allocate to SchedulerProcesses 
with least current usage, very similar to the FairScheduler's FairSharePolicy.  

The Policy will offer allocations to applications in a queue in order of least 
resources used, and preempt applications in reverse order (from most resources 
used). This will include conditional support for sizeBasedWeight style 
adjustment

Optionally, based on a conditional configuration to enable sizeBasedWeight 
(default false), an adjustment to boost larger applications (to offset the 
natural preference for smaller applications) will adjust the resource usage 
value based on demand, dividing it by the below value:

Math.log1p(app memory demand) / Math.log(2);

In cases where the above is indeterminate (two applications are equal after 
this comparison), behavior falls back to comparison based on the application 
id, which is generally lexically FIFO for that comparison



  was:
Implement a Fair Comparator for the Scheduler Comparator Ordering Policy which 
prefers to allocate to SchedulerProcesses with least current usage, very 
similar to the FairScheduler's FairSharePolicy.  

The Policy will offer allocations to applications in a queue in order of least 
resources used, and preempt applications in reverse order (from most resources 
used). This will include conditional support for sizeBasedWeight style 
adjustment

An implementation of a Scheduler Comparator for use with the Scheduler 
Comparator Ordering Policy will be built with the below comparison for ordering 
applications for container assignment (ascending) and for preemption 
(descending)

Current resource usage - less usage is lesser
Submission time - earlier is lesser

Optionally, based on a conditional configuration to enable sizeBasedWeight 
(default false), an adjustment to boost larger applications (to offset the 
natural preference for smaller applications) will adjust the resource usage 
value based on demand, dividing it by the below value:

Math.log1p(app memory demand) / Math.log(2);

In cases where the above is indeterminate (two applications are equal after 
this comparison), behavior falls back to comparison based on the application 
name, which is lexically FIFO for that comparison (first submitted is lesser)




 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch


 Implement a FairOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 id, which is generally lexically FIFO for that comparison



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a FairOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.47.patch

 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch, 
 YARN-3319.45.patch, YARN-3319.47.patch


 Implement a FairOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 id, which is generally lexically FIFO for that comparison



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.47.patch

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, YARN-3318.47.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.48.patch

javac error looks bogus, existing error has simply moved
findbugs looks bogus, class it's complaining about is static. uploading new 
version so see if it notices now
TestFairScheduler passes on my box with the patch, and can't see any way it 
would be effected.  Tests will rerun with new patch, so we'll see.

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-04-02 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.39.patch

 Implement a Fair SchedulerOrderingPolicy
 

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch


 Implement a Fair Comparator for the Scheduler Comparator Ordering Policy 
 which prefers to allocate to SchedulerProcesses with least current usage, 
 very similar to the FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 An implementation of a Scheduler Comparator for use with the Scheduler 
 Comparator Ordering Policy will be built with the below comparison for 
 ordering applications for container assignment (ascending) and for preemption 
 (descending)
 Current resource usage - less usage is lesser
 Submission time - earlier is lesser
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 name, which is lexically FIFO for that comparison (first submitted is lesser)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-04-02 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.39.patch

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch


 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-04-02 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392211#comment-14392211
 ] 

Craig Welch commented on YARN-3318:
---

[~leftnoteasy]  

SchedulerProcessEvents replaced with containerAllocated and containerReleased
Serial and SerialEpoch replaced with compareInputOrderTo(), which is the option 
2 for addressing it which we settled on offline
Added addSchedulerProcess/removeSchedulerProcess/addAllSchedulerProcesses
Changed configuration so that 
yarn.scheduler.capacity.root.default.ordering-policy=fair
will setup the fair configuration, fifo will setup fifo, fair+fifo will 
setup compound fair + fifo, etc.  It is possible to setup a custom ordering 
policy class using a different configuration, but the base one will handle the 
friendly setup.

[~vinodkv]
bq. It is not entirely clear how the ordering and limits work together - as one 
policy with multiple facets or multiple policy types
This should be modeled as different types of policies, so that they can each 
focus on their particular purpose and avoid a repetition of the intermingling 
which has made it difficult to mix, match, and share capabilities.  Having 
multiple policy types is essential to make it easy to combine them as needed.
bq. let's split the patch that exposes this to the client side / web UI and in 
the API records into its own JIRA...premature to support this as a publicly 
supportable configuration...
The goal is to make this available quickly but iteratively, keeping the changes 
small but making them available for use and feedback.  Clearly we can mark 
things unstable, communicate that they are not fully mature/subject to 
change/should be used gently, but we will need to make it possible to activate 
the feature and use it in order to accomplish the use and feedback.  We should 
grow it organically, gradually, iteratively, think of it is a facet of the 
policy framework hooked up and available but with more to follow
bq. ...SchedulableEntity better... well, I'd actually talked [~leftnoteasy] 
into SchedulerProcess :-)   So, we can chew on this a bit more  see where we go
bq. You add/remove applications to/from LeafQueue's policy but addition/removal 
of containers is an event...
This has been factored differently along [~leftnoteasy]'s suggestion, it should 
now be consistent
bq. The notion of a comparator doesn't make sense to an admin. It is simply a 
policy...
Have modeled policy configuration differently, so comparator is out of 
sight (see above).  
bq.  Depending on how ordering and limits come together, they may become 
properties of a policy
I expect them to be distinct, this is specifically an ordering-policy, limits 
will be other types of limit-policy(ies)

patch with these changes to follow in a few...

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, YARN-3318.36.patch


 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3293) Track and display capacity scheduler health metrics in web UI

2015-04-02 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393194#comment-14393194
 ] 

Craig Welch commented on YARN-3293:
---

Hey [~vvasudev], it seems that the patch doesn't apply cleanly, can you update 
to latest trunk?

 Track and display capacity scheduler health metrics in web UI
 -

 Key: YARN-3293
 URL: https://issues.apache.org/jira/browse/YARN-3293
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: Screen Shot 2015-03-30 at 4.30.14 PM.png, 
 apache-yarn-3293.0.patch, apache-yarn-3293.1.patch, apache-yarn-3293.2.patch


 It would be good to display metrics that let users know about the health of 
 the capacity scheduler in the web UI. Today it is hard to get an idea if the 
 capacity scheduler is functioning correctly. Metrics such as the time for the 
 last allocation, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3293) Track and display capacity scheduler health metrics in web UI

2015-04-02 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393587#comment-14393587
 ] 

Craig Welch commented on YARN-3293:
---

   General - it looks like the counters could possibly overflow and provide 
negative values, perhaps this is not something which could possibly happen in 
the lifetime of a cluster, but a large long-running cluster, is it a 
possiblilty/concern?
   This presently looks to be capasched only, had a suggestion to make slightly 
more general below, [~vinodkv] also mentioned not specific to scheduler, 
perhaps it's fine to go capasched only for the first iteration, but wanted to 
verify (perhaps we need a followon jira for other schedulers).  
   
on the web page
  It's a nit, but I find I don't like the look of the / between the counter 
and the resource expression where that occurs, maybe - instead of / for those 
(allocations/reservations/releases)?
  
TestSchedulerHealth
  can we import Nodemanager  get rid of package references in code
CapacitySchedulerHealthInfo
  looks like there is no need to keep a reference to the CapacityScheduler 
instance after construction, can we drop it 
  from being a member then?
  looks like line changes in info log are just whitespace, can you drop 
them?
LeafQueue
  L884 looks to be just whitespace, can you revert?
CSAssignment
  I think that there should be a new, gsharable between schedulers class 
which incorporates all the new assignment info and that it should be a member 
of CSAssignment, instead of adding all of the  details directly to 
CSAssignment.  You would still pack the info into CSAssignment (as an instance 
of that type), but now would take a form that can be shared across schedulers

 Track and display capacity scheduler health metrics in web UI
 -

 Key: YARN-3293
 URL: https://issues.apache.org/jira/browse/YARN-3293
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: Screen Shot 2015-03-30 at 4.30.14 PM.png, 
 apache-yarn-3293.0.patch, apache-yarn-3293.1.patch, apache-yarn-3293.2.patch


 It would be good to display metrics that let users know about the health of 
 the capacity scheduler in the web UI. Today it is hard to get an idea if the 
 capacity scheduler is functioning correctly. Metrics such as the time for the 
 last allocation, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-31 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388275#comment-14388275
 ] 

Craig Welch commented on YARN-3318:
---

Hi Wangda,
I have changed the patch a bit in the background without updating it on the 
jira.  The changes are not major but I think they render some of the comments 
obsolete.  I've uploaded the up-to-date patch just now, before doing so I took 
a pass through your comments- below I'll respond to each in turn-

bq. 1) SchedulerProcess
bq. 1.1. ...name seems not very straightforward to me...
Well, I'm certainly open to other name options, but I do prefer 
SchedulerProcess to SchedulableEntity - it's common to refer the items which a 
scheduler will schedule as Processes, which is what these are in this case, 
which is why I chose this name.  Entity is really very generic and empty of 
meaning.  I do wish to avoid confusion with Scheduleable (and wasn't enamored 
of that name either...), I expect that as integration progresses there will be 
a period where Schedulable will be an extension of SchedulerProcess with the 
(remaining) FairScheduler specific bits (which will, I think, ultimately be 
incorporated in some way into SchedulerProcess, but that's down the line/should 
be addressed in further iteration).  In any case, not in favor of adding 
Entity, I think when you consider the terminology as explained above 
SchedulerProcess works, try it on and see, and feel free to give other 
options...
bq. 1.2. ...SchedulerProcessEvent...asynchronized
Not all event handling must be asynchronous.  I believe the details regarding 
this were spelled out reasonably well in the interface definition - if you take 
a peek at how these events are handled in the capacity scheduler configuration 
you will see that they are synchronously/safely within protection against 
mutation of the schedulerprocess collection.  My goal was precisely to avoid 
needing to have implementer add a new method implementation every time a new 
event comes into play which may not be of interest to it, this makes 
maintenance of implementations easier - they can manage those which they 
understand and have appropriate default logic otherwise.  I think this is a 
classic case for an enumerated set of events to be handled by the interface  
so I think it should be modeled as it is as opposed to adding a new method for 
each new event type to the interface itself...
bq. 1.3 ...SerialEpoch
Yeah, I don't like the names much either, I've gone through several versions 
and come to the conclusion that it's not the choice of names that's the 
problem.  This is an attempt to hide the application id's while also exposing 
them, which is compound in nature, and is made stranger by the fact that this 
is totally irrelevant for other potential future implementors (such as queues). 
 I want to factor it differnently not just change the names, these are the 
courses I'm considering:
  1.  Have SchedulerProcess implement Comparable and provide a natural 
ordering, which is fifo for apps.  This seems to privilege fifo but, as a 
matter of fact, it's the fallback for fair so I'm not sure that's really an 
inappropriate thing to do - it seems like it is the natural ordering for 
apps.  Other things can give their own natural ordering (queues - the 
hierarchy...), so it should extend reasonably well without the current 
awkwardness.  This would remove all of getSerialEpoch, getSerial, and 
getId in favor of just implementing compareTo from comparable.  The 
downsides I see are the privilage and that if an implementor of 
SchedulerProcess implemented comparable in an unworkable fashion it would be 
an issue, not the case for what we are presently looking at supporting afaik
  2.  Have an explicit compareCreationOrder(SchedulerProcess other) method 
which returns 0 + - like compareTo.  This is much like 1, but removes the 
privilege and the possible Comparable collision... this also does away 
getSerialEpoch, getSerial, and getId in favor of the comparison method.
What do you think of these options?  Preference?
BTW, FS has an actual startTime for fsappattempts, but looking through it I 
don't like that approach - it doesn't appear to do the right thing in some 
cases (like rm failover or recovery), it still can be ambiguous for some cases 
(simultanious start w/in tsmillis granularity) where there's a fallback to 
appid, so it doesn't look to really add anything as you still have to be able 
to fall back to the app id for those cases so you can't get away from the 
issue, and it adds a bit of complexity to boot.
bq. 1.4 ...currentConsumption is not enough to make choice, 
demand(pending-resource)/used and priority/weight are basic fields of a 
Schedulable, do you think so...
Of those, only demand is required for the initial step of supporting 
application level fairness when sizeBasedWeight is active, the others are only 

[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-31 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.35.patch

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch


 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-31 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.35.patch

Apply after applying YARN-3318.35.patch

 Implement a Fair SchedulerOrderingPolicy
 

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch


 Implement a Fair Comparator for the Scheduler Comparator Ordering Policy 
 which prefers to allocate to SchedulerProcesses with least current usage, 
 very similar to the FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 An implementation of a Scheduler Comparator for use with the Scheduler 
 Comparator Ordering Policy will be built with the below comparison for 
 ordering applications for container assignment (ascending) and for preemption 
 (descending)
 Current resource usage - less usage is lesser
 Submission time - earlier is lesser
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 name, which is lexically FIFO for that comparison (first submitted is lesser)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-31 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.36.patch

Fixes for release audit warnings

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, YARN-3318.36.patch


 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-31 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389563#comment-14389563
 ] 

Craig Welch commented on YARN-3318:
---

The remaining javac error doesn't appear to be related to my changes, which is 
confusing.  On the next patch will have a change to try and address it anyway.  
TestRM passes on my box, I assume it's a transient issue.

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, YARN-3318.36.patch


 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-31 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389583#comment-14389583
 ] 

Craig Welch commented on YARN-3318:
---

BTW, can't just do lexical sort on the string version of application id - one 
problem with using the lexical compare on appid, the format for the id 
component is a min of 4 digits, which means that going from  to 1 will 
result in incorrect lexical sort wrt to actual order

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, YARN-3318.36.patch


 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-30 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.34.patch

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch


 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-18 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Description: 
Implement a Fair Comparator for the Scheduler Comparator Ordering Policy which 
prefers to allocate to SchedulerProcesses with least current usage, very 
similar to the FairScheduler's FairSharePolicy.  

The Policy will offer allocations to applications in a queue in order of least 
resources used, and preempt applications in reverse order (from most resources 
used). This will include conditional support for sizeBasedWeight style 
adjustment

An implementation of a Scheduler Comparator for use with the Scheduler 
Comparator Ordering Policy will be built with the below comparison for ordering 
applications for container assignment (ascending) and for preemption 
(descending)

Current resource usage - less usage is lesser
Submission time - earlier is lesser

Optionally, based on a conditional configuration to enable sizeBasedWeight 
(default false), an adjustment to boost larger applications (to offset the 
natural preference for smaller applications) will adjust the resource usage 
value based on demand, dividing it by the below value:

Math.log1p(app memory demand) / Math.log(2);

In cases where the above is indeterminate (two applications are equal after 
this comparison), behavior falls back to comparison based on the application 
name, which is lexically FIFO for that comparison (first submitted is lesser)



  was:Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
SchedulerProcesses with least current usage, very similar to the 
FairScheduler's FairSharePolicy.  


 Implement a Fair SchedulerOrderingPolicy
 

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch


 Implement a Fair Comparator for the Scheduler Comparator Ordering Policy 
 which prefers to allocate to SchedulerProcesses with least current usage, 
 very similar to the FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 An implementation of a Scheduler Comparator for use with the Scheduler 
 Comparator Ordering Policy will be built with the below comparison for 
 ordering applications for container assignment (ascending) and for preemption 
 (descending)
 Current resource usage - less usage is lesser
 Submission time - earlier is lesser
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 name, which is lexically FIFO for that comparison (first submitted is lesser)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-13 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.17.patch

With support for configuration via the scheduler's config file

 Implement a Fair SchedulerOrderingPolicy
 

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch


 Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-13 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.17.patch

With support for configuration via the scheduler's config file

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch


 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3306) [Umbrella] Proposing per-queue Policy driven scheduling in YARN

2015-03-13 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14360652#comment-14360652
 ] 

Craig Welch commented on YARN-3306:
---

Thanks for your thoughts, [~kasha]

The immediate proposal is to begin adding new functionality in a fashion which 
can be easily shared across scheduler implementations and mixed together in a 
single cluster.  The first case is to support additional container assignment 
and preemption types to fifo for applications in the capacity scheduler and 
potentially the fair scheduler using the same code, but this is expected to be 
expanded to cover queue relationships and potentially other behaviors (limits, 
etc) over time.

The hope is that this allows us to iterate toward a state where the various 
behaviors of the schedulers can be mixed, matched, and shared across 
implementations rather than having to try and accomplish this all in one go, 
and allows us to achieve the benefit of mixing and matching some of the 
features earlier/along the way.

I suspect that at some point we'll hit a critical mass where enough of the 
functionality has been extracted to sharable components and where we've been 
able to establish an understanding of how these can be made to compose well, 
and then we'll take that as an inflection point and go down the path you are 
suggesting, introduce a new scheduler to house the policies and in that way 
complete the picture, deprecating the others.  That's by no means the only 
possible conclusion, but it seems to be a good and/or  likely one.

 [Umbrella] Proposing per-queue Policy driven scheduling in YARN
 ---

 Key: YARN-3306
 URL: https://issues.apache.org/jira/browse/YARN-3306
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: PerQueuePolicydrivenschedulinginYARN.pdf


 Scheduling layout in Apache Hadoop YARN today is very coarse grained. This 
 proposal aims at converting today’s rigid scheduling in YARN to a per­-queue 
 policy driven architecture.
 We propose the creation of a c​ommon policy framework​ and implement a​common 
 set of policies​ that administrators can pick and chose per queue
  - Make scheduling policies configurable per queue
  - Initially, we limit ourselves to a new type of scheduling policy that 
 determines the ordering of applications within the l​eaf ­queue
  - In the near future, we will also pursue parent­ queue level policies and 
 potential algorithm reuse through a separate type of policies that control 
 resource limits per queue, user, application etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353397#comment-14353397
 ] 

Craig Welch commented on YARN-2495:
---

-re

bq. How about we simply things? Instead of accepting labels on both 
registration and heartbeat, why not restrict it to be just during registration?

As I understand the requirements, it's necessary to handle the case where the 
derived set of labels changes during the lifetime of the nodemanager, e.g. 
externally libraries might be installed or some other condition may change 
which effects the labels  no nodemanager re-registration is involved, and yet 
the changed labels need to be reflected

 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)
Craig Welch created YARN-3318:
-

 Summary: Create Initial OrderingPolicy Framework, integrate with 
CapacityScheduler LeafQueue supporting present behavior
 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch


Create the initial framework required for using OrderingPolicies with 
SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch reassigned YARN-3318:
-

Assignee: Craig Welch

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch

 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)
Craig Welch created YARN-3319:
-

 Summary: Implement a Fair SchedulerOrderingPolicy
 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch


Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
SchedulerProcesses with least current usage, very similar to the 
FairScheduler's FairSharePolicy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.13.patch


Initial, incomplete patch with the overall framework  implementation of the 
SchedulerComparatorPolicy and FifoComparator, major TODO includes integrating 
with capacity scheduler configuration.  Also includes a CompoundComparator for 
chaining comparator based policies where desired.

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch


 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353965#comment-14353965
 ] 

Craig Welch commented on YARN-3318:
---


The proposed initial implementation of the framework to support FIFO 
SchedulerApplicationAttempt ordering for the CapacityScheduler:

A SchedulerComparatorPolicy which implements OrderingPolicy above.  This 
implementation will take care of the common logic required for cases where the 
policy can be effectively implemented as a comparator (which is expected to be 
the case for several potential policies, including FIFO).  

A SchedulerComparator which is used by the SchedulerComparatorPolicy above.  
This is an extension of the java Comparator interface with additional logic 
required by the SchedulerComparatorPolicy, initially a method to accept 
SchedulerProcessEvents and indicate whether the require re-ordering of the 
associated SchedulerProcess.

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch

 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353953#comment-14353953
 ] 

Craig Welch commented on YARN-3318:
---


Proposed elements of the framework:

A SchedulerProcess interface which generalizes processes to be managed by the 
OrderingPolicy (initially, potentially in the future by other Policies as well) 
Initial implementer will be the SchedulerApplicaitonAttempt. 

An OrderingPolicy interface which exposes a collection of scheduler processes 
which will be ordered by the policy for container assignment and preemption.  
The ordering policy will provide one Iterator which presents processes in the 
policy specific order for container assignment and another Iterator which 
presents them in the proper order for preemption.  It will also accept 
SchedulerProcessEvents which may indicate a need to re-order the associated 
SchedulerProcess (for example, after container completion, preemption, 
assignment, etc)



 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch

 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354006#comment-14354006
 ] 

Craig Welch commented on YARN-3319:
---

Initially this will be implemented for SchedulerApplicationAttempts in the 
CapacityScheduler LeafQueue (similar to the FIFO implementation in 
[YARN-3318]).  The expectation is that this will be implement the 
SchedulerComparator interface and will be used as a comparator within the 
SchedulerComparatorPolicy implementation to achieve the intended behavior.

 Implement a Fair SchedulerOrderingPolicy
 

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch

 Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.13.patch

Attaching initial/incomplete patch, it depends on the [YARN-3318] patch of the 
same index - it is just the additional logic specific to Fairness.  Major TODO, 
sizeBasedWeight.

 Implement a Fair SchedulerOrderingPolicy
 

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch


 Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3320) Support a Priority OrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354035#comment-14354035
 ] 

Craig Welch commented on YARN-3320:
---

The initial intent is to bring the appropriate parts of the implementation of 
ApplicationPriorities from [YARN-2004] into the OrderingPolicy framework as a 
SchedulerComparator which can be composed with Fair and Fifo comparators to 
achieve Fair and Fifo behavior WITHIN priority bands

 Support a Priority OrderingPolicy
 -

 Key: YARN-3320
 URL: https://issues.apache.org/jira/browse/YARN-3320
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch

 When [YARN-2004] is complete, bring relevant logic into the OrderingPolicy 
 framework



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.14.patch

Same as .13 except it should be possible to apply with [YARN-3319] 's .14 patch

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch


 Create the initial framework required for using OrderingPolicies with 
 SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
 will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3320) Support a Priority OrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3320:
--
Summary: Support a Priority OrderingPolicy  (was: Support a Priority 
SchedulerOrderingPolicy composible with Fair and Fifo ordering)

 Support a Priority OrderingPolicy
 -

 Key: YARN-3320
 URL: https://issues.apache.org/jira/browse/YARN-3320
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch

 When [YARN-2004] is complete, bring relevant logic into the OrderingPolicy 
 framework



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3320) Support a Priority SchedulerOrderingPolicy composible with Fair and Fifo ordering

2015-03-09 Thread Craig Welch (JIRA)
Craig Welch created YARN-3320:
-

 Summary: Support a Priority SchedulerOrderingPolicy composible 
with Fair and Fifo ordering
 Key: YARN-3320
 URL: https://issues.apache.org/jira/browse/YARN-3320
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch


When [YARN-2004] is complete, bring relevant logic into the OrderingPolicy 
framework



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.14.patch

Same as .13, except it should be possible to apply this patch after applying 
[YARN-3318] 's .14 patch

 Implement a Fair SchedulerOrderingPolicy
 

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch


 Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)

2015-02-26 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338848#comment-14338848
 ] 

Craig Welch commented on YARN-3251:
---

Sorry if that wasn't clear, to reduce risk removed the minor changes in 
CSQueueUtils

 CapacityScheduler deadlock when computing absolute max avail capacity (short 
 term fix for 2.6.1)
 

 Key: YARN-3251
 URL: https://issues.apache.org/jira/browse/YARN-3251
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Craig Welch
Priority: Blocker
 Attachments: YARN-3251.1.patch, YARN-3251.2-6-0.2.patch, 
 YARN-3251.2-6-0.3.patch


 The ResourceManager can deadlock in the CapacityScheduler when computing the 
 absolute max available capacity for user limits and headroom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)

2015-02-26 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3251:
--
Attachment: YARN-3251.2-6-0.3.patch

Removing the csqueueutils

 CapacityScheduler deadlock when computing absolute max avail capacity (short 
 term fix for 2.6.1)
 

 Key: YARN-3251
 URL: https://issues.apache.org/jira/browse/YARN-3251
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Craig Welch
Priority: Blocker
 Attachments: YARN-3251.1.patch, YARN-3251.2-6-0.2.patch, 
 YARN-3251.2-6-0.3.patch


 The ResourceManager can deadlock in the CapacityScheduler when computing the 
 absolute max available capacity for user limits and headroom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)

2015-02-26 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3251:
--
Attachment: YARN-3251.2-6-0.4.patch

Minor, switch to Internal, seems to be more common in the codebase

 CapacityScheduler deadlock when computing absolute max avail capacity (short 
 term fix for 2.6.1)
 

 Key: YARN-3251
 URL: https://issues.apache.org/jira/browse/YARN-3251
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Craig Welch
Priority: Blocker
 Attachments: YARN-3251.1.patch, YARN-3251.2-6-0.2.patch, 
 YARN-3251.2-6-0.3.patch, YARN-3251.2-6-0.4.patch


 The ResourceManager can deadlock in the CapacityScheduler when computing the 
 absolute max available capacity for user limits and headroom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)

2015-02-26 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3251:
--
Attachment: YARN-3251.2.patch

Attaching an analogue of the most recent patch against trunk.  I do not believe 
that we will be committing this at this point as [~leftnoteasy] is working on a 
more significant change which will remove the need for it, but I wanted to make 
it available just in case.  For clarity, patch against trunk is 
YARN-3251.2.patch and the patch to commit against 2.6 is 
YARN-3251.2-6-0.4.patch.

 CapacityScheduler deadlock when computing absolute max avail capacity (short 
 term fix for 2.6.1)
 

 Key: YARN-3251
 URL: https://issues.apache.org/jira/browse/YARN-3251
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Craig Welch
Priority: Blocker
 Attachments: YARN-3251.1.patch, YARN-3251.2-6-0.2.patch, 
 YARN-3251.2-6-0.3.patch, YARN-3251.2-6-0.4.patch, YARN-3251.2.patch


 The ResourceManager can deadlock in the CapacityScheduler when computing the 
 absolute max available capacity for user limits and headroom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)

2015-02-25 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3251:
--
Attachment: YARN-3251.2-6-0.2.patch

Patch against branch-2.6.0

 CapacityScheduler deadlock when computing absolute max avail capacity (short 
 term fix for 2.6.1)
 

 Key: YARN-3251
 URL: https://issues.apache.org/jira/browse/YARN-3251
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Craig Welch
Priority: Blocker
 Attachments: YARN-3251.1.patch, YARN-3251.2-6-0.2.patch


 The ResourceManager can deadlock in the CapacityScheduler when computing the 
 absolute max available capacity for user limits and headroom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)

2015-02-25 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337962#comment-14337962
 ] 

Craig Welch commented on YARN-3251:
---

bq. 1) Since the target of your patch is to make a quick fix for old version, 
it's better to create a patch in branch-2.6
done
bq. And patch I'm working on now will remove the 
CSQueueUtils.computeMaxAvailResource, so it's no need to add a intermediate fix 
in branch-2.
I suppose that depends on whether anyone needs a trunk version of the patch 
before the other changes are landed - if someone asks for it I could quickly 
update the original patch to provide it
bq. 2) I think CSQueueUtils.getAbsoluteMaxAvailCapacity doesn't hold 
child/parent's lock together, maybe we don't need to change that, could you 
confirm?
it doesn't, the change there was to insure consistency for multiple values used 
from the queue, as previously it was occurring inside a lock and that was 
guaranteed, now it isn't.  However, there's no need to lock on the parent, so I 
removed that 
bq. 3) Maybe we don't need getter/setter of absoluteMaxAvailCapacity in queue, 
a volatile float is enough?
Yes, that should be safe, done


 CapacityScheduler deadlock when computing absolute max avail capacity (short 
 term fix for 2.6.1)
 

 Key: YARN-3251
 URL: https://issues.apache.org/jira/browse/YARN-3251
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Craig Welch
Priority: Blocker
 Attachments: YARN-3251.1.patch, YARN-3251.2-6-0.2.patch


 The ResourceManager can deadlock in the CapacityScheduler when computing the 
 absolute max available capacity for user limits and headroom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity

2015-02-24 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335765#comment-14335765
 ] 

Craig Welch commented on YARN-3251:
---

It looks like this can occur when a call which walks down the queue tree (in 
this case, getQueueInfo()) happens at the same time as an assignContainers call 
which does not start from the root queue, which is specifically one for a 
reservedContainer where scheduleAsynchronously is false.  Essentially, it isn't 
safe to hold a lock on a queue while locking on a parent queue (as I now see 
noted in other methods in LeafQueue :/).

[YARN-3243] is potentially a long term fix, but it would be nice to fix this 
right away as it clearly is already problematic.  Also, [YARN-3243] depends on 
a number of other sizable changes which have gone in recently, meaning it will 
be difficult to apply it as a fix to older codebases, for which it would be 
very nice to have a fix.

I've attached a patch somewhat along the lines suggest by [~sunilg], it simply 
moves the acquisition of the absoluteMaxAvailCapacity outside the lock on the 
leaf queue - it will lock parent queues individually as it ascends, but it 
never holds a parent and child lock simultaneously, which is the unacceptable 
state.  It follows the pattern for other methods in LeafQueue like 
recoverContainer which access parent queues - they all are careful to make sure 
the parent queue access occurs outside any lock on themselves.  

Unfortunately it's not possible to just do this in root.assignContainers 
because of the reservedContainer case which will not invoke assignContainers on 
the root queue at any point.  Instead, absoluteMaxAvailCapacity is determined 
outside any lock on the leaf queue in assignContainers before entering the 
synchronized method which continues the logic as it is today.  

This looks to me to be the way to fix the issue with the smallest code change 
today pending other changes coming down the line.

 CapacityScheduler deadlock when computing absolute max avail capacity
 -

 Key: YARN-3251
 URL: https://issues.apache.org/jira/browse/YARN-3251
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Wangda Tan
Priority: Blocker

 The ResourceManager can deadlock in the CapacityScheduler when computing the 
 absolute max available capacity for user limits and headroom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity

2015-02-24 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3251:
--
Attachment: YARN-3251.1.patch

 CapacityScheduler deadlock when computing absolute max avail capacity
 -

 Key: YARN-3251
 URL: https://issues.apache.org/jira/browse/YARN-3251
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Wangda Tan
Priority: Blocker
 Attachments: YARN-3251.1.patch


 The ResourceManager can deadlock in the CapacityScheduler when computing the 
 absolute max available capacity for user limits and headroom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-02-19 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327716#comment-14327716
 ] 

Craig Welch commented on YARN-2495:
---

So, here's my proposal [~Naganarasimha] [~leftnoteasy], take a minute and 
consider whether or not DECENTRALIZED_CONFIGURATION_ENABLED is more likely to 
cause difficulty than prevent it, as I'm suggesting, and then you all can 
decide to keep it or not as you wish - I don't want to hold up the way forward 
over something which is, on the whole, a detail...

 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-02-16 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323076#comment-14323076
 ] 

Craig Welch commented on YARN-2495:
---

My point is that everything necessary to manage labels properly exists without 
DECENTRALIZED_CONFIGURATION_ENABLED, it is a duplication of existing 
functionality.   The user controls this by:

1. choosing to specify or not specify a way of managing the nodes at the node 
manager
2. choosing to set or not set node labels and associations using the 
centralized apis

ergo, DECENTRALIZED_CONFIGURATION_ENABLED is completely redundant, it provides 
no capabilities not already present.  Users will need to understand how the 
feature works to use it effectively anyway, there is no value add by requiring 
that they repeat themselves (both by specifying a way of determining node 
labels at the node manager level and by having to set this switch.).  My 
prediction is that, if the switch is present, it's chief function will be to 
confuse and annoy users when they setup a configuration for the node managers 
to generate node labels and then the labels don't appear in the cluster as they 
expect them to.

 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-29 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297783#comment-14297783
 ] 

Craig Welch commented on YARN-1039:
---

[~chris.douglas]

bq.  YARN shouldn't understand the lifecycle for a service or the 
progress/dependencies for task containers

That's not necessarily so, there are some cases where the type of life cycle 
for an application is important, for example, when determining whether or not 
it is open-ended (service) or a batch process which entails a notion of 
progress (session), at least for purposes of display.

I think we need to re scope and clarify this jira a bit so that we can make 
progress - there are a number of items in the original problem statement and 
subsequent comments which have been taken on elsewhere and so really no longer 
make sense to pursue here.  Here's an attempt at a breakdown:

bq. This could be used by a scheduler that would know not to host the service 
on a transient (cloud: spot priced) node

I think this is now clearly covered by [YARN-796], nodes having qualities 
(including operational qualities such as these) is one of the core purposes of 
this work, it makes no sense to duplicate it here, and so it should be 
de-scoped from this jira

bq. Schedulers could also decide whether or not to allocate multiple long-lived 
containers on the same node

As [~ste...@apache.org]   mentioned in an earlier comment 
[https://issues.apache.org/jira/browse/YARN-1039?focusedCommentId=14038041page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14038041]
 affinity / anti-affinity is covered in a more general sense in [YARN-1042].  
The above component of this jira is really just such a case, and so it should 
be covered with that general solution and dropped from scope as well.  There 
may be some interest in informing that solution based on a generalized 
service setting, but to really understand that the affinity approach needs to 
be worked out - and I think the affinity approach will really need to 
inform/integrate with this rather than the other way around, and integration 
should be approached as part of that effort

That leaves nothing, so we can close the jira ;-)  Not quite, there were 
several things added in comments:

Token management - handled in [YARN-941]

Scheduler hints not related to node categories or anti-affinity (opportunistic 
scheduling, etc) - this does strike me as something better handled via the 
duration route et all [YARN-2877] [YARN-1051] and not something which needs to 
be replicated here

I think that really just leaves the progress bar (and potentially other display 
related items).  This is covered by [YARN-1079]  I suggest, then, that we 
either rescope this jira to providing the lifecycle information as an 
application tag 
[https://issues.apache.org/jira/browse/YARN-1039?focusedCommentId=14039679page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14039679]
 as suggested by [~zjshen] early on or close it and cover the work as part of 
[YARN-1079].  I originally objected to that approach on the basis that tags 
appeared to be a display type feature which did not fit this effort, but if re 
scoped as I'm proposing, it becomes such a feature, and I think that approach 
is now a good fit.  

Thoughts?


 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-27 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294557#comment-14294557
 ] 

Craig Welch commented on YARN-1039:
---

[~chris.douglas] what's the proper duration for a service which does not have a 
pre-defined lifetime?  

This distinction is not really about how long will it run but more about 
what is the lifecycle of this app - as [~ste...@apache.org] points out, is it 
session or batch oriented (something which has a defined set of work, so it has 
a notion of progress to completion) or is it a running process with an 
indeterminate/unknown lifetime which handles whatever work is sent it's way (a 
service).  This is really the distinction needed here - it's a qualitative 
difference regarding a lifecycle, the notion of an enumeration of lifecycle 
types makes sense for this.  Users will often have no idea how long their 
application will run, but they will generally have a clear notion of it's 
lifecycle.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-01-27 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294441#comment-14294441
 ] 

Craig Welch commented on YARN-2495:
---

[~Naganarasimha] I understand the desire to have the feature, it does seem more 
like a convenience / simplification measure than an introduction of something 
that can't be otherwise accomplished, but convenience and simplification can 
matter a great deal, so why not :-)

bq. There will be always confusion ... Do we need to And or OR 

What I'm getting at is that I think there are just too many switches and knobs 
in play once you consider the flag DECENTRALIZED_CONFIGURATION_ENABLED in 
addition to the other configuration relationships (defining the configuration 
script to do the update from nodes), I think that the act of configuring 
something to send node labels from the node manager is sufficient intent that 
it is the desired behavior, and the additional  
DECENTRALIZED_CONFIGURATION_ENABLED is just an extra ceiling for someone to 
bump their head against while setting this up.  wrt supporting add vs replace 
behavior, I think that as it's described now the idea is to just support 
replace form the node script, meaning that it will effectively be the only 
definition used when it is active (which is fine for many cases).  In the 
future, if there is a need for hybrid configuration of labels that can become 
an enhancement.  An option would be to use a different parameter for a script 
which will do add and remove instead of replace, and then say have it return + 
or - (for add and remove) with the label instead of a fixed set of labels for 
replacement.  From what I see above, the replacement approach, where the script 
determines the full label set, looks to be the immediate need - the other could 
be added in a compatible way later if it was needed.

 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-20 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284325#comment-14284325
 ] 

Craig Welch commented on YARN-1039:
---

Another thought - if we do need this kind of flag, I think we should detach 
the notion from duration or long life as such - I think it's more about 
service vs batch - where a service's duration is not necessarily related to any 
preset notion of a work item it will start, work on, and complete - it will be 
started to handle work which is given to it, of unknown quantity ( potentially 
many different items) and stopped when no longer needed - it's not so much 
about the duration as the lifecycle (a batch operation may have a longer 
runtime than a service, for example).  So, I'd suggest dropping the temporal 
flavor and going with service vs batch, or something along those lines.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-20 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284304#comment-14284304
 ] 

Craig Welch commented on YARN-1039:
---

As I understand it (and, I may be wrong on this...) the original intent of this 
jira was to provide a boolean switch to control a set of behaviors expected 
to be important for a long running service - among other things, what sort of 
nodes to schedule on and how to handle logs.  This could be on a sliding scale 
based on duration, but I'm not sure that works so well - at what duration do we 
start to change how we handle logs and / or where we schedule things?  While 
related, I think that converting this from a boolean to a range will make it 
more difficult to use it for the intended usecase.  I also think that packing 
together all of these behaviors into one parameter might be a negative overall. 
 I do think, to [~john.jian.fang] 's point, as of now using this to determine 
where to schedule tasks to avoid spot instances and the like has really been 
superseded by Node Labels and I do not think we should add additional 
functionality for that here - Node Labels is really the way to handle that part 
of the usecase.  That leaves, potentially among other things, 
affinity/anti-affinity issues (not scheduling long running tasks 
together/scheduling them together) and log handling (how do we tell the system 
we want log handling for a long running service, if, in fact, the system needs 
to be told that).  I submit that it would be better to have separate solutions 
to each of these needs which can be bundled together to achieve the overall 
usecase, as I think that will provide better control without adding too much 
complexity for the end user.  Which means that we would break this out into 
affinity/anti-affinity and logging configuration.  We could always have a 
single parameter (like this one) which set's the others for convenience, I'm 
not sure we'll actually need it, but I do think that splitting out the bundled 
functionality into individual items (some of which may already be being worked 
on elsewhere) is the way to go.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2015-01-16 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14281058#comment-14281058
 ] 

Craig Welch commented on YARN-1680:
---

Thanks for the update, [~airbots], a couple thoughts:

I created [YARN-2848] in the hopes that it would help us to build a solution 
which could share functionality between various items with similar needs, so 
that the solution we come up with is build with that in mind.  That said, I 
think we will need to build the solutions independently, and there's no need to 
do them all at the same time.

-re Every time, App asks for blacklist addition, we check whether the nodes in 
addition are in cluster blacklist or not (O(m), m is the nodes in blacklist 
addition). If so, remove this node from addition. 

Unfortunately, I don't think that this can be solved with checks during 
addition and removal - I believe that we will need to keep a persistent picture 
of all blacklisted nodes for an application regardless of their cluster state 
because the two can vary independently and changes after a blacklist request 
may invalidate things (for example, cluster blacklists just before app 
blacklists, the app blacklist request is discarded, the cluster reinstates but 
the app still cannot use the node for reasons different from the nodes cluster 
availability - we will still include that node in headroom incorrectly...).  

I also think that, as suggested in [YARN-2848], the only approach I see working 
for all states is one where there is a last-change indicator of some sort 
active for the cluster in terms of it's node composition which is held by the 
application and, when it has updated past the application's last calculation 
for app cluster resource (in this case, the one which omits blacklisted 
nodes), it re-evaluates state to determine a new app cluster resource which 
it then uses (until a reevaluation is required, again).  This should enable the 
application to have accurate headroom information regardless of the timing of 
changes and allows for the more complex evaluations which may be needed (rack 
blacklisting, etc) while minimizing the frequency of those evaluations.  I 
don't think it is necessarily required for blacklisting, but it's worth noting 
that this could include offloading some of the calculation to the application 
master (via more informational api's / library functions for calculation) to 
distribute the cost outward.  Again, not necessarily for this case, but I 
wanted to mention it as I think it is an option now or later on.

 availableResources sent to applicationMaster in heartbeat should exclude 
 blacklistedNodes free memory.
 --

 Key: YARN-1680
 URL: https://issues.apache.org/jira/browse/YARN-1680
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0, 2.3.0
 Environment: SuSE 11 SP2 + Hadoop-2.3 
Reporter: Rohith
Assignee: Chen He
 Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
 YARN-1680-v2.patch, YARN-1680.patch


 There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
 slow start is set to 1.
 Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
 become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
 NodeManager(NM-4). All reducer task are running in cluster now.
 MRAppMaster does not preempt the reducers because for Reducer preemption 
 calculation, headRoom is considering blacklisted nodes memory. This makes 
 jobs to hang forever(ResourceManager does not assing any new containers on 
 blacklisted nodes but returns availableResouce considers cluster free 
 memory). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be respected for both LeafQueue/User when trying to activate applications.

2015-01-13 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275700#comment-14275700
 ] 

Craig Welch commented on YARN-2637:
---

Regarding the findbugs report for LeafQueue.lastClusterResource - access to 
lastClusterResource appears to be synchronized everywhere except 
getAbsActualCapacity, which I don't actually see being used anywhere - I'm 
going to add a findbugs exception and a comment on the method so that if it is 
used in the future synchronization can be addressed

-re [~leftnoteasy] 's latest:

-re 1 - actually, user limits are based on absolute queue capacity rather than 
max capacity - this is apparently intentional because, although a queue can 
exceed it's absolute capacity, an individual user is not supposed to, hence my 
basing the user amlimit on the absolute capacity.  The approach I use fits with 
the original logic in CSQueueUtils which allows a user the greater of the 
userlimit share of the absolute capacity or 1/# active users (so if there are 
fewer users active than would reach the userlimit they can use the full queue 
absolute capacity), the only correction being that we are using the actual 
value of resources by application masters instead of one based on minalloc

-re 2 - Actually, the snippet provided is not quite correct, some schedulers 
provide a cpu value as well.  In any case, for encapsulation reasons it's 
better to use the scheduler's value in case its means of determining this 
changes in the future. 

-re 3 - I can't see this making the slightest difference in understandability - 
since these test's paths don't populate the rmapps I would simply be 
individually putting mocked ones into the map instead of the single mock + 
matcher for all the apps.  The way it is seems clearer to me as all of the 
mocking is together instead of distributing the (mock activity, if not mock 
framework...) process of putting mock rmapps into the collection throughout the 
test

-re 4 - interesting, those were already there, but I also couldn't see why.  
Test passes fine without them, so I removed them

-re 5 - removed

uploading updated patch in a few

 maximum-am-resource-percent could be respected for both LeafQueue/User when 
 trying to activate applications.
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.29.patch, YARN-2637.30.patch, 
 YARN-2637.31.patch, YARN-2637.32.patch, YARN-2637.36.patch, 
 YARN-2637.38.patch, YARN-2637.39.patch, YARN-2637.6.patch, YARN-2637.7.patch, 
 YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be respected for both LeafQueue/User when trying to activate applications.

2015-01-13 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.40.patch

 maximum-am-resource-percent could be respected for both LeafQueue/User when 
 trying to activate applications.
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.29.patch, YARN-2637.30.patch, 
 YARN-2637.31.patch, YARN-2637.32.patch, YARN-2637.36.patch, 
 YARN-2637.38.patch, YARN-2637.39.patch, YARN-2637.40.patch, 
 YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be respected for both LeafQueue/User when trying to activate applications.

2015-01-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.39.patch

Now with web ui entries max am and max am user resource + application limit 
tests

 maximum-am-resource-percent could be respected for both LeafQueue/User when 
 trying to activate applications.
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.29.patch, YARN-2637.30.patch, 
 YARN-2637.31.patch, YARN-2637.32.patch, YARN-2637.36.patch, 
 YARN-2637.38.patch, YARN-2637.39.patch, YARN-2637.6.patch, YARN-2637.7.patch, 
 YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be respected for both LeafQueue/User when trying to activate applications.

2015-01-08 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.36.patch

Should be down to one failing test, let's see

 maximum-am-resource-percent could be respected for both LeafQueue/User when 
 trying to activate applications.
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.29.patch, YARN-2637.30.patch, 
 YARN-2637.31.patch, YARN-2637.32.patch, YARN-2637.36.patch, 
 YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be respected for both LeafQueue/User when trying to activate applications.

2015-01-07 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.32.patch

Check tests using absoluteCapacity for userAmLimit

 maximum-am-resource-percent could be respected for both LeafQueue/User when 
 trying to activate applications.
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.29.patch, YARN-2637.30.patch, 
 YARN-2637.31.patch, YARN-2637.32.patch, YARN-2637.6.patch, YARN-2637.7.patch, 
 YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be respected for both LeafQueue/User when trying to activate applications.

2015-01-07 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.31.patch

See what happens when maxActiveApplications and maxActiveApplicationsPerUser 
are removed altogether

 maximum-am-resource-percent could be respected for both LeafQueue/User when 
 trying to activate applications.
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.29.patch, YARN-2637.30.patch, 
 YARN-2637.31.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-07 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.30.patch

userAMLimit logic included as well, now with a test :-)

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.29.patch, YARN-2637.30.patch, 
 YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-06 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.27.patch

patch tests which fail when null check for rmcontext.getscheduler is not 
present in ficaschedulerapp

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-06 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.28.patch

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-06 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267030#comment-14267030
 ] 

Craig Welch commented on YARN-2637:
---

Findbugs was the result of changing the ratio of sync to unsync accesses which 
hit the findbugs limits, but not the pattern itself, which looks fine, so added 
fb exclusion.  TestFairScheduler passes on my box with the change so build 
server related / not a real issue.  

Was not originally planning to address the max am percent for user as that 
wasn't the issue we kept encountering but forgot to mention this / edit the 
jira to reflect.  However, I'm going to see what the impact would be of adding 
that now  then we can decide to include it or move to it's own jira.

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-06 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.29.patch

Take a go adding user am limit also (needs further verification/test), see test 
impact

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.29.patch, YARN-2637.6.patch, YARN-2637.7.patch, 
 YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-05 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.26.patch

reformatted some sections of testleafqueue, commenting the null check for 
rmcontext.getscheduler in ficaschedulerapp to see how widespread that condition 
is in the tests.

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.6.patch, YARN-2637.7.patch, 
 YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-05 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265480#comment-14265480
 ] 

Craig Welch commented on YARN-2637:
---


bq. Regarding null checks in FiCaSchedulerApp. Since scheduler assumes 
application is in running state when adding FiCaSchedulerApp. It is a big issue 
if RMApp cannot be found at that time. So comparing to just ignore such error, 
I think you need throw exception (if that exception will not cause RM shutdown) 
and log such error.

I'm not quite sure how to phrase this differently to get the point across - it 
is already the case throughout the many mocking points which interact with this 
code that the rmapp may be null at this point (if it were not the case it would 
not be necessary to check for it).  As I mentioned previously, the 
ResourceManager itself checks for this case.  I am not introducing the mocking 
which resulted in this state, or even existing checks for it in non-test code, 
I'm receiving this state and carrying it forward in the same way as it has been 
done elsewhere (and, again, not simply in tests).  Changing this is not 
something which belongs in the scope of this jira because it represents a 
rationalization/overhaul of mocking throughout this area (resource manager, 
schedulers), it is non-trivial and not specific to or properly within the scope 
of this change.  Feel free to create a separate jira to improve the mocking 
throughout the code.  The separate null-check for the amresourcerequest is 
necessitated by the apparently intentional behavior of unmanaged am's.

bq. And when this is possible?

+  if (rmContext.getScheduler() != null) 

again, in existing test paths, and existing code is tolerant of this as well, 
I'm merely carrying it forward - it would belong in the new jira as well, were 
one opened

bq. \t in leafqueue - I've checked and the spacing is consistent with the 
existing spacing in the file.


 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.25.patch

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-03 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263608#comment-14263608
 ] 

Craig Welch commented on YARN-2637:
---

bq. I think there should at least one AM can be launched in each queue ... 
MockRM test config settings

That's been the case since switching to approach 2, some tests need to start  
1 app in a queue ;) In any case, I've removed the MockRM test config settings, 
it's only needed in a few tests now, so I'm setting it those tests directly
(done)

bq. -re maximumActiveApplications ... MAXIMUM_ACTIVE_APPLICATIONS_SUFFIX
I removed this new configuration point.  It is no longer possible to directly 
control how many apps start in a queue since the AM's are not all the same 
size, so it's not possible to actually control that now outside of testing (it 
was before, not it's not).  However, the cases I recall using that were all to 
work around the fact that the max am percent wasn't working properly, so 
hopefully this won't be missed
(done)

-re null checks in FiCaSchedulerApp constructor
So, the ResourceManager itself checks for null rmapps (ResourceManager.java~ 
line 830), this is a pre-existing case which is tolerated and I'm not going to 
address it.  The getAMResourceRequest() can also be null for unmanaged AM's.  
I've reduced the null checks for the app to just these two cases but those 
checks should remain.
(partly done/remaining should stay as-is)

All the build quality checks and tests are passing, not sure why the overall is 
red, think it's a build server issue...

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-02 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.23.patch

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-02 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.22.patch

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.6.patch, YARN-2637.7.patch, 
 YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-02 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.21.patch

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2014-12-29 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.20.patch

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, YARN-2637.6.patch, 
 YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   >