[ 
https://issues.apache.org/jira/browse/YARN-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231365#comment-15231365
 ] 

Subru Krishnan commented on YARN-4902:
--------------------------------------

Thanks [~vinodkv], [~leftnoteasy], [~jianhe],[~vvasudev] and others for putting 
up this proposal. I went through it & it seems quite relevant with the 
increasing range of workloads we have to support in the near future in YARN.

I have a few high level comments below. Obviously this needs lot more 
thought/discussions.

*GUTS API feedback*:
  - I want to echo [~asuresh]'s comment on consolidating _Allocation-ID_ with 
_Request-ID_ proposed in YARN-4879 and [~vinodkv] seems to agree based on his 
[comments|https://issues.apache.org/jira/browse/YARN-4879?focusedCommentId=15220475]
 .
  - Now that we are reworking the API from scratch, can we add a *cost 
function* for the _ResourceRequest_? I feel _Priority_ is being overloaded to 
express scheduling cost, preemption cost, container types etc.
  - I am not able to grok why we need both _maximum number of allocations & 
maximum concurrency_, especially considering that this on top of the existing 
_numContainers_. Won't they conflict?
  - Can we have a section in the end to explicitly list the mandatory and 
optional attributes at _Application_ and _ResourceRequests_ level. The document 
is rather long and so a snapshot summary will be good.
  - Overall the proposed API seems quite powerful but we should make sure that 
we don't end up trading simplicity for functionality IMHO(this is based on the 
feedback we received for YARN-1051). For instance, the typical MapReduce 
scenario feels more dense when compared to the current APIs but should be more 
easily expressible if we sacrifice on additional flexibility that the GUTS API 
provides. So it'll also be good to have examples of how current constrained 
asks will look like when made through GUTS API.

*Time aspects*: 
  - I agree that we should consolidate the time related placement conditions 
with the work done in YARN-1051. 
  - + capital 1 on your observation that _The reservations feature proposed at 
YARN-1051 can pave a great way for implementing minimumconcurrency_ :).

*Scheduler enhancements*: 
  - The current _Schedulers_ will be extremely hard pressed to efficiently 
handle GUTS API requests. I guess this should act as a good motivation to 
consider an _application centric_ approach as opposed to the current _node 
centric_ one as we have occasionally discussed with [~asuresh], [~kasha], 
[~curino] etc all.

> [Umbrella] Generalized and unified scheduling-strategies in YARN
> ----------------------------------------------------------------
>
>                 Key: YARN-4902
>                 URL: https://issues.apache.org/jira/browse/YARN-4902
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Wangda Tan
>         Attachments: Generalized and unified scheduling-strategies in YARN 
> -v0.pdf
>
>
> Apache Hadoop YARN's ResourceRequest mechanism is the core part of the YARN's 
> scheduling API for applications to use. The ResourceRequest mechanism is a 
> powerful API for applications (specifically ApplicationMasters) to indicate 
> to YARN what size of containers are needed, and where in the cluster etc.
> However a host of new feature requirements are making the API increasingly 
> more and more complex and difficult to understand by users and making it very 
> complicated to implement within the code-base.
> This JIRA aims to generalize and unify all such scheduling-strategies in YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to