[ https://issues.apache.org/jira/browse/YARN-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231365#comment-15231365 ]
Subru Krishnan commented on YARN-4902: -------------------------------------- Thanks [~vinodkv], [~leftnoteasy], [~jianhe],[~vvasudev] and others for putting up this proposal. I went through it & it seems quite relevant with the increasing range of workloads we have to support in the near future in YARN. I have a few high level comments below. Obviously this needs lot more thought/discussions. *GUTS API feedback*: - I want to echo [~asuresh]'s comment on consolidating _Allocation-ID_ with _Request-ID_ proposed in YARN-4879 and [~vinodkv] seems to agree based on his [comments|https://issues.apache.org/jira/browse/YARN-4879?focusedCommentId=15220475] . - Now that we are reworking the API from scratch, can we add a *cost function* for the _ResourceRequest_? I feel _Priority_ is being overloaded to express scheduling cost, preemption cost, container types etc. - I am not able to grok why we need both _maximum number of allocations & maximum concurrency_, especially considering that this on top of the existing _numContainers_. Won't they conflict? - Can we have a section in the end to explicitly list the mandatory and optional attributes at _Application_ and _ResourceRequests_ level. The document is rather long and so a snapshot summary will be good. - Overall the proposed API seems quite powerful but we should make sure that we don't end up trading simplicity for functionality IMHO(this is based on the feedback we received for YARN-1051). For instance, the typical MapReduce scenario feels more dense when compared to the current APIs but should be more easily expressible if we sacrifice on additional flexibility that the GUTS API provides. So it'll also be good to have examples of how current constrained asks will look like when made through GUTS API. *Time aspects*: - I agree that we should consolidate the time related placement conditions with the work done in YARN-1051. - + capital 1 on your observation that _The reservations feature proposed at YARN-1051 can pave a great way for implementing minimumconcurrency_ :). *Scheduler enhancements*: - The current _Schedulers_ will be extremely hard pressed to efficiently handle GUTS API requests. I guess this should act as a good motivation to consider an _application centric_ approach as opposed to the current _node centric_ one as we have occasionally discussed with [~asuresh], [~kasha], [~curino] etc all. > [Umbrella] Generalized and unified scheduling-strategies in YARN > ---------------------------------------------------------------- > > Key: YARN-4902 > URL: https://issues.apache.org/jira/browse/YARN-4902 > Project: Hadoop YARN > Issue Type: New Feature > Reporter: Vinod Kumar Vavilapalli > Assignee: Wangda Tan > Attachments: Generalized and unified scheduling-strategies in YARN > -v0.pdf > > > Apache Hadoop YARN's ResourceRequest mechanism is the core part of the YARN's > scheduling API for applications to use. The ResourceRequest mechanism is a > powerful API for applications (specifically ApplicationMasters) to indicate > to YARN what size of containers are needed, and where in the cluster etc. > However a host of new feature requirements are making the API increasingly > more and more complex and difficult to understand by users and making it very > complicated to implement within the code-base. > This JIRA aims to generalize and unify all such scheduling-strategies in YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)