Re: [VOTE] Merge YARN-3926 (resource profile) to trunk

Daniel Templeton Sat, 26 Aug 2017 07:41:27 -0700

Quick question, Wangda. When you say that the feature can be turnedoff, do you mean resource types or resource profiles? I know there's anoff-by-default property that governs resource profiles, but I didn't seeany way to turn off resource types. Even if only CPU and memory areconfigured, i.e. no additional resource types, the code path isdifferent than it was. Specifically, where CPU and memory wereprimitives before, they're now entries in an array whose indexes have tobe looked up through the ResourceUtils class. Did I miss something?

For those who haven't followed the feature closely, there are really twofeatures here. Resource types allows for declarative extension of theresource system in YARN. Resource profiles builds on top of resourcetypes to allow a user to request a group of resources as a profile, muchlike EC2 instance types, e.g. "fast-compute" might mean 32GB RAM, 8vcores, and 2 GPUs.


Daniel

On 8/23/17 11:49 AM, Wangda Tan wrote:

  Hi folks,

Per earlier discussion [1], I'd like to start a formal vote to merge
feature branch YARN-3926 (Resource profile) to trunk. The vote will run for
7 days and will end August 30 10:00 AM PDT.

Briefly, YARN-3926 can extend resource model of YARN to support resource
types other than CPU and memory, so it will be a cornerstone of features
like GPU support (YARN-6223), disk scheduling/isolation (YARN-2139), FPGA
support (YARN-5983), network IO scheduling/isolation (YARN-2140). In
addition to that, YARN-3926 allows admin to preconfigure resource profiles
in the cluster, for example, m3.large means <2 vcores, 8 GB memory, 64 GB
disk>, so applications can request "m3.large" profile instead of specifying
all resource types’s values.

There are 32 subtasks that were completed as part of this effort.

This feature needs to be explicitly turned on before use. We paid close
attention to compatibility, performance, and scalability of this feature,
mentioned in [1], we didn't see observable performance regression in large
scale SLS (scheduler load simulator) executions and saw less than 5%
performance regression by using micro benchmark added by YARN-6775.

This feature works from end-to-end (including UI/CLI/application/server),
we have setup a cluster with this feature turned on runs for several weeks,
we didn't see any issues by far.

Merge JIRA: YARN-7013 (Jenkins gave +1 already).
Documentation: YARN-7056

Special thanks to a team of folks who worked hard and contributed towards
this effort including design discussion/development/reviews, etc.: Varun
Vasudev, Sunil Govind, Daniel Templeton, Vinod Vavilapalli, Yufei Gu,
Karthik Kambatla, Jason Lowe, Arun Suresh.

Regards,
Wangda Tan

[1]
http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201708.mbox/%3CCAD%2B%2BeCnjEHU%3D-M33QdjnND0ZL73eKwxRua4%3DBbp4G8inQZmaMg%40mail.gmail.com%3E



---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Re: [VOTE] Merge YARN-3926 (resource profile) to trunk

Reply via email to