[ 
https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491177#comment-16491177
 ] 

Wangda Tan commented on YARN-8320:
----------------------------------

Thanks [~cheersyang], 

To me the benefit of exclusive mode is obvious and straightforward. For 
SHARED/RESERVED mode, if we plan to do this in phases, could we do the 
EXCLUSIVE mode first, harden the API, refactoring, etc first. Instead of 
working on too many changes. What's ur opinion on this?

In addition to the JIRA I mentioned, there's an effort to add support to GPU 
affinity: YARN-7481. I don't quite like the proposed approach (bitmap to 
represent resources). But it might be good if we can build a common layer to 
support such use cases since for CPU exclusive usage, etc. we also want ACLs / 
resource accounting, etc.

> [Umbrella] Support CPU isolation for latency-sensitive (LS) service
> -------------------------------------------------------------------
>
>                 Key: YARN-8320
>                 URL: https://issues.apache.org/jira/browse/YARN-8320
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Jiandan Yang 
>            Priority: Major
>         Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, 
> CPU-isolation-for-latency-sensitive-services-v2.pdf, YARN-8320.001.patch
>
>
> Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and 
> “cpu.shares” to isolate cpu resource. However,
>  * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; 
> no support for differentiated latency
>  * Request latency of services running on container may be frequent shake 
> when all containers share cpus, and latency-sensitive services can not afford 
> in our production environment.
> So we need more fine-grained cpu isolation.
> Here we propose a solution using cgroup cpuset to binds containers to 
> different processors, this is inspired by the isolation technique in [Borg 
> system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to