[ 
https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16490165#comment-16490165
 ] 

Weiwei Yang commented on YARN-8320:
-----------------------------------

Hi [~miklos.szeg...@cloudera.com]

Thanks for the comments, please see my points below

bq. 1) and 3) it might make sense to use a separate resource type for this 
feature

Extend resource type might not be straightforward for cpuset. From your 
suggestion, how can you define how many cpuset resource on a NM and how a AM to 
request? It's not a numeric value. The problem is cpuset is working on physical 
cores (processor) but Yarn manages vcores, and a processor can be shared by 
multiple containers. Hence we can hardly define "values" if we consider it as a 
resource. 

bq. 2) users might not need the RESERVED/SHARED modes

This was my first impression too. But after I talked with [~yangjiandan] and 
some other folks who manage LS services, I change my mind. RESERVE/SHARE helps 
to improve the utilization and a key for mix-workload environment, that having 
batch tasks running along with services. It helps to resolve the problems like 
you mentioned in #6 and #7.

bq. 4) The design lets the AM do a delayed exclusive request directly to the NM 
avoiding the RM. I think it would be more robust to request from the RM in the 
container launch context and just forward this to the NM. The RM has the chance 
to decline or delay the request in this case in the future.

I agree. We are not figuring out a way to let RM play its role here, will try 
harder thinking about this.

bq. 6) Let me mention that this feature negatively affects YARN-1011 and 
oversubscription.

That's why we have RESERVED/SHARED mode, it allows a LS service to share its 
CPU with other tasks, including O containers (O containers will be using ANY 
mode). But if we set a container with EXCLUSIVE mode, then yes, this will 
occupied the CPU, this is the only way to ensure it runs completely isolated 
for such highly sensitive tasks. For our existing online services, most of them 
are using RESERVED or SHARE mode in order to improve the utilization (a typical 
mixed-workload scenario)

bq. 5)  how can you make sure a parent cgroup does not interfere with a cgroup 
marked as cpuset.cpu_exclusive=1? What if a system service wakes up?

We are not going to set cpuset.cpu_exclusive=1, at least not in this version of 
design. We are trying to solve the problem about competing CPU resources 
between containers, not with system services.

bq. 7) Also, latency sensitive applications get exclusive protection but can 
only be assigned to their cpuset disallowing bursts to other CPUs when needed. 
I do not know how to solve this though.

Use SHARE mode. We have a lot of online services running under this mode, that 
allows it to use all processors except those assigned to EXCLUSIVE and RESERVED.

bq. 8) mean that other container cgroups need to be changed and reduced every 
time a reserved container starts

Correct. When we assign a processor to a container using RESERVED or EXCLUSIVE, 
then we need to remove it from rest of containers cgroup, this is briefly 
introduced in section 3.5 of the design doc.

Hope it makes sense, looking forward to hear your feedback.
Thanks

> [Umbrella] Support CPU isolation for latency-sensitive (LS) service
> -------------------------------------------------------------------
>
>                 Key: YARN-8320
>                 URL: https://issues.apache.org/jira/browse/YARN-8320
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Jiandan Yang 
>            Priority: Major
>         Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, 
> CPU-isolation-for-latency-sensitive-services-v2.pdf, YARN-8320.001.patch
>
>
> Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and 
> “cpu.shares” to isolate cpu resource. However,
>  * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; 
> no support for differentiated latency
>  * Request latency of services running on container may be frequent shake 
> when all containers share cpus, and latency-sensitive services can not afford 
> in our production environment.
> So we need more fine-grained cpu isolation.
> Here we propose a solution using cgroup cpuset to binds containers to 
> different processors, this is inspired by the isolation technique in [Borg 
> system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to