[ 
https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488899#comment-16488899
 ] 

Weiwei Yang commented on YARN-8320:
-----------------------------------

Hi [~leftnoteasy]

Thanks for reviewing the design doc, very good comments. Please see my answers 
below

bq. the #vcore must be divisible by #physical-core, otherwise it will cause 
rounding issue and containers will get less/more than requested resources

This is correct, we do need to detect #processors and do the check when this 
feature is enabled. Even this satisfied on NM, we still need the calculation 
introduced in section 3.3, there is a formula to transform #vcore a container 
requested to #processor, this could still be a decimal. So we take the floor 
integer. Container will still get what they requested (in terms of vcore), but 
when it binds to processors, it is possible for example a container might get 
30% of processors instead of 33% vcore of a node. Under configuration: #vcore = 
10 * #processor. I think we should educate people how to set these stuff up in 
a tutorial to avoid such non-optimal situation happens.

bq. I'm still trying to understand benefit of RESERVED / SHARED mode

*Same*:
Both RESERVED and SHARE modes provide an option for LS (latency-sensitive) 
service to share some of its cpu with LT (latency-tolerance) tasks, aka offline 
tasks. If cpu gets busy, LS service can still get enough cpu time by its 
*cpu.share* (LS service will have much bigger cpu.share than LT tasks). So your 
comment "RESERVED  can be affected by adhoc ANY container", that is correct, 
but since they have different cpu.share values, it is still under control : ).

*Difference*:
SHARE mode gives the option for a LS task share cpu with other LS tasks, this 
will help the case that some of less latency-sensitive tasks (less than 
EXCLUSIVE and RESERVED) can run together to pull up the cpu utilization. But 
such LS tasks still want to have higher weight than offline tasks (again by 
using of cpu.share).

bq. Related to NUMA allocation on YARN YARN-5764

Thanks for pointing this out, I was not aware. Just looked into the design and 
implementation, I agree that that feature is related to this one, but I don't 
think they are highly coupled. By roughly reading the code, I think it is 
possible to: 1st phase, we only support one over the other, user can either 
enable NUMA or cpuset, not both; 2nd phase, we can make them work together (by 
making sure we bind processors under a certain node that loads from NUMA side 
result). Will update the design doc later with some more details.

bq. Related to GPU allocation/resource plugin

We are not going to think cpuset as any sort of resource, so this is not 
applicable. Most of work for this feature will be contained on NM side as a 
cgroups cpuset handler. This is more like the NUMA code.

bq. To me only privileged users and applications can request non-ANY CPU mode

Valid suggestion. We can add this to phase-2 work with a pre-check. Will update 
the design doc also.

Thanks


> Support CPU isolation for latency-sensitive (LS) service
> --------------------------------------------------------
>
>                 Key: YARN-8320
>                 URL: https://issues.apache.org/jira/browse/YARN-8320
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Jiandan Yang 
>            Priority: Major
>         Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, 
> CPU-isolation-for-latency-sensitive-services-v2.pdf, YARN-8320.001.patch
>
>
> Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and 
> “cpu.shares” to isolate cpu resource. However,
>  * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; 
> no support for differentiated latency
>  * Request latency of services running on container may be frequent shake 
> when all containers share cpus, and latency-sensitive services can not afford 
> in our production environment.
> So we need more fine-grained cpu isolation.
> Here we propose a solution using cgroup cpuset to binds containers to 
> different processors, this is inspired by the isolation technique in [Borg 
> system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to