[ https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488899#comment-16488899 ]
Weiwei Yang commented on YARN-8320: ----------------------------------- Hi [~leftnoteasy] Thanks for reviewing the design doc, very good comments. Please see my answers below bq. the #vcore must be divisible by #physical-core, otherwise it will cause rounding issue and containers will get less/more than requested resources This is correct, we do need to detect #processors and do the check when this feature is enabled. Even this satisfied on NM, we still need the calculation introduced in section 3.3, there is a formula to transform #vcore a container requested to #processor, this could still be a decimal. So we take the floor integer. Container will still get what they requested (in terms of vcore), but when it binds to processors, it is possible for example a container might get 30% of processors instead of 33% vcore of a node. Under configuration: #vcore = 10 * #processor. I think we should educate people how to set these stuff up in a tutorial to avoid such non-optimal situation happens. bq. I'm still trying to understand benefit of RESERVED / SHARED mode *Same*: Both RESERVED and SHARE modes provide an option for LS (latency-sensitive) service to share some of its cpu with LT (latency-tolerance) tasks, aka offline tasks. If cpu gets busy, LS service can still get enough cpu time by its *cpu.share* (LS service will have much bigger cpu.share than LT tasks). So your comment "RESERVED can be affected by adhoc ANY container", that is correct, but since they have different cpu.share values, it is still under control : ). *Difference*: SHARE mode gives the option for a LS task share cpu with other LS tasks, this will help the case that some of less latency-sensitive tasks (less than EXCLUSIVE and RESERVED) can run together to pull up the cpu utilization. But such LS tasks still want to have higher weight than offline tasks (again by using of cpu.share). bq. Related to NUMA allocation on YARN YARN-5764 Thanks for pointing this out, I was not aware. Just looked into the design and implementation, I agree that that feature is related to this one, but I don't think they are highly coupled. By roughly reading the code, I think it is possible to: 1st phase, we only support one over the other, user can either enable NUMA or cpuset, not both; 2nd phase, we can make them work together (by making sure we bind processors under a certain node that loads from NUMA side result). Will update the design doc later with some more details. bq. Related to GPU allocation/resource plugin We are not going to think cpuset as any sort of resource, so this is not applicable. Most of work for this feature will be contained on NM side as a cgroups cpuset handler. This is more like the NUMA code. bq. To me only privileged users and applications can request non-ANY CPU mode Valid suggestion. We can add this to phase-2 work with a pre-check. Will update the design doc also. Thanks > Support CPU isolation for latency-sensitive (LS) service > -------------------------------------------------------- > > Key: YARN-8320 > URL: https://issues.apache.org/jira/browse/YARN-8320 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager > Reporter: Jiandan Yang > Priority: Major > Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, > CPU-isolation-for-latency-sensitive-services-v2.pdf, YARN-8320.001.patch > > > Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and > “cpu.shares” to isolate cpu resource. However, > * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; > no support for differentiated latency > * Request latency of services running on container may be frequent shake > when all containers share cpus, and latency-sensitive services can not afford > in our production environment. > So we need more fine-grained cpu isolation. > Here we propose a solution using cgroup cpuset to binds containers to > different processors, this is inspired by the isolation technique in [Borg > system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org