[jira] [Commented] (YARN-5936) when cpu strict mode is closed, yarn couldn't assure scheduling fairness between containers

Miklos Szegedi (JIRA) Mon, 13 Nov 2017 11:14:25 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250021#comment-16250021
 ]


Miklos Szegedi commented on YARN-5936:
--------------------------------------

Another option for the future is the use of the cgroup pids subsystem on newer 
kernels. The main reason fairness is not enforced in non-strict mode, is that 
it allows the container to run as many threads with the same cgroup and weight 
as needed. You can limit the amount of threads with the pids namespace, so that 
the effective overall container weight becomes weight*pids_limit. The drawback 
of this approach is that it limits multitasking and the number of launcher 
processes. The possible ideal value of pids_limit is the <number of 
cores>/<desired thread count>, so that we do not starve single threaded 
containers.

> when cpu strict mode is closed, yarn couldn't assure scheduling fairness 
> between containers
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-5936
>                 URL: https://issues.apache.org/jira/browse/YARN-5936
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.1
>         Environment: CentOS7.1
>            Reporter: zhengchenyu
>            Priority: Critical
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> When using LinuxContainer, the setting that 
> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" is 
> true could assure scheduling fairness with the cpu bandwith of cgroup. But 
> the cpu bandwidth of cgroup would lead to bad performance in our experience. 
>     Without cpu bandwidth of cgroup, cpu.share of cgroup is our only way to 
> assure scheduling fairness, but it is not completely effective. For example, 
> There are two container that have same vcore(means same cpu.share), one 
> container is single-threaded, the other container is multi-thread. the 
> multi-thread will have more CPU time, It's unreasonable!
>     Here is my test case, I submit two distributedshell application. And two 
> commmand are below:
> {code}
> hadoop jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar 
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar 
> -shell_script ./run.sh  -shell_args 10 -num_containers 1 -container_memory 
> 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
> hadoop jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar 
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar 
> -shell_script ./run.sh  -shell_args 1  -num_containers 1 -container_memory 
> 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
> {code}
>      here show the cpu time of the two container:
> {code}
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
> 15448 yarn      20   0 9059592  28336   9180 S 998.7  0.1  24:09.30 java
> 15026 yarn      20   0 9050340  27480   9188 S 100.0  0.1   3:33.97 java
> 13767 yarn      20   0 1799816 381208  18528 S   4.6  1.2   0:30.55 java
>    77 root      rt   0       0      0      0 S   0.3  0.0   0:00.74 
> migration/1   
> {code}
>     We find the cpu time of Muliti-Thread are ten times than the cpu time of 
> Single-Thread, though the two container have same cpu.share.
> notes:
> run.sh
> {code} 
>       java -cp /home/yarn/loop.jar:$CLASSPATH loop.loop $1    
> {code} 
> loop.java
> {code} 
> package loop;
> public class loop {
>       public static void main(String[] args) {
>               // TODO Auto-generated method stub
>               int loop = 1;
>               if(args.length>=1) {
>                       System.out.println(args[0]);
>                       loop = Integer.parseInt(args[0]);
>               }
>               for(int i=0;i<loop;i++){
>                       System.out.println("start thread " + i);
>                       new Thread(new Runnable() {
>                               @Override
>                               public void run() {
>                                       // TODO Auto-generated method stub
>                                       int j=0;
>                                       while(true){j++;}
>                               }
>                       }).start();
>               }
>       }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5936) when cpu strict mode is closed, yarn couldn't assure scheduling fairness between containers

Reply via email to