[ https://issues.apache.org/jira/browse/YARN-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15762998#comment-15762998 ]
Miklos Szegedi commented on YARN-5936: -------------------------------------- Indeed, I see some 6% performance loss in the second case above compared to the first. This is when I move away from the root cgroup and use the hierarchy but do not use the limit, yet. This happens, when I am running 10 processes in different cgroups with 100 threads each. > when cpu strict mode is closed, yarn couldn't assure scheduling fairness > between containers > ------------------------------------------------------------------------------------------- > > Key: YARN-5936 > URL: https://issues.apache.org/jira/browse/YARN-5936 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.7.1 > Environment: CentOS7.1 > Reporter: zhengchenyu > Priority: Critical > Fix For: 2.7.1 > > Original Estimate: 1m > Remaining Estimate: 1m > > When using LinuxContainer, the setting that > "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" is > true could assure scheduling fairness with the cpu bandwith of cgroup. But > the cpu bandwidth of cgroup would lead to bad performance in our experience. > Without cpu bandwidth of cgroup, cpu.share of cgroup is our only way to > assure scheduling fairness, but it is not completely effective. For example, > There are two container that have same vcore(means same cpu.share), one > container is single-threaded, the other container is multi-thread. the > multi-thread will have more CPU time, It's unreasonable! > Here is my test case, I submit two distributedshell application. And two > commmand are below: > {code} > hadoop jar > share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar > org.apache.hadoop.yarn.applications.distributedshell.Client -jar > share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar > -shell_script ./run.sh -shell_args 10 -num_containers 1 -container_memory > 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10 > hadoop jar > share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar > org.apache.hadoop.yarn.applications.distributedshell.Client -jar > share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar > -shell_script ./run.sh -shell_args 1 -num_containers 1 -container_memory > 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10 > {code} > here show the cpu time of the two container: > {code} > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 15448 yarn 20 0 9059592 28336 9180 S 998.7 0.1 24:09.30 java > 15026 yarn 20 0 9050340 27480 9188 S 100.0 0.1 3:33.97 java > 13767 yarn 20 0 1799816 381208 18528 S 4.6 1.2 0:30.55 java > 77 root rt 0 0 0 0 S 0.3 0.0 0:00.74 > migration/1 > {code} > We find the cpu time of Muliti-Thread are ten times than the cpu time of > Single-Thread, though the two container have same cpu.share. > notes: > run.sh > {code} > java -cp /home/yarn/loop.jar:$CLASSPATH loop.loop $1 > {code} > loop.java > {code} > package loop; > public class loop { > public static void main(String[] args) { > // TODO Auto-generated method stub > int loop = 1; > if(args.length>=1) { > System.out.println(args[0]); > loop = Integer.parseInt(args[0]); > } > for(int i=0;i<loop;i++){ > System.out.println("start thread " + i); > new Thread(new Runnable() { > @Override > public void run() { > // TODO Auto-generated method stub > int j=0; > while(true){j++;} > } > }).start(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org