Huangkaixuan created YARN-6289: ---------------------------------- Summary: yarn got little data locality Key: YARN-6289 URL: https://issues.apache.org/jira/browse/YARN-6289 Project: Hadoop YARN Issue Type: Improvement Components: capacity scheduler Environment: Hardware configuration CPU: 2 x Intel(R) Xeon(R) E5-2620 v2 @ 2.10GHz /15M Cache 6-Core 12-Thread Memory: 128GB Memory (16x8GB) 1600MHz Disk: 600GBx2 3.5-inch with RAID-1 Network bandwidth: 968Mb/s Software configuration Spark-1.6.2 Hadoop-2.7.1
Reporter: Huangkaixuan Priority: Minor When I ran this experiment with both Spark and MapReduce wordcount on the file, I noticed that the job did not get data locality every time. It was seemingly random in the placement of the tasks, even though there is no other job running on the cluster. I expected the task placement to always be on the single machine which is holding the data block, but that did not happen. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org