[ 
https://issues.apache.org/jira/browse/YARN-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huangkaixuan updated YARN-6289:
-------------------------------
    Attachment: Results For Experiment One.docx

> yarn got little data locality
> -----------------------------
>
>                 Key: YARN-6289
>                 URL: https://issues.apache.org/jira/browse/YARN-6289
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler
>         Environment: Hardware configuration
> CPU: 2 x Intel(R) Xeon(R) E5-2620 v2 @ 2.10GHz /15M Cache 6-Core 12-Thread 
> Memory: 128GB Memory (16x8GB) 1600MHz
> Disk: 600GBx2 3.5-inch with RAID-1
> Network bandwidth: 968Mb/s
> Software configuration
> Spark-1.6.2   Hadoop-2.7.1 
>            Reporter: Huangkaixuan
>            Priority: Minor
>         Attachments: Results For Experiment One.docx
>
>
> When I ran this experiment with both Spark and MapReduce wordcount on the 
> file, I noticed that the job did not get data locality every time. It was 
> seemingly random in the placement of the tasks, even though there is no other 
> job running on the cluster. I expected the task placement to always be on the 
> single machine which is holding the data block, but that did not happen.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to