Matei Zaharia created SPARK-10008:
-------------------------------------

             Summary: Shuffle locality can take precedence over narrow 
dependencies for RDDs with both
                 Key: SPARK-10008
                 URL: https://issues.apache.org/jira/browse/SPARK-10008
             Project: Spark
          Issue Type: Bug
          Components: Scheduler
            Reporter: Matei Zaharia


The shuffle locality patch made the DAGScheduler aware of shuffle data, but for 
RDDs that have both narrow and shuffle dependencies, it can cause them to place 
tasks based on the shuffle dependency instead of the narrow one. This case is 
common in iterative join-based algorithms like PageRank and ALS, where one RDD 
is hash-partitioned and one isn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to