[ 
https://issues.apache.org/jira/browse/SPARK-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Ousterhout resolved SPARK-11460.
------------------------------------
    Resolution: Duplicate

> Locality waits should be based on task set creation time, not last launch time
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-11460
>                 URL: https://issues.apache.org/jira/browse/SPARK-11460
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 1.2.0, 1.2.1, 1.2.2, 
> 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.5.0, 1.5.1
>         Environment: YARN
>            Reporter: Shengyue Ji
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Spark waits for spark.locality.waits period before going from RACK_LOCAL to 
> ANY when selecting an executor for assignment. The timeout was essentially 
> reset each time a new assignment is made.
> We were running Spark streaming on Kafka with a 10 second batch window on 32 
> Kafka partitions with 16 executors. All executors were in the ANY group. At 
> one point one RACK_LOCAL executor was added and all tasks were assigned to 
> it. Each task took about 0.6 second to process, resetting the 
> spark.locality.wait timeout (3000ms) repeatedly. This caused the whole 
> process to under utilize resources and created an increasing backlog.
> spark.locality.wait should be based on the task set creation time, not last 
> launch time so that after 3000ms of initial creation, all executors can get 
> tasks assigned to them.
> We are specifying a zero timeout for now as a workaround to disable locality 
> optimization. 
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L556



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to