[ https://issues.apache.org/jira/browse/SPARK-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kay Ousterhout resolved SPARK-4383. ----------------------------------- Resolution: Fixed Fix Version/s: 1.3.0 > Delay scheduling doesn't work right when jobs have tasks with different > locality levels > --------------------------------------------------------------------------------------- > > Key: SPARK-4383 > URL: https://issues.apache.org/jira/browse/SPARK-4383 > Project: Spark > Issue Type: Bug > Components: Scheduler > Affects Versions: 1.0.2, 1.1.0 > Reporter: Kay Ousterhout > Fix For: 1.3.0 > > > Copied from mailing list discussion: > Now our application will load data from hdfs in the same spark cluster, it > will get NODE_LOCAL and RACK_LOCAL level tasks during loading stage, if the > tasks in loading stage have same locality level, ether NODE_LOCAL or > RACK_LOCAL it works fine. > But if the tasks in loading stage get mixed locality level, such as 3 > NODE_LOCAL tasks, and 2 RACK_LOCAL tasks, then the TaskSetManager of loading > stage will submit the 3 NODE_LOCAL tasks as soon as resources were offered, > then wait for spark.locality.wait.node, which was set to 30 minutes, the 2 > RACK_LOCAL tasks will wait 30 minutes even though resources are available. > Fixing this is quite tricky -- do we need to track the locality level > individually for each task? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org