Thomas Graves created SPARK-30831: ------------------------------------- Summary: Executors UI shows more active tasks then possible Key: SPARK-30831 URL: https://issues.apache.org/jira/browse/SPARK-30831 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 3.0.0 Reporter: Thomas Graves
I regularly see the executors web ui showing more active tasks then it has cores. Looking at the code it seems that we track those separately and the message that is sent for task end is asynchronous and thus ends up showing up at the UI much later then the start event. CoarseGrainedSchedulerBackend on statusUpdate increases the freeCores which then allow scheduler to assign another task, but the taskEndEvent is asynchronous. We definitely don't want to slow down the scheduling part so not sure how easily it will be to improve. To reproduce I just ran: val df = sc.makeRDD(1 to 10000000, 6).toDF val df2 = sc.makeRDD(1 to 10000000, 6).toDF spark.time(df.select( $"value" as "a").join(df2.select($"value" as "b"), $"a" === $"b").write.mode("overwrite").csv("somefile")) And view the executors ui page. I started spark-shell with just 1 core per executor and you see 2 active tasks -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org