[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-24342. ---------------------------------- Resolution: Incomplete > Large Task prior scheduling to Reduce overall execution time > ------------------------------------------------------------ > > Key: SPARK-24342 > URL: https://issues.apache.org/jira/browse/SPARK-24342 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 2.3.0 > Reporter: gao > Priority: Major > Labels: bulk-closed > Attachments: tasktimespan.PNG > > > When performing a set of concurrent tasks, if the relatively large task > (long-time task) performs the first small-task execution, the overall > execution time > can be shortened. > Therefore, Spark needs to add a new function to perform Large Task of a group > of task set prior scheduling and small tasks after scheduling > The time span of the task can be identified based on the historical > execution time. We can think that the task with a long execution time will > longe in > future. Record the last task execution time together with the task's key as a > log file, and load the log file at the next execution time. use The > RangePartitioner and partitioning partitioning methods prioritize large tasks > and can achieve concurrent task optimization. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org