[ https://issues.apache.org/jira/browse/SPARK-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16546840#comment-16546840 ]
Michail Giannakopoulos commented on SPARK-9850: ----------------------------------------------- Hello [~yhuai]! Are people currently working on this Epic? In other words, is this work in progress, or have you determined that it should be stalled? I am asking because recently I logged an issue related with adaptive execution (SPARK-24826). It would be nice to know if you are working on this actively since it reduces a lot the number of partitions during shuffles when executing sql queries (one of the main bottlenecks for spark). Thanks a lot! > Adaptive execution in Spark > --------------------------- > > Key: SPARK-9850 > URL: https://issues.apache.org/jira/browse/SPARK-9850 > Project: Spark > Issue Type: Epic > Components: Spark Core, SQL > Reporter: Matei Zaharia > Assignee: Yin Huai > Priority: Major > Attachments: AdaptiveExecutionInSpark.pdf > > > Query planning is one of the main factors in high performance, but the > current Spark engine requires the execution DAG for a job to be set in > advance. Even with costÂ-based optimization, it is hard to know the behavior > of data and user-defined functions well enough to always get great execution > plans. This JIRA proposes to add adaptive query execution, so that the engine > can change the plan for each query as it sees what data earlier stages > produced. > We propose adding this to Spark SQL / DataFrames first, using a new API in > the Spark engine that lets libraries run DAGs adaptively. In future JIRAs, > the functionality could be extended to other libraries or the RDD API, but > that is more difficult than adding it in SQL. > I've attached a design doc by Yin Huai and myself explaining how it would > work in more detail. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org