[ https://issues.apache.org/jira/browse/SPARK-39729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566167#comment-17566167 ]
xiangxiang Shen commented on SPARK-39729: ----------------------------------------- Hi [~hyukjin.kwon] , I know this config. And this config is a static switch. Is there a mechanism to judge the cost for WholeStageCodeGen, when the cost is larger than Non-Codegen, will disable WholeStageCodeGen automatically. If not, is it necessary to implement this functionality. Thanks > Why generate WholeStagecodegen for single operator? > --------------------------------------------------- > > Key: SPARK-39729 > URL: https://issues.apache.org/jira/browse/SPARK-39729 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.3.0 > Reporter: xiangxiang Shen > Priority: Major > > WholeStagecodegen will have better performance in many cases. But it should > not use WholeStagecodegen for single operator. > Below is a simple experiment. > {code:java} > test("range/filter should be combined") { > val df = spark.range(10).filter("id = 1").selectExpr("id + 1") > val plan = df.queryExecution.executedPlan > assert(plan.find(_.isInstanceOf[WholeStageCodegenExec]).isDefined) > assert(df.collect() === Array(Row(2))) > df.explain(false) > df.queryExecution.debug.codegen > }{code} > > If add > {code:java} > override def supportCodegen: Boolean = false{code} > in FilterExec. > > The physical plan is > {code:java} > == Physical Plan == > *(2) Project [(id#0L + 1) AS (id + 1)#4L] > +- Filter (id#0L = 1) > +- *(1) Range (0, 10, step=1, splits=2){code} > > The performence is not good in this case. > How can disable WholeStagecodegen in these cases? > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org