[ https://issues.apache.org/jira/browse/SPARK-33798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17252186#comment-17252186 ]
Yuming Wang edited comment on SPARK-33798 at 12/19/20, 2:01 PM: ---------------------------------------------------------------- {noformat} 22:38:12.823 WARN org.apache.spark.sql.TPCDSQuerySuite: === Metrics of Analyzer/Optimizer Rules === Total number of runs: 244581 Total time: 119.050431411 seconds Rule Effective Time / Total Time Effective Runs / Total Runs org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries 16067549156 / 20348086725 47 / 772 org.apache.spark.sql.catalyst.optimizer.ColumnPruning 1667188964 / 7908667409 328 / 2383 org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions 1621072191 / 4292026876 49 / 2166 org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns 0 / 4286062022 0 / 2176 org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery 3605338420 / 3759423596 51 / 2166 org.apache.spark.sql.catalyst.analysis.DecimalPrecision 2363988148 / 2622889221 361 / 2166 org.apache.spark.sql.catalyst.optimizer.PruneFilters 40232441 / 2586390541 5 / 1997 org.apache.spark.sql.catalyst.optimizer.PushDownPredicates 967563396 / 2014635982 767 / 2390 org.apache.spark.sql.catalyst.optimizer.BooleanSimplification 10488612 / 1928073089 4 / 1611 org.apache.spark.sql.catalyst.optimizer.ReorderJoin 827478197 / 1877711922 177 / 1611 org.apache.spark.sql.catalyst.optimizer.RemoveNoopOperators 155445231 / 1706822650 116 / 2383 org.apache.spark.sql.catalyst.optimizer.NullPropagation 108945486 / 1531470853 59 / 1611 org.apache.spark.sql.catalyst.optimizer.OptimizeJsonExprs 0 / 1484595419 0 / 1611 org.apache.spark.sql.catalyst.optimizer.CollapseProject 251370336 / 1450991269 220 / 1997 org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison 498412 / 1441505196 1 / 1611 org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions 0 / 1441502129 0 / 1611 org.apache.spark.sql.catalyst.optimizer.ConstantFolding 258015371 / 1435436578 197 / 1611 org.apache.spark.sql.catalyst.optimizer.PushFoldableIntoBranches 16866331 / 1427302659 19 / 1611 {noformat} was (Author: q79969786): {noformat} === Metrics of Analyzer/Optimizer Rules === Total number of runs: 244581 Total time: 119.050431411 seconds Rule Effective Time / Total Time Effective Runs / Total Runs org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries 16067549156 / 20348086725 47 / 772 org.apache.spark.sql.catalyst.optimizer.ColumnPruning 1667188964 / 7908667409 328 / 2383 org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions 1621072191 / 4292026876 49 / 2166 org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns 0 / 4286062022 0 / 2176 org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery 3605338420 / 3759423596 51 / 2166 org.apache.spark.sql.catalyst.analysis.DecimalPrecision 2363988148 / 2622889221 361 / 2166 org.apache.spark.sql.catalyst.optimizer.PruneFilters 40232441 / 2586390541 5 / 1997 org.apache.spark.sql.catalyst.optimizer.PushDownPredicates 967563396 / 2014635982 767 / 2390 org.apache.spark.sql.catalyst.optimizer.BooleanSimplification 10488612 / 1928073089 4 / 1611 org.apache.spark.sql.catalyst.optimizer.ReorderJoin 827478197 / 1877711922 177 / 1611 org.apache.spark.sql.catalyst.optimizer.RemoveNoopOperators 155445231 / 1706822650 116 / 2383 org.apache.spark.sql.catalyst.optimizer.NullPropagation 108945486 / 1531470853 59 / 1611 org.apache.spark.sql.catalyst.optimizer.OptimizeJsonExprs 0 / 1484595419 0 / 1611 org.apache.spark.sql.catalyst.optimizer.CollapseProject 251370336 / 1450991269 220 / 1997 org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison 498412 / 1441505196 1 / 1611 org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions 0 / 1441502129 0 / 1611 org.apache.spark.sql.catalyst.optimizer.ConstantFolding 258015371 / 1435436578 197 / 1611 org.apache.spark.sql.catalyst.optimizer.PushFoldableIntoBranches 16866331 / 1427302659 19 / 1611 {noformat} > Simplify EqualTo(CaseWhen/If, Literal) always false > --------------------------------------------------- > > Key: SPARK-33798 > URL: https://issues.apache.org/jira/browse/SPARK-33798 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.2.0 > Reporter: Yuming Wang > Assignee: Yuming Wang > Priority: Major > Fix For: 3.2.0 > > > Simplify CaseWhen/If with EqualTo if all values are Literal and always false, > this is a real case from production: > {code:sql} > create table t1 using parquet as select * from range(100); > create table t2 using parquet as select * from range(200); > create temp view v1 as > > select 'a' as event_type, * from t1 > > union all > > select CASE WHEN id = 1 THEN 'b' WHEN id = 3 THEN 'c' end as event_type, * > from t2 > explain select * from v1 where event_type = 'a'; > {code} > Before this PR: > {noformat} > == Physical Plan == > Union > :- *(1) Project [a AS event_type#30533, id#30535L] > : +- *(1) ColumnarToRow > : +- FileScan parquet default.t1[id#30535L] Batched: true, DataFilters: > [], Format: Parquet > +- *(2) Project [CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN c > END AS event_type#30534, id#30536L] > +- *(2) Filter (CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN > c END = a) > +- *(2) ColumnarToRow > +- FileScan parquet default.t2[id#30536L] Batched: true, > DataFilters: [(CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN c > END = a)], Format: Parquet > {noformat} > After this PR: > {noformat} > == Physical Plan == > *(1) Project [a AS event_type#8, id#4L] > +- *(1) ColumnarToRow > +- FileScan parquet default.t1[id#4L] Batched: true, DataFilters: [], > Format: Parquet > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org