[ 
https://issues.apache.org/jira/browse/SPARK-33798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17252186#comment-17252186
 ] 

Yuming Wang edited comment on SPARK-33798 at 12/19/20, 2:01 PM:
----------------------------------------------------------------

{noformat}
22:38:12.823 WARN org.apache.spark.sql.TPCDSQuerySuite: 
=== Metrics of Analyzer/Optimizer Rules ===
Total number of runs: 244581
Total time: 119.050431411 seconds

Rule                                                                            
  Effective Time / Total Time                     Effective Runs / Total Runs   
                 

org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries            
  16067549156 / 20348086725                       47 / 772                      
                 
org.apache.spark.sql.catalyst.optimizer.ColumnPruning                           
  1667188964 / 7908667409                         328 / 2383                    
                 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions       
  1621072191 / 4292026876                         49 / 2166                     
                 
org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns              
  0 / 4286062022                                  0 / 2176                      
                 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery                 
  3605338420 / 3759423596                         51 / 2166                     
                 
org.apache.spark.sql.catalyst.analysis.DecimalPrecision                         
  2363988148 / 2622889221                         361 / 2166                    
                 
org.apache.spark.sql.catalyst.optimizer.PruneFilters                            
  40232441 / 2586390541                           5 / 1997                      
                 
org.apache.spark.sql.catalyst.optimizer.PushDownPredicates                      
  967563396 / 2014635982                          767 / 2390                    
                 
org.apache.spark.sql.catalyst.optimizer.BooleanSimplification                   
  10488612 / 1928073089                           4 / 1611                      
                 
org.apache.spark.sql.catalyst.optimizer.ReorderJoin                             
  827478197 / 1877711922                          177 / 1611                    
                 
org.apache.spark.sql.catalyst.optimizer.RemoveNoopOperators                     
  155445231 / 1706822650                          116 / 2383                    
                 
org.apache.spark.sql.catalyst.optimizer.NullPropagation                         
  108945486 / 1531470853                          59 / 1611                     
                 
org.apache.spark.sql.catalyst.optimizer.OptimizeJsonExprs                       
  0 / 1484595419                                  0 / 1611                      
                 
org.apache.spark.sql.catalyst.optimizer.CollapseProject                         
  251370336 / 1450991269                          220 / 1997                    
                 
org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison                
  498412 / 1441505196                             1 / 1611                      
                 
org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions       
  0 / 1441502129                                  0 / 1611                      
                 
org.apache.spark.sql.catalyst.optimizer.ConstantFolding                         
  258015371 / 1435436578                          197 / 1611                    
                 
org.apache.spark.sql.catalyst.optimizer.PushFoldableIntoBranches                
  16866331 / 1427302659                           19 / 1611                     
                 

{noformat}



was (Author: q79969786):

{noformat}
=== Metrics of Analyzer/Optimizer Rules ===
Total number of runs: 244581
Total time: 119.050431411 seconds

Rule                                                                            
  Effective Time / Total Time                     Effective Runs / Total Runs   
                 

org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries            
  16067549156 / 20348086725                       47 / 772                      
                 
org.apache.spark.sql.catalyst.optimizer.ColumnPruning                           
  1667188964 / 7908667409                         328 / 2383                    
                 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions       
  1621072191 / 4292026876                         49 / 2166                     
                 
org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns              
  0 / 4286062022                                  0 / 2176                      
                 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery                 
  3605338420 / 3759423596                         51 / 2166                     
                 
org.apache.spark.sql.catalyst.analysis.DecimalPrecision                         
  2363988148 / 2622889221                         361 / 2166                    
                 
org.apache.spark.sql.catalyst.optimizer.PruneFilters                            
  40232441 / 2586390541                           5 / 1997                      
                 
org.apache.spark.sql.catalyst.optimizer.PushDownPredicates                      
  967563396 / 2014635982                          767 / 2390                    
                 
org.apache.spark.sql.catalyst.optimizer.BooleanSimplification                   
  10488612 / 1928073089                           4 / 1611                      
                 
org.apache.spark.sql.catalyst.optimizer.ReorderJoin                             
  827478197 / 1877711922                          177 / 1611                    
                 
org.apache.spark.sql.catalyst.optimizer.RemoveNoopOperators                     
  155445231 / 1706822650                          116 / 2383                    
                 
org.apache.spark.sql.catalyst.optimizer.NullPropagation                         
  108945486 / 1531470853                          59 / 1611                     
                 
org.apache.spark.sql.catalyst.optimizer.OptimizeJsonExprs                       
  0 / 1484595419                                  0 / 1611                      
                 
org.apache.spark.sql.catalyst.optimizer.CollapseProject                         
  251370336 / 1450991269                          220 / 1997                    
                 
org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison                
  498412 / 1441505196                             1 / 1611                      
                 
org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions       
  0 / 1441502129                                  0 / 1611                      
                 
org.apache.spark.sql.catalyst.optimizer.ConstantFolding                         
  258015371 / 1435436578                          197 / 1611                    
                 
org.apache.spark.sql.catalyst.optimizer.PushFoldableIntoBranches                
  16866331 / 1427302659                           19 / 1611                     
                 

{noformat}


> Simplify EqualTo(CaseWhen/If, Literal) always false
> ---------------------------------------------------
>
>                 Key: SPARK-33798
>                 URL: https://issues.apache.org/jira/browse/SPARK-33798
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: Yuming Wang
>            Assignee: Yuming Wang
>            Priority: Major
>             Fix For: 3.2.0
>
>
> Simplify CaseWhen/If with EqualTo if all values are Literal and always false, 
> this is a real case from production:
> {code:sql}
> create table t1 using parquet as select * from range(100);
> create table t2 using parquet as select * from range(200);
> create temp view v1 as                                                        
>      
> select 'a' as event_type, * from t1                                           
>      
> union all                                                                     
>      
> select CASE WHEN id = 1 THEN 'b' WHEN id = 3 THEN 'c' end as event_type, * 
> from t2 
> explain select * from v1 where event_type = 'a';
> {code}
> Before this PR:
> {noformat}
> == Physical Plan ==
> Union
> :- *(1) Project [a AS event_type#30533, id#30535L]
> :  +- *(1) ColumnarToRow
> :     +- FileScan parquet default.t1[id#30535L] Batched: true, DataFilters: 
> [], Format: Parquet
> +- *(2) Project [CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN c 
> END AS event_type#30534, id#30536L]
>    +- *(2) Filter (CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN 
> c END = a)
>       +- *(2) ColumnarToRow
>          +- FileScan parquet default.t2[id#30536L] Batched: true, 
> DataFilters: [(CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN c 
> END = a)], Format: Parquet
> {noformat}
> After this PR:
> {noformat}
> == Physical Plan ==
> *(1) Project [a AS event_type#8, id#4L]
> +- *(1) ColumnarToRow
>    +- FileScan parquet default.t1[id#4L] Batched: true, DataFilters: [], 
> Format: Parquet
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to