Wang, Gang created SPARK-26375:
----------------------------------

             Summary: Rule PruneFileSourcePartitions should be fired before any 
other rules based on data size
                 Key: SPARK-26375
                 URL: https://issues.apache.org/jira/browse/SPARK-26375
             Project: Spark
          Issue Type: Improvement
          Components: Optimizer
    Affects Versions: 2.3.0
            Reporter: Wang, Gang


In catalyst, some optimize rules are base on table statistics, like rule 
ReorderJoin, in which star schema is detected, and CostBasedJoinReorder. In 
these rules, statistics accuracy are crucial. While, currently all these rules 
are fired before partition pruning, which may get inaccurate statistics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to