Wang, Gang created SPARK-26375: ---------------------------------- Summary: Rule PruneFileSourcePartitions should be fired before any other rules based on data size Key: SPARK-26375 URL: https://issues.apache.org/jira/browse/SPARK-26375 Project: Spark Issue Type: Improvement Components: Optimizer Affects Versions: 2.3.0 Reporter: Wang, Gang
In catalyst, some optimize rules are base on table statistics, like rule ReorderJoin, in which star schema is detected, and CostBasedJoinReorder. In these rules, statistics accuracy are crucial. While, currently all these rules are fired before partition pruning, which may get inaccurate statistics. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org