GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/21764
[SPARK-24802] Optimization Rule Exclusion ## What changes were proposed in this pull request? Since Spark has provided fairly clear interfaces for adding user-defined optimization rules, it would be nice to have an easy-to-use interface for excluding an optimization rule from the Spark query optimizer as well. This would make customizing Spark optimizer easier and sometimes could debugging issues too. - Add a new config spark.sql.optimizer.excludedRules, with the value being a list of rule names separated by comma. - Modify the current batches method to remove the excluded rules from the default batches. Log the rules that have been excluded. - Split the existing default batches into "post-analysis batches" and "optimization batches" so that only rules in the "optimization batches" can be excluded. ## How was this patch tested? Add a new test suite: OptimizerRuleExclusionSuite You can merge this pull request into a Git repository by running: $ git pull https://github.com/maryannxue/spark rule-exclusion Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21764.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21764 ---- commit eaec2f5f2b4e3193de41655b84a1dc936b0e50a3 Author: maryannxue <maryannxue@...> Date: 2018-07-13T21:32:01Z [SPARK-24802] Optimization Rule Exclusion ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org