GitHub user sathiyapk opened a pull request: https://github.com/apache/spark/pull/19451
SPARK-22181 Adds ReplaceExceptWithNotFilter rule ## What changes were proposed in this pull request? Adds a new optimisation rule 'ReplaceExceptWithNotFilter' that replaces Except logical with Filter operator and schedule it before applying 'ReplaceExceptWithAntiJoin' rule. This way we can avoid expensive join operation if one or both of the datasets of the Except operation are fully derived out of Filters from a same parent. ## How was this patch tested? The patch is tested locally using spark-shell + unit test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sathiyapk/spark SPARK-22181-optimize-exceptWithFilter Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19451.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19451 ---- commit 1baecfdce9552b1e7853ace425970ea49c8f2304 Author: Sathiya <sathiya.ku...@polytechnique.edu> Date: 2017-10-06T18:57:52Z SPARK-22181 Adds ReplaceExceptWithNotFilter rule ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org