Repository: spark Updated Branches: refs/heads/master 14464cadb -> f2e855fba
[SPARK-13473][SQL] Simplifies PushPredicateThroughProject ## What changes were proposed in this pull request? This is a follow-up of PR #11348. After PR #11348, a predicate is never pushed through a project as long as the project contains any non-deterministic fields. Thus, it's impossible that the candidate filter condition can reference any non-deterministic projected fields, and related logic can be safely cleaned up. To be more specific, the following optimization is allowed: ```scala // From: df.select('a, 'b).filter('c > rand(42)) // To: df.filter('c > rand(42)).select('a, 'b) ``` while this isn't: ```scala // From: df.select('a, rand('b) as 'rb, 'c).filter('c > 'rb) // To: df.filter('c > rand('b)).select('a, rand('b) as 'rb, 'c) ``` ## How was this patch tested? Existing test cases should do the work. Author: Cheng Lian <l...@databricks.com> Closes #11864 from liancheng/spark-13473-cleanup. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f2e855fb Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f2e855fb Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f2e855fb Branch: refs/heads/master Commit: f2e855fba8eb73475cf312cdf880c1297d4323bb Parents: 14464ca Author: Cheng Lian <l...@databricks.com> Authored: Tue Mar 22 19:20:56 2016 +0800 Committer: Cheng Lian <l...@databricks.com> Committed: Tue Mar 22 19:20:56 2016 +0800 ---------------------------------------------------------------------- .../sql/catalyst/optimizer/Optimizer.scala | 24 +------------------- 1 file changed, 1 insertion(+), 23 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/f2e855fb/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala index 41e8dc0..0840d46 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala @@ -891,29 +891,7 @@ object PushPredicateThroughProject extends Rule[LogicalPlan] with PredicateHelpe case a: Alias => (a.toAttribute, a.child) }) - // Split the condition into small conditions by `And`, so that we can push down part of this - // condition without nondeterministic expressions. - val andConditions = splitConjunctivePredicates(condition) - - val (deterministic, nondeterministic) = andConditions.partition(_.collect { - case a: Attribute if aliasMap.contains(a) => aliasMap(a) - }.forall(_.deterministic)) - - // If there is no nondeterministic conditions, push down the whole condition. - if (nondeterministic.isEmpty) { - project.copy(child = Filter(replaceAlias(condition, aliasMap), grandChild)) - } else { - // If they are all nondeterministic conditions, leave it un-changed. - if (deterministic.isEmpty) { - filter - } else { - // Push down the small conditions without nondeterministic expressions. - val pushedCondition = - deterministic.map(replaceAlias(_, aliasMap)).reduce(And) - Filter(nondeterministic.reduce(And), - project.copy(child = Filter(pushedCondition, grandChild))) - } - } + project.copy(child = Filter(replaceAlias(condition, aliasMap), grandChild)) } } --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org