spark git commit: [SPARK-4937][SQL] Comment for the newly optimization rules in `BooleanSimplification`

rxin Sat, 17 Jan 2015 15:51:45 -0800

Repository: spark
Updated Branches:
  refs/heads/master f3bfc768d -> c1f3c27f2



[SPARK-4937][SQL] Comment for the newly optimization rules in 
`BooleanSimplification`

Follow up of #3778
/cc rxin

Author: scwf <wangf...@huawei.com>

Closes #4086 from scwf/commentforspark-4937 and squashes the following commits:

aaf89f6 [scwf] code style issue
2d3406e [scwf] added comment for spark-4937


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c1f3c27f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c1f3c27f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c1f3c27f

Branch: refs/heads/master
Commit: c1f3c27f22c75188fbbc718de771ccdd637e4944
Parents: f3bfc76
Author: scwf <wangf...@huawei.com>
Authored: Sat Jan 17 15:51:24 2015 -0800
Committer: Reynold Xin <r...@databricks.com>
Committed: Sat Jan 17 15:51:24 2015 -0800

----------------------------------------------------------------------
 .../spark/sql/catalyst/optimizer/Optimizer.scala  | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/c1f3c27f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
index f3acb70..522f14b 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
@@ -311,13 +311,20 @@ object BooleanSimplification extends Rule[LogicalPlan] 
with PredicateHelper {
           // a && a => a
           case (l, r) if l fastEquals r => l
           case (_, _) =>
+            /* Do optimize for predicates using formula (a || b) && (a || c) 
=> a || (b && c)
+             * 1. Split left and right to get the disjunctive predicates,
+             *    i.e. lhsSet = (a, b), rhsSet = (a, c)
+             * 2. Find the common predict between lhsSet and rhsSet, i.e. 
common = (a)
+             * 3. Remove common predict from lhsSet and rhsSet, i.e. ldiff = 
(b), rdiff = (c)
+             * 4. Apply the formula, get the optimized predict: common || 
(ldiff && rdiff)
+             */
             val lhsSet = splitDisjunctivePredicates(left).toSet
             val rhsSet = splitDisjunctivePredicates(right).toSet
             val common = lhsSet.intersect(rhsSet)
             val ldiff = lhsSet.diff(common)
             val rdiff = rhsSet.diff(common)
             if (ldiff.size == 0 || rdiff.size == 0) {
-              // a && (a || b)
+              // a && (a || b) => a
               common.reduce(Or)
             } else {
               // (a || b || c || ...) && (a || b || d || ...) && (a || b || e 
|| ...) ... =>
@@ -339,13 +346,20 @@ object BooleanSimplification extends Rule[LogicalPlan] 
with PredicateHelper {
           // a || a => a
           case (l, r) if l fastEquals r => l
           case (_, _) =>
+            /* Do optimize for predicates using formula (a && b) || (a && c) 
=> a && (b || c)
+             * 1. Split left and right to get the conjunctive predicates,
+             *    i.e.  lhsSet = (a, b), rhsSet = (a, c)
+             * 2. Find the common predict between lhsSet and rhsSet, i.e. 
common = (a)
+             * 3. Remove common predict from lhsSet and rhsSet, i.e. ldiff = 
(b), rdiff = (c)
+             * 4. Apply the formula, get the optimized predict: common && 
(ldiff || rdiff)
+             */
             val lhsSet = splitConjunctivePredicates(left).toSet
             val rhsSet = splitConjunctivePredicates(right).toSet
             val common = lhsSet.intersect(rhsSet)
             val ldiff = lhsSet.diff(common)
             val rdiff = rhsSet.diff(common)
             if ( ldiff.size == 0 || rdiff.size == 0) {
-              // a || (b && a)
+              // a || (b && a) => a
               common.reduce(And)
             } else {
               // (a && b && c && ...) || (a && b && d && ...) || (a && b && e 
&& ...) ... =>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-4937][SQL] Comment for the newly optimization rules in `BooleanSimplification`

Reply via email to