[GitHub] [spark] viirya commented on a change in pull request #32870: [SPARK-35439][SQL][FOLLOWUP] ExpressionContainmentOrdering should not sort unrelated expressions

GitBox Mon, 14 Jun 2021 00:08:29 -0700


viirya commented on a change in pull request #32870:
URL: https://github.com/apache/spark/pull/32870#discussion_r649615316




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant 
expressions
+ * will be considered as e1 < e2 and e2 < e1 by this ordering. But for the 
usage here,
+ * the order of irrelevant expressions does not matter.
+ */
+class ExpressionContainmentOrdering extends Ordering[Expression] {

Review comment:
       Oh, as it is a nested class, I cannot allocate it separately, but
   
   ```scala
   val equivalence = new EquivalentExpressions
   val exprOrdering = new equivalence.ExpressionContainmentOrdering
   ```
   
   I can revert to nested class if you think it's unnecessary change.

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant 
expressions
+ * will be considered as e1 < e2 and e2 < e1 by this ordering. But for the 
usage here,

Review comment:
       Right, missing the doc. Fixed.

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant 
expressions
+ * will be considered as e1 < e2 and e2 < e1 by this ordering. But for the 
usage here,
+ * the order of irrelevant expressions does not matter.
+ */
+class ExpressionContainmentOrdering extends Ordering[Expression] {

Review comment:
       Oh, as it is a nested class, I cannot allocate it separately, but
   
   ```scala
   val equivalence = new EquivalentExpressions
   val exprOrdering = new equivalence.ExpressionContainmentOrdering
   ```
   
   I can revert to nested class if you think it's unnecessary change.

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant 
expressions
+ * will be considered as e1 < e2 and e2 < e1 by this ordering. But for the 
usage here,

Review comment:
       Right, missing the doc. Fixed.

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant 
expressions
+ * will be considered as equal by this ordering. But for the usage here, the 
order of
+ * irrelevant expressions does not matter.

Review comment:
       Sure. Added.

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant or 
semantically-equal
+ * expressions will be considered as equal by this ordering. But for the usage 
here, the order of
+ * irrelevant expressions does not matter.
+ */
+class ExpressionContainmentOrdering extends Ordering[Expression] {
+  override def compare(x: Expression, y: Expression): Int = {
+    if (x.find(_.semanticEquals(y)).isDefined) {
+      // `y` is child expression of `x`.
+      1
+    } else if (y.find(_.semanticEquals(x)).isDefined) {
+      // `x` is child expression of `y`.
+      -1
+    } else {
+      // Irrelevant expressions

Review comment:
       added. thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #32870: [SPARK-35439][SQL][FOLLOWUP] ExpressionContainmentOrdering should not sort unrelated expressions

Reply via email to