dongjoon-hyun commented on a change in pull request #32870:
URL: https://github.com/apache/spark/pull/32870#discussion_r649609136



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant 
expressions
+ * will be considered as e1 < e2 and e2 < e1 by this ordering. But for the 
usage here,
+ * the order of irrelevant expressions does not matter.
+ */
+class ExpressionContainmentOrdering extends Ordering[Expression] {

Review comment:
       Just curious, is there a reason of this move?

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant 
expressions
+ * will be considered as e1 < e2 and e2 < e1 by this ordering. But for the 
usage here,

Review comment:
       Shall we change this to `0` according to the new logic?

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant 
expressions
+ * will be considered as e1 < e2 and e2 < e1 by this ordering. But for the 
usage here,
+ * the order of irrelevant expressions does not matter.
+ */
+class ExpressionContainmentOrdering extends Ordering[Expression] {

Review comment:
       Never mind. New one also looks good~

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant 
expressions
+ * will be considered as equal by this ordering. But for the usage here, the 
order of
+ * irrelevant expressions does not matter.

Review comment:
       To be complete, could you add some description about the 
semantically-equal expressions?

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -229,3 +208,27 @@ class EquivalentExpressions {
     sb.toString()
   }
 }
+
+/**
+ * Orders `Expression` by parent/child relations. The child expression is 
smaller
+ * than parent expression. If there is child-parent relationships among the 
subexpressions,
+ * we want the child expressions come first than parent expressions, so we can 
replace
+ * child expressions in parent expressions with subexpression evaluation. Note 
that
+ * this is not for general expression ordering. For example, two irrelevant or 
semantically-equal
+ * expressions will be considered as equal by this ordering. But for the usage 
here, the order of
+ * irrelevant expressions does not matter.
+ */
+class ExpressionContainmentOrdering extends Ordering[Expression] {
+  override def compare(x: Expression, y: Expression): Int = {
+    if (x.find(_.semanticEquals(y)).isDefined) {
+      // `y` is child expression of `x`.
+      1
+    } else if (y.find(_.semanticEquals(x)).isDefined) {
+      // `x` is child expression of `y`.
+      -1
+    } else {
+      // Irrelevant expressions

Review comment:
       ditto. We should mention the semantically-equal expression here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to