[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31595: [SPARK-34474][SQL] Remove unnecessary Union under Distinct like operators

GitBox Sat, 20 Feb 2021 17:45:57 -0800


dongjoon-hyun commented on a change in pull request #31595:
URL: https://github.com/apache/spark/pull/31595#discussion_r579734736




##########
File path: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameSetOperationsSuite.scala
##########
@@ -387,100 +388,104 @@ class DataFrameSetOperationsSuite extends QueryTest 
with SharedSparkSession {
 
   test("SPARK-34283: SQL-style union using Dataset, " +
     "remove unnecessary deduplicate in multiple unions") {
-    val unionDF = 
testData.union(testData).distinct().union(testData).distinct()
-      .union(testData).distinct().union(testData).distinct()
-
-    // Before optimizer, there are three 'union.deduplicate' operations should 
be combined.
-    assert(unionDF.queryExecution.analyzed.collect {
-      case u: Union if u.children.size == 4 => u
-    }.size === 1)
-
-    // After optimizer, four 'union.deduplicate' operations should be combined.
-    assert(unionDF.queryExecution.optimizedPlan.collect {
-      case u: Union if u.children.size == 5 => u
-    }.size === 1)
-
-    checkAnswer(
-      unionDF.agg(avg("key"), max("key"), min("key"),
-        sum("key")), Row(50.5, 100, 1, 5050) :: Nil
-    )
+    withSQLConf(SQLConf.OPTIMIZER_EXCLUDED_RULES.key -> 
RemoveNoopOperators.ruleName) {

Review comment:
       This seems to reduce the test coverage for the existing other rules of 
`RemoveNoopOperators`. Can we make a new rule for this PR instead of touching 
`RemoveNoopOperators`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31595: [SPARK-34474][SQL] Remove unnecessary Union under Distinct like operators

Reply via email to