[GitHub] [spark] maropu commented on a change in pull request #28679: [SPARK-31870][SQL][TESTS] Fix "Do not optimize skew join if additional shuffle" test having no skew join

GitBox Sun, 31 May 2020 04:55:02 -0700


maropu commented on a change in pull request #28679:
URL: https://github.com/apache/spark/pull/28679#discussion_r432939223




##########
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
##########
@@ -639,11 +640,18 @@ class AdaptiveQueryExecSuite
           .range(0, 1000, 1, 10)
           .selectExpr("id % 1 as key2", "id as value2")
           .createOrReplaceTempView("skewData2")
-        val (_, innerAdaptivePlan) = runAdaptiveAndVerifyResult(
-          "SELECT key1 FROM skewData1 join skewData2 ON key1 = key2 group by 
key1")
+
+        def checkSkewJoin(query: String, additionalShuffle: Boolean): Unit = {
+          val (_, innerAdaptivePlan) = runAdaptiveAndVerifyResult(query)
+          val innerSmj = findTopLevelSortMergeJoin(innerAdaptivePlan)
+          assert(innerSmj.size == 1 && innerSmj.head.isSkewJoin != 
additionalShuffle)
+        }
+
+        checkSkewJoin(
+          "SELECT key1 FROM skewData1 join skewData2 ON key1 = key2", false)
         // Additional shuffle introduced, so disable the "OptimizeSkewedJoin" 
optimization
-        val innerSmj = findTopLevelSortMergeJoin(innerAdaptivePlan)
-        assert(innerSmj.size == 1 && !innerSmj.head.isSkewJoin)
+        checkSkewJoin(
+          "SELECT key1 FROM skewData1 join skewData2 ON key1 = key2 group by 
key1", true)

Review comment:
       nit: Plz use uppercases for SQL keywords where possible (e.g., `group 
by` -> `GROUP BY`).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #28679: [SPARK-31870][SQL][TESTS] Fix "Do not optimize skew join if additional shuffle" test having no skew join

Reply via email to