[GitHub] spark pull request #20844: [SPARK-23707][SQL] Don't need shuffle exchange wi...
Github user ConeyLiu closed the pull request at: https://github.com/apache/spark/pull/20844 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20844: [SPARK-23707][SQL] Don't need shuffle exchange wi...
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r178437551 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala --- @@ -39,7 +39,7 @@ class ConfigBehaviorSuite extends QueryTest with SharedSQLContext { def computeChiSquareTest(): Double = { val n = 1 // Trigger a sort - val data = spark.range(0, n, 1, 1).sort('id) + val data = spark.range(0, n, 1, 2).sort('id) --- End diff -- : ) Know this now --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20844: [SPARK-23707][SQL] Don't need shuffle exchange wi...
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r177311524 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala --- @@ -39,7 +39,7 @@ class ConfigBehaviorSuite extends QueryTest with SharedSQLContext { def computeChiSquareTest(): Double = { val n = 1 // Trigger a sort - val data = spark.range(0, n, 1, 1).sort('id) + val data = spark.range(0, n, 1, 2).sort('id) --- End diff -- Why change this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20844: [SPARK-23707][SQL] Don't need shuffle exchange wi...
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r176327636 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --- @@ -348,6 +348,13 @@ case class RangeExec(range: org.apache.spark.sql.catalyst.plans.logical.Range) override lazy val metrics = Map( "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output rows")) + /** Specifies how data is partitioned across different nodes in the cluster. */ + override def outputPartitioning: Partitioning = if (numSlices == 1 && numElements != 0) { --- End diff -- This related to the [UT error](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88474/testReport/org.apache.spark.sql/DataFrameRangeSuite/SPARK_7150_range_api/). `spark.range(-10, -9, -20, 1).count()` faild when `codegen` set to true and `RangeExec.outputPartitioning' set to `SinglePartition`. I try to found the root reason, but failed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20844: [SPARK-23707][SQL] Don't need shuffle exchange wi...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r176307376 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --- @@ -348,6 +348,13 @@ case class RangeExec(range: org.apache.spark.sql.catalyst.plans.logical.Range) override lazy val metrics = Map( "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output rows")) + /** Specifies how data is partitioned across different nodes in the cluster. */ + override def outputPartitioning: Partitioning = if (numSlices == 1 && numElements != 0) { --- End diff -- why `numElements != 0`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org