[GitHub] [spark] ulysses-you commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to rebalance the query output if AQE is enabled

GitBox Sun, 27 Jun 2021 05:34:30 -0700


ulysses-you commented on a change in pull request #32932:
URL: https://github.com/apache/spark/pull/32932#discussion_r659314954




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
##########
@@ -1351,6 +1351,31 @@ object RepartitionByExpression {
   }
 }
 
+/**
+ * This operator is used to rebalance the output partitions of the given 
`child`, so that every
+ * partition is of a reasonable size (not too small and not too big). It also 
try its best to
+ * partition the child output by `partitionExpressions`. If there are skews, 
Spark will split the
+ * skewed partitions, to make these partitions not too big. This operator is 
useful when you need
+ * to write the result of `child` to a table, to avoid too small/big files.
+ *
+ * Note that, this operator only makes sense when AQE is enabled.
+ */
+case class RebalancePartitions(
+    partitionExpressions: Seq[Expression],
+    child: LogicalPlan) extends UnaryNode {

Review comment:
       Since it's a special one. For conservative, this PR does not extend 
`RepartitionOperation`, see the comment  
https://github.com/apache/spark/pull/32932#discussion_r656249153.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] ulysses-you commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to rebalance the query output if AQE is enabled

Reply via email to