[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21911 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207701114 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala --- @@ -102,6 +104,32 @@ object ResolveHints { } } + /** + * COALESCE Hint accepts name "COALESCE" and "REPARTITION". + * Its parameter includes a partition number. + */ + object ResolveCoalesceHints extends Rule[LogicalPlan] { +private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION") + +def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators { + case h: UnresolvedHint if COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) => +val hintName = h.name.toUpperCase(Locale.ROOT) +val shuffle = hintName match { + case "REPARTITION" => true + case "COALESCE" => false +} +val numPartitions = h.parameters match { + case Seq(Literal(numPartitions: Int, IntegerType)) => --- End diff -- Use `IntegerLiteral` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user jzhuge commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207638866 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala --- @@ -102,6 +104,38 @@ object ResolveHints { } } + /** + * COALESCE Hint accepts name "COALESCE" and "REPARTITION". + * Its parameter includes a partition number. + */ + class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] { +private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION") + +private def applyCoalesceHint( + plan: LogicalPlan, + numPartitions: Int, + shuffle: Boolean): LogicalPlan = { + Repartition(numPartitions, shuffle, plan) +} + +def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators { + case h: UnresolvedHint if COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) => +val hintName = h.name.toUpperCase(Locale.ROOT) +val shuffle = hintName match { + case "REPARTITION" => true + case "COALESCE" => false +} +h.parameters match { --- End diff -- Ah I see. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207631776 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala --- @@ -102,6 +104,38 @@ object ResolveHints { } } + /** + * COALESCE Hint accepts name "COALESCE" and "REPARTITION". + * Its parameter includes a partition number. + */ + class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] { +private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION") + +private def applyCoalesceHint( + plan: LogicalPlan, + numPartitions: Int, + shuffle: Boolean): LogicalPlan = { + Repartition(numPartitions, shuffle, plan) +} + +def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators { + case h: UnresolvedHint if COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) => +val hintName = h.name.toUpperCase(Locale.ROOT) +val shuffle = hintName match { + case "REPARTITION" => true + case "COALESCE" => false +} +h.parameters match { --- End diff -- I meant something like: ``` val numPartitions = h.parameters match { ... } Repartition(numPartitions, shuffle, h.child) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user jzhuge commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207615008 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala --- @@ -102,6 +104,38 @@ object ResolveHints { } } + /** + * COALESCE Hint accepts name "COALESCE" and "REPARTITION". + * Its parameter includes a partition number. + */ + class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] { +private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION") + +private def applyCoalesceHint( + plan: LogicalPlan, + numPartitions: Int, + shuffle: Boolean): LogicalPlan = { + Repartition(numPartitions, shuffle, plan) +} + +def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators { + case h: UnresolvedHint if COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) => +val hintName = h.name.toUpperCase(Locale.ROOT) +val shuffle = hintName match { + case "REPARTITION" => true + case "COALESCE" => false +} +h.parameters match { --- End diff -- Thanks for the review, @mgaido91. Don't quite get this comment. Please take a look at my latest commit. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207509267 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala --- @@ -102,6 +104,38 @@ object ResolveHints { } } + /** + * COALESCE Hint accepts name "COALESCE" and "REPARTITION". + * Its parameter includes a partition number. + */ + class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] { --- End diff -- the `conf` is not used anywhere, we can remove it --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207510473 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala --- @@ -102,6 +104,38 @@ object ResolveHints { } } + /** + * COALESCE Hint accepts name "COALESCE" and "REPARTITION". + * Its parameter includes a partition number. + */ + class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] { +private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION") + +private def applyCoalesceHint( + plan: LogicalPlan, + numPartitions: Int, + shuffle: Boolean): LogicalPlan = { + Repartition(numPartitions, shuffle, plan) +} + +def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators { + case h: UnresolvedHint if COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) => +val hintName = h.name.toUpperCase(Locale.ROOT) +val shuffle = hintName match { + case "REPARTITION" => true + case "COALESCE" => false +} +h.parameters match { --- End diff -- nit: maybe we can just return `numPartitions` and then we can call `applyCoalesceHint` in order to avoid code duplication... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user jzhuge commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207334996 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala --- @@ -102,6 +104,39 @@ object ResolveHints { } } + /** + * For coalesce hint, we accept "COALESCE" and "REPARTITION". + * Its parameters include a partition number and an optional boolean to indicate + * whether shuffle is allowed. + */ + class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] { +private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION") + +private def applyCoalesceHint( + plan: LogicalPlan, + numPartitions: Int, + shuffle: Boolean): LogicalPlan = { + Repartition(numPartitions, shuffle, plan) +} + +def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators { + case h: UnresolvedHint if COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) => --- End diff -- done --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user jzhuge commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207303282 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveHintsSuite.scala --- @@ -17,15 +17,25 @@ package org.apache.spark.sql.catalyst.analysis +import org.apache.spark.sql.AnalysisException import org.apache.spark.sql.catalyst.dsl.expressions._ import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.expressions.Literal import org.apache.spark.sql.catalyst.parser.CatalystSqlParser import org.apache.spark.sql.catalyst.plans.Inner import org.apache.spark.sql.catalyst.plans.logical._ class ResolveHintsSuite extends AnalysisTest { import org.apache.spark.sql.catalyst.analysis.TestRelations._ + private def intercept(plan: LogicalPlan, messages: String*): Unit = { --- End diff -- Sure --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user jzhuge commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207297428 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala --- @@ -102,6 +104,35 @@ object ResolveHints { } } + /** + * COALESCE Hint accepts name "COALESCE" and "REPARTITION". + * Its parameter includes a partition number. + */ + class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] { +private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION") + +private def applyCoalesceHint( + plan: LogicalPlan, + numPartitions: Int, + shuffle: Boolean): LogicalPlan = { + Repartition(numPartitions, shuffle, plan) +} + +def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators { + case h: UnresolvedHint if COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) => +h.parameters match { + case Seq(Literal(numPartitions: Int, IntegerType)) => +val shuffle = h.name.toUpperCase(Locale.ROOT) match { + case "REPARTITION" => true + case "COALESCE" => false +} +applyCoalesceHint(h.child, numPartitions, shuffle) + case _ => +throw new AnalysisException("COALESCE Hint expects a partition number as parameter") --- End diff -- Good catch --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207285870 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveHintsSuite.scala --- @@ -17,15 +17,25 @@ package org.apache.spark.sql.catalyst.analysis +import org.apache.spark.sql.AnalysisException import org.apache.spark.sql.catalyst.dsl.expressions._ import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.expressions.Literal import org.apache.spark.sql.catalyst.parser.CatalystSqlParser import org.apache.spark.sql.catalyst.plans.Inner import org.apache.spark.sql.catalyst.plans.logical._ class ResolveHintsSuite extends AnalysisTest { import org.apache.spark.sql.catalyst.analysis.TestRelations._ + private def intercept(plan: LogicalPlan, messages: String*): Unit = { --- End diff -- Since this is an assertion and `intercept` is a well-known method, can you rename it to `assertErrorMessage` or something more descriptive? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/21911#discussion_r207285266 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala --- @@ -102,6 +104,35 @@ object ResolveHints { } } + /** + * COALESCE Hint accepts name "COALESCE" and "REPARTITION". + * Its parameter includes a partition number. + */ + class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] { +private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION") + +private def applyCoalesceHint( + plan: LogicalPlan, + numPartitions: Int, + shuffle: Boolean): LogicalPlan = { + Repartition(numPartitions, shuffle, plan) +} + +def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators { + case h: UnresolvedHint if COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) => +h.parameters match { + case Seq(Literal(numPartitions: Int, IntegerType)) => +val shuffle = h.name.toUpperCase(Locale.ROOT) match { + case "REPARTITION" => true + case "COALESCE" => false +} +applyCoalesceHint(h.child, numPartitions, shuffle) + case _ => +throw new AnalysisException("COALESCE Hint expects a partition number as parameter") --- End diff -- Can you use `h.name.toUpperCase` in this error message instead? I think that would be a better message for users that don't know the relationship between COALESCE and REPARTITION. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org