[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21911


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207701114
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -102,6 +104,32 @@ object ResolveHints {
 }
   }
 
+  /**
+   * COALESCE Hint accepts name "COALESCE" and "REPARTITION".
+   * Its parameter includes a partition number.
+   */
+  object ResolveCoalesceHints extends Rule[LogicalPlan] {
+private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION")
+
+def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators {
+  case h: UnresolvedHint if 
COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) =>
+val hintName = h.name.toUpperCase(Locale.ROOT)
+val shuffle = hintName match {
+  case "REPARTITION" => true
+  case "COALESCE" => false
+}
+val numPartitions = h.parameters match {
+  case Seq(Literal(numPartitions: Int, IntegerType)) =>
--- End diff --

Use `IntegerLiteral`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-03 Thread jzhuge
Github user jzhuge commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207638866
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -102,6 +104,38 @@ object ResolveHints {
 }
   }
 
+  /**
+   * COALESCE Hint accepts name "COALESCE" and "REPARTITION".
+   * Its parameter includes a partition number.
+   */
+  class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] {
+private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION")
+
+private def applyCoalesceHint(
+  plan: LogicalPlan,
+  numPartitions: Int,
+  shuffle: Boolean): LogicalPlan = {
+  Repartition(numPartitions, shuffle, plan)
+}
+
+def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators {
+  case h: UnresolvedHint if 
COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) =>
+val hintName = h.name.toUpperCase(Locale.ROOT)
+val shuffle = hintName match {
+  case "REPARTITION" => true
+  case "COALESCE" => false
+}
+h.parameters match {
--- End diff --

Ah I see. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-03 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207631776
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -102,6 +104,38 @@ object ResolveHints {
 }
   }
 
+  /**
+   * COALESCE Hint accepts name "COALESCE" and "REPARTITION".
+   * Its parameter includes a partition number.
+   */
+  class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] {
+private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION")
+
+private def applyCoalesceHint(
+  plan: LogicalPlan,
+  numPartitions: Int,
+  shuffle: Boolean): LogicalPlan = {
+  Repartition(numPartitions, shuffle, plan)
+}
+
+def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators {
+  case h: UnresolvedHint if 
COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) =>
+val hintName = h.name.toUpperCase(Locale.ROOT)
+val shuffle = hintName match {
+  case "REPARTITION" => true
+  case "COALESCE" => false
+}
+h.parameters match {
--- End diff --

I meant something like:
```
val numPartitions = h.parameters  match {
  ...
}
 Repartition(numPartitions, shuffle, h.child)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-03 Thread jzhuge
Github user jzhuge commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207615008
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -102,6 +104,38 @@ object ResolveHints {
 }
   }
 
+  /**
+   * COALESCE Hint accepts name "COALESCE" and "REPARTITION".
+   * Its parameter includes a partition number.
+   */
+  class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] {
+private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION")
+
+private def applyCoalesceHint(
+  plan: LogicalPlan,
+  numPartitions: Int,
+  shuffle: Boolean): LogicalPlan = {
+  Repartition(numPartitions, shuffle, plan)
+}
+
+def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators {
+  case h: UnresolvedHint if 
COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) =>
+val hintName = h.name.toUpperCase(Locale.ROOT)
+val shuffle = hintName match {
+  case "REPARTITION" => true
+  case "COALESCE" => false
+}
+h.parameters match {
--- End diff --

Thanks for the review, @mgaido91. Don't quite get this comment. Please take 
a look at my latest commit.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-03 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207509267
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -102,6 +104,38 @@ object ResolveHints {
 }
   }
 
+  /**
+   * COALESCE Hint accepts name "COALESCE" and "REPARTITION".
+   * Its parameter includes a partition number.
+   */
+  class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] {
--- End diff --

the `conf` is not used anywhere, we can remove it


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-03 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207510473
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -102,6 +104,38 @@ object ResolveHints {
 }
   }
 
+  /**
+   * COALESCE Hint accepts name "COALESCE" and "REPARTITION".
+   * Its parameter includes a partition number.
+   */
+  class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] {
+private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION")
+
+private def applyCoalesceHint(
+  plan: LogicalPlan,
+  numPartitions: Int,
+  shuffle: Boolean): LogicalPlan = {
+  Repartition(numPartitions, shuffle, plan)
+}
+
+def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators {
+  case h: UnresolvedHint if 
COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) =>
+val hintName = h.name.toUpperCase(Locale.ROOT)
+val shuffle = hintName match {
+  case "REPARTITION" => true
+  case "COALESCE" => false
+}
+h.parameters match {
--- End diff --

nit: maybe we can just return `numPartitions` and then we can call 
`applyCoalesceHint` in order to avoid code duplication...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-02 Thread jzhuge
Github user jzhuge commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207334996
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -102,6 +104,39 @@ object ResolveHints {
 }
   }
 
+  /**
+   * For coalesce hint, we accept "COALESCE" and "REPARTITION".
+   * Its parameters include a partition number and an optional boolean to 
indicate
+   * whether shuffle is allowed.
+   */
+  class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] {
+private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION")
+
+private def applyCoalesceHint(
+  plan: LogicalPlan,
+  numPartitions: Int,
+  shuffle: Boolean): LogicalPlan = {
+  Repartition(numPartitions, shuffle, plan)
+}
+
+def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators {
+  case h: UnresolvedHint if 
COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) =>
--- End diff --

done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-02 Thread jzhuge
Github user jzhuge commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207303282
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveHintsSuite.scala
 ---
@@ -17,15 +17,25 @@
 
 package org.apache.spark.sql.catalyst.analysis
 
+import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.dsl.expressions._
 import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions.Literal
 import org.apache.spark.sql.catalyst.parser.CatalystSqlParser
 import org.apache.spark.sql.catalyst.plans.Inner
 import org.apache.spark.sql.catalyst.plans.logical._
 
 class ResolveHintsSuite extends AnalysisTest {
   import org.apache.spark.sql.catalyst.analysis.TestRelations._
 
+  private def intercept(plan: LogicalPlan, messages: String*): Unit = {
--- End diff --

Sure


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-02 Thread jzhuge
Github user jzhuge commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207297428
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -102,6 +104,35 @@ object ResolveHints {
 }
   }
 
+  /**
+   * COALESCE Hint accepts name "COALESCE" and "REPARTITION".
+   * Its parameter includes a partition number.
+   */
+  class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] {
+private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION")
+
+private def applyCoalesceHint(
+  plan: LogicalPlan,
+  numPartitions: Int,
+  shuffle: Boolean): LogicalPlan = {
+  Repartition(numPartitions, shuffle, plan)
+}
+
+def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators {
+  case h: UnresolvedHint if 
COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) =>
+h.parameters match {
+  case Seq(Literal(numPartitions: Int, IntegerType)) =>
+val shuffle = h.name.toUpperCase(Locale.ROOT) match {
+  case "REPARTITION" => true
+  case "COALESCE" => false
+}
+applyCoalesceHint(h.child, numPartitions, shuffle)
+  case _ =>
+throw new AnalysisException("COALESCE Hint expects a partition 
number as parameter")
--- End diff --

Good catch


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-02 Thread rdblue
Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207285870
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveHintsSuite.scala
 ---
@@ -17,15 +17,25 @@
 
 package org.apache.spark.sql.catalyst.analysis
 
+import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.dsl.expressions._
 import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions.Literal
 import org.apache.spark.sql.catalyst.parser.CatalystSqlParser
 import org.apache.spark.sql.catalyst.plans.Inner
 import org.apache.spark.sql.catalyst.plans.logical._
 
 class ResolveHintsSuite extends AnalysisTest {
   import org.apache.spark.sql.catalyst.analysis.TestRelations._
 
+  private def intercept(plan: LogicalPlan, messages: String*): Unit = {
--- End diff --

Since this is an assertion and `intercept` is a well-known method, can you 
rename it to `assertErrorMessage` or something more descriptive?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint ...

2018-08-02 Thread rdblue
Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/21911#discussion_r207285266
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -102,6 +104,35 @@ object ResolveHints {
 }
   }
 
+  /**
+   * COALESCE Hint accepts name "COALESCE" and "REPARTITION".
+   * Its parameter includes a partition number.
+   */
+  class ResolveCoalesceHints(conf: SQLConf) extends Rule[LogicalPlan] {
+private val COALESCE_HINT_NAMES = Set("COALESCE", "REPARTITION")
+
+private def applyCoalesceHint(
+  plan: LogicalPlan,
+  numPartitions: Int,
+  shuffle: Boolean): LogicalPlan = {
+  Repartition(numPartitions, shuffle, plan)
+}
+
+def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators {
+  case h: UnresolvedHint if 
COALESCE_HINT_NAMES.contains(h.name.toUpperCase(Locale.ROOT)) =>
+h.parameters match {
+  case Seq(Literal(numPartitions: Int, IntegerType)) =>
+val shuffle = h.name.toUpperCase(Locale.ROOT) match {
+  case "REPARTITION" => true
+  case "COALESCE" => false
+}
+applyCoalesceHint(h.child, numPartitions, shuffle)
+  case _ =>
+throw new AnalysisException("COALESCE Hint expects a partition 
number as parameter")
--- End diff --

Can you use `h.name.toUpperCase` in this error message instead? I think 
that would be a better message for users that don't know the relationship 
between COALESCE and REPARTITION.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org