[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19363


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144873850
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -479,7 +492,9 @@ private[sql] object RelationalGroupedDataset {
   /**
* The Grouping Type
*/
-  private[sql] trait GroupType
+  private[sql] trait GroupType {
+override def toString: String = getClass.getSimpleName.replace("$", "")
--- End diff --

`getClass.getSimpleName.stripSuffix("$").stripSuffix("Type")`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144873610
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 
---
@@ -1341,8 +1341,69 @@ class DatasetSuite extends QueryTest with 
SharedSQLContext {
   Seq(1).toDS().map(_ => ("", TestForTypeAlias.seqOfTupleTypeAlias)),
   ("", Seq((1, 1), (2, 2
   }
+
+  test("Check RelationalGroupedDataset toString: Single data") {
+val kvDataset = (1 to 3).toDF("id").groupBy("id")
+val expected = "RelationalGroupedDataset: [" +
+  "group by: [id: int], value: [id: int], type: GroupByType]"
--- End diff --

a better idea:
* `group by` -> `grouping expressions`
* `type: GroupByType` -> `type: GroupBy`, i.e. remove the `Type` suffix.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144753514
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala ---
@@ -46,6 +36,20 @@ abstract class QueryTest extends PlanTest {
   Locale.setDefault(Locale.US)
 
   /**
+   * Makes sure the answer matches the expected result
+   */
+  def checkString(expected: String, actual: String): Unit = {
+if (expected != actual) {
+  fail(
+"The actual string gives wrong result:\n\n" + sideBySide(
--- End diff --

it's kind of an overkill, the `toString` we are testing always return a 
single line, just `assert` is enough.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144753446
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/RelationalGroupedDatasetSuite.scala
 ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class RelationalGroupedDatasetSuite extends QueryTest with 
SharedSQLContext {
--- End diff --

we can put the tests in `DataSetSuite`, as `DataSet.toString` is also 
tested there.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144753415
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/KeyValueGroupedDatasetSuite.scala 
---
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class KeyValueGroupedDatasetSuite extends QueryTest with SharedSQLContext {
--- End diff --

we can put the tests in `DataSetSuite`, as `DataSet.toString` is also 
tested there.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144753249
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ---
@@ -564,4 +564,25 @@ class KeyValueGroupedDataset[K, V] private[sql](
   encoder: Encoder[R]): Dataset[R] = {
 cogroup(other)((key, left, right) => f.call(key, left.asJava, 
right.asJava).asScala)(encoder)
   }
+
+  override def toString: String = {
+val builder = new StringBuilder
+val kFields = kExprEnc.schema.map {
+  case f => s"${f.name}: ${f.dataType.simpleString(2)}"
+}
+val vFields = vExprEnc.schema.map {
+  case f => s"${f.name}: ${f.dataType.simpleString(2)}"
+}
+builder.append("KeyValueGroupedDataset: [key: [")
+builder.append(kFields.take(2).mkString(", "))
+if (kFields.length > 2) {
+  builder.append(" ... " + (kFields.length - 2) + " more field(s)")
+}
+builder.append("], value: [")
+builder.append(vFields.take(2).mkString(", "))
+if (vFields.length > 2) {
+  builder.append(" ... " + (vFields.length - 2) + " more field(s)")
+}
+builder.append("]]").toString()
--- End diff --

shall we include `df` here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144753173
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -465,6 +466,19 @@ class RelationalGroupedDataset protected[sql](
 
 Dataset.ofRows(df.sparkSession, plan)
   }
+
+  override def toString: String = {
+val builder = new StringBuilder
+builder.append("RelationalGroupedDataset: [groupingBy: [")
+val kFields = groupingExprs.map(_.asInstanceOf[NamedExpression]).map {
+  case f => s"${f.name}: ${f.dataType.simpleString(2)}"
+}
+builder.append(kFields.take(2).mkString(", "))
+if (kFields.length > 2) {
+  builder.append(" ... " + (kFields.length - 2) + " more field(s)")
+}
+builder.append(s"], df: ${df.toString}, type: $groupType]").toString()
--- End diff --

why include `df` here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144753122
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -465,6 +466,19 @@ class RelationalGroupedDataset protected[sql](
 
 Dataset.ofRows(df.sparkSession, plan)
   }
+
+  override def toString: String = {
+val builder = new StringBuilder
+builder.append("RelationalGroupedDataset: [groupingBy: [")
--- End diff --

`groupingBy` -> `group by`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144753076
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ---
@@ -18,13 +18,13 @@
 package org.apache.spark.sql
 
 import scala.collection.JavaConverters._
+import scala.util.control.NonFatal
--- End diff --

unnecessary import


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144753090
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -21,6 +21,7 @@ import java.util.Locale
 
 import scala.collection.JavaConverters._
 import scala.language.implicitConversions
+import scala.util.control.NonFatal
--- End diff --

ditto


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144501447
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -465,6 +466,24 @@ class RelationalGroupedDataset protected[sql](
 
 Dataset.ofRows(df.sparkSession, plan)
   }
+
+  override def toString: String = {
+try {
+  val builder = new StringBuilder
+  builder.append("RelationalGroupedDataset: [key: [")
--- End diff --

`key:` or `groupingBy:`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144502680
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/RelationalGroupedDatasetSuite.scala
 ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class RelationalGroupedDatasetSuite extends QueryTest with 
SharedSQLContext {
+  import testImplicits._
+
+  test("Check RelationalGroupedDataset toString: Single data") {
+val kvDataset = (1 to 3).toDF("id").as[SingleData].groupBy("id")
+val expected = "RelationalGroupedDataset: [key: [id: int], value: [id: 
int], GroupByType]"
+val actual = kvDataset.toString
+checkString(expected, actual)
+  }
+
+  test("Check RelationalGroupedDataset toString: over length schema ") {
+val kvDataset = (1 to 3).map( x => (x, x.toString, x.toLong))
+  .toDF("id", "val1", "val2").as[TripleData].groupBy("id")
--- End diff --

Is `TripleData` necessary?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144502598
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/RelationalGroupedDatasetSuite.scala
 ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class RelationalGroupedDatasetSuite extends QueryTest with 
SharedSQLContext {
+  import testImplicits._
+
+  test("Check RelationalGroupedDataset toString: Single data") {
+val kvDataset = (1 to 3).toDF("id").as[SingleData].groupBy("id")
--- End diff --

Is `SingleData` necessary?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144500697
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -465,6 +466,24 @@ class RelationalGroupedDataset protected[sql](
 
 Dataset.ofRows(df.sparkSession, plan)
   }
+
+  override def toString: String = {
+try {
+  val builder = new StringBuilder
+  builder.append("RelationalGroupedDataset: [key: [")
+  val kFields = groupingExprs.map(_.asInstanceOf[NamedExpression]).map 
{
--- End diff --

Are all expressions in `groupingExprs` `NamedExpression`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144501966
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala ---
@@ -46,6 +36,20 @@ abstract class QueryTest extends PlanTest {
   Locale.setDefault(Locale.US)
 
   /**
+   * Makes sure the answer matches the expected result
+   */
+  def checkString(expected: String, actual: String): Unit = {
+if (expected != actual) {
+  fail(
+"KeyValueGroupedDataset.toString() gives wrong result:\n\n" + 
sideBySide(
--- End diff --

You didn't only use this against `KeyValueGroupedDataset`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144501685
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -465,6 +466,24 @@ class RelationalGroupedDataset protected[sql](
 
 Dataset.ofRows(df.sparkSession, plan)
   }
+
+  override def toString: String = {
+try {
+  val builder = new StringBuilder
+  builder.append("RelationalGroupedDataset: [key: [")
+  val kFields = groupingExprs.map(_.asInstanceOf[NamedExpression]).map 
{
+case f => s"${f.name}: ${f.dataType.simpleString(2)}"
+  }
+  builder.append(kFields.take(2).mkString(", "))
+  if (kFields.length > 2) {
+builder.append(" ... " + (kFields.length - 2) + " more field(s)")
+  }
+  builder.append(s"], value: ${df.toString}, $groupType]").toString()
--- End diff --

`value:` or `df:`?

`$groupType` -> `type: $groupType`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [SPARK-22224][SQL]Override toString of KeyValue/R...

2017-10-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r144500085
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ---
@@ -564,4 +565,30 @@ class KeyValueGroupedDataset[K, V] private[sql](
   encoder: Encoder[R]): Dataset[R] = {
 cogroup(other)((key, left, right) => f.call(key, left.asJava, 
right.asJava).asScala)(encoder)
   }
+
+  override def toString: String = {
+try {
+  val builder = new StringBuilder
+  val kFields = kExprEnc.schema.map {
+case f => s"${f.name}: ${f.dataType.simpleString(2)}"
+  }
+  val vFields = vExprEnc.schema.map {
+case f => s"${f.name}: ${f.dataType.simpleString(2)}"
+  }
+  builder.append("[key: [")
+  builder.append(kFields.take(2).mkString(", "))
+  if (kFields.length > 2) {
+builder.append(" ... " + (kFields.length - 2) + " more field(s)")
+  }
+  builder.append("], value: [")
+  builder.append(vFields.take(2).mkString(", "))
+  if (vFields.length > 2) {
+builder.append(" ... " + (vFields.length - 2) + " more field(s)")
+  }
+  builder.append("]]").toString()
+} catch {
+  case NonFatal(e) =>
--- End diff --

Although some methods of `StringBuilder` can throw exceptions, I didn't see 
simple `append` can throw exception. Access of `schema` should not throw 
exception at all. So I don't get why it is.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org