[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19223


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138857275
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}):
 @since(2.1)
 def to_json(col, options={}):
 """
-Converts a column containing a [[StructType]] or [[ArrayType]] of 
[[StructType]]s into a
-JSON string. Throws an exception, in the case of an unsupported type.
+Converts a column containing a :class:`StructType`, :class:`ArrayType` 
of :class:`StructType`s,
--- End diff --

That's fine :).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-14 Thread goldmedal
Github user goldmedal commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138856179
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}):
 @since(2.1)
 def to_json(col, options={}):
 """
-Converts a column containing a [[StructType]] or [[ArrayType]] of 
[[StructType]]s into a
-JSON string. Throws an exception, in the case of an unsupported type.
+Converts a column containing a :class:`StructType`, :class:`ArrayType` 
of :class:`StructType`s,
--- End diff --

ok, I'll modified it.  Because I'm not really familiar with python, thanks 
for your suggestions. :)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138852772
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}):
 @since(2.1)
 def to_json(col, options={}):
 """
-Converts a column containing a [[StructType]] or [[ArrayType]] of 
[[StructType]]s into a
-JSON string. Throws an exception, in the case of an unsupported type.
+Converts a column containing a :class:`StructType`, :class:`ArrayType` 
of :class:`StructType`s,
+a :class:`MapType` or :class:`ArrayType` of :class:`MapType` into a 
JSON string.
--- End diff --

`` `:class:`MapType` `` -> `` :class:`MapType`\\s `` for consistency.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138852676
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}):
 @since(2.1)
 def to_json(col, options={}):
 """
-Converts a column containing a [[StructType]] or [[ArrayType]] of 
[[StructType]]s into a
-JSON string. Throws an exception, in the case of an unsupported type.
+Converts a column containing a :class:`StructType`, :class:`ArrayType` 
of :class:`StructType`s,
--- End diff --

I believe `` :class:`StructType`s `` should be ``:class:`StructType`\\s``. 
Currently, this is rendered as below:

https://user-images.githubusercontent.com/6477701/30424466-05e7532a-9981-11e7-8b37-d83268495cc0.png;>



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-14 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138805358
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}):
 @since(2.1)
 def to_json(col, options={}):
 """
-Converts a column containing a [[StructType]] or [[ArrayType]] of 
[[StructType]]s into a
-JSON string. Throws an exception, in the case of an unsupported type.
+Converts a column containing a :class:`StructType, :class:`ArrayType` 
of :class:`StructType`s,
--- End diff --

Missing a ` after StructType.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread goldmedal
Github user goldmedal commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138800321
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}):
 @since(2.1)
 def to_json(col, options={}):
 """
-Converts a column containing a [[StructType]] or [[ArrayType]] of 
[[StructType]]s into a
-JSON string. Throws an exception, in the case of an unsupported type.
+Converts a column containing a [[StructType]], [[ArrayType]] of 
[[StructType]]s,
+a [[MapType]] or [[ArrayType]] of [[MapType]] into a JSON string.
+Throws an exception, in the case of an unsupported type.
--- End diff --

ok Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r13870
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/json-functions.sql.out ---
@@ -26,13 +26,13 @@ Extended Usage:
{"time":"26/08/2015"}
   > SELECT to_json(array(named_struct('a', 1, 'b', 2));
[{"a":1,"b":2}]
-  > SELECT to_json(map('a',named_struct('b',1)));
+  > SELECT to_json(map('a', named_struct('b', 1)));
--- End diff --

Oh. I see.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread goldmedal
Github user goldmedal commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138799591
  
--- Diff: R/pkg/R/functions.R ---
@@ -1715,7 +1717,15 @@ setMethod("to_date",
 #'
 #' # Converts an array of structs into a JSON array
 #' df2 <- sql("SELECT array(named_struct('name', 'Bob'), 
named_struct('name', 'Alice')) as people")
-#' df2 <- mutate(df2, people_json = to_json(df2$people))}
+#' df2 <- mutate(df2, people_json = to_json(df2$people))
+#'
+#' # Converts a map into a JSON object
+#' df2 <- sql("SELECT map('name', 'Bob')) as people")
+#' df2 <- mutate(df2, people_json = to_json(df2$people))
+#'
+#' # Converts an array of maps into a JSON array
+#' df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as 
people")
+#' df2 <- mutate(df2, people_json = to_json(df2$people))
--- End diff --

ok  Thanks for careful review :)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread goldmedal
Github user goldmedal commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138799483
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/json-functions.sql.out ---
@@ -26,13 +26,13 @@ Extended Usage:
{"time":"26/08/2015"}
   > SELECT to_json(array(named_struct('a', 1, 'b', 2));
[{"a":1,"b":2}]
-  > SELECT to_json(map('a',named_struct('b',1)));
+  > SELECT to_json(map('a', named_struct('b', 1)));
--- End diff --

umm. I modified `ExpressionDescription` of `StructsToJson` at @HyukjinKwon 
's suggestions which didn't be merged in last PR. Here's the test for `describe 
function extended to_json`, so I needed to regenerate the golden file for it. 
So this change isn't from `json-functions.sql`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138795133
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/json-functions.sql.out ---
@@ -26,13 +26,13 @@ Extended Usage:
{"time":"26/08/2015"}
   > SELECT to_json(array(named_struct('a', 1, 'b', 2));
[{"a":1,"b":2}]
-  > SELECT to_json(map('a',named_struct('b',1)));
+  > SELECT to_json(map('a', named_struct('b', 1)));
--- End diff --

Or you forget to commit `json-functions.sql`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138795006
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/json-functions.sql.out ---
@@ -26,13 +26,13 @@ Extended Usage:
{"time":"26/08/2015"}
   > SELECT to_json(array(named_struct('a', 1, 'b', 2));
[{"a":1,"b":2}]
-  > SELECT to_json(map('a',named_struct('b',1)));
+  > SELECT to_json(map('a', named_struct('b', 1)));
--- End diff --

I think you committed unrelated change?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138794052
  
--- Diff: R/pkg/R/functions.R ---
@@ -1715,7 +1717,15 @@ setMethod("to_date",
 #'
 #' # Converts an array of structs into a JSON array
 #' df2 <- sql("SELECT array(named_struct('name', 'Bob'), 
named_struct('name', 'Alice')) as people")
-#' df2 <- mutate(df2, people_json = to_json(df2$people))}
+#' df2 <- mutate(df2, people_json = to_json(df2$people))
+#'
+#' # Converts a map into a JSON object
+#' df2 <- sql("SELECT map('name', 'Bob')) as people")
+#' df2 <- mutate(df2, people_json = to_json(df2$people))
+#'
+#' # Converts an array of maps into a JSON array
+#' df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as 
people")
+#' df2 <- mutate(df2, people_json = to_json(df2$people))
--- End diff --

... meaning `}`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138774812
  
--- Diff: R/pkg/R/functions.R ---
@@ -1715,7 +1717,15 @@ setMethod("to_date",
 #'
 #' # Converts an array of structs into a JSON array
 #' df2 <- sql("SELECT array(named_struct('name', 'Bob'), 
named_struct('name', 'Alice')) as people")
-#' df2 <- mutate(df2, people_json = to_json(df2$people))}
+#' df2 <- mutate(df2, people_json = to_json(df2$people))
+#'
+#' # Converts a map into a JSON object
+#' df2 <- sql("SELECT map('name', 'Bob')) as people")
+#' df2 <- mutate(df2, people_json = to_json(df2$people))
+#'
+#' # Converts an array of maps into a JSON array
+#' df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as 
people")
+#' df2 <- mutate(df2, people_json = to_json(df2$people))
--- End diff --

And.. missing closing parentheses for `\dontrun{`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138767978
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}):
 @since(2.1)
 def to_json(col, options={}):
 """
-Converts a column containing a [[StructType]] or [[ArrayType]] of 
[[StructType]]s into a
-JSON string. Throws an exception, in the case of an unsupported type.
+Converts a column containing a [[StructType]], [[ArrayType]] of 
[[StructType]]s,
+a [[MapType]] or [[ArrayType]] of [[MapType]] into a JSON string.
+Throws an exception, in the case of an unsupported type.
--- End diff --

While we are here, let's fix `[[StructType]]` to `` :class:`StructType` `` 
(and the same instances too) to make Python API documentation pretty.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19223#discussion_r138766031
  
--- Diff: R/pkg/R/functions.R ---
@@ -1700,8 +1701,9 @@ setMethod("to_date",
   })
 
 #' @details
-#' \code{to_json}: Converts a column containing a \code{structType} or 
array of \code{structType}
-#' into a Column of JSON string. Resolving the Column can fail if an 
unsupported type is encountered.
+#' \code{to_json}: Converts a column containing a \code{structType}, array 
of \code{structType},
+#  a \code{mapType} or array of \code{mapType} into a Column of JSON 
string.
--- End diff --

Looks `'` is missed at the first `#`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread goldmedal
GitHub user goldmedal opened a pull request:

https://github.com/apache/spark/pull/19223

[SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting MapType 
to json for PySpark and SparkR

## What changes were proposed in this pull request?
In previous work SPARK-21513, we has allowed `MapType` and `ArrayType` of 
`MapType`s convert to a json string but only for Scala API. In this follow-up 
PR, we will make SparkSQL support it for PySpark and SparkR, too. We also fix 
some little bugs and comments of the previous work in this follow-up PR.

### For PySpark
```
>>> data = [(1, {"name": "Alice"})]
>>> df = spark.createDataFrame(data, ("key", "value"))
>>> df.select(to_json(df.value).alias("json")).collect()
[Row(json=u'{"name":"Alice")']
>>> data = [(1, [{"name": "Alice"}, {"name": "Bob"}])]
>>> df = spark.createDataFrame(data, ("key", "value"))
>>> df.select(to_json(df.value).alias("json")).collect()
[Row(json=u'[{"name":"Alice"},{"name":"Bob"}]')]
```
### For SparkR
```
# Converts a map into a JSON object
df2 <- sql("SELECT map('name', 'Bob')) as people")
df2 <- mutate(df2, people_json = to_json(df2$people))
# Converts an array of maps into a JSON array
df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as 
people")
df2 <- mutate(df2, people_json = to_json(df2$people))
```
## How was this patch tested?
Add unit test cases.

cc @viirya @HyukjinKwon 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/goldmedal/spark 
SPARK-21513-fp-PySaprkAndSparkR

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19223.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19223


commit 071173c2a30486e6e462e85c9e25b04db9f1d8d6
Author: goldmedal 
Date:   2017-09-13T18:41:13Z

fix the coding style issue

commit 80fbb6b589c7ecadb1016ede701d363035793eae
Author: goldmedal 
Date:   2017-09-13T18:41:54Z

fix the logic operator using

commit 5e9a7266002918babedaf426c68b9e2b93e7b967
Author: goldmedal 
Date:   2017-09-13T18:48:40Z

add comments and test cases for sparkR

commit 6a3d374cac58e51b3c687b30e4bd924694c0ff91
Author: goldmedal 
Date:   2017-09-13T18:52:08Z

add comments and test cases for PySpark

commit 1f5b7cf86c19b6570dc64e4c8f12a215068dcc7f
Author: goldmedal 
Date:   2017-09-13T18:55:30Z

fix some bug and comments

commit 29e7323467c319d8e83a086d20f8bffde34a7b15
Author: goldmedal 
Date:   2017-09-13T19:06:51Z

re-generate golden file




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org