[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19223 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138857275 --- Diff: python/pyspark/sql/functions.py --- @@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}): @since(2.1) def to_json(col, options={}): """ -Converts a column containing a [[StructType]] or [[ArrayType]] of [[StructType]]s into a -JSON string. Throws an exception, in the case of an unsupported type. +Converts a column containing a :class:`StructType`, :class:`ArrayType` of :class:`StructType`s, --- End diff -- That's fine :). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user goldmedal commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138856179 --- Diff: python/pyspark/sql/functions.py --- @@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}): @since(2.1) def to_json(col, options={}): """ -Converts a column containing a [[StructType]] or [[ArrayType]] of [[StructType]]s into a -JSON string. Throws an exception, in the case of an unsupported type. +Converts a column containing a :class:`StructType`, :class:`ArrayType` of :class:`StructType`s, --- End diff -- ok, I'll modified it. Because I'm not really familiar with python, thanks for your suggestions. :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138852772 --- Diff: python/pyspark/sql/functions.py --- @@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}): @since(2.1) def to_json(col, options={}): """ -Converts a column containing a [[StructType]] or [[ArrayType]] of [[StructType]]s into a -JSON string. Throws an exception, in the case of an unsupported type. +Converts a column containing a :class:`StructType`, :class:`ArrayType` of :class:`StructType`s, +a :class:`MapType` or :class:`ArrayType` of :class:`MapType` into a JSON string. --- End diff -- `` `:class:`MapType` `` -> `` :class:`MapType`\\s `` for consistency. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138852676 --- Diff: python/pyspark/sql/functions.py --- @@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}): @since(2.1) def to_json(col, options={}): """ -Converts a column containing a [[StructType]] or [[ArrayType]] of [[StructType]]s into a -JSON string. Throws an exception, in the case of an unsupported type. +Converts a column containing a :class:`StructType`, :class:`ArrayType` of :class:`StructType`s, --- End diff -- I believe `` :class:`StructType`s `` should be ``:class:`StructType`\\s``. Currently, this is rendered as below: https://user-images.githubusercontent.com/6477701/30424466-05e7532a-9981-11e7-8b37-d83268495cc0.png;> --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138805358 --- Diff: python/pyspark/sql/functions.py --- @@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}): @since(2.1) def to_json(col, options={}): """ -Converts a column containing a [[StructType]] or [[ArrayType]] of [[StructType]]s into a -JSON string. Throws an exception, in the case of an unsupported type. +Converts a column containing a :class:`StructType, :class:`ArrayType` of :class:`StructType`s, --- End diff -- Missing a ` after StructType. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user goldmedal commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138800321 --- Diff: python/pyspark/sql/functions.py --- @@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}): @since(2.1) def to_json(col, options={}): """ -Converts a column containing a [[StructType]] or [[ArrayType]] of [[StructType]]s into a -JSON string. Throws an exception, in the case of an unsupported type. +Converts a column containing a [[StructType]], [[ArrayType]] of [[StructType]]s, +a [[MapType]] or [[ArrayType]] of [[MapType]] into a JSON string. +Throws an exception, in the case of an unsupported type. --- End diff -- ok Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r13870 --- Diff: sql/core/src/test/resources/sql-tests/results/json-functions.sql.out --- @@ -26,13 +26,13 @@ Extended Usage: {"time":"26/08/2015"} > SELECT to_json(array(named_struct('a', 1, 'b', 2)); [{"a":1,"b":2}] - > SELECT to_json(map('a',named_struct('b',1))); + > SELECT to_json(map('a', named_struct('b', 1))); --- End diff -- Oh. I see. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user goldmedal commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138799591 --- Diff: R/pkg/R/functions.R --- @@ -1715,7 +1717,15 @@ setMethod("to_date", #' #' # Converts an array of structs into a JSON array #' df2 <- sql("SELECT array(named_struct('name', 'Bob'), named_struct('name', 'Alice')) as people") -#' df2 <- mutate(df2, people_json = to_json(df2$people))} +#' df2 <- mutate(df2, people_json = to_json(df2$people)) +#' +#' # Converts a map into a JSON object +#' df2 <- sql("SELECT map('name', 'Bob')) as people") +#' df2 <- mutate(df2, people_json = to_json(df2$people)) +#' +#' # Converts an array of maps into a JSON array +#' df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as people") +#' df2 <- mutate(df2, people_json = to_json(df2$people)) --- End diff -- ok Thanks for careful review :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user goldmedal commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138799483 --- Diff: sql/core/src/test/resources/sql-tests/results/json-functions.sql.out --- @@ -26,13 +26,13 @@ Extended Usage: {"time":"26/08/2015"} > SELECT to_json(array(named_struct('a', 1, 'b', 2)); [{"a":1,"b":2}] - > SELECT to_json(map('a',named_struct('b',1))); + > SELECT to_json(map('a', named_struct('b', 1))); --- End diff -- umm. I modified `ExpressionDescription` of `StructsToJson` at @HyukjinKwon 's suggestions which didn't be merged in last PR. Here's the test for `describe function extended to_json`, so I needed to regenerate the golden file for it. So this change isn't from `json-functions.sql`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138795133 --- Diff: sql/core/src/test/resources/sql-tests/results/json-functions.sql.out --- @@ -26,13 +26,13 @@ Extended Usage: {"time":"26/08/2015"} > SELECT to_json(array(named_struct('a', 1, 'b', 2)); [{"a":1,"b":2}] - > SELECT to_json(map('a',named_struct('b',1))); + > SELECT to_json(map('a', named_struct('b', 1))); --- End diff -- Or you forget to commit `json-functions.sql`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138795006 --- Diff: sql/core/src/test/resources/sql-tests/results/json-functions.sql.out --- @@ -26,13 +26,13 @@ Extended Usage: {"time":"26/08/2015"} > SELECT to_json(array(named_struct('a', 1, 'b', 2)); [{"a":1,"b":2}] - > SELECT to_json(map('a',named_struct('b',1))); + > SELECT to_json(map('a', named_struct('b', 1))); --- End diff -- I think you committed unrelated change? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138794052 --- Diff: R/pkg/R/functions.R --- @@ -1715,7 +1717,15 @@ setMethod("to_date", #' #' # Converts an array of structs into a JSON array #' df2 <- sql("SELECT array(named_struct('name', 'Bob'), named_struct('name', 'Alice')) as people") -#' df2 <- mutate(df2, people_json = to_json(df2$people))} +#' df2 <- mutate(df2, people_json = to_json(df2$people)) +#' +#' # Converts a map into a JSON object +#' df2 <- sql("SELECT map('name', 'Bob')) as people") +#' df2 <- mutate(df2, people_json = to_json(df2$people)) +#' +#' # Converts an array of maps into a JSON array +#' df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as people") +#' df2 <- mutate(df2, people_json = to_json(df2$people)) --- End diff -- ... meaning `}` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138774812 --- Diff: R/pkg/R/functions.R --- @@ -1715,7 +1717,15 @@ setMethod("to_date", #' #' # Converts an array of structs into a JSON array #' df2 <- sql("SELECT array(named_struct('name', 'Bob'), named_struct('name', 'Alice')) as people") -#' df2 <- mutate(df2, people_json = to_json(df2$people))} +#' df2 <- mutate(df2, people_json = to_json(df2$people)) +#' +#' # Converts a map into a JSON object +#' df2 <- sql("SELECT map('name', 'Bob')) as people") +#' df2 <- mutate(df2, people_json = to_json(df2$people)) +#' +#' # Converts an array of maps into a JSON array +#' df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as people") +#' df2 <- mutate(df2, people_json = to_json(df2$people)) --- End diff -- And.. missing closing parentheses for `\dontrun{`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138767978 --- Diff: python/pyspark/sql/functions.py --- @@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}): @since(2.1) def to_json(col, options={}): """ -Converts a column containing a [[StructType]] or [[ArrayType]] of [[StructType]]s into a -JSON string. Throws an exception, in the case of an unsupported type. +Converts a column containing a [[StructType]], [[ArrayType]] of [[StructType]]s, +a [[MapType]] or [[ArrayType]] of [[MapType]] into a JSON string. +Throws an exception, in the case of an unsupported type. --- End diff -- While we are here, let's fix `[[StructType]]` to `` :class:`StructType` `` (and the same instances too) to make Python API documentation pretty. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138766031 --- Diff: R/pkg/R/functions.R --- @@ -1700,8 +1701,9 @@ setMethod("to_date", }) #' @details -#' \code{to_json}: Converts a column containing a \code{structType} or array of \code{structType} -#' into a Column of JSON string. Resolving the Column can fail if an unsupported type is encountered. +#' \code{to_json}: Converts a column containing a \code{structType}, array of \code{structType}, +# a \code{mapType} or array of \code{mapType} into a Column of JSON string. --- End diff -- Looks `'` is missed at the first `#`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
GitHub user goldmedal opened a pull request: https://github.com/apache/spark/pull/19223 [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting MapType to json for PySpark and SparkR ## What changes were proposed in this pull request? In previous work SPARK-21513, we has allowed `MapType` and `ArrayType` of `MapType`s convert to a json string but only for Scala API. In this follow-up PR, we will make SparkSQL support it for PySpark and SparkR, too. We also fix some little bugs and comments of the previous work in this follow-up PR. ### For PySpark ``` >>> data = [(1, {"name": "Alice"})] >>> df = spark.createDataFrame(data, ("key", "value")) >>> df.select(to_json(df.value).alias("json")).collect() [Row(json=u'{"name":"Alice")'] >>> data = [(1, [{"name": "Alice"}, {"name": "Bob"}])] >>> df = spark.createDataFrame(data, ("key", "value")) >>> df.select(to_json(df.value).alias("json")).collect() [Row(json=u'[{"name":"Alice"},{"name":"Bob"}]')] ``` ### For SparkR ``` # Converts a map into a JSON object df2 <- sql("SELECT map('name', 'Bob')) as people") df2 <- mutate(df2, people_json = to_json(df2$people)) # Converts an array of maps into a JSON array df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as people") df2 <- mutate(df2, people_json = to_json(df2$people)) ``` ## How was this patch tested? Add unit test cases. cc @viirya @HyukjinKwon You can merge this pull request into a Git repository by running: $ git pull https://github.com/goldmedal/spark SPARK-21513-fp-PySaprkAndSparkR Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19223.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19223 commit 071173c2a30486e6e462e85c9e25b04db9f1d8d6 Author: goldmedalDate: 2017-09-13T18:41:13Z fix the coding style issue commit 80fbb6b589c7ecadb1016ede701d363035793eae Author: goldmedal Date: 2017-09-13T18:41:54Z fix the logic operator using commit 5e9a7266002918babedaf426c68b9e2b93e7b967 Author: goldmedal Date: 2017-09-13T18:48:40Z add comments and test cases for sparkR commit 6a3d374cac58e51b3c687b30e4bd924694c0ff91 Author: goldmedal Date: 2017-09-13T18:52:08Z add comments and test cases for PySpark commit 1f5b7cf86c19b6570dc64e4c8f12a215068dcc7f Author: goldmedal Date: 2017-09-13T18:55:30Z fix some bug and comments commit 29e7323467c319d8e83a086d20f8bffde34a7b15 Author: goldmedal Date: 2017-09-13T19:06:51Z re-generate golden file --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org