GitHub user MaxGekk opened a pull request: https://github.com/apache/spark/pull/21550
[SPARK-24543][SQL] Support any type as DDL string for from_json's schema ## What changes were proposed in this pull request? In the PR, I propose to support any DataType represented as DDL string for the from_json function. After the changes, it will be possible to specify `MapType` in SQL like: ```sql select from_json('{"a":1, "b":2}', 'map<string, int>') ``` and in Scala (similar in other languages) ```scala val in = Seq("""{"a": {"b": 1}}""").toDS() val schema = "map<string, map<string, int>>" val out = in.select(from_json($"value", schema, Map.empty[String, String])) ``` ## How was this patch tested? Added a couple sql tests and modified existing tests for Python and Scala. The former tests were modified because it is not imported for them in which format schema for `from_json` is provided. You can merge this pull request into a Git repository by running: $ git pull https://github.com/MaxGekk/spark-1 from_json-ddl-schema Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21550.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21550 ---- commit 41d4522848610d3c8c7983157f0b4b7bded9dd94 Author: Maxim Gekk <maxim.gekk@...> Date: 2018-06-13T05:56:33Z Support any types in schema DDL commit f824f1651999f0ba8919d4b8d29329eb1f538237 Author: Maxim Gekk <maxim.gekk@...> Date: 2018-06-13T05:56:57Z SQL tests for from_json commit 08a01223354cf44174653996dae936aa09bf340d Author: Maxim Gekk <maxim.gekk@...> Date: 2018-06-13T06:47:46Z Support any DataType as schema for from_json commit 41ad77ee74265a170191203bf0330a7c7b3b384d Author: Maxim Gekk <maxim.gekk@...> Date: 2018-06-13T06:48:40Z Test for MapType in PySpark's from_json commit 5d53ec77f022a17a1ffb5c77937a32b3a32cea63 Author: Maxim Gekk <maxim.gekk@...> Date: 2018-06-13T06:53:35Z Test for MapType in DDL as the root type for from_json ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org