[ https://issues.apache.org/jira/browse/SPARK-24642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16529030#comment-16529030 ]
Maxim Gekk edited comment on SPARK-24642 at 7/1/18 10:05 AM: ------------------------------------------------------------- [~rxin] I created new ticket SPARK-24709 which aims to add simpler function. Here is the PR https://github.com/apache/spark/pull/21686 for the ticket. was (Author: maxgekk): I created new ticket SPARK-24709 which aims to add simpler function. > Add a function which infers schema from a JSON column > ----------------------------------------------------- > > Key: SPARK-24642 > URL: https://issues.apache.org/jira/browse/SPARK-24642 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.3.1 > Reporter: Maxim Gekk > Priority: Minor > > Need to add new aggregate function - *infer_schema()*. The function should > infer schema for set of JSON strings. The result of the function is a schema > in DDL format (or JSON format). > One of the use cases is passing output of *infer_schema()* to *from_json()*. > Currently, the from_json() function requires a schema as a mandatory > argument. It is possible to infer schema programmatically in Scala/Python and > pass it as the second argument but in SQL it is not possible. An user has to > pass schema as string literal in SQL. The new function should allow to use it > in SQL like in the example: > {code:sql} > select from_json(json_col, infer_schema(json_col)) > from json_table; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org