[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...

HyukjinKwon Thu, 25 Oct 2018 16:54:08 -0700

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22775
  
    Actually, that usecase can more easily accomplished by simply inferring 
schema by JSON datasource. Yea, I indeed suggested that as workaround for this 
issue before. Let's say, `spark.read.json(df.select("json").as[String]).schema`.
    
    I know it's not super clear about the usecase of `schema_of_json` and 
actually that's also partly why I want to allow what we need for this 
expression now.
    
    @rxin, WDYT? This PR tries to allow what we only need for now. Let's say 
disallow:
    
    ```scala
    schema_of_json(column)
    ```
    
    and only allow 
    
    ```scala
    schema_of_json(literal)
    ```
    
    because the main usecase is:
    
    ```scala
    from_json(schema_of_json(literal))
    ```
    
    and
    
    ```scala
    from_json(schema_of_json(column))
    ```
    
    is already not being supported. My judgement was `schema_of_json(column)` 
doesn't have to be exposed for now and actually want to have a talk with 
@MaxGekk about this when he comes back from his vacation.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...

Reply via email to