[GitHub] spark pull request #19017: SPARK-21804: json_tuple returns null values withi...

jmchung Mon, 21 Aug 2017 23:46:58 -0700

GitHub user jmchung opened a pull request:

    https://github.com/apache/spark/pull/19017


    SPARK-21804: json_tuple returns null values within repeated columns except 
the first one

    ## What changes were proposed in this pull request?
    
    When json_tuple in extracting values from JSON it returns null values 
within repeated columns except the first one as below:
    
    ``` scala
    scala> spark.sql("""SELECT json_tuple('{"a":1, "b":2}', 'a', 'b', 
'a')""").show()
    +---+---+----+
    | c0| c1|  c2|
    +---+---+----+
    |  1|  2|null|
    +---+---+----+
    ```
    
    I think this should be consistent with Hive's implementation:
    ```
    hive> SELECT json_tuple('{"a": 1, "b": 2}', 'a', 'a');
    ...
    1    1
    ```
    
    In this PR, we located all the matched indices in `fieldNames` instead of 
returning the first matched index, i.e., indexOf.
    
    ## How was this patch tested?
    
    Added test in JsonExpressionsSuite.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jmchung/spark SPARK-21804

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19017.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19017
    
----
commit f04b896f3a8b3befdc1cbfb60464dfdcb019b684
Author: Jen-Ming Chung <jenmingi...@gmail.com>
Date:   2017-08-22T06:38:40Z

    SPARK-21804: json_tuple returns null values within repeated columns except 
the first one

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19017: SPARK-21804: json_tuple returns null values withi...

Reply via email to