Zoltán Borók-Nagy created HIVE-20448:
----------------------------------------
Summary: get_json_object() unnecessarily unnests array values
Key: HIVE-20448
URL: https://issues.apache.org/jira/browse/HIVE-20448
Project: Hive
Issue Type: Bug
Reporter: Zoltán Borók-Nagy
Hive's implementation of get_json_object() unnecessarily unnests array values,
e.g.:
{noformat}
hive> select get_json_object('[{"key": [1,2,3]}, {"key": [4,5]}]', '$[*].key');
OK
[1,2,3,4,5]{noformat}
I think for the above query the following output is more reasonable:
{noformat}
[[1,2,3],[4,5]]{noformat}
This is also the output that one gets from
* [https://jsonpath.curiousconcept.com/]
* [http://jsonpath.com/]
* [https://jsonpath.herokuapp.com/]
Python's jsonpath package also produces this output.
I tested it with Hive 2.1.1, but it is likely that it affects other versions as
well.
If we'd really want to unnest the arrays, then '$[*].key[*]' is the way to go.
It produces the same output everywhere.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)