[
https://issues.apache.org/jira/browse/DRILL-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506580#comment-16506580
]
ASF GitHub Bot commented on DRILL-4824:
---------------------------------------
paul-rogers commented on issue #580: DRILL-4824: JSON with complex nested data
produces incorrect output w…
URL: https://github.com/apache/drill/pull/580#issuecomment-395895576
@ilooner, there are quite a few open issues around JSON. There are a number
of JIRA tickets that explain the issues, as in a whole section in the "Batch
Handling" document on my Wiki. This is not a simple bug fix.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Null maps / lists and non-provided state support for JSON fields. Numeric
> types promotion.
> ------------------------------------------------------------------------------------------
>
> Key: DRILL-4824
> URL: https://issues.apache.org/jira/browse/DRILL-4824
> Project: Apache Drill
> Issue Type: Improvement
> Components: Storage - JSON
> Affects Versions: 1.0.0
> Reporter: Roman Kulyk
> Assignee: Volodymyr Vysotskyi
> Priority: Major
>
> There is incorrect output in case of JSON file with complex nested data.
> _JSON:_
> {code:none|title=example.json|borderStyle=solid}
> {
> "Field1" : {
> }
> }
> {
> "Field1" : {
> "InnerField1": {"key1":"value1"},
> "InnerField2": {"key2":"value2"}
> }
> }
> {
> "Field1" : {
> "InnerField3" : ["value3", "value4"],
> "InnerField4" : ["value5", "value6"]
> }
> }
> {code}
> _Query:_
> {code:sql}
> select Field1 from dfs.`/tmp/example.json`
> {code}
> _Incorrect result:_
> {code:none}
> +---------------------------+
> | Field1 |
> +---------------------------+
> {"InnerField1":{},"InnerField2":{},"InnerField3":[],"InnerField4":[]}
> {"InnerField1":{"key1":"value1"},"InnerField2"
> {"key2":"value2"},"InnerField3":[],"InnerField4":[]}
> {"InnerField1":{},"InnerField2":{},"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]}
> +--------------------------+
> {code}
> Theres is no need to output missing fields. In case of deeply nested
> structure we will get unreadable result for user.
> _Correct result:_
> {code:none}
> +--------------------------+
> | Field1 |
> +--------------------------+
> |{}
> {"InnerField1":{"key1":"value1"},"InnerField2":{"key2":"value2"}}
> {"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]}
> +--------------------------+
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)