Roman created DRILL-4824: ---------------------------- Summary: JSON with complex nested data produces incorrect output with missing fields Key: DRILL-4824 URL: https://issues.apache.org/jira/browse/DRILL-4824 Project: Apache Drill Issue Type: New Feature Components: Storage - JSON Affects Versions: 1.7.0 Reporter: Roman Assignee: Roman Fix For: Future
There is incorrect output in case of JSON file with complex nested data. Here is a JSON file: {code:none|title=example.json|borderStyle=solid} { "Field1" : { } } { "Field1" : { "InnerField1": {"key1":"value1"}, "InnerField2": {"key2":"value2"} } } { "Field1" : { "InnerField3" : ["value3", "value4"], "InnerField4" : ["value5", "value6"] } } {code} Here is actual result after command "select Field1 from dfs.`/tmp/example.json`;": {code:none} +---------------------------+ | Field1 | +---------------------------+ {"InnerField1":{},"InnerField2":{},"InnerField3":[],"InnerField4":[]} {"InnerField1":{"key1":"value1"},"InnerField2" {"key2":"value2"},"InnerField3":[],"InnerField4":[]} {"InnerField1":{},"InnerField2":{},"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]} +--------------------------+ {code} I think it is no need to output missing fields. In case of deeply nested structure we will get unreadable for user result. So my expected result is: {code:none} +--------------------------+ | Field1 | +--------------------------+ |{} {"InnerField1":{"key1":"value1"},"InnerField2":{"key2":"value2"}} {"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]} +--------------------------+ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)