Paul Rogers created DRILL-6359: ---------------------------------- Summary: All-text mode in JSON still reads missing column as Nullable Int Key: DRILL-6359 URL: https://issues.apache.org/jira/browse/DRILL-6359 Project: Apache Drill Issue Type: Bug Affects Versions: 1.13.0 Reporter: Paul Rogers
Suppose we have the following file: {noformat} {a: 0} {a: 1} ... {a: 70001, b: 10.5} {noformat} Where the "..." indicates another 70K records. (Chosen to force the appearance of {{b}} into a second or later batch.) Suppose we execute the following query: {code} ALTER SESSION SET `store.json.all_text_mode` = true; SELECT a, b FROM `70Kmissing.json` WHERE b IS NOT NULL ORDER BY a; {code} The query should work. We have an explicit project for column {{b}} and we've told JSON to always use text. So, JSON should have enough information to create column {{b}} as {{Nullable VarChar}}. Yet, the result of the query in {{sqlline}} is: {noformat} Error: UNSUPPORTED_OPERATION ERROR: Schema changes not supported in External Sort. Please enable Union type. Previous schema BatchSchema [fields=[[`a` (VARCHAR:OPTIONAL)], [`b` (INT:OPTIONAL)]], selectionVector=NONE] Incoming schema BatchSchema [fields=[[`a` (VARCHAR:OPTIONAL)], [`b` (VARCHAR:OPTIONAL)]], selectionVector=NONE] {noformat} The expected result is that the query works because even missing columns should be subject to the "all text mode" setting because the JSON reader handles projection push-down, and is responsible for filling in the missing columns. This is with the shipping Drill 1.13 JSON reader. I *think* this is fixed in the "batch size handling" JSON reader rewrite, but I've not tested it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)