[ https://issues.apache.org/jira/browse/DRILL-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Khurram Faraaz updated DRILL-6359: ---------------------------------- Affects Version/s: 1.14.0 > All-text mode in JSON still reads missing column as Nullable Int > ---------------------------------------------------------------- > > Key: DRILL-6359 > URL: https://issues.apache.org/jira/browse/DRILL-6359 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.13.0, 1.14.0 > Reporter: Paul Rogers > Priority: Major > > Suppose we have the following file: > {noformat} > {a: 0} > {a: 1} > ... > {a: 70001, b: 10.5} > {noformat} > Where the "..." indicates another 70K records. (Chosen to force the > appearance of {{b}} into a second or later batch.) > Suppose we execute the following query: > {code} > ALTER SESSION SET `store.json.all_text_mode` = true; > SELECT a, b FROM `70Kmissing.json` WHERE b IS NOT NULL ORDER BY a; > {code} > The query should work. We have an explicit project for column {{b}} and we've > told JSON to always use text. So, JSON should have enough information to > create column {{b}} as {{Nullable VarChar}}. > Yet, the result of the query in {{sqlline}} is: > {noformat} > Error: UNSUPPORTED_OPERATION ERROR: Schema changes not supported in External > Sort. Please enable Union type. > Previous schema BatchSchema [fields=[[`a` (VARCHAR:OPTIONAL)], [`b` > (INT:OPTIONAL)]], selectionVector=NONE] > Incoming schema BatchSchema [fields=[[`a` (VARCHAR:OPTIONAL)], [`b` > (VARCHAR:OPTIONAL)]], selectionVector=NONE] > {noformat} > The expected result is that the query works because even missing columns > should be subject to the "all text mode" setting because the JSON reader > handles projection push-down, and is responsible for filling in the missing > columns. > This is with the shipping Drill 1.13 JSON reader. I *think* this is fixed in > the "batch size handling" JSON reader rewrite, but I've not tested it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)