Jinfeng Ni created DRILL-5667:
---------------------------------
Summary: Project operator would change output field order if
convertfromJSON is used
Key: DRILL-5667
URL: https://issues.apache.org/jira/browse/DRILL-5667
Project: Apache Drill
Issue Type: Bug
Reporter: Jinfeng Ni
When one of the projected expression uses convertfromJSON function, such
expression would be put in the last of output fields. In general, such
rearrange of output fields would not cause problem, since most of Drill's
execution operators uses name-based resolution. However, union/union-all
operators uses positional reference. If one branch of union/union-all
re-arranges the output fields, while the other branch does not, then it would
lead either incorrect query result, or execution error.
For instance, given the following json file:
{code}
a: { b: 100, c : "DRILL"}}
{code}
The following query would fail with execution error:
{code}
select 'abc' as col1, convert_from(convert_to(t.a, 'JSON'), 'JSON') as col2,
'xyz' as col3
from dfs.tmp.`1.json` t
union all
select 'abc' as col1, t.a as col2, 'xyz' as col3
from dfs.tmp.`1.json` t;
Error: SYSTEM ERROR: DrillRuntimeException: Type mismatch between VARCHAR on
the left side and MAP on the right side in column 2 of UNION ALL
{code}
If we put the Map column as the last position in both of union all branches,
the query will run successfully.
{code}
select 'abc' as col1, 'xyz' as col3, convert_from(convert_to(t.a, 'JSON'),
'JSON') as col2
from dfs.tmp.`1.json` t
union all
select 'abc' as col1, 'xyz' as col3, t.a as col2
from dfs.tmp.`1.json` t;
+-------+-------+------------------------+
| col1 | col3 | col2 |
+-------+-------+------------------------+
| abc | xyz | {"b":100,"c":"DRILL"} |
| abc | xyz | {"b":100,"c":"DRILL"} |
+-------+-------+------------------------+
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)