[ https://issues.apache.org/jira/browse/DRILL-7499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vova Vysotskyi updated DRILL-7499: ---------------------------------- Affects Version/s: (was: 1.17.0) 1.16.0 > sqltypeof() function with an array returns "ARRAY", not type > ------------------------------------------------------------ > > Key: DRILL-7499 > URL: https://issues.apache.org/jira/browse/DRILL-7499 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.16.0 > Reporter: Paul Rogers > Priority: Minor > Labels: regresion > > The {{sqltypeof()}} function was introduced in Drill 1.14 to work around > limitations of the original {{typeof()}} function. The function is mentioned > in _Learning Apache Drill_, Chapter 8, page 152: > {noformat} > ELECT sqlTypeOf(columns) AS cols_type, > modeOf(columns) AS cols_mode > FROM `csv/cust.csv` LIMIT 1; > +--------------------+------------+ > | cols_type | cols_mode | > +--------------------+------------+ > | CHARACTER VARYING | ARRAY | > +--------------------+------------+ > {noformat} > When the same query is run against the just-released Drill 1.17, we get the > *wrong* results: > {noformat} > +-----------+-----------+ > | cols_type | cols_mode | > +-----------+-----------+ > | ARRAY | ARRAY | > +-----------+-----------+ > {noformat} > The definition of {{sqlTypeOf()}} is that it should return the type portion > of the columns (type, mode) major type. Clearly, it is no longer doing so for > arrays. As a result, there is no function to obtain the data type for arrays. > The problem also shows up in the query from page 158: > {code:sql} > SELECT a, b, > sqlTypeOf(b) AS b_type, modeof(b) AS b_mode > FROM `gen/70kmissing.json` > WHERE mod(a, 70000) = 1; > {code} > Expected (table from the book with Drill 1.14 results): > {noformat} > +--------+-------+----------+-----------+ > | a | b | b_type | b_mode | > +--------+-------+----------+-----------+ > | 1 | null | INTEGER | NULLABLE | > +--------+-------+----------+-----------+ > {noformat} > Actual Drill 1.17 results: > {noformat} > +-------+-----------+-------------------+----------+ > | a | b | b_type | b_mode | > +-------+-----------+-------------------+----------+ > | 1 | null | NULL | NULLABLE | > +-------+-----------+-------------------+----------+ > {noformat} > (Second line of table is omitted because something else changed, not relevant > to this ticket.) > The above might not actually be a bug, however if someone has changed the > type of missing columns from the old {{INT}} to a newer (untyped) {{NULL}}. > But, an indirect test suggests that the column is still `INT` and the > function is wrong: > {code:sql} > SELECT a, b > FROM `gen/70kdouble.json` > WHERE b IS NOT NULL ORDER BY a; > {code} > Data: > {noformat} > {a: 1} > ... > {a: 69999} > {a: 70001, b: 10.5} > {noformat} > Error: > {noformat} > Error: UNSUPPORTED_OPERATION ERROR: Schema changes not supported in External > Sort. Please enable Union type. > Previous schema BatchSchema [fields=[[`a` (BIGINT:OPTIONAL)], [`b` > (INT:OPTIONAL)]], selectionVector=NONE] > Incoming schema BatchSchema [fields=[[`a` (BIGINT:OPTIONAL)], [`b` > (FLOAT8:OPTIONAL)]], selectionVector=NONE] > {noformat} > Oddly, however, the query on page 160 works as expected: > {code:sql} > SELECT sqlTypeOf(a) AS a_type, modeOf(a) AS a_mode > FROM `json/all-null.json` LIMIT 1; > {code} > {noformat} > +---------+----------+ > | a_type | a_mode | > +---------+----------+ > | INTEGER | NULLABLE | > +---------+----------+ > {noformat} > Someone will have to do some investigating to understand the current > behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005)