Paul Rogers created DRILL-5920:
----------------------------------
Summary: Drill incorrectly projects column aliases to scan operator
Key: DRILL-5920
URL: https://issues.apache.org/jira/browse/DRILL-5920
Project: Apache Drill
Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers
Priority: Major
The {{TestNewTextReader.ensureColumnNameDisplayedinError}} unit test runs this
query:
{code}
select max(columns[1]) as col1
from cp.`textinput/input1.csv`
where col1 is not null
{code}
The following appears in the {{SubScan}} for the {{TextFormatPlugin}}:
{noformat}
[`col1`, `columns`[1]]
{noformat}
This is clearly wrong. The actual table column is {{columns}} (and,
specifically, element 1.) {{col1} is an alias that should never have been
pushed down to the data source because the data source does not know about
aliases.
Further, the projection list makes no distinction between the "real" and
"alias" columns, so, to the data source, both look like real table columns.
The current workaround is to create a nullable int column for {{col1}} which
is, presumably, replaced by a later projection operator.
Because this behavior is wrong, we must think though all the possible failure
cases and how to handle them in this incorrect design. What if the alias
matches an (expensive) table column? What if the alias is the same as some base
column in the same query?
{code}
SELECT a as b, b as c FROM ...
{code}
Incorrect name handling may work in many cases, but it does lead to problems
because the behavior is not following the accepted SQL standards.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)