Paul Rogers created DRILL-5489:
----------------------------------
Summary: Unprotected array access in RepeatedVarCharOutput ctor
Key: DRILL-5489
URL: https://issues.apache.org/jira/browse/DRILL-5489
Project: Apache Drill
Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers
Priority: Minor
Suppose a user runs a query of form:
{code}
SELECT columns[70000] FROM `dfs`.`mycsv.csv`
{code}
Internally, this will create a {{PathSegment}} to represent the selected
column. This is passed into the {{RepeatedVarCharOutput}} constructor where it
is used to set a flag in an array of 64K booleans. But, while the code is very
diligent of making sure that the column name is "columns" and that the path
segment is an array, it does not check the array value. Instead:
{code}
for(Integer i : columnIds){
...
fields[i] = true;
}
{code}
We need to add a bounds check to reject array indexes that are not valid:
negative or above 64K.
While we are at it, we might as well fix another bit of bogus code:
{code}
for(Integer i : columnIds){
maxField = 0;
maxField = Math.max(maxField, i);
...
}
{code}
The code to compute maxField simply uses the last value, not the maximum value.
This will be thrown off by a query of the form:
{code}
SELECT columns[20], columns[1] FROM ...
{code}
It may be that the code further up the hierarchy does the checks. But, if so,
it should do the other checks as well. Leaving the checks incomplete is
confusing.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)