Paul Rogers created DRILL-5929:
----------------------------------
Summary: Misleading error for text file with blank line delimiter
Key: DRILL-5929
URL: https://issues.apache.org/jira/browse/DRILL-5929
Project: Apache Drill
Issue Type: Bug
Affects Versions: 1.11.0
Reporter: Paul Rogers
Priority: Minor
Consider the following functional test query:
{code}
select * from
table(`table_function/colons.txt`(type=>'text',lineDelimiter=>'\\'))
{code}
For some reason (yet to be determined), when running this from Java, the line
delimiter ended up empty. This cases the following line to fail with an
{{ArrayIndexOutOfBoundsException}}:
{code}
class TextInput ...
public final byte nextChar() throws IOException {
if (byteChar == lineSeparator[0]) { // but, lineSeparator.length == 0
{code}
We then translate the exception:
{code}
class TextReader ...
public final boolean parseNext() throws IOException {
...
} catch (Exception ex) {
try {
throw handleException(ex);
...
private TextParsingException handleException(Exception ex) throws IOException
{
...
if (ex instanceof ArrayIndexOutOfBoundsException) {
// Not clear this exception is still thrown...
ex = UserException
.dataReadError(ex)
.message(
"Drill failed to read your text file. Drill supports up to %d
columns in a text file. Your file appears to have more than that.",
MAXIMUM_NUMBER_COLUMNS)
.build(logger);
}
{code}
That is, due to a missing delimiter, we get an index out of bounds exception,
which we translate to an error about having too many fields. But, the file
itself has only a handful of fields. Thus, the error is completely wrong.
Then, we compound the error:
{code}
private TextParsingException handleException(Exception ex) throws IOException
{
...
throw new TextParsingException(context, message, ex);
class CompliantTextReader ...
public boolean next() {
...
} catch (IOException | TextParsingException e) {
throw UserException.dataReadError(e)
.addContext("Failure while reading file %s. Happened at or shortly
before byte position %d.",
split.getPath(), reader.getPos())
.build(logger);
{code}
That is, our AIOB exception became a user exception that became a text parsing
exception that became a data read error.
But, this is not a data read error. It is an error in Drill's own validation
logic. Not clear we should be wrapping user exceptions in other errors that we
wrap in other user exceptions.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)