[ 
https://issues.apache.org/jira/browse/DRILL-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965515#comment-14965515
 ] 

Khurram Faraaz commented on DRILL-2322:
---------------------------------------

Note that there is no mention of the line number of the file name as part of 
the error message. I also checked drillbit.log for stack trace there is no 
mention of the line number of the file name.

{code}
0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from `bad_csv`;
Error: SYSTEM ERROR: NumberFormatException: @

Fragment 0:0

[Error Id: 6544865f-c743-4abc-a32c-0a6debe4c9f0 on centos-04.qa.lab:31010] 
(state=,code=0)
{code}

Stack trace from drillbit.log

{code}
2015-10-20 18:22:57,828 [29d9797e-19aa-3ee1-276b-e9d9319f41d7:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NumberFormatException: @

Fragment 0:0

[Error Id: 6544865f-c743-4abc-a32c-0a6debe4c9f0 on centos-04.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
NumberFormatException: @

Fragment 0:0

[Error Id: 6544865f-c743-4abc-a32c-0a6debe4c9f0 on centos-04.qa.lab:31010]
        at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
 ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_85]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_85]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85]
Caused by: java.lang.NumberFormatException: @
        at 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI(StringFunctionHelpers.java:97)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varCharToInt(StringFunctionHelpers.java:122)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.test.generated.ProjectorGen3.doEval(ProjectorTemplate.java:62)
 ~[na:na]
        at 
org.apache.drill.exec.test.generated.ProjectorGen3.projectRecords(ProjectorTemplate.java:62)
 ~[na:na]
        at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:172)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:258)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:252)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at java.security.AccessController.doPrivileged(Native Method) 
~[na:1.7.0_85]
        at javax.security.auth.Subject.doAs(Subject.java:415) ~[na:1.7.0_85]
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
 ~[hadoop-common-2.5.1-mapr-1503.jar:na]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:252)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        ... 4 common frames omitted
{code}

Here is the content from the csv file used in the test

{code}
0: jdbc:drill:schema=dfs.tmp> select * from `badCSVFile.csv`;
+-------------------+
|      columns      |
+-------------------+
| ["1","test","a"]  |
| ["2","test","b"]  |
| ["@","test","c"]  |
| ["4","test","d"]  |
| ["5","blah","e"]  |
+-------------------+
5 rows selected (0.532 seconds)
{code}

> CSV record reader should log which file and which record caused an error in 
> the reader
> --------------------------------------------------------------------------------------
>
>                 Key: DRILL-2322
>                 URL: https://issues.apache.org/jira/browse/DRILL-2322
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Text & CSV
>    Affects Versions: 0.8.0
>            Reporter: Ramana Inukonda Nagaraj
>            Assignee: Sudheesh Katkam
>             Fix For: 0.9.0
>
>         Attachments: DRILL-2322.1.patch.txt, DRILL-2322.2.patch.txt, 
> DRILL-2322.3.patch.txt
>
>
> I believe the title is self exploratory.
> If the text reader fails for any reason due to an offending record drill 
> should log which file (if there are multiple files) and which line/record the 
> error occurs at. This will improve debugging when dealing with large files/ 
> large number of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to