[ 
https://issues.apache.org/jira/browse/IMPALA-9175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9175 started by Norbert Luksa.
---------------------------------------------
> Revisit the error handling logics in ORC scanner
> ------------------------------------------------
>
>                 Key: IMPALA-9175
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9175
>             Project: IMPALA
>          Issue Type: Task
>            Reporter: Quanlong Huang
>            Assignee: Norbert Luksa
>            Priority: Blocker
>
> This is a task to revisit all the corresponding error handling logics in the 
> ORC scanner comparing to the Parquet scanner. For each kind of error handling 
> in the parquet scanner, make sure we already handle it in the orc scanner, 
> otherwise create separate JIRAs to handle them.
> Also need to make sure whether the exposed error messages are enough for 
> debugging. For instance, one frequently encountered error when Impala has 
> stale metadata of an ORC file is:
> {code:java}
> Encountered parse error in tail of ORC file 
> hdfs://hadoop2cluster/user/hive-0.13.1/warehouse/bi_ucar.db/alliance_driver_stat_hour_api/dt=2019-08-09/part-00006:
>  Invalid ORC postscript length
> {code}
> It'd be better to also print the postscript length we read and the file size. 
> So users can know whether the file is corrupt (so need data regeneration) or 
> the metadata is stale (so need refresh). We may need changes in the ORC lib 
> for these.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to