Tongjie Chen created PARQUET-148:
------------------------------------
Summary: provide an option to skip entire row group in case
corrupted bytes occur
Key: PARQUET-148
URL: https://issues.apache.org/jira/browse/PARQUET-148
Project: Parquet
Issue Type: Improvement
Reporter: Tongjie Chen
In case of hardware failure (disk, memory, etc), there might be corrupted
bytes. That will result in ArrayIndexOutOfBoundException or/and data garbled.
Currently, jobs reading those Parquet files will fail unless the corrupted
files are deleted/moved.
It would be better if Parquet provide an option to skip entire row group (and
report how many rows being affected) in case of corrupted bytes.
related issue: PARQUET-147
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)