Tongjie Chen created PARQUET-148:
------------------------------------

             Summary: provide an option to skip entire row group in case 
corrupted bytes occur
                 Key: PARQUET-148
                 URL: https://issues.apache.org/jira/browse/PARQUET-148
             Project: Parquet
          Issue Type: Improvement
            Reporter: Tongjie Chen


In case of hardware failure (disk, memory, etc),  there might be corrupted 
bytes. That will result in ArrayIndexOutOfBoundException or/and data garbled.

Currently, jobs reading those Parquet files will fail unless the corrupted 
files are deleted/moved.

It would be better if Parquet provide an option to skip entire row group (and 
report how many rows being affected) in case of corrupted bytes.

related issue: PARQUET-147




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to