[ 
https://issues.apache.org/jira/browse/PARQUET-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Szadovszky resolved PARQUET-1963.
---------------------------------------
    Resolution: Fixed

> DeprecatedParquetInputFormat in CombineFileInputFormat throw NPE when the 
> first sub-split is empty
> --------------------------------------------------------------------------------------------------
>
>                 Key: PARQUET-1963
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1963
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Major
>
> A followup of PARQUET-1947, after the fix, when the first sub-split is empty 
> in CombineFileInputFormat, there's a NPE:
> {code}
> Caused by: java.lang.NullPointerException
>       at 
> org.apache.parquet.hadoop.mapred.DeprecatedParquetInputFormat$RecordReaderWrapper.next(DeprecatedParquetInputFormat.java:154)
>       at 
> org.apache.parquet.hadoop.mapred.DeprecatedParquetInputFormat$RecordReaderWrapper.next(DeprecatedParquetInputFormat.java:73)
>       at 
> cascading.tap.hadoop.io.CombineFileRecordReaderWrapper.next(CombineFileRecordReaderWrapper.java:70)
>       at 
> org.apache.hadoop.mapred.lib.CombineFileRecordReader.next(CombineFileRecordReader.java:58)
>       at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>       at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
>       at 
> cascading.tap.hadoop.util.MeasuredRecordReader.next(MeasuredRecordReader.java:61)
>       at 
> org.apache.parquet.cascading.ParquetTupleScheme.source(ParquetTupleScheme.java:160)
>       at 
> cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:163)
>       at 
> cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:136)
>       ... 10 more
> {code}
> The reason is CombineFileInputFormat will use the result of createValue of 
> the first sub-split as the value container. Since the first sub-split is 
> empty, the value container is null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to