[ https://issues.apache.org/jira/browse/PARQUET-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabor Szadovszky resolved PARQUET-1963. --------------------------------------- Resolution: Fixed > DeprecatedParquetInputFormat in CombineFileInputFormat throw NPE when the > first sub-split is empty > -------------------------------------------------------------------------------------------------- > > Key: PARQUET-1963 > URL: https://issues.apache.org/jira/browse/PARQUET-1963 > Project: Parquet > Issue Type: Bug > Components: parquet-mr > Reporter: Daniel Dai > Assignee: Daniel Dai > Priority: Major > > A followup of PARQUET-1947, after the fix, when the first sub-split is empty > in CombineFileInputFormat, there's a NPE: > {code} > Caused by: java.lang.NullPointerException > at > org.apache.parquet.hadoop.mapred.DeprecatedParquetInputFormat$RecordReaderWrapper.next(DeprecatedParquetInputFormat.java:154) > at > org.apache.parquet.hadoop.mapred.DeprecatedParquetInputFormat$RecordReaderWrapper.next(DeprecatedParquetInputFormat.java:73) > at > cascading.tap.hadoop.io.CombineFileRecordReaderWrapper.next(CombineFileRecordReaderWrapper.java:70) > at > org.apache.hadoop.mapred.lib.CombineFileRecordReader.next(CombineFileRecordReader.java:58) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > at > cascading.tap.hadoop.util.MeasuredRecordReader.next(MeasuredRecordReader.java:61) > at > org.apache.parquet.cascading.ParquetTupleScheme.source(ParquetTupleScheme.java:160) > at > cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:163) > at > cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:136) > ... 10 more > {code} > The reason is CombineFileInputFormat will use the result of createValue of > the first sub-split as the value container. Since the first sub-split is > empty, the value container is null. -- This message was sent by Atlassian Jira (v8.3.4#803005)