[ https://issues.apache.org/jira/browse/PARQUET-2343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gang Wu resolved PARQUET-2343. ------------------------------ Resolution: Fixed > Fixes NPE when rewriting file with multiple rowgroups > ----------------------------------------------------- > > Key: PARQUET-2343 > URL: https://issues.apache.org/jira/browse/PARQUET-2343 > Project: Parquet > Issue Type: Bug > Reporter: Xianyang Liu > Assignee: Xianyang Liu > Priority: Major > Fix For: 1.14.0 > > > Currently, the ParquetRewiter creates the `ColumnReadStoreImpl crStore` and > reuses it for all the blocks rewriting. This should be incorrect and we > should create the `crStore` for each block that needs to be rewritten. > Otherwise, we will fail as the following: > ```java > java.lang.NullPointerException > at > org.apache.parquet.column.impl.ColumnReaderBase.readPage(ColumnReaderBase.java:620) > at > org.apache.parquet.column.impl.ColumnReaderBase.checkRead(ColumnReaderBase.java:594) > at > org.apache.parquet.column.impl.ColumnReaderBase.consume(ColumnReaderBase.java:735) > at > org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:30) > at > org.apache.parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:47) > at > org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:82) > at > org.apache.parquet.hadoop.rewrite.ParquetRewriter.processBlocksFromReader(ParquetRewriter.java:316) > at > org.apache.parquet.hadoop.rewrite.ParquetRewriter.processBlocks(ParquetRewriter.java:250) > ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)