[ 
https://issues.apache.org/jira/browse/PARQUET-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183352#comment-17183352
 ] 

Xinli Shang commented on PARQUET-1901:
--------------------------------------

Hi [~rdblue], please comment on this if you have different opinions. This is 
found during the ColumnIndex integration to Iceberg. We would need to handle 
the null checking in Iceberg anyway before Parquet 1.12.0.  

> Add filter null check for ColumnIndex  
> ---------------------------------------
>
>                 Key: PARQUET-1901
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1901
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.11.0
>            Reporter: Xinli Shang
>            Assignee: Xinli Shang
>            Priority: Major
>             Fix For: 1.12.0
>
>
> This Jira is opened for discussion that should we add null checking for the 
> filter when ColumnIndex is enabled. 
> In the ColumnIndexFilter#calculateRowRanges() method, the input parameter 
> 'filter' is assumed to be non-null without checking. It throws NPE when 
> ColumnIndex is enabled(by default) but there is no filter set in the 
> ParquetReadOptions. The call stack is as below. 
>     java.lang.NullPointerException
>         at 
> org.apache.parquet.internal.filter2.columnindex.ColumnIndexFilter.calculateRowRanges(ColumnIndexFilter.java:81)
>         at 
> org.apache.parquet.hadoop.ParquetFileReader.getRowRanges(ParquetFileReader.java:961)
>         at 
> org.apache.parquet.hadoop.ParquetFileReader.readNextFilteredRowGroup(ParquetFileReader.java:891)
> If we don't add, the user might need to choose to call readNextRowGroup() or 
> readFilteredNextRowGroup() accordingly based on filter existence. 
> Thoughts?  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to