Gabor Szadovszky created PARQUET-1765:
-----------------------------------------

             Summary: Invalid filteredRowCount in InternalParquetRecordReader
                 Key: PARQUET-1765
                 URL: https://issues.apache.org/jira/browse/PARQUET-1765
             Project: Parquet
          Issue Type: Bug
          Components: parquet-mr
    Affects Versions: 1.11.0
            Reporter: Gabor Szadovszky
            Assignee: Gabor Szadovszky
             Fix For: 1.11.1


The [record 
count|https://github.com/apache/parquet-mr/blob/apache-parquet-1.11.0/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/InternalParquetRecordReader.java#L185]
 is retrieved before setting the [projection 
schema|https://github.com/apache/parquet-mr/blob/apache-parquet-1.11.0/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/InternalParquetRecordReader.java#L188]
 so the value might be invalid if the projection impacts the filter.

In normal cases it does not cause any issue because the record filter will 
filter correctly only that we are filtering the records one-by-one instead of 
dropping the related pages.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to