[ https://issues.apache.org/jira/browse/SPARK-42388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chao Sun resolved SPARK-42388. ------------------------------ Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 39950 [https://github.com/apache/spark/pull/39950] > Avoid unnecessary parquet footer reads when no filters in vectorized reader > --------------------------------------------------------------------------- > > Key: SPARK-42388 > URL: https://issues.apache.org/jira/browse/SPARK-42388 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.4.0 > Reporter: Mars > Assignee: Mars > Priority: Major > Fix For: 3.5.0 > > > Parquet footer is now read twice even if there are no filters requiring > pushdown in vectorized parquet reader. > When the NameNode is under high pressure, it will cost time to read twice. > Actually we can avoid this unnecessary parquet footer reads and use footer > metadata inĀ {{{}VectorizedParquetRecordReader{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org