Yang Jie created SPARK-33700: -------------------------------- Summary: Try to push down filters for parquet and orc should add add `filters.nonEmpty` condition Key: SPARK-33700 URL: https://issues.apache.org/jira/browse/SPARK-33700 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Yang Jie
{code:java} lazy val footerFileMetaData = ParquetFileReader.readFooter(conf, filePath, SKIP_ROW_GROUPS).getFileMetaData // Try to push zdown filters when filter push-down is enabled. val pushed = if (enableParquetFilterPushDown) { val parquetSchema = footerFileMetaData.getSchema val parquetFilters = new ParquetFilters(parquetSchema, pushDownDate, pushDownTimestamp, pushDownDecimal, pushDownStringStartWith, pushDownInFilterThreshold, isCaseSensitive) filters // Collects all converted Parquet filter predicates. Notice that not all predicates can be // converted (`ParquetFilters.createFilter` returns an `Option`). That's why a `flatMap` // is used here. .flatMap(parquetFilters.createFilter) .reduceOption(FilterApi.and) } else { None } {code} Should add extra condition `filters.nonEmpty` when try to push down filters for parquet to avoid unnecessary file reading (parquet footer) -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org