cloud-fan commented on a change in pull request #34575: URL: https://github.com/apache/spark/pull/34575#discussion_r752943643
########## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala ########## @@ -355,7 +377,14 @@ case class FileSourceScanExec( @transient private lazy val pushedDownFilters = { val supportNestedPredicatePushdown = DataSourceUtils.supportNestedPredicatePushdown(relation) - dataFilters.flatMap(DataSourceStrategy.translateFilter(_, supportNestedPredicatePushdown)) + dataFilters + .filterNot( + _.references.exists { + case MetadataAttribute(_) => true Review comment: How do we handle predicates against metadata columns? Do we ask Spark to run the filters? I think it's better to push them down to the file source reader. For example, we can skip reading an entire file if the predicates have some check about the file name. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org