Github user rdblue commented on the issue: https://github.com/apache/spark/pull/21118 @cloud-fan, I'd like to get this PR in by 2.4.0. Now that the change to push predicates and projections happens when converting to the physical plan, this can go in. I've rebased this on master and updated it. This changes the DSv2 API to primarily use InternalRow. It ensures that the rows are UnsafeRow by adding a projection on top of the physical scan node. This projection is actually *more* efficient than the current read path because filters are run before the projection. This means, for example, that the Parquet reader can avoid those two projections that currently happen in the scan node.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org