Chao Sun created SPARK-36527:
--------------------------------

             Summary: Implement lazy materialization for the vectorized Parquet 
reader
                 Key: SPARK-36527
                 URL: https://issues.apache.org/jira/browse/SPARK-36527
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.3.0
            Reporter: Chao Sun


At the moment the Parquet vectorized reader will eagerly decode all the columns 
that are in the read schema, before any filter has been applied to them. This 
is costly. Instead it's better to only materialize these column vectors when 
the data are actually read.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to