Josh Rosen created SPARK-30338:
----------------------------------

             Summary: Avoid unnecessary InternalRow copies in 
ParquetRowConverter
                 Key: SPARK-30338
                 URL: https://issues.apache.org/jira/browse/SPARK-30338
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Josh Rosen
            Assignee: Josh Rosen


ParquetRowConverter calls {{InternalRow.copy()}} in cases where the copy is 
unnecessary; this can severely harm performance when reading deeply-nested 
Parquet.

It looks like this copying was originally added to handle arrays and maps of 
structs (in which case we need to keep the copying), but we can omit it for the 
more common case of structs nested directly in structs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to