[ 
https://issues.apache.org/jira/browse/SPARK-25102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-25102:
----------------------------------
    Description: 
Currently, Spark writes Spark version number into Hive Table properties with 
`org.apache.spark.sql.create.version`.

This issue aims to write Spark versions to ORC/Parquet file metadata 
consistently.

  was:
-PARQUET-352- added support for the "writer.model.name" property in the Parquet 
metadata to identify the object model (application) that wrote the file.

The easiest way to write this property is by overriding getName() of 
org.apache.parquet.hadoop.api.WriteSupport. In Spark, this would mean adding 
getName() to the 
org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport class.


> Write Spark version to Parquet/ORC file metadata
> ------------------------------------------------
>
>                 Key: SPARK-25102
>                 URL: https://issues.apache.org/jira/browse/SPARK-25102
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Zoltan Ivanfi
>            Priority: Major
>
> Currently, Spark writes Spark version number into Hive Table properties with 
> `org.apache.spark.sql.create.version`.
> This issue aims to write Spark versions to ORC/Parquet file metadata 
> consistently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to