[ 
https://issues.apache.org/jira/browse/SPARK-36642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weichen Xu reassigned SPARK-36642:
----------------------------------

    Assignee: Liang Zhang

> Add df.withMetadata: a syntax suger to update the metadata of a dataframe
> -------------------------------------------------------------------------
>
>                 Key: SPARK-36642
>                 URL: https://issues.apache.org/jira/browse/SPARK-36642
>             Project: Spark
>          Issue Type: Story
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Liang Zhang
>            Assignee: Liang Zhang
>            Priority: Major
>
> To make it easy to use/modify the semantic annotation, we want to have a 
> shorter API to update the metadata in a dataframe.
> Currently we have
> {code:scala}
> df.withColumn("col1", col("col1").alias("col1", metadata=metadata))
> {code}
> to update the metadata without changing the column name, and this is too 
> verbose. We want to have a syntax suger API
> {code:scala}
> df.withMetadata("col1", metadata=metadata)
> {code}
> to achieve the same functionality.
> A bit of background for the frequency of the metadata update: We are working 
> on inferring the semantic data types and use them in AutoML and store the 
> semantic annotation in the metadata. So in many cases, we will suggest the 
> user update the metadata to correct the wrong inference or manually add the 
> annotation for weak inference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to