[ https://issues.apache.org/jira/browse/SPARK-36642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Weichen Xu reassigned SPARK-36642: ---------------------------------- Assignee: Liang Zhang > Add df.withMetadata: a syntax suger to update the metadata of a dataframe > ------------------------------------------------------------------------- > > Key: SPARK-36642 > URL: https://issues.apache.org/jira/browse/SPARK-36642 > Project: Spark > Issue Type: Story > Components: SQL > Affects Versions: 3.3.0 > Reporter: Liang Zhang > Assignee: Liang Zhang > Priority: Major > > To make it easy to use/modify the semantic annotation, we want to have a > shorter API to update the metadata in a dataframe. > Currently we have > {code:scala} > df.withColumn("col1", col("col1").alias("col1", metadata=metadata)) > {code} > to update the metadata without changing the column name, and this is too > verbose. We want to have a syntax suger API > {code:scala} > df.withMetadata("col1", metadata=metadata) > {code} > to achieve the same functionality. > A bit of background for the frequency of the metadata update: We are working > on inferring the semantic data types and use them in AutoML and store the > semantic annotation in the metadata. So in many cases, we will suggest the > user update the metadata to correct the wrong inference or manually add the > annotation for weak inference. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org