[ https://issues.apache.org/jira/browse/SPARK-24853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437394#comment-17437394 ]
Nicholas Chammas commented on SPARK-24853: ------------------------------------------ [~hyukjin.kwon] - It's not just for consistency. As noted in the description, this is useful when you are trying to rename a column with an ambiguous name. For example, imagine two tables {{left}} and {{right}}, each with a column called {{count}}: {code:java} ( left_counts.alias('left') .join(right_counts.alias('right'), on='join_key') .withColumn( 'total_count', left_counts['count'] + right_counts['count'] ) .withColumnRenamed('left.count', 'left_count') # no-op; alias doesn't work .withColumnRenamed('count', 'left_count') # incorrect; it renames both count columns .withColumnRenamed(left_counts['count'], 'left_count') # what, ideally, users want to do here .show() ){code} If you don't mind, I'm going to reopen this issue. > Support Column type for withColumn and withColumnRenamed apis > ------------------------------------------------------------- > > Key: SPARK-24853 > URL: https://issues.apache.org/jira/browse/SPARK-24853 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.2.2 > Reporter: nirav patel > Priority: Major > > Can we add overloaded version of withColumn or withColumnRenamed that accept > Column type instead of String? That way I can specify FQN in case when there > is duplicate column names. e.g. if I have 2 columns with same name as a > result of join and I want to rename one of the field I can do it with this > new API. > > This would be similar to Drop api which supports both String and Column type. > > def > withColumn(colName: Column, col: Column): DataFrame > Returns a new Dataset by adding a column or replacing the existing column > that has the same name. > > def > withColumnRenamed(existingName: Column, newName: Column): DataFrame > Returns a new Dataset with a column renamed. > > > > I think there should also be this one: > > def > withColumnRenamed(existingName: *Column*, newName: *Column*): DataFrame > Returns a new Dataset with a column renamed. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org