Chaerim Yeo created SPARK-25571:
-----------------------------------

             Summary: Add withColumnsRenamed method to Dataset
                 Key: SPARK-25571
                 URL: https://issues.apache.org/jira/browse/SPARK-25571
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.3.2
            Reporter: Chaerim Yeo


There are two general approaches to rename several columns.
 * Using *withColumnRenamed* method
 * Using *select* method

{code}
// Using withColumnRenamed
ds.withColumnRenamed("first_name", "firstName")
  .withColumnRenamed("last_name", "lastName")
  .withColumnRenamed("postal_code", "postalCode")

// Using select
ds.select(
  $"id",
  $"first_name" as "firstName",
  $"last_name" as "lastName",
  $"address",
  $"postal_code" as "postalCode"
)
{code}
However, both approaches are still inefficient and redundant due to following 
limitations.
 * withColumnRenamed: it is required to call method several times
 * select: it is required to pass all columns to select method

It is necessary to implement new method, such as *withColumnsRenamed*, which 
can rename many columns at once.
{code}
ds.withColumnsRenamed(
  "first_name" -> "firstName",
  "last_name" -> "lastName",
  "postal_code" -> "postalCode"
)
// or
ds.withColumnsRenamed(Map(
  "first_name" -> "firstName",
  "last_name" -> "lastName",
  "postal_code" -> "postalCode"
))
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to