[ 
https://issues.apache.org/jira/browse/SPARK-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivaram Venkataraman resolved SPARK-10346.
-------------------------------------------
    Resolution: Fixed

> SparkR mutate and transform should replace column with same name to match R 
> data.frame behavior
> -----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-10346
>                 URL: https://issues.apache.org/jira/browse/SPARK-10346
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 1.5.0
>            Reporter: Felix Cheung
>
> Spark doesn't seem to replace existing column with the name in mutate (ie. 
> mutate(df, age = df$age + 2) - returned DataFrame has 2 columns with the same 
> name 'age'), so therefore not doing that for now in transform.
> Though it is clearly stated it should replace column with matching name:
> https://stat.ethz.ch/R-manual/R-devel/library/base/html/transform.html
> "The tags are matched against names(_data), and for those that match, the 
> value replace the corresponding variable in _data, and the others are 
> appended to _data."
> Also the resulting DataFrame might be hard to work with if one is to use 
> select with column names, or to register the table to SQL, and so on, since 
> then 2 columns have the same name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to