[ https://issues.apache.org/jira/browse/SPARK-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shivaram Venkataraman resolved SPARK-10346. ------------------------------------------- Resolution: Fixed > SparkR mutate and transform should replace column with same name to match R > data.frame behavior > ----------------------------------------------------------------------------------------------- > > Key: SPARK-10346 > URL: https://issues.apache.org/jira/browse/SPARK-10346 > Project: Spark > Issue Type: Bug > Components: SparkR > Affects Versions: 1.5.0 > Reporter: Felix Cheung > > Spark doesn't seem to replace existing column with the name in mutate (ie. > mutate(df, age = df$age + 2) - returned DataFrame has 2 columns with the same > name 'age'), so therefore not doing that for now in transform. > Though it is clearly stated it should replace column with matching name: > https://stat.ethz.ch/R-manual/R-devel/library/base/html/transform.html > "The tags are matched against names(_data), and for those that match, the > value replace the corresponding variable in _data, and the others are > appended to _data." > Also the resulting DataFrame might be hard to work with if one is to use > select with column names, or to register the table to SQL, and so on, since > then 2 columns have the same name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org