Felix Cheung created SPARK-10346:
------------------------------------
Summary: SparkR mutate and transform should replace column with
same name to match R data.frame behavior
Key: SPARK-10346
URL: https://issues.apache.org/jira/browse/SPARK-10346
Project: Spark
Issue Type: Bug
Components: R
Affects Versions: 1.5.0
Reporter: Felix Cheung
Spark doesn't seem to replace existing column with the name in mutate (ie.
mutate(df, age = df$age + 2) - returned DataFrame has 2 columns with the same
name 'age'), so therefore not doing that for now in transform.
Though it is clearly stated it should replace column with matching name:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/transform.html
"The tags are matched against names(_data), and for those that match, the value
replace the corresponding variable in _data, and the others are appended to
_data."
Also the resulting DataFrame might be hard to work with if one is to use select
with column names and so on.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]