[ 
https://issues.apache.org/jira/browse/SPARK-18823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15745633#comment-15745633
 ] 

Shivaram Venkataraman commented on SPARK-18823:
-----------------------------------------------

Thanks [~masip85] for verifying this. I think as [~felixcheung] pointed out 
there are two separate issues we can file as feature requests

1. Supporting assignment of DataFrame columns in `[` and `[[` -- This should be 
pretty straight forward I'd guess

2. Supporting assignment of a local R column using `$` and / or `[[`  -- This 
one I'm less sure about because it will involve determining types, serializing 
data from local R and splitting into existing DataFrame etc. Also at a higher 
level if the DataFrame has a 100M rows then it might not be efficient to ship 
that much data etc. 

> Assignation by column name variable not available or bug?
> ---------------------------------------------------------
>
>                 Key: SPARK-18823
>                 URL: https://issues.apache.org/jira/browse/SPARK-18823
>             Project: Spark
>          Issue Type: Question
>          Components: SparkR
>    Affects Versions: 2.0.2
>         Environment: RStudio Server in EC2 Instances (EMR Service of AWS) Emr 
> 4. Or databricks (community.cloud.databricks.com) .
>            Reporter: Vicente Masip
>             Fix For: 2.0.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I really don't know if this is a bug or can be done with some function:
> Sometimes is very important to assign something to a column which name has to 
> be access trough a variable. Normally, I have always used it with doble 
> brackets likes this out of SparkR problems:
> # df could be faithful normal data frame or data table.
> # accesing by variable name:
> myname = "waiting"
> df[[myname]] <- c(1:nrow(df))
> # or even column number
> df[[2]] <- df$eruptions
> The error is not caused by the right side of the "<-" operator of assignment. 
> The problem is that I can't assign to a column name using a variable or 
> column number as I do in this examples out of spark. Doesn't matter if I am 
> modifying or creating column. Same problem.
> I have also tried to use this with no results:
> val df2 = withColumn(df,"tmp", df$eruptions)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to