Re: How to add a column to a spark RDD with many columns?

2015-05-02 Thread Carter
Thanks for your reply! It is what I am after.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-a-column-to-a-spark-RDD-with-many-columns-tp22729p22740.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: How to add a column to a spark RDD with many columns?

2015-05-02 Thread dsgriffin
val newRdd = myRdd.map(row => row ++ Array((row(1).toLong *
row(199).toLong).toString))



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-a-column-to-a-spark-RDD-with-many-columns-tp22729p22735.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: How to add a column to a spark RDD with many columns?

2015-04-30 Thread ayan guha
You have rdd or dataframe? Rdds are kind of tuples. You can add a new
column to it by a map.
rdd s are immutable, so you will get another rdd.
On 1 May 2015 14:59, "Carter"  wrote:

> Hi all,
>
> I have a RDD with *MANY *columns (e.g., *hundreds*), how do I add one more
> column at the end of this RDD?
>
> For example, if my RDD is like below:
>
> 123, 523, 534, ..., 893
> 536, 98, 1623, ..., 98472
> 537, 89, 83640, ..., 9265
> 7297, 98364, 9, ..., 735
> ..
> 29, 94, 956, ..., 758
>
> how can I efficiently add a column to it, whose value is the sum of the 2nd
> and the 200th columns?
>
> Thank you very much.
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-a-column-to-a-spark-RDD-with-many-columns-tp22729.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


How to add a column to a spark RDD with many columns?

2015-04-30 Thread Carter
Hi all,

I have a RDD with *MANY *columns (e.g., *hundreds*), how do I add one more
column at the end of this RDD?

For example, if my RDD is like below:

123, 523, 534, ..., 893
536, 98, 1623, ..., 98472
537, 89, 83640, ..., 9265
7297, 98364, 9, ..., 735
..
29, 94, 956, ..., 758

how can I efficiently add a column to it, whose value is the sum of the 2nd
and the 200th columns?

Thank you very much.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-a-column-to-a-spark-RDD-with-many-columns-tp22729.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org