You have rdd or dataframe? Rdds are kind of tuples. You can add a new column to it by a map. rdd s are immutable, so you will get another rdd. On 1 May 2015 14:59, "Carter" <gyz...@hotmail.com> wrote:
> Hi all, > > I have a RDD with *MANY *columns (e.g., *hundreds*), how do I add one more > column at the end of this RDD? > > For example, if my RDD is like below: > > 123, 523, 534, ..., 893 > 536, 98, 1623, ..., 98472 > 537, 89, 83640, ..., 9265 > 7297, 98364, 9, ..., 735 > ...... > 29, 94, 956, ..., 758 > > how can I efficiently add a column to it, whose value is the sum of the 2nd > and the 200th columns? > > Thank you very much. > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-a-column-to-a-spark-RDD-with-many-columns-tp22729.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >