Nathan, On Fri, Dec 12, 2014 at 3:11 PM, Nathan Kronenfeld < nkronenf...@oculusinfo.com> wrote: > > I can see how to do it if can express the added values in SQL - just run > "SELECT *,valueCalculation AS newColumnName FROM table" > > I've been searching all over for how to do this if my added value is a > scala function, with no luck. > > Let's say I have a SchemaRDD with columns A, B, and C, and I want to add a > new column, D, calculated using Utility.process(b, c), and I want (of > course) to pass in the value B and C from each row, ending up with a new > SchemaRDD with columns A, B, C, and D. > <nkronenf...@oculusinfo.com> >
I guess you would have to do two things: - schemardd.map(row => { extend the row here }) which will give you a plain RDD[Row] without a schema - take the schema from the schemardd and extend it manually by the name and type of the newly added column, - create a new SchemaRDD from your mapped RDD and the manually extended schema. Does that make sense? Tobias