Re: Adding a column to a SchemaRDD

2014-12-15 Thread Yanbo Liang
Hi Nathan, #1 Spark SQL DSL can satisfy your requirement. You can refer the following code snippet: jdata.select(Star(Node), 'seven.getField(mod), 'eleven.getField(mod)) You need to import org.apache.spark.sql.catalyst.analysis.Star in advance. #2 After you make the transform above, you do

Re: Adding a column to a SchemaRDD

2014-12-14 Thread Tobias Pfeiffer
Nathan, On Fri, Dec 12, 2014 at 3:11 PM, Nathan Kronenfeld nkronenf...@oculusinfo.com wrote: I can see how to do it if can express the added values in SQL - just run SELECT *,valueCalculation AS newColumnName FROM table I've been searching all over for how to do this if my added value is a

Re: Adding a column to a SchemaRDD

2014-12-12 Thread Yanbo Liang
RDD is immutable so you can not modify it. If you want to modify some value or schema in RDD, using map to generate a new RDD. The following code for your reference: def add(a:Int,b:Int):Int = { a + b } val d1 = sc.parallelize(1 to 10).map { i = (i, i+1, i+2) } val d2 = d1.map { i = (i._1,

Re: Adding a column to a SchemaRDD

2014-12-12 Thread Nathan Kronenfeld
(1) I understand about immutability, that's why I said I wanted a new SchemaRDD. (2) I specfically asked for a non-SQL solution that takes a SchemaRDD, and results in a new SchemaRDD with one new function. (3) The DSL stuff is a big clue, but I can't find adequate documentation for it What I'm

Adding a column to a SchemaRDD

2014-12-11 Thread Nathan Kronenfeld
Hi, there. I'm trying to understand how to augment data in a SchemaRDD. I can see how to do it if can express the added values in SQL - just run SELECT *,valueCalculation AS newColumnName FROM table I've been searching all over for how to do this if my added value is a scala function, with no