Hi All, I was wondering how rdd transformation work on schemaRDDs. Is there a way to force the rdd transform to keep the schemaRDD types or do I need to recreate the schemaRDD by applying the applySchema method?
Currently what I have is an array of SchemaRDDs and I just want to do a union across them i.e. I want the result to be one SchemaRDD with the union of all the SchemaRDDs in the array. This is what I currently have that is not working: scala> z res23: Array[org.apache.spark.sql.SchemaRDD] scala> z.reduce((a,b) => a.union(b)) I get the following error: found : org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] required: org.apache.spark.sql.SchemaRDD z.reduce((a,b) => a.union(b)) I also noticed then when I do a simple join: z(0).join(z(1)) the result back is not a schemaRDD, but a normal RDD: scala> z(0).union(z(1)) res22: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] Is there a simple way for me to convert back to schemaRDD or do I need to HiveContext.applySchema(res22, myschema)? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Using-regular-rdd-transforms-on-schemaRDD-tp22105.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org