Using regular rdd transforms on schemaRDD

kpeng1 Tue, 17 Mar 2015 12:32:59 -0700

Hi All,

I was wondering how rdd transformation work on schemaRDDs.  Is there a way
to force the rdd transform to keep the schemaRDD types or do I need to
recreate the schemaRDD by applying the applySchema method?


Currently what I have is an array of SchemaRDDs and I just want to do a
union across them i.e. I want the result to be one SchemaRDD with the union
of all the SchemaRDDs in the array.  This is what I currently have that is
not working:
scala> z
res23: Array[org.apache.spark.sql.SchemaRDD]

scala> z.reduce((a,b) => a.union(b))
I get the following error:
 found   : org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
 required: org.apache.spark.sql.SchemaRDD
              z.reduce((a,b) => a.union(b))

I also noticed then when I do a simple join: z(0).join(z(1)) the result back
is not a schemaRDD, but a normal RDD:
scala> z(0).union(z(1))
res22: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]

Is there a simple way for me to convert back to schemaRDD or do I need to
HiveContext.applySchema(res22, myschema)?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Using-regular-rdd-transforms-on-schemaRDD-tp22105.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Using regular rdd transforms on schemaRDD

Reply via email to