Something like this (in scala): val rdd = parquetFile.javaRDD().map(row => row.mkstring(","))
You can create a map operation over your javaRDD to convert the org.apache.spark.sql.Row <https://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/sql/Row.html> to String (the Row.mkstring() Operation) Thanks Best Regards On Mon, Apr 4, 2016 at 12:02 PM, Ramkumar V <ramkumar.c...@gmail.com> wrote: > Any idea on this ? How to convert parquet file into JavaRDD<String> ? > > *Thanks*, > <https://in.linkedin.com/in/ramkumarcs31> > > > On Thu, Mar 31, 2016 at 4:30 PM, Ramkumar V <ramkumar.c...@gmail.com> > wrote: > >> Hi, >> >> Thanks for the reply. I tried this. It's returning JavaRDD<row> instead >> of JavaRDD<String>. How to get JavaRDD<String> ? >> >> Error : >> incompatible types: >> org.apache.spark.api.java.JavaRDD<org.apache.spark.sql.Row> cannot be >> converted to org.apache.spark.api.java.JavaRDD<java.lang.String> >> >> >> >> >> >> *Thanks*, >> <https://in.linkedin.com/in/ramkumarcs31> >> >> >> On Thu, Mar 31, 2016 at 2:57 PM, UMESH CHAUDHARY <umesh9...@gmail.com> >> wrote: >> >>> From Spark Documentation: >>> >>> DataFrame parquetFile = sqlContext.read().parquet("people.parquet"); >>> >>> JavaRDD<String> jRDD= parquetFile.javaRDD() >>> >>> javaRDD() method will convert the DF to RDD >>> >>> On Thu, Mar 31, 2016 at 2:51 PM, Ramkumar V <ramkumar.c...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> I'm trying to read parquet log files in Java Spark. Parquet log files >>>> are stored in hdfs. I want to read and convert that parquet file into >>>> JavaRDD. I could able to find Sqlcontext dataframe api. How can I read if >>>> it is sparkcontext and rdd ? what is the best way to read it ? >>>> >>>> *Thanks*, >>>> <https://in.linkedin.com/in/ramkumarcs31> >>>> >>>> >>> >> >