Hi, Section Parquet in the documentation (it’s very nice document): http://spark.apache.org/docs/latest/sql-programming-guide.html
Be sure that the parquet file is created properly: http://www.do-hadoop.com/scala/csv-parquet-scala-spark-snippet/ Kind Regards, Jan > On 18 Apr 2016, at 11:44, Ramkumar V <ramkumar.c...@gmail.com> wrote: > > HI, > > Any idea on this ? > > Thanks, > > <https://in.linkedin.com/in/ramkumarcs31> > > > On Mon, Apr 4, 2016 at 2:47 PM, Akhil Das <ak...@sigmoidanalytics.com > <mailto:ak...@sigmoidanalytics.com>> wrote: > I wasn't knowing you have a parquet file containing json data. > > Thanks > Best Regards > > On Mon, Apr 4, 2016 at 2:44 PM, Ramkumar V <ramkumar.c...@gmail.com > <mailto:ramkumar.c...@gmail.com>> wrote: > Hi Akhil, > > Thanks for your help. Why do you put separator as "," ? > > I have a parquet file which contains only json in each line. > > Thanks, > > <https://in.linkedin.com/in/ramkumarcs31> > > > On Mon, Apr 4, 2016 at 2:34 PM, Akhil Das <ak...@sigmoidanalytics.com > <mailto:ak...@sigmoidanalytics.com>> wrote: > Something like this (in scala): > > val rdd = parquetFile.javaRDD().map(row => row.mkstring(",")) > > You can create a map operation over your javaRDD to convert the > org.apache.spark.sql.Row > <https://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/sql/Row.html> > to String (the Row.mkstring() Operation) > > Thanks > Best Regards > > On Mon, Apr 4, 2016 at 12:02 PM, Ramkumar V <ramkumar.c...@gmail.com > <mailto:ramkumar.c...@gmail.com>> wrote: > Any idea on this ? How to convert parquet file into JavaRDD<String> ? > > Thanks, > > <https://in.linkedin.com/in/ramkumarcs31> > > > On Thu, Mar 31, 2016 at 4:30 PM, Ramkumar V <ramkumar.c...@gmail.com > <mailto:ramkumar.c...@gmail.com>> wrote: > Hi, > > Thanks for the reply. I tried this. It's returning JavaRDD<row> instead of > JavaRDD<String>. How to get JavaRDD<String> ? > > Error : > incompatible types: > org.apache.spark.api.java.JavaRDD<org.apache.spark.sql.Row> cannot be > converted to org.apache.spark.api.java.JavaRDD<java.lang.String> > > > > > > > > Thanks, > > <https://in.linkedin.com/in/ramkumarcs31> > > > On Thu, Mar 31, 2016 at 2:57 PM, UMESH CHAUDHARY <umesh9...@gmail.com > <mailto:umesh9...@gmail.com>> wrote: > From Spark Documentation: > > DataFrame parquetFile = sqlContext.read().parquet("people.parquet"); > JavaRDD<String> jRDD= parquetFile.javaRDD() > > javaRDD() method will convert the DF to RDD > > On Thu, Mar 31, 2016 at 2:51 PM, Ramkumar V <ramkumar.c...@gmail.com > <mailto:ramkumar.c...@gmail.com>> wrote: > Hi, > > I'm trying to read parquet log files in Java Spark. Parquet log files are > stored in hdfs. I want to read and convert that parquet file into JavaRDD. I > could able to find Sqlcontext dataframe api. How can I read if it is > sparkcontext and rdd ? what is the best way to read it ? > > Thanks, > > <https://in.linkedin.com/in/ramkumarcs31> > > > > > > > >