Hi,

Section Parquet in the documentation (it’s very nice document):
http://spark.apache.org/docs/latest/sql-programming-guide.html

Be sure that the parquet file is created properly:
http://www.do-hadoop.com/scala/csv-parquet-scala-spark-snippet/

Kind Regards,
Jan


> On 18 Apr 2016, at 11:44, Ramkumar V <ramkumar.c...@gmail.com> wrote:
> 
> HI, 
> 
> Any idea on this ?
> 
> Thanks,
> 
>  <https://in.linkedin.com/in/ramkumarcs31> 
> 
> 
> On Mon, Apr 4, 2016 at 2:47 PM, Akhil Das <ak...@sigmoidanalytics.com 
> <mailto:ak...@sigmoidanalytics.com>> wrote:
> I wasn't knowing you have a parquet file containing json data.
> 
> Thanks
> Best Regards
> 
> On Mon, Apr 4, 2016 at 2:44 PM, Ramkumar V <ramkumar.c...@gmail.com 
> <mailto:ramkumar.c...@gmail.com>> wrote:
> Hi Akhil,
> 
> Thanks for your help. Why do you put separator as "," ?
> 
> I have a parquet file which contains only json in each line.
> 
> Thanks,
> 
>  <https://in.linkedin.com/in/ramkumarcs31> 
> 
> 
> On Mon, Apr 4, 2016 at 2:34 PM, Akhil Das <ak...@sigmoidanalytics.com 
> <mailto:ak...@sigmoidanalytics.com>> wrote:
> Something like this (in scala):
> 
> val rdd = parquetFile.javaRDD().map(row => row.mkstring(","))
> 
> You can create a map operation over your javaRDD to convert the 
> org.apache.spark.sql.Row 
> <https://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/sql/Row.html> 
> to String (the Row.mkstring() Operation)
> 
> Thanks
> Best Regards
> 
> On Mon, Apr 4, 2016 at 12:02 PM, Ramkumar V <ramkumar.c...@gmail.com 
> <mailto:ramkumar.c...@gmail.com>> wrote:
> Any idea on this ? How to convert parquet file into JavaRDD<String> ?
> 
> Thanks,
> 
>  <https://in.linkedin.com/in/ramkumarcs31> 
> 
> 
> On Thu, Mar 31, 2016 at 4:30 PM, Ramkumar V <ramkumar.c...@gmail.com 
> <mailto:ramkumar.c...@gmail.com>> wrote:
> Hi,
> 
> Thanks for the reply.  I tried this. It's returning JavaRDD<row> instead of 
> JavaRDD<String>. How to get JavaRDD<String> ?
> 
> Error :
> incompatible types: 
> org.apache.spark.api.java.JavaRDD<org.apache.spark.sql.Row> cannot be 
> converted to org.apache.spark.api.java.JavaRDD<java.lang.String>
> 
> 
> 
> 
> 
> 
> 
> Thanks,
> 
>  <https://in.linkedin.com/in/ramkumarcs31> 
> 
> 
> On Thu, Mar 31, 2016 at 2:57 PM, UMESH CHAUDHARY <umesh9...@gmail.com 
> <mailto:umesh9...@gmail.com>> wrote:
> From Spark Documentation:
> 
> DataFrame parquetFile = sqlContext.read().parquet("people.parquet");
> JavaRDD<String> jRDD= parquetFile.javaRDD()
> 
> javaRDD() method will convert the DF to RDD
> 
> On Thu, Mar 31, 2016 at 2:51 PM, Ramkumar V <ramkumar.c...@gmail.com 
> <mailto:ramkumar.c...@gmail.com>> wrote:
> Hi,
> 
> I'm trying to read parquet log files in Java Spark. Parquet log files are 
> stored in hdfs. I want to read and convert that parquet file into JavaRDD. I 
> could able to find Sqlcontext dataframe api. How can I read if it is 
> sparkcontext and rdd ? what is the best way to read it ?
> 
> Thanks,
> 
>  <https://in.linkedin.com/in/ramkumarcs31> 
> 
> 
> 
> 
> 
> 
> 
> 

Reply via email to