You can try something like below, if you only have one column.

val rdd = parquetFile.javaRDD().map(row => row.getAs[String](0)

Thanks.

Zhan Zhang

On Apr 18, 2016, at 3:44 AM, Ramkumar V 
<ramkumar.c...@gmail.com<mailto:ramkumar.c...@gmail.com>> wrote:

HI,

Any idea on this ?

Thanks,
[http://www.mylivesignature.com/signatures/54491/300/42C82353F8F99C0C0B59C2E122C12687.png]
[http://thelinkedinman.com/wp-content/uploads/sites/2/2012/01/linkedinbutton.jpg]<https://in.linkedin.com/in/ramkumarcs31>


On Mon, Apr 4, 2016 at 2:47 PM, Akhil Das 
<ak...@sigmoidanalytics.com<mailto:ak...@sigmoidanalytics.com>> wrote:
I wasn't knowing you have a parquet file containing json data.

Thanks
Best Regards

On Mon, Apr 4, 2016 at 2:44 PM, Ramkumar V 
<ramkumar.c...@gmail.com<mailto:ramkumar.c...@gmail.com>> wrote:
Hi Akhil,

Thanks for your help. Why do you put separator as "," ?

I have a parquet file which contains only json in each line.

Thanks,
[http://www.mylivesignature.com/signatures/54491/300/42C82353F8F99C0C0B59C2E122C12687.png]
[http://thelinkedinman.com/wp-content/uploads/sites/2/2012/01/linkedinbutton.jpg]<https://in.linkedin.com/in/ramkumarcs31>


On Mon, Apr 4, 2016 at 2:34 PM, Akhil Das 
<ak...@sigmoidanalytics.com<mailto:ak...@sigmoidanalytics.com>> wrote:
Something like this (in scala):

val rdd = parquetFile.javaRDD().map(row => row.mkstring(","))

You can create a map operation over your javaRDD to convert the 
org.apache.spark.sql.Row<https://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/sql/Row.html>
 to String (the Row.mkstring() Operation)

Thanks
Best Regards

On Mon, Apr 4, 2016 at 12:02 PM, Ramkumar V 
<ramkumar.c...@gmail.com<mailto:ramkumar.c...@gmail.com>> wrote:
Any idea on this ? How to convert parquet file into JavaRDD<String> ?

Thanks,
[http://www.mylivesignature.com/signatures/54491/300/42C82353F8F99C0C0B59C2E122C12687.png]
[http://thelinkedinman.com/wp-content/uploads/sites/2/2012/01/linkedinbutton.jpg]<https://in.linkedin.com/in/ramkumarcs31>


On Thu, Mar 31, 2016 at 4:30 PM, Ramkumar V 
<ramkumar.c...@gmail.com<mailto:ramkumar.c...@gmail.com>> wrote:
Hi,

Thanks for the reply.  I tried this. It's returning JavaRDD<row> instead of 
JavaRDD<String>. How to get JavaRDD<String> ?

Error :
incompatible types: org.apache.spark.api.java.JavaRDD<org.apache.spark.sql.Row> 
cannot be converted to org.apache.spark.api.java.JavaRDD<java.lang.String>





Thanks,
[http://www.mylivesignature.com/signatures/54491/300/42C82353F8F99C0C0B59C2E122C12687.png]
[http://thelinkedinman.com/wp-content/uploads/sites/2/2012/01/linkedinbutton.jpg]<https://in.linkedin.com/in/ramkumarcs31>


On Thu, Mar 31, 2016 at 2:57 PM, UMESH CHAUDHARY 
<umesh9...@gmail.com<mailto:umesh9...@gmail.com>> wrote:
>From Spark Documentation:


DataFrame parquetFile = sqlContext.read().parquet("people.parquet");


JavaRDD<String> jRDD= parquetFile.javaRDD()

javaRDD() method will convert the DF to RDD

On Thu, Mar 31, 2016 at 2:51 PM, Ramkumar V 
<ramkumar.c...@gmail.com<mailto:ramkumar.c...@gmail.com>> wrote:
Hi,

I'm trying to read parquet log files in Java Spark. Parquet log files are 
stored in hdfs. I want to read and convert that parquet file into JavaRDD. I 
could able to find Sqlcontext dataframe api. How can I read if it is 
sparkcontext and rdd ? what is the best way to read it ?

Thanks,
[http://www.mylivesignature.com/signatures/54491/300/42C82353F8F99C0C0B59C2E122C12687.png]
[http://thelinkedinman.com/wp-content/uploads/sites/2/2012/01/linkedinbutton.jpg]<https://in.linkedin.com/in/ramkumarcs31>









Reply via email to