Re: Read Parquet in Java Spark

2016-04-18 Thread Zhan Zhang
You can try something like below, if you only have one column. val rdd = parquetFile.javaRDD().map(row => row.getAs[String](0) Thanks. Zhan Zhang On Apr 18, 2016, at 3:44 AM, Ramkumar V > wrote: HI, Any idea on this ? Thanks,

Re: Read Parquet in Java Spark

2016-04-18 Thread Jan Rock
Hi, Section Parquet in the documentation (it’s very nice document): http://spark.apache.org/docs/latest/sql-programming-guide.html Be sure that the parquet file is created properly: http://www.do-hadoop.com/scala/csv-parquet-scala-spark-snippet/ Kind Regards, Jan > On 18 Apr 2016, at 11:44,

Re: Read Parquet in Java Spark

2016-04-18 Thread Ramkumar V
HI, Any idea on this ? *Thanks*, On Mon, Apr 4, 2016 at 2:47 PM, Akhil Das wrote: > I wasn't knowing you have a parquet file containing json data. > > Thanks > Best Regards > > On Mon, Apr 4, 2016 at 2:44 PM, Ramkumar V

Re: Read Parquet in Java Spark

2016-04-04 Thread Akhil Das
I wasn't knowing you have a parquet file containing json data. Thanks Best Regards On Mon, Apr 4, 2016 at 2:44 PM, Ramkumar V wrote: > Hi Akhil, > > Thanks for your help. Why do you put separator as "," ? > > I have a parquet file which contains only json in each line.

Re: Read Parquet in Java Spark

2016-04-04 Thread Ramkumar V
Hi Akhil, Thanks for your help. Why do you put separator as "," ? I have a parquet file which contains only json in each line. *Thanks*, On Mon, Apr 4, 2016 at 2:34 PM, Akhil Das wrote: > Something like this (in scala): >

Re: Read Parquet in Java Spark

2016-04-04 Thread Akhil Das
Something like this (in scala): val rdd = parquetFile.javaRDD().map(row => row.mkstring(",")) You can create a map operation over your javaRDD to convert the org.apache.spark.sql.Row to String (the Row.mkstring()

Re: Read Parquet in Java Spark

2016-04-04 Thread Ramkumar V
Any idea on this ? How to convert parquet file into JavaRDD ? *Thanks*, On Thu, Mar 31, 2016 at 4:30 PM, Ramkumar V wrote: > Hi, > > Thanks for the reply. I tried this. It's returning JavaRDD instead > of JavaRDD. How to get

Re: Read Parquet in Java Spark

2016-03-31 Thread Ramkumar V
Hi, Thanks for the reply. I tried this. It's returning JavaRDD instead of JavaRDD. How to get JavaRDD ? Error : incompatible types: org.apache.spark.api.java.JavaRDD cannot be converted to org.apache.spark.api.java.JavaRDD *Thanks*, On Thu, Mar

Re: Read Parquet in Java Spark

2016-03-31 Thread UMESH CHAUDHARY
>From Spark Documentation: DataFrame parquetFile = sqlContext.read().parquet("people.parquet"); JavaRDD jRDD= parquetFile.javaRDD() javaRDD() method will convert the DF to RDD On Thu, Mar 31, 2016 at 2:51 PM, Ramkumar V wrote: > Hi, > > I'm trying to read parquet log

Read Parquet in Java Spark

2016-03-31 Thread Ramkumar V
Hi, I'm trying to read parquet log files in Java Spark. Parquet log files are stored in hdfs. I want to read and convert that parquet file into JavaRDD. I could able to find Sqlcontext dataframe api. How can I read if it is sparkcontext and rdd ? what is the best way to read it ? *Thanks*,