Re: How to create DataFrame from a binary file?

2015-08-10 Thread Ted Yu
Umesh: Please take a look at the classes under: sql/core/src/main/scala/org/apache/spark/sql/parquet FYI On Mon, Aug 10, 2015 at 10:35 AM, Umesh Kacha umesh.ka...@gmail.com wrote: Hi Bo thanks much let me explain please see the following code JavaPairRDDString,PortableDataStream pairRdd =

Re: How to create DataFrame from a binary file?

2015-08-10 Thread Umesh Kacha
Hi Bo thanks much let me explain please see the following code JavaPairRDDString,PortableDataStream pairRdd = javaSparkContext.binaryFiles(/hdfs/path/to/binfile); JavaRDDPortableDataStream javardd = pairRdd.values(); DataFrame binDataFrame = sqlContext.createDataFrame(javaBinRdd,

Re: How to create DataFrame from a binary file?

2015-08-09 Thread bo yang
You can create your own data schema (StructType in spark), and use following method to create data frame with your own data schema: sqlContext.createDataFrame(yourRDD, structType); I wrote a post on how to do it. You can also get the sample code there: Light-Weight Self-Service Data Query

Re: How to create DataFrame from a binary file?

2015-08-09 Thread Umesh Kacha
Hi Bo I know how to create a DataFrame my question is how to create a DataFrame for binary files and in your blog it is raw text json files please read my question properly thanks. On Sun, Aug 9, 2015 at 11:21 PM, bo yang bobyan...@gmail.com wrote: You can create your own data schema

Re: How to create DataFrame from a binary file?

2015-08-09 Thread bo yang
Well, my post uses raw text json file to show how to create data frame with a custom data schema. The key idea is to show the flexibility to deal with any format of data by using your own schema. Sorry if I did not make you fully understand. Anyway, let us know once you figure out your problem.