Umesh:
Please take a look at the classes under:
sql/core/src/main/scala/org/apache/spark/sql/parquet
FYI
On Mon, Aug 10, 2015 at 10:35 AM, Umesh Kacha umesh.ka...@gmail.com wrote:
Hi Bo thanks much let me explain please see the following code
JavaPairRDDString,PortableDataStream pairRdd =
Hi Bo thanks much let me explain please see the following code
JavaPairRDDString,PortableDataStream pairRdd =
javaSparkContext.binaryFiles(/hdfs/path/to/binfile);
JavaRDDPortableDataStream javardd = pairRdd.values();
DataFrame binDataFrame = sqlContext.createDataFrame(javaBinRdd,
You can create your own data schema (StructType in spark), and use
following method to create data frame with your own data schema:
sqlContext.createDataFrame(yourRDD, structType);
I wrote a post on how to do it. You can also get the sample code there:
Light-Weight Self-Service Data Query
Hi Bo I know how to create a DataFrame my question is how to create a
DataFrame for binary files and in your blog it is raw text json files
please read my question properly thanks.
On Sun, Aug 9, 2015 at 11:21 PM, bo yang bobyan...@gmail.com wrote:
You can create your own data schema
Well, my post uses raw text json file to show how to create data frame with
a custom data schema. The key idea is to show the flexibility to deal with
any format of data by using your own schema. Sorry if I did not make you
fully understand.
Anyway, let us know once you figure out your problem.