Hi Bo I know how to create a DataFrame my question is how to create a
DataFrame for binary files and in your blog it is raw text json files
please read my question properly thanks.

On Sun, Aug 9, 2015 at 11:21 PM, bo yang <bobyan...@gmail.com> wrote:

> You can create your own data schema (StructType in spark), and use
> following method to create data frame with your own data schema:
>
> sqlContext.createDataFrame(yourRDD, structType);
>
> I wrote a post on how to do it. You can also get the sample code there:
>
> Light-Weight Self-Service Data Query through Spark SQL:
>
> https://www.linkedin.com/pulse/light-weight-self-service-data-query-through-spark-sql-bo-yang
>
> Take a look and feel free to  let me know for any question.
>
> Best,
> Bo
>
>
>
> On Sat, Aug 8, 2015 at 1:42 PM, unk1102 <umesh.ka...@gmail.com> wrote:
>
>> Hi how do we create DataFrame from a binary file stored in HDFS? I was
>> thinking to use
>>
>> JavaPairRDD<String,PortableDataStream> pairRdd =
>> javaSparkContext.binaryFiles("/hdfs/path/to/binfile");
>> JavaRDD<PortableDataStream> javardd = pairRdd.values();
>>
>> I can see that PortableDataStream has method called toArray which can
>> convert into byte array I was thinking if I have JavaRDD<byte[]> can I
>> call
>> the following and get DataFrame
>>
>> DataFrame binDataFrame =
>> sqlContext.createDataFrame(javaBinRdd,Byte.class);
>>
>> Please guide I am new to Spark. I have my own custom format which is
>> binary
>> format and I was thinking if I can convert my custom format into DataFrame
>> using binary operations then I dont need to create my own custom Hadoop
>> format am I on right track? Will reading binary data into DataFrame scale?
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-create-DataFrame-from-a-binary-file-tp24179.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Reply via email to