Re: Reading parquet files into Spark Streaming

Akhilesh Pathodia Fri, 26 Aug 2016 22:01:40 -0700

Hi Renato,

Which version of Spark are you using?


If spark version is 1.3.0 or more then you can use SqlContext to read the
parquet file which will give you DataFrame. Please follow the below link:

https://spark.apache.org/docs/1.5.0/sql-programming-guide.html#loading-data-programmatically

Thanks,
Akhilesh

On Sat, Aug 27, 2016 at 3:26 AM, Renato Marroquín Mogrovejo <
renatoj.marroq...@gmail.com> wrote:

> Anybody? I think Rory also didn't get an answer from the list ...
>
> https://mail-archives.apache.org/mod_mbox/spark-user/201602.mbox/%3CCAC+
> fre14pv5nvqhtbvqdc+6dkxo73odazfqslbso8f94ozo...@mail.gmail.com%3E
>
>
>
> 2016-08-26 17:42 GMT+02:00 Renato Marroquín Mogrovejo <
> renatoj.marroq...@gmail.com>:
>
>> Hi all,
>>
>> I am trying to use parquet files as input for DStream operations, but I
>> can't find any documentation or example. The only thing I found was [1] but
>> I also get the same error as in the post (Class
>> parquet.avro.AvroReadSupport not found).
>> Ideally I would like to do have something like this:
>>
>> val oDStream = ssc.fileStream[Void, Order, ParquetInputFormat[Order]]("da
>> ta/")
>>
>> where Order is a case class and the files inside "data" are all parquet
>> files.
>> Any hints would be highly appreciated. Thanks!
>>
>>
>> Best,
>>
>> Renato M.
>>
>> [1] http://stackoverflow.com/questions/35413552/how-do-i-read-
>> in-parquet-files-using-ssc-filestream-and-what-is-the-nature
>>
>
>

Re: Reading parquet files into Spark Streaming

Reply via email to