Re: Structured Stream in Spark

KhajaAsmath Mohammed Wed, 25 Oct 2017 13:30:01 -0700

Thanks Subhash.

Have you ever used zero data loss concept with streaming. I am bit worried
to use streamig when it comes to data loss.


https://blog.cloudera.com/blog/2017/06/offset-management-for-apache-kafka-with-apache-spark-streaming/


does structured streaming handles it internally?

On Wed, Oct 25, 2017 at 3:10 PM, Subhash Sriram <subhash.sri...@gmail.com>
wrote:

> No problem! Take a look at this:
>
> http://spark.apache.org/docs/latest/structured-streaming-
> programming-guide.html#recovering-from-failures-with-checkpointing
>
> Thanks,
> Subhash
>
> On Wed, Oct 25, 2017 at 4:08 PM, KhajaAsmath Mohammed <
> mdkhajaasm...@gmail.com> wrote:
>
>> Hi Sriram,
>>
>> Thanks. This is what I was looking for.
>>
>> one question, where do we need to specify the checkpoint directory in
>> case of structured streaming?
>>
>> Thanks,
>> Asmath
>>
>> On Wed, Oct 25, 2017 at 2:52 PM, Subhash Sriram <subhash.sri...@gmail.com
>> > wrote:
>>
>>> Hi Asmath,
>>>
>>> Here is an example of using structured streaming to read from Kafka:
>>>
>>> https://github.com/apache/spark/blob/master/examples/src/mai
>>> n/scala/org/apache/spark/examples/sql/streaming/StructuredKa
>>> fkaWordCount.scala
>>>
>>> In terms of parsing the JSON, there is a from_json function that you can
>>> use. The following might help:
>>>
>>> https://databricks.com/blog/2017/02/23/working-complex-data-
>>> formats-structured-streaming-apache-spark-2-1.html
>>>
>>> I hope this helps.
>>>
>>> Thanks,
>>> Subhash
>>>
>>> On Wed, Oct 25, 2017 at 2:59 PM, KhajaAsmath Mohammed <
>>> mdkhajaasm...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Could anyone provide suggestions on how to parse json data from kafka
>>>> and load it back in hive.
>>>>
>>>> I have read about structured streaming but didn't find any examples. is
>>>> there any best practise on how to read it and parse it with structured
>>>> streaming for this use case?
>>>>
>>>> Thanks,
>>>> Asmath
>>>>
>>>
>>>
>>
>

Re: Structured Stream in Spark

Reply via email to