Thanks Subhash. Have you ever used zero data loss concept with streaming. I am bit worried to use streamig when it comes to data loss.
https://blog.cloudera.com/blog/2017/06/offset-management-for-apache-kafka-with-apache-spark-streaming/ does structured streaming handles it internally? On Wed, Oct 25, 2017 at 3:10 PM, Subhash Sriram <subhash.sri...@gmail.com> wrote: > No problem! Take a look at this: > > http://spark.apache.org/docs/latest/structured-streaming- > programming-guide.html#recovering-from-failures-with-checkpointing > > Thanks, > Subhash > > On Wed, Oct 25, 2017 at 4:08 PM, KhajaAsmath Mohammed < > mdkhajaasm...@gmail.com> wrote: > >> Hi Sriram, >> >> Thanks. This is what I was looking for. >> >> one question, where do we need to specify the checkpoint directory in >> case of structured streaming? >> >> Thanks, >> Asmath >> >> On Wed, Oct 25, 2017 at 2:52 PM, Subhash Sriram <subhash.sri...@gmail.com >> > wrote: >> >>> Hi Asmath, >>> >>> Here is an example of using structured streaming to read from Kafka: >>> >>> https://github.com/apache/spark/blob/master/examples/src/mai >>> n/scala/org/apache/spark/examples/sql/streaming/StructuredKa >>> fkaWordCount.scala >>> >>> In terms of parsing the JSON, there is a from_json function that you can >>> use. The following might help: >>> >>> https://databricks.com/blog/2017/02/23/working-complex-data- >>> formats-structured-streaming-apache-spark-2-1.html >>> >>> I hope this helps. >>> >>> Thanks, >>> Subhash >>> >>> On Wed, Oct 25, 2017 at 2:59 PM, KhajaAsmath Mohammed < >>> mdkhajaasm...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> Could anyone provide suggestions on how to parse json data from kafka >>>> and load it back in hive. >>>> >>>> I have read about structured streaming but didn't find any examples. is >>>> there any best practise on how to read it and parse it with structured >>>> streaming for this use case? >>>> >>>> Thanks, >>>> Asmath >>>> >>> >>> >> >