Re: Apache Spark Structured Streaming - Kafka Streaming - Option to ignore checkpoint
If you are using kafka direct connect api it might be committing offset back to kafka itself בתאריך יום ה׳, 7 ביוני 2018, 4:10, מאת licl : > I met the same issue and I have try to delete the checkpoint dir before the > job , > > But spark seems can read the correct offset even though after the > checkpoint dir is deleted , > > I don't know how spark do this without checkpoint's metadata. > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >
Re: Apache Spark Structured Streaming - Kafka Streaming - Option to ignore checkpoint
I met the same issue and I have try to delete the checkpoint dir before the job , But spark seems can read the correct offset even though after the checkpoint dir is deleted , I don't know how spark do this without checkpoint's metadata. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Apache Spark Structured Streaming - Kafka Streaming - Option to ignore checkpoint
Hi: I am working on a realtime application using spark structured streaming (v 2.2.1). The application reads data from kafka and if there is a failure, I would like to ignore the checkpoint. Is there any configuration to just read from last kafka offset after a failure and ignore any offset checkpoints ? Also, I believe that the checkpoint also saves state and will continue to aggregations after recovery. Is there any way to ignore checkpointed state ? Also, is there a way to selectively save state or offset checkpoint only ? Thanks