Re: Apache Spark Structured Streaming - Kafka Streaming - Option to ignore checkpoint

2018-06-06 Thread amihay gonen
If you are using kafka direct connect api it might be committing offset
back to kafka itself

בתאריך יום ה׳, 7 ביוני 2018, 4:10, מאת licl ‏:

> I met the same issue and I have try to delete the checkpoint dir before the
> job ,
>
> But spark seems can read the correct offset  even though after the
> checkpoint dir is deleted ,
>
> I don't know how spark do this without checkpoint's metadata.
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Re: Apache Spark Structured Streaming - Kafka Streaming - Option to ignore checkpoint

2018-06-06 Thread licl
I met the same issue and I have try to delete the checkpoint dir before the
job ,

But spark seems can read the correct offset  even though after the
checkpoint dir is deleted ,

I don't know how spark do this without checkpoint's metadata.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Apache Spark Structured Streaming - Kafka Streaming - Option to ignore checkpoint

2018-03-22 Thread M Singh
Hi:
I am working on a realtime application using spark structured streaming (v 
2.2.1). The application reads data from kafka and if there is a failure, I 
would like to ignore the checkpoint.  Is there any configuration to just read 
from last kafka offset after a failure and ignore any offset checkpoints ? 
Also, I believe that the checkpoint also saves state and will continue to 
aggregations after recovery.  Is there any way to ignore checkpointed state ?
Also, is there a way to selectively save state or offset checkpoint only ?

Thanks