Hi folks,

We  are using kafka + spark streaming in our data pipeline,  but sometimes
we have to clean up checkpoint from hdfs before we restart spark streaming
application, otherwise the application fails to start.

That means we are losing data when we clean up checkpoint, is there a way
to read kafka offset from checkpoint so that we might be able tp process
the data from that offset to avoid losing data.

Thanks

Reply via email to