Datasource API V2 and checkpointing

Thakrar, Jayesh Mon, 23 Apr 2018 19:50:34 -0700

I was wondering when checkpointing is enabled, who does the actual work?
The streaming datasource or the execution engine/driver?


I have written a small/trivial datasource that just generates strings.
After enabling checkpointing, I do see a folder being created under the 
checkpoint folder, but there's nothing else in there.

Same question for write-ahead and recovery?
And on a restart from a failed streaming session - who should set the offsets?
The driver/Spark or the datasource?

Any pointers to design docs would also be greatly appreciated.

Thanks,
Jayesh

Datasource API V2 and checkpointing

Reply via email to