Custom serialization and checkpointing

Tech Meme Thu, 13 Aug 2015 16:33:06 -0700

Hi Guys,
   We need to do some state checkpointing (an rdd thats updated using
updateStateByKey). We would like finer control over the serialization.
Also, this would allow us to do schema evolution in the deserialization
code when we need to modify the structure of the classes associated with
the state.


I guess I can do foreachRDD and write to any location (either to a blob
store or a dynamo).

A) How I can make the checkpoint recovery read data from this persisted
location.
B) I notice that calling checkpoint cleans up older versions of the
checkpoint. Where should i be writing this cleanup code.
C) My understanding is that checkpointing is atomic. Is there anything I
need to be aware of to not loose the atomicity semantics.


Thanks,
Arun

Custom serialization and checkpointing

Reply via email to