Thank You Jungtaek and Amit ! This is very helpful indeed ! Cheers,
Debu On Mon, Sep 28, 2020 at 5:33 AM Jungtaek Lim <kabhwan.opensou...@gmail.com> wrote: > > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala > > You would need to implement CheckpointFileManager by yourself, which is > tightly integrated with HDFS (parameters and return types of methods are > mostly from HDFS). That wouldn't mean it's impossible to > implement CheckpointFileManager against a non-filesystem, but it'd be > non-trivial to override all of the functionalities and make it work > seamlessly. > > Required consistency is documented via javadoc of CheckpointFileManager - > please go through reading it, and evaluate whether your target storage can > fulfill the requirement. > > Thanks, > Jungtaek Lim (HeartSaVioR) > > On Mon, Sep 28, 2020 at 3:04 AM Amit Joshi <mailtojoshia...@gmail.com> > wrote: > >> Hi, >> >> As far as I know, it depends on whether you are using spark streaming or >> structured streaming. >> In spark streaming you can write your own code to checkpoint. >> But in case of structured streaming it should be file location. >> But main question in why do you want to checkpoint in >> Nosql, as it's eventual consistence. >> >> >> Regards >> Amit >> >> On Sunday, September 27, 2020, Debabrata Ghosh <mailford...@gmail.com> >> wrote: >> >>> Hi, >>> I had a query around Spark checkpoints - Can I store the >>> checkpoints in NoSQL or Kafka instead of Filesystem ? >>> >>> Regards, >>> >>> Debu >>> >>