Thank You Jungtaek and Amit ! This is very helpful indeed !

Cheers,

Debu

On Mon, Sep 28, 2020 at 5:33 AM Jungtaek Lim <kabhwan.opensou...@gmail.com>
wrote:

>
> https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala
>
> You would need to implement CheckpointFileManager by yourself, which is
> tightly integrated with HDFS (parameters and return types of methods are
> mostly from HDFS). That wouldn't mean it's impossible to
> implement CheckpointFileManager against a non-filesystem, but it'd be
> non-trivial to override all of the functionalities and make it work
> seamlessly.
>
> Required consistency is documented via javadoc of CheckpointFileManager -
> please go through reading it, and evaluate whether your target storage can
> fulfill the requirement.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> On Mon, Sep 28, 2020 at 3:04 AM Amit Joshi <mailtojoshia...@gmail.com>
> wrote:
>
>> Hi,
>>
>> As far as I know, it depends on whether you are using spark streaming or
>> structured streaming.
>> In spark streaming you can write your own code to checkpoint.
>> But in case of structured streaming it should be file location.
>> But main question in why do you want to checkpoint in
>> Nosql, as it's eventual consistence.
>>
>>
>> Regards
>> Amit
>>
>> On Sunday, September 27, 2020, Debabrata Ghosh <mailford...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>     I had a query around Spark checkpoints - Can I store the
>>> checkpoints in NoSQL or Kafka instead of Filesystem ?
>>>
>>> Regards,
>>>
>>> Debu
>>>
>>

Reply via email to