thanks for all replies! i'll definitely take a look at the Flink k8s
Operator project.

i'll try to restate the issue to clarify. this issue is specific to
starting a job from a savepoint in job-cluster mode. in these cases the Job
Manager container is configured to run a single Flink job at start-up. the
savepoint needs to be provided as an argument to the entrypoint. the Flink
documentation for this approach is here:

https://github.com/apache/flink/tree/master/flink-container/kubernetes#resuming-from-a-savepoint

the issue is that taking this approach means that the job will *always*
start from the savepoint provided as the start argument in the Kubernetes
YAML. this includes unplanned restarts of the job manager, but we'd really
prefer any *unplanned* restarts resume for the most recent checkpoint
instead of restarting from the configured savepoint. so in a sense we want
the savepoint argument to be transient, only being used during the initial
deployment, but this runs counter to the design of Kubernetes which always
wants to restore a deployment to the "goal state" as defined in the YAML.

i hope this helps. if you want more details please let me know, and thanks
again for your time.


On Tue, Sep 24, 2019 at 1:09 PM Hao Sun <ha...@zendesk.com> wrote:

> I think I overlooked it. Good point. I am using Redis to save the path to
> my savepoint, I might be able to set a TTL to avoid such issue.
>
> Hao Sun
>
>
> On Tue, Sep 24, 2019 at 9:54 AM Yuval Itzchakov <yuva...@gmail.com> wrote:
>
>> Hi Hao,
>>
>> I think he's exactly talking about the usecase where the JM/TM restart
>> and they come back up from the latest savepoint which might be stale by
>> that time.
>>
>> On Tue, 24 Sep 2019, 19:24 Hao Sun, <ha...@zendesk.com> wrote:
>>
>>> We always make a savepoint before we shutdown the job-cluster. So the
>>> savepoint is always the latest. When we fix a bug or change the job graph,
>>> it can resume well.
>>> We only use checkpoints for unplanned downtime, e.g. K8S killed JM/TM,
>>> uncaught exception, etc.
>>>
>>> Maybe I do not understand your use case well, I do not see a need to
>>> start from checkpoint after a bug fix.
>>> From what I know, currently you can use checkpoint as a savepoint as well
>>>
>>> Hao Sun
>>>
>>>
>>> On Tue, Sep 24, 2019 at 7:48 AM Yuval Itzchakov <yuva...@gmail.com>
>>> wrote:
>>>
>>>> AFAIK there's currently nothing implemented to solve this problem, but
>>>> working on a possible fix can be implemented on top of
>>>> https://github.com/lyft/flinkk8soperator which already has a pretty
>>>> fancy state machine for rolling upgrades. I'd love to be involved as this
>>>> is an issue I've been thinking about as well.
>>>>
>>>> Yuval
>>>>
>>>> On Tue, Sep 24, 2019 at 5:02 PM Sean Hester <
>>>> sean.hes...@bettercloud.com> wrote:
>>>>
>>>>> hi all--we've run into a gap (knowledge? design? tbd?) for our use
>>>>> cases when deploying Flink jobs to start from savepoints using the
>>>>> job-cluster mode in Kubernetes.
>>>>>
>>>>> we're running a ~15 different jobs, all in job-cluster mode, using a
>>>>> mix of Flink 1.8.1 and 1.9.0, under GKE (Google Kubernetes Engine). these
>>>>> are all long-running streaming jobs, all essentially acting as
>>>>> microservices. we're using Helm charts to configure all of our 
>>>>> deployments.
>>>>>
>>>>> we have a number of use cases where we want to restart jobs from a
>>>>> savepoint to replay recent events, i.e. when we've enhanced the job logic
>>>>> or fixed a bug. but after the deployment we want to have the job resume
>>>>> it's "long-running" behavior, where any unplanned restarts resume from the
>>>>> latest checkpoint.
>>>>>
>>>>> the issue we run into is that any obvious/standard/idiomatic
>>>>> Kubernetes deployment includes the savepoint argument in the 
>>>>> configuration.
>>>>> if the Job Manager container(s) have an unplanned restart, when they come
>>>>> back up they will start from the savepoint instead of resuming from the
>>>>> latest checkpoint. everything is working as configured, but that's not
>>>>> exactly what we want. we want the savepoint argument to be transient
>>>>> somehow (only used during the initial deployment), but Kubernetes doesn't
>>>>> really support the concept of transient configuration.
>>>>>
>>>>> i can see a couple of potential solutions that either involve custom
>>>>> code in the jobs or custom logic in the container (i.e. a custom 
>>>>> entrypoint
>>>>> script that records that the configured savepoint has already been used in
>>>>> a file on a persistent volume or GCS, and potentially when/why/by which
>>>>> deployment). but these seem like unexpected and hacky solutions. before we
>>>>> head down that road i wanted to ask:
>>>>>
>>>>>    - is this is already a solved problem that i've missed?
>>>>>    - is this issue already on the community's radar?
>>>>>
>>>>> thanks in advance!
>>>>>
>>>>> --
>>>>> *Sean Hester* | Senior Staff Software Engineer | m. 404-828-0865
>>>>> 3525 Piedmont Rd. NE, Building 6, Suite 500, Atlanta, GA 30305
>>>>> <http://www.bettercloud.com> <http://www.bettercloud.com>
>>>>> *Altitude 2019 in San Francisco | Sept. 23 - 25*
>>>>> It’s not just an IT conference, it’s “a complete learning and
>>>>> networking experience”
>>>>> <https://altitude.bettercloud.com/?utm_source=gmail&utm_medium=signature&utm_campaign=2019-altitude>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Yuval Itzchakov.
>>>>
>>>

-- 
*Sean Hester* | Senior Staff Software Engineer | m. 404-828-0865
3525 Piedmont Rd. NE, Building 6, Suite 500, Atlanta, GA 30305
<http://www.bettercloud.com> <http://www.bettercloud.com>
*Altitude 2019 in San Francisco | Sept. 23 - 25*
It’s not just an IT conference, it’s “a complete learning and networking
experience”
<https://altitude.bettercloud.com/?utm_source=gmail&utm_medium=signature&utm_campaign=2019-altitude>

Reply via email to