[jira] [Updated] (FLINK-38133) Unable to find checkpoint status analysis during Flink restart on k8s，jobmanager created by deployment

jeremyMu (Jira) Tue, 22 Jul 2025 06:45:07 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-38133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


jeremyMu updated FLINK-38133:
-----------------------------
    Attachment:     (was: 486dc1fed2cf5b84922ede34d479015.png)

> Unable to find checkpoint status analysis during Flink restart on 
> k8s，jobmanager created by deployment
> ------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-38133
>                 URL: https://issues.apache.org/jira/browse/FLINK-38133
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / Kubernetes
>    Affects Versions: 1.16.2
>            Reporter: jeremyMu
>            Priority: Major
>         Attachments: 486dc1fed2cf5b84922ede34d479015-1.png, 
> AgAABj35qkdRNkkxGZlEc4xVnl4UNi2l.png
>
>
> Before exiting abnormally, jm will clear the metadata information of ha 
> (metadata information such as checkpoint pointers
> In actual business operations, the number of TM retries is configured (in 
> some business scenarios, the taskmanager will not retry indefinitely). If the 
> TM reaches the retry limit and fails to pull up the job normally, it will 
> cause the JM to crash. After the JM crashes, the metadata information stored 
> by HA will be cleared (check the logic in the source code). As a result, when 
> the JM automatically restarts, it cannot find the HA metadata information, 
> and thus cannot locate the most recent Checkpoint state



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-38133) Unable to find checkpoint status analysis during Flink restart on k8s，jobmanager created by deployment

Reply via email to