Prabhu Joseph created FLINK-32173:
-------------------------------------

             Summary: Flink Job Metrics returns stale values in the first 
request after an update in the values
                 Key: FLINK-32173
                 URL: https://issues.apache.org/jira/browse/FLINK-32173
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Metrics
    Affects Versions: 1.17.0
            Reporter: Prabhu Joseph


Flink Job Metrics returns stale values in the first request after an update in 
the values.

*Repro:*

1. Run a flink job with fixed strategy and with multiple attempts 
{code}
restart-strategy: fixed-delay
restart-strategy.fixed-delay.attempts: 10000


flink run -Dexecution.checkpointing.interval="10s" -d -c 
org.apache.flink.streaming.examples.wordcount.WordCount 
/usr/lib/flink/examples/streaming/WordCount.jar
{code}

2. Kill one of the TaskManager which will initiate job restart.

3. After job restarted, fetch any job metrics. The first time it returns stale 
(older) value 48.

{code}
[hadoop@ip-172-31-44-70 ~]$ curl 
http://jobmanager:52000/jobs/d24f7d74d541f1215a65395e0ebd898c/metrics?get=numRestarts
  | jq .
[
  {
    "id": "numRestarts",
    "value": "48"
  }
]
{code}

4. On subsequent runs, it returns the correct value.
{code}
[hadoop@ip-172-31-44-70 ~]$ curl 
http://jobmanager:52000/jobs/d24f7d74d541f1215a65395e0ebd898c/metrics?get=numRestarts
  | jq .
[
  {
    "id": "numRestarts",
    "value": "49"
  }
]
{code}

5. Repeat steps 2 to 5, which will show that the first request after an update 
to the metrics returns a previous value before the update. Only on the next 
request is the correct value returned.







--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to