[
https://issues.apache.org/jira/browse/FLINK-38704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mohsen Rezaei updated FLINK-38704:
----------------------------------
Description:
Something that was working in 1.x releases, but it doesn't load the correct
config in 2.x.
Runtime Flink configurations loaded:
{code:java}
2025-11-20 04:33:51.737 [main] INFO
org.apache.flink.configuration.GlobalConfiguration - Loading configuration
property: metrics.reporter.prom.port, 9999
2025-11-20 04:33:51.738 [main] INFO
org.apache.flink.configuration.GlobalConfiguration - Loading configuration
property: metrics.reporter.prom.factory.class,
org.apache.flink.metrics.prometheus.PrometheusReporterFactory
{code}
But the reporter setup [loads the default
port|https://github.com/apache/flink/blob/45ab6c816465e717d0eef2ad6672cbb0c1a73a7e/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporterFactory.java#L33]
{code:java}
2025-11-20 04:33:55.520 [main] INFO
org.apache.flink.metrics.prometheus.PrometheusReporter - Started
PrometheusReporter HTTP server on port 9249.
{code}
and only vending metrics from 9249:
{code:java}
flink@jm-0:~$ curl localhost:9999/metrics
curl: (7) Failed to connect to localhost port 9999 after 0 ms: Couldn't connect
to server
flink@jm-0:~$ curl localhost:9249/metrics
# HELP flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond
TimeMsPerSecond (scope: jobmanager_Status_JVM_GarbageCollector_Copy)
# TYPE flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond gauge
flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond{host="10_155_60_8",}
0.0
...
{code}
This is potentially affecting all the reporters loaded via their factory in
{{{}ReporterSetup{}}}.
was:
Something that was working in 1.x releases, but it doesn't load the correct
config in 2.x.
Runtime Flink configurations loaded:
{code}
2025-11-20 04:33:51.737 [main] INFO
org.apache.flink.configuration.GlobalConfiguration - Loading configuration
property: metrics.reporter.prom.port, 9999
2025-11-20 04:33:51.738 [main] INFO
org.apache.flink.configuration.GlobalConfiguration - Loading configuration
property: metrics.reporter.prom.factory.class,
org.apache.flink.metrics.prometheus.PrometheusReporterFactory
{code}
But the reporter setup [loads the default
port](https://github.com/apache/flink/blob/45ab6c816465e717d0eef2ad6672cbb0c1a73a7e/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporterFactory.java#L33):
{code}
2025-11-20 04:33:55.520 [main] INFO
org.apache.flink.metrics.prometheus.PrometheusReporter - Started
PrometheusReporter HTTP server on port 9249.
{code}
and only vending metrics from 9249:
{code}
flink@jm-0:~$ curl localhost:9999/metrics
curl: (7) Failed to connect to localhost port 9999 after 0 ms: Couldn't connect
to server
flink@jm-0:~$ curl localhost:9249/metrics
# HELP flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond
TimeMsPerSecond (scope: jobmanager_Status_JVM_GarbageCollector_Copy)
# TYPE flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond gauge
flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond{host="10_155_60_8",}
0.0
...
{code}
This is potentially affecting all the reporters loaded via their factory in
{{ReporterSetup}}.
> Metrics reporter setup does not load Prometheus with correct configs/port
> -------------------------------------------------------------------------
>
> Key: FLINK-38704
> URL: https://issues.apache.org/jira/browse/FLINK-38704
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Metrics
> Affects Versions: 2.0.1, 2.2.0, 2.1.1
> Reporter: Mohsen Rezaei
> Priority: Major
>
> Something that was working in 1.x releases, but it doesn't load the correct
> config in 2.x.
> Runtime Flink configurations loaded:
> {code:java}
> 2025-11-20 04:33:51.737 [main] INFO
> org.apache.flink.configuration.GlobalConfiguration - Loading configuration
> property: metrics.reporter.prom.port, 9999
> 2025-11-20 04:33:51.738 [main] INFO
> org.apache.flink.configuration.GlobalConfiguration - Loading configuration
> property: metrics.reporter.prom.factory.class,
> org.apache.flink.metrics.prometheus.PrometheusReporterFactory
> {code}
> But the reporter setup [loads the default
> port|https://github.com/apache/flink/blob/45ab6c816465e717d0eef2ad6672cbb0c1a73a7e/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporterFactory.java#L33]
> {code:java}
> 2025-11-20 04:33:55.520 [main] INFO
> org.apache.flink.metrics.prometheus.PrometheusReporter - Started
> PrometheusReporter HTTP server on port 9249.
> {code}
> and only vending metrics from 9249:
> {code:java}
> flink@jm-0:~$ curl localhost:9999/metrics
> curl: (7) Failed to connect to localhost port 9999 after 0 ms: Couldn't
> connect to server
> flink@jm-0:~$ curl localhost:9249/metrics
> # HELP flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond
> TimeMsPerSecond (scope: jobmanager_Status_JVM_GarbageCollector_Copy)
> # TYPE flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond gauge
> flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond{host="10_155_60_8",}
> 0.0
> ...
> {code}
> This is potentially affecting all the reporters loaded via their factory in
> {{{}ReporterSetup{}}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)