Re: Custom Prometheus metrics disappeared in 1.16.2 => 1.17.1 upgrade

Javier Vegas Thu, 28 Sep 2023 00:24:47 -0700

Thanks! I saw the first change but missed the third one, that is the
most that most probably explains my problem, most probably the metrics
I was sending with the twitter/finagle statsReceiver ended up in the
singleton default registry and were exposed by Flink with all the
other Flink metrics, but now that Flink uses its own registry I have
no idea where my custom metrics end up



El mié, 27 sept 2023 a las 4:56, Kenan Kılıçtepe
(<kkilict...@gmail.com>) escribió:
>
> Have you checked the metric  changes in 1.17.
>
> From release notes 1.17:
> https://nightlies.apache.org/flink/flink-docs-master/release-notes/flink-1.17/
>
> Metric Reporters #
> Only support reporter factories for instantiation #
> FLINK-24235 #
> Configuring reporters by their class is no longer supported. Reporter 
> implementations must provide a MetricReporterFactory, and all configurations 
> must be migrated to such a factory.
>
> UseLogicalIdentifier makes datadog consider metric as custom #
> FLINK-30383 #
> The Datadog reporter now adds a “flink.” prefix to metric identifiers if 
> “useLogicalIdentifier” is enabled. This is required for these metrics to be 
> recognized as Flink metrics, not custom ones.
>
> Use separate Prometheus CollectorRegistries #
> FLINK-30020 #
> The PrometheusReporters now use a separate CollectorRegistry for each 
> reporter instance instead of the singleton default registry. This generally 
> shouldn’t impact setups, but it may break code that indirectly interacts with 
> the reporter via the singleton instance (e.g., a test trying to assert what 
> metrics are reported).
>
>
>
> On Wed, Sep 27, 2023 at 11:11 AM Javier Vegas <jve...@strava.com> wrote:
>>
>> I implemented some custom Prometheus metrics that were working on
>> 1.16.2, with my configuration
>>
>> metrics.reporter.prom.factory.class:
>> org.apache.flink.metrics.prometheus.PrometheusReporterFactory
>> metrics.reporter.prom.port: 9999
>>
>> I could see both Flink metrics and my custom metrics on port 9999 of
>> my task managers
>>
>> After upgrading to 1.17.1, using the same configuration, I can see
>> only the FLink metrics on port 9999 of the task managers,
>> the custom metrics are getting lost somewhere.
>>
>> The release notes for 1.17 mention
>> https://issues.apache.org/jira/browse/FLINK-24235
>> that removes instantiating reporters by name and forces using a
>> factory, which I was already doing in 1.16.2. Do I need to do
>> anything extra after those changes so my metrics are aggregated with
>> the Flink ones?
>>
>> I am also seeing this error message on application startup (which I
>> was already seeing in 1.16.2): "Multiple implementations of the same
>> reporter were found in 'lib' and/or 'plugins' directories for
>> org.apache.flink.metrics.prometheus.PrometheusReporterFactory. It is
>> recommended to remove redundant reporter JARs to resolve used
>> versions' ambiguity." Could that also explain the missing metrics?
>>
>> Thanks,
>>
>> Javier Vegas

Re: Custom Prometheus metrics disappeared in 1.16.2 => 1.17.1 upgrade

Reply via email to