Re: Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-30 Thread Ryan van Huuksloot
Sounds good - I haven't looked too much into the comparison of the old and
new client.

Opened a ticket: https://issues.apache.org/jira/browse/FLINK-32508

I may have some time to support this ticket in 2 weeks but anyone can feel
free to start working on it.

Thanks,
Ryan van Huuksloot
Sr. Production Engineer | Streaming Platform
[image: Shopify]
<https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>


On Fri, Jun 30, 2023 at 10:34 AM Martijn Visser 
wrote:

> Hi Ryan,
>
> If I look at the changelog for the simpleclient 0.10 [1], they've switched
> their data model. So if you upgrade to the later version, the data model
> for existing Flink Prometheus users would be broken IIRC. That's why I
> think option 1 is more clean: it provides the option to the user to choose
> which package they want to use. Either the new one, with a new data model,
> or the current one, with the existing data model.
>
> Best regards,
>
> Martijn
>
> [1] https://github.com/prometheus/client_java/releases/tag/parent-0.10.0
>
> On Fri, Jun 30, 2023 at 4:23 PM Ryan van Huuksloot
>  wrote:
>
> > I'd have to check but the original plan is to upgrade the client but keep
> > the flink-metrics-prometheus implementation the same. This should keep
> the
> > metrics collection consistent even with the client upgrade - this would
> > need to be verified.
> >
> > But if that is the worry then we could create a new package to keep
> things
> > distinct.
> >
> > Thanks,
> >
> > Ryan van Huuksloot
> > Sr. Production Engineer | Streaming Platform
> > [image: Shopify]
> > <https://www.shopify.com/?utm_medium=salessignatures_source=hs_email
> >
> >
> >
> > On Fri, Jun 30, 2023 at 10:02 AM Martijn Visser <
> martijnvis...@apache.org>
> > wrote:
> >
> > > Hi Patrick,
> > >
> > > Yeah, but you would need the latest version of the client, which would
> > > break the implementation for the current, outdated one, wouldn't it?
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > On Fri, Jun 30, 2023 at 3:35 PM Ryan van Huuksloot
> > >  wrote:
> > >
> > > > Hi Martijn,
> > > >
> > > > Option 2 and 3 would use a single client. It would just register the
> > > > metrics differently.
> > > >
> > > > Does that make sense? Does that change your perspective?
> > > >
> > > > Thanks,
> > > >
> > > > Ryan van Huuksloot
> > > > Sr. Production Engineer | Streaming Platform
> > > > [image: Shopify]
> > > > <
> > https://www.shopify.com/?utm_medium=salessignatures_source=hs_email
> > > >
> > > >
> > > >
> > > > On Fri, Jun 30, 2023 at 7:49 AM Martijn Visser <
> > martijnvis...@apache.org
> > > >
> > > > wrote:
> > > >
> > > > > Hi Ryan,
> > > > >
> > > > > I think option 2 and option 3 won't work, because there can be only
> > one
> > > > > version of the client. I don't think we should make a clean break
> on
> > > > > metrics in a minor version, but only in major. All in all, I think
> > > > option 1
> > > > > would be the best. We could deprecate the existing one and remove
> it
> > > > > with Flink 2.0 imho.
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Martijn
> > > > >
> > > > > On Thu, Jun 29, 2023 at 5:56 PM Ryan van Huuksloot
> > > > >  wrote:
> > > > >
> > > > > > Hi Martijn,
> > > > > >
> > > > > > Our team shared the same concern. We've considered a few options:
> > > > > >
> > > > > >
> > > > > > *1. Add a new package such as `flink-metrics-prometheus-native`
> and
> > > > > > eventually deprecate the original.*
> > > > > > *Pros:*
> > > > > > - Supports backward compatibility.
> > > > > > *Cons:*
> > > > > > - Two packages to maintain in the interim.
> > > > > > - Not consistent with other metrics packages.
> > > > > >
> > > > > > *2. Maintain the same logic in flink-metrics-prometheus and write
> > new
> > > > > > natively typed metrics to a different metric name in Prometheus,
> in
> > > > > > addition to the o

[jira] [Created] (FLINK-32508) Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-30 Thread Ryan van Huuksloot (Jira)
Ryan van Huuksloot created FLINK-32508:
--

 Summary: Flink-Metrics Prometheus - Native Histograms / Native 
Counters
 Key: FLINK-32508
 URL: https://issues.apache.org/jira/browse/FLINK-32508
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / Metrics
Reporter: Ryan van Huuksloot
 Fix For: 1.18.0, 1.19.0


There are new metric types in Prometheus that would allow for the exporter to 
write Counters and Histograms as Native metrics in prometheus (vs writing as 
Gauges). This requires an update to the Prometheus Client which has changed 
it's spec.

To accommodate the new metric types while retaining the old option for 
prometheus metrics, the recommendation is to *Add a new package such as 
`flink-metrics-prometheus-native` and eventually deprecate the original.*

Discussed more on the mailing list: 
https://lists.apache.org/thread/kbo3973whb8nj5xvkpvhxrmgtmnbkhlv



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-30 Thread Martijn Visser
Hi Ryan,

If I look at the changelog for the simpleclient 0.10 [1], they've switched
their data model. So if you upgrade to the later version, the data model
for existing Flink Prometheus users would be broken IIRC. That's why I
think option 1 is more clean: it provides the option to the user to choose
which package they want to use. Either the new one, with a new data model,
or the current one, with the existing data model.

Best regards,

Martijn

[1] https://github.com/prometheus/client_java/releases/tag/parent-0.10.0

On Fri, Jun 30, 2023 at 4:23 PM Ryan van Huuksloot
 wrote:

> I'd have to check but the original plan is to upgrade the client but keep
> the flink-metrics-prometheus implementation the same. This should keep the
> metrics collection consistent even with the client upgrade - this would
> need to be verified.
>
> But if that is the worry then we could create a new package to keep things
> distinct.
>
> Thanks,
>
> Ryan van Huuksloot
> Sr. Production Engineer | Streaming Platform
> [image: Shopify]
> <https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>
>
>
> On Fri, Jun 30, 2023 at 10:02 AM Martijn Visser 
> wrote:
>
> > Hi Patrick,
> >
> > Yeah, but you would need the latest version of the client, which would
> > break the implementation for the current, outdated one, wouldn't it?
> >
> > Best regards,
> >
> > Martijn
> >
> > On Fri, Jun 30, 2023 at 3:35 PM Ryan van Huuksloot
> >  wrote:
> >
> > > Hi Martijn,
> > >
> > > Option 2 and 3 would use a single client. It would just register the
> > > metrics differently.
> > >
> > > Does that make sense? Does that change your perspective?
> > >
> > > Thanks,
> > >
> > > Ryan van Huuksloot
> > > Sr. Production Engineer | Streaming Platform
> > > [image: Shopify]
> > > <
> https://www.shopify.com/?utm_medium=salessignatures_source=hs_email
> > >
> > >
> > >
> > > On Fri, Jun 30, 2023 at 7:49 AM Martijn Visser <
> martijnvis...@apache.org
> > >
> > > wrote:
> > >
> > > > Hi Ryan,
> > > >
> > > > I think option 2 and option 3 won't work, because there can be only
> one
> > > > version of the client. I don't think we should make a clean break on
> > > > metrics in a minor version, but only in major. All in all, I think
> > > option 1
> > > > would be the best. We could deprecate the existing one and remove it
> > > > with Flink 2.0 imho.
> > > >
> > > > Best regards,
> > > >
> > > > Martijn
> > > >
> > > > On Thu, Jun 29, 2023 at 5:56 PM Ryan van Huuksloot
> > > >  wrote:
> > > >
> > > > > Hi Martijn,
> > > > >
> > > > > Our team shared the same concern. We've considered a few options:
> > > > >
> > > > >
> > > > > *1. Add a new package such as `flink-metrics-prometheus-native` and
> > > > > eventually deprecate the original.*
> > > > > *Pros:*
> > > > > - Supports backward compatibility.
> > > > > *Cons:*
> > > > > - Two packages to maintain in the interim.
> > > > > - Not consistent with other metrics packages.
> > > > >
> > > > > *2. Maintain the same logic in flink-metrics-prometheus and write
> new
> > > > > natively typed metrics to a different metric name in Prometheus, in
> > > > > addition to the original metric.*
> > > > >
> > > > > *Pros:*
> > > > > - Supports backward compatibility.
> > > > > *Cons:*
> > > > > - Nearly doubles the metrics being captured by default.
> > > > > - The naming convention will permanently differ when the original
> > names
> > > > are
> > > > > deprecated.
> > > > > - The original names will likely be deprecated at some point.
> > > > >
> > > > > *3. Maintain the same logic in flink-metrics-prometheus. However,
> if
> > > you
> > > > > use a flink-conf option, natively typed metrics would be written to
> > the
> > > > > same names instead of the original metric types.*
> > > > >
> > > > > *Pros:*
> > > > > - Supports backwards compatibility
> > > > > - No double metrics
> > > > > *Cons:*
> > > > > - Increases the maintenance burden.
> > &

Re: Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-30 Thread Ryan van Huuksloot
I'd have to check but the original plan is to upgrade the client but keep
the flink-metrics-prometheus implementation the same. This should keep the
metrics collection consistent even with the client upgrade - this would
need to be verified.

But if that is the worry then we could create a new package to keep things
distinct.

Thanks,

Ryan van Huuksloot
Sr. Production Engineer | Streaming Platform
[image: Shopify]
<https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>


On Fri, Jun 30, 2023 at 10:02 AM Martijn Visser 
wrote:

> Hi Patrick,
>
> Yeah, but you would need the latest version of the client, which would
> break the implementation for the current, outdated one, wouldn't it?
>
> Best regards,
>
> Martijn
>
> On Fri, Jun 30, 2023 at 3:35 PM Ryan van Huuksloot
>  wrote:
>
> > Hi Martijn,
> >
> > Option 2 and 3 would use a single client. It would just register the
> > metrics differently.
> >
> > Does that make sense? Does that change your perspective?
> >
> > Thanks,
> >
> > Ryan van Huuksloot
> > Sr. Production Engineer | Streaming Platform
> > [image: Shopify]
> > <https://www.shopify.com/?utm_medium=salessignatures_source=hs_email
> >
> >
> >
> > On Fri, Jun 30, 2023 at 7:49 AM Martijn Visser  >
> > wrote:
> >
> > > Hi Ryan,
> > >
> > > I think option 2 and option 3 won't work, because there can be only one
> > > version of the client. I don't think we should make a clean break on
> > > metrics in a minor version, but only in major. All in all, I think
> > option 1
> > > would be the best. We could deprecate the existing one and remove it
> > > with Flink 2.0 imho.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > On Thu, Jun 29, 2023 at 5:56 PM Ryan van Huuksloot
> > >  wrote:
> > >
> > > > Hi Martijn,
> > > >
> > > > Our team shared the same concern. We've considered a few options:
> > > >
> > > >
> > > > *1. Add a new package such as `flink-metrics-prometheus-native` and
> > > > eventually deprecate the original.*
> > > > *Pros:*
> > > > - Supports backward compatibility.
> > > > *Cons:*
> > > > - Two packages to maintain in the interim.
> > > > - Not consistent with other metrics packages.
> > > >
> > > > *2. Maintain the same logic in flink-metrics-prometheus and write new
> > > > natively typed metrics to a different metric name in Prometheus, in
> > > > addition to the original metric.*
> > > >
> > > > *Pros:*
> > > > - Supports backward compatibility.
> > > > *Cons:*
> > > > - Nearly doubles the metrics being captured by default.
> > > > - The naming convention will permanently differ when the original
> names
> > > are
> > > > deprecated.
> > > > - The original names will likely be deprecated at some point.
> > > >
> > > > *3. Maintain the same logic in flink-metrics-prometheus. However, if
> > you
> > > > use a flink-conf option, natively typed metrics would be written to
> the
> > > > same names instead of the original metric types.*
> > > >
> > > > *Pros:*
> > > > - Supports backwards compatibility
> > > > - No double metrics
> > > > *Cons:*
> > > > - Increases the maintenance burden.
> > > > - Would require future migrations
> > > >
> > > > *4. Make a clean break and swap the types in
> flink-metrics-prometheus,
> > > > releasing it in 1.18 or 1.19 with a note.*
> > > >
> > > > *Pros:*
> > > > - Avoids duplicate metrics and packages.
> > > > - No future maintenance burden.
> > > > *Cons:*
> > > > -Introduces a breaking change.
> > > > - Metrics may silently fail in dashboards if the graphs do not
> support
> > > the
> > > > new data type (I would need to conduct more testing to determine how
> > > often
> > > > this occurs).
> > > >
> > > > I lean towards option 4, and we would communicate the change
> internally
> > > as
> > > > part of a minor version upgrade. I'm open to other ideas and would
> > > welcome
> > > > further discussion on what the OSS community prefers.
> > > >
> > > > Thanks,
> > > >
> > > > Ryan van Huuksloot
> > > > 

Re: Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-30 Thread Martijn Visser
Hi Patrick,

Yeah, but you would need the latest version of the client, which would
break the implementation for the current, outdated one, wouldn't it?

Best regards,

Martijn

On Fri, Jun 30, 2023 at 3:35 PM Ryan van Huuksloot
 wrote:

> Hi Martijn,
>
> Option 2 and 3 would use a single client. It would just register the
> metrics differently.
>
> Does that make sense? Does that change your perspective?
>
> Thanks,
>
> Ryan van Huuksloot
> Sr. Production Engineer | Streaming Platform
> [image: Shopify]
> <https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>
>
>
> On Fri, Jun 30, 2023 at 7:49 AM Martijn Visser 
> wrote:
>
> > Hi Ryan,
> >
> > I think option 2 and option 3 won't work, because there can be only one
> > version of the client. I don't think we should make a clean break on
> > metrics in a minor version, but only in major. All in all, I think
> option 1
> > would be the best. We could deprecate the existing one and remove it
> > with Flink 2.0 imho.
> >
> > Best regards,
> >
> > Martijn
> >
> > On Thu, Jun 29, 2023 at 5:56 PM Ryan van Huuksloot
> >  wrote:
> >
> > > Hi Martijn,
> > >
> > > Our team shared the same concern. We've considered a few options:
> > >
> > >
> > > *1. Add a new package such as `flink-metrics-prometheus-native` and
> > > eventually deprecate the original.*
> > > *Pros:*
> > > - Supports backward compatibility.
> > > *Cons:*
> > > - Two packages to maintain in the interim.
> > > - Not consistent with other metrics packages.
> > >
> > > *2. Maintain the same logic in flink-metrics-prometheus and write new
> > > natively typed metrics to a different metric name in Prometheus, in
> > > addition to the original metric.*
> > >
> > > *Pros:*
> > > - Supports backward compatibility.
> > > *Cons:*
> > > - Nearly doubles the metrics being captured by default.
> > > - The naming convention will permanently differ when the original names
> > are
> > > deprecated.
> > > - The original names will likely be deprecated at some point.
> > >
> > > *3. Maintain the same logic in flink-metrics-prometheus. However, if
> you
> > > use a flink-conf option, natively typed metrics would be written to the
> > > same names instead of the original metric types.*
> > >
> > > *Pros:*
> > > - Supports backwards compatibility
> > > - No double metrics
> > > *Cons:*
> > > - Increases the maintenance burden.
> > > - Would require future migrations
> > >
> > > *4. Make a clean break and swap the types in flink-metrics-prometheus,
> > > releasing it in 1.18 or 1.19 with a note.*
> > >
> > > *Pros:*
> > > - Avoids duplicate metrics and packages.
> > > - No future maintenance burden.
> > > *Cons:*
> > > -Introduces a breaking change.
> > > - Metrics may silently fail in dashboards if the graphs do not support
> > the
> > > new data type (I would need to conduct more testing to determine how
> > often
> > > this occurs).
> > >
> > > I lean towards option 4, and we would communicate the change internally
> > as
> > > part of a minor version upgrade. I'm open to other ideas and would
> > welcome
> > > further discussion on what the OSS community prefers.
> > >
> > > Thanks,
> > >
> > > Ryan van Huuksloot
> > > Sr. Production Engineer | Streaming Platform
> > > [image: Shopify]
> > > <
> https://www.shopify.com/?utm_medium=salessignatures_source=hs_email
> > >
> > >
> > >
> > > On Thu, Jun 29, 2023 at 4:23 AM Martijn Visser <
> martijnvis...@apache.org
> > >
> > > wrote:
> > >
> > > > Hi Ryan,
> > > >
> > > > I think there definitely is an interest in the
> > > > flink-metrics-prometheus, but I do see some challenges as well. Given
> > > > that the Prometheus simpleclient doesn't yet have a major version,
> > > > there are breaking changes happening in that. If we would update
> this,
> > > > it can/probably breaks the metrics for users, which is an undesirable
> > > > situation. Any thoughts on how we could avoid that situation?
> > > >
> > > > Best regards,
> > > >
> > > > Martijn
> > > >
> > > > On Tue, Jun 20, 2023 at 3:53 PM Ryan 

Re: Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-30 Thread Ryan van Huuksloot
Hi Martijn,

Option 2 and 3 would use a single client. It would just register the
metrics differently.

Does that make sense? Does that change your perspective?

Thanks,

Ryan van Huuksloot
Sr. Production Engineer | Streaming Platform
[image: Shopify]
<https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>


On Fri, Jun 30, 2023 at 7:49 AM Martijn Visser 
wrote:

> Hi Ryan,
>
> I think option 2 and option 3 won't work, because there can be only one
> version of the client. I don't think we should make a clean break on
> metrics in a minor version, but only in major. All in all, I think option 1
> would be the best. We could deprecate the existing one and remove it
> with Flink 2.0 imho.
>
> Best regards,
>
> Martijn
>
> On Thu, Jun 29, 2023 at 5:56 PM Ryan van Huuksloot
>  wrote:
>
> > Hi Martijn,
> >
> > Our team shared the same concern. We've considered a few options:
> >
> >
> > *1. Add a new package such as `flink-metrics-prometheus-native` and
> > eventually deprecate the original.*
> > *Pros:*
> > - Supports backward compatibility.
> > *Cons:*
> > - Two packages to maintain in the interim.
> > - Not consistent with other metrics packages.
> >
> > *2. Maintain the same logic in flink-metrics-prometheus and write new
> > natively typed metrics to a different metric name in Prometheus, in
> > addition to the original metric.*
> >
> > *Pros:*
> > - Supports backward compatibility.
> > *Cons:*
> > - Nearly doubles the metrics being captured by default.
> > - The naming convention will permanently differ when the original names
> are
> > deprecated.
> > - The original names will likely be deprecated at some point.
> >
> > *3. Maintain the same logic in flink-metrics-prometheus. However, if you
> > use a flink-conf option, natively typed metrics would be written to the
> > same names instead of the original metric types.*
> >
> > *Pros:*
> > - Supports backwards compatibility
> > - No double metrics
> > *Cons:*
> > - Increases the maintenance burden.
> > - Would require future migrations
> >
> > *4. Make a clean break and swap the types in flink-metrics-prometheus,
> > releasing it in 1.18 or 1.19 with a note.*
> >
> > *Pros:*
> > - Avoids duplicate metrics and packages.
> > - No future maintenance burden.
> > *Cons:*
> > -Introduces a breaking change.
> > - Metrics may silently fail in dashboards if the graphs do not support
> the
> > new data type (I would need to conduct more testing to determine how
> often
> > this occurs).
> >
> > I lean towards option 4, and we would communicate the change internally
> as
> > part of a minor version upgrade. I'm open to other ideas and would
> welcome
> > further discussion on what the OSS community prefers.
> >
> > Thanks,
> >
> > Ryan van Huuksloot
> > Sr. Production Engineer | Streaming Platform
> > [image: Shopify]
> > <https://www.shopify.com/?utm_medium=salessignatures_source=hs_email
> >
> >
> >
> > On Thu, Jun 29, 2023 at 4:23 AM Martijn Visser  >
> > wrote:
> >
> > > Hi Ryan,
> > >
> > > I think there definitely is an interest in the
> > > flink-metrics-prometheus, but I do see some challenges as well. Given
> > > that the Prometheus simpleclient doesn't yet have a major version,
> > > there are breaking changes happening in that. If we would update this,
> > > it can/probably breaks the metrics for users, which is an undesirable
> > > situation. Any thoughts on how we could avoid that situation?
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > On Tue, Jun 20, 2023 at 3:53 PM Ryan van Huuksloot
> > >  wrote:
> > > >
> > > > Following up, any interest in flink-metrics-prometheus? It is quite a
> > > stale
> > > > package. I would be interested in contributing - time permitting.
> > > >
> > > > Ryan van Huuksloot
> > > > Sr. Production Engineer | Streaming Platform
> > > > [image: Shopify]
> > > > <
> > https://www.shopify.com/?utm_medium=salessignatures_source=hs_email
> > > >
> > > >
> > > >
> > > > On Thu, Jun 15, 2023 at 2:16 PM Ryan van Huuksloot <
> > > > ryan.vanhuuksl...@shopify.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > Internally we use the flink-metrics-prometheus jar and we noticed
> > that
> > > the
> > > > > code is a little out of date. Primarily, there are new metric types
> > in
> > > > > Prometheus that would allow for the exporter to write Counters and
> > > > > Histograms as Native metrics in prometheus (vs writing as Gauges).
> > > > >
> > > > > I noticed that there was a closed PR for the simpleclient:
> > > > > https://github.com/apache/flink/pull/21047 - which has what is
> > > required
> > > > > for the native metrics but may cause other maintenance tickets.
> > > > >
> > > > > Is there any appetite from the community to update this exporter?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Ryan van Huuksloot
> > > > > Sr. Production Engineer | Streaming Platform
> > > > > [image: Shopify]
> > > > > <
> > >
> https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>
> > > > >
> > >
> >
>


Re: Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-30 Thread Martijn Visser
Hi Ryan,

I think option 2 and option 3 won't work, because there can be only one
version of the client. I don't think we should make a clean break on
metrics in a minor version, but only in major. All in all, I think option 1
would be the best. We could deprecate the existing one and remove it
with Flink 2.0 imho.

Best regards,

Martijn

On Thu, Jun 29, 2023 at 5:56 PM Ryan van Huuksloot
 wrote:

> Hi Martijn,
>
> Our team shared the same concern. We've considered a few options:
>
>
> *1. Add a new package such as `flink-metrics-prometheus-native` and
> eventually deprecate the original.*
> *Pros:*
> - Supports backward compatibility.
> *Cons:*
> - Two packages to maintain in the interim.
> - Not consistent with other metrics packages.
>
> *2. Maintain the same logic in flink-metrics-prometheus and write new
> natively typed metrics to a different metric name in Prometheus, in
> addition to the original metric.*
>
> *Pros:*
> - Supports backward compatibility.
> *Cons:*
> - Nearly doubles the metrics being captured by default.
> - The naming convention will permanently differ when the original names are
> deprecated.
> - The original names will likely be deprecated at some point.
>
> *3. Maintain the same logic in flink-metrics-prometheus. However, if you
> use a flink-conf option, natively typed metrics would be written to the
> same names instead of the original metric types.*
>
> *Pros:*
> - Supports backwards compatibility
> - No double metrics
> *Cons:*
> - Increases the maintenance burden.
> - Would require future migrations
>
> *4. Make a clean break and swap the types in flink-metrics-prometheus,
> releasing it in 1.18 or 1.19 with a note.*
>
> *Pros:*
> - Avoids duplicate metrics and packages.
> - No future maintenance burden.
> *Cons:*
> -Introduces a breaking change.
> - Metrics may silently fail in dashboards if the graphs do not support the
> new data type (I would need to conduct more testing to determine how often
> this occurs).
>
> I lean towards option 4, and we would communicate the change internally as
> part of a minor version upgrade. I'm open to other ideas and would welcome
> further discussion on what the OSS community prefers.
>
> Thanks,
>
> Ryan van Huuksloot
> Sr. Production Engineer | Streaming Platform
> [image: Shopify]
> <https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>
>
>
> On Thu, Jun 29, 2023 at 4:23 AM Martijn Visser 
> wrote:
>
> > Hi Ryan,
> >
> > I think there definitely is an interest in the
> > flink-metrics-prometheus, but I do see some challenges as well. Given
> > that the Prometheus simpleclient doesn't yet have a major version,
> > there are breaking changes happening in that. If we would update this,
> > it can/probably breaks the metrics for users, which is an undesirable
> > situation. Any thoughts on how we could avoid that situation?
> >
> > Best regards,
> >
> > Martijn
> >
> > On Tue, Jun 20, 2023 at 3:53 PM Ryan van Huuksloot
> >  wrote:
> > >
> > > Following up, any interest in flink-metrics-prometheus? It is quite a
> > stale
> > > package. I would be interested in contributing - time permitting.
> > >
> > > Ryan van Huuksloot
> > > Sr. Production Engineer | Streaming Platform
> > > [image: Shopify]
> > > <
> https://www.shopify.com/?utm_medium=salessignatures_source=hs_email
> > >
> > >
> > >
> > > On Thu, Jun 15, 2023 at 2:16 PM Ryan van Huuksloot <
> > > ryan.vanhuuksl...@shopify.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > Internally we use the flink-metrics-prometheus jar and we noticed
> that
> > the
> > > > code is a little out of date. Primarily, there are new metric types
> in
> > > > Prometheus that would allow for the exporter to write Counters and
> > > > Histograms as Native metrics in prometheus (vs writing as Gauges).
> > > >
> > > > I noticed that there was a closed PR for the simpleclient:
> > > > https://github.com/apache/flink/pull/21047 - which has what is
> > required
> > > > for the native metrics but may cause other maintenance tickets.
> > > >
> > > > Is there any appetite from the community to update this exporter?
> > > >
> > > > Thanks,
> > > >
> > > > Ryan van Huuksloot
> > > > Sr. Production Engineer | Streaming Platform
> > > > [image: Shopify]
> > > > <
> > https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>
> > > >
> >
>


Re: Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-29 Thread Ryan van Huuksloot
Hi Martijn,

Our team shared the same concern. We've considered a few options:


*1. Add a new package such as `flink-metrics-prometheus-native` and
eventually deprecate the original.*
*Pros:*
- Supports backward compatibility.
*Cons:*
- Two packages to maintain in the interim.
- Not consistent with other metrics packages.

*2. Maintain the same logic in flink-metrics-prometheus and write new
natively typed metrics to a different metric name in Prometheus, in
addition to the original metric.*

*Pros:*
- Supports backward compatibility.
*Cons:*
- Nearly doubles the metrics being captured by default.
- The naming convention will permanently differ when the original names are
deprecated.
- The original names will likely be deprecated at some point.

*3. Maintain the same logic in flink-metrics-prometheus. However, if you
use a flink-conf option, natively typed metrics would be written to the
same names instead of the original metric types.*

*Pros:*
- Supports backwards compatibility
- No double metrics
*Cons:*
- Increases the maintenance burden.
- Would require future migrations

*4. Make a clean break and swap the types in flink-metrics-prometheus,
releasing it in 1.18 or 1.19 with a note.*

*Pros:*
- Avoids duplicate metrics and packages.
- No future maintenance burden.
*Cons:*
-Introduces a breaking change.
- Metrics may silently fail in dashboards if the graphs do not support the
new data type (I would need to conduct more testing to determine how often
this occurs).

I lean towards option 4, and we would communicate the change internally as
part of a minor version upgrade. I'm open to other ideas and would welcome
further discussion on what the OSS community prefers.

Thanks,

Ryan van Huuksloot
Sr. Production Engineer | Streaming Platform
[image: Shopify]
<https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>


On Thu, Jun 29, 2023 at 4:23 AM Martijn Visser 
wrote:

> Hi Ryan,
>
> I think there definitely is an interest in the
> flink-metrics-prometheus, but I do see some challenges as well. Given
> that the Prometheus simpleclient doesn't yet have a major version,
> there are breaking changes happening in that. If we would update this,
> it can/probably breaks the metrics for users, which is an undesirable
> situation. Any thoughts on how we could avoid that situation?
>
> Best regards,
>
> Martijn
>
> On Tue, Jun 20, 2023 at 3:53 PM Ryan van Huuksloot
>  wrote:
> >
> > Following up, any interest in flink-metrics-prometheus? It is quite a
> stale
> > package. I would be interested in contributing - time permitting.
> >
> > Ryan van Huuksloot
> > Sr. Production Engineer | Streaming Platform
> > [image: Shopify]
> > <https://www.shopify.com/?utm_medium=salessignatures_source=hs_email
> >
> >
> >
> > On Thu, Jun 15, 2023 at 2:16 PM Ryan van Huuksloot <
> > ryan.vanhuuksl...@shopify.com> wrote:
> >
> > > Hello,
> > >
> > > Internally we use the flink-metrics-prometheus jar and we noticed that
> the
> > > code is a little out of date. Primarily, there are new metric types in
> > > Prometheus that would allow for the exporter to write Counters and
> > > Histograms as Native metrics in prometheus (vs writing as Gauges).
> > >
> > > I noticed that there was a closed PR for the simpleclient:
> > > https://github.com/apache/flink/pull/21047 - which has what is
> required
> > > for the native metrics but may cause other maintenance tickets.
> > >
> > > Is there any appetite from the community to update this exporter?
> > >
> > > Thanks,
> > >
> > > Ryan van Huuksloot
> > > Sr. Production Engineer | Streaming Platform
> > > [image: Shopify]
> > > <
> https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>
> > >
>


Re: Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-29 Thread Martijn Visser
Hi Ryan,

I think there definitely is an interest in the
flink-metrics-prometheus, but I do see some challenges as well. Given
that the Prometheus simpleclient doesn't yet have a major version,
there are breaking changes happening in that. If we would update this,
it can/probably breaks the metrics for users, which is an undesirable
situation. Any thoughts on how we could avoid that situation?

Best regards,

Martijn

On Tue, Jun 20, 2023 at 3:53 PM Ryan van Huuksloot
 wrote:
>
> Following up, any interest in flink-metrics-prometheus? It is quite a stale
> package. I would be interested in contributing - time permitting.
>
> Ryan van Huuksloot
> Sr. Production Engineer | Streaming Platform
> [image: Shopify]
> <https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>
>
>
> On Thu, Jun 15, 2023 at 2:16 PM Ryan van Huuksloot <
> ryan.vanhuuksl...@shopify.com> wrote:
>
> > Hello,
> >
> > Internally we use the flink-metrics-prometheus jar and we noticed that the
> > code is a little out of date. Primarily, there are new metric types in
> > Prometheus that would allow for the exporter to write Counters and
> > Histograms as Native metrics in prometheus (vs writing as Gauges).
> >
> > I noticed that there was a closed PR for the simpleclient:
> > https://github.com/apache/flink/pull/21047 - which has what is required
> > for the native metrics but may cause other maintenance tickets.
> >
> > Is there any appetite from the community to update this exporter?
> >
> > Thanks,
> >
> > Ryan van Huuksloot
> > Sr. Production Engineer | Streaming Platform
> > [image: Shopify]
> > <https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>
> >


Re: Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-20 Thread Ryan van Huuksloot
Following up, any interest in flink-metrics-prometheus? It is quite a stale
package. I would be interested in contributing - time permitting.

Ryan van Huuksloot
Sr. Production Engineer | Streaming Platform
[image: Shopify]
<https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>


On Thu, Jun 15, 2023 at 2:16 PM Ryan van Huuksloot <
ryan.vanhuuksl...@shopify.com> wrote:

> Hello,
>
> Internally we use the flink-metrics-prometheus jar and we noticed that the
> code is a little out of date. Primarily, there are new metric types in
> Prometheus that would allow for the exporter to write Counters and
> Histograms as Native metrics in prometheus (vs writing as Gauges).
>
> I noticed that there was a closed PR for the simpleclient:
> https://github.com/apache/flink/pull/21047 - which has what is required
> for the native metrics but may cause other maintenance tickets.
>
> Is there any appetite from the community to update this exporter?
>
> Thanks,
>
> Ryan van Huuksloot
> Sr. Production Engineer | Streaming Platform
> [image: Shopify]
> <https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>
>


Flink-Metrics Prometheus - Native Histograms / Native Counters

2023-06-15 Thread Ryan van Huuksloot
Hello,

Internally we use the flink-metrics-prometheus jar and we noticed that the
code is a little out of date. Primarily, there are new metric types in
Prometheus that would allow for the exporter to write Counters and
Histograms as Native metrics in prometheus (vs writing as Gauges).

I noticed that there was a closed PR for the simpleclient:
https://github.com/apache/flink/pull/21047 - which has what is required for
the native metrics but may cause other maintenance tickets.

Is there any appetite from the community to update this exporter?

Thanks,

Ryan van Huuksloot
Sr. Production Engineer | Streaming Platform
[image: Shopify]
<https://www.shopify.com/?utm_medium=salessignatures_source=hs_email>


[jira] [Created] (FLINK-29970) Prometheus cannot collect flink metrics

2022-11-09 Thread Jira
钟洋洋 created FLINK-29970:
---

 Summary: Prometheus cannot collect flink metrics
 Key: FLINK-29970
 URL: https://issues.apache.org/jira/browse/FLINK-29970
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.14.6, 1.15.2
Reporter: 钟洋洋


When I use the native k8s method to deploy my flink application cluster, if I 
do not manually deploy a Service to expose the 9249 port of my jobmanager and 
taskmanager, the prometheus I deployed in k8s cannot collect the metrics in my 
flink. Should flink generate these services itself when deploying.

My deploy command is(Some content is omitted)
{code:sh}
flink run-application --target-application -Dmetrics.reporters=prom 
-Dmetric.reporter.prom.class=org.apache.flink.metrics.prometheus.PrometheusReport
 
-Dkubernetes.jobmanager.annotaions=prometheus.io/scrape:true,prometheus.io/port:9249
 
-DDkubernetes.jobmanager.annotaions=prometheus.io/scrape:true,prometheus.io/port:9249{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


FW: 1.13.2 - Flink Metrics not updated during job execution

2022-08-03 Thread Oliveira, Joao / Kuehne + Nagel / Opo MI-DH


From: "Oliveira, Joao / Kuehne + Nagel / Opo MI-DH" 

Date: Wednesday, 3 August 2022 at 09:54
To: "u...@flink.apache.org" 
Cc: "Melo, Jose / Kuehne + Nagel / Opo MI-DH" 
Subject: 1.13.2 - Flink Metrics not updated during job execution

Hi Flink team,

My name is João Oliveira, i’m living in Portugal and I’m working at 
Kuehne+Nagel.

We are creating a process to handle millions of records with apache flink to 
support logistics data pipelines. We are moving from kinesis sources/sink to 
kafka sources/sink.

However, in the flink dashboard, the job metrics are not being updated in the 
near-real-time. Do you want can be wrong with the job/version? Evidence in the 
attachments

Best,

João Oliveira
Kuehne+Nagel


[jira] [Created] (FLINK-27914) Integrate JOSDK metrics with Flink Metrics reporter

2022-06-06 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-27914:
--

 Summary: Integrate JOSDK metrics with Flink Metrics reporter
 Key: FLINK-27914
 URL: https://issues.apache.org/jira/browse/FLINK-27914
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator
Reporter: Gyula Fora
 Fix For: kubernetes-operator-1.1.0


The Java Operator SDK comes with an internal metric interface that could be 
implemented to forward metrics/measurements to the Flink metric registries. 

We should investigate and implement this if possible.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27661) [Metric]Flink-Metrics pushgateway support authentication

2022-05-17 Thread jiangchunyang (Jira)
jiangchunyang created FLINK-27661:
-

 Summary: [Metric]Flink-Metrics pushgateway support authentication
 Key: FLINK-27661
 URL: https://issues.apache.org/jira/browse/FLINK-27661
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
 Environment: Flink:1.13.0
Reporter: jiangchunyang
 Fix For: 1.13.0


We found that the native PushGateway does not support authentication. As a 
result, the metrics data in on YARN mode cannot be reported to pushGateway with 
authentication.  

Although we have some other solutions, such as landing files and others, we 
think pushGateway is the best solution.  

So I decided to do some implementation on my own, and will submit pr to the 
community later



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27163) Fix typo issue in Flink Metrics documentation

2022-04-11 Thread hao wang (Jira)
hao wang created FLINK-27163:


 Summary: Fix typo issue in Flink Metrics documentation
 Key: FLINK-27163
 URL: https://issues.apache.org/jira/browse/FLINK-27163
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.15.0, 1.16.0
Reporter: hao wang
 Attachments: 20220411145958283.png

The Cluster module in the metrics documentation has five items,but only four 
are specified.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26230) [JUnit5 Migration] Module: flink-metrics-core

2022-02-17 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-26230:


 Summary: [JUnit5 Migration] Module: flink-metrics-core
 Key: FLINK-26230
 URL: https://issues.apache.org/jira/browse/FLINK-26230
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26229) [JUnit5 Migration] Module: flink-metrics-datadog

2022-02-17 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-26229:


 Summary: [JUnit5 Migration] Module: flink-metrics-datadog
 Key: FLINK-26229
 URL: https://issues.apache.org/jira/browse/FLINK-26229
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26228) [JUnit5 Migration] Module: flink-metrics-dropwizard

2022-02-17 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-26228:


 Summary: [JUnit5 Migration] Module: flink-metrics-dropwizard
 Key: FLINK-26228
 URL: https://issues.apache.org/jira/browse/FLINK-26228
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26227) [JUnit5 Migration] Module: flink-metrics-graphite

2022-02-17 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-26227:


 Summary: [JUnit5 Migration] Module: flink-metrics-graphite
 Key: FLINK-26227
 URL: https://issues.apache.org/jira/browse/FLINK-26227
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26226) [JUnit5 Migration] Module: flink-metrics-influxdb

2022-02-17 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-26226:


 Summary: [JUnit5 Migration] Module: flink-metrics-influxdb
 Key: FLINK-26226
 URL: https://issues.apache.org/jira/browse/FLINK-26226
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26225) [JUnit5 Migration] Module: flink-metrics-jmx

2022-02-17 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-26225:


 Summary: [JUnit5 Migration] Module: flink-metrics-jmx
 Key: FLINK-26225
 URL: https://issues.apache.org/jira/browse/FLINK-26225
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26221) [JUnit5 Migration] Module: flink-metrics-prometheus

2022-02-17 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-26221:


 Summary: [JUnit5 Migration] Module: flink-metrics-prometheus
 Key: FLINK-26221
 URL: https://issues.apache.org/jira/browse/FLINK-26221
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26219) [JUnit5 Migration] Module: flink-metrics-slf4j

2022-02-17 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-26219:


 Summary: [JUnit5 Migration] Module: flink-metrics-slf4j
 Key: FLINK-26219
 URL: https://issues.apache.org/jira/browse/FLINK-26219
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26215) [JUnit5 Migration] Module: flink-metrics-statsd

2022-02-17 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-26215:


 Summary: [JUnit5 Migration] Module: flink-metrics-statsd
 Key: FLINK-26215
 URL: https://issues.apache.org/jira/browse/FLINK-26215
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24896) Failed to execute goal com.github.siom79.japicmp:japicmp-maven-plugin:0.11.0:cmp (default) on project flink-metrics-core

2021-11-15 Thread Yun Gao (Jira)
Yun Gao created FLINK-24896:
---

 Summary: Failed to execute goal 
com.github.siom79.japicmp:japicmp-maven-plugin:0.11.0:cmp (default) on project 
flink-metrics-core
 Key: FLINK-24896
 URL: https://issues.apache.org/jira/browse/FLINK-24896
 Project: Flink
  Issue Type: Bug
  Components: Build System / Azure Pipelines
Affects Versions: 1.15.0
Reporter: Yun Gao


{code:java}
[ERROR] Failed to execute goal 
com.github.siom79.japicmp:japicmp-maven-plugin:0.11.0:cmp (default) on project 
flink-metrics-core: Execution default of goal 
com.github.siom79.japicmp:japicmp-maven-plugin:0.11.0:cmp failed: Marshalling 
of XML document failed: Implementation of JAXB-API has not been found on module 
path or classpath. com.sun.xml.internal.bind.v2.ContextFactory -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :flink-metrics-core
 {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24514) Incorrect Flink Metrics on Job-Internal-Restart:

2021-10-12 Thread Alok Singh (Jira)
Alok Singh created FLINK-24514:
--

 Summary: Incorrect Flink Metrics on Job-Internal-Restart:
 Key: FLINK-24514
 URL: https://issues.apache.org/jira/browse/FLINK-24514
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Metrics
Affects Versions: 1.12.1
Reporter: Alok Singh
 Attachments: Screenshot 2021-10-12 at 4.46.49 PM.png, Screenshot 
2021-10-12 at 4.47.17 PM.png, Screenshot 2021-10-12 at 4.47.29 PM.png, 
Screenshot 2021-10-12 at 4.47.41 PM.png

We have been seeing metrics showing multi-folded values after Flink Job 
restarts (due to some internal exceptions for example something like while 
deployment, the job didn't get the Task Managers in time and then it restarted 
on its own.)

Metrics implementation:
 # We have done metrics implementation using Meter.
 # We are using Accumulators.scala to define our metrics name as Value and use 
this as key and MeterView as value to define it under a Map in 
CustomMetrics.scala.
 # For MeterView object creation, we use object of AtomicLongCounter.scala 
class which extends Counter interface and override its methods. (Attached code 
files for the same to understand better)
 # We register the metrics inside FilterReportsForSummaryAnalysis.scala.

Some points to remember:
 # Not all job internal restarts cause incorrect metrics.
 # When there are internal job-restarts which caused incorrect metrics, then if 
we manually restart the job (Killing it and restarting using or not using 
savepoints), the metrics show correct value after this manual restart.(Given 
that on manual restarts, no other potential exception happened again which 
could cause an internal restarts)
 # We are using Flink Delay Restart Strategy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21773) Remove dependency from flink-metrics-statsd

2021-03-13 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21773:


 Summary: Remove dependency from flink-metrics-statsd
 Key: FLINK-21773
 URL: https://issues.apache.org/jira/browse/FLINK-21773
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21772) Remove dependency from flink-metrics-slf4j

2021-03-13 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21772:


 Summary: Remove dependency from flink-metrics-slf4j
 Key: FLINK-21772
 URL: https://issues.apache.org/jira/browse/FLINK-21772
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21761) Remove dependency from flink-metrics-influxdb

2021-03-13 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21761:


 Summary: Remove dependency from flink-metrics-influxdb
 Key: FLINK-21761
 URL: https://issues.apache.org/jira/browse/FLINK-21761
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21762) Remove dependency from flink-metrics-prometheus

2021-03-13 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21762:


 Summary: Remove dependency from flink-metrics-prometheus
 Key: FLINK-21762
 URL: https://issues.apache.org/jira/browse/FLINK-21762
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21760) Remove dependency from flink-metrics-jmx

2021-03-13 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21760:


 Summary: Remove dependency from flink-metrics-jmx
 Key: FLINK-21760
 URL: https://issues.apache.org/jira/browse/FLINK-21760
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18573) plugins/metrics-influx/flink-metrics-influxdb-1.11.0.jar META-INF has no dir named "services" , but "service"

2020-07-12 Thread zhangyunyun (Jira)
zhangyunyun created FLINK-18573:
---

 Summary: plugins/metrics-influx/flink-metrics-influxdb-1.11.0.jar 
META-INF has no dir named "services" , but "service"
 Key: FLINK-18573
 URL: https://issues.apache.org/jira/browse/FLINK-18573
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Metrics
Affects Versions: 1.11.0
Reporter: zhangyunyun


It causes the error:

 
2020-07-13 09:08:46.146 [main] WARN 
org.apache.flink.runtime.metrics.ReporterSetup - The reporter factory 
(org.apache.flink.metrics.influxdb.InfluxdbReporterFactory) could not be found 
for reporter influxdb. Available factories: 
[org.apache.flink.metrics.slf4j.Slf4jReporterFactory, 
org.apache.flink.metrics.datadog.DatadogHttpReporterFactory, 
org.apache.flink.metrics.graphite.GraphiteReporterFactory, 
org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporterFactory, 
org.apache.flink.metrics.statsd.StatsDReporterFactory, 
org.apache.flink.metrics.prometheus.PrometheusReporterFactory, 
org.apache.flink.metrics.jmx.JMXReporterFactory].
2020-07-13 09:08:46.149 [main] INFO 
org.apache.flink.runtime.metrics.MetricRegistryImpl - Periodically reporting 
metrics in intervals of 60 SECONDS for reporter slf4j of type 
org.apache.flink.metrics.slf4j.Slf4jReporter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17556) FATAL: Thread 'flink-metrics-akka.remote.default-remote-dispatcher-3' produced an uncaught exception. Stopping the process... java.lang.OutOfMemoryError: Direct buffer m

2020-05-07 Thread Tammy zhang (Jira)
Tammy zhang created FLINK-17556:
---

 Summary: FATAL: Thread 
'flink-metrics-akka.remote.default-remote-dispatcher-3' produced an uncaught 
exception. Stopping the process... java.lang.OutOfMemoryError: Direct buffer 
memory
 Key: FLINK-17556
 URL: https://issues.apache.org/jira/browse/FLINK-17556
 Project: Flink
  Issue Type: Bug
Reporter: Tammy zhang


My job consumes the data in kafka and then processes the data. After the job 
lasts for a while, the following error appears: 

ERROR org.apache.flink.runtime.util.FatalExitExceptionHandler - FATAL: Thread 
'flink-metrics-akka.remote.default-remote-dispatcher-3' produced an uncaught 
exception. Stopping the process...
java.lang.OutOfMemoryError: Direct buffer memory

i have set the "max.poll.records" propertity is "250", and it does not work. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16635) Incompatible okio dependency in flink-metrics-influxdb module

2020-03-17 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-16635:
-

 Summary: Incompatible okio dependency in flink-metrics-influxdb 
module
 Key: FLINK-16635
 URL: https://issues.apache.org/jira/browse/FLINK-16635
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Metrics
Affects Versions: 1.10.0
Reporter: Till Rohrmann
 Fix For: 1.10.1, 1.11.0


With FLINK-12147 we bumped {{influxdb-java}} from version {{2.14}} to {{2.16}}. 
At the same time we fix the okio dependency to version {{1.14.0}}. Since 
{{influxdb-java}} transitive dependency {{converter-moshi:jar:2.6.1}} requires 
{{moshi:jar:1.8.0}} which requires {{okio:jar:1.16.0}}, the influxdb metric 
reporter fails as described 
[here|https://stackoverflow.com/q/60667654/4815083]. We should fix this 
incompatibility by removing the dependency management entry for okio.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15832) Artifact flink-metrics-core-tests will be uploaded twice

2020-01-31 Thread static-max (Jira)
static-max created FLINK-15832:
--

 Summary: Artifact flink-metrics-core-tests will be uploaded twice
 Key: FLINK-15832
 URL: https://issues.apache.org/jira/browse/FLINK-15832
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.9.1
Reporter: static-max


I built Flink 1.9.1 myself and merged the changes from 
[https://github.com/apache/flink/pull/10936].

When I uploaded the artifacts to our repository (using {{mvn deploy }}
{{-DaltDeploymentRepository}}) the build fails as {{flink-metrics-core-tests}} 
will be uploaded twice and we have redeployments disabled.
 
I'm not sure if other artifacts are affected as well, as I enabled redeployment 
as a quick workaround.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15356) Add applicationId to existing flink metrics running on yarn

2019-12-20 Thread Forward Xu (Jira)
Forward Xu created FLINK-15356:
--

 Summary: Add applicationId to existing flink metrics running on 
yarn
 Key: FLINK-15356
 URL: https://issues.apache.org/jira/browse/FLINK-15356
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Reporter: Forward Xu


When sending metrics to Prometheus, these systems have only the Flink job ID, 
and the Flink job ID is UUID, which cannot be associated with the application 
job on the yarn. Therefore, we need to increase the applicationId when running 
on yarn. This helps us to accurately find the corresponding job on yarn when 
the metric is abnormal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14831) When edit flink-metrics-influxdb, need add metrics.md by hand

2019-11-16 Thread ouyangwulin (Jira)
ouyangwulin created FLINK-14831:
---

 Summary: When edit flink-metrics-influxdb,  need add metrics.md by 
hand
 Key: FLINK-14831
 URL: https://issues.apache.org/jira/browse/FLINK-14831
 Project: Flink
  Issue Type: Improvement
Reporter: ouyangwulin


When edit flink-metrics-influxdb, need add metrics.md by hand. AND 
{code:java}
mvn package -Dgenerate-config-docs -pl flink-docs -am -nsu -DskipTests{code}
, Is not work



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Flink metrics with parallel operator

2019-07-31 Thread Sibendu Dey
Hello,

I have been working on a flink project and need some help with the metric
system.

I have a logic inside a process function which side outputs a particular
message on certain failure parameters.

This process function has a parallelism > 1. How do I keep track of the
failed messages through a counter metrics

which is scoped to the operator across all parallel instances?

Regards,
Sibendu


[jira] [Created] (FLINK-12423) Add Timer metric type for flink-metrics module

2019-05-06 Thread Armstrong Nova (JIRA)
Armstrong Nova created FLINK-12423:
--

 Summary: Add Timer metric type for flink-metrics module
 Key: FLINK-12423
 URL: https://issues.apache.org/jira/browse/FLINK-12423
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Reporter: Armstrong Nova
Assignee: Armstrong Nova


Hi guys, 

    Currently, Flink only support 4 registering metrics, {{Counters}}, 
{{Gauges}}, {{Histograms}} and {{Meters}}.  If we want to measure the time cost 
metric, for example P75th, P99th, P999th, right now can only use Histograms 
metric. But Histograms metric not support TimeUnit type (second, millisecond, 
microsecond, nanosecond). So it's not convenient to collaborate with outside 
metric system. 

    The Codahale/DropWizard support Time meter, so we can wrap it in Flink. 
    Thanks for your time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11174) flink Metrics Prometheus labels support chinese

2018-12-16 Thread Fan weiwen (JIRA)
Fan weiwen created FLINK-11174:
--

 Summary: flink Metrics Prometheus labels support chinese
 Key: FLINK-11174
 URL: https://issues.apache.org/jira/browse/FLINK-11174
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Affects Versions: 1.7.0, 1.6.2
Reporter: Fan weiwen


use flink metrics and Prometheus 

my job name is chinese 

but  org.apache.flink.metrics.prometheus.AbstractPrometheusReporter

replaceInvalidChars  only support  [a-zA-Z0-9:_] 

so my job name is  replaceAll  

 

i think  labels key is  [a-zA-Z0-9:_]  ok 

but  labels value  can support chinese?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11014) Relocate flink-metrics-graphite's dropwizard dependencies

2018-11-27 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-11014:
-

 Summary: Relocate flink-metrics-graphite's dropwizard dependencies
 Key: FLINK-11014
 URL: https://issues.apache.org/jira/browse/FLINK-11014
 Project: Flink
  Issue Type: Improvement
  Components: Build System, Metrics
Affects Versions: 1.7.0
Reporter: Till Rohrmann
 Fix For: 1.8.0


Currently the `flink-metrics-graphite` module shades its dropwizard 
dependencies. In order to not interfere with user code, I think it would be 
good to also relocate them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10998) flink-metrics-ganglia has LGPL dependency

2018-11-23 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-10998:


 Summary: flink-metrics-ganglia has LGPL dependency
 Key: FLINK-10998
 URL: https://issues.apache.org/jira/browse/FLINK-10998
 Project: Flink
  Issue Type: Bug
  Components: Metrics
Affects Versions: 1.6.2, 1.5.5, 1.7.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.5.6, 1.6.3, 1.7.0


{{flink-metrics-ganglia}} depends on {{info.ganglia.gmetric4j:gmetric4j}} which 
depends on {{org.acplt:oncrpc}}.

{{org.acplt:oncrpc}} is licensed under the LGPL, which is a 
[category-x|https://www.apache.org/legal/resolved.html#what-can-we-not-include-in-an-asf-project-category-x]
 license.

For the time being we should drop this module from all future releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10423) Forward RocksDB memory metrics to Flink metrics reporter

2018-09-25 Thread Seth Wiesman (JIRA)
Seth Wiesman created FLINK-10423:


 Summary: Forward RocksDB memory metrics to Flink metrics reporter 
 Key: FLINK-10423
 URL: https://issues.apache.org/jira/browse/FLINK-10423
 Project: Flink
  Issue Type: New Feature
  Components: Metrics, State Backends, Checkpointing
Reporter: Seth Wiesman
Assignee: Seth Wiesman


RocksDB contains a number of metrics at the column family level about current 
memory usage, open memtables,  etc that would be useful to users wishing 
greater insight what rocksdb is doing. This work is inspired heavily by the 
comments on this rocksdb issue thread 
(https://github.com/facebook/rocksdb/issues/3216#issuecomment-348779233)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10221) Flink metrics documentation io section table layout error

2018-08-27 Thread vinoyang (JIRA)
vinoyang created FLINK-10221:


 Summary: Flink metrics documentation io section table layout error
 Key: FLINK-10221
 URL: https://issues.apache.org/jira/browse/FLINK-10221
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.6.0
Reporter: vinoyang
Assignee: vinoyang


see here :

https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/metrics.html#io



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10035) ConcurrentModificationException with flink-metrics-slf4j

2018-08-02 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-10035:
---

 Summary: ConcurrentModificationException with flink-metrics-slf4j
 Key: FLINK-10035
 URL: https://issues.apache.org/jira/browse/FLINK-10035
 Project: Flink
  Issue Type: Bug
  Components: Metrics
Affects Versions: 1.5.2
Reporter: Nico Kruber


{code}
2018-08-02 15:45:08,052 WARN  
org.apache.flink.runtime.metrics.MetricRegistryImpl   - Error while 
reporting metrics
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$EntryIterator.next(HashMap.java:1471)
at java.util.HashMap$EntryIterator.next(HashMap.java:1469)
at 
org.apache.flink.metrics.slf4j.Slf4jReporter.report(Slf4jReporter.java:95)
at 
org.apache.flink.runtime.metrics.MetricRegistryImpl$ReporterTask.run(MetricRegistryImpl.java:427)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}
https://api.travis-ci.org/v3/job/411307171/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8553) switch flink-metrics-datadog to async mode

2018-02-03 Thread Bowen Li (JIRA)
Bowen Li created FLINK-8553:
---

 Summary: switch flink-metrics-datadog to async mode
 Key: FLINK-8553
 URL: https://issues.apache.org/jira/browse/FLINK-8553
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Affects Versions: 1.4.0
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.5.0


Even though currently flink-metrics-datadog is designed as `fire-and-forget`, 
it's still using sync calls which may block or slow down core. Need to switch 
it to async mode.

cc  [~Zentol]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-7907) Flink Metrics documentation missing Scala examples

2017-10-23 Thread Colin Williams (JIRA)
Colin Williams created FLINK-7907:
-

 Summary: Flink Metrics documentation missing Scala examples
 Key: FLINK-7907
 URL: https://issues.apache.org/jira/browse/FLINK-7907
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Colin Williams
Priority: Minor


The Flink metrics documentation is missing Scala examples for many of the 
metrics types. To be consistent there should be Scala examples for all the 
types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Question about Flink Metrics

2017-09-26 Thread Tony Wei
Hi Chesnay,

That sounds great to me. I think I will be interested in it.

Best Regards,
Tony Wei

2017-09-26 21:57 GMT+08:00 Chesnay Schepler <ches...@apache.org>:

> Hello,
>
> i see the value in supporting this, and it's also quite easy to do so
> actually.
>
> I've filed FLINK-7692, containing instructions on how to implement this.
>
> @Tony Are you interested in implementing this?
>
>
>
> On 26.09.2017 14:10, Tony Wei wrote:
>
>> Hi Hai Zhou,
>>
>> It's a good idea to implement my own reporter, but I think it is not the
>> best solution.
>> After all, reporter needs to be set well when starting the cluster. It is
>> not efficient to update cluster whenever you have a new metric for a new
>> streaming job.
>>
>> Anyway, it is still a workaround for now. Thank you!
>>
>> Best Regards,
>> Tony Wei
>>
>>
>> 2017-09-26 19:13 GMT+08:00 Hai Zhou <yew...@gmail.com>:
>>
>> Hi Tony,
>>>
>>> you can consider implementing a reporter, use a trick to convert the
>>> flink's metrics to the structure that suits your needs.
>>>
>>> This is just my personal practice, hoping to help you.
>>>
>>> Cheers,
>>> Hai Zhou
>>>
>>>
>>> 在 2017年9月26日,17:49,Tony Wei <tony19920...@gmail.com> 写道:
>>>
>>> Hi,
>>>
>>> Recently, I am using PrometheusReporter to monitor every metrics from
>>> Flink.
>>>
>>> I found that the metric name in Prometheus will map to the identifier
>>> from
>>> User Scope and System Scope [1], and the labels will map to Variables
>>> [2].
>>>
>>> To monitor the same metrics from Prometheus, I would like to use labels
>>> to differentiate them.
>>> Under the job/task/operator scope, it words fine to me. However, its not
>>> convenient to me to monitor partitions' states from Kakfa consumer,
>>> because
>>> I couldn't place partition id like a tag on each metric. All partition
>>> states like current commit offset will be a unique metric in Prometheus.
>>> It's hard to use visualization tool such as Grafana to monitor them.
>>>
>>> My question is: Is it possible to add tags on Metric, instead of using
>>> `.addGroup()`?
>>> If not, will it be a new feature on Flink Metrics in the future? Since I
>>> am not sure about how other reporters work, I am afraid that it is not a
>>> good design to just fulfill the requirement on particular reporter.
>>>
>>> Please guide and thanks for your help.
>>>
>>> Best Regards,
>>> Tony Wei
>>>
>>> [1]: https://ci.apache.org/projects/flink/flink-docs-
>>> release-1.3/monitoring/metrics.html#scope
>>> [2]: https://ci.apache.org/projects/flink/flink-docs-
>>> release-1.3/monitoring/metrics.html#list-of-all-variables
>>>
>>>
>>>
>>>
>


Re: Question about Flink Metrics

2017-09-26 Thread Chesnay Schepler

Hello,

i see the value in supporting this, and it's also quite easy to do so 
actually.


I've filed FLINK-7692, containing instructions on how to implement this.

@Tony Are you interested in implementing this?


On 26.09.2017 14:10, Tony Wei wrote:

Hi Hai Zhou,

It's a good idea to implement my own reporter, but I think it is not the
best solution.
After all, reporter needs to be set well when starting the cluster. It is
not efficient to update cluster whenever you have a new metric for a new
streaming job.

Anyway, it is still a workaround for now. Thank you!

Best Regards,
Tony Wei


2017-09-26 19:13 GMT+08:00 Hai Zhou <yew...@gmail.com>:


Hi Tony,

you can consider implementing a reporter, use a trick to convert the
flink's metrics to the structure that suits your needs.

This is just my personal practice, hoping to help you.

Cheers,
Hai Zhou


在 2017年9月26日,17:49,Tony Wei <tony19920...@gmail.com> 写道:

Hi,

Recently, I am using PrometheusReporter to monitor every metrics from
Flink.

I found that the metric name in Prometheus will map to the identifier from
User Scope and System Scope [1], and the labels will map to Variables [2].

To monitor the same metrics from Prometheus, I would like to use labels
to differentiate them.
Under the job/task/operator scope, it words fine to me. However, its not
convenient to me to monitor partitions' states from Kakfa consumer, because
I couldn't place partition id like a tag on each metric. All partition
states like current commit offset will be a unique metric in Prometheus.
It's hard to use visualization tool such as Grafana to monitor them.

My question is: Is it possible to add tags on Metric, instead of using
`.addGroup()`?
If not, will it be a new feature on Flink Metrics in the future? Since I
am not sure about how other reporters work, I am afraid that it is not a
good design to just fulfill the requirement on particular reporter.

Please guide and thanks for your help.

Best Regards,
Tony Wei

[1]: https://ci.apache.org/projects/flink/flink-docs-
release-1.3/monitoring/metrics.html#scope
[2]: https://ci.apache.org/projects/flink/flink-docs-
release-1.3/monitoring/metrics.html#list-of-all-variables







Re: Question about Flink Metrics

2017-09-26 Thread Tony Wei
Hi Hai Zhou,

It's a good idea to implement my own reporter, but I think it is not the
best solution.
After all, reporter needs to be set well when starting the cluster. It is
not efficient to update cluster whenever you have a new metric for a new
streaming job.

Anyway, it is still a workaround for now. Thank you!

Best Regards,
Tony Wei


2017-09-26 19:13 GMT+08:00 Hai Zhou <yew...@gmail.com>:

> Hi Tony,
>
> you can consider implementing a reporter, use a trick to convert the
> flink's metrics to the structure that suits your needs.
>
> This is just my personal practice, hoping to help you.
>
> Cheers,
> Hai Zhou
>
>
> 在 2017年9月26日,17:49,Tony Wei <tony19920...@gmail.com> 写道:
>
> Hi,
>
> Recently, I am using PrometheusReporter to monitor every metrics from
> Flink.
>
> I found that the metric name in Prometheus will map to the identifier from
> User Scope and System Scope [1], and the labels will map to Variables [2].
>
> To monitor the same metrics from Prometheus, I would like to use labels
> to differentiate them.
> Under the job/task/operator scope, it words fine to me. However, its not
> convenient to me to monitor partitions' states from Kakfa consumer, because
> I couldn't place partition id like a tag on each metric. All partition
> states like current commit offset will be a unique metric in Prometheus.
> It's hard to use visualization tool such as Grafana to monitor them.
>
> My question is: Is it possible to add tags on Metric, instead of using
> `.addGroup()`?
> If not, will it be a new feature on Flink Metrics in the future? Since I
> am not sure about how other reporters work, I am afraid that it is not a
> good design to just fulfill the requirement on particular reporter.
>
> Please guide and thanks for your help.
>
> Best Regards,
> Tony Wei
>
> [1]: https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/monitoring/metrics.html#scope
> [2]: https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/monitoring/metrics.html#list-of-all-variables
>
>
>


Re: Question about Flink Metrics

2017-09-26 Thread Hai Zhou
Hi Tony,

you can consider implementing a reporter, use a trick to convert the flink's 
metrics to the structure that suits your needs.

This is just my personal practice, hoping to help you.

Cheers,
Hai Zhou


> 在 2017年9月26日,17:49,Tony Wei <tony19920...@gmail.com> 写道:
> 
> Hi,
> 
> Recently, I am using PrometheusReporter to monitor every metrics from Flink.
> 
> I found that the metric name in Prometheus will map to the identifier from 
> User Scope and System Scope [1], and the labels will map to Variables [2].
> 
> To monitor the same metrics from Prometheus, I would like to use labels to 
> differentiate them.
> Under the job/task/operator scope, it words fine to me. However, its not 
> convenient to me to monitor partitions' states from Kakfa consumer, because I 
> couldn't place partition id like a tag on each metric. All partition states 
> like current commit offset will be a unique metric in Prometheus. It's hard 
> to use visualization tool such as Grafana to monitor them.
> 
> My question is: Is it possible to add tags on Metric, instead of using 
> `.addGroup()`?
> If not, will it be a new feature on Flink Metrics in the future? Since I am 
> not sure about how other reporters work, I am afraid that it is not a good 
> design to just fulfill the requirement on particular reporter.
> 
> Please guide and thanks for your help.
> 
> Best Regards, 
> Tony Wei
> 
> [1]: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html#scope
>  
> <https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html#scope>
> [2]: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html#list-of-all-variables
>  
> <https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html#list-of-all-variables>


Question about Flink Metrics

2017-09-26 Thread Tony Wei
Hi,

Recently, I am using PrometheusReporter to monitor every metrics from Flink.

I found that the metric name in Prometheus will map to the identifier from
User Scope and System Scope [1], and the labels will map to Variables [2].

To monitor the same metrics from Prometheus, I would like to use labels
to differentiate them.
Under the job/task/operator scope, it words fine to me. However, its not
convenient to me to monitor partitions' states from Kakfa consumer, because
I couldn't place partition id like a tag on each metric. All partition
states like current commit offset will be a unique metric in Prometheus.
It's hard to use visualization tool such as Grafana to monitor them.

My question is: Is it possible to add tags on Metric, instead of using
`.addGroup()`?
If not, will it be a new feature on Flink Metrics in the future? Since I am
not sure about how other reporters work, I am afraid that it is not a good
design to just fulfill the requirement on particular reporter.

Please guide and thanks for your help.

Best Regards,
Tony Wei

[1]:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html#scope
[2]:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html#list-of-all-variables


[jira] [Created] (FLINK-6431) Activate strict checkstyle for flink-metrics

2017-05-02 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6431:
---

 Summary: Activate strict checkstyle for flink-metrics
 Key: FLINK-6431
 URL: https://issues.apache.org/jira/browse/FLINK-6431
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Flink Metrics

2016-10-18 Thread Aljoscha Krettek
https://ci.apache.org/projects/flink/flink-docs-release-1.2/monitoring/metrics.html
Or this:
https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/metrics.html
if
you prefer Flink 1.1

On Mon, 17 Oct 2016 at 19:16 amir bahmanyari <amirto...@yahoo.com> wrote:

> Hi colleagues,
> Is there a link that described Flink Matrices & provides example on how to
> utilize it pls?
> I really appreciate it...
> Cheers
>
> --
> *From:* Till Rohrmann <trohrm...@apache.org>
> *To:* u...@flink.apache.org
> *Cc:* dev@flink.apache.org
> *Sent:* Monday, October 17, 2016 12:52 AM
> *Subject:* Re: Flink Metrics
>
> Hi Govind,
>
> I think the DropwizardMeterWrapper implementation is just a reference
> implementation where it was decided to report the minute rate. You can
> define your own meter class which allows to configure the rate interval
> accordingly.
>
> Concerning Timers, I think nobody requested this metric so far. If you
> want, then you can open a JIRA issue and contribute it. The community would
> really appreciate that.
>
> Cheers,
> Till
> ​
>
> On Mon, Oct 17, 2016 at 5:26 AM, Govindarajan Srinivasaraghavan <
> govindragh...@gmail.com> wrote:
>
> > Hi,
> >
> > I am currently using flink 1.2 snapshot and instrumenting my pipeline
> with
> > flink metrics. One small suggestion I have is currently the Meter
> interface
> > only supports getRate() which is always the one minute rate.
> >
> > It would great if all the rates (1 min, 5 min & 15 min) are exposed to
> get
> > a better picture in terms of performance.
> >
> > Also is there any reason why timers are not part of flink metrics core?
> >
> > Regards,
> > Govind
> >
>
>
>


Re: Flink Metrics

2016-10-17 Thread amir bahmanyari
Hi colleagues,Is there a link that described Flink Matrices & provides example 
on how to utilize it pls?I really appreciate it...Cheers

  From: Till Rohrmann <trohrm...@apache.org>
 To: u...@flink.apache.org 
Cc: dev@flink.apache.org
 Sent: Monday, October 17, 2016 12:52 AM
 Subject: Re: Flink Metrics
   
Hi Govind,

I think the DropwizardMeterWrapper implementation is just a reference
implementation where it was decided to report the minute rate. You can
define your own meter class which allows to configure the rate interval
accordingly.

Concerning Timers, I think nobody requested this metric so far. If you
want, then you can open a JIRA issue and contribute it. The community would
really appreciate that.

Cheers,
Till
​

On Mon, Oct 17, 2016 at 5:26 AM, Govindarajan Srinivasaraghavan <
govindragh...@gmail.com> wrote:

> Hi,
>
> I am currently using flink 1.2 snapshot and instrumenting my pipeline with
> flink metrics. One small suggestion I have is currently the Meter interface
> only supports getRate() which is always the one minute rate.
>
> It would great if all the rates (1 min, 5 min & 15 min) are exposed to get
> a better picture in terms of performance.
>
> Also is there any reason why timers are not part of flink metrics core?
>
> Regards,
> Govind
>

   

Re: Flink Metrics

2016-10-17 Thread Till Rohrmann
Hi Govind,

I think the DropwizardMeterWrapper implementation is just a reference
implementation where it was decided to report the minute rate. You can
define your own meter class which allows to configure the rate interval
accordingly.

Concerning Timers, I think nobody requested this metric so far. If you
want, then you can open a JIRA issue and contribute it. The community would
really appreciate that.

Cheers,
Till
​

On Mon, Oct 17, 2016 at 5:26 AM, Govindarajan Srinivasaraghavan <
govindragh...@gmail.com> wrote:

> Hi,
>
> I am currently using flink 1.2 snapshot and instrumenting my pipeline with
> flink metrics. One small suggestion I have is currently the Meter interface
> only supports getRate() which is always the one minute rate.
>
> It would great if all the rates (1 min, 5 min & 15 min) are exposed to get
> a better picture in terms of performance.
>
> Also is there any reason why timers are not part of flink metrics core?
>
> Regards,
> Govind
>


Flink Metrics

2016-10-16 Thread Govindarajan Srinivasaraghavan
Hi,

I am currently using flink 1.2 snapshot and instrumenting my pipeline with
flink metrics. One small suggestion I have is currently the Meter interface
only supports getRate() which is always the one minute rate.

It would great if all the rates (1 min, 5 min & 15 min) are exposed to get
a better picture in terms of performance.

Also is there any reason why timers are not part of flink metrics core?

Regards,
Govind


[jira] [Created] (FLINK-4186) Expose Kafka metrics through Flink metrics

2016-07-10 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-4186:
-

 Summary: Expose Kafka metrics through Flink metrics
 Key: FLINK-4186
 URL: https://issues.apache.org/jira/browse/FLINK-4186
 Project: Flink
  Issue Type: Improvement
  Components: Kafka Connector
Affects Versions: 1.1.0
Reporter: Robert Metzger
Assignee: Robert Metzger


Currently, we expose the Kafka metrics through Flink's accumulators.
We can now use the metrics system in Flink to report Kafka metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4173) Replace maven-assembly-plugin by maven-shade-plugin in flink-metrics

2016-07-07 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-4173:


 Summary: Replace maven-assembly-plugin by maven-shade-plugin in 
flink-metrics
 Key: FLINK-4173
 URL: https://issues.apache.org/jira/browse/FLINK-4173
 Project: Flink
  Issue Type: Bug
  Components: Metrics
Affects Versions: 1.1.0
Reporter: Till Rohrmann
Assignee: Chesnay Schepler
Priority: Minor
 Fix For: 1.1.0


The modules {{flink-metrics-dropwizard}}, {{flink-metrics-ganglia}} and 
{{flink-metrics-graphite}} use the {{maven-assembly-plugin}} to build a fat 
jar. The resulting fat jar has the suffix {{jar-with-dependencies}}. In order 
to make the naming consistent with the rest of the system we should create a 
fat-jar without this suffix.

Additionally we could replace the {{maven-assembly-plugin}} with the 
{{maven-shade-plugin}} to make it consistent with the rest of the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)