[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3833 Finally found time to try this out, and it works great, merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user mbode commented on the issue: https://github.com/apache/flink/pull/3833 I see, makes sense. Finally got around to fixing the dependency and getting a [Green Travis](https://travis-ci.org/mbode/flink/builds/240926382). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3833 Another thing to consider is that this rule also covers the myriad of testing utilities that so far did not have to be documented. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3833 I think it is intended. The majority of tests can be (and are) commented with "Tests for the {@link }.", which is totally OK because it tells us that this is the general test battery for that class and not something else. These are also easy to add so there's not a real overhead in doing so. But this doesn't apply to all of them; and you only really notice the missing javadoc when you stumble on a test and you spend 10 minutes trying to figure out what it is doing because the method names aren't helping at all. The widely-accepted notion that the name of a class and test method are supposed to be enough documentation for test is quite questionable IMO; and by enforcing a javadoc on the class we give a little nudge to make the developer hopefully think about whether something needs to be documented or not. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user mbode commented on the issue: https://github.com/apache/flink/pull/3833 Oh, I broke the stricter checkstyle. To me, it feels a bit weird to have to put javadoc on tests, is that intended? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3833 Shading looks correct; dependencies are included and the correct jar is included in flink-dist. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3833 I'll take a look later today. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user mbode commented on the issue: https://github.com/apache/flink/pull/3833 @zentol Would you mind checking that I got the shading right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3833 Yu can find an example on how to shade here: https://github.com/apache/flink/blob/master/flink-metrics/flink-metrics-datadog/pom.xml. Shading dependencies in reporters/connectors has become a safety-precaution form our side. It is not that unlikely that user-code contains the same dependencies. For one user-code also includes other reporters, so by the very existence of this reporter there is a precedent :) Besides that something that pops up from time to time on the mailing lists is users talking to REST endpoints in their functions/source/sinks, which may also rely on http-related dependencies. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user mbode commented on the issue: https://github.com/apache/flink/pull/3833 Okay, guava is not used anymore. I am not sure about the shading part. Would you expect either prometheus clients or nanohttpd to be used in Flink user code? Or are there other advantages to shading? If so, could you point me to a module I could copy the "Flink way of shading" from? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3833 As it stands a metric can have at most 10 labels, and there aren't any current plans to extend this set; according to the docs that's still ok. If this does indeed become a problem we can add a switch to rely on the scope formats instead, giving the user control as to how many labels there are. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user mbode commented on the issue: https://github.com/apache/flink/pull/3833 @hadronzoo I tried to keep it as general as possible by exporting all variables available to the metric group as labels. I am not sure if this might lead to [label overuse](https://prometheus.io/docs/practices/instrumentation/#do-not-overuse-labels) at some point, did you ever run into difficulties in that context? Unfortunately I do not have a lot of experience running Prometheus in a production environment yet. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user hadronzoo commented on the issue: https://github.com/apache/flink/pull/3833 @mbode thanks for working on this! One thing that I've found useful when exporting Flink's statsd metrics to Prometheus is to make several of the metric fields tags: like `job_name`, `task_name`, `operator_name`, etc. This [statsd-exporter mapping](https://gist.github.com/hadronzoo/621b6a6dce7e2596d5643ce8d1e954ea) has tags that have worked well for me. I'm not tagging host names or IP addresses because Prometheus's Kubernetes support does that already, but that could be useful for people running standalone clusters. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user mbode commented on the issue: https://github.com/apache/flink/pull/3833 Thanks for looking at it so quickly! I somewhat had the same instinct as far as your first point is concerned and thought about pulling out a `DropwizardReporter` without Scheduling but decided against it to not have to touch too many places. I like your suggestion of converting metrics directly without using Dropwizard as an intermediate step and am going to try that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3833: [FLINK-6221] Add PrometheusReporter
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3833 I've glanced over this for now but it looks pretty good. If possible though i would like to not use the DropwizardExports class, for 2 reasons: 1) We now introduce this odd pattern of extending `ScheduledDropwizardReporter` without actually being scheduled nor creating an actual Dropwizard reporter. 2) I've looked into the source code and it has the typical Dropwizard reporter problem where absolutely nothing gets cached and effectively constant objects are re-created again and again. We can make this way more efficient. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---