Czar of Zonk created NIFI-8197: ---------------------------------- Summary: Prometheus Processor Efficiency ReportingTask Key: NIFI-8197 URL: https://issues.apache.org/jira/browse/NIFI-8197 Project: Apache NiFi Issue Type: Improvement Components: Core Framework Affects Versions: 1.12.1 Environment: Prometheus, ReportingTask, Java 11 Reporter: Czar of Zonk
I have assembled a reference repository in github, replete with sample Java code, maven POM(s), and, equally important, extensive write-up. Links: * Github repository: [https://github.com/lcphill/NiFiPromProcessorEfficiencyMetricsReportingTask] * Extensive write-up (dot.odt): [https://github.com/lcphill/NiFiPromProcessorEfficiencyMetricsReportingTask/blob/main/fodder/NiFiPrometheusProcessorEfficiencyMetricsSynopsis.odt] * Source java code (ReportingTask): [https://github.com/lcphill/NiFiPromProcessorEfficiencyMetricsReportingTask/blob/main/impl/src/main/java/org/zonk/nifi/prometheus/reportingtask/processorefficiency/NiFiPromProcessorEfficiencyMetricsReportingTask.java] Also reference Prometheus collector (golang), which is an improved prometheus' push-gateway component: [https://github.com/pschou/prom-collector] This feature set is serving a customer. The sample reference code, linked above, is slightly trimmed down for such purpose as providing a reference implementation. Key feature: Processor Efficiency metrics, implemented using Apache commons math3 SimpleRegression, providing slope and intercept computations (a la FIFO), for which to alarm (using Grafana) as relating to processor backlog and / or flow drop-off. Please peruse the linked write-up for details. Additional features (consider as additional requirement set): * Privacy considerations - metrics are sensitive. Support for privacy is demonstrated. See linked write-up and Java code reference implementation for details. There are a handful of provided features that assist to address privacy considerations in the Java code reference implementation linked above. * Push (versus Listen) - very important that the paradigm be changed for security reasons, as well as network architecture reasons. See linked write-up for details. * JVM: available memory calculation (based on sandbox max size and current used memory) * Queue based metrics: The join between processor and input relation / output relations is best accomplished within the confines of the reporting task code (with, effectively, zero-latency) versus requiring the Grafana back-end to accomplish – at that juncture – the very arduous task of attempting to reassemble the join. Please see the Java code reference implementation and linked write-up for details. -- This message was sent by Atlassian Jira (v8.3.4#803005)