The Prometheus reporter should work with 1.3.2.

Does this also occur with the reporter that currently exists in 1.4? (to rule out new bugs from the PR).

To investigate this further, please set the logging level to WARN and try again, as all errors in the metric system are logged on that level.

On 22.09.2017 10:33, Tony Wei wrote:
Hi,

I have built the Prometheus reporter package from this PR https://github.com/apache/flink/pull/4586, and used it on Flink 1.3.2 to record every default metrics and those from `FlinkKafkaConsumer`.

Originally, everything was fine. I could get those metrics in TM from Prometheus just like I saw on Flink Web UI. However, when I turned to JM, I found Prometheus gives this error to me: Get http://localhost:9249/metrics: EOF. I checked the log on JM and saw nothing in it. There was no error message and 9249 port was still alive.

To figure out what happened, I created another cluster and I found Prometheus could connect to Flink cluster if there is no running job. After JM triggered or completed the first checkpoint, Prometheus started getting ERR_EMPTY_RESPONSE from JM, but not for TM. There was still no error in log file and 9249 port was still alive.

I was wondering where did the error occur. Flink or Prometheus reporter?
Or It is incorrect to use Prometheus reporter on Flink 1.3.2 ? Thank you.

Best Regards,
Tony Wei


Reply via email to