Hi Chesnay,

Many thanks for your reply. At the end, we have decided to change the
infrastructure a bit and use StatD instead. This way, we don't need a
custom reporter and it works fine.

Thanks!

Bruno

On Fri, 5 May 2017 at 13:20 Chesnay Schepler <ches...@apache.org> wrote:

> Hello,
>
> for Graphite, Flink uses the DropWizard metrics reporter. I don't know
> at the moment whether it supports any kind of reconnecting functionality.
>
> I'm not sure whether i understood you correctly; did you try upgrading
> the DropWizard metrics-core/metrics-graphite dependencies?
>
> If that didn't do the trick we could in fact implement this in Flink, it
> would be hack though. When an error occurs we can simply re-instantiate
> the reporter, but we would have to know how the reporter communicates
> the connection drop; i.e. whether it throws some exception or not.
>
> Could you check the log for a warning statements from the MetricRegistry?
>
> Regards,
> Chesnay
>
> On 05.05.2017 13:26, Bruno Aranda wrote:
> > Hi,
> >
> > We are using the Graphite reporter from Flink 1.2.0 to send the
> > metrics via TCP. Due to our network configuration we cannot use UDP at
> > the moment.
> >
> > We have observed that if there is any problem with graphite our the
> > network, basically, the TCP connection times out or something, the
> > metrics reporter does not recover. This is easy to reproduce by
> > blocking the port we are sending the metrics using iptables. If we
> > block the port for more than a minute or so, the problem will happen.
> > After the port is re-open, Flink does not continue like before.
> >
> > Is this a known issue? Googling shows some problems with the
> > metrics-graphite package that should have been solved already. We have
> > trying updated metrics-core/graphite to the latest with no success.
> >
> > Any ideas?
> >
> > Thanks!
> >
> > Bruno
>
>
>

Reply via email to