Re: Streaming job monitoring

Chesnay Schepler Thu, 08 Jun 2017 03:16:23 -0700

Hello Flavio,

I'm not sure what source you are using, but it looks like theContinouosFileMonitoringSource which works with 2 operators.The first operator (what is displayed as the actual Source) emits inputsplits (chunks of files that should be read) and passes

these to the second operator (split reader).


So the numRecordsOut of the source is the number of splits created.

For sinks, the numRecordsOut counter is essentially unused; mostlybecause it is kind of redundant as it should in generalbe equal to the numRecordsIn counter. The same applies to thenumRecordsIn counter of sources.(although in this particular case it would be interesting to know howmany files the source has read...)

This is something we would have to solve for each source/sinkindividually, which is kind of tricky as the numRecordsIn/-Outmetrics are internal metrics and not accessible in user-definedfunctions without casting.

In your case the reading of the chunks and writing by the sink is donein a single task. The webUI is not aware of operators

and thus can't display the individual metrics nicely.

The metrics tab doesn't aggregate metrics across subtasks, so I can seehow that would be cumbersome to use. We can't solveaggregation in general as when dealing with Gauges we just don't knowwhether we can aggregate them at all.Frankly, this is also something I don't really won't to implement in thefirst place as there are dedicated systems for thisexact use-case. The WebFrontend is not meant as a replacement for thesesystems.

In general i would recommend to setup a dedicated metrics system likegraphite/ganglia to store metrics and use grafana

or something similar to actually monitor them.

Regards,
Chesnay

On 08.06.2017 11:43, Flavio Pompermaier wrote:

Hi to all,
we've successfully ran our first straming job on a Flink cluster (withsome problems with the shading of guava..) and it really outperformsLogstash, from the point of view of indexing speed and easiness of use.
However there's only one problem: when the job is running, in the JobMonitoring UI, I see 2 blocks within the plan visualizer:
 1. Source: Custom File Source (without any info about the file I'm
    reading)
 2. Split Reader: Custom File source -> Sink: unnamed
None of them helps me to understand which data I'm reading or writing(while within the batch jobs this is usually displayed). Moreover, inthe task details the "Byte sent/Records sent" are totally senseless, Idon't know what is counted (see the attached image if available)...Isee documents indexed on ES but in the Flink Job UI I don't seeanything that could help to understand how many documents are sent toES or from one function (Source) to the other (Sink).I tried to display some metrics and there I found something but I hopethis is not the usual way of monitoring streaming jobs...am I doingsomething wrong? Or the streaming jobs should be monitored withsomething else?
Inline image 1
Best,
Flavio

Re: Streaming job monitoring

Reply via email to