Re: Metrics for Beam IOs.

Stephen Sisk Tue, 14 Feb 2017 09:37:23 -0800

>Many of the metrics that should be exposed for all transforms are likely
best exposed by the runner or some other common layer, rather than being
added to each transform.


+1 I wasn't trying to advocate for each user trying to implement these on
every transform - it should be provided by beam or the runner automatically.


> But, the existing Metrics API should work within a source or sink --
anything that is called within a step should work.

Great! I wasn't aware we'd changed that with the new Metrics API - I was
only aware of the limitations of the old system.


> if the source can detect that it is
having trouble splitting and raise a message like "you're using compressed
text files which can't be parallelized beyond the number of files" that is
much more actionable.

I think there's an important difference between what a particular runner
chooses to show it's users in their monitoring interface vs what beam
should be reporting. I think it's important that Beam or the runner layer
should be reporting this data (which would be necessary for doing high
level analysis like what you propose), but then the monitoring interface
should choose how to expose that information. So then the question becomes
- does it make sense for these common transform metrics to be exposed by
runner implementations or within common beam code?

S


On Tue, Feb 14, 2017 at 9:21 AM Ben Chambers <bchamb...@google.com.invalid>
wrote:

> On Tue, Feb 14, 2017 at 9:07 AM Stephen Sisk <s...@google.com.invalid>
> wrote:
>
> > hi!
> >
> > (ben just sent his mail and he covered some similar topics to me, but
> I'll
> > keep my comments intact since they are slightly different)
> >
> > * I think there are a lot of metrics that should be exposed for all
> > transforms - everything from JB's list (mile number of split, throughput,
> > reading/writing rate, number of splits, etc..) also apply to
> > splittableDoFns.
> >
>
> Many of the metrics that should be exposed for all transforms are likely
> best exposed by the runner or some other common layer, rather than being
> added to each transform. But things like number of elements, estimated size
> of elements, etc. all make sense for every transform.
>
>
> > * I also think there are data source specific metrics that a given IO
> will
> > want to expose (ie, things like kafka backlog for a topic.) No one on
> this
> > thread has specifically addressed this, but Beam Sources & Sinks do not
> > presently have the ability to report metrics even if a given IO writer
> > wanted to - depending on the timeline for SplittableDoFn and the move to
> > that infrastructure, I don't think we need that support in Sources/Sinks,
> > but I do think we should make sure SplittableDoFn has the necessary
> > support.
> >
>
> Two parts -- we may want to introduce something like a Gauge here that lets
> the metric system ask the source/sink for the latest metrics. This allows
> the runner to gather metrics at a rate that makes sense without impacting
> performance.
>
> But, the existing Metrics API should work within a source or sink --
> anything that is called within a step should work.
>
>
> > * I think there are ways to do many metrics such that they are not too
> > expensive to calculate all the time.  (ie, reporting per bundle rather
> than
> > per item) I think we should ask whether we want/need are metrics that are
> > expensive to calculate before going to the effort of adding
> enable/disable.
> >
>
> +1 -- hence why I'd like to look at reporting the metrics with no
> configuration.
>
>
> > * I disagree with ben about showing the amount of splitting - I think
> > especially with IOs it's useful to understand/diagnose reading problems
> > since that's one potential source of problems, especially given that the
> > user can write transforms that split now in SplittableDoFn. But I look
> > forward to discussing that further
> >
>
> I think many of the splitting metrics fall into things the runner should
> report. I think if we pick the right so they're useful, it likely doesn't
> hurt to gather them, but here again it may be useful to talk about specific
> problems.
>
> I still think these likely won't make sense for all users -- if I'm a new
> user just trying to get a source/sink working, I'm not sure what "splitting
> metrics" would be useful to me. But if the source can detect that it is
> having trouble splitting and raise a message like "you're using compressed
> text files which can't be parallelized beyond the number of files" that is
> much more actionable.
>
>
> > +1 on talking about specific examples
> >
> > S
> >
> > On Tue, Feb 14, 2017 at 8:29 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> > wrote:
> >
> > > Hi Aviem
> > >
> > > Agree with your comments, it's pretty close to my previous ones.
> > >
> > > Regards
> > > JB
> > >
> > > On Feb 14, 2017, 12:04, at 12:04, Aviem Zur <aviem...@gmail.com>
> wrote:
> > > >Hi Ismaël,
> > > >
> > > >You've raised some great points.
> > > >Please see my comments inline.
> > > >
> > > >On Tue, Feb 14, 2017 at 3:37 PM Ismaël Mejía <ieme...@gmail.com>
> wrote:
> > > >
> > > >> Hello,
> > > >>
> > > >> The new metrics API allows us to integrate some basic metrics into
> > > >the Beam
> > > >> IOs. I have been following some discussions about this on JIRAs/PRs,
> > > >and I
> > > >> think it is important to discuss the subject here so we can have
> more
> > > >> awareness and obtain ideas from the community.
> > > >>
> > > >> First I want to thank Ben for his work on the metrics API, and Aviem
> > > >for
> > > >> his ongoing work on metrics for IOs, e.g. KafkaIO) that made me
> aware
> > > >of
> > > >> this subject.
> > > >>
> > > >> There are some basic ideas to discuss e.g.
> > > >>
> > > >> - What are the responsibilities of Beam IOs in terms of Metrics
> > > >> (considering the fact that the actual IOs, server + client, usually
> > > >provide
> > > >> their own)?
> > > >>
> > > >
> > > >While it is true that many IOs provide their own metrics, I think that
> > > >Beam
> > > >should expose IO metrics because:
> > > >
> > > >1. Metrics which help understanding performance of a pipeline which
> > > >uses
> > > >   an IO may not be covered by the IO .
> > > >2. Users may not be able to setup integrations with the IO's metrics
> to
> > > >view them effectively (And correlate them to a specific Beam
> pipeline),
> > > >but
> > > >   still want to investigate their pipeline's performance.
> > > >
> > > >
> > > >> - What metrics are relevant to the pipeline (or some particular
> IOs)?
> > > >Kafka
> > > >> backlog for one could point that a pipeline is behind ingestion
> rate.
> > > >
> > > >
> > > >I think it depends on the IO, but there is probably overlap in some of
> > > >the
> > > >metrics so a guideline might be written for this.
> > > >I listed what I thought should be reported for KafkaIO in the
> following
> > > >JIRA: https://issues.apache.org/jira/browse/BEAM-1398
> > > >Feel free to add more metrics you think are important to report.
> > > >
> > > >
> > > >>
> > > >>
> > > >- Should metrics be calculated on IOs by default or no?
> > > >> - If metrics are defined by default does it make sense to allow
> users
> > > >to
> > > >> disable them?
> > > >>
> > > >
> > > >IIUC, your concern is that metrics will add overhead to the pipeline,
> > > >and
> > > >pipelines which are highly sensitive to this will be hampered?
> > > >In any case I think that yes, metrics calculation should be
> > > >configurable
> > > >(Enable/disable).
> > > >In Spark runner, for example the Metrics sink feature (not the metrics
> > > >calculation itself, but sinks to send them to) is configurable in the
> > > >pipeline options.
> > > >
> > > >
> > > >> Well these are just some questions around the subject so we can
> > > >create a
> > > >> common set of practices to include metrics in the IOs and eventually
> > > >> improve the transform guide with this. What do you think about this?
> > > >Do you
> > > >> have other questions/ideas?
> > > >>
> > > >> Thanks,
> > > >> Ismaël
> > > >>
> > >
> >
>

Re: Metrics for Beam IOs.

Reply via email to