Hi Aviem

Agree with your comments, it's pretty close to my previous ones.

Regards
JB

On Feb 14, 2017, 12:04, at 12:04, Aviem Zur <[email protected]> wrote:
>Hi Ismaël,
>
>You've raised some great points.
>Please see my comments inline.
>
>On Tue, Feb 14, 2017 at 3:37 PM Ismaël Mejía <[email protected]> wrote:
>
>> ​Hello,
>>
>> The new metrics API allows us to integrate some basic metrics into
>the Beam
>> IOs. I have been following some discussions about this on JIRAs/PRs,
>and I
>> think it is important to discuss the subject here so we can have more
>> awareness and obtain ideas from the community.
>>
>> First I want to thank Ben for his work on the metrics API, and Aviem
>for
>> his ongoing work on metrics for IOs, e.g. KafkaIO) that made me aware
>of
>> this subject.
>>
>> There are some basic ideas to discuss e.g.
>>
>> - What are the responsibilities of Beam IOs in terms of Metrics
>> (considering the fact that the actual IOs, server + client, usually
>provide
>> their own)?
>>
>
>While it is true that many IOs provide their own metrics, I think that
>Beam
>should expose IO metrics because:
>
>1. Metrics which help understanding performance of a pipeline which
>uses
>   an IO may not be covered by the IO .
>2. Users may not be able to setup integrations with the IO's metrics to
>view them effectively (And correlate them to a specific Beam pipeline),
>but
>   still want to investigate their pipeline's performance.
>
>
>> - What metrics are relevant to the pipeline (or some particular IOs)?
>Kafka
>> backlog for one could point that a pipeline is behind ingestion rate.
>
>
>I think it depends on the IO, but there is probably overlap in some of
>the
>metrics so a guideline might be written for this.
>I listed what I thought should be reported for KafkaIO in the following
>JIRA: https://issues.apache.org/jira/browse/BEAM-1398
>Feel free to add more metrics you think are important to report.
>
>
>>
>>
>- Should metrics be calculated on IOs by default or no?
>> - If metrics are defined by default does it make sense to allow users
>to
>> disable them?
>>
>
>IIUC, your concern is that metrics will add overhead to the pipeline,
>and
>pipelines which are highly sensitive to this will be hampered?
>In any case I think that yes, metrics calculation should be
>configurable
>(Enable/disable).
>In Spark runner, for example the Metrics sink feature (not the metrics
>calculation itself, but sinks to send them to) is configurable in the
>pipeline options.
>
>
>> Well these are just some questions around the subject so we can
>create a
>> common set of practices to include metrics in the IOs and eventually
>> improve the transform guide with this. What do you think about this?
>Do you
>> have other questions/ideas?
>>
>> Thanks,
>> Ismaël
>>

Reply via email to