Re: Metrics for Beam IOs.

2017-02-22 Thread Jean-Baptiste Onofré

Hi Ismaël,

You did a great summary and you put the emphasis on the valid point.

I full agree with you about the fact we should provide something 
agnostic for end-users to get the metrics.


Let's see what the others are thinking. I'm ready to start a PoC to 
bring part of Decanter in Beam ;)


Thanks again
Regards
JB

On 02/22/2017 07:55 PM, Ismaël Mejía wrote:

Hello,

Thanks everyone for giving your points of view. I was waiting to see how
the conversation evolved to summarize it and continue on the open points.

Points where mostly everybody agrees (please correct me if somebody still
disagrees):

- Default metrics should not affect performance, for that reason they
should be calculated by default. (in this case disabling them should matter
less and probably we can evaluate if we make this configurable per IO in
the future).

- There is a clear distinction between metrics that are useful for the user
and metrics that are useful for the runner. The user metrics are the most
important for user experience. (we will focus on these).

- We should support metrics for IOs in both APIs: Source/Sink based IOs and
SDF.

- IO metrics should focus on what's relevant to the Pipeline. Relevant
metrics should be discussed in a per IO basis. (It is hard to generalize,
but probably we will make progress faster just creating metrics for each
one and then consolidating the common ones).

Points where consensus is not yet achieved

- Should IOs expose metrics that are useful for the runners? And if so How?

I think this is important but not relevant for the current discussion so we
should probably open a different conversation for this. My only comment
around this is that we must prevent the interference of mixing runner and
user oriented metrics (probably with a namespace).

- Where is the frontier of the responsibilities of Beam for metrics? Should
we have a runner-agnostic way to recollect metrics (different from result
polling)?

We can offer a plugin-like system to push metrics into given sinks, JB
proposed an approach similar to Karaf’s Decanter. There is also the issue
of pull-based metrics like those of Codehale.

As a user I think having something like what JB proposed is nice, even a
REST service to query stuff about pipelines in a runner-agnostic way would
make me happy too, but again it is up to the community to decide how can we
we implement this and if this should be part of Beam.

What do you guys think about the pending issues? Did I miss something else ?

Ismaël



On Sat, Feb 18, 2017 at 9:02 PM, Jean-Baptiste Onofré 
wrote:


Yes, agree Ben.

More than the collected metrics, my question is more how to
"expose"/"push" those metrics.

Imagine, I have a pipeline executed using the Spark runner on a Spark
cluster. Now, I change this pipeline to use the dataflow runner on Google
Cloud Dataflow service. I have to completely change the way of getting
metrics.
Optionally, if I'm able to define some like --metric-appender=foo.cfg
containing additionally to the execution engine specific layer, a target
where I can push the metric (like elasticsearch or kafka), I can implement
a simple and generic way of harvesting metrics.
It's totally fine to have some metric specific to Spark when using the
Spark runner and others specific to Dataflow when using the dataflow
runner, my point is more: how I can send the metrics to my global
monitoring/reporting layer.
Somehow, we can see the metrics like meta side output of the pipeline,
send to a target sink.

I don't want to change or bypass the execution engine specific, I mean
provide a way for the user to target his system (optional).

Regards
JB


On 02/18/2017 06:15 PM, Ben Chambers wrote:


The question is how much of metrics has to be in the runner and how much
can be shared. So far we share the API the user uses to report metrics -
this is the most important part since it is required for pipelines to be
portable.

The next piece that could be shared is something related to reporting.
But,
implementing logically correct metrics requires the execution engine to be
involved, since it depends on how and which bundles are retried. What I'm
not sure about is how much can be shared and/or made available for these
runners vs. how much is tied to the execution engine.

On Sat, Feb 18, 2017, 8:52 AM Jean-Baptiste Onofré 
wrote:

For Spark, I fully agree. My point is more when the execution engine or

runner doesn't provide anything or we have to provide a generic way of
harvesting/pushing metrics.

It could at least be a documentation point. Actually, I'm evaluation the
monitoring capabilities of the different runners.

Regards
JB

On 02/18/2017 05:47 PM, Amit Sela wrote:


That's what I don't understand - why would we want that ?
Taking on responsibilities in the "stack" should have a good reason.

Someone choosing to run Beam on Spark/Flink/Apex would have to take care


of


installing those clusters, right ? perhaps providing them a 

Re: Metrics for Beam IOs.

2017-02-22 Thread Ismaël Mejía
Hello,

Thanks everyone for giving your points of view. I was waiting to see how
the conversation evolved to summarize it and continue on the open points.

Points where mostly everybody agrees (please correct me if somebody still
disagrees):

- Default metrics should not affect performance, for that reason they
should be calculated by default. (in this case disabling them should matter
less and probably we can evaluate if we make this configurable per IO in
the future).

- There is a clear distinction between metrics that are useful for the user
and metrics that are useful for the runner. The user metrics are the most
important for user experience. (we will focus on these).

- We should support metrics for IOs in both APIs: Source/Sink based IOs and
SDF.

- IO metrics should focus on what's relevant to the Pipeline. Relevant
metrics should be discussed in a per IO basis. (It is hard to generalize,
but probably we will make progress faster just creating metrics for each
one and then consolidating the common ones).

Points where consensus is not yet achieved

- Should IOs expose metrics that are useful for the runners? And if so How?

I think this is important but not relevant for the current discussion so we
should probably open a different conversation for this. My only comment
around this is that we must prevent the interference of mixing runner and
user oriented metrics (probably with a namespace).

- Where is the frontier of the responsibilities of Beam for metrics? Should
we have a runner-agnostic way to recollect metrics (different from result
polling)?

We can offer a plugin-like system to push metrics into given sinks, JB
proposed an approach similar to Karaf’s Decanter. There is also the issue
of pull-based metrics like those of Codehale.

As a user I think having something like what JB proposed is nice, even a
REST service to query stuff about pipelines in a runner-agnostic way would
make me happy too, but again it is up to the community to decide how can we
we implement this and if this should be part of Beam.

What do you guys think about the pending issues? Did I miss something else ?

Ismaël



On Sat, Feb 18, 2017 at 9:02 PM, Jean-Baptiste Onofré 
wrote:

> Yes, agree Ben.
>
> More than the collected metrics, my question is more how to
> "expose"/"push" those metrics.
>
> Imagine, I have a pipeline executed using the Spark runner on a Spark
> cluster. Now, I change this pipeline to use the dataflow runner on Google
> Cloud Dataflow service. I have to completely change the way of getting
> metrics.
> Optionally, if I'm able to define some like --metric-appender=foo.cfg
> containing additionally to the execution engine specific layer, a target
> where I can push the metric (like elasticsearch or kafka), I can implement
> a simple and generic way of harvesting metrics.
> It's totally fine to have some metric specific to Spark when using the
> Spark runner and others specific to Dataflow when using the dataflow
> runner, my point is more: how I can send the metrics to my global
> monitoring/reporting layer.
> Somehow, we can see the metrics like meta side output of the pipeline,
> send to a target sink.
>
> I don't want to change or bypass the execution engine specific, I mean
> provide a way for the user to target his system (optional).
>
> Regards
> JB
>
>
> On 02/18/2017 06:15 PM, Ben Chambers wrote:
>
>> The question is how much of metrics has to be in the runner and how much
>> can be shared. So far we share the API the user uses to report metrics -
>> this is the most important part since it is required for pipelines to be
>> portable.
>>
>> The next piece that could be shared is something related to reporting.
>> But,
>> implementing logically correct metrics requires the execution engine to be
>> involved, since it depends on how and which bundles are retried. What I'm
>> not sure about is how much can be shared and/or made available for these
>> runners vs. how much is tied to the execution engine.
>>
>> On Sat, Feb 18, 2017, 8:52 AM Jean-Baptiste Onofré 
>> wrote:
>>
>> For Spark, I fully agree. My point is more when the execution engine or
>>> runner doesn't provide anything or we have to provide a generic way of
>>> harvesting/pushing metrics.
>>>
>>> It could at least be a documentation point. Actually, I'm evaluation the
>>> monitoring capabilities of the different runners.
>>>
>>> Regards
>>> JB
>>>
>>> On 02/18/2017 05:47 PM, Amit Sela wrote:
>>>
 That's what I don't understand - why would we want that ?
 Taking on responsibilities in the "stack" should have a good reason.

 Someone choosing to run Beam on Spark/Flink/Apex would have to take care

>>> of
>>>
 installing those clusters, right ? perhaps providing them a resilient
 underlying FS ? and if he wants, setup monitoring (which even with the

>>> API
>>>
 proposed he'd have to do).

 I just don't see why it should be a part of the runner 

Re: Metrics for Beam IOs.

2017-02-18 Thread Jean-Baptiste Onofré
For Spark, I fully agree. My point is more when the execution engine or 
runner doesn't provide anything or we have to provide a generic way of 
harvesting/pushing metrics.


It could at least be a documentation point. Actually, I'm evaluation the 
monitoring capabilities of the different runners.


Regards
JB

On 02/18/2017 05:47 PM, Amit Sela wrote:

That's what I don't understand - why would we want that ?
Taking on responsibilities in the "stack" should have a good reason.

Someone choosing to run Beam on Spark/Flink/Apex would have to take care of
installing those clusters, right ? perhaps providing them a resilient
underlying FS ? and if he wants, setup monitoring (which even with the API
proposed he'd have to do).

I just don't see why it should be a part of the runner and/or Metrics API.

On Sat, Feb 18, 2017 at 6:35 PM Jean-Baptiste Onofré 
wrote:


Good point.

In Decanter, it's what I named a "scheduled collector". So, yes, the
adapter will periodically harvest metric to push.

Regards
JB

On 02/18/2017 05:30 PM, Amit Sela wrote:

First issue with "push" metrics plugin - what if the runner's underlying
reporting mechanism is "pull" ? Codahale ScheduledReporter will sample

the

values every X and send to ...
So any runner using a "pull-like" would use an adapter ?

On Sat, Feb 18, 2017 at 6:27 PM Jean-Baptiste Onofré 
wrote:


Hi Ben,

ok it's what I thought. Thanks for the clarification.

+1 for the plugin-like "push" API (it's what I have in mind too ;)).
I will start a PoC for discussion next week.

Regards
JB

On 02/18/2017 05:17 PM, Ben Chambers wrote:

The runner can already report metrics during pipeline execution so it

is

usable for monitoring.

The pipeline result can be used to query metrics during pipeline

execution,

so a first version of reporting to other systems is to periodically

pulls

metrics from the runner with that API.

We may eventually want to provide a plugin-like API to get the runner

to

push metrics more directly to other metrics stores. This layer needs

some

thought since it has to handle the complexity of attempted/committed
metrics to be consistent with the model.



On Sat, Feb 18, 2017, 5:44 AM Jean-Baptiste Onofré 

wrote:


Hi Amit,

before Beam, I didn't mind about portability ;) So I used the Spark
approach.

But, now, as a Beam user, I would expect a generic way to deal with
metric whatever the runner would be.

Today, you are right: I'm using the solution provided by the execution
engine. That's the current approach and it works fine. And it's up to

me

to leverage (for intance Accumulators) it with my own system.

My thought is more to provide a generic way. It's only a discussion for
now ;)

Regards
JB

On 02/18/2017 02:38 PM, Amit Sela wrote:

On Sat, Feb 18, 2017 at 10:16 AM Jean-Baptiste Onofré <

j...@nanthrax.net>

wrote:


Hi Amit,

my point is: how do we provide metric today to end user and how can

they

use it to monitor a running pipeline ?

Clearly the runner is involved, but, it should behave the same way

for

all runners. Let me take an example.
On my ecosystem, I'm using both Flink and Spark with Beam, some
pipelines on each. I would like to get the metrics for all pipelines

to

my monitoring backend. If I can "poll" from the execution engine

metric

backend to my system that's acceptable, but it's an overhead of work.
Having a generic metric reporting layer would allow us to have a more
common way. If the user doesn't provide any reporting sink, then we

use

the execution backend metric layer. If provided, we use the reporting

sink.



How did you do it before Beam ? I that for Spark you reported it's

native

metrics via Codahale Reporter and Accumulators were visible in the UI,

and

the Spark runner took it a step forward to make it all visible via
Codahale. Assuming Flink does something similar, it all belongs to

runner

setup/configuration.



About your question: you are right, it's possible to update a

collector

or appender without impacting anything else.

Regards
JB

On 02/17/2017 10:38 PM, Amit Sela wrote:

@JB I think what you're suggesting is that Beam should provide a

"Metrics

Reporting" API as well, and I used to think like you, but the more I
thought of that the more I tend to disagree now.

The SDK is for users to author pipelines, so Metrics are for

user-defined

metrics (in contrast to runner metrics).

The Runner API is supposed to help different backends to integrate

with

Beam to allow users to execute those pipeline on their favourite

backend. I

believe the Runner API has to provide restrictions/demands that are

just

enough so a runner could execute a Beam pipeline as best it can, and

I'm

afraid that this would demand runner authors to do work that is

unnecessary.

This is also sort of "crossing the line" into the runner's domain

and

"telling it how to do" instead of what, and I don't think we want

that.


I do believe however that 

Re: Metrics for Beam IOs.

2017-02-18 Thread Amit Sela
That's what I don't understand - why would we want that ?
Taking on responsibilities in the "stack" should have a good reason.

Someone choosing to run Beam on Spark/Flink/Apex would have to take care of
installing those clusters, right ? perhaps providing them a resilient
underlying FS ? and if he wants, setup monitoring (which even with the API
proposed he'd have to do).

I just don't see why it should be a part of the runner and/or Metrics API.

On Sat, Feb 18, 2017 at 6:35 PM Jean-Baptiste Onofré 
wrote:

> Good point.
>
> In Decanter, it's what I named a "scheduled collector". So, yes, the
> adapter will periodically harvest metric to push.
>
> Regards
> JB
>
> On 02/18/2017 05:30 PM, Amit Sela wrote:
> > First issue with "push" metrics plugin - what if the runner's underlying
> > reporting mechanism is "pull" ? Codahale ScheduledReporter will sample
> the
> > values every X and send to ...
> > So any runner using a "pull-like" would use an adapter ?
> >
> > On Sat, Feb 18, 2017 at 6:27 PM Jean-Baptiste Onofré 
> > wrote:
> >
> >> Hi Ben,
> >>
> >> ok it's what I thought. Thanks for the clarification.
> >>
> >> +1 for the plugin-like "push" API (it's what I have in mind too ;)).
> >> I will start a PoC for discussion next week.
> >>
> >> Regards
> >> JB
> >>
> >> On 02/18/2017 05:17 PM, Ben Chambers wrote:
> >>> The runner can already report metrics during pipeline execution so it
> is
> >>> usable for monitoring.
> >>>
> >>> The pipeline result can be used to query metrics during pipeline
> >> execution,
> >>> so a first version of reporting to other systems is to periodically
> pulls
> >>> metrics from the runner with that API.
> >>>
> >>> We may eventually want to provide a plugin-like API to get the runner
> to
> >>> push metrics more directly to other metrics stores. This layer needs
> some
> >>> thought since it has to handle the complexity of attempted/committed
> >>> metrics to be consistent with the model.
> >>>
> >>>
> >>>
> >>> On Sat, Feb 18, 2017, 5:44 AM Jean-Baptiste Onofré 
> >> wrote:
> >>>
> >>> Hi Amit,
> >>>
> >>> before Beam, I didn't mind about portability ;) So I used the Spark
> >>> approach.
> >>>
> >>> But, now, as a Beam user, I would expect a generic way to deal with
> >>> metric whatever the runner would be.
> >>>
> >>> Today, you are right: I'm using the solution provided by the execution
> >>> engine. That's the current approach and it works fine. And it's up to
> me
> >>> to leverage (for intance Accumulators) it with my own system.
> >>>
> >>> My thought is more to provide a generic way. It's only a discussion for
> >>> now ;)
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On 02/18/2017 02:38 PM, Amit Sela wrote:
>  On Sat, Feb 18, 2017 at 10:16 AM Jean-Baptiste Onofré <
> j...@nanthrax.net>
>  wrote:
> 
> > Hi Amit,
> >
> > my point is: how do we provide metric today to end user and how can
> >> they
> > use it to monitor a running pipeline ?
> >
> > Clearly the runner is involved, but, it should behave the same way
> for
> > all runners. Let me take an example.
> > On my ecosystem, I'm using both Flink and Spark with Beam, some
> > pipelines on each. I would like to get the metrics for all pipelines
> to
> > my monitoring backend. If I can "poll" from the execution engine
> metric
> > backend to my system that's acceptable, but it's an overhead of work.
> > Having a generic metric reporting layer would allow us to have a more
> > common way. If the user doesn't provide any reporting sink, then we
> use
> > the execution backend metric layer. If provided, we use the reporting
> >>> sink.
> >
>  How did you do it before Beam ? I that for Spark you reported it's
> >> native
>  metrics via Codahale Reporter and Accumulators were visible in the UI,
> >> and
>  the Spark runner took it a step forward to make it all visible via
>  Codahale. Assuming Flink does something similar, it all belongs to
> >> runner
>  setup/configuration.
> 
> >
> > About your question: you are right, it's possible to update a
> collector
> > or appender without impacting anything else.
> >
> > Regards
> > JB
> >
> > On 02/17/2017 10:38 PM, Amit Sela wrote:
> >> @JB I think what you're suggesting is that Beam should provide a
> >>> "Metrics
> >> Reporting" API as well, and I used to think like you, but the more I
> >> thought of that the more I tend to disagree now.
> >>
> >> The SDK is for users to author pipelines, so Metrics are for
> >>> user-defined
> >> metrics (in contrast to runner metrics).
> >>
> >> The Runner API is supposed to help different backends to integrate
> >> with
> >> Beam to allow users to execute those pipeline on their favourite
> > backend. I
> >> believe the Runner API has to provide restrictions/demands that are
> >> just
> >> enough so a runner 

Re: Metrics for Beam IOs.

2017-02-18 Thread Aviem Zur
Is there a way to leverage runners' existing metrics sinks?
As stated by Amit & Stas, Spark runner uses Spark's metrics sink to report
Beam's aggregators and metrics.
Other runners may also have a similar capability, I'm not sure. This could
remove the need for a plugin, and dealing with push/pull.
I'm assuming we should compile a table of what can be supported in each
runner in this area and then decide a way to move forward?

On Sat, Feb 18, 2017 at 6:35 PM Jean-Baptiste Onofré 
wrote:

> Good point.
>
> In Decanter, it's what I named a "scheduled collector". So, yes, the
> adapter will periodically harvest metric to push.
>
> Regards
> JB
>
> On 02/18/2017 05:30 PM, Amit Sela wrote:
> > First issue with "push" metrics plugin - what if the runner's underlying
> > reporting mechanism is "pull" ? Codahale ScheduledReporter will sample
> the
> > values every X and send to ...
> > So any runner using a "pull-like" would use an adapter ?
> >
> > On Sat, Feb 18, 2017 at 6:27 PM Jean-Baptiste Onofré 
> > wrote:
> >
> >> Hi Ben,
> >>
> >> ok it's what I thought. Thanks for the clarification.
> >>
> >> +1 for the plugin-like "push" API (it's what I have in mind too ;)).
> >> I will start a PoC for discussion next week.
> >>
> >> Regards
> >> JB
> >>
> >> On 02/18/2017 05:17 PM, Ben Chambers wrote:
> >>> The runner can already report metrics during pipeline execution so it
> is
> >>> usable for monitoring.
> >>>
> >>> The pipeline result can be used to query metrics during pipeline
> >> execution,
> >>> so a first version of reporting to other systems is to periodically
> pulls
> >>> metrics from the runner with that API.
> >>>
> >>> We may eventually want to provide a plugin-like API to get the runner
> to
> >>> push metrics more directly to other metrics stores. This layer needs
> some
> >>> thought since it has to handle the complexity of attempted/committed
> >>> metrics to be consistent with the model.
> >>>
> >>>
> >>>
> >>> On Sat, Feb 18, 2017, 5:44 AM Jean-Baptiste Onofré 
> >> wrote:
> >>>
> >>> Hi Amit,
> >>>
> >>> before Beam, I didn't mind about portability ;) So I used the Spark
> >>> approach.
> >>>
> >>> But, now, as a Beam user, I would expect a generic way to deal with
> >>> metric whatever the runner would be.
> >>>
> >>> Today, you are right: I'm using the solution provided by the execution
> >>> engine. That's the current approach and it works fine. And it's up to
> me
> >>> to leverage (for intance Accumulators) it with my own system.
> >>>
> >>> My thought is more to provide a generic way. It's only a discussion for
> >>> now ;)
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On 02/18/2017 02:38 PM, Amit Sela wrote:
>  On Sat, Feb 18, 2017 at 10:16 AM Jean-Baptiste Onofré <
> j...@nanthrax.net>
>  wrote:
> 
> > Hi Amit,
> >
> > my point is: how do we provide metric today to end user and how can
> >> they
> > use it to monitor a running pipeline ?
> >
> > Clearly the runner is involved, but, it should behave the same way
> for
> > all runners. Let me take an example.
> > On my ecosystem, I'm using both Flink and Spark with Beam, some
> > pipelines on each. I would like to get the metrics for all pipelines
> to
> > my monitoring backend. If I can "poll" from the execution engine
> metric
> > backend to my system that's acceptable, but it's an overhead of work.
> > Having a generic metric reporting layer would allow us to have a more
> > common way. If the user doesn't provide any reporting sink, then we
> use
> > the execution backend metric layer. If provided, we use the reporting
> >>> sink.
> >
>  How did you do it before Beam ? I that for Spark you reported it's
> >> native
>  metrics via Codahale Reporter and Accumulators were visible in the UI,
> >> and
>  the Spark runner took it a step forward to make it all visible via
>  Codahale. Assuming Flink does something similar, it all belongs to
> >> runner
>  setup/configuration.
> 
> >
> > About your question: you are right, it's possible to update a
> collector
> > or appender without impacting anything else.
> >
> > Regards
> > JB
> >
> > On 02/17/2017 10:38 PM, Amit Sela wrote:
> >> @JB I think what you're suggesting is that Beam should provide a
> >>> "Metrics
> >> Reporting" API as well, and I used to think like you, but the more I
> >> thought of that the more I tend to disagree now.
> >>
> >> The SDK is for users to author pipelines, so Metrics are for
> >>> user-defined
> >> metrics (in contrast to runner metrics).
> >>
> >> The Runner API is supposed to help different backends to integrate
> >> with
> >> Beam to allow users to execute those pipeline on their favourite
> > backend. I
> >> believe the Runner API has to provide restrictions/demands that are
> >> just
> >> enough so a runner could execute a Beam 

Re: Metrics for Beam IOs.

2017-02-18 Thread Jean-Baptiste Onofré

Good point.

In Decanter, it's what I named a "scheduled collector". So, yes, the 
adapter will periodically harvest metric to push.


Regards
JB

On 02/18/2017 05:30 PM, Amit Sela wrote:

First issue with "push" metrics plugin - what if the runner's underlying
reporting mechanism is "pull" ? Codahale ScheduledReporter will sample the
values every X and send to ...
So any runner using a "pull-like" would use an adapter ?

On Sat, Feb 18, 2017 at 6:27 PM Jean-Baptiste Onofré 
wrote:


Hi Ben,

ok it's what I thought. Thanks for the clarification.

+1 for the plugin-like "push" API (it's what I have in mind too ;)).
I will start a PoC for discussion next week.

Regards
JB

On 02/18/2017 05:17 PM, Ben Chambers wrote:

The runner can already report metrics during pipeline execution so it is
usable for monitoring.

The pipeline result can be used to query metrics during pipeline

execution,

so a first version of reporting to other systems is to periodically pulls
metrics from the runner with that API.

We may eventually want to provide a plugin-like API to get the runner to
push metrics more directly to other metrics stores. This layer needs some
thought since it has to handle the complexity of attempted/committed
metrics to be consistent with the model.



On Sat, Feb 18, 2017, 5:44 AM Jean-Baptiste Onofré 

wrote:


Hi Amit,

before Beam, I didn't mind about portability ;) So I used the Spark
approach.

But, now, as a Beam user, I would expect a generic way to deal with
metric whatever the runner would be.

Today, you are right: I'm using the solution provided by the execution
engine. That's the current approach and it works fine. And it's up to me
to leverage (for intance Accumulators) it with my own system.

My thought is more to provide a generic way. It's only a discussion for
now ;)

Regards
JB

On 02/18/2017 02:38 PM, Amit Sela wrote:

On Sat, Feb 18, 2017 at 10:16 AM Jean-Baptiste Onofré 
wrote:


Hi Amit,

my point is: how do we provide metric today to end user and how can

they

use it to monitor a running pipeline ?

Clearly the runner is involved, but, it should behave the same way for
all runners. Let me take an example.
On my ecosystem, I'm using both Flink and Spark with Beam, some
pipelines on each. I would like to get the metrics for all pipelines to
my monitoring backend. If I can "poll" from the execution engine metric
backend to my system that's acceptable, but it's an overhead of work.
Having a generic metric reporting layer would allow us to have a more
common way. If the user doesn't provide any reporting sink, then we use
the execution backend metric layer. If provided, we use the reporting

sink.



How did you do it before Beam ? I that for Spark you reported it's

native

metrics via Codahale Reporter and Accumulators were visible in the UI,

and

the Spark runner took it a step forward to make it all visible via
Codahale. Assuming Flink does something similar, it all belongs to

runner

setup/configuration.



About your question: you are right, it's possible to update a collector
or appender without impacting anything else.

Regards
JB

On 02/17/2017 10:38 PM, Amit Sela wrote:

@JB I think what you're suggesting is that Beam should provide a

"Metrics

Reporting" API as well, and I used to think like you, but the more I
thought of that the more I tend to disagree now.

The SDK is for users to author pipelines, so Metrics are for

user-defined

metrics (in contrast to runner metrics).

The Runner API is supposed to help different backends to integrate

with

Beam to allow users to execute those pipeline on their favourite

backend. I

believe the Runner API has to provide restrictions/demands that are

just

enough so a runner could execute a Beam pipeline as best it can, and

I'm

afraid that this would demand runner authors to do work that is

unnecessary.

This is also sort of "crossing the line" into the runner's domain and
"telling it how to do" instead of what, and I don't think we want

that.


I do believe however that runner's should integrate the Metrics into

their

own metrics reporting system - but that's for the runner author to

decide.

Stas did this for the Spark runner because Spark doesn't report back
user-defined Accumulators (Spark's Aggregators) to it's Metrics

system.


On a curious note though, did you use an OSGi service per event-type ?

so

you can upgrade specific event-handlers without taking down the entire
reporter ? but that's really unrelated to this thread :-) .



On Fri, Feb 17, 2017 at 8:36 PM Ben Chambers



wrote:


It don't think it is possible for there to be a general mechanism for
pushing metrics out during the execution of a pipeline. The Metrics

API

suggests that metrics should be reported as values across all

attempts

and

values across only successful attempts. The latter requires runner
involvement to ensure that a given metric value is 

Re: Metrics for Beam IOs.

2017-02-18 Thread Amit Sela
First issue with "push" metrics plugin - what if the runner's underlying
reporting mechanism is "pull" ? Codahale ScheduledReporter will sample the
values every X and send to ...
So any runner using a "pull-like" would use an adapter ?

On Sat, Feb 18, 2017 at 6:27 PM Jean-Baptiste Onofré 
wrote:

> Hi Ben,
>
> ok it's what I thought. Thanks for the clarification.
>
> +1 for the plugin-like "push" API (it's what I have in mind too ;)).
> I will start a PoC for discussion next week.
>
> Regards
> JB
>
> On 02/18/2017 05:17 PM, Ben Chambers wrote:
> > The runner can already report metrics during pipeline execution so it is
> > usable for monitoring.
> >
> > The pipeline result can be used to query metrics during pipeline
> execution,
> > so a first version of reporting to other systems is to periodically pulls
> > metrics from the runner with that API.
> >
> > We may eventually want to provide a plugin-like API to get the runner to
> > push metrics more directly to other metrics stores. This layer needs some
> > thought since it has to handle the complexity of attempted/committed
> > metrics to be consistent with the model.
> >
> >
> >
> > On Sat, Feb 18, 2017, 5:44 AM Jean-Baptiste Onofré 
> wrote:
> >
> > Hi Amit,
> >
> > before Beam, I didn't mind about portability ;) So I used the Spark
> > approach.
> >
> > But, now, as a Beam user, I would expect a generic way to deal with
> > metric whatever the runner would be.
> >
> > Today, you are right: I'm using the solution provided by the execution
> > engine. That's the current approach and it works fine. And it's up to me
> > to leverage (for intance Accumulators) it with my own system.
> >
> > My thought is more to provide a generic way. It's only a discussion for
> > now ;)
> >
> > Regards
> > JB
> >
> > On 02/18/2017 02:38 PM, Amit Sela wrote:
> >> On Sat, Feb 18, 2017 at 10:16 AM Jean-Baptiste Onofré 
> >> wrote:
> >>
> >>> Hi Amit,
> >>>
> >>> my point is: how do we provide metric today to end user and how can
> they
> >>> use it to monitor a running pipeline ?
> >>>
> >>> Clearly the runner is involved, but, it should behave the same way for
> >>> all runners. Let me take an example.
> >>> On my ecosystem, I'm using both Flink and Spark with Beam, some
> >>> pipelines on each. I would like to get the metrics for all pipelines to
> >>> my monitoring backend. If I can "poll" from the execution engine metric
> >>> backend to my system that's acceptable, but it's an overhead of work.
> >>> Having a generic metric reporting layer would allow us to have a more
> >>> common way. If the user doesn't provide any reporting sink, then we use
> >>> the execution backend metric layer. If provided, we use the reporting
> > sink.
> >>>
> >> How did you do it before Beam ? I that for Spark you reported it's
> native
> >> metrics via Codahale Reporter and Accumulators were visible in the UI,
> and
> >> the Spark runner took it a step forward to make it all visible via
> >> Codahale. Assuming Flink does something similar, it all belongs to
> runner
> >> setup/configuration.
> >>
> >>>
> >>> About your question: you are right, it's possible to update a collector
> >>> or appender without impacting anything else.
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On 02/17/2017 10:38 PM, Amit Sela wrote:
>  @JB I think what you're suggesting is that Beam should provide a
> > "Metrics
>  Reporting" API as well, and I used to think like you, but the more I
>  thought of that the more I tend to disagree now.
> 
>  The SDK is for users to author pipelines, so Metrics are for
> > user-defined
>  metrics (in contrast to runner metrics).
> 
>  The Runner API is supposed to help different backends to integrate
> with
>  Beam to allow users to execute those pipeline on their favourite
> >>> backend. I
>  believe the Runner API has to provide restrictions/demands that are
> just
>  enough so a runner could execute a Beam pipeline as best it can, and
> I'm
>  afraid that this would demand runner authors to do work that is
> >>> unnecessary.
>  This is also sort of "crossing the line" into the runner's domain and
>  "telling it how to do" instead of what, and I don't think we want
> that.
> 
>  I do believe however that runner's should integrate the Metrics into
> >>> their
>  own metrics reporting system - but that's for the runner author to
> >>> decide.
>  Stas did this for the Spark runner because Spark doesn't report back
>  user-defined Accumulators (Spark's Aggregators) to it's Metrics
> system.
> 
>  On a curious note though, did you use an OSGi service per event-type ?
> > so
>  you can upgrade specific event-handlers without taking down the entire
>  reporter ? but that's really unrelated to this thread :-) .
> 
> 
> 
>  On Fri, Feb 17, 2017 at 8:36 PM Ben Chambers
> >>> 
>  wrote:
> 

Re: Metrics for Beam IOs.

2017-02-18 Thread Jean-Baptiste Onofré

Hi Ben,

ok it's what I thought. Thanks for the clarification.

+1 for the plugin-like "push" API (it's what I have in mind too ;)).
I will start a PoC for discussion next week.

Regards
JB

On 02/18/2017 05:17 PM, Ben Chambers wrote:

The runner can already report metrics during pipeline execution so it is
usable for monitoring.

The pipeline result can be used to query metrics during pipeline execution,
so a first version of reporting to other systems is to periodically pulls
metrics from the runner with that API.

We may eventually want to provide a plugin-like API to get the runner to
push metrics more directly to other metrics stores. This layer needs some
thought since it has to handle the complexity of attempted/committed
metrics to be consistent with the model.



On Sat, Feb 18, 2017, 5:44 AM Jean-Baptiste Onofré  wrote:

Hi Amit,

before Beam, I didn't mind about portability ;) So I used the Spark
approach.

But, now, as a Beam user, I would expect a generic way to deal with
metric whatever the runner would be.

Today, you are right: I'm using the solution provided by the execution
engine. That's the current approach and it works fine. And it's up to me
to leverage (for intance Accumulators) it with my own system.

My thought is more to provide a generic way. It's only a discussion for
now ;)

Regards
JB

On 02/18/2017 02:38 PM, Amit Sela wrote:

On Sat, Feb 18, 2017 at 10:16 AM Jean-Baptiste Onofré 
wrote:


Hi Amit,

my point is: how do we provide metric today to end user and how can they
use it to monitor a running pipeline ?

Clearly the runner is involved, but, it should behave the same way for
all runners. Let me take an example.
On my ecosystem, I'm using both Flink and Spark with Beam, some
pipelines on each. I would like to get the metrics for all pipelines to
my monitoring backend. If I can "poll" from the execution engine metric
backend to my system that's acceptable, but it's an overhead of work.
Having a generic metric reporting layer would allow us to have a more
common way. If the user doesn't provide any reporting sink, then we use
the execution backend metric layer. If provided, we use the reporting

sink.



How did you do it before Beam ? I that for Spark you reported it's native
metrics via Codahale Reporter and Accumulators were visible in the UI, and
the Spark runner took it a step forward to make it all visible via
Codahale. Assuming Flink does something similar, it all belongs to runner
setup/configuration.



About your question: you are right, it's possible to update a collector
or appender without impacting anything else.

Regards
JB

On 02/17/2017 10:38 PM, Amit Sela wrote:

@JB I think what you're suggesting is that Beam should provide a

"Metrics

Reporting" API as well, and I used to think like you, but the more I
thought of that the more I tend to disagree now.

The SDK is for users to author pipelines, so Metrics are for

user-defined

metrics (in contrast to runner metrics).

The Runner API is supposed to help different backends to integrate with
Beam to allow users to execute those pipeline on their favourite

backend. I

believe the Runner API has to provide restrictions/demands that are just
enough so a runner could execute a Beam pipeline as best it can, and I'm
afraid that this would demand runner authors to do work that is

unnecessary.

This is also sort of "crossing the line" into the runner's domain and
"telling it how to do" instead of what, and I don't think we want that.

I do believe however that runner's should integrate the Metrics into

their

own metrics reporting system - but that's for the runner author to

decide.

Stas did this for the Spark runner because Spark doesn't report back
user-defined Accumulators (Spark's Aggregators) to it's Metrics system.

On a curious note though, did you use an OSGi service per event-type ?

so

you can upgrade specific event-handlers without taking down the entire
reporter ? but that's really unrelated to this thread :-) .



On Fri, Feb 17, 2017 at 8:36 PM Ben Chambers



wrote:


It don't think it is possible for there to be a general mechanism for
pushing metrics out during the execution of a pipeline. The Metrics API
suggests that metrics should be reported as values across all attempts

and

values across only successful attempts. The latter requires runner
involvement to ensure that a given metric value is atomically

incremented

(or checkpointed) when the bundle it was reported in is committed.

Aviem has already implemented Metrics support for the Spark runner. I

am

working on support for the Dataflow runner.

On Fri, Feb 17, 2017 at 7:50 AM Jean-Baptiste Onofré 
wrote:

Hi guys,

As I'm back from vacation, I'm back on this topic ;)

It's a great discussion, and I think about the Metric IO coverage, it's
good.

However, there's a point that we discussed very fast in the thread and

I

think it's an important 

Re: Metrics for Beam IOs.

2017-02-18 Thread Ben Chambers
The runner can already report metrics during pipeline execution so it is
usable for monitoring.

The pipeline result can be used to query metrics during pipeline execution,
so a first version of reporting to other systems is to periodically pulls
metrics from the runner with that API.

We may eventually want to provide a plugin-like API to get the runner to
push metrics more directly to other metrics stores. This layer needs some
thought since it has to handle the complexity of attempted/committed
metrics to be consistent with the model.



On Sat, Feb 18, 2017, 5:44 AM Jean-Baptiste Onofré  wrote:

Hi Amit,

before Beam, I didn't mind about portability ;) So I used the Spark
approach.

But, now, as a Beam user, I would expect a generic way to deal with
metric whatever the runner would be.

Today, you are right: I'm using the solution provided by the execution
engine. That's the current approach and it works fine. And it's up to me
to leverage (for intance Accumulators) it with my own system.

My thought is more to provide a generic way. It's only a discussion for
now ;)

Regards
JB

On 02/18/2017 02:38 PM, Amit Sela wrote:
> On Sat, Feb 18, 2017 at 10:16 AM Jean-Baptiste Onofré 
> wrote:
>
>> Hi Amit,
>>
>> my point is: how do we provide metric today to end user and how can they
>> use it to monitor a running pipeline ?
>>
>> Clearly the runner is involved, but, it should behave the same way for
>> all runners. Let me take an example.
>> On my ecosystem, I'm using both Flink and Spark with Beam, some
>> pipelines on each. I would like to get the metrics for all pipelines to
>> my monitoring backend. If I can "poll" from the execution engine metric
>> backend to my system that's acceptable, but it's an overhead of work.
>> Having a generic metric reporting layer would allow us to have a more
>> common way. If the user doesn't provide any reporting sink, then we use
>> the execution backend metric layer. If provided, we use the reporting
sink.
>>
> How did you do it before Beam ? I that for Spark you reported it's native
> metrics via Codahale Reporter and Accumulators were visible in the UI, and
> the Spark runner took it a step forward to make it all visible via
> Codahale. Assuming Flink does something similar, it all belongs to runner
> setup/configuration.
>
>>
>> About your question: you are right, it's possible to update a collector
>> or appender without impacting anything else.
>>
>> Regards
>> JB
>>
>> On 02/17/2017 10:38 PM, Amit Sela wrote:
>>> @JB I think what you're suggesting is that Beam should provide a
"Metrics
>>> Reporting" API as well, and I used to think like you, but the more I
>>> thought of that the more I tend to disagree now.
>>>
>>> The SDK is for users to author pipelines, so Metrics are for
user-defined
>>> metrics (in contrast to runner metrics).
>>>
>>> The Runner API is supposed to help different backends to integrate with
>>> Beam to allow users to execute those pipeline on their favourite
>> backend. I
>>> believe the Runner API has to provide restrictions/demands that are just
>>> enough so a runner could execute a Beam pipeline as best it can, and I'm
>>> afraid that this would demand runner authors to do work that is
>> unnecessary.
>>> This is also sort of "crossing the line" into the runner's domain and
>>> "telling it how to do" instead of what, and I don't think we want that.
>>>
>>> I do believe however that runner's should integrate the Metrics into
>> their
>>> own metrics reporting system - but that's for the runner author to
>> decide.
>>> Stas did this for the Spark runner because Spark doesn't report back
>>> user-defined Accumulators (Spark's Aggregators) to it's Metrics system.
>>>
>>> On a curious note though, did you use an OSGi service per event-type ?
so
>>> you can upgrade specific event-handlers without taking down the entire
>>> reporter ? but that's really unrelated to this thread :-) .
>>>
>>>
>>>
>>> On Fri, Feb 17, 2017 at 8:36 PM Ben Chambers
>> 
>>> wrote:
>>>
 It don't think it is possible for there to be a general mechanism for
 pushing metrics out during the execution of a pipeline. The Metrics API
 suggests that metrics should be reported as values across all attempts
>> and
 values across only successful attempts. The latter requires runner
 involvement to ensure that a given metric value is atomically
>> incremented
 (or checkpointed) when the bundle it was reported in is committed.

 Aviem has already implemented Metrics support for the Spark runner. I
am
 working on support for the Dataflow runner.

 On Fri, Feb 17, 2017 at 7:50 AM Jean-Baptiste Onofré 
 wrote:

 Hi guys,

 As I'm back from vacation, I'm back on this topic ;)

 It's a great discussion, and I think about the Metric IO coverage, it's
 good.

 However, there's a point that we discussed very fast in the 

Re: Metrics for Beam IOs.

2017-02-18 Thread Jean-Baptiste Onofré

Hi Amit,

before Beam, I didn't mind about portability ;) So I used the Spark 
approach.


But, now, as a Beam user, I would expect a generic way to deal with 
metric whatever the runner would be.


Today, you are right: I'm using the solution provided by the execution 
engine. That's the current approach and it works fine. And it's up to me 
to leverage (for intance Accumulators) it with my own system.


My thought is more to provide a generic way. It's only a discussion for 
now ;)


Regards
JB

On 02/18/2017 02:38 PM, Amit Sela wrote:

On Sat, Feb 18, 2017 at 10:16 AM Jean-Baptiste Onofré 
wrote:


Hi Amit,

my point is: how do we provide metric today to end user and how can they
use it to monitor a running pipeline ?

Clearly the runner is involved, but, it should behave the same way for
all runners. Let me take an example.
On my ecosystem, I'm using both Flink and Spark with Beam, some
pipelines on each. I would like to get the metrics for all pipelines to
my monitoring backend. If I can "poll" from the execution engine metric
backend to my system that's acceptable, but it's an overhead of work.
Having a generic metric reporting layer would allow us to have a more
common way. If the user doesn't provide any reporting sink, then we use
the execution backend metric layer. If provided, we use the reporting sink.


How did you do it before Beam ? I that for Spark you reported it's native
metrics via Codahale Reporter and Accumulators were visible in the UI, and
the Spark runner took it a step forward to make it all visible via
Codahale. Assuming Flink does something similar, it all belongs to runner
setup/configuration.



About your question: you are right, it's possible to update a collector
or appender without impacting anything else.

Regards
JB

On 02/17/2017 10:38 PM, Amit Sela wrote:

@JB I think what you're suggesting is that Beam should provide a "Metrics
Reporting" API as well, and I used to think like you, but the more I
thought of that the more I tend to disagree now.

The SDK is for users to author pipelines, so Metrics are for user-defined
metrics (in contrast to runner metrics).

The Runner API is supposed to help different backends to integrate with
Beam to allow users to execute those pipeline on their favourite

backend. I

believe the Runner API has to provide restrictions/demands that are just
enough so a runner could execute a Beam pipeline as best it can, and I'm
afraid that this would demand runner authors to do work that is

unnecessary.

This is also sort of "crossing the line" into the runner's domain and
"telling it how to do" instead of what, and I don't think we want that.

I do believe however that runner's should integrate the Metrics into

their

own metrics reporting system - but that's for the runner author to

decide.

Stas did this for the Spark runner because Spark doesn't report back
user-defined Accumulators (Spark's Aggregators) to it's Metrics system.

On a curious note though, did you use an OSGi service per event-type ? so
you can upgrade specific event-handlers without taking down the entire
reporter ? but that's really unrelated to this thread :-) .



On Fri, Feb 17, 2017 at 8:36 PM Ben Chambers



wrote:


It don't think it is possible for there to be a general mechanism for
pushing metrics out during the execution of a pipeline. The Metrics API
suggests that metrics should be reported as values across all attempts

and

values across only successful attempts. The latter requires runner
involvement to ensure that a given metric value is atomically

incremented

(or checkpointed) when the bundle it was reported in is committed.

Aviem has already implemented Metrics support for the Spark runner. I am
working on support for the Dataflow runner.

On Fri, Feb 17, 2017 at 7:50 AM Jean-Baptiste Onofré 
wrote:

Hi guys,

As I'm back from vacation, I'm back on this topic ;)

It's a great discussion, and I think about the Metric IO coverage, it's
good.

However, there's a point that we discussed very fast in the thread and I
think it's an important one (maybe more important than the provided
metrics actually in term of roadmap ;)).

Assuming we have pipelines, PTransforms, IOs, ... using the Metric API,
how do we expose the metrics for the end-users ?

A first approach would be to bind a JMX MBean server by the pipeline and
expose the metrics via MBeans. I don't think it's a good idea for the
following reasons:
1. It's not easy to know where the pipeline is actually executed, and
so, not easy to find the MBean server URI.
2. For the same reason, we can have port binding error.
3. If it could work for unbounded/streaming pipelines (as they are
always "running"), it's not really applicable for bounded/batch
pipelines as their lifetime is "limited" ;)

So, I think the "push" approach is better: during the execution, a
pipeline "internally" collects and pushes the metric to a backend.
The "push" could a 

Re: Metrics for Beam IOs.

2017-02-18 Thread Stas Levin
JB, I think you raise a good point. As a user, if I'm sold on pipeline
portability, I would expect the deal to include metrics as well as.
Otherwise, being able to port a pipeline to a different execution engine at
the cost of "flying blind" might not be as tempting.

If we wish to give users a consistent set of metrics (which is totally +1
IMHO), even before we decide on what such a metrics set should include,
there seem to be at least 2 ways to achieve this:

   1. Hard spec, via API: have the SDK define metric specs for runners/IOs
   to implement
   2. Soft spec, via documentation: specify the expected metrics in docs,
   some of the metrics could be "must have" while others could be "nice to
   have", and let runners implement (interpret?) what they can, and how they
   can.

Naturally, each of these has its pros and cons.

Hard spec is great for consistency, but may make things harder on the
implementors due to the efforts required to align the internals of each
execution engine to the SDK (e.g., attempted vs. committed metrics, as
discussed in a separate thread here on the dev list)

Soft spec is enforced by doc rather than by cod and leaves consistency up
to disincline (meh...), but may make it easier on implementors to get
things done as they have more freedom.

-Stas

On Sat, Feb 18, 2017 at 10:16 AM Jean-Baptiste Onofré 
wrote:

> Hi Amit,
>
> my point is: how do we provide metric today to end user and how can they
> use it to monitor a running pipeline ?
>
> Clearly the runner is involved, but, it should behave the same way for
> all runners. Let me take an example.
> On my ecosystem, I'm using both Flink and Spark with Beam, some
> pipelines on each. I would like to get the metrics for all pipelines to
> my monitoring backend. If I can "poll" from the execution engine metric
> backend to my system that's acceptable, but it's an overhead of work.
> Having a generic metric reporting layer would allow us to have a more
> common way. If the user doesn't provide any reporting sink, then we use
> the execution backend metric layer. If provided, we use the reporting sink.
>
> About your question: you are right, it's possible to update a collector
> or appender without impacting anything else.
>
> Regards
> JB
>
> On 02/17/2017 10:38 PM, Amit Sela wrote:
> > @JB I think what you're suggesting is that Beam should provide a "Metrics
> > Reporting" API as well, and I used to think like you, but the more I
> > thought of that the more I tend to disagree now.
> >
> > The SDK is for users to author pipelines, so Metrics are for user-defined
> > metrics (in contrast to runner metrics).
> >
> > The Runner API is supposed to help different backends to integrate with
> > Beam to allow users to execute those pipeline on their favourite
> backend. I
> > believe the Runner API has to provide restrictions/demands that are just
> > enough so a runner could execute a Beam pipeline as best it can, and I'm
> > afraid that this would demand runner authors to do work that is
> unnecessary.
> > This is also sort of "crossing the line" into the runner's domain and
> > "telling it how to do" instead of what, and I don't think we want that.
> >
> > I do believe however that runner's should integrate the Metrics into
> their
> > own metrics reporting system - but that's for the runner author to
> decide.
> > Stas did this for the Spark runner because Spark doesn't report back
> > user-defined Accumulators (Spark's Aggregators) to it's Metrics system.
> >
> > On a curious note though, did you use an OSGi service per event-type ? so
> > you can upgrade specific event-handlers without taking down the entire
> > reporter ? but that's really unrelated to this thread :-) .
> >
> >
> >
> > On Fri, Feb 17, 2017 at 8:36 PM Ben Chambers
> 
> > wrote:
> >
> >> It don't think it is possible for there to be a general mechanism for
> >> pushing metrics out during the execution of a pipeline. The Metrics API
> >> suggests that metrics should be reported as values across all attempts
> and
> >> values across only successful attempts. The latter requires runner
> >> involvement to ensure that a given metric value is atomically
> incremented
> >> (or checkpointed) when the bundle it was reported in is committed.
> >>
> >> Aviem has already implemented Metrics support for the Spark runner. I am
> >> working on support for the Dataflow runner.
> >>
> >> On Fri, Feb 17, 2017 at 7:50 AM Jean-Baptiste Onofré 
> >> wrote:
> >>
> >> Hi guys,
> >>
> >> As I'm back from vacation, I'm back on this topic ;)
> >>
> >> It's a great discussion, and I think about the Metric IO coverage, it's
> >> good.
> >>
> >> However, there's a point that we discussed very fast in the thread and I
> >> think it's an important one (maybe more important than the provided
> >> metrics actually in term of roadmap ;)).
> >>
> >> Assuming we have pipelines, PTransforms, IOs, ... using the Metric API,
> >> how 

Re: Metrics for Beam IOs.

2017-02-18 Thread Jean-Baptiste Onofré

Hi Amit,

my point is: how do we provide metric today to end user and how can they 
use it to monitor a running pipeline ?


Clearly the runner is involved, but, it should behave the same way for 
all runners. Let me take an example.
On my ecosystem, I'm using both Flink and Spark with Beam, some 
pipelines on each. I would like to get the metrics for all pipelines to 
my monitoring backend. If I can "poll" from the execution engine metric 
backend to my system that's acceptable, but it's an overhead of work.
Having a generic metric reporting layer would allow us to have a more 
common way. If the user doesn't provide any reporting sink, then we use 
the execution backend metric layer. If provided, we use the reporting sink.


About your question: you are right, it's possible to update a collector 
or appender without impacting anything else.


Regards
JB

On 02/17/2017 10:38 PM, Amit Sela wrote:

@JB I think what you're suggesting is that Beam should provide a "Metrics
Reporting" API as well, and I used to think like you, but the more I
thought of that the more I tend to disagree now.

The SDK is for users to author pipelines, so Metrics are for user-defined
metrics (in contrast to runner metrics).

The Runner API is supposed to help different backends to integrate with
Beam to allow users to execute those pipeline on their favourite backend. I
believe the Runner API has to provide restrictions/demands that are just
enough so a runner could execute a Beam pipeline as best it can, and I'm
afraid that this would demand runner authors to do work that is unnecessary.
This is also sort of "crossing the line" into the runner's domain and
"telling it how to do" instead of what, and I don't think we want that.

I do believe however that runner's should integrate the Metrics into their
own metrics reporting system - but that's for the runner author to decide.
Stas did this for the Spark runner because Spark doesn't report back
user-defined Accumulators (Spark's Aggregators) to it's Metrics system.

On a curious note though, did you use an OSGi service per event-type ? so
you can upgrade specific event-handlers without taking down the entire
reporter ? but that's really unrelated to this thread :-) .



On Fri, Feb 17, 2017 at 8:36 PM Ben Chambers 
wrote:


It don't think it is possible for there to be a general mechanism for
pushing metrics out during the execution of a pipeline. The Metrics API
suggests that metrics should be reported as values across all attempts and
values across only successful attempts. The latter requires runner
involvement to ensure that a given metric value is atomically incremented
(or checkpointed) when the bundle it was reported in is committed.

Aviem has already implemented Metrics support for the Spark runner. I am
working on support for the Dataflow runner.

On Fri, Feb 17, 2017 at 7:50 AM Jean-Baptiste Onofré 
wrote:

Hi guys,

As I'm back from vacation, I'm back on this topic ;)

It's a great discussion, and I think about the Metric IO coverage, it's
good.

However, there's a point that we discussed very fast in the thread and I
think it's an important one (maybe more important than the provided
metrics actually in term of roadmap ;)).

Assuming we have pipelines, PTransforms, IOs, ... using the Metric API,
how do we expose the metrics for the end-users ?

A first approach would be to bind a JMX MBean server by the pipeline and
expose the metrics via MBeans. I don't think it's a good idea for the
following reasons:
1. It's not easy to know where the pipeline is actually executed, and
so, not easy to find the MBean server URI.
2. For the same reason, we can have port binding error.
3. If it could work for unbounded/streaming pipelines (as they are
always "running"), it's not really applicable for bounded/batch
pipelines as their lifetime is "limited" ;)

So, I think the "push" approach is better: during the execution, a
pipeline "internally" collects and pushes the metric to a backend.
The "push" could a kind of sink. For instance, the metric "records" can
be sent to a Kafka topic, or directly to Elasticsearch or whatever.
The metric backend can deal with alerting, reporting, etc.

Basically, we have to define two things:
1. The "appender" where the metrics have to be sent (and the
corresponding configuration to connect, like Kafka or Elasticsearch
location)
2. The format of the metric data (for instance, json format).

In Apache Karaf, I created something similar named Decanter:


http://blog.nanthrax.net/2015/07/monitoring-and-alerting-with-apache-karaf-decanter/

http://karaf.apache.org/manual/decanter/latest-1/

Decanter provides collectors that harvest the metrics (like JMX MBean
attributes, log messages, ...). Basically, for Beam, it would be
directly the Metric API used by pipeline parts.
Then, the metric record are send to a dispatcher which send the metric
records to an appender. The appenders store or send the metric 

Re: Metrics for Beam IOs.

2017-02-17 Thread Amit Sela
@JB I think what you're suggesting is that Beam should provide a "Metrics
Reporting" API as well, and I used to think like you, but the more I
thought of that the more I tend to disagree now.

The SDK is for users to author pipelines, so Metrics are for user-defined
metrics (in contrast to runner metrics).

The Runner API is supposed to help different backends to integrate with
Beam to allow users to execute those pipeline on their favourite backend. I
believe the Runner API has to provide restrictions/demands that are just
enough so a runner could execute a Beam pipeline as best it can, and I'm
afraid that this would demand runner authors to do work that is unnecessary.
This is also sort of "crossing the line" into the runner's domain and
"telling it how to do" instead of what, and I don't think we want that.

I do believe however that runner's should integrate the Metrics into their
own metrics reporting system - but that's for the runner author to decide.
Stas did this for the Spark runner because Spark doesn't report back
user-defined Accumulators (Spark's Aggregators) to it's Metrics system.

On a curious note though, did you use an OSGi service per event-type ? so
you can upgrade specific event-handlers without taking down the entire
reporter ? but that's really unrelated to this thread :-) .



On Fri, Feb 17, 2017 at 8:36 PM Ben Chambers 
wrote:

> It don't think it is possible for there to be a general mechanism for
> pushing metrics out during the execution of a pipeline. The Metrics API
> suggests that metrics should be reported as values across all attempts and
> values across only successful attempts. The latter requires runner
> involvement to ensure that a given metric value is atomically incremented
> (or checkpointed) when the bundle it was reported in is committed.
>
> Aviem has already implemented Metrics support for the Spark runner. I am
> working on support for the Dataflow runner.
>
> On Fri, Feb 17, 2017 at 7:50 AM Jean-Baptiste Onofré 
> wrote:
>
> Hi guys,
>
> As I'm back from vacation, I'm back on this topic ;)
>
> It's a great discussion, and I think about the Metric IO coverage, it's
> good.
>
> However, there's a point that we discussed very fast in the thread and I
> think it's an important one (maybe more important than the provided
> metrics actually in term of roadmap ;)).
>
> Assuming we have pipelines, PTransforms, IOs, ... using the Metric API,
> how do we expose the metrics for the end-users ?
>
> A first approach would be to bind a JMX MBean server by the pipeline and
> expose the metrics via MBeans. I don't think it's a good idea for the
> following reasons:
> 1. It's not easy to know where the pipeline is actually executed, and
> so, not easy to find the MBean server URI.
> 2. For the same reason, we can have port binding error.
> 3. If it could work for unbounded/streaming pipelines (as they are
> always "running"), it's not really applicable for bounded/batch
> pipelines as their lifetime is "limited" ;)
>
> So, I think the "push" approach is better: during the execution, a
> pipeline "internally" collects and pushes the metric to a backend.
> The "push" could a kind of sink. For instance, the metric "records" can
> be sent to a Kafka topic, or directly to Elasticsearch or whatever.
> The metric backend can deal with alerting, reporting, etc.
>
> Basically, we have to define two things:
> 1. The "appender" where the metrics have to be sent (and the
> corresponding configuration to connect, like Kafka or Elasticsearch
> location)
> 2. The format of the metric data (for instance, json format).
>
> In Apache Karaf, I created something similar named Decanter:
>
>
> http://blog.nanthrax.net/2015/07/monitoring-and-alerting-with-apache-karaf-decanter/
>
> http://karaf.apache.org/manual/decanter/latest-1/
>
> Decanter provides collectors that harvest the metrics (like JMX MBean
> attributes, log messages, ...). Basically, for Beam, it would be
> directly the Metric API used by pipeline parts.
> Then, the metric record are send to a dispatcher which send the metric
> records to an appender. The appenders store or send the metric records
> to a backend (elasticsearc, cassandra, kafka, jms, reddis, ...).
>
> I think it would make sense to provide the configuration and Metric
> "appender" via the pipeline options.
> As it's not really runner specific, it could be part of the metric API
> (or SPI in that case).
>
> WDYT ?
>
> Regards
> JB
>
> On 02/15/2017 09:22 AM, Stas Levin wrote:
> > +1 to making the IO metrics (e.g. producers, consumers) available as part
> > of the Beam pipeline metrics tree for debugging and visibility.
> >
> > As it has already been mentioned, many IO clients have a metrics
> mechanism
> > in place, so in these cases I think it could be beneficial to mirror
> their
> > metrics under the relevant subtree of the Beam metrics tree.
> >
> > On Wed, Feb 15, 2017 at 12:04 AM Amit Sela  

Re: Metrics for Beam IOs.

2017-02-17 Thread Jean-Baptiste Onofré

Hi guys,

As I'm back from vacation, I'm back on this topic ;)

It's a great discussion, and I think about the Metric IO coverage, it's 
good.


However, there's a point that we discussed very fast in the thread and I 
think it's an important one (maybe more important than the provided 
metrics actually in term of roadmap ;)).


Assuming we have pipelines, PTransforms, IOs, ... using the Metric API, 
how do we expose the metrics for the end-users ?


A first approach would be to bind a JMX MBean server by the pipeline and 
expose the metrics via MBeans. I don't think it's a good idea for the 
following reasons:
1. It's not easy to know where the pipeline is actually executed, and 
so, not easy to find the MBean server URI.

2. For the same reason, we can have port binding error.
3. If it could work for unbounded/streaming pipelines (as they are 
always "running"), it's not really applicable for bounded/batch 
pipelines as their lifetime is "limited" ;)


So, I think the "push" approach is better: during the execution, a 
pipeline "internally" collects and pushes the metric to a backend.
The "push" could a kind of sink. For instance, the metric "records" can 
be sent to a Kafka topic, or directly to Elasticsearch or whatever.

The metric backend can deal with alerting, reporting, etc.

Basically, we have to define two things:
1. The "appender" where the metrics have to be sent (and the 
corresponding configuration to connect, like Kafka or Elasticsearch 
location)

2. The format of the metric data (for instance, json format).

In Apache Karaf, I created something similar named Decanter:

http://blog.nanthrax.net/2015/07/monitoring-and-alerting-with-apache-karaf-decanter/

http://karaf.apache.org/manual/decanter/latest-1/

Decanter provides collectors that harvest the metrics (like JMX MBean 
attributes, log messages, ...). Basically, for Beam, it would be 
directly the Metric API used by pipeline parts.
Then, the metric record are send to a dispatcher which send the metric 
records to an appender. The appenders store or send the metric records 
to a backend (elasticsearc, cassandra, kafka, jms, reddis, ...).


I think it would make sense to provide the configuration and Metric 
"appender" via the pipeline options.
As it's not really runner specific, it could be part of the metric API 
(or SPI in that case).


WDYT ?

Regards
JB

On 02/15/2017 09:22 AM, Stas Levin wrote:

+1 to making the IO metrics (e.g. producers, consumers) available as part
of the Beam pipeline metrics tree for debugging and visibility.

As it has already been mentioned, many IO clients have a metrics mechanism
in place, so in these cases I think it could be beneficial to mirror their
metrics under the relevant subtree of the Beam metrics tree.

On Wed, Feb 15, 2017 at 12:04 AM Amit Sela  wrote:


I think this is a great discussion and I'd like to relate to some of the
points raised here, and raise some of my own.

First of all I think we should be careful here not to cross boundaries. IOs
naturally have many metrics, and Beam should avoid "taking over" those. IO
metrics should focus on what's relevant to the Pipeline: input/output rate,
backlog (for UnboundedSources, which exists in bytes but for monitoring
purposes we might want to consider #messages).

I don't agree that we should not invest in doing this in Sources/Sinks and
going directly to SplittableDoFn because the IO API is familiar and known,
and as long as we keep it should be treated as a first class citizen.

As for enable/disable - if IOs consider focusing on pipeline-related
metrics I think we should be fine, though this could also change between
runners as well.

Finally, considering "split-metrics" is interesting because on one hand it
affects the pipeline directly (unbalanced partitions in Kafka that may
cause backlog) but this is that fine-line of responsibilities (Kafka
monitoring would probably be able to tell you that partitions are not
balanced).

My 2 cents, cheers!

On Tue, Feb 14, 2017 at 8:46 PM Raghu Angadi 
wrote:


On Tue, Feb 14, 2017 at 9:21 AM, Ben Chambers



Re: Metrics for Beam IOs.

2017-02-15 Thread Stas Levin
+1 to making the IO metrics (e.g. producers, consumers) available as part
of the Beam pipeline metrics tree for debugging and visibility.

As it has already been mentioned, many IO clients have a metrics mechanism
in place, so in these cases I think it could be beneficial to mirror their
metrics under the relevant subtree of the Beam metrics tree.

On Wed, Feb 15, 2017 at 12:04 AM Amit Sela  wrote:

> I think this is a great discussion and I'd like to relate to some of the
> points raised here, and raise some of my own.
>
> First of all I think we should be careful here not to cross boundaries. IOs
> naturally have many metrics, and Beam should avoid "taking over" those. IO
> metrics should focus on what's relevant to the Pipeline: input/output rate,
> backlog (for UnboundedSources, which exists in bytes but for monitoring
> purposes we might want to consider #messages).
>
> I don't agree that we should not invest in doing this in Sources/Sinks and
> going directly to SplittableDoFn because the IO API is familiar and known,
> and as long as we keep it should be treated as a first class citizen.
>
> As for enable/disable - if IOs consider focusing on pipeline-related
> metrics I think we should be fine, though this could also change between
> runners as well.
>
> Finally, considering "split-metrics" is interesting because on one hand it
> affects the pipeline directly (unbalanced partitions in Kafka that may
> cause backlog) but this is that fine-line of responsibilities (Kafka
> monitoring would probably be able to tell you that partitions are not
> balanced).
>
> My 2 cents, cheers!
>
> On Tue, Feb 14, 2017 at 8:46 PM Raghu Angadi 
> wrote:
>
> > On Tue, Feb 14, 2017 at 9:21 AM, Ben Chambers
>  > >
> > wrote:
> >
> > >
> > > > * I also think there are data source specific metrics that a given IO
> > > will
> > > > want to expose (ie, things like kafka backlog for a topic.)
> >
> >
> > UnboundedSource has API for backlog. It is better for beam/runners to
> > handle backlog as well.
> > Of course there will be some source specific metrics too (errors, i/o ops
> > etc).
> >
>


Re: Metrics for Beam IOs.

2017-02-14 Thread Amit Sela
I think this is a great discussion and I'd like to relate to some of the
points raised here, and raise some of my own.

First of all I think we should be careful here not to cross boundaries. IOs
naturally have many metrics, and Beam should avoid "taking over" those. IO
metrics should focus on what's relevant to the Pipeline: input/output rate,
backlog (for UnboundedSources, which exists in bytes but for monitoring
purposes we might want to consider #messages).

I don't agree that we should not invest in doing this in Sources/Sinks and
going directly to SplittableDoFn because the IO API is familiar and known,
and as long as we keep it should be treated as a first class citizen.

As for enable/disable - if IOs consider focusing on pipeline-related
metrics I think we should be fine, though this could also change between
runners as well.

Finally, considering "split-metrics" is interesting because on one hand it
affects the pipeline directly (unbalanced partitions in Kafka that may
cause backlog) but this is that fine-line of responsibilities (Kafka
monitoring would probably be able to tell you that partitions are not
balanced).

My 2 cents, cheers!

On Tue, Feb 14, 2017 at 8:46 PM Raghu Angadi 
wrote:

> On Tue, Feb 14, 2017 at 9:21 AM, Ben Chambers  >
> wrote:
>
> >
> > > * I also think there are data source specific metrics that a given IO
> > will
> > > want to expose (ie, things like kafka backlog for a topic.)
>
>
> UnboundedSource has API for backlog. It is better for beam/runners to
> handle backlog as well.
> Of course there will be some source specific metrics too (errors, i/o ops
> etc).
>


Re: Metrics for Beam IOs.

2017-02-14 Thread Raghu Angadi
On Tue, Feb 14, 2017 at 9:21 AM, Ben Chambers 
wrote:

>
> > * I also think there are data source specific metrics that a given IO
> will
> > want to expose (ie, things like kafka backlog for a topic.)


UnboundedSource has API for backlog. It is better for beam/runners to
handle backlog as well.
Of course there will be some source specific metrics too (errors, i/o ops
etc).


Re: Metrics for Beam IOs.

2017-02-14 Thread Stephen Sisk
>Many of the metrics that should be exposed for all transforms are likely
best exposed by the runner or some other common layer, rather than being
added to each transform.

+1 I wasn't trying to advocate for each user trying to implement these on
every transform - it should be provided by beam or the runner automatically.


> But, the existing Metrics API should work within a source or sink --
anything that is called within a step should work.

Great! I wasn't aware we'd changed that with the new Metrics API - I was
only aware of the limitations of the old system.


> if the source can detect that it is
having trouble splitting and raise a message like "you're using compressed
text files which can't be parallelized beyond the number of files" that is
much more actionable.

I think there's an important difference between what a particular runner
chooses to show it's users in their monitoring interface vs what beam
should be reporting. I think it's important that Beam or the runner layer
should be reporting this data (which would be necessary for doing high
level analysis like what you propose), but then the monitoring interface
should choose how to expose that information. So then the question becomes
- does it make sense for these common transform metrics to be exposed by
runner implementations or within common beam code?

S


On Tue, Feb 14, 2017 at 9:21 AM Ben Chambers 
wrote:

> On Tue, Feb 14, 2017 at 9:07 AM Stephen Sisk 
> wrote:
>
> > hi!
> >
> > (ben just sent his mail and he covered some similar topics to me, but
> I'll
> > keep my comments intact since they are slightly different)
> >
> > * I think there are a lot of metrics that should be exposed for all
> > transforms - everything from JB's list (mile number of split, throughput,
> > reading/writing rate, number of splits, etc..) also apply to
> > splittableDoFns.
> >
>
> Many of the metrics that should be exposed for all transforms are likely
> best exposed by the runner or some other common layer, rather than being
> added to each transform. But things like number of elements, estimated size
> of elements, etc. all make sense for every transform.
>
>
> > * I also think there are data source specific metrics that a given IO
> will
> > want to expose (ie, things like kafka backlog for a topic.) No one on
> this
> > thread has specifically addressed this, but Beam Sources & Sinks do not
> > presently have the ability to report metrics even if a given IO writer
> > wanted to - depending on the timeline for SplittableDoFn and the move to
> > that infrastructure, I don't think we need that support in Sources/Sinks,
> > but I do think we should make sure SplittableDoFn has the necessary
> > support.
> >
>
> Two parts -- we may want to introduce something like a Gauge here that lets
> the metric system ask the source/sink for the latest metrics. This allows
> the runner to gather metrics at a rate that makes sense without impacting
> performance.
>
> But, the existing Metrics API should work within a source or sink --
> anything that is called within a step should work.
>
>
> > * I think there are ways to do many metrics such that they are not too
> > expensive to calculate all the time.  (ie, reporting per bundle rather
> than
> > per item) I think we should ask whether we want/need are metrics that are
> > expensive to calculate before going to the effort of adding
> enable/disable.
> >
>
> +1 -- hence why I'd like to look at reporting the metrics with no
> configuration.
>
>
> > * I disagree with ben about showing the amount of splitting - I think
> > especially with IOs it's useful to understand/diagnose reading problems
> > since that's one potential source of problems, especially given that the
> > user can write transforms that split now in SplittableDoFn. But I look
> > forward to discussing that further
> >
>
> I think many of the splitting metrics fall into things the runner should
> report. I think if we pick the right so they're useful, it likely doesn't
> hurt to gather them, but here again it may be useful to talk about specific
> problems.
>
> I still think these likely won't make sense for all users -- if I'm a new
> user just trying to get a source/sink working, I'm not sure what "splitting
> metrics" would be useful to me. But if the source can detect that it is
> having trouble splitting and raise a message like "you're using compressed
> text files which can't be parallelized beyond the number of files" that is
> much more actionable.
>
>
> > +1 on talking about specific examples
> >
> > S
> >
> > On Tue, Feb 14, 2017 at 8:29 AM Jean-Baptiste Onofré 
> > wrote:
> >
> > > Hi Aviem
> > >
> > > Agree with your comments, it's pretty close to my previous ones.
> > >
> > > Regards
> > > JB
> > >
> > > On Feb 14, 2017, 12:04, at 12:04, Aviem Zur 
> wrote:
> > > >Hi Ismaël,
> > > >
> > > >You've raised some great points.
> > > >Please see my 

Re: Metrics for Beam IOs.

2017-02-14 Thread Ben Chambers
On Tue, Feb 14, 2017 at 9:07 AM Stephen Sisk 
wrote:

> hi!
>
> (ben just sent his mail and he covered some similar topics to me, but I'll
> keep my comments intact since they are slightly different)
>
> * I think there are a lot of metrics that should be exposed for all
> transforms - everything from JB's list (mile number of split, throughput,
> reading/writing rate, number of splits, etc..) also apply to
> splittableDoFns.
>

Many of the metrics that should be exposed for all transforms are likely
best exposed by the runner or some other common layer, rather than being
added to each transform. But things like number of elements, estimated size
of elements, etc. all make sense for every transform.


> * I also think there are data source specific metrics that a given IO will
> want to expose (ie, things like kafka backlog for a topic.) No one on this
> thread has specifically addressed this, but Beam Sources & Sinks do not
> presently have the ability to report metrics even if a given IO writer
> wanted to - depending on the timeline for SplittableDoFn and the move to
> that infrastructure, I don't think we need that support in Sources/Sinks,
> but I do think we should make sure SplittableDoFn has the necessary
> support.
>

Two parts -- we may want to introduce something like a Gauge here that lets
the metric system ask the source/sink for the latest metrics. This allows
the runner to gather metrics at a rate that makes sense without impacting
performance.

But, the existing Metrics API should work within a source or sink --
anything that is called within a step should work.


> * I think there are ways to do many metrics such that they are not too
> expensive to calculate all the time.  (ie, reporting per bundle rather than
> per item) I think we should ask whether we want/need are metrics that are
> expensive to calculate before going to the effort of adding enable/disable.
>

+1 -- hence why I'd like to look at reporting the metrics with no
configuration.


> * I disagree with ben about showing the amount of splitting - I think
> especially with IOs it's useful to understand/diagnose reading problems
> since that's one potential source of problems, especially given that the
> user can write transforms that split now in SplittableDoFn. But I look
> forward to discussing that further
>

I think many of the splitting metrics fall into things the runner should
report. I think if we pick the right so they're useful, it likely doesn't
hurt to gather them, but here again it may be useful to talk about specific
problems.

I still think these likely won't make sense for all users -- if I'm a new
user just trying to get a source/sink working, I'm not sure what "splitting
metrics" would be useful to me. But if the source can detect that it is
having trouble splitting and raise a message like "you're using compressed
text files which can't be parallelized beyond the number of files" that is
much more actionable.


> +1 on talking about specific examples
>
> S
>
> On Tue, Feb 14, 2017 at 8:29 AM Jean-Baptiste Onofré 
> wrote:
>
> > Hi Aviem
> >
> > Agree with your comments, it's pretty close to my previous ones.
> >
> > Regards
> > JB
> >
> > On Feb 14, 2017, 12:04, at 12:04, Aviem Zur  wrote:
> > >Hi Ismaël,
> > >
> > >You've raised some great points.
> > >Please see my comments inline.
> > >
> > >On Tue, Feb 14, 2017 at 3:37 PM Ismaël Mejía  wrote:
> > >
> > >> ​Hello,
> > >>
> > >> The new metrics API allows us to integrate some basic metrics into
> > >the Beam
> > >> IOs. I have been following some discussions about this on JIRAs/PRs,
> > >and I
> > >> think it is important to discuss the subject here so we can have more
> > >> awareness and obtain ideas from the community.
> > >>
> > >> First I want to thank Ben for his work on the metrics API, and Aviem
> > >for
> > >> his ongoing work on metrics for IOs, e.g. KafkaIO) that made me aware
> > >of
> > >> this subject.
> > >>
> > >> There are some basic ideas to discuss e.g.
> > >>
> > >> - What are the responsibilities of Beam IOs in terms of Metrics
> > >> (considering the fact that the actual IOs, server + client, usually
> > >provide
> > >> their own)?
> > >>
> > >
> > >While it is true that many IOs provide their own metrics, I think that
> > >Beam
> > >should expose IO metrics because:
> > >
> > >1. Metrics which help understanding performance of a pipeline which
> > >uses
> > >   an IO may not be covered by the IO .
> > >2. Users may not be able to setup integrations with the IO's metrics to
> > >view them effectively (And correlate them to a specific Beam pipeline),
> > >but
> > >   still want to investigate their pipeline's performance.
> > >
> > >
> > >> - What metrics are relevant to the pipeline (or some particular IOs)?
> > >Kafka
> > >> backlog for one could point that a pipeline is behind ingestion rate.
> > >
> > >
> > >I think it depends on the IO, but there is 

Re: Metrics for Beam IOs.

2017-02-14 Thread Ben Chambers
Thanks for starting this conversation Ismael! I too have been thinking
we'll need some general approach to metrics for IO in the near future.

Two general thoughts:

1. Before making the metrics configurable, I think it would be worthwhile
to see if we can find the right set of metrics that provide useful
information about IO without affecting performance and have these always
on. Monitoring information like this is often useful when a pipeline is
behaving unexpectedly, and predicting when that will happen and turning on
the metrics is problematic.

2. I think focusing on metrics about source splitting and such is the wrong
level from a user perspective. A user shouldn't need to understand how
sources split and what that means. Instead, we should report higher-level
metrics such as how many bytes of input have been processed, how many bytes
remain (if that is known), etc.

Ideally, metrics about splitting can be reported by the runner in a general
manner. If they're useful for developing the source maybe that would be the
configuration (indicating that you're developing a source and want these
more detailed metrics).

Maybe it would help to pick one or two IOs that you're looking at and talk
about proposed metrics? That might focus the discussion on what metrics
make sense to users and how expensive they might be to report?

On Tue, Feb 14, 2017 at 8:29 AM Jean-Baptiste Onofré 
wrote:

> Hi Aviem
>
> Agree with your comments, it's pretty close to my previous ones.
>
> Regards
> JB
>
> On Feb 14, 2017, 12:04, at 12:04, Aviem Zur  wrote:
> >Hi Ismaël,
> >
> >You've raised some great points.
> >Please see my comments inline.
> >
> >On Tue, Feb 14, 2017 at 3:37 PM Ismaël Mejía  wrote:
> >
> >> ​Hello,
> >>
> >> The new metrics API allows us to integrate some basic metrics into
> >the Beam
> >> IOs. I have been following some discussions about this on JIRAs/PRs,
> >and I
> >> think it is important to discuss the subject here so we can have more
> >> awareness and obtain ideas from the community.
> >>
> >> First I want to thank Ben for his work on the metrics API, and Aviem
> >for
> >> his ongoing work on metrics for IOs, e.g. KafkaIO) that made me aware
> >of
> >> this subject.
> >>
> >> There are some basic ideas to discuss e.g.
> >>
> >> - What are the responsibilities of Beam IOs in terms of Metrics
> >> (considering the fact that the actual IOs, server + client, usually
> >provide
> >> their own)?
> >>
> >
> >While it is true that many IOs provide their own metrics, I think that
> >Beam
> >should expose IO metrics because:
> >
> >1. Metrics which help understanding performance of a pipeline which
> >uses
> >   an IO may not be covered by the IO .
> >2. Users may not be able to setup integrations with the IO's metrics to
> >view them effectively (And correlate them to a specific Beam pipeline),
> >but
> >   still want to investigate their pipeline's performance.
> >
> >
> >> - What metrics are relevant to the pipeline (or some particular IOs)?
> >Kafka
> >> backlog for one could point that a pipeline is behind ingestion rate.
> >
> >
> >I think it depends on the IO, but there is probably overlap in some of
> >the
> >metrics so a guideline might be written for this.
> >I listed what I thought should be reported for KafkaIO in the following
> >JIRA: https://issues.apache.org/jira/browse/BEAM-1398
> >Feel free to add more metrics you think are important to report.
> >
> >
> >>
> >>
> >- Should metrics be calculated on IOs by default or no?
> >> - If metrics are defined by default does it make sense to allow users
> >to
> >> disable them?
> >>
> >
> >IIUC, your concern is that metrics will add overhead to the pipeline,
> >and
> >pipelines which are highly sensitive to this will be hampered?
> >In any case I think that yes, metrics calculation should be
> >configurable
> >(Enable/disable).
> >In Spark runner, for example the Metrics sink feature (not the metrics
> >calculation itself, but sinks to send them to) is configurable in the
> >pipeline options.
> >
> >
> >> Well these are just some questions around the subject so we can
> >create a
> >> common set of practices to include metrics in the IOs and eventually
> >> improve the transform guide with this. What do you think about this?
> >Do you
> >> have other questions/ideas?
> >>
> >> Thanks,
> >> Ismaël
> >>
>


Re: Metrics for Beam IOs.

2017-02-14 Thread Jean-Baptiste Onofré
Hi Aviem

Agree with your comments, it's pretty close to my previous ones.

Regards
JB

On Feb 14, 2017, 12:04, at 12:04, Aviem Zur  wrote:
>Hi Ismaël,
>
>You've raised some great points.
>Please see my comments inline.
>
>On Tue, Feb 14, 2017 at 3:37 PM Ismaël Mejía  wrote:
>
>> ​Hello,
>>
>> The new metrics API allows us to integrate some basic metrics into
>the Beam
>> IOs. I have been following some discussions about this on JIRAs/PRs,
>and I
>> think it is important to discuss the subject here so we can have more
>> awareness and obtain ideas from the community.
>>
>> First I want to thank Ben for his work on the metrics API, and Aviem
>for
>> his ongoing work on metrics for IOs, e.g. KafkaIO) that made me aware
>of
>> this subject.
>>
>> There are some basic ideas to discuss e.g.
>>
>> - What are the responsibilities of Beam IOs in terms of Metrics
>> (considering the fact that the actual IOs, server + client, usually
>provide
>> their own)?
>>
>
>While it is true that many IOs provide their own metrics, I think that
>Beam
>should expose IO metrics because:
>
>1. Metrics which help understanding performance of a pipeline which
>uses
>   an IO may not be covered by the IO .
>2. Users may not be able to setup integrations with the IO's metrics to
>view them effectively (And correlate them to a specific Beam pipeline),
>but
>   still want to investigate their pipeline's performance.
>
>
>> - What metrics are relevant to the pipeline (or some particular IOs)?
>Kafka
>> backlog for one could point that a pipeline is behind ingestion rate.
>
>
>I think it depends on the IO, but there is probably overlap in some of
>the
>metrics so a guideline might be written for this.
>I listed what I thought should be reported for KafkaIO in the following
>JIRA: https://issues.apache.org/jira/browse/BEAM-1398
>Feel free to add more metrics you think are important to report.
>
>
>>
>>
>- Should metrics be calculated on IOs by default or no?
>> - If metrics are defined by default does it make sense to allow users
>to
>> disable them?
>>
>
>IIUC, your concern is that metrics will add overhead to the pipeline,
>and
>pipelines which are highly sensitive to this will be hampered?
>In any case I think that yes, metrics calculation should be
>configurable
>(Enable/disable).
>In Spark runner, for example the Metrics sink feature (not the metrics
>calculation itself, but sinks to send them to) is configurable in the
>pipeline options.
>
>
>> Well these are just some questions around the subject so we can
>create a
>> common set of practices to include metrics in the IOs and eventually
>> improve the transform guide with this. What do you think about this?
>Do you
>> have other questions/ideas?
>>
>> Thanks,
>> Ismaël
>>


Re: Metrics for Beam IOs.

2017-02-14 Thread Aviem Zur
Hi Ismaël,

You've raised some great points.
Please see my comments inline.

On Tue, Feb 14, 2017 at 3:37 PM Ismaël Mejía  wrote:

> ​Hello,
>
> The new metrics API allows us to integrate some basic metrics into the Beam
> IOs. I have been following some discussions about this on JIRAs/PRs, and I
> think it is important to discuss the subject here so we can have more
> awareness and obtain ideas from the community.
>
> First I want to thank Ben for his work on the metrics API, and Aviem for
> his ongoing work on metrics for IOs, e.g. KafkaIO) that made me aware of
> this subject.
>
> There are some basic ideas to discuss e.g.
>
> - What are the responsibilities of Beam IOs in terms of Metrics
> (considering the fact that the actual IOs, server + client, usually provide
> their own)?
>

While it is true that many IOs provide their own metrics, I think that Beam
should expose IO metrics because:

   1. Metrics which help understanding performance of a pipeline which uses
   an IO may not be covered by the IO .
   2. Users may not be able to setup integrations with the IO's metrics to
   view them effectively (And correlate them to a specific Beam pipeline), but
   still want to investigate their pipeline's performance.


> - What metrics are relevant to the pipeline (or some particular IOs)? Kafka
> backlog for one could point that a pipeline is behind ingestion rate.


I think it depends on the IO, but there is probably overlap in some of the
metrics so a guideline might be written for this.
I listed what I thought should be reported for KafkaIO in the following
JIRA: https://issues.apache.org/jira/browse/BEAM-1398
Feel free to add more metrics you think are important to report.


>
>
- Should metrics be calculated on IOs by default or no?
> - If metrics are defined by default does it make sense to allow users to
> disable them?
>

IIUC, your concern is that metrics will add overhead to the pipeline, and
pipelines which are highly sensitive to this will be hampered?
In any case I think that yes, metrics calculation should be configurable
(Enable/disable).
In Spark runner, for example the Metrics sink feature (not the metrics
calculation itself, but sinks to send them to) is configurable in the
pipeline options.


> Well these are just some questions around the subject so we can create a
> common set of practices to include metrics in the IOs and eventually
> improve the transform guide with this. What do you think about this? Do you
> have other questions/ideas?
>
> Thanks,
> Ismaël
>


Re: Metrics for Beam IOs.

2017-02-14 Thread Jean-Baptiste Onofré
Hi Ismael

Good point to discuss the metric here.

Imho, the IOs should use the metric API to provide specific IO metrics (mile 
number of split, throughput, reading/writing rate, number of splits, etc). In 
Camel, each processor (aka IO) provides such metric indicators. The purpose is 
to provide the metric specific to the IO process (not to the back end).

On the other hand, equivalent metrics can be found at the pipeline level and at 
IO level (it's a question of scope).

I would propose to extend the documentation and maybe the API about the IO.

Regards
JB

On Feb 14, 2017, 09:37, at 09:37, "Ismaël Mejía"  wrote:
>​Hello,
>
>The new metrics API allows us to integrate some basic metrics into the
>Beam
>IOs. I have been following some discussions about this on JIRAs/PRs,
>and I
>think it is important to discuss the subject here so we can have more
>awareness and obtain ideas from the community.
>
>First I want to thank Ben for his work on the metrics API, and Aviem
>for
>his ongoing work on metrics for IOs, e.g. KafkaIO) that made me aware
>of
>this subject.
>
>There are some basic ideas to discuss e.g.
>
>- What are the responsibilities of Beam IOs in terms of Metrics
>(considering the fact that the actual IOs, server + client, usually
>provide
>their own)?
>
>- What metrics are relevant to the pipeline (or some particular IOs)?
>Kafka
>backlog for one could point that a pipeline is behind ingestion rate.
>
>- Should metrics be calculated on IOs by default or no?
>
>- If metrics are defined by default does it make sense to allow users
>to
>disable them?
>
>Well these are just some questions around the subject so we can create
>a
>common set of practices to include metrics in the IOs and eventually
>improve the transform guide with this. What do you think about this? Do
>you
>have other questions/ideas?
>
>Thanks,
>Ismaël