Re: Committed vs. attempted metrics results

Ben Chambers Thu, 26 Jan 2017 08:48:07 -0800

It think relaxing the query to not be an exact match is reasonable. I'm
wondering if it should be substring or regex. either one preserves the
existing behavior of, when passed a full step path returning only the
metrics for that specific step, but it adds the ability to just know
approximately the step name.


A) Util or not seems fine to me. If it seems likely to be reusable let's do
so. Preferably in a seperate PR or commit.

B) Yes, direct runner should match whatever semantics we choose.

On Thu, Jan 26, 2017, 5:30 AM Aviem Zur <aviem...@gmail.com> wrote:

> Ben - yes, there is still some ambiguity regarding the querying of the
> metrics results.
>
> You've discussed in this thread the notion that metrics step names should
> be converted to unique names when aggregating metrics, so that each step
> will aggregate its own metrics, and not join with other steps by mistake.
> The way you suggested to query this is that when querying by step, all
> steps which contain the query string as a substring will show up in the
> results.
>
> This sounds fine, however this is not how the direct runner is implemented
> right now. It is an exact match.
>
> In my PR I copied the MetricsResults filtering methods directly from the
> direct runner and suggested that since I copied them verbatim perhaps they
> should be pulled up to a more central module, and then all runners could
> use them.
>
> So, my remaining questions are:
> A) Should we create a util for filtering MetricsResults based on a query,
> using the suggested filtering implementation in this thread (Substring
> match) ?
> B) Should direct runner be changed to use such a util, and conform with the
> results filtering suggested.
>
> On Tue, Jan 24, 2017 at 2:03 AM Ben Chambers <bchamb...@google.com.invalid
> >
> wrote:
>
> > For the short term, it seems like staying with the existing Query API and
> > allowing runner's to throw exceptions if a user issues a query that is
> not
> > supported is reasonable. It shouldn't affect the ability to run a Beam
> > pipeline on other runners, since the Query API is only exposed *after*
> the
> > pipeline is run.
> >
> > For the longer term, it seems like the Query API could merit some more
> > thought, especially if people have good use cases for accessing the value
> > of metrics programatically from the same program that ran the original
> > pipeline.
> >
> > Aviem -- is there anything specific that needs to be discussed/nailed
> down
> > to help with your PR?
> >
> > On Thu, Jan 19, 2017 at 3:57 PM Ben Chambers <bchamb...@google.com>
> wrote:
> >
> > > On Thu, Jan 19, 2017 at 3:28 PM Amit Sela <amitsel...@gmail.com>
> wrote:
> > >
> > > On Fri, Jan 20, 2017 at 1:18 AM Ben Chambers
> > <bchamb...@google.com.invalid
> > > >
> > > wrote:
> > >
> > > > Part of the problem here is that whether attempted or committed is
> > "what
> > > > matters" depends on the metric. If you're counting the number of RPCs
> > to
> > > an
> > > > external service, you may want all the attempts (to predict your
> bill).
> > > If
> > > > you're tracking how long those RPCs took, you may want it just to be
> > the
> > > > committed (eg., what is the best-case time to execute your pipeline).
> > > This
> > > > is essentially Luke's case of wanting one or the other.
> > > >
> > > This sounds like Metrics should have a user-defined guarantee-level..
> > which
> > > might make more sense - Metrics.counter().attempted()/committed() -
> > though
> > > this might prove more challenging for runners to implement.
> > >
> > >
> > > We went away from that because of cases like Luke's where a user might
> > > want to compare the two. Or, not even realize there is a difference
> > > up-front, so declaring ahead of time is difficult. If both are
> available
> > > and can be looked at, if they're the same -- no problems. If they're
> > > different, then it provides a good reason to investigate and figure out
> > the
> > > difference.
> > >
> > >
> > > >
> > > > Regarding the step names -- the metrics are reported using the full
> > step
> > > > name, which is also made unique during the Graph API. So "My
> > > > Composite/ParDo(MyDoFn)" or "My Composite/ParDo(MyDoFn)2" if there
> are
> > > > multiple instances within the same composite. Specifically -- the
> names
> > > are
> > > > made unique prior to recording metrics, so there are no double
> counts.
> > > >
> > > But how would the user know that ? I'm afraid this could be confusing
> as
> > a
> > > user-facing query API, and I think most users would simply name metrics
> > > differently.
> > >
> > >
> > > The query API can support querying based on a sub-string of the full
> > name,
> > > and return metrics from all steps that match. That would allow the user
> > to
> > > query metrics without knowing that. Having unique names for steps is
> > > important and useful for many other things (logging, associating time
> > spent
> > > executing, etc.).
> > >
> > >
> > > >
> > > > On Thu, Jan 19, 2017 at 1:57 PM Amit Sela <amitsel...@gmail.com>
> > wrote:
> > > >
> > > > > I think Luke's example is interesting, but I wonder how common it
> > > > is/would
> > > > > be ? I'd expect failures to happen but not in a rate that would be
> so
> > > > > dramatic that it'd be interesting to follow applicatively (I'd
> expect
> > > the
> > > > > runner/cluster to properly monitor up time of processes/nodes
> > > > separately).
> > > > > And even if it is useful, I can't think of other use cases.
> > > > >
> > > > > I thought the idea was to "declare" the Metrics guarantee level in
> > the
> > > > > query API, but the more I think about it the more I tend to let it
> go
> > > for
> > > > > the following reasons:
> > > > >
> > > > >    - Setting aside Luke's example, I think users would prefer the
> > best
> > > > >    guarantee a runner can provide. And on that note, I'd expect a
> > > > > "getMetrics"
> > > > >    API and not have to figure-out guarantees.
> > > > >    - Programmatic querying would "break"
> > > (UnsupportedOperationExecption)
> > > > >    portability if a program that was running with a runner that
> > > supports
> > > > >    committed() would try to execute on a runner that only supports
> > > > > attempted()
> > > > >    - I know that portability is for the Pipeline and this is
> > > > post-execution
> > > > >    but still, call it 25% portability issue ;-) .
> > > > >    - According to the Capability Matrix, all runners fail to
> provide
> > > > >    "commit" guarantee for Aggregators. I can only speak for Spark
> > > saying
> > > > > that
> > > > >    supporting the Metrics API relies on the same underlying
> mechanism
> > > and
> > > > > so
> > > > >    nothing will change. I wonder about other runners, anyone plans
> to
> > > > > support
> > > > >    "commit" guarantees for Metrics soon ? having said that, not
> sure
> > > this
> > > > > is a
> > > > >    good reason not to have this as a placeholder.
> > > > >
> > > > > Another question for querying Metrics - querying by step could be a
> > bit
> > > > > tricky since a runner is expected to keep unique naming/ids for
> > steps,
> > > > but
> > > > > users are supposed to be aware of this here and I'd suspect users
> > might
> > > > not
> > > > > follow and if they use the same ParDo in a couple of places they'll
> > > query
> > > > > it and it might be confusing for them to see "double counts" if
> they
> > > > didn't
> > > > > mean for that.
> > > > >
> > > > > Amit.
> > > > >
> > > > > On Thu, Jan 19, 2017 at 7:36 PM Ben Chambers
> > > > <bchamb...@google.com.invalid
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Thanks for starting the discussion! I'm going to hold off saying
> > > what I
> > > > > > think and instead just provide some background and additional
> > > > questions,
> > > > > > because I want to see where the discussion goes.
> > > > > >
> > > > > > When I first suggested the API for querying metrics I was adding
> it
> > > for
> > > > > > parity with aggregators. A good first question might be does the
> > > > pipeline
> > > > > > result even need query methods? Runners could add them as
> necessary
> > > > based
> > > > > > on the levels of querying the support.
> > > > > >
> > > > > > The other desire was to make the accuracy clear. One
> implementation
> > > > path
> > > > > > was reporting metrics directly from the workers while attempting
> > > work.
> > > > > This
> > > > > > can overcount when retrying and may be under the actual attempts
> if
> > > the
> > > > > > worker lost connectivity before reporting.
> > > > > >
> > > > > > Another implementation was something like a side output where the
> > > > counts
> > > > > > are committed as part of each bundles results, and then
> aggregated.
> > > > This
> > > > > > committed value is more accurate and represents the value that
> > > occurred
> > > > > > along the success path of the pipeline.
> > > > > >
> > > > > > I suspect there are other possible implementations so trying to
> > make
> > > an
> > > > > API
> > > > > > that expresses all of them is difficult. So:
> > > > > >
> > > > > > 1. Does pipeline result need to support querying (which is useful
> > for
> > > > > > programmatic consumption) or are metrics intended only to get
> > values
> > > > out
> > > > > of
> > > > > > a pipeline and into some metrics store?
> > > > > >
> > > > > > 2. How should pipeline results indicate the different kinds of
> > > metrics?
> > > > > > What if a runner supported multiple kinds (eg, the runner reports
> > > both
> > > > > > attempted and committed results)? As Luke mentions it may be
> useful
> > > to
> > > > > look
> > > > > > at both to understand how much retries affected the value.
> > > > > > On Thu, Jan 19, 2017, 1:42 AM Aviem Zur <aviem...@gmail.com>
> > wrote:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > While working on the implementation of metrics API in Spark
> runner
> > > the
> > > > > > question of committed vs. attempted results has come up, sparking
> > (no
> > > > pun
> > > > > > intended) an interesting conversation.
> > > > > > (See API: MetricResult
> > > > > > <
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/MetricResult.java
> > > > > > >
> > > > > > and
> > > > > > discussion: PR #1750 <https://github.com/apache/beam/pull/1750>)
> > > > > >
> > > > > > The separation of `attempted` and `committed` metric results
> seems
> > a
> > > > bit
> > > > > > unclear.
> > > > > >
> > > > > > Seeing that current implementations of aggregators in the
> different
> > > > > runners
> > > > > > do not guarantee correctness, one could assume that the metrics
> API
> > > > > > implementations will also follow the same guarantees.
> > > > > >
> > > > > > If this is correct, then you could assume that only `attempted()`
> > > > metrics
> > > > > > results can be fulfilled.
> > > > > > Would it then be better to just have a single method such as
> > `get()`
> > > in
> > > > > the
> > > > > > API, and have the guarantees of each runner explained in the
> > > capability
> > > > > > matrix / documentation?
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
>

Re: Committed vs. attempted metrics results

Reply via email to