Re: Committed vs. attempted metrics results

Aviem Zur Sun, 29 Jan 2017 07:17:23 -0800

Sure.
I've opened a ticket which tries to summarize these final thoughts:
https://issues.apache.org/jira/browse/BEAM-1344


On Thu, Jan 26, 2017 at 6:47 PM Ben Chambers <bchamb...@google.com.invalid>
wrote:

> It think relaxing the query to not be an exact match is reasonable. I'm
> wondering if it should be substring or regex. either one preserves the
> existing behavior of, when passed a full step path returning only the
> metrics for that specific step, but it adds the ability to just know
> approximately the step name.
>
> A) Util or not seems fine to me. If it seems likely to be reusable let's do
> so. Preferably in a seperate PR or commit.
>
> B) Yes, direct runner should match whatever semantics we choose.
>
> On Thu, Jan 26, 2017, 5:30 AM Aviem Zur <aviem...@gmail.com> wrote:
>
> > Ben - yes, there is still some ambiguity regarding the querying of the
> > metrics results.
> >
> > You've discussed in this thread the notion that metrics step names should
> > be converted to unique names when aggregating metrics, so that each step
> > will aggregate its own metrics, and not join with other steps by mistake.
> > The way you suggested to query this is that when querying by step, all
> > steps which contain the query string as a substring will show up in the
> > results.
> >
> > This sounds fine, however this is not how the direct runner is
> implemented
> > right now. It is an exact match.
> >
> > In my PR I copied the MetricsResults filtering methods directly from the
> > direct runner and suggested that since I copied them verbatim perhaps
> they
> > should be pulled up to a more central module, and then all runners could
> > use them.
> >
> > So, my remaining questions are:
> > A) Should we create a util for filtering MetricsResults based on a query,
> > using the suggested filtering implementation in this thread (Substring
> > match) ?
> > B) Should direct runner be changed to use such a util, and conform with
> the
> > results filtering suggested.
> >
> > On Tue, Jan 24, 2017 at 2:03 AM Ben Chambers
> <bchamb...@google.com.invalid
> > >
> > wrote:
> >
> > > For the short term, it seems like staying with the existing Query API
> and
> > > allowing runner's to throw exceptions if a user issues a query that is
> > not
> > > supported is reasonable. It shouldn't affect the ability to run a Beam
> > > pipeline on other runners, since the Query API is only exposed *after*
> > the
> > > pipeline is run.
> > >
> > > For the longer term, it seems like the Query API could merit some more
> > > thought, especially if people have good use cases for accessing the
> value
> > > of metrics programatically from the same program that ran the original
> > > pipeline.
> > >
> > > Aviem -- is there anything specific that needs to be discussed/nailed
> > down
> > > to help with your PR?
> > >
> > > On Thu, Jan 19, 2017 at 3:57 PM Ben Chambers <bchamb...@google.com>
> > wrote:
> > >
> > > > On Thu, Jan 19, 2017 at 3:28 PM Amit Sela <amitsel...@gmail.com>
> > wrote:
> > > >
> > > > On Fri, Jan 20, 2017 at 1:18 AM Ben Chambers
> > > <bchamb...@google.com.invalid
> > > > >
> > > > wrote:
> > > >
> > > > > Part of the problem here is that whether attempted or committed is
> > > "what
> > > > > matters" depends on the metric. If you're counting the number of
> RPCs
> > > to
> > > > an
> > > > > external service, you may want all the attempts (to predict your
> > bill).
> > > > If
> > > > > you're tracking how long those RPCs took, you may want it just to
> be
> > > the
> > > > > committed (eg., what is the best-case time to execute your
> pipeline).
> > > > This
> > > > > is essentially Luke's case of wanting one or the other.
> > > > >
> > > > This sounds like Metrics should have a user-defined guarantee-level..
> > > which
> > > > might make more sense - Metrics.counter().attempted()/committed() -
> > > though
> > > > this might prove more challenging for runners to implement.
> > > >
> > > >
> > > > We went away from that because of cases like Luke's where a user
> might
> > > > want to compare the two. Or, not even realize there is a difference
> > > > up-front, so declaring ahead of time is difficult. If both are
> > available
> > > > and can be looked at, if they're the same -- no problems. If they're
> > > > different, then it provides a good reason to investigate and figure
> out
> > > the
> > > > difference.
> > > >
> > > >
> > > > >
> > > > > Regarding the step names -- the metrics are reported using the full
> > > step
> > > > > name, which is also made unique during the Graph API. So "My
> > > > > Composite/ParDo(MyDoFn)" or "My Composite/ParDo(MyDoFn)2" if there
> > are
> > > > > multiple instances within the same composite. Specifically -- the
> > names
> > > > are
> > > > > made unique prior to recording metrics, so there are no double
> > counts.
> > > > >
> > > > But how would the user know that ? I'm afraid this could be confusing
> > as
> > > a
> > > > user-facing query API, and I think most users would simply name
> metrics
> > > > differently.
> > > >
> > > >
> > > > The query API can support querying based on a sub-string of the full
> > > name,
> > > > and return metrics from all steps that match. That would allow the
> user
> > > to
> > > > query metrics without knowing that. Having unique names for steps is
> > > > important and useful for many other things (logging, associating time
> > > spent
> > > > executing, etc.).
> > > >
> > > >
> > > > >
> > > > > On Thu, Jan 19, 2017 at 1:57 PM Amit Sela <amitsel...@gmail.com>
> > > wrote:
> > > > >
> > > > > > I think Luke's example is interesting, but I wonder how common it
> > > > > is/would
> > > > > > be ? I'd expect failures to happen but not in a rate that would
> be
> > so
> > > > > > dramatic that it'd be interesting to follow applicatively (I'd
> > expect
> > > > the
> > > > > > runner/cluster to properly monitor up time of processes/nodes
> > > > > separately).
> > > > > > And even if it is useful, I can't think of other use cases.
> > > > > >
> > > > > > I thought the idea was to "declare" the Metrics guarantee level
> in
> > > the
> > > > > > query API, but the more I think about it the more I tend to let
> it
> > go
> > > > for
> > > > > > the following reasons:
> > > > > >
> > > > > >    - Setting aside Luke's example, I think users would prefer the
> > > best
> > > > > >    guarantee a runner can provide. And on that note, I'd expect a
> > > > > > "getMetrics"
> > > > > >    API and not have to figure-out guarantees.
> > > > > >    - Programmatic querying would "break"
> > > > (UnsupportedOperationExecption)
> > > > > >    portability if a program that was running with a runner that
> > > > supports
> > > > > >    committed() would try to execute on a runner that only
> supports
> > > > > > attempted()
> > > > > >    - I know that portability is for the Pipeline and this is
> > > > > post-execution
> > > > > >    but still, call it 25% portability issue ;-) .
> > > > > >    - According to the Capability Matrix, all runners fail to
> > provide
> > > > > >    "commit" guarantee for Aggregators. I can only speak for Spark
> > > > saying
> > > > > > that
> > > > > >    supporting the Metrics API relies on the same underlying
> > mechanism
> > > > and
> > > > > > so
> > > > > >    nothing will change. I wonder about other runners, anyone
> plans
> > to
> > > > > > support
> > > > > >    "commit" guarantees for Metrics soon ? having said that, not
> > sure
> > > > this
> > > > > > is a
> > > > > >    good reason not to have this as a placeholder.
> > > > > >
> > > > > > Another question for querying Metrics - querying by step could
> be a
> > > bit
> > > > > > tricky since a runner is expected to keep unique naming/ids for
> > > steps,
> > > > > but
> > > > > > users are supposed to be aware of this here and I'd suspect users
> > > might
> > > > > not
> > > > > > follow and if they use the same ParDo in a couple of places
> they'll
> > > > query
> > > > > > it and it might be confusing for them to see "double counts" if
> > they
> > > > > didn't
> > > > > > mean for that.
> > > > > >
> > > > > > Amit.
> > > > > >
> > > > > > On Thu, Jan 19, 2017 at 7:36 PM Ben Chambers
> > > > > <bchamb...@google.com.invalid
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for starting the discussion! I'm going to hold off
> saying
> > > > what I
> > > > > > > think and instead just provide some background and additional
> > > > > questions,
> > > > > > > because I want to see where the discussion goes.
> > > > > > >
> > > > > > > When I first suggested the API for querying metrics I was
> adding
> > it
> > > > for
> > > > > > > parity with aggregators. A good first question might be does
> the
> > > > > pipeline
> > > > > > > result even need query methods? Runners could add them as
> > necessary
> > > > > based
> > > > > > > on the levels of querying the support.
> > > > > > >
> > > > > > > The other desire was to make the accuracy clear. One
> > implementation
> > > > > path
> > > > > > > was reporting metrics directly from the workers while
> attempting
> > > > work.
> > > > > > This
> > > > > > > can overcount when retrying and may be under the actual
> attempts
> > if
> > > > the
> > > > > > > worker lost connectivity before reporting.
> > > > > > >
> > > > > > > Another implementation was something like a side output where
> the
> > > > > counts
> > > > > > > are committed as part of each bundles results, and then
> > aggregated.
> > > > > This
> > > > > > > committed value is more accurate and represents the value that
> > > > occurred
> > > > > > > along the success path of the pipeline.
> > > > > > >
> > > > > > > I suspect there are other possible implementations so trying to
> > > make
> > > > an
> > > > > > API
> > > > > > > that expresses all of them is difficult. So:
> > > > > > >
> > > > > > > 1. Does pipeline result need to support querying (which is
> useful
> > > for
> > > > > > > programmatic consumption) or are metrics intended only to get
> > > values
> > > > > out
> > > > > > of
> > > > > > > a pipeline and into some metrics store?
> > > > > > >
> > > > > > > 2. How should pipeline results indicate the different kinds of
> > > > metrics?
> > > > > > > What if a runner supported multiple kinds (eg, the runner
> reports
> > > > both
> > > > > > > attempted and committed results)? As Luke mentions it may be
> > useful
> > > > to
> > > > > > look
> > > > > > > at both to understand how much retries affected the value.
> > > > > > > On Thu, Jan 19, 2017, 1:42 AM Aviem Zur <aviem...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > While working on the implementation of metrics API in Spark
> > runner
> > > > the
> > > > > > > question of committed vs. attempted results has come up,
> sparking
> > > (no
> > > > > pun
> > > > > > > intended) an interesting conversation.
> > > > > > > (See API: MetricResult
> > > > > > > <
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/MetricResult.java
> > > > > > > >
> > > > > > > and
> > > > > > > discussion: PR #1750 <https://github.com/apache/beam/pull/1750
> >)
> > > > > > >
> > > > > > > The separation of `attempted` and `committed` metric results
> > seems
> > > a
> > > > > bit
> > > > > > > unclear.
> > > > > > >
> > > > > > > Seeing that current implementations of aggregators in the
> > different
> > > > > > runners
> > > > > > > do not guarantee correctness, one could assume that the metrics
> > API
> > > > > > > implementations will also follow the same guarantees.
> > > > > > >
> > > > > > > If this is correct, then you could assume that only
> `attempted()`
> > > > > metrics
> > > > > > > results can be fulfilled.
> > > > > > > Would it then be better to just have a single method such as
> > > `get()`
> > > > in
> > > > > > the
> > > > > > > API, and have the guarantees of each runner explained in the
> > > > capability
> > > > > > > matrix / documentation?
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
>

Re: Committed vs. attempted metrics results

Reply via email to