Took a look at the code, too. It seems like a mismatch in a few ways

 - PipelineRunner::run is async already and returns while the job is still
running
 - PipelineResult is a legacy name - it is really meant to be a handle to a
running job
 - cancel() on a future is just not really related to cancel() in a job. I
would expect to cancel a job with PipelineResult::cancel and I would expect
JobInvocation::cancel to cancel the "start job" RPC/request/whatever. So I
would not expect metrics for a job which I decided to not even start.

Kenn

On Fri, Jul 26, 2019 at 8:48 AM Łukasz Gajowy <lgaj...@apache.org> wrote:

> Hi all,
>
> I'm currently working on BEAM-4775
> <https://issues.apache.org/jira/browse/BEAM-4775>. The goal here is to
> pass portable MetricResults over the RPC API to the PortableRunner (SDK)
> part and allow reading them there. The metrics can be collected from the
> pipeline result that is available in JobInvocation's callbacks. The
> callbacks are registered in *start()
> <https://github.com/apache/beam/blob/a999e858a7282c8c7d1eea2670df252dea78c537/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/JobInvocation.java#L87>*
>  and
> *cancel()
> <https://github.com/apache/beam/blob/a999e858a7282c8c7d1eea2670df252dea78c537/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/JobInvocation.java#L149>
>  *methods
> of JobInvocation. This is the place where my problems begin:
>
> I want to access the pipeline result and get the MetricResults from it.
> This is possible *only in onSuccess(PipelineResult result) method* of the
> callbacks registered in *start() and* *cancel() *in JobInvocation. Now,
> when I cancel the job invocation, *invocationFuture.cancel()
> <https://github.com/apache/beam/blob/a999e858a7282c8c7d1eea2670df252dea78c537/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/JobInvocation.java#L148>*
>  is
> called and will result in invoking *onFailure(Throwable throwable) *in
> case the pipeline is still running. *onFailure()* has no PipelineResult
> parameter, hence there currently is no possibility to collect the metrics
> there.
>
> My questions currently are:
>
>    - Should we collect metrics after the job is canceled? So far I
>    assumed that we should.
>    - If so, does anyone have some other ideas on how to collect metrics
>    so that we could collect them when canceling the job?
>
> PR I'm working on with more discussions on the topic: PR 9020
> <https://github.com/apache/beam/pull/9020>
> The current idea on how the metrics could be collected in JobInvocation:
> link
> <https://github.com/apache/beam/pull/9020/files#diff-19f1da178ef8693f13c026d3bf70398a>
>
> Thanks,
> Łukasz
>
>

Reply via email to