Took a look at the code, too. It seems like a mismatch in a few ways - PipelineRunner::run is async already and returns while the job is still running - PipelineResult is a legacy name - it is really meant to be a handle to a running job - cancel() on a future is just not really related to cancel() in a job. I would expect to cancel a job with PipelineResult::cancel and I would expect JobInvocation::cancel to cancel the "start job" RPC/request/whatever. So I would not expect metrics for a job which I decided to not even start.
Kenn On Fri, Jul 26, 2019 at 8:48 AM Łukasz Gajowy <lgaj...@apache.org> wrote: > Hi all, > > I'm currently working on BEAM-4775 > <https://issues.apache.org/jira/browse/BEAM-4775>. The goal here is to > pass portable MetricResults over the RPC API to the PortableRunner (SDK) > part and allow reading them there. The metrics can be collected from the > pipeline result that is available in JobInvocation's callbacks. The > callbacks are registered in *start() > <https://github.com/apache/beam/blob/a999e858a7282c8c7d1eea2670df252dea78c537/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/JobInvocation.java#L87>* > and > *cancel() > <https://github.com/apache/beam/blob/a999e858a7282c8c7d1eea2670df252dea78c537/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/JobInvocation.java#L149> > *methods > of JobInvocation. This is the place where my problems begin: > > I want to access the pipeline result and get the MetricResults from it. > This is possible *only in onSuccess(PipelineResult result) method* of the > callbacks registered in *start() and* *cancel() *in JobInvocation. Now, > when I cancel the job invocation, *invocationFuture.cancel() > <https://github.com/apache/beam/blob/a999e858a7282c8c7d1eea2670df252dea78c537/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/JobInvocation.java#L148>* > is > called and will result in invoking *onFailure(Throwable throwable) *in > case the pipeline is still running. *onFailure()* has no PipelineResult > parameter, hence there currently is no possibility to collect the metrics > there. > > My questions currently are: > > - Should we collect metrics after the job is canceled? So far I > assumed that we should. > - If so, does anyone have some other ideas on how to collect metrics > so that we could collect them when canceling the job? > > PR I'm working on with more discussions on the topic: PR 9020 > <https://github.com/apache/beam/pull/9020> > The current idea on how the metrics could be collected in JobInvocation: > link > <https://github.com/apache/beam/pull/9020/files#diff-19f1da178ef8693f13c026d3bf70398a> > > Thanks, > Łukasz > >