Re: Collecting metrics in JobInvocation - BEAM-4775

Robert Bradshaw Wed, 07 Aug 2019 06:18:24 -0700

I think the question here is whether PipelineRunner::run is allowed to
be blocking. If it is, then the futures make sense (but there's no way
to properly cancel it). I'm OK with not being able to return metrics
on cancel in this case, or the case the pipeline didn't even start up
yet. Otherwise, we should quickly get a handle to the PipelineResult
and be able to query that for all future use.


On Fri, Jul 26, 2019 at 6:04 PM Kenneth Knowles <k...@apache.org> wrote:
>
> Took a look at the code, too. It seems like a mismatch in a few ways
>
>  - PipelineRunner::run is async already and returns while the job is still 
> running
>  - PipelineResult is a legacy name - it is really meant to be a handle to a 
> running job
>  - cancel() on a future is just not really related to cancel() in a job. I 
> would expect to cancel a job with PipelineResult::cancel and I would expect 
> JobInvocation::cancel to cancel the "start job" RPC/request/whatever. So I 
> would not expect metrics for a job which I decided to not even start.
>
> Kenn
>
> On Fri, Jul 26, 2019 at 8:48 AM Łukasz Gajowy <lgaj...@apache.org> wrote:
>>
>> Hi all,
>>
>> I'm currently working on BEAM-4775. The goal here is to pass portable 
>> MetricResults over the RPC API to the PortableRunner (SDK) part and allow 
>> reading them there. The metrics can be collected from the pipeline result 
>> that is available in JobInvocation's callbacks. The callbacks are registered 
>> in start() and cancel() methods of JobInvocation. This is the place where my 
>> problems begin:
>>
>> I want to access the pipeline result and get the MetricResults from it. This 
>> is possible only in onSuccess(PipelineResult result) method of the callbacks 
>> registered in start() and cancel() in JobInvocation. Now, when I cancel the 
>> job invocation, invocationFuture.cancel() is called and will result in 
>> invoking onFailure(Throwable throwable) in case the pipeline is still 
>> running. onFailure() has no PipelineResult parameter, hence there currently 
>> is no possibility to collect the metrics there.
>>
>> My questions currently are:
>>
>> Should we collect metrics after the job is canceled? So far I assumed that 
>> we should.
>> If so, does anyone have some other ideas on how to collect metrics so that 
>> we could collect them when canceling the job?
>>
>> PR I'm working on with more discussions on the topic: PR 9020
>> The current idea on how the metrics could be collected in JobInvocation: link
>>
>> Thanks,
>> Łukasz
>>

Re: Collecting metrics in JobInvocation - BEAM-4775

Reply via email to