On Wed, May 23, 2018 at 3:09 PM Ankur Goenka <goe...@google.com> wrote:
> 1. Why JobService is runner specific? Couldn't at least a good part of it > be reused given that the runner specific parts are mostly in the > translation? or I am missing other reasons? > > Yes, absolutely. A good chunk of it can be reused. We are reusing a few > components from ULR in Flink runner. Calling JobService runner specific > gives freedom to runner to have very custom JobService if needed. > So you're suggesting that we should publish common JobService components and recommend that runners use them, but that runners are free to build something completely custom if they prefer? > > 2. What about authentication and authorisation for production runners ? > Once you can use such service to submit/cancel Pipelines is the first thing > I can think of abusing. > > Authentication and authorization is still an unsolved problem. To the best > of my knowledge, it is runner specific and any required information should > be a part of grpc headers. > > On Wed, May 23, 2018 at 2:48 PM Ismaël Mejía <ieme...@gmail.com> wrote: > >> Interesting document, two questions: >> >> 1. Why JobService is runner specific? Couldn't at least a good part of it >> be reused given that the runner specific parts are mostly in the >> translation? or I am missing other reasons? >> >> 2. What about authentication and authorisation for production runners ? >> Once you can use such service to submit/cancel Pipelines is the first >> thing >> I can think of abusing. >> On Tue, May 22, 2018 at 9:40 PM Ankur Goenka <goe...@google.com> wrote: >> >> > Thank you guys for the input. >> >> > Here is the summary. >> >> > Responsibility of Beam on Job Management >> >> > Beam provide a common interface for basic job management operations >> called JobService. The supported operations can vary between runners. >> >> >> > What is JobService? >> >> > JobService is a runner specific component which implements Beams >> JobService interface defined here. >> >> >> > What is the life cycle of a JobService? >> >> > There are 3 scenarios >> >> > With ULR, JobService is short lived and runs as long as the ULR runs. ( >> JobService Lifespan ~= Job Lifespan ) >> >> > With Production runners ( Flink, Dataflow etc), JobService can either be >> short lived or long lived. The choice is up to the runner. >> >> > With Production runners ( Flink, Dataflow etc) without long running >> JobService, SDK will spin up a local JobService. >> >> >> > JobService state management >> >> > The choice of state management is up to JobService implementation. The >> basic requirement is that JobService should be able to perform all the >> operations with the returned job handle. >> >> > At the very least it can be the job handle for the underlying runner job >> and JobService will simply proxy actions to the runner using the provided >> job handle. >> >> > A persistent JobService is free to provide a simple string as a >> JobHandle. In this case, job handle can only be used with the same job >> service. >> >> > A stateless not persistent JobService can provide a opaque blob >> containing all the relevant information about the job. In this case the >> job >> handle can be used with any instance of JobService with the same code. >> >> >> > JobService code distribution and invocation when JobService is short >> lived >> >> > We will give an easy to run solution using docker. Docker will help in >> both executable distribution and providing platform independent binary. >> >> > We will also give an easy setup script with a supporting document for >> users who do not want to use docker on local machine. >> >> >> > Should Flink JobService start a local cluster for testing? >> >> > Flink JobService will be capable of submitting to a remote Flink cluster >> if an master url is provided else it will execute the pipeline in an >> inprocess Flink invocation on the same JVM. >> >> >> >> >> > On Tue, May 22, 2018 at 12:37 PM Eugene Kirpichov <kirpic...@google.com >> > >> wrote: >> >> >> Thanks Ankur, I think there's consensus, so it's probably ready to >> share >> :) >> >> >> On Fri, May 18, 2018 at 3:00 PM Ankur Goenka <goe...@google.com> >> wrote: >> >> >>> Thanks for all the input. >> >>> I have summarized the discussions at the bottom of the document ( here >> ). >> >>> Please feel free to provide comments. >> >>> Once we agree, I will publish the conclusion on the mailing list. >> >> >>> On Mon, May 14, 2018 at 1:51 PM Eugene Kirpichov < >> kirpic...@google.com> >> wrote: >> >> >>>> Thanks Ankur, this document clarifies a few points and raises some >> very important questions. I encourage everybody with a stake in >> Portability >> to take a look and chime in. >> >> >>>> +Aljoscha Krettek +Thomas Weise +Henning Rohde >> >> >>>> On Mon, May 14, 2018 at 12:34 PM Ankur Goenka <goe...@google.com> >> wrote: >> >> >>>>> Updated link to the document as the previous link was not working >> for >> some people. >> >> >> >>>>> On Fri, May 11, 2018 at 7:56 PM Ankur Goenka <goe...@google.com> >> wrote: >> >> >>>>>> Hi, >> >> >>>>>> Recent effort on portability has introduced JobService and >> ArtifactService to the beam stack along with SDK. This has open up a few >> questions around how we start a pipeline in a portable setup (with >> JobService). >> >>>>>> I am trying to document our approach to launching a portable >> pipeline and take binding decisions based on the discussion. >> >>>>>> Please review the document and provide your feedback. >> >> >>>>>> Thanks, >> >>>>>> Ankur >> >