Performance, due to the extra gRPC hop.
On Thu, Mar 8, 2018 at 11:08 AM, Lukasz Cwik <lc...@google.com> wrote: > The goal is to use containers (and similar technologies) in the future. It > really hinders pipeline portability between runners if you also have to > deal with the dependency conflicts between Flink/Dataflow/Spark/... > execution runtimes. > > What kinds of penalty are you referring to (perf, user complexity, ...)? > > > > On Thu, Mar 8, 2018 at 11:02 AM, Thomas Weise <t...@apache.org> wrote: > >> I'm curious if pipelines that are exclusively Java will be executed (when >> running on Flink or other JVM based runnner) in separate harness containers >> also? This would impose a significant penalty compared to the current >> execution model. Will this be something the user can control? >> >> Thanks, >> Thomas >> >> >> On Wed, Mar 7, 2018 at 2:09 PM, Aljoscha Krettek <aljos...@apache.org> >> wrote: >> >>> @Axel I assigned https://issues.apache.org/jira/browse/BEAM-2588 to >>> you. It might make sense to also grab other issues that you're already >>> working on. >>> >>> >>> On 7. Mar 2018, at 21:18, Aljoscha Krettek <aljos...@apache.org> wrote: >>> >>> Cool, so we had the same ideas. I think this indicates that we're not >>> completely on the wrong track with this! ;-) >>> >>> Aljoscha >>> >>> On 7. Mar 2018, at 21:14, Thomas Weise <t...@apache.org> wrote: >>> >>> Ben, >>> >>> Looks like we hit the send button at the same time. Is the plan the to >>> derive the Flink implementation of the various execution services from >>> those under org.apache.beam.runners.fnexecution ? >>> >>> Thanks >>> >>> On Wed, Mar 7, 2018 at 11:02 AM, Thomas Weise <t...@apache.org> wrote: >>> >>>> What's the plan for the endpoints that the Flink operator needs to >>>> provide (control/data plane, state, logging)? Is the intention to provide >>>> base implementations that can be shared across runners and then implement >>>> the Flink specific parts on top of it? Has work started on those? >>>> >>>> If there are subtasks ready to be taken up I would be interested. >>>> >>>> Thanks, >>>> Thomas >>>> >>>> >>>> On Wed, Mar 7, 2018 at 9:35 AM, Ben Sidhom <sid...@google.com> wrote: >>>> >>>>> Yes, Axel has started work on such a shim. >>>>> >>>>> Our plan in the short term is to keep the old FlinkRunner around and >>>>> to call into it to process jobs from the job service itself. That way we >>>>> can keep the non-portable runner fully-functional while working on >>>>> portability. Eventually, I think it makes sense for this to go away, but >>>>> we >>>>> haven't given much thought to that. The translator layer will likely stay >>>>> the same, and the FlinkRunner bits are a relatively simple wrapper around >>>>> translation, so it should be simple enough to factor this out. >>>>> >>>>> Much of the service code from the Universal Local Runner (ULR) should >>>>> be composed and reused with other runner implementations. Thomas and Axel >>>>> have more context around that. >>>>> >>>>> >>>>> On Wed, Mar 7, 2018 at 8:47 AM Aljoscha Krettek <aljos...@apache.org> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Has anyone started on https://issues.apache.org/jira/browse/BEAM-2588 >>>>>> (FlinkRunner shim for serving Job API). If not I would start on >>>>>> that. >>>>>> >>>>>> My plan is to implement a FlinkJobService that implements >>>>>> JobServiceImplBase, >>>>>> similar to ReferenceRunnerJobService. This would have a lot of the >>>>>> functionality that FlinkRunner currently has. As a next step, I would >>>>>> add a >>>>>> JobServiceRunner that can submit Pipelines to a JobService. >>>>>> >>>>>> For testing, I would probably add functionality that allows spinning >>>>>> up a JobService in-process with the JobServiceRunner. I can imagine for >>>>>> testing we could even eventually use something like: >>>>>> "--runner=JobServiceRunner", "--streaming=true", >>>>>> "--jobService=FlinkRunnerJobService". >>>>>> >>>>>> Once all of this is done, we only need the python component that >>>>>> talks to the JobService to submit a pipeline. >>>>>> >>>>>> What do you think about the plan? >>>>>> >>>>>> Btw, I feel that the thing currently called Runner, i.e. FlinkRunner >>>>>> will go way in the long run and we will have FlinkJobService, >>>>>> SparkJobService and whatnot, what do you think? >>>>>> >>>>>> Aljoscha >>>>>> >>>>>> >>>>>> On 9. Feb 2018, at 01:31, Ben Sidhom <sid...@google.com> wrote: >>>>>> >>>>>> Hey all, >>>>>> >>>>>> We're working on getting the portability framework plumbed through >>>>>> the Flink runner. The first iteration will likely only support batch and >>>>>> will be limited in its deployment flexibility, but hopefully it shouldn't >>>>>> be too painful to expand this. >>>>>> >>>>>> We have the start of a tracking doc here: https://s.apache.org/por >>>>>> table-beam-on-flink. >>>>>> >>>>>> We've documented the general deployment strategy here: >>>>>> https://s.apache.org/portable-flink-runner-overview. >>>>>> >>>>>> Feel free to provide comments on the docs or jump in on any of the >>>>>> referenced bugs. >>>>>> >>>>>> -- >>>>>> -Ben >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> -Ben >>>>> >>>> >>>> >>> >>> >>> >> >