Hi Thomas, Do you have an example Dockerfile demonstrating best practices for building an image that contains both Flink and Beam SDK dependencies? That would be useful.
-chad On Fri, Nov 1, 2019 at 10:18 AM Thomas Weise <[email protected]> wrote: > For folks looking to run Beam on Flink on k8s, see update in [1] > > I also updated [2] > > TLDR: at this time best option to run portable pipelines on k8s is to > create container images that have both Flink and the SDK dependencies. > > I'm curious how much interest there is to use the official SDK container > images and keep Flink and portable pipeline separate as far as the image > build goes? Deployment can be achieved with the sidecar container approach. > Most of mechanics are in place already, one addition would be an > abstraction for where the SDK entry point executes (process, external), > similar to how we have it for the workers. > > Thanks, > Thomas > > [1] > https://lists.apache.org/thread.html/10aada9c200d4300059d02e4baa47dda0cc2c6fc9432f950194a7f5e@%3Cdev.beam.apache.org%3E > > [2] > https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI > > On Wed, Aug 21, 2019 at 9:44 PM Thomas Weise <[email protected]> wrote: > >> The changes to containerize the Python SDK worker pool are nearly >> complete. I also updated the document for next implementation steps. >> >> The favored approach (initially targeted option) for pipeline submission >> is support for the (externally created) fat far. It will keep changes to >> the operator to a minimum and is applicable for any other Flink job as well. >> >> Regarding fine grained resource scheduling, that would happen within the >> pods scheduled by the k8s operator (or other external orchestration tool) >> or, further down the road, in a completely elastic/dynamic fashion with >> active execution mode (where Flink would request resources directly from >> k8s, similar to how it would work on YARN). >> >> >> On Tue, Aug 13, 2019 at 10:59 AM Chad Dombrova <[email protected]> wrote: >> >>> Hi Thomas, >>> Nice work! It's really clearly presented. >>> >>> What's the current favored approach for pipeline submission? >>> >>> I'm also interested to know how this plan overlaps (if at all) with the >>> work on Fine-Grained Resource Scheduling [1][2] that's being done for Flink >>> 1.9+, which has implications for creation of task managers in kubernetes. >>> >>> [1] >>> https://docs.google.com/document/d/1h68XOG-EyOFfcomd2N7usHK1X429pJSMiwZwAXCwx1k/edit#heading=h.72szmms7ufza >>> [2] https://issues.apache.org/jira/browse/FLINK-12761 >>> >>> -chad >>> >>> >>> On Tue, Aug 13, 2019 at 9:14 AM Thomas Weise <[email protected]> wrote: >>> >>>> There have been a few comments in the doc and I'm going to start >>>> working on the worker execution part. >>>> >>>> Instead of running a Docker container for each worker, the preferred >>>> option is to use the worker pool: >>>> >>>> >>>> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/edit#heading=h.bzs6y6ms0898 >>>> >>>> There are some notes at the end of the doc regarding the >>>> implementation. It would be great if those interested in the environments >>>> and Python docker container could take a look. In a nutshell, the proposal >>>> is to make few changes to the Python container so that it (optionally) can >>>> be used to run the worker pool. >>>> >>>> Thanks, >>>> Thomas >>>> >>>> >>>> On Wed, Jul 24, 2019 at 9:42 PM Pablo Estrada <[email protected]> >>>> wrote: >>>> >>>>> I am very happy to see this. I'll take a look, and leave my comments. >>>>> >>>>> I think this is something we'd been needing, and it's great that you >>>>> guys are putting thought into it. Thanks!<3 >>>>> >>>>> On Wed, Jul 24, 2019 at 9:01 PM Thomas Weise <[email protected]> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Recently Lyft open sourced *FlinkK8sOperator,* a Kubernetes operator >>>>>> to manage Flink deployments on Kubernetes: >>>>>> >>>>>> https://github.com/lyft/flinkk8soperator/ >>>>>> >>>>>> We are now discussing how to extend this operator to also support >>>>>> deployment of Python Beam pipelines with the Flink runner. I would like >>>>>> to >>>>>> share the proposal with the Beam community to enlist feedback as well as >>>>>> explore opportunities for collaboration: >>>>>> >>>>>> >>>>>> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/ >>>>>> >>>>>> Looking forward to your comments and suggestions! >>>>>> >>>>>> Thomas >>>>>> >>>>>>
