That's a good idea. Probably best to add an example in: https://github.com/lyft/flinkk8soperator
Do you want to add an issue? (It will have to wait for 2.18 release though.) On Fri, Nov 1, 2019 at 11:37 AM Chad Dombrova <[email protected]> wrote: > Hi Thomas, > Do you have an example Dockerfile demonstrating best practices for > building an image that contains both Flink and Beam SDK dependencies? That > would be useful. > > -chad > > > On Fri, Nov 1, 2019 at 10:18 AM Thomas Weise <[email protected]> wrote: > >> For folks looking to run Beam on Flink on k8s, see update in [1] >> >> I also updated [2] >> >> TLDR: at this time best option to run portable pipelines on k8s is to >> create container images that have both Flink and the SDK dependencies. >> >> I'm curious how much interest there is to use the official SDK container >> images and keep Flink and portable pipeline separate as far as the image >> build goes? Deployment can be achieved with the sidecar container approach. >> Most of mechanics are in place already, one addition would be an >> abstraction for where the SDK entry point executes (process, external), >> similar to how we have it for the workers. >> >> Thanks, >> Thomas >> >> [1] >> https://lists.apache.org/thread.html/10aada9c200d4300059d02e4baa47dda0cc2c6fc9432f950194a7f5e@%3Cdev.beam.apache.org%3E >> >> [2] >> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI >> >> On Wed, Aug 21, 2019 at 9:44 PM Thomas Weise <[email protected]> wrote: >> >>> The changes to containerize the Python SDK worker pool are nearly >>> complete. I also updated the document for next implementation steps. >>> >>> The favored approach (initially targeted option) for pipeline submission >>> is support for the (externally created) fat far. It will keep changes to >>> the operator to a minimum and is applicable for any other Flink job as well. >>> >>> Regarding fine grained resource scheduling, that would happen within the >>> pods scheduled by the k8s operator (or other external orchestration tool) >>> or, further down the road, in a completely elastic/dynamic fashion with >>> active execution mode (where Flink would request resources directly from >>> k8s, similar to how it would work on YARN). >>> >>> >>> On Tue, Aug 13, 2019 at 10:59 AM Chad Dombrova <[email protected]> >>> wrote: >>> >>>> Hi Thomas, >>>> Nice work! It's really clearly presented. >>>> >>>> What's the current favored approach for pipeline submission? >>>> >>>> I'm also interested to know how this plan overlaps (if at all) with the >>>> work on Fine-Grained Resource Scheduling [1][2] that's being done for Flink >>>> 1.9+, which has implications for creation of task managers in kubernetes. >>>> >>>> [1] >>>> https://docs.google.com/document/d/1h68XOG-EyOFfcomd2N7usHK1X429pJSMiwZwAXCwx1k/edit#heading=h.72szmms7ufza >>>> [2] https://issues.apache.org/jira/browse/FLINK-12761 >>>> >>>> -chad >>>> >>>> >>>> On Tue, Aug 13, 2019 at 9:14 AM Thomas Weise <[email protected]> wrote: >>>> >>>>> There have been a few comments in the doc and I'm going to start >>>>> working on the worker execution part. >>>>> >>>>> Instead of running a Docker container for each worker, the preferred >>>>> option is to use the worker pool: >>>>> >>>>> >>>>> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/edit#heading=h.bzs6y6ms0898 >>>>> >>>>> There are some notes at the end of the doc regarding the >>>>> implementation. It would be great if those interested in the environments >>>>> and Python docker container could take a look. In a nutshell, the proposal >>>>> is to make few changes to the Python container so that it (optionally) can >>>>> be used to run the worker pool. >>>>> >>>>> Thanks, >>>>> Thomas >>>>> >>>>> >>>>> On Wed, Jul 24, 2019 at 9:42 PM Pablo Estrada <[email protected]> >>>>> wrote: >>>>> >>>>>> I am very happy to see this. I'll take a look, and leave my comments. >>>>>> >>>>>> I think this is something we'd been needing, and it's great that you >>>>>> guys are putting thought into it. Thanks!<3 >>>>>> >>>>>> On Wed, Jul 24, 2019 at 9:01 PM Thomas Weise <[email protected]> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Recently Lyft open sourced *FlinkK8sOperator,* a Kubernetes >>>>>>> operator to manage Flink deployments on Kubernetes: >>>>>>> >>>>>>> https://github.com/lyft/flinkk8soperator/ >>>>>>> >>>>>>> We are now discussing how to extend this operator to also support >>>>>>> deployment of Python Beam pipelines with the Flink runner. I would like >>>>>>> to >>>>>>> share the proposal with the Beam community to enlist feedback as well as >>>>>>> explore opportunities for collaboration: >>>>>>> >>>>>>> >>>>>>> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/ >>>>>>> >>>>>>> Looking forward to your comments and suggestions! >>>>>>> >>>>>>> Thomas >>>>>>> >>>>>>>
