The changes to containerize the Python SDK worker pool are nearly complete. I also updated the document for next implementation steps.
The favored approach (initially targeted option) for pipeline submission is support for the (externally created) fat far. It will keep changes to the operator to a minimum and is applicable for any other Flink job as well. Regarding fine grained resource scheduling, that would happen within the pods scheduled by the k8s operator (or other external orchestration tool) or, further down the road, in a completely elastic/dynamic fashion with active execution mode (where Flink would request resources directly from k8s, similar to how it would work on YARN). On Tue, Aug 13, 2019 at 10:59 AM Chad Dombrova <[email protected]> wrote: > Hi Thomas, > Nice work! It's really clearly presented. > > What's the current favored approach for pipeline submission? > > I'm also interested to know how this plan overlaps (if at all) with the > work on Fine-Grained Resource Scheduling [1][2] that's being done for Flink > 1.9+, which has implications for creation of task managers in kubernetes. > > [1] > https://docs.google.com/document/d/1h68XOG-EyOFfcomd2N7usHK1X429pJSMiwZwAXCwx1k/edit#heading=h.72szmms7ufza > [2] https://issues.apache.org/jira/browse/FLINK-12761 > > -chad > > > On Tue, Aug 13, 2019 at 9:14 AM Thomas Weise <[email protected]> wrote: > >> There have been a few comments in the doc and I'm going to start working >> on the worker execution part. >> >> Instead of running a Docker container for each worker, the preferred >> option is to use the worker pool: >> >> >> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/edit#heading=h.bzs6y6ms0898 >> >> There are some notes at the end of the doc regarding the implementation. >> It would be great if those interested in the environments and Python docker >> container could take a look. In a nutshell, the proposal is to make few >> changes to the Python container so that it (optionally) can be used to run >> the worker pool. >> >> Thanks, >> Thomas >> >> >> On Wed, Jul 24, 2019 at 9:42 PM Pablo Estrada <[email protected]> wrote: >> >>> I am very happy to see this. I'll take a look, and leave my comments. >>> >>> I think this is something we'd been needing, and it's great that you >>> guys are putting thought into it. Thanks!<3 >>> >>> On Wed, Jul 24, 2019 at 9:01 PM Thomas Weise <[email protected]> wrote: >>> >>>> Hi, >>>> >>>> Recently Lyft open sourced *FlinkK8sOperator,* a Kubernetes operator >>>> to manage Flink deployments on Kubernetes: >>>> >>>> https://github.com/lyft/flinkk8soperator/ >>>> >>>> We are now discussing how to extend this operator to also support >>>> deployment of Python Beam pipelines with the Flink runner. I would like to >>>> share the proposal with the Beam community to enlist feedback as well as >>>> explore opportunities for collaboration: >>>> >>>> >>>> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/ >>>> >>>> Looking forward to your comments and suggestions! >>>> >>>> Thomas >>>> >>>>
