That's a good idea. Probably best to add an example in:
https://github.com/lyft/flinkk8soperator

Do you want to add an issue?

(It will have to wait for 2.18 release though.)


On Fri, Nov 1, 2019 at 11:37 AM Chad Dombrova <[email protected]> wrote:

> Hi Thomas,
> Do you have an example Dockerfile demonstrating best practices for
> building an image that contains both Flink and Beam SDK dependencies?  That
> would be useful.
>
> -chad
>
>
> On Fri, Nov 1, 2019 at 10:18 AM Thomas Weise <[email protected]> wrote:
>
>> For folks looking to run Beam on Flink on k8s, see update in [1]
>>
>> I also updated [2]
>>
>> TLDR: at this time best option to run portable pipelines on k8s is to
>> create container images that have both Flink and the SDK dependencies.
>>
>> I'm curious how much interest there is to use the official SDK container
>> images and keep Flink and portable pipeline separate as far as the image
>> build goes? Deployment can be achieved with the sidecar container approach.
>> Most of mechanics are in place already, one addition would be an
>> abstraction for where the SDK entry point executes (process, external),
>> similar to how we have it for the workers.
>>
>> Thanks,
>> Thomas
>>
>> [1]
>> https://lists.apache.org/thread.html/10aada9c200d4300059d02e4baa47dda0cc2c6fc9432f950194a7f5e@%3Cdev.beam.apache.org%3E
>>
>> [2]
>> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI
>>
>> On Wed, Aug 21, 2019 at 9:44 PM Thomas Weise <[email protected]> wrote:
>>
>>> The changes to containerize the Python SDK worker pool are nearly
>>> complete. I also updated the document for next implementation steps.
>>>
>>> The favored approach (initially targeted option) for pipeline submission
>>> is support for the (externally created) fat far. It will keep changes to
>>> the operator to a minimum and is applicable for any other Flink job as well.
>>>
>>> Regarding fine grained resource scheduling, that would happen within the
>>> pods scheduled by the k8s operator (or other external orchestration tool)
>>> or, further down the road, in a completely elastic/dynamic fashion with
>>> active execution mode (where Flink would request resources directly from
>>> k8s, similar to how it would work on YARN).
>>>
>>>
>>> On Tue, Aug 13, 2019 at 10:59 AM Chad Dombrova <[email protected]>
>>> wrote:
>>>
>>>> Hi Thomas,
>>>> Nice work!  It's really clearly presented.
>>>>
>>>> What's the current favored approach for pipeline submission?
>>>>
>>>> I'm also interested to know how this plan overlaps (if at all) with the
>>>> work on Fine-Grained Resource Scheduling [1][2] that's being done for Flink
>>>> 1.9+, which has implications for creation of task managers in kubernetes.
>>>>
>>>> [1]
>>>> https://docs.google.com/document/d/1h68XOG-EyOFfcomd2N7usHK1X429pJSMiwZwAXCwx1k/edit#heading=h.72szmms7ufza
>>>> [2] https://issues.apache.org/jira/browse/FLINK-12761
>>>>
>>>> -chad
>>>>
>>>>
>>>> On Tue, Aug 13, 2019 at 9:14 AM Thomas Weise <[email protected]> wrote:
>>>>
>>>>> There have been a few comments in the doc and I'm going to start
>>>>> working on the worker execution part.
>>>>>
>>>>> Instead of running a Docker container for each worker, the preferred
>>>>> option is to use the worker pool:
>>>>>
>>>>>
>>>>> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/edit#heading=h.bzs6y6ms0898
>>>>>
>>>>> There are some notes at the end of the doc regarding the
>>>>> implementation. It would be great if those interested in the environments
>>>>> and Python docker container could take a look. In a nutshell, the proposal
>>>>> is to make few changes to the Python container so that it (optionally) can
>>>>> be used to run the worker pool.
>>>>>
>>>>> Thanks,
>>>>> Thomas
>>>>>
>>>>>
>>>>> On Wed, Jul 24, 2019 at 9:42 PM Pablo Estrada <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I am very happy to see this. I'll take a look, and leave my comments.
>>>>>>
>>>>>> I think this is something we'd been needing, and it's great that you
>>>>>> guys are putting thought into it. Thanks!<3
>>>>>>
>>>>>> On Wed, Jul 24, 2019 at 9:01 PM Thomas Weise <[email protected]> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Recently Lyft open sourced *FlinkK8sOperator,* a Kubernetes
>>>>>>> operator to manage Flink deployments on Kubernetes:
>>>>>>>
>>>>>>> https://github.com/lyft/flinkk8soperator/
>>>>>>>
>>>>>>> We are now discussing how to extend this operator to also support
>>>>>>> deployment of Python Beam pipelines with the Flink runner. I would like 
>>>>>>> to
>>>>>>> share the proposal with the Beam community to enlist feedback as well as
>>>>>>> explore opportunities for collaboration:
>>>>>>>
>>>>>>>
>>>>>>> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/
>>>>>>>
>>>>>>> Looking forward to your comments and suggestions!
>>>>>>>
>>>>>>> Thomas
>>>>>>>
>>>>>>>

Reply via email to