Hi Thomas,
Do you have an example Dockerfile demonstrating best practices for building
an image that contains both Flink and Beam SDK dependencies?  That would be
useful.

-chad


On Fri, Nov 1, 2019 at 10:18 AM Thomas Weise <[email protected]> wrote:

> For folks looking to run Beam on Flink on k8s, see update in [1]
>
> I also updated [2]
>
> TLDR: at this time best option to run portable pipelines on k8s is to
> create container images that have both Flink and the SDK dependencies.
>
> I'm curious how much interest there is to use the official SDK container
> images and keep Flink and portable pipeline separate as far as the image
> build goes? Deployment can be achieved with the sidecar container approach.
> Most of mechanics are in place already, one addition would be an
> abstraction for where the SDK entry point executes (process, external),
> similar to how we have it for the workers.
>
> Thanks,
> Thomas
>
> [1]
> https://lists.apache.org/thread.html/10aada9c200d4300059d02e4baa47dda0cc2c6fc9432f950194a7f5e@%3Cdev.beam.apache.org%3E
>
> [2]
> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI
>
> On Wed, Aug 21, 2019 at 9:44 PM Thomas Weise <[email protected]> wrote:
>
>> The changes to containerize the Python SDK worker pool are nearly
>> complete. I also updated the document for next implementation steps.
>>
>> The favored approach (initially targeted option) for pipeline submission
>> is support for the (externally created) fat far. It will keep changes to
>> the operator to a minimum and is applicable for any other Flink job as well.
>>
>> Regarding fine grained resource scheduling, that would happen within the
>> pods scheduled by the k8s operator (or other external orchestration tool)
>> or, further down the road, in a completely elastic/dynamic fashion with
>> active execution mode (where Flink would request resources directly from
>> k8s, similar to how it would work on YARN).
>>
>>
>> On Tue, Aug 13, 2019 at 10:59 AM Chad Dombrova <[email protected]> wrote:
>>
>>> Hi Thomas,
>>> Nice work!  It's really clearly presented.
>>>
>>> What's the current favored approach for pipeline submission?
>>>
>>> I'm also interested to know how this plan overlaps (if at all) with the
>>> work on Fine-Grained Resource Scheduling [1][2] that's being done for Flink
>>> 1.9+, which has implications for creation of task managers in kubernetes.
>>>
>>> [1]
>>> https://docs.google.com/document/d/1h68XOG-EyOFfcomd2N7usHK1X429pJSMiwZwAXCwx1k/edit#heading=h.72szmms7ufza
>>> [2] https://issues.apache.org/jira/browse/FLINK-12761
>>>
>>> -chad
>>>
>>>
>>> On Tue, Aug 13, 2019 at 9:14 AM Thomas Weise <[email protected]> wrote:
>>>
>>>> There have been a few comments in the doc and I'm going to start
>>>> working on the worker execution part.
>>>>
>>>> Instead of running a Docker container for each worker, the preferred
>>>> option is to use the worker pool:
>>>>
>>>>
>>>> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/edit#heading=h.bzs6y6ms0898
>>>>
>>>> There are some notes at the end of the doc regarding the
>>>> implementation. It would be great if those interested in the environments
>>>> and Python docker container could take a look. In a nutshell, the proposal
>>>> is to make few changes to the Python container so that it (optionally) can
>>>> be used to run the worker pool.
>>>>
>>>> Thanks,
>>>> Thomas
>>>>
>>>>
>>>> On Wed, Jul 24, 2019 at 9:42 PM Pablo Estrada <[email protected]>
>>>> wrote:
>>>>
>>>>> I am very happy to see this. I'll take a look, and leave my comments.
>>>>>
>>>>> I think this is something we'd been needing, and it's great that you
>>>>> guys are putting thought into it. Thanks!<3
>>>>>
>>>>> On Wed, Jul 24, 2019 at 9:01 PM Thomas Weise <[email protected]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Recently Lyft open sourced *FlinkK8sOperator,* a Kubernetes operator
>>>>>> to manage Flink deployments on Kubernetes:
>>>>>>
>>>>>> https://github.com/lyft/flinkk8soperator/
>>>>>>
>>>>>> We are now discussing how to extend this operator to also support
>>>>>> deployment of Python Beam pipelines with the Flink runner. I would like 
>>>>>> to
>>>>>> share the proposal with the Beam community to enlist feedback as well as
>>>>>> explore opportunities for collaboration:
>>>>>>
>>>>>>
>>>>>> https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/
>>>>>>
>>>>>> Looking forward to your comments and suggestions!
>>>>>>
>>>>>> Thomas
>>>>>>
>>>>>>

Reply via email to