Hey Max, Great stuff, thank you for sharing this. In case anyone has feedback on the summit as a whole, please feel free to fill out the survey <https://goo.gl/forms/Oka3kicBrFyUXEvp1> as well.
Thank you! Best regards, Matthias On Tue, 9 Oct 2018 at 10:48 Maximilian Michels <m...@apache.org> wrote: > Thanks for the pointer to the thread. I didn't know there already had > been a discussion. It is possible to look at Kubernetes support solely > from a Runner perspective, still we have to provide the basic knobs in > Beam to make deployment easy. > > The approach Henning described here and in the thread (Approach 2: > > https://lists.apache.org/thread.html/209ddf4d701c8c915e3b411e99773f491a6cd830807d636b470000e8@%3Cdev.beam.apache.org%3E) > > where the backend and the SDK harness are started concurrently with > fixed endpoints would be the way to go. In the Proto we already have the > "EXTERNAL" environment for that. > > On 08.10.18 20:18, Thomas Weise wrote: > > Related thread: > > > > > https://lists.apache.org/thread.html/d6b6fde764796de31996db9bb5f9de3e7aaf0ab29b99d0adb52ac508@%3Cdev.beam.apache.org%3E > > > > Kubernetes is otherwise more of a runner deployment concern. There are > > efforts in the Flink community underway to make deployment on Kubernetes > > easier. > > > > Max: thanks for taking notes! > > > > > > On Mon, Oct 8, 2018 at 10:43 AM Henning Rohde <hero...@google.com > > <mailto:hero...@google.com>> wrote: > > > > Regarding the Kubernetes/Docker story: the current idea for that > > setup is to use a per-job pod for the user/sdk containers + runner > > container, so that running (and scaling) a job will go with the > > grain of that ecosystem. The Beam code on each worker thus wouldn't > > do any container management. This is also how Dataflow essentially > > works. The process-based option assumes that the runner environment > > is what the SDK needs, which is generally not the case. > > > > Henning > > > > On Sun, Oct 7, 2018 at 1:40 PM Alex Van Boxel <a...@vanboxel.be > > <mailto:a...@vanboxel.be>> wrote: > > > > Hey Max, I've build quit some experience with *Kubernetes* over > > the years. The problem you describe seems like a custom operator > > story. The thing is I don't know enough of the runner and > > bootstrapping story. After the summit I'm quite eager to dive > > into a beam problem, so if you like to collaborate on that topic > > let me know. > > > > _/ > > _/ Alex Van Boxel > > > > > > On Fri, Oct 5, 2018 at 4:05 PM Maximilian Michels > > <m...@apache.org <mailto:m...@apache.org>> wrote: > > > > Hi, > > > > What do you think about collecting some of the feedback from > > the > > community at Beam Summit last week? Here's what I've come > > across: > > > > > > * The Kubernetes / Docker Story > > > > Multiple users reported that they would like a > > Beam-Kubernetes story. > > What is the best way to deploy Beam with Kubernetes? Will > > there be > > built-in support? > > > > Especially with regards to the portability, there are some > > unsolved > > problems, e.g. how to start Beam containerized and bootstrap > > the SDK > > Harness container from within a container? For local testing > > with the > > JobServer we support that via mounting the Docker socket, > > but this will > > be too fragile in production scenarios. Now that we have > > process-based > > execution, we could just use that inside the main container. > > > > Deployment is a very important topic for users and we should > > try to > > reduce complexity as much as possible. > > > > * External SDKs / Scio > > > > Users have asked why Scio is not part of the main > > repository. Generally, > > I don't think that has to be the case, same for the Runners > > which are > > not part of the main repo. However, it does raise the > > question, what > > will be the future model for maintaining SDKs/IOs/Runners? > > How do we > > ensure easy development and a consistent quality of > > internal/external > > components? > > > > * Documenting Timers & State > > > > These two have excellent blog posts but are not part of the > > official > > documentation. Since they are part of the model, it would be > > good to > > eventually update the docs. > > > > * Better Debuggability of pipelines > > > > Even a simple WordCount in Beam leads to a quite complex > > Flink execution > > graph (due to the the involved I/O logic). How can we make > > pipelines > > easier to understand? Will we provide a way to visualize the > > architecture of high-level Beam pipelines? If so, do we > > provide a way to > > gain insight into how it is mapped to the Runner execution > > model? Users > > would like to have more insight. > > > > * Current Roadmap > > > > This was asked in the context of portability. By the end of > > the year we > > should have at least the FlinkRunner in a ready state, with > > the rest > > following up. There are a lot of others threads in Beam. The > > newsletter > > is a great way to keep up with the project development. > > > > > > Looking forward to any other points you might have. > > > > Best, > > Max > > > --