I agree with flink-kubernetes-operator as the repo name :)
Don't have any better idea

Gyula

On Sat, Feb 5, 2022 at 2:41 AM Thomas Weise <t...@apache.org> wrote:

> Hi,
>
> Thanks for the continued feedback and discussion. Looks like we are
> ready to start a VOTE, I will initiate it shortly.
>
> In parallel it would be good to find the repository name.
>
> My suggestion would be: flink-kubernetes-operator
>
> I thought "flink-operator" could be a bit misleading since the term
> operator already has a meaning in Flink.
>
> I also considered "flink-k8s-operator" but that would be almost
> identical to existing operator implementations and could lead to
> confusion in the future.
>
> Thoughts?
>
> Thanks,
> Thomas
>
>
>
> On Fri, Feb 4, 2022 at 5:15 AM Gyula Fóra <gyula.f...@gmail.com> wrote:
> >
> > Hi Danny,
> >
> > So far we have been focusing our dev efforts on the initial native
> > implementation with the team.
> > If the discussion and vote goes well for this FLIP we are looking forward
> > to contributing the initial version sometime next week (fingers crossed).
> >
> > At that point I think we can already start the dev work to support the
> > standalone mode as well, especially if you can dedicate some effort to
> > pushing that side.
> > Working together on this sounds like a great idea and we should start as
> > soon as possible! :)
> >
> > Cheers,
> > Gyula
> >
> > On Fri, Feb 4, 2022 at 2:07 PM Danny Cranmer <dannycran...@apache.org>
> > wrote:
> >
> > > I have been discussing this one with my team. We are interested in the
> > > Standalone mode, and are willing to contribute towards the
> implementation.
> > > Potentially we can work together to support both modes in parallel?
> > >
> > > Thanks,
> > >
> > > On Wed, Feb 2, 2022 at 4:02 PM Gyula Fóra <gyula.f...@gmail.com>
> wrote:
> > >
> > > > Hi Danny!
> > > >
> > > > Thanks for the feedback :)
> > > >
> > > > Versioning:
> > > > Versioning will be independent from Flink and the operator will
> depend
> > > on a
> > > > fixed flink version (in every given operator version).
> > > > This should be the exact same setup as with Stateful Functions (
> > > > https://github.com/apache/flink-statefun). So independent release
> cycle
> > > > but
> > > > still within the Flink umbrella.
> > > >
> > > > Deployment error handling:
> > > > I think that's a very good point, as general exception handling for
> the
> > > > different failure scenarios is a tricky problem. I think the
> exception
> > > > classifiers and retry strategies could avoid a lot of manual
> intervention
> > > > from the user. We will definitely need to add something like this.
> Once
> > > we
> > > > have the repo created with the initial operator code we should open
> some
> > > > tickets for this and put it on the short term roadmap!
> > > >
> > > > Cheers,
> > > > Gyula
> > > >
> > > > On Wed, Feb 2, 2022 at 4:50 PM Danny Cranmer <
> dannycran...@apache.org>
> > > > wrote:
> > > >
> > > > > Hey team,
> > > > >
> > > > > Great work on the FLIP, I am looking forward to this one. I agree
> that
> > > we
> > > > > can move forward to the voting stage.
> > > > >
> > > > > I have general feedback around how we will handle job submission
> > > failure
> > > > > and retry. As discussed in the Rejected Alternatives section, we
> can
> > > use
> > > > > Java to handle job submission failures from the Flink client. It
> would
> > > be
> > > > > useful to have the ability to configure exception classifiers and
> retry
> > > > > strategy as part of operator configuration.
> > > > >
> > > > > Given this will be in a separate Github repository I am curious how
> > > ther
> > > > > versioning strategy will work in relation to the Flink version? Do
> we
> > > > have
> > > > > any other components with a similar setup I can look at? Will the
> > > > operator
> > > > > version track Flink or will it use its own versioning strategy
> with a
> > > > Flink
> > > > > version support matrix, or similar?
> > > > >
> > > > > Thanks,
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Feb 1, 2022 at 2:33 PM Márton Balassi <
> > > balassi.mar...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi team,
> > > > > >
> > > > > > Thank you for the great feedback, Thomas has updated the FLIP
> page
> > > > > > accordingly. If you are comfortable with the currently existing
> > > design
> > > > > and
> > > > > > depth in the FLIP [1] I suggest moving forward to the voting
> stage -
> > > > once
> > > > > > that reaches a positive conclusion it lets us create the separate
> > > code
> > > > > > repository under the flink project for the operator.
> > > > > >
> > > > > > I encourage everyone to keep improving the details in the
> meantime,
> > > > > however
> > > > > > I believe given the existing design and the general sentiment on
> this
> > > > > > thread that the most efficient path from here is starting the
> > > > > > implementation so that we can collectively iterate over it.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator
> > > > > >
> > > > > > On Mon, Jan 31, 2022 at 10:15 PM Thomas Weise <t...@apache.org>
> > > wrote:
> > > > > >
> > > > > > > HI Xintong,
> > > > > > >
> > > > > > > Thanks for the feedback and please see responses below -->
> > > > > > >
> > > > > > > On Fri, Jan 28, 2022 at 12:21 AM Xintong Song <
> > > tonysong...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks Thomas for drafting this FLIP, and everyone for the
> > > > > discussion.
> > > > > > > >
> > > > > > > > I also have a few questions and comments.
> > > > > > > >
> > > > > > > > ## Job Submission
> > > > > > > > Deploying a Flink session cluster via kubectl & CR and then
> > > > > submitting
> > > > > > > jobs
> > > > > > > > to the cluster via Flink cli / REST is probably the approach
> that
> > > > > > > requires
> > > > > > > > the least effort. However, I'd like to point out 2
> weaknesses.
> > > > > > > > 1. A lot of users use Flink in perjob/application modes. For
> > > these
> > > > > > users,
> > > > > > > > having to run the job in two steps (deploy the cluster, and
> > > submit
> > > > > the
> > > > > > > job)
> > > > > > > > is not that convenient.
> > > > > > > > 2. One of our motivations is being able to manage Flink
> > > > applications'
> > > > > > > > lifecycles with kubectl. Submitting jobs from cli sounds not
> > > > aligned
> > > > > > with
> > > > > > > > this motivation.
> > > > > > > > I think it's probably worth it to support submitting jobs via
> > > > > kubectl &
> > > > > > > CR
> > > > > > > > in the first version, both together with deploying the
> cluster
> > > like
> > > > > in
> > > > > > > > perjob/application mode and after deploying the cluster like
> in
> > > > > session
> > > > > > > > mode.
> > > > > > > >
> > > > > > >
> > > > > > > The intention is to support application management through
> operator
> > > > and
> > > > > > CR,
> > > > > > > which means there won't be any 2 step submission process,
> which as
> > > > you
> > > > > > > allude to would defeat the purpose of this project. The CR
> example
> > > > > shows
> > > > > > > the application part. Please note that the bare cluster
> support is
> > > an
> > > > > > > *additional* feature for scenarios that require external job
> > > > > management.
> > > > > > Is
> > > > > > > there anything on the FLIP page that creates a different
> > > impression?
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > ## Versioning
> > > > > > > > Which Flink versions does the operator plan to support?
> > > > > > > > 1. Native K8s deployment was firstly introduced in Flink 1.10
> > > > > > > > 2. Native K8s HA was introduced in Flink 1.12
> > > > > > > > 3. The Pod template support was introduced in Flink 1.13
> > > > > > > > 4. There was some changes to the Flink docker image
> entrypoint
> > > > script
> > > > > > in,
> > > > > > > > IIRC, Flink 1.13
> > > > > > > >
> > > > > > >
> > > > > > > Great, thanks for providing this. It is important for the
> > > > compatibility
> > > > > > > going forward also. We are targeting Flink 1.14.x upwards.
> Before
> > > the
> > > > > > > operator is ready there will be another Flink release. Let's
> see if
> > > > > > anyone
> > > > > > > is interested in earlier versions?
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > ## Compatibility
> > > > > > > > What kind of API compatibility we can commit to? It's
> probably
> > > fine
> > > > > to
> > > > > > > have
> > > > > > > > alpha / beta version APIs that allow incompatible future
> changes
> > > > for
> > > > > > the
> > > > > > > > first version. But eventually we would need to guarantee
> > > backwards
> > > > > > > > compatibility, so that an early version CR can work with a
> new
> > > > > version
> > > > > > > > operator.
> > > > > > > >
> > > > > > >
> > > > > > > Another great point and please let me include that on the FLIP
> > > page.
> > > > > ;-)
> > > > > > >
> > > > > > > I think we should allow incompatible changes for the first one
> or
> > > two
> > > > > > > versions, similar to how other major features have evolved
> > > recently,
> > > > > such
> > > > > > > as FLIP-27.
> > > > > > >
> > > > > > > Would be great to get broader feedback on this one.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Thomas
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Thank you~
> > > > > > > >
> > > > > > > > Xintong Song
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Jan 28, 2022 at 1:18 PM Thomas Weise <t...@apache.org
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the feedback!
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > # 1 Flink Native vs Standalone integration
> > > > > > > > > > Maybe we should make this more clear in the FLIP but we
> > > agreed
> > > > to
> > > > > > do
> > > > > > > > the
> > > > > > > > > > first version of the operator based on the native
> > > integration.
> > > > > > > > > > While this clearly does not cover all use-cases and
> > > > requirements,
> > > > > > it
> > > > > > > > > seems
> > > > > > > > > > this would lead to a much smaller initial effort and a
> nicer
> > > > > first
> > > > > > > > > version.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I'm also leaning towards the native integration, as long
> as it
> > > > > > reduces
> > > > > > > > the
> > > > > > > > > MVP effort. Ultimately the operator will need to also
> support
> > > the
> > > > > > > > > standalone mode. I would like to gain more confidence that
> > > native
> > > > > > > > > integration reduces the effort. While it cuts the effort to
> > > > handle
> > > > > > the
> > > > > > > TM
> > > > > > > > > pod creation, some mapping code from the CR to the native
> > > > > integration
> > > > > > > > > client and config needs to be created. As mentioned in the
> > > FLIP,
> > > > > > native
> > > > > > > > > integration requires the Flink job manager to have access
> to
> > > the
> > > > > k8s
> > > > > > > API
> > > > > > > > to
> > > > > > > > > create pods, which in some scenarios may be seen as
> > > unfavorable.
> > > > > > > > >
> > > > > > > > >  > > > # Pod Template
> > > > > > > > > > > > Is the pod template in CR same with what Flink has
> > > already
> > > > > > > > > > supported[4]?
> > > > > > > > > > > > Then I am afraid not the arbitrary field(e.g.
> cpu/memory
> > > > > > > resources)
> > > > > > > > > > could
> > > > > > > > > > > > take effect.
> > > > > > > > >
> > > > > > > > > Yes, pod template would look almost identical. There are a
> few
> > > > > > settings
> > > > > > > > > that the operator will control (and that may need to be
> > > > > blacklisted),
> > > > > > > but
> > > > > > > > > in general we would not want to place restrictions. I
> think a
> > > > > > mechanism
> > > > > > > > > where a pod template is merged from multiple layers would
> also
> > > be
> > > > > > > > > interesting to make this more flexible.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Thomas
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>

Reply via email to