Hi Gyula!

Alright! For clarity, the savepoint path and other savepoint related
configuration can be put into flinkConfiguration.
On the side, I think the automatic savepoint generation for instance should
be put into JobSpec along with other job options, and FlinkConfiguration
only configures what is contained in the Flink-conf.yaml file.

Best Wishes,
Peng Yuan

On Tue, Feb 15, 2022 at 7:02 PM Gyula Fóra <gyula.f...@gmail.com> wrote:

> Hi Peng Yuan!
>
> While I do agree that savepoint path is a very important production
> configuration there are a lot of other things that come to my mind:
>  - savepoint dir
>  - checkpoint dir
>  - checkpoint interval/timeout
>  - high availability settings (provider/storagedir etc)
>
> just to name a few...
>
> While these are all production critical, they have nice clean Flink config
> settings to go with them. If we stand introducing these to jobspec we only
> get confusion about priority order etc and it is going to be hard to change
> or remove them in the future. In any case we should validate that these
> configs exist in cases where users use a stateful upgrade mode for example.
> This is something we need to add for sure.
>
> As for the other options you mentioned like automatic savepoint generation
> for instance, those deserve an independent discussion of their own I
> believe :)
>
> Cheers,
> Gyula
>
> On Tue, Feb 15, 2022 at 11:23 AM K Fred <yuanpengf...@gmail.com> wrote:
>
> > Hi Matyas!
> >
> > Thanks for your reply!
> > For 1. and 3. scenarios,I couldn't agree more with the podTemplate
> solution
> > , i missed this part.
> > For savepoint related configuration, I think it's very important to be
> > specified in JobSpec, Because savepoint is a very common configuration
> for
> > upgrading a job, if it has been placed in JobSpec can be obviously
> > configured by the user. In addition, other advanced properties can be put
> > into flinkConfiguration customized by expert users.
> > A bunch of savepoint configuration as follows:
> >
> > > fromSavepoint——Job restart from
> >
> > autoSavepointSecond—— Automatically take a savepoint to the
> `savepointsDir`
> > > every n seconds.
> >
> > savepointsDir—— Savepoints dir where to store automatically taken
> > > savepoints
> >
> > savepointGeneration—— Update savepoint generation of job status for a
> > > running job (should be defined in JobStatus)
> >
> >
> > Best wishes,
> > Peng Yuan.
> >
> > On Tue, Feb 15, 2022 at 4:41 PM Őrhidi Mátyás <matyas.orh...@gmail.com>
> > wrote:
> >
> > > Hi Peng,
> > >
> > > Thanks for your feedback. Regarding 1. and 3. scenarios, the
> podTemplate
> > > functionality in the operator could cover both. We also need to be
> > careful
> > > about introducing proxy parameters in the CRD spec. The savepoint path
> is
> > > usually accompanied with a bunch of other configurations for example,
> so
> > > users need to use configuration params anyway. What do you think?
> > >
> > > Best,
> > > Matyas
> > >
> > > On Tue, Feb 15, 2022 at 8:58 AM K Fred <yuanpengf...@gmail.com> wrote:
> > >
> > > > Hi Gyula!
> > > >
> > > > I have reviewed the prototype design of flink-kubernetes-operator you
> > > > submitted, and I have the following questions:
> > > >
> > > > 1.Can a Flink Jar package that supports pulling from the sidecar be
> > added
> > > > to the JobSpec? just like this:
> > > >
> > > > > initContainers:
> > > > >       - name: downloader
> > > > >         image: curlimages/curl
> > > > >         env:
> > > > >           - name: JAR_URL
> > > > >             value:
> > > > >
> > > >
> > >
> >
> https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.12/1.14.3/flink-examples-streaming_2.12-1.14.3-WordCount.jar
> > > > >           - name: DEST_PATH
> > > > >             value: /cache/flink-app.jar
> > > > >         command: ['sh', '-c', 'curl -o ${DEST_PATH} ${JAR_URL}']
> > > >
> > > > 2.Can we add savepoint path property to job specification?
> > > > 3.Can we add an extra port to the JobManagerSpec and TaskManagerSpec
> to
> > > > expose some service ,such as prometheus?The property can be this:
> > > >
> > > > > extraPorts:
> > > > >       - name: prom
> > > > >         containerPort: 9249
> > > >
> > > >
> > > >
> > > > Best wishes,
> > > > Peng Yuan
> > > >
> > > > On Tue, Feb 15, 2022 at 12:23 AM Gyula Fóra <gyf...@apache.org>
> wrote:
> > > >
> > > > > Hi Flink Devs!
> > > > >
> > > > > We would like to present to you the first prototype of the
> > > > > flink-kubernetes-operator that was built based on the FLIP and the
> > > > > discussion on this mail thread. We would also like to call out some
> > > > design
> > > > > decisions that we have made regarding architecture components that
> > were
> > > > not
> > > > > explicitly mentioned in the FLIP document/thread and give you the
> > > > > opportunity to raise any concerns here.
> > > > >
> > > > > You can find the initial prototype here:
> > > > > https://github.com/apache/flink-kubernetes-operator/pull/1
> > > > >
> > > > > We will leave the PR open for 1-2 days before merging to let people
> > > > comment
> > > > > on it, but please be mindful that this is an initial prototype with
> > > many
> > > > > rough edges. It is not intended to be a complete implementation of
> > the
> > > > FLIP
> > > > > specs as that will take some more work from all of us :)
> > > > >
> > > > >
> > > > > *Prototype feature set:*The prototype contains a basic working
> > version
> > > of
> > > > > the flink-kubernetes-operator that supports deployment and
> lifecycle
> > > > > management of a stateful native flink application. We have basic
> > > support
> > > > > for stateful and stateless upgrades, UI ingress, pod templates etc.
> > > Error
> > > > > handling at this point is largely missing.
> > > > >
> > > > >
> > > > > *Features / design decisions that were not explicitly discussed in
> > this
> > > > > thread*
> > > > >
> > > > > *Basic Admission control using a Webhook*Standard resource
> admission
> > > > > control in Kubernetes to validate and potentially reject resources
> is
> > > > done
> > > > > through Webhooks.
> > > > >
> > > > >
> > > >
> > >
> >
> https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/
> > > > > This is a necessary mechanism to give the user an upfront error
> when
> > an
> > > > > incorrect resource was submitted. In the Flink operator's case we
> > need
> > > to
> > > > > validate that the FlinkDeployment yaml actually makes sense and
> does
> > > not
> > > > > contain erroneous config options that would inevitably lead to
> > > > > deployment/job failures.
> > > > >
> > > > > We have implemented a simple webhook that we can use for this type
> of
> > > > > validation, as a separate maven module (flink-kubernetes-webhook).
> > The
> > > > > webhook is an optional component and can be enabled or disabled
> > during
> > > > > deployment. To avoid pulling in new external dependencies we have
> > used
> > > > the
> > > > > Flink Shaded Netty module to build the simple rest endpoint
> required.
> > > If
> > > > > the community feels that Netty adds unnecessary complexity to the
> > > webhook
> > > > > implementation we are open to alternative backends such as
> Springboot
> > > for
> > > > > instance which would practically eliminate all the boilerplate.
> > > > >
> > > > >
> > > > > *Helm Chart for deployment*Helm charts provide an industry standard
> > way
> > > > of
> > > > > managing kubernetes deployments. We have created a helm chart
> > prototype
> > > > > that can be used to deploy the operator together with all required
> > > > > resources. The helm chart allows easy configuration for things like
> > > > images,
> > > > > namespaces etc and flags to control specific parts of the
> deployment
> > > such
> > > > > as RBAC or the webhook.
> > > > >
> > > > > The helm chart provided is intended to be a first version that
> worked
> > > for
> > > > > us during development but we expect to have a lot of iterations on
> it
> > > > based
> > > > > on the feedback from the community.
> > > > >
> > > > > *Acknowledgment*
> > > > > We would like to thank everyone who has provided support and
> valuable
> > > > > feedback on this FLIP.
> > > > > We would also like to thank Yang Wang & Alexis Sarda-Espinosa
> > > > specifically
> > > > > for making their operators open source and available to us which
> had
> > a
> > > > big
> > > > > impact on the FLIP and the prototype.
> > > > >
> > > > > We are looking forward to continuing development on the operator
> > > together
> > > > > with the broader community.
> > > > > All work will be tracked using the ASF Jira from now on.
> > > > >
> > > > > Cheers,
> > > > > Gyula
> > > > >
> > > > > On Mon, Feb 14, 2022 at 9:21 AM K Fred <yuanpengf...@gmail.com>
> > wrote:
> > > > >
> > > > > > Hi Gyula,
> > > > > >
> > > > > > Thanks!
> > > > > > It's great to see the project getting started and I can't wait to
> > see
> > > > the
> > > > > > PR and start contributing code.😄😄😄
> > > > > >
> > > > > > Best Wishes!
> > > > > > Peng Yuan
> > > > > >
> > > > > > On Mon, Feb 14, 2022 at 4:14 PM Gyula Fóra <gyula.f...@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > > > Hi Peng Yuan!
> > > > > > >
> > > > > > > The repo is already created:
> > > > > > > https://github.com/apache/flink-kubernetes-operator
> > > > > > >
> > > > > > > We will open the PR with the initial prototype later today,
> stay
> > > > tuned
> > > > > in
> > > > > > > this thread! :)
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Gyula
> > > > > > >
> > > > > > > On Mon, Feb 14, 2022 at 9:09 AM K Fred <yuanpengf...@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > Has the project of flink-kubernetes-operator been created in
> > > > github?
> > > > > > > >
> > > > > > > > Peng Yuan
> > > > > > > >
> > > > > > > > On Wed, Feb 9, 2022 at 1:23 AM Gyula Fóra <
> > gyula.f...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > I agree with flink-kubernetes-operator as the repo name :)
> > > > > > > > > Don't have any better idea
> > > > > > > > >
> > > > > > > > > Gyula
> > > > > > > > >
> > > > > > > > > On Sat, Feb 5, 2022 at 2:41 AM Thomas Weise <
> t...@apache.org>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > Thanks for the continued feedback and discussion. Looks
> > like
> > > we
> > > > > are
> > > > > > > > > > ready to start a VOTE, I will initiate it shortly.
> > > > > > > > > >
> > > > > > > > > > In parallel it would be good to find the repository name.
> > > > > > > > > >
> > > > > > > > > > My suggestion would be: flink-kubernetes-operator
> > > > > > > > > >
> > > > > > > > > > I thought "flink-operator" could be a bit misleading
> since
> > > the
> > > > > term
> > > > > > > > > > operator already has a meaning in Flink.
> > > > > > > > > >
> > > > > > > > > > I also considered "flink-k8s-operator" but that would be
> > > almost
> > > > > > > > > > identical to existing operator implementations and could
> > lead
> > > > to
> > > > > > > > > > confusion in the future.
> > > > > > > > > >
> > > > > > > > > > Thoughts?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Thomas
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Feb 4, 2022 at 5:15 AM Gyula Fóra <
> > > > gyula.f...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi Danny,
> > > > > > > > > > >
> > > > > > > > > > > So far we have been focusing our dev efforts on the
> > initial
> > > > > > native
> > > > > > > > > > > implementation with the team.
> > > > > > > > > > > If the discussion and vote goes well for this FLIP we
> are
> > > > > looking
> > > > > > > > > forward
> > > > > > > > > > > to contributing the initial version sometime next week
> > > > (fingers
> > > > > > > > > crossed).
> > > > > > > > > > >
> > > > > > > > > > > At that point I think we can already start the dev work
> > to
> > > > > > support
> > > > > > > > the
> > > > > > > > > > > standalone mode as well, especially if you can dedicate
> > > some
> > > > > > effort
> > > > > > > > to
> > > > > > > > > > > pushing that side.
> > > > > > > > > > > Working together on this sounds like a great idea and
> we
> > > > should
> > > > > > > start
> > > > > > > > > as
> > > > > > > > > > > soon as possible! :)
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > > Gyula
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Feb 4, 2022 at 2:07 PM Danny Cranmer <
> > > > > > > > dannycran...@apache.org>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > I have been discussing this one with my team. We are
> > > > > interested
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > Standalone mode, and are willing to contribute
> towards
> > > the
> > > > > > > > > > implementation.
> > > > > > > > > > > > Potentially we can work together to support both
> modes
> > in
> > > > > > > parallel?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Feb 2, 2022 at 4:02 PM Gyula Fóra <
> > > > > > gyula.f...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Danny!
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the feedback :)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Versioning:
> > > > > > > > > > > > > Versioning will be independent from Flink and the
> > > > operator
> > > > > > will
> > > > > > > > > > depend
> > > > > > > > > > > > on a
> > > > > > > > > > > > > fixed flink version (in every given operator
> > version).
> > > > > > > > > > > > > This should be the exact same setup as with
> Stateful
> > > > > > Functions
> > > > > > > (
> > > > > > > > > > > > > https://github.com/apache/flink-statefun). So
> > > > independent
> > > > > > > > release
> > > > > > > > > > cycle
> > > > > > > > > > > > > but
> > > > > > > > > > > > > still within the Flink umbrella.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Deployment error handling:
> > > > > > > > > > > > > I think that's a very good point, as general
> > exception
> > > > > > handling
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > > > different failure scenarios is a tricky problem. I
> > > think
> > > > > the
> > > > > > > > > > exception
> > > > > > > > > > > > > classifiers and retry strategies could avoid a lot
> of
> > > > > manual
> > > > > > > > > > intervention
> > > > > > > > > > > > > from the user. We will definitely need to add
> > something
> > > > > like
> > > > > > > > this.
> > > > > > > > > > Once
> > > > > > > > > > > > we
> > > > > > > > > > > > > have the repo created with the initial operator
> code
> > we
> > > > > > should
> > > > > > > > open
> > > > > > > > > > some
> > > > > > > > > > > > > tickets for this and put it on the short term
> > roadmap!
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > Gyula
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Feb 2, 2022 at 4:50 PM Danny Cranmer <
> > > > > > > > > > dannycran...@apache.org>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hey team,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Great work on the FLIP, I am looking forward to
> > this
> > > > > one. I
> > > > > > > > agree
> > > > > > > > > > that
> > > > > > > > > > > > we
> > > > > > > > > > > > > > can move forward to the voting stage.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I have general feedback around how we will handle
> > job
> > > > > > > > submission
> > > > > > > > > > > > failure
> > > > > > > > > > > > > > and retry. As discussed in the Rejected
> > Alternatives
> > > > > > section,
> > > > > > > > we
> > > > > > > > > > can
> > > > > > > > > > > > use
> > > > > > > > > > > > > > Java to handle job submission failures from the
> > Flink
> > > > > > client.
> > > > > > > > It
> > > > > > > > > > would
> > > > > > > > > > > > be
> > > > > > > > > > > > > > useful to have the ability to configure exception
> > > > > > classifiers
> > > > > > > > and
> > > > > > > > > > retry
> > > > > > > > > > > > > > strategy as part of operator configuration.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Given this will be in a separate Github
> repository
> > I
> > > am
> > > > > > > curious
> > > > > > > > > how
> > > > > > > > > > > > ther
> > > > > > > > > > > > > > versioning strategy will work in relation to the
> > > Flink
> > > > > > > version?
> > > > > > > > > Do
> > > > > > > > > > we
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > any other components with a similar setup I can
> > look
> > > > at?
> > > > > > Will
> > > > > > > > the
> > > > > > > > > > > > > operator
> > > > > > > > > > > > > > version track Flink or will it use its own
> > versioning
> > > > > > > strategy
> > > > > > > > > > with a
> > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > version support matrix, or similar?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Feb 1, 2022 at 2:33 PM Márton Balassi <
> > > > > > > > > > > > balassi.mar...@gmail.com>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi team,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you for the great feedback, Thomas has
> > > updated
> > > > > the
> > > > > > > FLIP
> > > > > > > > > > page
> > > > > > > > > > > > > > > accordingly. If you are comfortable with the
> > > > currently
> > > > > > > > existing
> > > > > > > > > > > > design
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > depth in the FLIP [1] I suggest moving forward
> to
> > > the
> > > > > > > voting
> > > > > > > > > > stage -
> > > > > > > > > > > > > once
> > > > > > > > > > > > > > > that reaches a positive conclusion it lets us
> > > create
> > > > > the
> > > > > > > > > separate
> > > > > > > > > > > > code
> > > > > > > > > > > > > > > repository under the flink project for the
> > > operator.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I encourage everyone to keep improving the
> > details
> > > in
> > > > > the
> > > > > > > > > > meantime,
> > > > > > > > > > > > > > however
> > > > > > > > > > > > > > > I believe given the existing design and the
> > general
> > > > > > > sentiment
> > > > > > > > > on
> > > > > > > > > > this
> > > > > > > > > > > > > > > thread that the most efficient path from here
> is
> > > > > starting
> > > > > > > the
> > > > > > > > > > > > > > > implementation so that we can collectively
> > iterate
> > > > over
> > > > > > it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Jan 31, 2022 at 10:15 PM Thomas Weise <
> > > > > > > > t...@apache.org>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > HI Xintong,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for the feedback and please see
> > responses
> > > > > below
> > > > > > > -->
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Jan 28, 2022 at 12:21 AM Xintong
> Song <
> > > > > > > > > > > > tonysong...@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks Thomas for drafting this FLIP, and
> > > > everyone
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > > > > discussion.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I also have a few questions and comments.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > ## Job Submission
> > > > > > > > > > > > > > > > > Deploying a Flink session cluster via
> > kubectl &
> > > > CR
> > > > > > and
> > > > > > > > then
> > > > > > > > > > > > > > submitting
> > > > > > > > > > > > > > > > jobs
> > > > > > > > > > > > > > > > > to the cluster via Flink cli / REST is
> > probably
> > > > the
> > > > > > > > > approach
> > > > > > > > > > that
> > > > > > > > > > > > > > > > requires
> > > > > > > > > > > > > > > > > the least effort. However, I'd like to
> point
> > > out
> > > > 2
> > > > > > > > > > weaknesses.
> > > > > > > > > > > > > > > > > 1. A lot of users use Flink in
> > > perjob/application
> > > > > > > modes.
> > > > > > > > > For
> > > > > > > > > > > > these
> > > > > > > > > > > > > > > users,
> > > > > > > > > > > > > > > > > having to run the job in two steps (deploy
> > the
> > > > > > cluster,
> > > > > > > > and
> > > > > > > > > > > > submit
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > job)
> > > > > > > > > > > > > > > > > is not that convenient.
> > > > > > > > > > > > > > > > > 2. One of our motivations is being able to
> > > manage
> > > > > > Flink
> > > > > > > > > > > > > applications'
> > > > > > > > > > > > > > > > > lifecycles with kubectl. Submitting jobs
> from
> > > cli
> > > > > > > sounds
> > > > > > > > > not
> > > > > > > > > > > > > aligned
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > this motivation.
> > > > > > > > > > > > > > > > > I think it's probably worth it to support
> > > > > submitting
> > > > > > > jobs
> > > > > > > > > via
> > > > > > > > > > > > > > kubectl &
> > > > > > > > > > > > > > > > CR
> > > > > > > > > > > > > > > > > in the first version, both together with
> > > > deploying
> > > > > > the
> > > > > > > > > > cluster
> > > > > > > > > > > > like
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > perjob/application mode and after deploying
> > the
> > > > > > cluster
> > > > > > > > > like
> > > > > > > > > > in
> > > > > > > > > > > > > > session
> > > > > > > > > > > > > > > > > mode.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The intention is to support application
> > > management
> > > > > > > through
> > > > > > > > > > operator
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > CR,
> > > > > > > > > > > > > > > > which means there won't be any 2 step
> > submission
> > > > > > process,
> > > > > > > > > > which as
> > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > allude to would defeat the purpose of this
> > > project.
> > > > > The
> > > > > > > CR
> > > > > > > > > > example
> > > > > > > > > > > > > > shows
> > > > > > > > > > > > > > > > the application part. Please note that the
> bare
> > > > > cluster
> > > > > > > > > > support is
> > > > > > > > > > > > an
> > > > > > > > > > > > > > > > *additional* feature for scenarios that
> require
> > > > > > external
> > > > > > > > job
> > > > > > > > > > > > > > management.
> > > > > > > > > > > > > > > Is
> > > > > > > > > > > > > > > > there anything on the FLIP page that creates
> a
> > > > > > different
> > > > > > > > > > > > impression?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > ## Versioning
> > > > > > > > > > > > > > > > > Which Flink versions does the operator plan
> > to
> > > > > > support?
> > > > > > > > > > > > > > > > > 1. Native K8s deployment was firstly
> > introduced
> > > > in
> > > > > > > Flink
> > > > > > > > > 1.10
> > > > > > > > > > > > > > > > > 2. Native K8s HA was introduced in Flink
> 1.12
> > > > > > > > > > > > > > > > > 3. The Pod template support was introduced
> in
> > > > Flink
> > > > > > > 1.13
> > > > > > > > > > > > > > > > > 4. There was some changes to the Flink
> docker
> > > > image
> > > > > > > > > > entrypoint
> > > > > > > > > > > > > script
> > > > > > > > > > > > > > > in,
> > > > > > > > > > > > > > > > > IIRC, Flink 1.13
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Great, thanks for providing this. It is
> > important
> > > > for
> > > > > > the
> > > > > > > > > > > > > compatibility
> > > > > > > > > > > > > > > > going forward also. We are targeting Flink
> > 1.14.x
> > > > > > > upwards.
> > > > > > > > > > Before
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > operator is ready there will be another Flink
> > > > > release.
> > > > > > > > Let's
> > > > > > > > > > see if
> > > > > > > > > > > > > > > anyone
> > > > > > > > > > > > > > > > is interested in earlier versions?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > ## Compatibility
> > > > > > > > > > > > > > > > > What kind of API compatibility we can
> commit
> > > to?
> > > > > It's
> > > > > > > > > > probably
> > > > > > > > > > > > fine
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > alpha / beta version APIs that allow
> > > incompatible
> > > > > > > future
> > > > > > > > > > changes
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > first version. But eventually we would need
> > to
> > > > > > > guarantee
> > > > > > > > > > > > backwards
> > > > > > > > > > > > > > > > > compatibility, so that an early version CR
> > can
> > > > work
> > > > > > > with
> > > > > > > > a
> > > > > > > > > > new
> > > > > > > > > > > > > > version
> > > > > > > > > > > > > > > > > operator.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Another great point and please let me include
> > > that
> > > > on
> > > > > > the
> > > > > > > > > FLIP
> > > > > > > > > > > > page.
> > > > > > > > > > > > > > ;-)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I think we should allow incompatible changes
> > for
> > > > the
> > > > > > > first
> > > > > > > > > one
> > > > > > > > > > or
> > > > > > > > > > > > two
> > > > > > > > > > > > > > > > versions, similar to how other major features
> > > have
> > > > > > > evolved
> > > > > > > > > > > > recently,
> > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > as FLIP-27.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Would be great to get broader feedback on
> this
> > > one.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > Thomas
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Jan 28, 2022 at 1:18 PM Thomas
> Weise
> > <
> > > > > > > > > t...@apache.org
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > # 1 Flink Native vs Standalone
> > integration
> > > > > > > > > > > > > > > > > > > Maybe we should make this more clear in
> > the
> > > > > FLIP
> > > > > > > but
> > > > > > > > we
> > > > > > > > > > > > agreed
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > first version of the operator based on
> > the
> > > > > native
> > > > > > > > > > > > integration.
> > > > > > > > > > > > > > > > > > > While this clearly does not cover all
> > > > use-cases
> > > > > > and
> > > > > > > > > > > > > requirements,
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > seems
> > > > > > > > > > > > > > > > > > > this would lead to a much smaller
> initial
> > > > > effort
> > > > > > > and
> > > > > > > > a
> > > > > > > > > > nicer
> > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > version.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I'm also leaning towards the native
> > > > integration,
> > > > > as
> > > > > > > > long
> > > > > > > > > > as it
> > > > > > > > > > > > > > > reduces
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > MVP effort. Ultimately the operator will
> > need
> > > > to
> > > > > > also
> > > > > > > > > > support
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > standalone mode. I would like to gain
> more
> > > > > > confidence
> > > > > > > > > that
> > > > > > > > > > > > native
> > > > > > > > > > > > > > > > > > integration reduces the effort. While it
> > cuts
> > > > the
> > > > > > > > effort
> > > > > > > > > to
> > > > > > > > > > > > > handle
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > TM
> > > > > > > > > > > > > > > > > > pod creation, some mapping code from the
> CR
> > > to
> > > > > the
> > > > > > > > native
> > > > > > > > > > > > > > integration
> > > > > > > > > > > > > > > > > > client and config needs to be created. As
> > > > > mentioned
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > FLIP,
> > > > > > > > > > > > > > > native
> > > > > > > > > > > > > > > > > > integration requires the Flink job
> manager
> > to
> > > > > have
> > > > > > > > access
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > k8s
> > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > create pods, which in some scenarios may
> be
> > > > seen
> > > > > as
> > > > > > > > > > > > unfavorable.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >  > > > # Pod Template
> > > > > > > > > > > > > > > > > > > > > Is the pod template in CR same with
> > > what
> > > > > > Flink
> > > > > > > > has
> > > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > > supported[4]?
> > > > > > > > > > > > > > > > > > > > > Then I am afraid not the arbitrary
> > > > > field(e.g.
> > > > > > > > > > cpu/memory
> > > > > > > > > > > > > > > > resources)
> > > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > > take effect.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Yes, pod template would look almost
> > > identical.
> > > > > > There
> > > > > > > > are
> > > > > > > > > a
> > > > > > > > > > few
> > > > > > > > > > > > > > > settings
> > > > > > > > > > > > > > > > > > that the operator will control (and that
> > may
> > > > need
> > > > > > to
> > > > > > > be
> > > > > > > > > > > > > > blacklisted),
> > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > in general we would not want to place
> > > > > > restrictions. I
> > > > > > > > > > think a
> > > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > > > where a pod template is merged from
> > multiple
> > > > > layers
> > > > > > > > would
> > > > > > > > > > also
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > interesting to make this more flexible.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > Thomas
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to