A simple Helm chart that covers both scenarios - running gateway
1) as a sidecar, 2) as an independent deployment- with solid docs sounds
good to me.
It would be nice if we could also include optional features in the chart
such as k8s service for exposing the sql gateway.

Best regards,
Dongwoo

2023년 9월 19일 (화) 오후 10:15, Thomas Weise <t...@apache.org>님이 작성:

> It is already possible to bring up a SQL Gateway as a sidecar utilizing the
> pod templates - I tend to also see this more of a documentation/example
> issue rather than something that calls for a separate CRD or other
> dedicated operator support.
>
> Thanks,
> Thomas
>
>
>
> On Tue, Sep 19, 2023 at 3:41 PM Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > Based on this I think we should start with simple Helm charts / templates
> > for creating the `FlinkDeployment` together with a separate Deployment
> for
> > the SQL Gateway.
> > If the gateway itself doesn't integrate well with the operator managed
> CRs
> > (sessionjobs) then I think it's better and simpler to have it separately.
> >
> > These Helm charts should be part of the operator repo / examples with
> nice
> > docs. If we see that it's useful and popular we can start thinking of
> > integrating it into the CRD.
> >
> > What do you think?
> > Gyula
> >
> > On Tue, Sep 19, 2023 at 6:09 AM Yangze Guo <karma...@gmail.com> wrote:
> >
> > > Thanks for the reply, @Gyula.
> > >
> > > I would like to first provide more context on OLAP scenarios. In OLAP
> > > scenarios, users typically submit multiple short batch jobs that have
> > > execution times typically measured in seconds or even sub-seconds.
> > > Additionally, due to the lightweight nature of these jobs, they often
> > > do not require lifecycle management features and can disable high
> > > availability functionalities such as failover and checkpointing.
> > >
> > > Regarding the integration issue, I believe that supporting the
> > > generation of FlinkSessionJob through a gateway is a "nice to have"
> > > feature rather than a "must-have." Firstly, it may be overkill to
> > > create a CRD for such lightweight jobs, and it could potentially
> > > impact the end-to-end execution time of the OLAP job. Secondly, as
> > > mentioned earlier, these jobs do not have strong lifecycle management
> > > requirements, so having an operator manage them would be a bit wasted.
> > > Therefore, atm, we can allow users to directly submit jobs using JDBC
> > > or REST API. WDYT?
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Mon, Sep 18, 2023 at 4:08 PM Gyula Fóra <gyula.f...@gmail.com>
> wrote:
> > > >
> > > > As I wrote in my previous answer, this could be done as a helm chart
> or
> > > as
> > > > part of the operator easily. Both would work.
> > > > My main concern for adding this into the operator is that the SQL
> > Gateway
> > > > itself is not properly integrated with the Operator Custom resources.
> > > >
> > > > Gyula
> > > >
> > > > On Mon, Sep 18, 2023 at 4:24 AM Shammon FY <zjur...@gmail.com>
> wrote:
> > > >
> > > > > Thanks @Gyula, I would like to share our use of sql-gateway with
> the
> > > Flink
> > > > > session cluster and I hope that it could help you to have a clearer
> > > > > understanding of our needs :)
> > > > >
> > > > > As @Yangze mentioned, currently we use flink as an olap platform by
> > the
> > > > > following steps
> > > > > 1. Setup a flink session cluster by flink k8s session with k8s or
> zk
> > > > > highavailable.
> > > > > 2.  Write a Helm chart for Sql-Gateway image and launch multiple
> > > gateway
> > > > > instances to submit jobs to the same flink session cluster.
> > > > >
> > > > > As we mentioned in docs[1], we hope that users can easily launch
> > > > > sql-gateway instances in k8s. Does it only need to add a Helm chart
> > for
> > > > > sql-gateway, or should we need to add this feature to the flink
> > > > > operator? Can you help give the conclusion? Thank you very much
> > @Gyula
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/olap_quickstart/
> > > > >
> > > > > Best,
> > > > > Shammon FY
> > > > >
> > > > >
> > > > >
> > > > > On Sun, Sep 17, 2023 at 2:02 PM Gyula Fóra <gyula.f...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Hi!
> > > > > > It sounds pretty easy to deploy the gateway automatically with
> > > session
> > > > > > cluster deployments from the operator , but there is a major
> > > limitation
> > > > > > currently. The SQL gateway itself doesn't really support any
> > operator
> > > > > > integration so jobs submitted through the SQL gateway would not
> be
> > > > > > manageable by the operator (they won't show up as session jobs).
> > > > > >
> > > > > > Without that, this is a very strange feature. We would make
> > something
> > > > > much
> > > > > > easier for users that is not well supported by the operator in
> the
> > > first
> > > > > > place. The operator is designed to manage clusters and jobs
> > > > > > (FlinkDeployment / FlinkSessionJob). It would be good to
> understand
> > > if we
> > > > > > could make the SQL Gateway create a FlinkSessionJob / Deployment
> > > (that
> > > > > > would require application cluster support) and basically submit
> the
> > > job
> > > > > > through the operator.
> > > > > >
> > > > > > Cheers,
> > > > > > Gyula
> > > > > >
> > > > > > On Sun, Sep 17, 2023 at 1:26 AM Yangze Guo <karma...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > > There would be many different ways of doing this. One gateway
> > per
> > > > > > > session cluster, one gateway shared across different
> clusters...
> > > > > > >
> > > > > > > Currently, sql gateway cannot be shared across multiple
> clusters.
> > > > > > >
> > > > > > > > understand the tradeoff and the simplest way of accomplishing
> > > this.
> > > > > > >
> > > > > > > I'm not familiar with the Flink operator codebase, it would be
> > > > > > > appreciated if you could elaborate more on the cost of adding
> > this
> > > > > > > feature. I agree that deploying a gateway using the native
> > > Kubernetes
> > > > > > > Deployment can be a simple way and straightforward for users.
> > > However,
> > > > > > > integrating it into an operator can provide additional benefits
> > > and be
> > > > > > > more user-friendly, especially for users who are less familiar
> > with
> > > > > > > Kubernetes. By using an operator, users can benefit from
> > consistent
> > > > > > > version management with the session cluster and upgrade
> > > capabilities.
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Fri, Sep 15, 2023 at 5:38 PM Gyula Fóra <
> gyula.f...@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > There would be many different ways of doing this. One gateway
> > per
> > > > > > session
> > > > > > > > cluster, one gateway shared across different clusters...
> > > > > > > > I would not rush to add anything anywhere until we understand
> > the
> > > > > > > tradeoff
> > > > > > > > and the simplest way of accomplishing this.
> > > > > > > >
> > > > > > > > The operator already supports ingresses for session clusters
> so
> > > we
> > > > > > could
> > > > > > > > have a gateway sitting somewhere else simply using it.
> > > > > > > >
> > > > > > > > Gyula
> > > > > > > >
> > > > > > > > On Fri, Sep 15, 2023 at 10:18 AM Yangze Guo <
> > karma...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for bringing this up, Dongwoo. Flink SQL Gateway is
> > > also a
> > > > > key
> > > > > > > > > component for OLAP scenarios.
> > > > > > > > >
> > > > > > > > > @Gyula
> > > > > > > > > How about add sql gateway as an optional component to
> Session
> > > > > Cluster
> > > > > > > > > Deployments. User can specify the resource / instance
> number
> > > and
> > > > > > ports
> > > > > > > > > of the sql gateway. I think that would help a lot for OLAP
> > and
> > > > > batch
> > > > > > > > > user.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > > > On Fri, Sep 15, 2023 at 3:19 PM ConradJam <
> > jam.gz...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > If we start from the crd direction, I think this mode is
> > more
> > > > > like
> > > > > > a
> > > > > > > > > > sidecar of the session cluster, which is submitted to the
> > > session
> > > > > > > cluster
> > > > > > > > > > by sending sql commands to the sql gateway. I don't know
> if
> > > my
> > > > > > > statement
> > > > > > > > > is
> > > > > > > > > > accurate.
> > > > > > > > > >
> > > > > > > > > > Xiaolong Wang <xiaolong.w...@smartnews.com.invalid>
> > > > > 于2023年9月15日周五
> > > > > > > > > 13:27写道:
> > > > > > > > > >
> > > > > > > > > > > Hi, Dongwoo,
> > > > > > > > > > >
> > > > > > > > > > > Since Flink SQL gateway should run upon a Flink session
> > > > > cluster,
> > > > > > I
> > > > > > > > > think
> > > > > > > > > > > it'd be easier to add more fields to the CRD of
> > > > > > `FlinkSessionJob`.
> > > > > > > > > > >
> > > > > > > > > > > e.g.
> > > > > > > > > > >
> > > > > > > > > > > apiVersion: flink.apache.org/v1beta1
> > > > > > > > > > > kind: FlinkSessionJob
> > > > > > > > > > > metadata:
> > > > > > > > > > >   name: sql-gateway
> > > > > > > > > > > spec:
> > > > > > > > > > >   sqlGateway:
> > > > > > > > > > >     endpoint: "hiveserver2"
> > > > > > > > > > >     mode: "streaming"
> > > > > > > > > > >     hiveConf:
> > > > > > > > > > >       configMap:
> > > > > > > > > > >         name: hive-config
> > > > > > > > > > >         items:
> > > > > > > > > > >           - key: hive-site.xml
> > > > > > > > > > >             path: hive-site.xml
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Sep 15, 2023 at 12:56 PM Dongwoo Kim <
> > > > > > > dongwoo7....@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > >
> > > > > > > > > > > > *@Gyula*
> > > > > > > > > > > > Thanks for the consideration Gyula. My initial idea
> for
> > > the
> > > > > CR
> > > > > > > was
> > > > > > > > > > > roughly
> > > > > > > > > > > > like below.
> > > > > > > > > > > > I focused on simplifying the setup in k8s
> environment,
> > > but I
> > > > > > > agree
> > > > > > > > > with
> > > > > > > > > > > > your opinion that for the sql gateway
> > > > > > > > > > > > we don't need custom operator logic to handle and
> most
> > > of the
> > > > > > > > > > > requirements
> > > > > > > > > > > > can be met by existing k8s resources.
> > > > > > > > > > > > So maybe helm chart that bundles all resources needed
> > > should
> > > > > be
> > > > > > > > > enough.
> > > > > > > > > > > >
> > > > > > > > > > > > apiVersion: flink.apache.org/v1beta1
> > > > > > > > > > > > kind: FlinkSqlGateway
> > > > > > > > > > > > metadata:
> > > > > > > > > > > >   name: flink-sql-gateway-example
> > > > > > > > > > > >   namespace: default
> > > > > > > > > > > > spec:
> > > > > > > > > > > >   clusterName: flink-session-cluster-example
> > > > > > > > > > > >   exposeServiceType: LoadBalancer
> > > > > > > > > > > >   flinkSqlGatewayConfiguration:
> > > > > > > > > > > >     sql-gateway.endpoint.type: "hiveserver2"
> > > > > > > > > > > >     sql-gateway.endpoint.hiveserver2.catalog.name:
> > > "hive"
> > > > > > > > > > > >   hiveConf:
> > > > > > > > > > > >     configMap:
> > > > > > > > > > > >       name: hive-config
> > > > > > > > > > > >       items:
> > > > > > > > > > > >         - key: hive-site.xml
> > > > > > > > > > > >           path: hive-site.xml
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *@xiaolong, @Shammon*
> > > > > > > > > > > > Hi xiaolong and Shammon.
> > > > > > > > > > > > Thanks for taking the time to share.
> > > > > > > > > > > > I'd also like to add my experience with setting up
> > flink
> > > sql
> > > > > > > gateway
> > > > > > > > > on
> > > > > > > > > > > > k8s.
> > > > > > > > > > > > Without building a new Docker image, I've added a
> > > separate
> > > > > > > container
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > existing JobManager pod and started the sql gateway
> > > using the
> > > > > > > > > > > > "sql-gateway.sh start-foreground" command.
> > > > > > > > > > > > I haven't explored deploying the sql gateway as an
> > > > > independent
> > > > > > > > > deployment
> > > > > > > > > > > > yet, but that's something I'm considering after
> > modifying
> > > > > JM's
> > > > > > > > > address to
> > > > > > > > > > > > desired session cluster.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks all
> > > > > > > > > > > >
> > > > > > > > > > > > Best
> > > > > > > > > > > > Dongwoo
> > > > > > > > > > > >
> > > > > > > > > > > > 2023년 9월 15일 (금) 오전 11:55, Xiaolong Wang
> > > > > > > > > > > > <xiaolong.w...@smartnews.com.invalid>님이 작성:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Shammon,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, I want to create a Flink SQL-gateway in a
> > > job-manager.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Currently, the above script is generally a
> > work-around
> > > and
> > > > > > > allows
> > > > > > > > > me to
> > > > > > > > > > > > > start a Flink session job manager with a SQL
> gateway
> > > > > running
> > > > > > > upon.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I agree that it'd be more elegant that we create a
> > new
> > > job
> > > > > > > type and
> > > > > > > > > > > > write a
> > > > > > > > > > > > > script, which is much easier for the user to use
> > (since
> > > > > they
> > > > > > > do not
> > > > > > > > > > > need
> > > > > > > > > > > > to
> > > > > > > > > > > > > build a separate Flink image any more).
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Sep 15, 2023 at 10:29 AM Shammon FY <
> > > > > > zjur...@gmail.com
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Currently `sql-gateway` can be started with the
> > > script
> > > > > > > > > > > `sql-gateway.sh`
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > an existing node, it is more like a simple
> > > "standalone"
> > > > > > > node. I
> > > > > > > > > think
> > > > > > > > > > > > > it's
> > > > > > > > > > > > > > valuable if we can do more work to start it in
> k8s.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > For xiaolong:
> > > > > > > > > > > > > > Do you want to start a sql-gateway instance in
> the
> > > > > > jobmanager
> > > > > > > > > pod? I
> > > > > > > > > > > > > think
> > > > > > > > > > > > > > maybe we need a script like
> > > `kubernetes-sql-gatewah.sh`
> > > > > to
> > > > > > > start
> > > > > > > > > > > > > > `sql-gateway` pods with a flink image, what do
> you
> > > think?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Shammon FY
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Sep 15, 2023 at 10:02 AM Xiaolong Wang
> > > > > > > > > > > > > > <xiaolong.w...@smartnews.com.invalid> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi, I've experiment this feature on K8S
> recently,
> > > here
> > > > > is
> > > > > > > some
> > > > > > > > > of
> > > > > > > > > > > my
> > > > > > > > > > > > > > trial:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. Create a new kubernetes-jobmanager.sh script
> > > with
> > > > > the
> > > > > > > > > following
> > > > > > > > > > > > > > content
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > #!/usr/bin/env bash
> > > > > > > > > > > > > > > $FLINK_HOME/bin/sql-gateway.sh start
> > > > > > > > > > > > > > > $FLINK_HOME/bin/kubernetes-jobmanager1.sh
> > > > > > > kubernetes-session
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. Build your own Flink docker image something
> > like
> > > > > this
> > > > > > > > > > > > > > > FROM flink:1.17.1-scala_2.12-java11
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > RUN mv $FLINK_HOME/bin/kubernetes-jobmanager.sh
> > > > > > > > > $FLINK_HOME/bin/
> > > > > > > > > > > > > > > kubernetes-jobmanager1.sh
> > > > > > > > > > > > > > > COPY ./kubernetes-jobmanager.sh
> > > > > > > > > > > > > $FLINK_HOME/bin/kubernetes-jobmanager.sh
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > RUN chmod +x $FLINK_HOME/bin/*.sh
> > > > > > > > > > > > > > > USER flink
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 3. Create a Flink session job with the operator
> > > using
> > > > > the
> > > > > > > above
> > > > > > > > > > > > image.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, Sep 14, 2023 at 9:49 PM Gyula Fóra <
> > > > > > > > > gyula.f...@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi!
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I don't completely understand what would be a
> > > content
> > > > > > of
> > > > > > > such
> > > > > > > > > > > CRD,
> > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > you give a minimal example how the Flink SQL
> > > Gateway
> > > > > CR
> > > > > > > yaml
> > > > > > > > > > > would
> > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > like?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Adding a CRD would mean you need to add some
> > > > > > > > > operator/controller
> > > > > > > > > > > > > logic
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > well. Why not simply use a Deployment /
> > > StatefulSet
> > > > > in
> > > > > > > > > > > Kubernetes?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Or a Helm chart if you want to make it more
> > user
> > > > > > > friendly?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > Gyula
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, Sep 14, 2023 at 12:57 PM Dongwoo Kim
> <
> > > > > > > > > > > > dongwoo7....@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I've been working on setting up a flink SQL
> > > gateway
> > > > > > in
> > > > > > > a
> > > > > > > > > k8s
> > > > > > > > > > > > > > > environment
> > > > > > > > > > > > > > > > > and it got me thinking — what if we had a
> CRD
> > > for
> > > > > > this?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > So I have quick questions below.
> > > > > > > > > > > > > > > > > 1. Is there ongoing work to create a CRD
> for
> > > the
> > > > > > Flink
> > > > > > > SQL
> > > > > > > > > > > > Gateway?
> > > > > > > > > > > > > > > > > 2. If not, would the community be open to
> > > > > > considering a
> > > > > > > > > CRD for
> > > > > > > > > > > > > this?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I've noticed a growing demand for
> simplified
> > > setup
> > > > > of
> > > > > > > the
> > > > > > > > > flink
> > > > > > > > > > > > sql
> > > > > > > > > > > > > > > > gateway
> > > > > > > > > > > > > > > > > in flink's slack channel.
> > > > > > > > > > > > > > > > > Implementing a CRD could make deployments
> > > easier
> > > > > and
> > > > > > > offer
> > > > > > > > > > > better
> > > > > > > > > > > > > > > > > integration with k8s.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > If this idea is accepted, I'm open to
> > drafting
> > > a
> > > > > FLIP
> > > > > > > for
> > > > > > > > > > > further
> > > > > > > > > > > > > > > > > discussion
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks for your time and looking forward to
> > > your
> > > > > > > thoughts!
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > > > > > > Dongwoo
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best
> > > > > > > > > >
> > > > > > > > > > ConradJam
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
>

Reply via email to