Re: Re: [DISCUSS] FLIP-316: Introduce SQL Driver

Chanhae Oh Sun, 07 Jun 2026 05:22:47 -0700

Hi Yuepeng,

Thank you for your feedback and for the encouragement to move this forward. I 
have updated the FLIP-316 document and would like to share the revised version 
for your review.


https://docs.google.com/document/d/1lXOBuYnPq2lOLPa5pJd4jFnxdZEtau_txUzueUgGN7s/edit?usp=sharing

Below is a summary of the key changes from the original draft:

---
Clarifications on scope relative to FLIP-480

FLIP-480 already covers the non-interactive path (SQL script → SqlDriver → 
embedded SQL Gateway). FLIP-316 focuses exclusively on the interactive 
compile-then-deploy path, where the SQL Gateway compiles SQL to a JSON Plan 
before cluster provisioning. The two FLIPs are complementary, not overlapping.

SqlDriver module location

The original draft referenced flink-sql-gateway as the module for SqlDriver. 
This has been corrected — SqlDriver already resides in flink-table-runtime as 
part of the FLIP-480 implementation and will be reused as-is.

JSON Plan distribution — no object storage required

The original draft assumed the JSON Plan would be uploaded to S3/HDFS. Since 
the JSON Plan is an internal serialization format with no need for external 
access, it is now passed inline as a configuration value embedded in the K8s 
ConfigMap or YARN distributed cache at cluster startup. Production-level SQL 
workloads generate JSON Plans well within the 1 MB ConfigMap limit, and the 
plan is automatically cleaned up with the cluster. This removes any dependency 
on object storage for plan distribution.

UDF JAR distribution — ResourceManager fix

FLINK-28915 and FLINK-32315 were expected to resolve file distribution, but all 
related PRs (#20671, #20779, #24065) have been closed. Currently, ADD JAR 
's3://...' causes ResourceManager to set a local file:// path in pipeline.jars, 
which is inaccessible from the K8s JobManager Pod. FLIP-316 proposes modifying 
ResourceManager.addJarConfiguration() to propagate the original URI (s3://, 
hdfs://) so the JobManager can fetch the JAR directly via ArtifactFetchManager.

Temporary function serialization — RexNodeJsonSerializer fix

Temporary functions are currently serialized with identifier only, making them 
unrestorable at the JobManager. FLIP-316 proposes modifying 
serializeTemporaryFunction() to also record the class name in the JSON Plan. 
This allows users to use UDFs via CREATE TEMPORARY FUNCTION without relying on 
a permanent catalog (HMS, etc.), and avoids cross-session registration 
conflicts.

DDL handling — no limitation

The original draft listed DDL as a limitation. This has been corrected. 
OperationExecutor already routes DDL to local Gateway execution, so DDL 
statements (CREATE TABLE IF NOT EXISTS, etc.) are applied directly to the 
external catalog and their schema and connector options are subsequently 
embedded in the JSON Plan via CatalogPlanCompilation.ALL. No special handling 
is required.

PIPELINE_FIXED_JOB_ID — already exists

The original draft noted this option needed to be added. It already exists in 
flink-core and is used by MaterializedTableManager. FLIP-316 reuses it as-is.

Deployment target detection

Added a new proposed change to detect whether the Flink Kubernetes Operator is 
present (via FlinkDeployment CRD) and route accordingly — to a FlinkDeployment 
CR for Operator-managed clusters, or to the existing 
KubernetesClusterDescriptor for native K8s, with YARN as a separate path.

Infrastructure prerequisites

Documented the required pre-configuration for each deployment target:

Deployment Target
Required
Recommended
Same kubernetes cluster
ServiceAccount + RBAC
Pre-configure Gateway Pod
Different kubernetes cluster
Target cluster kubeconfig
kubernetes.config.file via SET
YARN
Hadoop configuration files
HADOOP_CONF_DIR environment variable

---
I would appreciate any feedback on the direction, especially on the 
ResourceManager and RexNodeJsonSerializer changes which are the two concrete 
code modifications proposed.

Looking forward to your thoughts.

Best regards,
Chanhae Oh

________________________________
From: Yuepeng Pan <[email protected]>
Sent: Wednesday, June 3, 2026 00:04
To: [email protected] <[email protected]>
Subject: Re: Re: [DISCUSS] FLIP-316: Introduce SQL Driver

Hi, Chanhae,

Thank you so much for reorganizing and driving this FLIP forward.

I've briefly reviewed the discussion history and pending issues, and your
summary aligns perfectly with what I saw.
I look forward to seeing the specific proposals for those outstanding
items.
Would you mind documenting them in a Google Doc or directly on the FLIP
wiki page?

Thanks again!

Best regards,
Yuepeng Pan

Chanhae Oh <[email protected]> 于2026年6月2日周二 22:04写道：

> Hi all,
>
> I'd like to share some thoughts on FLIP-316 and how it might complement
> the recently merged FLIP-480.
>
> How FLIP-316 and FLIP-480 complement each other
>
>  FLIP-480 (FLINK-36702) ships a SQL script file to the JobManager and
> compiles it at runtime inside the already-deployed cluster. FLIP-316, by
> contrast, proposes that the SQL Gateway compiles the query into a
> CompiledPlan (JSON Plan) first, then deploys that artifact to application
> mode. The key
>   difference is where and when compilation happens:
>
>   - FLIP-480: script → JM compiles at runtime (simpler, good for ad-hoc)
>   - FLIP-316: Gateway compiles → JSON Plan → deploy to JM (enables
> pre-validation, plan inspection, and more deterministic behavior across
> cluster versions)
>
>  These two approaches are complementary. FLIP-316 adds a compile-first
> path that gives users stronger guarantees before a cluster is provisioned.
>
>  SET configuration and cluster parameters
>
>   One practical advantage of the compile-first model is that SET
> statements are evaluated in the Gateway session before cluster
> provisioning. This means table.* options can be embedded into the JSON Plan
> itself (they are part of each ExecNode's configuration), and kubernetes.* /
> taskmanager.* /
>   jobmanager.* options can be captured at deploy time as cluster-level
> configuration. No separate configuration file management is needed — the
> session config naturally splits into plan-level and cluster-level at the
> right boundary.
>
>  UDF support
>
>   TableConfigOptions.CatalogPlanCompilation.ALL (the default) embeds both
> the function identifier and the fully qualified class name into the JSON
> Plan. This means the JobManager does not need catalog access to resolve
> UDFs at runtime. UDF class metadata is self-contained in the plan.
>
>   UDF JAR distribution to the application mode cluster is a separate
> concern. Three directions come to mind, and I suspect the community may
> already have opinions on which is preferred:
>
>   1. Require users to pre-stage JARs at a remote URI (S3, HDFS) and pass
> them via user.artifacts.artifact-list.
> KubernetesApplicationClusterEntrypoint already invokes ArtifactFetchManager
> for this config key when pipeline.jars is set, though the interaction with
> usingSystemClassPath may need revisiting.
>   2. Accept remote URIs in ADD JAR and propagate them as-is into the
> deploy configuration (rather than resolving to a local Gateway path via
> ResourceManager).
>   3. Document that UDF JARs must be baked into the cluster image for the
> first iteration, deferring dynamic JAR distribution to a follow-up.
>
>   Option 3 is the most conservative and might be a reasonable scope for an
> initial implementation.
>
> CALL PROCEDURE and statement scope
>
>   compilePlanSql() in TableEnvironmentImpl currently enforces that only
> ModifyOperation (i.e., INSERT statements and EXECUTE STATEMENT SET) is
> accepted. CALL PROCEDURE is not a ModifyOperation and will throw
> TableException. This is an existing constraint in the planner, not
> something FLIP-316 introduces.
>
>   One possible direction would be to document clearly which statement
> types are in scope for the compile-then-deploy path (INSERT, EXECUTE
> STATEMENT SET) and which are not (CALL PROCEDURE, DDL, DML row-level
> modifications). Explicit scoping in the FLIP would prevent ambiguity in the
> implementation.
>
> Open questions I'd appreciate input on
>
>
>   1.
> Kubernetes Operator path: When the Gateway is running inside a K8s cluster
> managed by the Flink Kubernetes Operator, submitting via a FlinkDeployment
> CR may be more appropriate than the native KubernetesClusterDescriptor. One
> possible way to detect this is to check whether the FlinkDeployment CRD is
> registered in the cluster via the K8s API. FlinkKubeClientFactory already
> handles kubeconfig resolution (kubernetes.config.file →
> Config.fromKubeconfig(), otherwise Config.autoConfigure() for in-cluster
> service accounts), so the config machinery seems reusable. Does the
> community have a preferred detection or dispatch strategy here?
>   2.
> EXECUTE STATEMENT SET scope: Since StatementSet already exposes
> compilePlan(), which produces a single CompiledPlan covering multiple
> sinks, this case seems naturally supported. Would it make sense to treat a
> single INSERT and a STATEMENT SET as equivalent from the
> compile-then-deploy perspective? Or should we restrict the first iteration
> to single-INSERT plans?
>   3.
> API design: Should FLIP-316 introduce a dedicated endpoint (e.g., POST
> /sessions/{sessionHandle}/plans) separate from FLIP-480's
> /sessions/{sessionHandle}/scripts, or extend the existing deployScript
> endpoint? A separate endpoint seems cleaner — it avoids conflating the
> compile-then-deploy model with the script-execution model — but I'm curious
> whether there are integration or UX reasons to unify them.
>
> I'm happy to look into any of these further. Comments and corrections are
> very welcome.
>
> Best regards,
> Chanhae Oh.
>
>
> On 2023/06/08 15:20:23 Paul Lam wrote:
> > Hi ShengKai,
> >
> > Good point with the ANALYZE TABLE and CALL PROCEDURE statements.
> >
> > > Can we remove the jars if the job is running or gateway exits?
> >
> > Yes, I think it would be okay to remove the resources after the job is
> submitted.
> > It should be Gateway’s responsibility to remove them.
> >
> > > Can we use the returned rest client by ApplicationDeployer to query
> the job
> > > id? I am concerned that users don't know which job is related to the
> > > submitted SQL.
> >
> > That should be doable, as normally we only allow one job in an
> application
> > cluster ATM.
> >
> > But a more significant problem I see is that select statements are not
> available.
> >
> > Perhaps we need to make CollectSinkFunction accept an external sink
> address
> > from SQL Gateway to get the result back from SQL Driver. WDYT?
> >
> > > It seems we need to introduce a new module. Will the new module is
> > > available in the distribution package? I agree with Jark that we don't
> need
> > > to introduce this for table-API users and these users have their main
> > > class. If we want to make users write the k8s operator more easily, I
> think
> > > we should modify the k8s operator repo. If we don't need to support SQL
> > > files, can we make this jar only visible in the sql-gateway like we do
> in
> > > the planner loader?[1]
> >
> > I rethink the relationship between SQL Driver and SQL Client with
> embedded
> > Gateway. With the help of SQL Driver, we should be able to run SQL files
> > with non-interactive SQL Client on K8s, just as @Biao did.
> >
> > If it’s the case, I’m good with introducing a new module and making SQL
> Driver
> > an internal class and accepts JSON plans only.
> >
> > WRT visibility, I lean toward making it more publicly visible and easy
> to integrate
> > with external systems. I think putting the jar in the opt folder is
> good. May you
> > elaborate a bit more about the benefit we get from an extra loader?
> >
> > Best,
> > Paul Lam
> >
> > > 2023年6月7日 17:25，Shengkai Fang <[email protected]> 写道：
> > >
> > > Hi. Paul. Thanks for your update and the update makes me understand the
> > > design much better.
> > >
> > > But I still have some questions about the FLIP.
> > >
> > >> For SQL Gateway, only DMLs need to be delegated to the SQL server
> > >> Driver. I would think about the details and update the FLIP. Do you
> have
> > > some
> > >> ideas already?
> > >
> > > If the applicaiton mode can not support library mode, I think we should
> > > only execute INSERT INTO and UPDATE/ DELETE statement in the
> application
> > > mode. AFAIK, we can not support ANALYZE TABLE and CALL PROCEDURE
> > > statements. The ANALYZE TABLE syntax need to register the statistic to
> the
> > > catalog after job finishes and the CALL PROCEDURE statement doesn't
> > > generate the ExecNodeGraph.
> > >
> > > * Introduce storage via option `sql-gateway.application.storage-dir`
> > >
> > > If we can not support to submit the jars through web submission, +1 to
> > > introduce the options to upload the files. While I think the uploader
> > > should be responsible to remove the uploaded jars. Can we remove the
> jars
> > > if the job is running or gateway exits?
> > >
> > > * JobID is not avaliable
> > >
> > > Can we use the returned rest client by ApplicationDeployer to query
> the job
> > > id? I am concerned that users don't know which job is related to the
> > > submitted SQL.
> > >
> > > * Do we need to introduce a new module named flink-table-sql-runner?
> > >
> > > It seems we need to introduce a new module. Will the new module is
> > > available in the distribution package? I agree with Jark that we don't
> need
> > > to introduce this for table-API users and these users have their main
> > > class. If we want to make users write the k8s operator more easily, I
> think
> > > we should modify the k8s operator repo. If we don't need to support SQL
> > > files, can we make this jar only visible in the sql-gateway like we do
> in
> > > the planner loader?[1]
> > >
> > > [1]
> > >
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-loader/src/main/java/org/apache/flink/table/planner/loader/PlannerModule.java#L95
> > >
> > > Best,
> > > Shengkai
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > Weihua Hu <[email protected]> 于2023年6月7日周三 10:52写道：
> > >
> > >> Hi,
> > >>
> > >> Thanks for updating the FLIP.
> > >>
> > >> I have two cents on the distribution of SQLs and resources.
> > >> 1. Should we support a common file distribution mechanism for k8s
> > >> application mode?
> > >> I have seen some issues and requirements on the mailing list.
> > >> In our production environment, we implement the download command in
> the
> > >> CliFrontend.
> > >> And automatically add an init container to the POD for file
> downloading.
> > >> The advantage of this
> > >> is that we can use all Flink-supported file systems to store files.
> > >>
> > >> This need more discussion. I would appreciate hearing more opinions.
> > >>
> > >> 2. In this FLIP, we distribute files in two different ways in YARN and
> > >> Kubernetes. Can we combine it in one way?
> > >> If we don't want to implement a common file distribution for k8s
> > >> application mode. Could we use the SQLDriver
> > >> to download the files both in YARN and K8S? IMO, this can reduce the
> cost
> > >> of code maintenance.
> > >>
> > >> Best,
> > >> Weihua
> > >>
> > >>
> > >> On Wed, Jun 7, 2023 at 10:18 AM Paul Lam <[email protected]> wrote:
> > >>
> > >>> Hi Mason,
> > >>>
> > >>> Thanks for your input!
> > >>>
> > >>>> +1 for init containers or a more generalized way of obtaining
> arbitrary
> > >>>> files. File fetching isn't specific to just SQL--it also matters for
> > >> Java
> > >>>> applications if the user doesn't want to rebuild a Flink image and
> just
> > >>>> wants to modify the user application fat jar.
> > >>>
> > >>> I agree that utilizing SQL Drivers in Java applications is equally
> > >>> important
> > >>> as employing them in SQL Gateway. WRT init containers, I think most
> > >>> users use them just as a workaround. For example, wget a jar from the
> > >>> maven repo.
> > >>>
> > >>> We could implement the functionality in SQL Driver in a more graceful
> > >>> way and the flink-supported filesystem approach seems to be a
> > >>> good choice.
> > >>>
> > >>>> Also, what do you think about prefixing the config options with
> > >>>> `sql-driver` instead of just `sql` to be more specific?
> > >>>
> > >>> LGTM, since SQL Driver is a public interface and the options are
> > >>> specific to it.
> > >>>
> > >>> Best,
> > >>> Paul Lam
> > >>>
> > >>>> 2023年6月6日 06:30，Mason Chen <[email protected]> 写道：
> > >>>>
> > >>>> Hi Paul,
> > >>>>
> > >>>> +1 for this feature and supporting SQL file + JSON plans. We get a
> lot
> > >> of
> > >>>> requests to just be able to submit a SQL file, but the JSON plan
> > >>>> optimizations make sense.
> > >>>>
> > >>>> +1 for init containers or a more generalized way of obtaining
> arbitrary
> > >>>> files. File fetching isn't specific to just SQL--it also matters for
> > >> Java
> > >>>> applications if the user doesn't want to rebuild a Flink image and
> just
> > >>>> wants to modify the user application fat jar.
> > >>>>
> > >>>> Please note that we could reuse the checkpoint storage like S3/HDFS,
> > >>> which
> > >>>>> should
> > >>>>
> > >>>> be required to run Flink in production, so I guess that would be
> > >>> acceptable
> > >>>>> for most
> > >>>>
> > >>>> users. WDYT?
> > >>>>
> > >>>>
> > >>>> If you do go this route, it would be nice to support writing these
> > >> files
> > >>> to
> > >>>> S3/HDFS via Flink. This makes access control and policy management
> > >>> simpler.
> > >>>>
> > >>>> Also, what do you think about prefixing the config options with
> > >>>> `sql-driver` instead of just `sql` to be more specific?
> > >>>>
> > >>>> Best,
> > >>>> Mason
> > >>>>
> > >>>> On Mon, Jun 5, 2023 at 2:28 AM Paul Lam <[email protected]
> > >> <mailto:
> > >>> [email protected]>> wrote:
> > >>>>
> > >>>>> Hi Jark,
> > >>>>>
> > >>>>> Thanks for your input! Please see my comments inline.
> > >>>>>
> > >>>>>> Isn't Table API the same way as DataSream jobs to submit Flink
> SQL?
> > >>>>>> DataStream API also doesn't provide a default main class for
> users,
> > >>>>>> why do we need to provide such one for SQL?
> > >>>>>
> > >>>>> Sorry for the confusion I caused. By DataStream jobs, I mean jobs
> > >>> submitted
> > >>>>> via Flink CLI which actually could be DataStream/Table jobs.
> > >>>>>
> > >>>>> I think a default main class would be user-friendly which
> eliminates
> > >> the
> > >>>>> need
> > >>>>> for users to write a main class as SQLRunner in Flink K8s operator
> > >> [1].
> > >>>>>
> > >>>>>> I thought the proposed SqlDriver was a dedicated main class
> accepting
> > >>>>> SQL files, is
> > >>>>>> that correct?
> > >>>>>
> > >>>>> Both JSON plans and SQL files are accepted. SQL Gateway should use
> > >> JSON
> > >>>>> plans,
> > >>>>> while CLI users may use either JSON plans or SQL files.
> > >>>>>
> > >>>>> Please see the updated FLIP[2] for more details.
> > >>>>>
> > >>>>>> Personally, I prefer the way of init containers which doesn't
> depend
> > >> on
> > >>>>>> additional components.
> > >>>>>> This can reduce the moving parts of a production environment.
> > >>>>>> Depending on a distributed file system makes the testing, demo,
> and
> > >>> local
> > >>>>>> setup harder than init containers.
> > >>>>>
> > >>>>> Please note that we could reuse the checkpoint storage like
> S3/HDFS,
> > >>> which
> > >>>>> should
> > >>>>> be required to run Flink in production, so I guess that would be
> > >>>>> acceptable for most
> > >>>>> users. WDYT?
> > >>>>>
> > >>>>> WRT testing, demo, and local setups, I think we could support the
> > >> local
> > >>>>> filesystem
> > >>>>> scheme i.e. file://** as the state backends do. It works as long as
> > >> SQL
> > >>>>> Gateway
> > >>>>> and JobManager(or SQL Driver) can access the resource directory
> > >>> (specified
> > >>>>> via
> > >>>>> `sql-gateway.application.storage-dir`).
> > >>>>>
> > >>>>> Thanks!
> > >>>>>
> > >>>>> [1]
> > >>>>>
> > >>>
> > >>
> https://github.com/apache/flink-kubernetes-operator/blob/main/examples/flink-sql-runner-example/src/main/java/org/apache/flink/examples/SqlRunner.java
> > >>>>> [2]
> > >>>>>
> > >>>
> > >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver
> > >>>>> [3]
> > >>>>>
> > >>>
> > >>
> https://github.com/apache/flink/blob/3245e0443b2a4663552a5b707c5c8c46876c1f6d/flink-runtime/src/test/java/org/apache/flink/runtime/state/filesystem/AbstractFileCheckpointStorageAccessTestBase.java#L161
> > >>>>>
> > >>>>> Best,
> > >>>>> Paul Lam
> > >>>>>
> > >>>>>> 2023年6月3日 12:21，Jark Wu <[email protected]> 写道：
> > >>>>>>
> > >>>>>> Hi Paul,
> > >>>>>>
> > >>>>>> Thanks for your reply. I left my comments inline.
> > >>>>>>
> > >>>>>>> As the FLIP said, it’s good to have a default main class for
> Flink
> > >>> SQLs,
> > >>>>>>> which allows users to submit Flink SQLs in the same way as
> > >> DataStream
> > >>>>>>> jobs, or else users need to write their own main class.
> > >>>>>>
> > >>>>>> Isn't Table API the same way as DataSream jobs to submit Flink
> SQL?
> > >>>>>> DataStream API also doesn't provide a default main class for
> users,
> > >>>>>> why do we need to provide such one for SQL?
> > >>>>>>
> > >>>>>>> With the help of ExecNodeGraph, do we still need the serialized
> > >>>>>>> SessionState? If not, we could make SQL Driver accepts two
> > >> serialized
> > >>>>>>> formats:
> > >>>>>>
> > >>>>>> No, ExecNodeGraph doesn't need to serialize SessionState. I
> thought
> > >> the
> > >>>>>> proposed SqlDriver was a dedicated main class accepting SQL
> files, is
> > >>>>>> that correct?
> > >>>>>> If true, we have to ship the SessionState for this case which is a
> > >>> large
> > >>>>>> work.
> > >>>>>> I think we just need a JsonPlanDriver which is a main class that
> > >>> accepts
> > >>>>>> JsonPlan as the parameter.
> > >>>>>>
> > >>>>>>
> > >>>>>>> The common solutions I know is to use distributed file systems or
> > >> use
> > >>>>>>> init containers to localize the resources.
> > >>>>>>
> > >>>>>> Personally, I prefer the way of init containers which doesn't
> depend
> > >> on
> > >>>>>> additional components.
> > >>>>>> This can reduce the moving parts of a production environment.
> > >>>>>> Depending on a distributed file system makes the testing, demo,
> and
> > >>> local
> > >>>>>> setup harder than init containers.
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Jark
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Fri, 2 Jun 2023 at 18:10, Paul Lam <[email protected]
> > >> <mailto:
> > >>> [email protected]> <mailto:
> > >>>>> [email protected] <[email protected]>>> wrote:
> > >>>>>>
> > >>>>>>> The FLIP is in the early phase and some details are not included,
> > >> but
> > >>>>>>> fortunately, we got lots of valuable ideas from the discussion.
> > >>>>>>>
> > >>>>>>> Thanks to everyone who joined the dissuasion!
> > >>>>>>> @Weihua @Shanmon @Shengkai @Biao @Jark
> > >>>>>>>
> > >>>>>>> This weekend I’m gonna revisit and update the FLIP, adding more
> > >>>>>>> details. Hopefully, we can further align our opinions.
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Paul Lam
> > >>>>>>>
> > >>>>>>>> 2023年6月2日 18:02，Paul Lam <[email protected] <mailto:
> > >>> [email protected]>> 写道：
> > >>>>>>>>
> > >>>>>>>> Hi Jark,
> > >>>>>>>>
> > >>>>>>>> Thanks a lot for your input!
> > >>>>>>>>
> > >>>>>>>>> If we decide to submit ExecNodeGraph instead of SQL file, is it
> > >>> still
> > >>>>>>>>> necessary to support SQL Driver?
> > >>>>>>>>
> > >>>>>>>> I think so. Apart from usage in SQL Gateway, SQL Driver could
> > >>> simplify
> > >>>>>>>> Flink SQL execution with Flink CLI.
> > >>>>>>>>
> > >>>>>>>> As the FLIP said, it’s good to have a default main class for
> Flink
> > >>>>> SQLs,
> > >>>>>>>> which allows users to submit Flink SQLs in the same way as
> > >> DataStream
> > >>>>>>>> jobs, or else users need to write their own main class.
> > >>>>>>>>
> > >>>>>>>>> SQL Driver needs to serialize SessionState which is very
> > >> challenging
> > >>>>>>>>> but not detailed covered in the FLIP.
> > >>>>>>>>
> > >>>>>>>> With the help of ExecNodeGraph, do we still need the serialized
> > >>>>>>>> SessionState? If not, we could make SQL Driver accepts two
> > >> serialized
> > >>>>>>>> formats:
> > >>>>>>>>
> > >>>>>>>> - SQL files for user-facing public usage
> > >>>>>>>> - ExecNodeGraph for internal usage
> > >>>>>>>>
> > >>>>>>>> It’s kind of similar to the relationship between job jars and
> > >>>>> jobgraphs.
> > >>>>>>>>
> > >>>>>>>>> Regarding "K8S doesn't support shipping multiple jars", is that
> > >>> true?
> > >>>>>>> Is it
> > >>>>>>>>> possible to support it?
> > >>>>>>>>
> > >>>>>>>> Yes, K8s doesn’t distribute any files. It’s the users’
> > >> responsibility
> > >>>>> to
> > >>>>>>> make
> > >>>>>>>> sure the resources are accessible in the containers. The common
> > >>>>> solutions
> > >>>>>>>> I know is to use distributed file systems or use init
> containers to
> > >>>>>>> localize the
> > >>>>>>>> resources.
> > >>>>>>>>
> > >>>>>>>> Now I lean toward introducing a fs to do the distribution job.
> > >> WDYT?
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>> Paul Lam
> > >>>>>>>>
> > >>>>>>>>> 2023年6月1日 20:33，Jark Wu <[email protected] <mailto:
> > >> [email protected]>
> > >>> <mailto:[email protected] <[email protected]>>
> > >>>>> <mailto:[email protected] <[email protected]> <mailto:
> > >>> [email protected] <[email protected]>>>>
> > >>>>>>> 写道：
> > >>>>>>>>>
> > >>>>>>>>> Hi Paul,
> > >>>>>>>>>
> > >>>>>>>>> Thanks for starting this discussion. I like the proposal! This
> is
> > >> a
> > >>>>>>>>> frequently requested feature!
> > >>>>>>>>>
> > >>>>>>>>> I agree with Shengkai that ExecNodeGraph as the submission
> object
> > >>> is a
> > >>>>>>>>> better idea than SQL file. To be more specific, it should be
> > >>>>>>> JsonPlanGraph
> > >>>>>>>>> or CompiledPlan which is the serializable representation.
> > >>> CompiledPlan
> > >>>>>>> is a
> > >>>>>>>>> clear separation between compiling/optimization/validation and
> > >>>>>>> execution.
> > >>>>>>>>> This can keep the validation and metadata accessing still on
> the
> > >>>>>>> SQLGateway
> > >>>>>>>>> side. This allows SQLGateway to leverage some metadata caching
> and
> > >>> UDF
> > >>>>>>> JAR
> > >>>>>>>>> caching for better compiling performance.
> > >>>>>>>>>
> > >>>>>>>>> If we decide to submit ExecNodeGraph instead of SQL file, is it
> > >>> still
> > >>>>>>>>> necessary to support SQL Driver? Regarding non-interactive SQL
> > >> jobs,
> [message truncated...]
>

Re: Re: [DISCUSS] FLIP-316: Introduce SQL Driver

Reply via email to