yuanfenghu created FLINK-38377:
----------------------------------
Summary: SQL Job Submission in the Flink Kubernetes Operator via
SQL Gateway
Key: FLINK-38377
URL: https://issues.apache.org/jira/browse/FLINK-38377
Project: Flink
Issue Type: New Feature
Components: Kubernetes Operator, Table SQL / Gateway
Reporter: yuanfenghu
1. Abstract
This proposal outlines a new, integrated approach for submitting Flink SQL
jobs on Kubernetes. We propose enhancing the Flink Kubernetes Operator to
natively support SQL job submission by leveraging the REST API of the Flink
SQL Gateway. This change will provide a standardized, API-driven, and
user-friendly experience, moving away from the current reliance on
custom-built runner JARs.
2. Motivation
Currently, submitting a pure SQL job through the Flink Kubernetes Operator is
not a first-class experience. The common approach, demonstrated by the
flink-sql-runner-example, has several significant drawbacks:
* Requires Custom Tooling: Users must build and maintain a specialized JAR
file (flink-sql-runner.jar) to package and execute their SQL scripts. This
adds an extra layer of complexity to the development and deployment
workflow.
* Complex Dependency Management: Managing SQL connectors, UDFs, and other
dependencies is cumbersome. Users often need to bundle all dependencies into a
"fat JAR" or manage them through complex classpath configurations, which
is error-prone.
* Non-Standard Workflow: This method feels like a workaround rather than a
native feature. It deviates from the declarative, API-centric philosophy of
Kubernetes and the operator pattern.
* Poor User Experience: The process is indirect and not intuitive for users
who simply want to run a SQL script against Flink on Kubernetes.
Starting with Flink 2.0, the Flink SQL Gateway has introduced the capability
to directly submit SQL application jobs to a Kubernetes cluster
[FLIP-480|https://cwiki.apache.org/confluence/display/FLINK/FLIP-480%3A+Support+to+deploy+SQL+script+in+application+mode].
This
presents a perfect opportunity to modernize and streamline the SQL submission
process within the operator ecosystem.
3. Proposed Change
We propose to integrate the Flink Kubernetes Operator with the Flink SQL
Gateway to create a seamless SQL job submission workflow. This approach
effectively turns the operator into an orchestrator that translates a
declarative Kubernetes resource into an imperative API call against the SQL
Gateway.
4. Key Advantages
This integration would deliver substantial benefits:
* Standardized, API-Driven Submission: Eliminates the need for custom runner
JARs, providing a clean, official, and maintainable submission process.
* Simplified Dependency Management: Leverages the SQL Gateway's robust
mechanisms for dynamically loading connectors and dependencies, which can be
achieved through the ADD JAR syntax.
* Improved User Experience: Users can define and manage Flink SQL jobs using
familiar kubectl and YAML files, creating a true Kubernetes-native
experience.
* Decoupling of Logic and Packaging: SQL developers can focus on writing SQL
without worrying about Java build systems or container image customization.
* Alignment with Flink Project Direction: This approach aligns the operator
with the strategic investment being made in the Flink SQL Gateway as the
primary entry point for SQL interactions.
5. Next Steps
I have already developed a proof-of-concept for this functionality and
believe it would be a highly valuable addition to the Flink on Kubernetes
ecosystem.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)