[ https://issues.apache.org/jira/browse/FLINK-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456345#comment-16456345 ]
ASF GitHub Bot commented on FLINK-9181: --------------------------------------- Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184678478 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,538 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + + +Although Flink’s Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + +<a href="{{ site.baseurl }}/fig/sql_client_demo.gif"><img class="offset" src="{{ site.baseurl }}/fig/sql_client_demo.gif" alt="Animated demo of the Flink SQL Client CLI running table programs on a cluster" width="80%" /></a> + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--------------- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + +{% highlight bash %} +./bin/sql-client.sh embedded +{% endhighlight %} + +This command starts the submission service and CLI embedded in one application process. By default, the SQL Client will read its configuration from the environment file located in `./conf/sql-client-defaults.yaml`. See the [next part](sqlClient.html#environment-files) for more information about the structure of environment files. + +### Running SQL Queries + +Once the CLI has been started, you can use the `HELP` command to list all available SQL statements. For validating your setup and cluster connection, you can enter your first SQL query and press the `Enter` key to execute it: + +{% highlight sql %} +SELECT 'Hello World' +{% endhighlight %} + +This query requires no table source and produces a single row result. The CLI will retrieve results from the cluster and visualize them. You can close the result view by pressing the `Q` key. + +The CLI supports **two modes** for maintaining and visualizing results. + +The *table mode* materializes results in memory and visualizes them in a regular, paginated table representation. It can be enabled by executing the following command in the CLI: + +{% highlight text %} +SET execution.result-mode=table +{% endhighlight %} + +The *changelog mode* does not materialize results and visualizes the result stream that is produced by a continuous query [LINK] consisting of insertions (`+`) and retractions (`-`). + +{% highlight text %} +SET execution.result-mode=changelog +{% endhighlight %} + +You can use the following query to see both result modes in action: + +{% highlight sql %} +SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name +{% endhighlight %} + +This query performs a bounded word count example. The following sections explain how to read from table sources and configure other table program properties. + +{% top %} + +Configuration +------------- + +The SQL Client can be started with the following optional CLI commands. They are discussed in detail in the subsequent paragraphs. + +{% highlight text %} +./bin/sql-client.sh embedded --help + +Mode "embedded" submits Flink jobs from the local machine. + + Syntax: embedded [OPTIONS] + "embedded" mode options: + -d,--defaults <environment file> The environment properties with which + every new session is initialized. + Properties might be overwritten by + session properties. + -e,--environment <environment file> The environment properties to be + imported into the session. It might + overwrite default environment + properties. + -h,--help Show the help message with + descriptions of all options. + -j,--jar <JAR file> A JAR file to be imported into the + session. The file might contain + user-defined classes needed for the + execution of statements such as + functions, table sources, or sinks. + Can be used multiple times. + -l,--library <JAR directory> A JAR file directory with which every + new session is initialized. The files + might contain user-defined classes + needed for the execution of + statements such as functions, table + sources, or sinks. Can be used + multiple times. + -s,--session <session identifier> The identifier for a session. + 'default' is the default identifier. +{% endhighlight %} + +{% top %} + +### Environment Files + +A SQL query needs a configuration environment in which it is executed. The so-called *environment files* define available table sources and sinks, external catalogs, user-defined functions, and other properties required for execution and deployment. + +Every environment file is a regular [YAML file](http://http://yaml.org/) that looks similar to the following example. The file defines an environment with a table source `MyTableName` that reads from CSV file. Queries that are executed in this environment (among others) will have a parallelism of 1, an even-time characteristic, and will run in the `table` result mode. + +{% highlight yaml %} +# Define table sources and sinks here. + +tables: + - name: MyTableName + type: source + schema: + - name: MyField1 + type: INT + - name: MyField2 + type: VARCHAR + connector: + type: filesystem + path: "/path/to/something.csv" + format: + type: csv + fields: + - name: MyField1 + type: INT + - name: MyField2 + type: VARCHAR + line-delimiter: "\n" + comment-prefix: "#" + +# Execution properties allow for changing the behavior of a table program. + +execution: + type: streaming + time-characteristic: event-time + parallelism: 1 + max-parallelism: 16 + min-idle-state-retention: 0 + max-idle-state-retention: 0 + result-mode: table + +# Deployment properties allow for describing the cluster to which table programs are submitted to. + +deployment: + response-timeout: 5000 +{% endhighlight %} + +Environment files can be created for general purposes (*defaults environment file* using `--defaults`) as well as on a per-session basis (*session environment file* using `--environment`). Every CLI session is initialized with the default properties followed by the session properties. Both default and session environment files can be passed when starting the CLI application. If no default environment file has been specified, the SQL Client searches for `./conf/sql-client-defaults.yaml` in Flink's configuration directory. + +Properties that have been set within a CLI session (e.g. using the `SET` command) have highest precedence: + +{% highlight text %} +CLI commands > session environment file > defaults environment file +{% endhighlight %} + +{% top %} + +### Dependencies + +The SQL Client does not require to setup a Java project using Maven or SBT. Instead, you can pass the dependencies as regular JAR files that get submitted to the cluster. You can either specify each JAR file separately (using `--jar`) or define entire library directories (using `--library`). For connectors to external systems (such as Apache Kafka) and corresponding data formats (such as JSON), Flink provides **ready-to-use JAR bundles**. These JAR files are suffixed with `sql-jar` and can be downloaded for each release from the Maven central repository. + + + +#### Connectors + +| Name | Version | Download | +| :---------------- | :------------ | :--------------------- | +| Filesystem | | Built-in | +| Apache Kafka | 0.8 | [Download](http://central.maven.org/maven2/org/apache/flink/flink-connector-kafka-0.8{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka-0.8{{site.scala_version_suffix}}-{{site.version}}-sql-jar.jar) | +| Apache Kafka | 0.9 | [Download](http://central.maven.org/maven2/org/apache/flink/flink-connector-kafka-0.9{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka-0.9{{site.scala_version_suffix}}-{{site.version}}-sql-jar.jar) | +| Apache Kafka | 0.10 | [Download](http://central.maven.org/maven2/org/apache/flink/flink-connector-kafka-0.10{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka-0.10{{site.scala_version_suffix}}-{{site.version}}-sql-jar.jar) | +| Apache Kafka | 0.11 | [Download](http://central.maven.org/maven2/org/apache/flink/flink-connector-kafka-0.11{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka-0.11{{site.scala_version_suffix}}-{{site.version}}-sql-jar.jar) | + +#### Formats + +| Name | Download | +| :---------------- | :--------------------- | +| CSV | Built-in | +| JSON | [Download](http://central.maven.org/maven2/org/apache/flink/flink-json/{{site.version}}/flink-json-{{site.version}}-sql-jar.jar) | + + +{% top %} + +Table Sources +------------- + +Sources are defined using a set of [YAML properties](http://http://yaml.org/). Similar to a SQL `CREATE TABLE` statement you define the name of the table, the final schema of the table, connector, and a data format if necessary. Additionally, you have to specify it's type (source, sink, or both). + +{% highlight yaml %} +name: MyTable # required; string representing the table name +type: source # required; currently only 'source' is supported +schema: ... # required; final table schema +connector: ... # required; connector configuration +format: ... # optional; format that depends on the connector +{% endhighlight %} + --- End diff -- Put it in a banner using any of the syntaxes in the above comments. > Add SQL Client documentation page > --------------------------------- > > Key: FLINK-9181 > URL: https://issues.apache.org/jira/browse/FLINK-9181 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL > Reporter: Timo Walther > Assignee: Timo Walther > Priority: Major > > The current implementation of the SQL Client implementation needs > documentation for the upcoming 1.5 release. -- This message was sent by Atlassian JIRA (v7.6.3#76005)