[jira] [Commented] (FLINK-9181) Add SQL Client documentation page

ASF GitHub Bot (JIRA) Fri, 27 Apr 2018 06:00:46 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456345#comment-16456345
 ]


ASF GitHub Bot commented on FLINK-9181:
---------------------------------------

Github user kl0u commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5913#discussion_r184678478
  
    --- Diff: docs/dev/table/sqlClient.md ---
    @@ -0,0 +1,538 @@
    +---
    +title: "SQL Client"
    +nav-parent_id: tableapi
    +nav-pos: 100
    +is_beta: true
    +---
    +<!--
    +Licensed to the Apache Software Foundation (ASF) under one
    +or more contributor license agreements.  See the NOTICE file
    +distributed with this work for additional information
    +regarding copyright ownership.  The ASF licenses this file
    +to you under the Apache License, Version 2.0 (the
    +"License"); you may not use this file except in compliance
    +with the License.  You may obtain a copy of the License at
    +
    +  http://www.apache.org/licenses/LICENSE-2.0
    +
    +Unless required by applicable law or agreed to in writing,
    +software distributed under the License is distributed on an
    +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    +KIND, either express or implied.  See the License for the
    +specific language governing permissions and limitations
    +under the License.
    +-->
    +
    +
    +Although Flink’s Table & SQL API allows to declare queries in the SQL 
language. A SQL query needs to be embedded within a table program that is 
written either in Java or Scala. The table program needs to be packaged with a 
build tool before it can be submitted to a cluster. This limits the usage of 
Flink to mostly Java/Scala programmers.
    +
    +The *SQL Client* aims to provide an easy way of writing, debugging, and 
submitting table programs to a Flink cluster without a single line of code. The 
*SQL Client CLI* allows for retrieving and visualizing real-time results from 
the running distributed application on the command line.
    +
    +<a href="{{ site.baseurl }}/fig/sql_client_demo.gif"><img class="offset" 
src="{{ site.baseurl }}/fig/sql_client_demo.gif" alt="Animated demo of the 
Flink SQL Client CLI running table programs on a cluster" width="80%" /></a>
    +
    +**Note:** The SQL Client is in an early developement phase. Even though 
the application is not production-ready yet, it can be a quite useful tool for 
prototyping and playing around with Flink SQL. In the future, the community 
plans to extend its functionality by providing a REST-based [SQL Client 
Gateway](sqlClient.html#limitations--future).
    +
    +* This will be replaced by the TOC
    +{:toc}
    +
    +Getting Started
    +---------------
    +
    +This section describes how to setup and run your first Flink SQL program 
from the command-line. The SQL Client is bundled in the regular Flink 
distribution and thus runnable out of the box.
    +
    +The SQL Client requires a running Flink cluster where table programs can 
be submitted to. For more information about setting up a Flink cluster see the 
[deployment part of this documentation]({{ site.baseurl 
}}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL 
Client, you can also start a local cluster with one worker using the following 
command:
    +
    +{% highlight bash %}
    +./bin/start-cluster.sh
    +{% endhighlight %}
    +
    +### Starting the SQL Client CLI
    +
    +The SQL Client scripts are also located in the binary directory of Flink. 
You can start the CLI by calling:
    +
    +{% highlight bash %}
    +./bin/sql-client.sh embedded
    +{% endhighlight %}
    +
    +This command starts the submission service and CLI embedded in one 
application process. By default, the SQL Client will read its configuration 
from the environment file located in `./conf/sql-client-defaults.yaml`. See the 
[next part](sqlClient.html#environment-files) for more information about the 
structure of environment files.
    +
    +### Running SQL Queries
    +
    +Once the CLI has been started, you can use the `HELP` command to list all 
available SQL statements. For validating your setup and cluster connection, you 
can enter your first SQL query and press the `Enter` key to execute it:
    +
    +{% highlight sql %}
    +SELECT 'Hello World'
    +{% endhighlight %}
    +
    +This query requires no table source and produces a single row result. The 
CLI will retrieve results from the cluster and visualize them. You can close 
the result view by pressing the `Q` key.
    +
    +The CLI supports **two modes** for maintaining and visualizing results.
    +
    +The *table mode* materializes results in memory and visualizes them in a 
regular, paginated table representation. It can be enabled by executing the 
following command in the CLI:
    +
    +{% highlight text %}
    +SET execution.result-mode=table
    +{% endhighlight %}
    +
    +The *changelog mode* does not materialize results and visualizes the 
result stream that is produced by a continuous query [LINK] consisting of 
insertions (`+`) and retractions (`-`).
    +
    +{% highlight text %}
    +SET execution.result-mode=changelog
    +{% endhighlight %}
    +
    +You can use the following query to see both result modes in action:
    +
    +{% highlight sql %}
    +SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), 
('Bob')) AS NameTable(name) GROUP BY name 
    +{% endhighlight %}
    +
    +This query performs a bounded word count example. The following sections 
explain how to read from table sources and configure other table program 
properties. 
    +
    +{% top %}
    +
    +Configuration
    +-------------
    +
    +The SQL Client can be started with the following optional CLI commands. 
They are discussed in detail in the subsequent paragraphs.
    +
    +{% highlight text %}
    +./bin/sql-client.sh embedded --help
    +
    +Mode "embedded" submits Flink jobs from the local machine.
    +
    +  Syntax: embedded [OPTIONS]
    +  "embedded" mode options:
    +     -d,--defaults <environment file>      The environment properties with 
which
    +                                           every new session is 
initialized.
    +                                           Properties might be overwritten 
by
    +                                           session properties.
    +     -e,--environment <environment file>   The environment properties to be
    +                                           imported into the session. It 
might
    +                                           overwrite default environment
    +                                           properties.
    +     -h,--help                             Show the help message with
    +                                           descriptions of all options.
    +     -j,--jar <JAR file>                   A JAR file to be imported into 
the
    +                                           session. The file might contain
    +                                           user-defined classes needed for 
the
    +                                           execution of statements such as
    +                                           functions, table sources, or 
sinks.
    +                                           Can be used multiple times.
    +     -l,--library <JAR directory>          A JAR file directory with which 
every
    +                                           new session is initialized. The 
files
    +                                           might contain user-defined 
classes
    +                                           needed for the execution of
    +                                           statements such as functions, 
table
    +                                           sources, or sinks. Can be used
    +                                           multiple times.
    +     -s,--session <session identifier>     The identifier for a session.
    +                                           'default' is the default 
identifier.
    +{% endhighlight %}
    +
    +{% top %}
    +
    +### Environment Files
    +
    +A SQL query needs a configuration environment in which it is executed. The 
so-called *environment files* define available table sources and sinks, 
external catalogs, user-defined functions, and other properties required for 
execution and deployment.
    +
    +Every environment file is a regular [YAML file](http://http://yaml.org/) 
that looks similar to the following example. The file defines an environment 
with a table source `MyTableName` that reads from CSV file. Queries that are 
executed in this environment (among others) will have a parallelism of 1, an 
even-time characteristic, and will run in the `table` result mode.
    +
    +{% highlight yaml %}
    +# Define table sources and sinks here.
    +
    +tables:
    +  - name: MyTableName
    +    type: source
    +    schema:
    +      - name: MyField1
    +        type: INT
    +      - name: MyField2
    +        type: VARCHAR
    +    connector:
    +      type: filesystem
    +      path: "/path/to/something.csv"
    +    format:
    +      type: csv
    +      fields:
    +        - name: MyField1
    +          type: INT
    +        - name: MyField2
    +          type: VARCHAR
    +      line-delimiter: "\n"
    +      comment-prefix: "#"
    +
    +# Execution properties allow for changing the behavior of a table program.
    +
    +execution:
    +  type: streaming
    +  time-characteristic: event-time
    +  parallelism: 1
    +  max-parallelism: 16
    +  min-idle-state-retention: 0
    +  max-idle-state-retention: 0
    +  result-mode: table
    +
    +# Deployment properties allow for describing the cluster to which table 
programs are submitted to.
    +
    +deployment:
    +  response-timeout: 5000
    +{% endhighlight %}
    +
    +Environment files can be created for general purposes (*defaults 
environment file* using `--defaults`) as well as on a per-session basis 
(*session environment file* using `--environment`). Every CLI session is 
initialized with the default properties followed by the session properties. 
Both default and session environment files can be passed when starting the CLI 
application. If no default environment file has been specified, the SQL Client 
searches for `./conf/sql-client-defaults.yaml` in Flink's configuration 
directory.
    +
    +Properties that have been set within a CLI session (e.g. using the `SET` 
command) have highest precedence:
    +
    +{% highlight text %}
    +CLI commands > session environment file > defaults environment file
    +{% endhighlight %}
    +
    +{% top %}
    +
    +### Dependencies
    +
    +The SQL Client does not require to setup a Java project using Maven or 
SBT. Instead, you can pass the dependencies as regular JAR files that get 
submitted to the cluster. You can either specify each JAR file separately 
(using `--jar`) or define entire library directories (using `--library`). For 
connectors to external systems (such as Apache Kafka) and corresponding data 
formats (such as JSON), Flink provides **ready-to-use JAR bundles**. These JAR 
files are suffixed with `sql-jar` and can be downloaded for each release from 
the Maven central repository.
    +
    +
    +
    +#### Connectors
    +
    +| Name              | Version       | Download               |
    +| :---------------- | :------------ | :--------------------- |
    +| Filesystem        |               | Built-in               |
    +| Apache Kafka      | 0.8           | 
[Download](http://central.maven.org/maven2/org/apache/flink/flink-connector-kafka-0.8{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka-0.8{{site.scala_version_suffix}}-{{site.version}}-sql-jar.jar)
 |
    +| Apache Kafka      | 0.9           | 
[Download](http://central.maven.org/maven2/org/apache/flink/flink-connector-kafka-0.9{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka-0.9{{site.scala_version_suffix}}-{{site.version}}-sql-jar.jar)
 |
    +| Apache Kafka      | 0.10          | 
[Download](http://central.maven.org/maven2/org/apache/flink/flink-connector-kafka-0.10{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka-0.10{{site.scala_version_suffix}}-{{site.version}}-sql-jar.jar)
 |
    +| Apache Kafka      | 0.11          | 
[Download](http://central.maven.org/maven2/org/apache/flink/flink-connector-kafka-0.11{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka-0.11{{site.scala_version_suffix}}-{{site.version}}-sql-jar.jar)
 |
    +
    +#### Formats
    +
    +| Name              | Download               |
    +| :---------------- | :--------------------- |
    +| CSV               | Built-in               |
    +| JSON              | 
[Download](http://central.maven.org/maven2/org/apache/flink/flink-json/{{site.version}}/flink-json-{{site.version}}-sql-jar.jar)
 |
    +
    +
    +{% top %}
    +
    +Table Sources
    +-------------
    +
    +Sources are defined using a set of [YAML 
properties](http://http://yaml.org/). Similar to a SQL `CREATE TABLE` statement 
you define the name of the table, the final schema of the table, connector, and 
a data format if necessary. Additionally, you have to specify it's type 
(source, sink, or both).
    +
    +{% highlight yaml %}
    +name: MyTable     # required; string representing the table name
    +type: source      # required; currently only 'source' is supported
    +schema: ...       # required; final table schema
    +connector: ...    # required; connector configuration
    +format: ...       # optional; format that depends on the connector 
    +{% endhighlight %}
    +
    --- End diff --
    
    Put it in a banner using any of the syntaxes in the above comments.


> Add SQL Client documentation page
> ---------------------------------
>
>                 Key: FLINK-9181
>                 URL: https://issues.apache.org/jira/browse/FLINK-9181
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API &amp; SQL
>            Reporter: Timo Walther
>            Assignee: Timo Walther
>            Priority: Major
>
> The current implementation of the SQL Client implementation needs 
> documentation for the upcoming 1.5 release. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9181) Add SQL Client documentation page

Reply via email to