This is an automated email from the ASF dual-hosted git repository. jark pushed a commit to branch release-1.11 in repository https://gitbox.apache.org/repos/asf/flink.git
commit 4581d543b5d4d21e0dff9ea9271f384f05adac30 Author: Jark Wu <j...@apache.org> AuthorDate: Fri Jun 12 00:31:37 2020 +0800 [hotfix][docs][connectors] Improve SQL connectors documentation --- docs/dev/table/connectors/elasticsearch.md | 6 +- docs/dev/table/connectors/elasticsearch.zh.md | 6 +- docs/dev/table/connectors/filesystem.md | 2 +- docs/dev/table/connectors/filesystem.zh.md | 2 +- docs/dev/table/connectors/formats/avro.md | 82 ++++++++++---------- docs/dev/table/connectors/formats/avro.zh.md | 84 ++++++++++----------- docs/dev/table/connectors/formats/csv.md | 103 +++++++++++++------------- docs/dev/table/connectors/formats/csv.zh.md | 103 +++++++++++++------------- docs/dev/table/connectors/formats/index.md | 14 ++-- docs/dev/table/connectors/formats/index.zh.md | 14 ++-- docs/dev/table/connectors/formats/json.md | 80 ++++++++++---------- docs/dev/table/connectors/formats/json.zh.md | 80 ++++++++++---------- docs/dev/table/connectors/hbase.md | 54 +++++++------- docs/dev/table/connectors/hbase.zh.md | 54 +++++++------- docs/dev/table/connectors/index.md | 18 ++--- docs/dev/table/connectors/index.zh.md | 24 +++--- docs/dev/table/connectors/jdbc.md | 6 +- docs/dev/table/connectors/jdbc.zh.md | 6 +- docs/dev/table/connectors/kafka.md | 70 ++++++++--------- docs/dev/table/connectors/kafka.zh.md | 70 ++++++++--------- 20 files changed, 442 insertions(+), 436 deletions(-) diff --git a/docs/dev/table/connectors/elasticsearch.md b/docs/dev/table/connectors/elasticsearch.md index 5ea587ab..d620d4e 100644 --- a/docs/dev/table/connectors/elasticsearch.md +++ b/docs/dev/table/connectors/elasticsearch.md @@ -2,7 +2,7 @@ title: "Elasticsearch SQL Connector" nav-title: Elasticsearch nav-parent_id: sql-connectors -nav-pos: 3 +nav-pos: 4 --- <!-- Licensed to the Apache Software Foundation (ASF) under one @@ -263,4 +263,6 @@ Data Type Mapping ---------------- Elasticsearch stores document in a JSON string. So the data type mapping is between Flink data type and JSON data type. -Flink uses built-in `'json'` format for Elasticsearch connector. Please refer to <a href="{% link dev/table/connectors/formats/index.md %}">JSON Format</a> page for more type mapping details. \ No newline at end of file +Flink uses built-in `'json'` format for Elasticsearch connector. Please refer to [JSON Format]({% link dev/table/connectors/formats/json.md %}) page for more type mapping details. + +{% top %} \ No newline at end of file diff --git a/docs/dev/table/connectors/elasticsearch.zh.md b/docs/dev/table/connectors/elasticsearch.zh.md index 5ea587ab..d620d4e 100644 --- a/docs/dev/table/connectors/elasticsearch.zh.md +++ b/docs/dev/table/connectors/elasticsearch.zh.md @@ -2,7 +2,7 @@ title: "Elasticsearch SQL Connector" nav-title: Elasticsearch nav-parent_id: sql-connectors -nav-pos: 3 +nav-pos: 4 --- <!-- Licensed to the Apache Software Foundation (ASF) under one @@ -263,4 +263,6 @@ Data Type Mapping ---------------- Elasticsearch stores document in a JSON string. So the data type mapping is between Flink data type and JSON data type. -Flink uses built-in `'json'` format for Elasticsearch connector. Please refer to <a href="{% link dev/table/connectors/formats/index.md %}">JSON Format</a> page for more type mapping details. \ No newline at end of file +Flink uses built-in `'json'` format for Elasticsearch connector. Please refer to [JSON Format]({% link dev/table/connectors/formats/json.md %}) page for more type mapping details. + +{% top %} \ No newline at end of file diff --git a/docs/dev/table/connectors/filesystem.md b/docs/dev/table/connectors/filesystem.md index 5b98041..a39e76e 100644 --- a/docs/dev/table/connectors/filesystem.md +++ b/docs/dev/table/connectors/filesystem.md @@ -2,7 +2,7 @@ title: "FileSystem SQL Connector" nav-title: FileSystem nav-parent_id: sql-connectors -nav-pos: 3 +nav-pos: 5 --- <!-- Licensed to the Apache Software Foundation (ASF) under one diff --git a/docs/dev/table/connectors/filesystem.zh.md b/docs/dev/table/connectors/filesystem.zh.md index 5b98041..a39e76e 100644 --- a/docs/dev/table/connectors/filesystem.zh.md +++ b/docs/dev/table/connectors/filesystem.zh.md @@ -2,7 +2,7 @@ title: "FileSystem SQL Connector" nav-title: FileSystem nav-parent_id: sql-connectors -nav-pos: 3 +nav-pos: 5 --- <!-- Licensed to the Apache Software Foundation (ASF) under one diff --git a/docs/dev/table/connectors/formats/avro.md b/docs/dev/table/connectors/formats/avro.md index 93f758a..c03d98d 100644 --- a/docs/dev/table/connectors/formats/avro.md +++ b/docs/dev/table/connectors/formats/avro.md @@ -95,7 +95,7 @@ Format Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specify what format to use, here should be 'avro'.</td> + <td>Specify what format to use, here should be <code>'avro'</code>.</td> </tr> </tbody> </table> @@ -109,9 +109,9 @@ So the following table lists the type mapping from Flink type to Avro type. <table class="table table-bordered"> <thead> <tr> - <th class="text-left">Flink Data Type</th> - <th class="text-center">Avro type</th> - <th class="text-center">Avro logical type</th> + <th class="text-left">Flink SQL type</th> + <th class="text-left">Avro type</th> + <th class="text-left">Avro logical type</th> </tr> </thead> <tbody> @@ -121,85 +121,85 @@ So the following table lists the type mapping from Flink type to Avro type. <td></td> </tr> <tr> - <td>BOOLEAN</td> - <td>boolean</td> + <td><code>BOOLEAN</code></td> + <td><code>boolean</code></td> <td></td> </tr> <tr> - <td>BINARY / VARBINARY</td> - <td>bytes</td> + <td><code>BINARY / VARBINARY</code></td> + <td><code>bytes</code></td> <td></td> </tr> <tr> - <td>DECIMAL</td> - <td>fixed</td> - <td>decimal</td> + <td><code>DECIMAL</code></td> + <td><code>fixed</code></td> + <td><code>decimal</code></td> </tr> <tr> - <td>TINYINT</td> - <td>int</td> + <td><code>TINYINT</code></td> + <td><code>int</code></td> <td></td> </tr> <tr> - <td>SMALLINT</td> - <td>int</td> + <td><code>SMALLINT</code></td> + <td><code>int</code></td> <td></td> </tr> <tr> - <td>INT</td> - <td>int</td> + <td><code>INT</code></td> + <td><code>int</code></td> <td></td> </tr> <tr> - <td>BIGINT</td> - <td>long</td> + <td><code>BIGINT</code></td> + <td><code>long</code></td> <td></td> </tr> <tr> - <td>FLOAT</td> - <td>float</td> + <td><code>FLOAT</code></td> + <td><code>float</code></td> <td></td> </tr> <tr> - <td>DOUBLE</td> - <td>double</td> + <td><code>DOUBLE</code></td> + <td><code>double</code></td> <td></td> </tr> <tr> - <td>DATE</td> - <td>int</td> - <td>date</td> + <td><code>DATE</code></td> + <td><code>int</code></td> + <td><code>date</code></td> </tr> <tr> - <td>TIME</td> - <td>int</td> - <td>time-millis</td> + <td><code>TIME</code></td> + <td><code>int</code></td> + <td><code>time-millis</code></td> </tr> <tr> - <td>TIMESTAMP</td> - <td>long</td> - <td>timestamp-millis</td> + <td><code>TIMESTAMP</code></td> + <td><code>long</code></td> + <td><code>timestamp-millis</code></td> </tr> <tr> - <td>ARRAY</td> - <td>array</td> + <td><code>ARRAY</code></td> + <td><code>array</code></td> <td></td> </tr> <tr> - <td>MAP<br> + <td><code>MAP</code><br> (key must be string/char/varchar type)</td> - <td>map</td> + <td><code>map</code></td> <td></td> </tr> <tr> - <td>MULTISET<br> + <td><code>MULTISET</code><br> (element must be string/char/varchar type)</td> - <td>map</td> + <td><code>map</code></td> <td></td> </tr> <tr> - <td>ROW</td> - <td>record</td> + <td><code>ROW</code></td> + <td><code>record</code></td> <td></td> </tr> </tbody> @@ -207,7 +207,7 @@ So the following table lists the type mapping from Flink type to Avro type. In addition to the types listed above, Flink supports reading/writing nullable types. Flink maps nullable types to Avro `union(something, null)`, where `something` is the Avro type converted from Flink type. -You can refer to Avro Specification for more information about Avro types: [https://avro.apache.org/docs/current/spec.html](https://avro.apache.org/docs/current/spec.html). +You can refer to [Avro Specification](https://avro.apache.org/docs/current/spec.html) for more information about Avro types. diff --git a/docs/dev/table/connectors/formats/avro.zh.md b/docs/dev/table/connectors/formats/avro.zh.md index 01c9ace..c03d98d 100644 --- a/docs/dev/table/connectors/formats/avro.zh.md +++ b/docs/dev/table/connectors/formats/avro.zh.md @@ -38,7 +38,7 @@ In order to setup the Avro format, the following table provides dependency infor <div class="codetabs" markdown="1"> <div data-lang="SQL Client JAR" markdown="1"> -Avro format is part of the binary distribution, but requires additional [Hadoop dependency]({% link ops/deployment/hadoop.zh.md %}) for cluster execution. +Avro format is part of the binary distribution, but requires additional [Hadoop dependency]({% link ops/deployment/hadoop.md %}) for cluster execution. </div> <div data-lang="Maven dependency" markdown="1"> {% highlight xml %} @@ -95,7 +95,7 @@ Format Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specify what format to use, here should be 'avro'.</td> + <td>Specify what format to use, here should be <code>'avro'</code>.</td> </tr> </tbody> </table> @@ -109,9 +109,9 @@ So the following table lists the type mapping from Flink type to Avro type. <table class="table table-bordered"> <thead> <tr> - <th class="text-left">Flink Data Type</th> - <th class="text-center">Avro type</th> - <th class="text-center">Avro logical type</th> + <th class="text-left">Flink SQL type</th> + <th class="text-left">Avro type</th> + <th class="text-left">Avro logical type</th> </tr> </thead> <tbody> @@ -121,85 +121,85 @@ So the following table lists the type mapping from Flink type to Avro type. <td></td> </tr> <tr> - <td>BOOLEAN</td> - <td>boolean</td> + <td><code>BOOLEAN</code></td> + <td><code>boolean</code></td> <td></td> </tr> <tr> - <td>BINARY / VARBINARY</td> - <td>bytes</td> + <td><code>BINARY / VARBINARY</code></td> + <td><code>bytes</code></td> <td></td> </tr> <tr> - <td>DECIMAL</td> - <td>fixed</td> - <td>decimal</td> + <td><code>DECIMAL</code></td> + <td><code>fixed</code></td> + <td><code>decimal</code></td> </tr> <tr> - <td>TINYINT</td> - <td>int</td> + <td><code>TINYINT</code></td> + <td><code>int</code></td> <td></td> </tr> <tr> - <td>SMALLINT</td> - <td>int</td> + <td><code>SMALLINT</code></td> + <td><code>int</code></td> <td></td> </tr> <tr> - <td>INT</td> - <td>int</td> + <td><code>INT</code></td> + <td><code>int</code></td> <td></td> </tr> <tr> - <td>BIGINT</td> - <td>long</td> + <td><code>BIGINT</code></td> + <td><code>long</code></td> <td></td> </tr> <tr> - <td>FLOAT</td> - <td>float</td> + <td><code>FLOAT</code></td> + <td><code>float</code></td> <td></td> </tr> <tr> - <td>DOUBLE</td> - <td>double</td> + <td><code>DOUBLE</code></td> + <td><code>double</code></td> <td></td> </tr> <tr> - <td>DATE</td> - <td>int</td> - <td>date</td> + <td><code>DATE</code></td> + <td><code>int</code></td> + <td><code>date</code></td> </tr> <tr> - <td>TIME</td> - <td>int</td> - <td>time-millis</td> + <td><code>TIME</code></td> + <td><code>int</code></td> + <td><code>time-millis</code></td> </tr> <tr> - <td>TIMESTAMP</td> - <td>long</td> - <td>timestamp-millis</td> + <td><code>TIMESTAMP</code></td> + <td><code>long</code></td> + <td><code>timestamp-millis</code></td> </tr> <tr> - <td>ARRAY</td> - <td>array</td> + <td><code>ARRAY</code></td> + <td><code>array</code></td> <td></td> </tr> <tr> - <td>MAP<br> + <td><code>MAP</code><br> (key must be string/char/varchar type)</td> - <td>map</td> + <td><code>map</code></td> <td></td> </tr> <tr> - <td>MULTISET<br> + <td><code>MULTISET</code><br> (element must be string/char/varchar type)</td> - <td>map</td> + <td><code>map</code></td> <td></td> </tr> <tr> - <td>ROW</td> - <td>record</td> + <td><code>ROW</code></td> + <td><code>record</code></td> <td></td> </tr> </tbody> @@ -207,7 +207,7 @@ So the following table lists the type mapping from Flink type to Avro type. In addition to the types listed above, Flink supports reading/writing nullable types. Flink maps nullable types to Avro `union(something, null)`, where `something` is the Avro type converted from Flink type. -You can refer to Avro Specification for more information about Avro types: [https://avro.apache.org/docs/current/spec.html](https://avro.apache.org/docs/current/spec.html). +You can refer to [Avro Specification](https://avro.apache.org/docs/current/spec.html) for more information about Avro types. diff --git a/docs/dev/table/connectors/formats/csv.md b/docs/dev/table/connectors/formats/csv.md index 6154fbd..b34a1e8 100644 --- a/docs/dev/table/connectors/formats/csv.md +++ b/docs/dev/table/connectors/formats/csv.md @@ -86,57 +86,57 @@ Format Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specify what format to use, here should be 'csv'.</td> + <td>Specify what format to use, here should be <code>'csv'</code>.</td> </tr> <tr> <td><h5>csv.field-delimiter</h5></td> <td>optional</td> <td style="word-wrap: break-word;"><code>,</code></td> <td>String</td> - <td>Field delimiter character (',' by default).</td> + <td>Field delimiter character (<code>','</code> by default).</td> </tr> <tr> <td><h5>csv.line-delimiter</h5></td> <td>optional</td> <td style="word-wrap: break-word;"><code>\n</code></td> <td>String</td> - <td>Line delimiter ('\n' by default, otherwise - '\r' or '\r\n' are allowed), unicode is supported if - the delimiter is an invisible special character, - e.g. U&'\\000D' is the unicode representation of carriage return '\r' - e.g. U&'\\000A' is the unicode representation of line feed '\n'.</td> + <td>Line delimiter, <code>\n</code> by default. Note the <code>\n</code> and <code>\r</code> are invisible special characters, you have to use unicode to specify them in plain SQL. + <ul> + <li>e.g. <code>'csv.line-delimiter' = U&'\\000D'</code> specifies the to use carriage return <code>\r</code> as line delimiter.</li> + <li>e.g. <code>'csv.line-delimiter' = U&'\\000A'</code> specifies the to use line feed <code>\n</code> as line delimiter.</li> + </ul> + </td> </tr> <tr> <td><h5>csv.disable-quote-character</h5></td> <td>optional</td> <td style="word-wrap: break-word;">false</td> <td>Boolean</td> - <td>Flag to disabled quote character for enclosing field values (false by default) - if true, quote-character can not be set.</td> + <td>Disabled quote character for enclosing field values (false by default). + If true, option <code>'csv.quote-character'</code> must be set.</td> </tr> <tr> <td><h5>csv.quote-character</h5></td> <td>optional</td> <td style="word-wrap: break-word;"><code>"</code></td> <td>String</td> - <td>Quote character for enclosing field values ('"' by default).</td> + <td>Quote character for enclosing field values (<code>"</code> by default).</td> </tr> <tr> <td><h5>csv.allow-comments</h5></td> <td>optional</td> <td style="word-wrap: break-word;">false</td> <td>Boolean</td> - <td>Flag to ignore comment lines that start with '#' - (disabled by default); - if enabled, make sure to also ignore parse errors to allow empty rows.</td> + <td>Ignore comment lines that start with <code>'#'</code> (disabled by default). + If enabled, make sure to also ignore parse errors to allow empty rows.</td> </tr> <tr> <td><h5>csv.ignore-parse-errors</h5></td> <td>optional</td> <td style="word-wrap: break-word;">false</td> <td>Boolean</td> - <td>Flag to skip fields and rows with parse errors instead of failing; - fields are set to null in case of errors.</td> + <td>Skip fields and rows with parse errors instead of failing. + Fields are set to null in case of errors.</td> </tr> <tr> <td><h5>csv.array-element-delimiter</h5></td> @@ -144,7 +144,7 @@ Format Options <td style="word-wrap: break-word;"><code>;</code></td> <td>String</td> <td>Array element delimiter string for separating - array and row element values (';' by default).</td> + array and row element values (<code>';'</code> by default).</td> </tr> <tr> <td><h5>csv.escape-character</h5></td> @@ -158,8 +158,7 @@ Format Options <td>optional</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Null literal string that is interpreted as a - null value (disabled by default).</td> + <td>Null literal string that is interpreted as a null value (disabled by default).</td> </tr> </tbody> </table> @@ -176,74 +175,74 @@ The following table lists the type mapping from Flink type to CSV type. <table class="table table-bordered"> <thead> <tr> - <th class="text-left">Flink Data Type</th> - <th class="text-center">CSV Data Type</th> + <th class="text-left">Flink SQL type</th> + <th class="text-left">CSV type</th> </tr> </thead> <tbody> <tr> - <td>CHAR / VARCHAR / STRING</td> - <td>string</td> + <td><code>CHAR / VARCHAR / STRING</code></td> + <td><code>string</code></td> </tr> <tr> - <td>BOOLEAN</td> - <td>boolean</td> + <td><code>BOOLEAN</code></td> + <td><code>boolean</code></td> </tr> <tr> - <td>BINARY / VARBINARY</td> - <td>string with encoding: base64</td> + <td><code>BINARY / VARBINARY</code></td> + <td><code>string with encoding: base64</code></td> </tr> <tr> - <td>DECIMAL</td> - <td>number</td> + <td><code>DECIMAL</code></td> + <td><code>number</code></td> </tr> <tr> - <td>TINYINT</td> - <td>number</td> + <td><code>TINYINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>SMALLINT</td> - <td>number</td> + <td><code>SMALLINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>INT</td> - <td>number</td> + <td><code>INT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>BIGINT</td> - <td>number</td> + <td><code>BIGINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>FLOAT</td> - <td>number</td> + <td><code>FLOAT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>DOUBLE</td> - <td>number</td> + <td><code>DOUBLE</code></td> + <td><code>number</code></td> </tr> <tr> - <td>DATE</td> - <td>string with format: date</td> + <td><code>DATE</code></td> + <td><code>string with format: date</code></td> </tr> <tr> - <td>TIME</td> - <td>string with format: time</td> + <td><code>TIME</code></td> + <td><code>string with format: time</code></td> </tr> <tr> - <td>TIMESTAMP</td> - <td>string with format: date-time</td> + <td><code>TIMESTAMP</code></td> + <td><code>string with format: date-time</code></td> </tr> <tr> - <td>INTERVAL</td> - <td>number</td> + <td><code>INTERVAL</code></td> + <td><code>number</code></td> </tr> <tr> - <td>ARRAY</td> - <td>array</td> + <td><code>ARRAY</code></td> + <td><code>array</code></td> </tr> <tr> - <td>ROW</td> - <td>object</td> + <td><code>ROW</code></td> + <td><code>object</code></td> </tr> </tbody> </table> diff --git a/docs/dev/table/connectors/formats/csv.zh.md b/docs/dev/table/connectors/formats/csv.zh.md index 6154fbd..b34a1e8 100644 --- a/docs/dev/table/connectors/formats/csv.zh.md +++ b/docs/dev/table/connectors/formats/csv.zh.md @@ -86,57 +86,57 @@ Format Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specify what format to use, here should be 'csv'.</td> + <td>Specify what format to use, here should be <code>'csv'</code>.</td> </tr> <tr> <td><h5>csv.field-delimiter</h5></td> <td>optional</td> <td style="word-wrap: break-word;"><code>,</code></td> <td>String</td> - <td>Field delimiter character (',' by default).</td> + <td>Field delimiter character (<code>','</code> by default).</td> </tr> <tr> <td><h5>csv.line-delimiter</h5></td> <td>optional</td> <td style="word-wrap: break-word;"><code>\n</code></td> <td>String</td> - <td>Line delimiter ('\n' by default, otherwise - '\r' or '\r\n' are allowed), unicode is supported if - the delimiter is an invisible special character, - e.g. U&'\\000D' is the unicode representation of carriage return '\r' - e.g. U&'\\000A' is the unicode representation of line feed '\n'.</td> + <td>Line delimiter, <code>\n</code> by default. Note the <code>\n</code> and <code>\r</code> are invisible special characters, you have to use unicode to specify them in plain SQL. + <ul> + <li>e.g. <code>'csv.line-delimiter' = U&'\\000D'</code> specifies the to use carriage return <code>\r</code> as line delimiter.</li> + <li>e.g. <code>'csv.line-delimiter' = U&'\\000A'</code> specifies the to use line feed <code>\n</code> as line delimiter.</li> + </ul> + </td> </tr> <tr> <td><h5>csv.disable-quote-character</h5></td> <td>optional</td> <td style="word-wrap: break-word;">false</td> <td>Boolean</td> - <td>Flag to disabled quote character for enclosing field values (false by default) - if true, quote-character can not be set.</td> + <td>Disabled quote character for enclosing field values (false by default). + If true, option <code>'csv.quote-character'</code> must be set.</td> </tr> <tr> <td><h5>csv.quote-character</h5></td> <td>optional</td> <td style="word-wrap: break-word;"><code>"</code></td> <td>String</td> - <td>Quote character for enclosing field values ('"' by default).</td> + <td>Quote character for enclosing field values (<code>"</code> by default).</td> </tr> <tr> <td><h5>csv.allow-comments</h5></td> <td>optional</td> <td style="word-wrap: break-word;">false</td> <td>Boolean</td> - <td>Flag to ignore comment lines that start with '#' - (disabled by default); - if enabled, make sure to also ignore parse errors to allow empty rows.</td> + <td>Ignore comment lines that start with <code>'#'</code> (disabled by default). + If enabled, make sure to also ignore parse errors to allow empty rows.</td> </tr> <tr> <td><h5>csv.ignore-parse-errors</h5></td> <td>optional</td> <td style="word-wrap: break-word;">false</td> <td>Boolean</td> - <td>Flag to skip fields and rows with parse errors instead of failing; - fields are set to null in case of errors.</td> + <td>Skip fields and rows with parse errors instead of failing. + Fields are set to null in case of errors.</td> </tr> <tr> <td><h5>csv.array-element-delimiter</h5></td> @@ -144,7 +144,7 @@ Format Options <td style="word-wrap: break-word;"><code>;</code></td> <td>String</td> <td>Array element delimiter string for separating - array and row element values (';' by default).</td> + array and row element values (<code>';'</code> by default).</td> </tr> <tr> <td><h5>csv.escape-character</h5></td> @@ -158,8 +158,7 @@ Format Options <td>optional</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Null literal string that is interpreted as a - null value (disabled by default).</td> + <td>Null literal string that is interpreted as a null value (disabled by default).</td> </tr> </tbody> </table> @@ -176,74 +175,74 @@ The following table lists the type mapping from Flink type to CSV type. <table class="table table-bordered"> <thead> <tr> - <th class="text-left">Flink Data Type</th> - <th class="text-center">CSV Data Type</th> + <th class="text-left">Flink SQL type</th> + <th class="text-left">CSV type</th> </tr> </thead> <tbody> <tr> - <td>CHAR / VARCHAR / STRING</td> - <td>string</td> + <td><code>CHAR / VARCHAR / STRING</code></td> + <td><code>string</code></td> </tr> <tr> - <td>BOOLEAN</td> - <td>boolean</td> + <td><code>BOOLEAN</code></td> + <td><code>boolean</code></td> </tr> <tr> - <td>BINARY / VARBINARY</td> - <td>string with encoding: base64</td> + <td><code>BINARY / VARBINARY</code></td> + <td><code>string with encoding: base64</code></td> </tr> <tr> - <td>DECIMAL</td> - <td>number</td> + <td><code>DECIMAL</code></td> + <td><code>number</code></td> </tr> <tr> - <td>TINYINT</td> - <td>number</td> + <td><code>TINYINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>SMALLINT</td> - <td>number</td> + <td><code>SMALLINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>INT</td> - <td>number</td> + <td><code>INT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>BIGINT</td> - <td>number</td> + <td><code>BIGINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>FLOAT</td> - <td>number</td> + <td><code>FLOAT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>DOUBLE</td> - <td>number</td> + <td><code>DOUBLE</code></td> + <td><code>number</code></td> </tr> <tr> - <td>DATE</td> - <td>string with format: date</td> + <td><code>DATE</code></td> + <td><code>string with format: date</code></td> </tr> <tr> - <td>TIME</td> - <td>string with format: time</td> + <td><code>TIME</code></td> + <td><code>string with format: time</code></td> </tr> <tr> - <td>TIMESTAMP</td> - <td>string with format: date-time</td> + <td><code>TIMESTAMP</code></td> + <td><code>string with format: date-time</code></td> </tr> <tr> - <td>INTERVAL</td> - <td>number</td> + <td><code>INTERVAL</code></td> + <td><code>number</code></td> </tr> <tr> - <td>ARRAY</td> - <td>array</td> + <td><code>ARRAY</code></td> + <td><code>array</code></td> </tr> <tr> - <td>ROW</td> - <td>object</td> + <td><code>ROW</code></td> + <td><code>object</code></td> </tr> </tbody> </table> diff --git a/docs/dev/table/connectors/formats/index.md b/docs/dev/table/connectors/formats/index.md index 9209166..6ab144d 100644 --- a/docs/dev/table/connectors/formats/index.md +++ b/docs/dev/table/connectors/formats/index.md @@ -37,28 +37,28 @@ Flink supports the following formats: </thead> <tbody> <tr> - <td><a href="{{ site.baseurl }}/dev/table/connectors/formats/csv.html">CSV</a></td> - <td>Apache Kafka, + <td><a href="{% link dev/table/connectors/formats/csv.md %}">CSV</a></td> + <td><a href="{% link dev/table/connectors/kafka.md %}">Apache Kafka</a>, <a href="{% link dev/table/connectors/filesystem.md %}">Filesystem</a></td> </tr> <tr> - <td><a href="{{ site.baseurl }}/dev/table/connectors/formats/json.html">JSON</a></td> - <td>Apache Kafka, + <td><a href="{% link dev/table/connectors/formats/json.md %}">JSON</a></td> + <td><a href="{% link dev/table/connectors/kafka.md %}">Apache Kafka</a>, <a href="{% link dev/table/connectors/filesystem.md %}">Filesystem</a>, <a href="{% link dev/table/connectors/elasticsearch.md %}">Elasticsearch</a></td> </tr> <tr> <td><a href="{% link dev/table/connectors/formats/avro.md %}">Apache Avro</a></td> - <td>Apache Kafka, + <td><a href="{% link dev/table/connectors/kafka.md %}">Apache Kafka</a>, <a href="{% link dev/table/connectors/filesystem.md %}">Filesystem</a></td> </tr> <tr> <td>Debezium JSON</td> - <td>Apache Kafka</td> + <td><a href="{% link dev/table/connectors/kafka.md %}">Apache Kafka</a></td> </tr> <tr> <td>Canal JSON</td> - <td>Apache Kafka</td> + <td><a href="{% link dev/table/connectors/kafka.md %}">Apache Kafka</a></td> </tr> <tr> <td>Apache Parquet</td> diff --git a/docs/dev/table/connectors/formats/index.zh.md b/docs/dev/table/connectors/formats/index.zh.md index 35f073b..6aef539 100644 --- a/docs/dev/table/connectors/formats/index.zh.md +++ b/docs/dev/table/connectors/formats/index.zh.md @@ -37,28 +37,28 @@ Flink supports the following formats: </thead> <tbody> <tr> - <td><a href="{{ site.baseurl }}/dev/table/connectors/formats/csv.html">CSV</a></td> - <td>Apache Kafka, + <td><a href="{% link dev/table/connectors/formats/csv.zh.md %}">CSV</a></td> + <td><a href="{% link dev/table/connectors/kafka.zh.md %}">Apache Kafka</a>, <a href="{% link dev/table/connectors/filesystem.zh.md %}">Filesystem</a></td> </tr> <tr> - <td><a href="{{ site.baseurl }}/dev/table/connectors/formats/json.html">JSON</a></td> - <td>Apache Kafka, + <td><a href="{% link dev/table/connectors/formats/json.zh.md %}">JSON</a></td> + <td><a href="{% link dev/table/connectors/kafka.zh.md %}">Apache Kafka</a>, <a href="{% link dev/table/connectors/filesystem.zh.md %}">Filesystem</a>, <a href="{% link dev/table/connectors/elasticsearch.zh.md %}">Elasticsearch</a></td> </tr> <tr> <td><a href="{% link dev/table/connectors/formats/avro.zh.md %}">Apache Avro</a></td> - <td>Apache Kafka, + <td><a href="{% link dev/table/connectors/kafka.zh.md %}">Apache Kafka</a>, <a href="{% link dev/table/connectors/filesystem.zh.md %}">Filesystem</a></td> </tr> <tr> <td>Debezium JSON</td> - <td>Apache Kafka</td> + <td><a href="{% link dev/table/connectors/kafka.zh.md %}">Apache Kafka</a></td> </tr> <tr> <td>Canal JSON</td> - <td>Apache Kafka</td> + <td><a href="{% link dev/table/connectors/kafka.zh.md %}">Apache Kafka</a></td> </tr> <tr> <td>Apache Parquet</td> diff --git a/docs/dev/table/connectors/formats/json.md b/docs/dev/table/connectors/formats/json.md index 892f30c..e4a0df9 100644 --- a/docs/dev/table/connectors/formats/json.md +++ b/docs/dev/table/connectors/formats/json.md @@ -86,22 +86,22 @@ Format Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specify what format to use, here should be 'json'.</td> + <td>Specify what format to use, here should be <code>'json'</code>.</td> </tr> <tr> <td><h5>json.fail-on-missing-field</h5></td> <td>optional</td> <td style="word-wrap: break-word;">false</td> <td>Boolean</td> - <td>Flag to specify whether to fail if a field is missing or not, false by default.</td> + <td>Whether to fail if a field is missing or not.</td> </tr> <tr> <td><h5>json.ignore-parse-errors</h5></td> <td>optional</td> <td style="word-wrap: break-word;">false</td> <td>Boolean</td> - <td>Flag to skip fields and rows with parse errors instead of failing; - fields are set to null in case of errors, false by default.</td> + <td>Skip fields and rows with parse errors instead of failing. + Fields are set to null in case of errors.</td> </tr> </tbody> </table> @@ -118,78 +118,78 @@ The following table lists the type mapping from Flink type to JSON type. <table class="table table-bordered"> <thead> <tr> - <th class="text-left">Flink Data Type</th> - <th class="text-center">JSON Data Type</th> + <th class="text-left">Flink SQL type</th> + <th class="text-left">JSON type</th> </tr> </thead> <tbody> <tr> - <td>CHAR / VARCHAR / STRING</td> - <td>string</td> + <td><code>CHAR / VARCHAR / STRING</code></td> + <td><code>string</code></td> </tr> <tr> - <td>BOOLEAN</td> - <td>boolean</td> + <td><code>BOOLEAN</code></td> + <td><code>boolean</code></td> </tr> <tr> - <td>BINARY / VARBINARY</td> - <td>string with encoding: base64</td> + <td><code>BINARY / VARBINARY</code></td> + <td><code>string with encoding: base64</code></td> </tr> <tr> - <td>DECIMAL</td> - <td>number</td> + <td><code>DECIMAL</code></td> + <td><code>number</code></td> </tr> <tr> - <td>TINYINT</td> - <td>number</td> + <td><code>TINYINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>SMALLINT</td> - <td>number</td> + <td><code>SMALLINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>INT</td> - <td>number</td> + <td><code>INT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>BIGINT</td> - <td>number</td> + <td><code>BIGINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>FLOAT</td> - <td>number</td> + <td><code>FLOAT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>DOUBLE</td> - <td>number</td> + <td><code>DOUBLE</code></td> + <td><code>number</code></td> </tr> <tr> - <td>DATE</td> - <td>string with format: date</td> + <td><code>DATE</code></td> + <td><code>string with format: date</code></td> </tr> <tr> - <td>TIME</td> - <td>string with format: time</td> + <td><code>TIME</code></td> + <td><code>string with format: time</code></td> </tr> <tr> - <td>TIMESTAMP</td> - <td>string with format: date-time</td> + <td><code>TIMESTAMP</code></td> + <td><code>string with format: date-time</code></td> </tr> <tr> - <td>INTERVAL</td> - <td>number</td> + <td><code>INTERVAL</code></td> + <td><code>number</code></td> </tr> <tr> - <td>ARRAY</td> - <td>array</td> + <td><code>ARRAY</code></td> + <td><code>array</code></td> </tr> <tr> - <td>MAP/MULTISET</td> - <td>object</td> + <td><code>MAP / MULTISET</code></td> + <td><code>object</code></td> </tr> <tr> - <td>ROW</td> - <td>object</td> + <td><code>ROW</code></td> + <td><code>object</code></td> </tr> </tbody> </table> diff --git a/docs/dev/table/connectors/formats/json.zh.md b/docs/dev/table/connectors/formats/json.zh.md index 892f30c..e4a0df9 100644 --- a/docs/dev/table/connectors/formats/json.zh.md +++ b/docs/dev/table/connectors/formats/json.zh.md @@ -86,22 +86,22 @@ Format Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specify what format to use, here should be 'json'.</td> + <td>Specify what format to use, here should be <code>'json'</code>.</td> </tr> <tr> <td><h5>json.fail-on-missing-field</h5></td> <td>optional</td> <td style="word-wrap: break-word;">false</td> <td>Boolean</td> - <td>Flag to specify whether to fail if a field is missing or not, false by default.</td> + <td>Whether to fail if a field is missing or not.</td> </tr> <tr> <td><h5>json.ignore-parse-errors</h5></td> <td>optional</td> <td style="word-wrap: break-word;">false</td> <td>Boolean</td> - <td>Flag to skip fields and rows with parse errors instead of failing; - fields are set to null in case of errors, false by default.</td> + <td>Skip fields and rows with parse errors instead of failing. + Fields are set to null in case of errors.</td> </tr> </tbody> </table> @@ -118,78 +118,78 @@ The following table lists the type mapping from Flink type to JSON type. <table class="table table-bordered"> <thead> <tr> - <th class="text-left">Flink Data Type</th> - <th class="text-center">JSON Data Type</th> + <th class="text-left">Flink SQL type</th> + <th class="text-left">JSON type</th> </tr> </thead> <tbody> <tr> - <td>CHAR / VARCHAR / STRING</td> - <td>string</td> + <td><code>CHAR / VARCHAR / STRING</code></td> + <td><code>string</code></td> </tr> <tr> - <td>BOOLEAN</td> - <td>boolean</td> + <td><code>BOOLEAN</code></td> + <td><code>boolean</code></td> </tr> <tr> - <td>BINARY / VARBINARY</td> - <td>string with encoding: base64</td> + <td><code>BINARY / VARBINARY</code></td> + <td><code>string with encoding: base64</code></td> </tr> <tr> - <td>DECIMAL</td> - <td>number</td> + <td><code>DECIMAL</code></td> + <td><code>number</code></td> </tr> <tr> - <td>TINYINT</td> - <td>number</td> + <td><code>TINYINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>SMALLINT</td> - <td>number</td> + <td><code>SMALLINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>INT</td> - <td>number</td> + <td><code>INT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>BIGINT</td> - <td>number</td> + <td><code>BIGINT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>FLOAT</td> - <td>number</td> + <td><code>FLOAT</code></td> + <td><code>number</code></td> </tr> <tr> - <td>DOUBLE</td> - <td>number</td> + <td><code>DOUBLE</code></td> + <td><code>number</code></td> </tr> <tr> - <td>DATE</td> - <td>string with format: date</td> + <td><code>DATE</code></td> + <td><code>string with format: date</code></td> </tr> <tr> - <td>TIME</td> - <td>string with format: time</td> + <td><code>TIME</code></td> + <td><code>string with format: time</code></td> </tr> <tr> - <td>TIMESTAMP</td> - <td>string with format: date-time</td> + <td><code>TIMESTAMP</code></td> + <td><code>string with format: date-time</code></td> </tr> <tr> - <td>INTERVAL</td> - <td>number</td> + <td><code>INTERVAL</code></td> + <td><code>number</code></td> </tr> <tr> - <td>ARRAY</td> - <td>array</td> + <td><code>ARRAY</code></td> + <td><code>array</code></td> </tr> <tr> - <td>MAP/MULTISET</td> - <td>object</td> + <td><code>MAP / MULTISET</code></td> + <td><code>object</code></td> </tr> <tr> - <td>ROW</td> - <td>object</td> + <td><code>ROW</code></td> + <td><code>object</code></td> </tr> </tbody> </table> diff --git a/docs/dev/table/connectors/hbase.md b/docs/dev/table/connectors/hbase.md index 460904e..17508d7 100644 --- a/docs/dev/table/connectors/hbase.md +++ b/docs/dev/table/connectors/hbase.md @@ -2,7 +2,7 @@ title: "HBase SQL Connector" nav-title: HBase nav-parent_id: sql-connectors -nav-pos: 9 +nav-pos: 6 --- <!-- Licensed to the Apache Software Foundation (ASF) under one @@ -35,8 +35,6 @@ The HBase connector allows for reading from and writing to an HBase cluster. Thi HBase always works in upsert mode for exchange changelog messages with the external system using a primary key defined on the DDL. The primary key must be defined on the HBase rowkey field (rowkey field must be declared). If the PRIMARY KEY clause is not declared, the HBase connector will take rowkey as the primary key by default. -<span class="label label-danger">Attention</span> HBase as a Lookup Source does not use any cache, data is always queried directly through the HBase client. - Dependencies ------------ @@ -89,7 +87,7 @@ Connector Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specify what connector to use, here should be 'hbase-1.4'.</td> + <td>Specify what connector to use, here should be <code>'hbase-1.4'</code>.</td> </tr> <tr> <td><h5>table-name</h5></td> @@ -126,7 +124,7 @@ Connector Options <td>MemorySize</td> <td>Writing option, maximum size in memory of buffered rows for each writing request. This can improve performance for writing data to HBase database, but may increase the latency. - Can be set to '0' to disable it. + Can be set to <code>'0'</code> to disable it. </td> </tr> <tr> @@ -136,7 +134,7 @@ Connector Options <td>Integer</td> <td>Writing option, maximum number of rows to buffer for each writing request. This can improve performance for writing data to HBase database, but may increase the latency. - Can be set to '0' to disable it. + Can be set to <code>'0'</code> to disable it. </td> </tr> <tr> @@ -146,8 +144,8 @@ Connector Options <td>Duration</td> <td>Writing option, the interval to flush any buffered rows. This can improve performance for writing data to HBase database, but may increase the latency. - Can be set to '0' to disable it. Note, both 'sink.buffer-flush.max-size' and 'sink.buffer-flush.max-rows' - can be set to '0' with the flush interval set allowing for complete async processing of buffered actions. + Can be set to <code>'0'</code> to disable it. Note, both <code>'sink.buffer-flush.max-size'</code> and <code>'sink.buffer-flush.max-rows'</code> + can be set to <code>'0'</code> with the flush interval set allowing for complete async processing of buffered actions. </td> </tr> </tbody> @@ -169,13 +167,13 @@ The data type mappings are as follows: <table class="table table-bordered"> <thead> <tr> - <th class="text-left">Flink Data Type</th> - <th class="text-center">HBase conversion</th> + <th class="text-left">Flink SQL type</th> + <th class="text-left">HBase conversion</th> </tr> </thead> <tbody> <tr> - <td>CHAR / VARCHAR / STRING</td> + <td><code>CHAR / VARCHAR / STRING</code></td> <td> {% highlight java %} byte[] toBytes(String s) @@ -184,7 +182,7 @@ String toString(byte[] b) </td> </tr> <tr> - <td>BOOLEAN</td> + <td><code>BOOLEAN</code></td> <td> {% highlight java %} byte[] toBytes(boolean b) @@ -193,11 +191,11 @@ boolean toBoolean(byte[] b) </td> </tr> <tr> - <td>BINARY / VARBINARY</td> + <td><code>BINARY / VARBINARY</code></td> <td>Returns <code>byte[]</code> as is.</td> </tr> <tr> - <td>DECIMAL</td> + <td><code>DECIMAL</code></td> <td> {% highlight java %} byte[] toBytes(BigDecimal v) @@ -206,7 +204,7 @@ BigDecimal toBigDecimal(byte[] b) </td> </tr> <tr> - <td>TINYINT</td> + <td><code>TINYINT</code></td> <td> {% highlight java %} new byte[] { val } @@ -215,7 +213,7 @@ bytes[0] // returns first and only byte from bytes </td> </tr> <tr> - <td>SMALLINT</td> + <td><code>SMALLINT</code></td> <td> {% highlight java %} byte[] toBytes(short val) @@ -224,7 +222,7 @@ short toShort(byte[] bytes) </td> </tr> <tr> - <td>INT</td> + <td><code>INT</code></td> <td> {% highlight java %} byte[] toBytes(int val) @@ -233,7 +231,7 @@ int toInt(byte[] bytes) </td> </tr> <tr> - <td>BIGINT</td> + <td><code>BIGINT</code></td> <td> {% highlight java %} byte[] toBytes(long val) @@ -242,7 +240,7 @@ long toLong(byte[] bytes) </td> </tr> <tr> - <td>FLOAT</td> + <td><code>FLOAT</code></td> <td> {% highlight java %} byte[] toBytes(float val) @@ -251,7 +249,7 @@ float toFloat(byte[] bytes) </td> </tr> <tr> - <td>DOUBLE</td> + <td><code>DOUBLE</code></td> <td> {% highlight java %} byte[] toBytes(double val) @@ -260,28 +258,30 @@ double toDouble(byte[] bytes) </td> </tr> <tr> - <td>DATE</td> + <td><code>DATE</code></td> <td>Stores the number of days since epoch as int value.</td> </tr> <tr> - <td>TIME</td> + <td><code>TIME</code></td> <td>Stores the number of milliseconds of the day as int value.</td> </tr> <tr> - <td>TIMESTAMP</td> + <td><code>TIMESTAMP</code></td> <td>Stores the milliseconds since epoch as long value.</td> </tr> <tr> - <td>ARRAY</td> + <td><code>ARRAY</code></td> <td>Not supported</td> </tr> <tr> - <td>MAP / MULTISET</td> + <td><code>MAP / MULTISET<code></td> <td>Not supported</td> </tr> <tr> - <td>ROW</td> + <td><code>ROW</code></td> <td>Not supported</td> </tr> </tbody> -</table> \ No newline at end of file +</table> + +{% top %} \ No newline at end of file diff --git a/docs/dev/table/connectors/hbase.zh.md b/docs/dev/table/connectors/hbase.zh.md index 460904e..17508d7 100644 --- a/docs/dev/table/connectors/hbase.zh.md +++ b/docs/dev/table/connectors/hbase.zh.md @@ -2,7 +2,7 @@ title: "HBase SQL Connector" nav-title: HBase nav-parent_id: sql-connectors -nav-pos: 9 +nav-pos: 6 --- <!-- Licensed to the Apache Software Foundation (ASF) under one @@ -35,8 +35,6 @@ The HBase connector allows for reading from and writing to an HBase cluster. Thi HBase always works in upsert mode for exchange changelog messages with the external system using a primary key defined on the DDL. The primary key must be defined on the HBase rowkey field (rowkey field must be declared). If the PRIMARY KEY clause is not declared, the HBase connector will take rowkey as the primary key by default. -<span class="label label-danger">Attention</span> HBase as a Lookup Source does not use any cache, data is always queried directly through the HBase client. - Dependencies ------------ @@ -89,7 +87,7 @@ Connector Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specify what connector to use, here should be 'hbase-1.4'.</td> + <td>Specify what connector to use, here should be <code>'hbase-1.4'</code>.</td> </tr> <tr> <td><h5>table-name</h5></td> @@ -126,7 +124,7 @@ Connector Options <td>MemorySize</td> <td>Writing option, maximum size in memory of buffered rows for each writing request. This can improve performance for writing data to HBase database, but may increase the latency. - Can be set to '0' to disable it. + Can be set to <code>'0'</code> to disable it. </td> </tr> <tr> @@ -136,7 +134,7 @@ Connector Options <td>Integer</td> <td>Writing option, maximum number of rows to buffer for each writing request. This can improve performance for writing data to HBase database, but may increase the latency. - Can be set to '0' to disable it. + Can be set to <code>'0'</code> to disable it. </td> </tr> <tr> @@ -146,8 +144,8 @@ Connector Options <td>Duration</td> <td>Writing option, the interval to flush any buffered rows. This can improve performance for writing data to HBase database, but may increase the latency. - Can be set to '0' to disable it. Note, both 'sink.buffer-flush.max-size' and 'sink.buffer-flush.max-rows' - can be set to '0' with the flush interval set allowing for complete async processing of buffered actions. + Can be set to <code>'0'</code> to disable it. Note, both <code>'sink.buffer-flush.max-size'</code> and <code>'sink.buffer-flush.max-rows'</code> + can be set to <code>'0'</code> with the flush interval set allowing for complete async processing of buffered actions. </td> </tr> </tbody> @@ -169,13 +167,13 @@ The data type mappings are as follows: <table class="table table-bordered"> <thead> <tr> - <th class="text-left">Flink Data Type</th> - <th class="text-center">HBase conversion</th> + <th class="text-left">Flink SQL type</th> + <th class="text-left">HBase conversion</th> </tr> </thead> <tbody> <tr> - <td>CHAR / VARCHAR / STRING</td> + <td><code>CHAR / VARCHAR / STRING</code></td> <td> {% highlight java %} byte[] toBytes(String s) @@ -184,7 +182,7 @@ String toString(byte[] b) </td> </tr> <tr> - <td>BOOLEAN</td> + <td><code>BOOLEAN</code></td> <td> {% highlight java %} byte[] toBytes(boolean b) @@ -193,11 +191,11 @@ boolean toBoolean(byte[] b) </td> </tr> <tr> - <td>BINARY / VARBINARY</td> + <td><code>BINARY / VARBINARY</code></td> <td>Returns <code>byte[]</code> as is.</td> </tr> <tr> - <td>DECIMAL</td> + <td><code>DECIMAL</code></td> <td> {% highlight java %} byte[] toBytes(BigDecimal v) @@ -206,7 +204,7 @@ BigDecimal toBigDecimal(byte[] b) </td> </tr> <tr> - <td>TINYINT</td> + <td><code>TINYINT</code></td> <td> {% highlight java %} new byte[] { val } @@ -215,7 +213,7 @@ bytes[0] // returns first and only byte from bytes </td> </tr> <tr> - <td>SMALLINT</td> + <td><code>SMALLINT</code></td> <td> {% highlight java %} byte[] toBytes(short val) @@ -224,7 +222,7 @@ short toShort(byte[] bytes) </td> </tr> <tr> - <td>INT</td> + <td><code>INT</code></td> <td> {% highlight java %} byte[] toBytes(int val) @@ -233,7 +231,7 @@ int toInt(byte[] bytes) </td> </tr> <tr> - <td>BIGINT</td> + <td><code>BIGINT</code></td> <td> {% highlight java %} byte[] toBytes(long val) @@ -242,7 +240,7 @@ long toLong(byte[] bytes) </td> </tr> <tr> - <td>FLOAT</td> + <td><code>FLOAT</code></td> <td> {% highlight java %} byte[] toBytes(float val) @@ -251,7 +249,7 @@ float toFloat(byte[] bytes) </td> </tr> <tr> - <td>DOUBLE</td> + <td><code>DOUBLE</code></td> <td> {% highlight java %} byte[] toBytes(double val) @@ -260,28 +258,30 @@ double toDouble(byte[] bytes) </td> </tr> <tr> - <td>DATE</td> + <td><code>DATE</code></td> <td>Stores the number of days since epoch as int value.</td> </tr> <tr> - <td>TIME</td> + <td><code>TIME</code></td> <td>Stores the number of milliseconds of the day as int value.</td> </tr> <tr> - <td>TIMESTAMP</td> + <td><code>TIMESTAMP</code></td> <td>Stores the milliseconds since epoch as long value.</td> </tr> <tr> - <td>ARRAY</td> + <td><code>ARRAY</code></td> <td>Not supported</td> </tr> <tr> - <td>MAP / MULTISET</td> + <td><code>MAP / MULTISET<code></td> <td>Not supported</td> </tr> <tr> - <td>ROW</td> + <td><code>ROW</code></td> <td>Not supported</td> </tr> </tbody> -</table> \ No newline at end of file +</table> + +{% top %} \ No newline at end of file diff --git a/docs/dev/table/connectors/index.md b/docs/dev/table/connectors/index.md index 813cf5b..c7fbf6f 100644 --- a/docs/dev/table/connectors/index.md +++ b/docs/dev/table/connectors/index.md @@ -29,9 +29,9 @@ Flink's Table API & SQL programs can be connected to other external systems for This page describes how to register table sources and table sinks in Flink using the natively supported connectors. After a source or sink has been registered, it can be accessed by Table API & SQL statements. -<span class="label label-info">NOTE</span> If you want to implement your own *custom* table source or sink, have a look at the [user-defined sources & sinks page](sourceSinks.html). +<span class="label label-info">NOTE</span> If you want to implement your own *custom* table source or sink, have a look at the [user-defined sources & sinks page]({% link dev/table/sourceSinks.md %}). -<span class="label label-danger">Attention</span> Flink Table & SQL introduces a new set of connector options since 1.11.0, if you are using the legacy connector options, please refer to the [legacy documentation]({{ site.baseurl }}/dev/table/connect.html). +<span class="label label-danger">Attention</span> Flink Table & SQL introduces a new set of connector options since 1.11.0, if you are using the legacy connector options, please refer to the [legacy documentation]({% link dev/table/connect.md %}). * This will be replaced by the TOC {:toc} @@ -52,7 +52,7 @@ Flink natively support various connectors. The following tables list all availab </thead> <tbody> <tr> - <td><a href="{{ site.baseurl }}/dev/table/connectors/filesystem.html">Filesystem</a></td> + <td><a href="{% link dev/table/connectors/filesystem.md %}">Filesystem</a></td> <td></td> <td>Bounded and Unbounded Scan, Lookup</td> <td>Streaming Sink, Batch Sink</td> @@ -64,7 +64,7 @@ Flink natively support various connectors. The following tables list all availab <td>Streaming Sink, Batch Sink</td> </tr> <tr> - <td><a href="{{ site.baseurl }}/dev/table/connectors/kafka.html">Apache Kafka</a></td> + <td><a href="{% link dev/table/connectors/kafka.md %}">Apache Kafka</a></td> <td>0.10+</td> <td>Unbounded Scan</td> <td>Streaming Sink, Batch Sink</td> @@ -115,7 +115,7 @@ CREATE TABLE MyUserTable ( </div> </div> -In this way the desired connection properties are converted into string-based key-value pairs. So-called [table factories](sourceSinks.html#define-a-tablefactory) create configured table sources, table sinks, and corresponding formats from the key-value pairs. All table factories that can be found via Java's [Service Provider Interfaces (SPI)](https://docs.oracle.com/javase/tutorial/sound/SPI-intro.html) are taken into account when searching for exactly-one matching table factory. +In this way the desired connection properties are converted into string-based key-value pairs. So-called [table factories]({% link dev/table/sourceSinks.md %}#define-a-tablefactory) create configured table sources, table sinks, and corresponding formats from the key-value pairs. All table factories that can be found via Java's [Service Provider Interfaces (SPI)](https://docs.oracle.com/javase/tutorial/sound/SPI-intro.html) are taken into account when searching for exactly-one matching ta [...] If no factory can be found or multiple factories match for the given properties, an exception will be thrown with additional information about considered factories and supported properties. @@ -169,11 +169,11 @@ CREATE TABLE MyTable ( Time attributes are essential when working with unbounded streaming tables. Therefore both proctime and rowtime attributes can be defined as part of the schema. -For more information about time handling in Flink and especially event-time, we recommend the general [event-time section](streaming/time_attributes.html). +For more information about time handling in Flink and especially event-time, we recommend the general [event-time section]({% link dev/table/streaming/time_attributes.md %}). #### Proctime Attributes -In order to declare a proctime attribute in the schema, you can use [Computed Column syntax]({{ site.baseurl }}/dev/table/sql/create.html#create-table) to declare a computed column which is generated from `PROCTIME()` builtin function. +In order to declare a proctime attribute in the schema, you can use [Computed Column syntax]({% link dev/table/sql/create.md %}#create-table) to declare a computed column which is generated from `PROCTIME()` builtin function. The computed column is a virtual column which is not stored in the physical data. <div class="codetabs" markdown="1"> @@ -195,7 +195,7 @@ CREATE TABLE MyTable ( In order to control the event-time behavior for tables, Flink provides predefined timestamp extractors and watermark strategies. -Please refer to [CREATE TABLE statements](sql/create.html#create-table) for more information about defining time attributes in DDL. +Please refer to [CREATE TABLE statements]({% link dev/table/sql/create.md %}#create-table) for more information about defining time attributes in DDL. The following timestamp extractors are supported: @@ -263,6 +263,6 @@ Make sure to always declare both timestamps and watermarks. Watermarks are requi ### SQL Types -Please see the [Data Types](types.html) page about how to declare a type in SQL. +Please see the [Data Types]({% link dev/table/types.md %}) page about how to declare a type in SQL. {% top %} \ No newline at end of file diff --git a/docs/dev/table/connectors/index.zh.md b/docs/dev/table/connectors/index.zh.md index 7fa8383..c7fbf6f 100644 --- a/docs/dev/table/connectors/index.zh.md +++ b/docs/dev/table/connectors/index.zh.md @@ -29,9 +29,9 @@ Flink's Table API & SQL programs can be connected to other external systems for This page describes how to register table sources and table sinks in Flink using the natively supported connectors. After a source or sink has been registered, it can be accessed by Table API & SQL statements. -<span class="label label-info">NOTE</span> If you want to implement your own *custom* table source or sink, have a look at the [user-defined sources & sinks page](sourceSinks.html). +<span class="label label-info">NOTE</span> If you want to implement your own *custom* table source or sink, have a look at the [user-defined sources & sinks page]({% link dev/table/sourceSinks.md %}). -<span class="label label-danger">Attention</span> Flink Table & SQL introduces a new set of connector options since 1.11.0, if you are using the legacy connector options, please refer to the [legacy documentation]({{ site.baseurl }}/dev/table/connect.html). +<span class="label label-danger">Attention</span> Flink Table & SQL introduces a new set of connector options since 1.11.0, if you are using the legacy connector options, please refer to the [legacy documentation]({% link dev/table/connect.md %}). * This will be replaced by the TOC {:toc} @@ -52,31 +52,31 @@ Flink natively support various connectors. The following tables list all availab </thead> <tbody> <tr> - <td><a href="{{ site.baseurl }}/dev/table/connectors/filesystem.html">Filesystem</a></td> + <td><a href="{% link dev/table/connectors/filesystem.md %}">Filesystem</a></td> <td></td> <td>Bounded and Unbounded Scan, Lookup</td> <td>Streaming Sink, Batch Sink</td> </tr> <tr> - <td><a href="{% link dev/table/connectors/elasticsearch.zh.md %}">Elasticsearch</a></td> + <td><a href="{% link dev/table/connectors/elasticsearch.md %}">Elasticsearch</a></td> <td>6.x & 7.x</td> <td>Not supported</td> <td>Streaming Sink, Batch Sink</td> </tr> <tr> - <td>[Apache Kafka]({{ site.baseurl }}/dev/table/connectors/kafka.html)</td> + <td><a href="{% link dev/table/connectors/kafka.md %}">Apache Kafka</a></td> <td>0.10+</td> <td>Unbounded Scan</td> <td>Streaming Sink, Batch Sink</td> </tr> <tr> - <td><a href="{% link dev/table/connectors/jdbc.zh.md %}">JDBC</a></td> + <td><a href="{% link dev/table/connectors/jdbc.md %}">JDBC</a></td> <td></td> <td>Bounded Scan, Lookup</td> <td>Streaming Sink, Batch Sink</td> </tr> <tr> - <td><a href="{% link dev/table/connectors/hbase.zh.md %}">Apache HBase</a></td> + <td><a href="{% link dev/table/connectors/hbase.md %}">Apache HBase</a></td> <td>1.4.x</td> <td>Bounded Scan, Lookup</td> <td>Streaming Sink, Batch Sink</td> @@ -115,7 +115,7 @@ CREATE TABLE MyUserTable ( </div> </div> -In this way the desired connection properties are converted into string-based key-value pairs. So-called [table factories](sourceSinks.html#define-a-tablefactory) create configured table sources, table sinks, and corresponding formats from the key-value pairs. All table factories that can be found via Java's [Service Provider Interfaces (SPI)](https://docs.oracle.com/javase/tutorial/sound/SPI-intro.html) are taken into account when searching for exactly-one matching table factory. +In this way the desired connection properties are converted into string-based key-value pairs. So-called [table factories]({% link dev/table/sourceSinks.md %}#define-a-tablefactory) create configured table sources, table sinks, and corresponding formats from the key-value pairs. All table factories that can be found via Java's [Service Provider Interfaces (SPI)](https://docs.oracle.com/javase/tutorial/sound/SPI-intro.html) are taken into account when searching for exactly-one matching ta [...] If no factory can be found or multiple factories match for the given properties, an exception will be thrown with additional information about considered factories and supported properties. @@ -169,11 +169,11 @@ CREATE TABLE MyTable ( Time attributes are essential when working with unbounded streaming tables. Therefore both proctime and rowtime attributes can be defined as part of the schema. -For more information about time handling in Flink and especially event-time, we recommend the general [event-time section](streaming/time_attributes.html). +For more information about time handling in Flink and especially event-time, we recommend the general [event-time section]({% link dev/table/streaming/time_attributes.md %}). #### Proctime Attributes -In order to declare a proctime attribute in the schema, you can use [Computed Column syntax]({{ site.baseurl }}/dev/table/sql/create.html#create-table) to declare a computed column which is generated from `PROCTIME()` builtin function. +In order to declare a proctime attribute in the schema, you can use [Computed Column syntax]({% link dev/table/sql/create.md %}#create-table) to declare a computed column which is generated from `PROCTIME()` builtin function. The computed column is a virtual column which is not stored in the physical data. <div class="codetabs" markdown="1"> @@ -195,7 +195,7 @@ CREATE TABLE MyTable ( In order to control the event-time behavior for tables, Flink provides predefined timestamp extractors and watermark strategies. -Please refer to [CREATE TABLE statements](sql/create.html#create-table) for more information about defining time attributes in DDL. +Please refer to [CREATE TABLE statements]({% link dev/table/sql/create.md %}#create-table) for more information about defining time attributes in DDL. The following timestamp extractors are supported: @@ -263,6 +263,6 @@ Make sure to always declare both timestamps and watermarks. Watermarks are requi ### SQL Types -Please see the [Data Types](types.html) page about how to declare a type in SQL. +Please see the [Data Types]({% link dev/table/types.md %}) page about how to declare a type in SQL. {% top %} \ No newline at end of file diff --git a/docs/dev/table/connectors/jdbc.md b/docs/dev/table/connectors/jdbc.md index d6f54a3..d01229f 100644 --- a/docs/dev/table/connectors/jdbc.md +++ b/docs/dev/table/connectors/jdbc.md @@ -2,7 +2,7 @@ title: "JDBC SQL Connector" nav-title: JDBC nav-parent_id: sql-connectors -nav-pos: 9 +nav-pos: 3 --- <!-- Licensed to the Apache Software Foundation (ASF) under one @@ -545,4 +545,6 @@ Flink supports connect to several databases which uses dialect like MySQL, Postg <td><code>ARRAY</code></td> </tr> </tbody> -</table> \ No newline at end of file +</table> + +{% top %} \ No newline at end of file diff --git a/docs/dev/table/connectors/jdbc.zh.md b/docs/dev/table/connectors/jdbc.zh.md index 8ce7a89..1fe5a83 100644 --- a/docs/dev/table/connectors/jdbc.zh.md +++ b/docs/dev/table/connectors/jdbc.zh.md @@ -2,7 +2,7 @@ title: "JDBC SQL Connector" nav-title: JDBC nav-parent_id: sql-connectors -nav-pos: 9 +nav-pos: 3 --- <!-- Licensed to the Apache Software Foundation (ASF) under one @@ -545,4 +545,6 @@ Flink supports connect to several databases which uses dialect like MySQL, Postg <td><code>ARRAY</code></td> </tr> </tbody> -</table> \ No newline at end of file +</table> + +{% top %} \ No newline at end of file diff --git a/docs/dev/table/connectors/kafka.md b/docs/dev/table/connectors/kafka.md index fb6ee83..74cf304 100644 --- a/docs/dev/table/connectors/kafka.md +++ b/docs/dev/table/connectors/kafka.md @@ -2,7 +2,7 @@ title: "Apache Kafka SQL Connector" nav-title: Kafka nav-parent_id: sql-connectors -nav-pos: 1 +nav-pos: 2 --- <!-- Licensed to the Apache Software Foundation (ASF) under one @@ -29,7 +29,7 @@ under the License. * This will be replaced by the TOC {:toc} -Flink provides an [Apache Kafka](https://kafka.apache.org) connector for reading data from and writing data to Kafka topics. +The Elasticsearch connector allows for reading data from and writing data into Kafka topics. Dependencies ------------ @@ -44,15 +44,18 @@ For details on Kafka compatibility, please refer to the official [Kafka document | Kafka Version | Maven dependency | SQL Client JAR | | :------------------ | :-------------------------------------------------------- | :----------------------| -| universal | `flink-connector-kafka{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for stable releases {% endif %} | -| 0.11.x | `flink-connector-kafka-011{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka-011{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for stable releases {% endif %} | -| 0.10.x | `flink-connector-kafka-010{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka-010{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for stable releases {% endif %} | +| universal | `flink-connector-kafka{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for [stable releases]({{ site.stable_baseurl }}/dev/table/connectors/kafka.html) {% endif %} | +| 0.11.x | `flink-connector-kafka-011{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka-011{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for [stable releases]({{ site.stable_baseurl }}/dev/table/connectors/kafka.html) {% endif %} | +| 0.10.x | `flink-connector-kafka-010{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka-010{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for [stable releases]({{ site.stable_baseurl }}/dev/table/connectors/kafka.html) {% endif %} | -Flink's streaming connectors are not currently part of the binary distribution. -See how to link with them for cluster execution [here]({{ site.baseurl}}/dev/projectsetup/dependencies.html). +The Kafka connectors are not currently part of the binary distribution. +See how to link with them for cluster execution [here]({% link dev/project-configuration.md %}). How to create a Kafka table ---------------- + +The example below shows how to create a Kafka table: + <div class="codetabs" markdown="1"> <div data-lang="SQL" markdown="1"> {% highlight sql %} @@ -93,7 +96,7 @@ Connector Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specify what connector to use, for Kafka the options are: 'kafka', 'kafka-0.11', 'kafka-0.10'.</td> + <td>Specify what connector to use, for Kafka the options are: <code>'kafka'</code>, <code>'kafka-0.11'</code>, <code>'kafka-0.10'</code>.</td> </tr> <tr> <td><h5>topic</h5></td> @@ -107,23 +110,23 @@ Connector Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Kafka server connection string.</td> + <td>Comma separated list of Kafka brokers.</td> </tr> <tr> <td><h5>properties.group.id</h5></td> <td>required by source</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Consumer group in Kafka consumer, no need for Kafka producer</td> + <td>The id of the consumer group for Kafka source, optional for Kafka sink.</td> </tr> <tr> <td><h5>format</h5></td> <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Kafka connector requires to specify a format, - the supported formats are 'csv', 'json' and 'avro'. - Please refer to [Table Formats]({{ site.baseurl }}/dev/table/connect.html#table-formats) section for more details. + <td>The format used to deserialize and serialize Kafka messages. + The supported formats are <code>'csv'</code>, <code>'json'</code>, <code>'avro'</code>, <code>'debezium-json'</code> and <code>'canal-json'</code>. + Please refer to <a href="{% link dev/table/connectors/formats/index.md %}">Formats</a> page for more details and more format options. </td> </tr> <tr> @@ -131,14 +134,15 @@ Connector Options <td>optional</td> <td style="word-wrap: break-word;">group-offsets</td> <td>String</td> - <td>Startup mode for Kafka consumer, valid enumerations are <code>'earliest-offset'</code>, <code>'latest-offset'</code>, <code>'group-offsets'</code>, <code>'timestamp'</code> or <code>'specific-offsets'</code>.</td> + <td>Startup mode for Kafka consumer, valid values are <code>'earliest-offset'</code>, <code>'latest-offset'</code>, <code>'group-offsets'</code>, <code>'timestamp'</code> and <code>'specific-offsets'</code>. + See the following <a href="#start-reading-position">Start Reading Position</a> for more details.</td> </tr> <tr> <td><h5>scan.startup.specific-offsets</h5></td> <td>optional</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specifies offsets for each partition in case of 'specific-offsets' startup mode, e.g. `partition:0,offset:42;partition:1,offset:300`. + <td>Specify offsets for each partition in case of <code>'specific-offsets'</code> startup mode, e.g. <code>'partition:0,offset:42;partition:1,offset:300'</code>. </td> </tr> <tr> @@ -146,18 +150,18 @@ Connector Options <td>optional</td> <td style="word-wrap: break-word;">(none)</td> <td>Long</td> - <td>Timestamp used in case of 'timestamp' startup mode, the 'timestamp' represents the milliseconds that have passed since January 1, 1970 00:00:00.000 GMT, e.g. '1591776274000' for '2020-06-10 16:04:34 +08:00'.</td> + <td>Start from the specified epoch timestamp (milliseconds) used in case of <code>'timestamp'</code> startup mode.</td> </tr> <tr> <td><h5>sink.partitioner</h5></td> <td>optional</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Output partitioning from Flink's partitions into Kafka's partitions. Valid enumerations are + <td>Output partitioning from Flink's partitions into Kafka's partitions. Valid values are <ul> - <li><span markdown="span">`fixed`</span>: each Flink partition ends up in at most one Kafka partition.</li> - <li><span markdown="span">`round-robin`</span>: a Flink partition is distributed to Kafka partitions round-robin.</li> - <li><span markdown="span">`custom class name`</span>: use a custom FlinkKafkaPartitioner subclass.</li> + <li><code>fixed</code>: each Flink partition ends up in at most one Kafka partition.</li> + <li><code>round-robin</code>: a Flink partition is distributed to Kafka partitions round-robin.</li> + <li>Custom <code>FlinkKafkaPartitioner</code> subclass: e.g. <code>'org.mycompany.MyPartitioner'</code>.</li> </ul> </td> </tr> @@ -167,7 +171,8 @@ Connector Options Features ---------------- -### Specify the Start Reading Position +### Start Reading Position + The config option `scan.startup.mode` specifies the startup mode for Kafka consumer. The valid enumerations are: <ul> <li><span markdown="span">`group-offsets`</span>: start from committed offsets in ZK / Kafka brokers of a specific consumer group.</li> @@ -185,25 +190,20 @@ If `specific-offsets` is specified, another config option `scan.startup.specific e.g. an option value `partition:0,offset:42;partition:1,offset:300` indicates offset `42` for partition `0` and offset `300` for partition `1`. ### Sink Partitioning -The config option `sink.partitioner` specifies output partitioning from Flink's partitions into Kafka's partitions. The valid enumerations are: -<ul> -<li><span markdown="span">`fixed`</span>: each Flink partition ends up in at most one Kafka partition.</li> -<li><span markdown="span">`round-robin`</span>: a Flink partition is distributed to Kafka partitions round-robin.</li> -<li><span markdown="span">`custom class name`</span>: use a custom FlinkKafkaPartitioner subclass.</li> -</ul> - -<span class="label label-danger">Note</span> If the option value it neither `fixed` nor `round-robin`, then Flink would try to parse as -the `custom class name`, if that is not a full class name that implements `FlinkKafkaPartitioner`, an exception would be thrown. -If config option `sink.partitioner` is not specified, a partition will be assigned in a round-robin fashion. +The config option `sink.partitioner` specifies output partitioning from Flink's partitions into Kafka's partitions. +By default, a Kafka sink writes to at most as many partitions as its own parallelism (each parallel instance of the sink writes to exactly one partition). +In order to distribute the writes to more partitions or control the routing of rows into partitions, a custom sink partitioner can be provided. The `round-robin` partitioner is useful to avoid an unbalanced partitioning. +However, it will cause a lot of network connections between all the Flink instances and all the Kafka brokers. ### Consistency guarantees -The Kafka SQL sink only supports at-least-once writes now, for exactly-once writes, use the `DataStream` connector, see -<a href="{{ site.baseurl }}/dev/connectors/kafka.html#kafka-producers-and-fault-tolerance">Kafka Producers And Fault Tolerance</a> for more details. + +By default, a Kafka sink ingests data with at-least-once guarantees into a Kafka topic if the query is executed with [checkpointing enabled]({% link dev/stream/state/checkpointing.md %}#enabling-and-configuring-checkpointing). Data Type Mapping ---------------- -Kafka connector requires to specify a format, thus the supported data types are decided by the specific formats it specifies. -Please refer to <a href="{{ site.baseurl }}/dev/table/connectors/formats/index.html">Table Formats</a> section for more details. + +Kafka stores message keys and values as bytes, so Kafka doesn't have schema or data types. The Kafka messages are deserialized and serialized by formats, e.g. csv, json, avro. +Thus, the data type mapping is determined by specific formats. Please refer to [Formats]({% link dev/table/connectors/formats/index.md %}) pages for more details. {% top %} diff --git a/docs/dev/table/connectors/kafka.zh.md b/docs/dev/table/connectors/kafka.zh.md index fb6ee83..74cf304 100644 --- a/docs/dev/table/connectors/kafka.zh.md +++ b/docs/dev/table/connectors/kafka.zh.md @@ -2,7 +2,7 @@ title: "Apache Kafka SQL Connector" nav-title: Kafka nav-parent_id: sql-connectors -nav-pos: 1 +nav-pos: 2 --- <!-- Licensed to the Apache Software Foundation (ASF) under one @@ -29,7 +29,7 @@ under the License. * This will be replaced by the TOC {:toc} -Flink provides an [Apache Kafka](https://kafka.apache.org) connector for reading data from and writing data to Kafka topics. +The Elasticsearch connector allows for reading data from and writing data into Kafka topics. Dependencies ------------ @@ -44,15 +44,18 @@ For details on Kafka compatibility, please refer to the official [Kafka document | Kafka Version | Maven dependency | SQL Client JAR | | :------------------ | :-------------------------------------------------------- | :----------------------| -| universal | `flink-connector-kafka{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for stable releases {% endif %} | -| 0.11.x | `flink-connector-kafka-011{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka-011{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for stable releases {% endif %} | -| 0.10.x | `flink-connector-kafka-010{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka-010{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for stable releases {% endif %} | +| universal | `flink-connector-kafka{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for [stable releases]({{ site.stable_baseurl }}/dev/table/connectors/kafka.html) {% endif %} | +| 0.11.x | `flink-connector-kafka-011{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka-011{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for [stable releases]({{ site.stable_baseurl }}/dev/table/connectors/kafka.html) {% endif %} | +| 0.10.x | `flink-connector-kafka-010{{site.scala_version_suffix}}` | {% if site.is_stable %} [Download](https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka-010{{site.scala_version_suffix}}/{{site.version}}/flink-connector-kafka{{site.scala_version_suffix}}-{{site.version}}.jar) {% else %} Only available for [stable releases]({{ site.stable_baseurl }}/dev/table/connectors/kafka.html) {% endif %} | -Flink's streaming connectors are not currently part of the binary distribution. -See how to link with them for cluster execution [here]({{ site.baseurl}}/dev/projectsetup/dependencies.html). +The Kafka connectors are not currently part of the binary distribution. +See how to link with them for cluster execution [here]({% link dev/project-configuration.md %}). How to create a Kafka table ---------------- + +The example below shows how to create a Kafka table: + <div class="codetabs" markdown="1"> <div data-lang="SQL" markdown="1"> {% highlight sql %} @@ -93,7 +96,7 @@ Connector Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specify what connector to use, for Kafka the options are: 'kafka', 'kafka-0.11', 'kafka-0.10'.</td> + <td>Specify what connector to use, for Kafka the options are: <code>'kafka'</code>, <code>'kafka-0.11'</code>, <code>'kafka-0.10'</code>.</td> </tr> <tr> <td><h5>topic</h5></td> @@ -107,23 +110,23 @@ Connector Options <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Kafka server connection string.</td> + <td>Comma separated list of Kafka brokers.</td> </tr> <tr> <td><h5>properties.group.id</h5></td> <td>required by source</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Consumer group in Kafka consumer, no need for Kafka producer</td> + <td>The id of the consumer group for Kafka source, optional for Kafka sink.</td> </tr> <tr> <td><h5>format</h5></td> <td>required</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Kafka connector requires to specify a format, - the supported formats are 'csv', 'json' and 'avro'. - Please refer to [Table Formats]({{ site.baseurl }}/dev/table/connect.html#table-formats) section for more details. + <td>The format used to deserialize and serialize Kafka messages. + The supported formats are <code>'csv'</code>, <code>'json'</code>, <code>'avro'</code>, <code>'debezium-json'</code> and <code>'canal-json'</code>. + Please refer to <a href="{% link dev/table/connectors/formats/index.md %}">Formats</a> page for more details and more format options. </td> </tr> <tr> @@ -131,14 +134,15 @@ Connector Options <td>optional</td> <td style="word-wrap: break-word;">group-offsets</td> <td>String</td> - <td>Startup mode for Kafka consumer, valid enumerations are <code>'earliest-offset'</code>, <code>'latest-offset'</code>, <code>'group-offsets'</code>, <code>'timestamp'</code> or <code>'specific-offsets'</code>.</td> + <td>Startup mode for Kafka consumer, valid values are <code>'earliest-offset'</code>, <code>'latest-offset'</code>, <code>'group-offsets'</code>, <code>'timestamp'</code> and <code>'specific-offsets'</code>. + See the following <a href="#start-reading-position">Start Reading Position</a> for more details.</td> </tr> <tr> <td><h5>scan.startup.specific-offsets</h5></td> <td>optional</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Specifies offsets for each partition in case of 'specific-offsets' startup mode, e.g. `partition:0,offset:42;partition:1,offset:300`. + <td>Specify offsets for each partition in case of <code>'specific-offsets'</code> startup mode, e.g. <code>'partition:0,offset:42;partition:1,offset:300'</code>. </td> </tr> <tr> @@ -146,18 +150,18 @@ Connector Options <td>optional</td> <td style="word-wrap: break-word;">(none)</td> <td>Long</td> - <td>Timestamp used in case of 'timestamp' startup mode, the 'timestamp' represents the milliseconds that have passed since January 1, 1970 00:00:00.000 GMT, e.g. '1591776274000' for '2020-06-10 16:04:34 +08:00'.</td> + <td>Start from the specified epoch timestamp (milliseconds) used in case of <code>'timestamp'</code> startup mode.</td> </tr> <tr> <td><h5>sink.partitioner</h5></td> <td>optional</td> <td style="word-wrap: break-word;">(none)</td> <td>String</td> - <td>Output partitioning from Flink's partitions into Kafka's partitions. Valid enumerations are + <td>Output partitioning from Flink's partitions into Kafka's partitions. Valid values are <ul> - <li><span markdown="span">`fixed`</span>: each Flink partition ends up in at most one Kafka partition.</li> - <li><span markdown="span">`round-robin`</span>: a Flink partition is distributed to Kafka partitions round-robin.</li> - <li><span markdown="span">`custom class name`</span>: use a custom FlinkKafkaPartitioner subclass.</li> + <li><code>fixed</code>: each Flink partition ends up in at most one Kafka partition.</li> + <li><code>round-robin</code>: a Flink partition is distributed to Kafka partitions round-robin.</li> + <li>Custom <code>FlinkKafkaPartitioner</code> subclass: e.g. <code>'org.mycompany.MyPartitioner'</code>.</li> </ul> </td> </tr> @@ -167,7 +171,8 @@ Connector Options Features ---------------- -### Specify the Start Reading Position +### Start Reading Position + The config option `scan.startup.mode` specifies the startup mode for Kafka consumer. The valid enumerations are: <ul> <li><span markdown="span">`group-offsets`</span>: start from committed offsets in ZK / Kafka brokers of a specific consumer group.</li> @@ -185,25 +190,20 @@ If `specific-offsets` is specified, another config option `scan.startup.specific e.g. an option value `partition:0,offset:42;partition:1,offset:300` indicates offset `42` for partition `0` and offset `300` for partition `1`. ### Sink Partitioning -The config option `sink.partitioner` specifies output partitioning from Flink's partitions into Kafka's partitions. The valid enumerations are: -<ul> -<li><span markdown="span">`fixed`</span>: each Flink partition ends up in at most one Kafka partition.</li> -<li><span markdown="span">`round-robin`</span>: a Flink partition is distributed to Kafka partitions round-robin.</li> -<li><span markdown="span">`custom class name`</span>: use a custom FlinkKafkaPartitioner subclass.</li> -</ul> - -<span class="label label-danger">Note</span> If the option value it neither `fixed` nor `round-robin`, then Flink would try to parse as -the `custom class name`, if that is not a full class name that implements `FlinkKafkaPartitioner`, an exception would be thrown. -If config option `sink.partitioner` is not specified, a partition will be assigned in a round-robin fashion. +The config option `sink.partitioner` specifies output partitioning from Flink's partitions into Kafka's partitions. +By default, a Kafka sink writes to at most as many partitions as its own parallelism (each parallel instance of the sink writes to exactly one partition). +In order to distribute the writes to more partitions or control the routing of rows into partitions, a custom sink partitioner can be provided. The `round-robin` partitioner is useful to avoid an unbalanced partitioning. +However, it will cause a lot of network connections between all the Flink instances and all the Kafka brokers. ### Consistency guarantees -The Kafka SQL sink only supports at-least-once writes now, for exactly-once writes, use the `DataStream` connector, see -<a href="{{ site.baseurl }}/dev/connectors/kafka.html#kafka-producers-and-fault-tolerance">Kafka Producers And Fault Tolerance</a> for more details. + +By default, a Kafka sink ingests data with at-least-once guarantees into a Kafka topic if the query is executed with [checkpointing enabled]({% link dev/stream/state/checkpointing.md %}#enabling-and-configuring-checkpointing). Data Type Mapping ---------------- -Kafka connector requires to specify a format, thus the supported data types are decided by the specific formats it specifies. -Please refer to <a href="{{ site.baseurl }}/dev/table/connectors/formats/index.html">Table Formats</a> section for more details. + +Kafka stores message keys and values as bytes, so Kafka doesn't have schema or data types. The Kafka messages are deserialized and serialized by formats, e.g. csv, json, avro. +Thus, the data type mapping is determined by specific formats. Please refer to [Formats]({% link dev/table/connectors/formats/index.md %}) pages for more details. {% top %}