wuchong commented on a change in pull request #9799: [FLINK-13360][documentation] Add documentation for HBase connector for Table API & SQL URL: https://github.com/apache/flink/pull/9799#discussion_r332922638
########## File path: docs/dev/table/connect.md ########## @@ -1075,6 +1075,94 @@ CREATE TABLE MyUserTable ( {% top %} +### HBase Connector + +<span class="label label-primary">Source: Batch</span> +<span class="label label-primary">Sink: Batch</span> +<span class="label label-primary">Sink: Streaming Append Mode</span> +<span class="label label-primary">Sink: Streaming Upsert Mode</span> +<span class="label label-primary">Temporal Join: Sync Mode</span> + +The HBase connector allows for reading from and writing to an HBase cluster. + +The connector can operate in [upsert mode](#update-modes) for exchanging UPSERT/DELETE messages with the external system using a [key defined by the query](./streaming/dynamic_tables.html#table-to-stream-conversion). + +For append-only queries, the connector can also operate in [append mode](#update-modes) for exchanging only INSERT messages with the external system. + +To use this connector, add the following dependency to your project: + +{% highlight xml %} +<dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-connector-hbase{{ site.scala_version_suffix }}</artifactId> + <version>{{ site.version }}</version> +</dependency> +{% endhighlight %} + +The connector can be defined as follows: + +<div class="codetabs" markdown="1"> +<div data-lang="YAML" markdown="1"> +{% highlight yaml %} +connector: + type: hbase + version: 1.4.3 # required: valid connector versions are "1.4.3" + + table-name: "hbase_table_name" # required: hbase table name + + zookeeper: + quorum: "localhost:2181" # required: HBase Zookeeper quorum configuration + znode.parent: "/test" # required: the root dir in Zookeeper for HBase cluster + + write.buffer-flush: + max-size: 1048576 # optional: Write option, sets when to flush a buffered request + # based on the memory size of rows currently added. + max-rows: 1 # optional: Write option, sets when to flush buffered + # request based on the number of rows currently added. + interval: 1 # optional: Write option, sets a flush interval flushing buffered + # requesting if the interval passes, in milliseconds. +{% endhighlight %} +</div> + +<div data-lang="DDL" markdown="1"> +{% highlight sql %} +CREATE TABLE MyUserTable ( + hbase_rowkey_name rowkey_type, + hbase_column_family_name1 ROW<...>, + hbase_column_family_name2 ROW<...> +) WITH ( + 'connector.type' = 'hbase', -- required: specify this table type is hbase + + 'connector.version' = '1.4.3', -- required: valid connector versions are "1.4.3" + + 'connector.table-name' = 'hbase_table_name', -- required: hbase table name + + 'connector.zookeeper.quorum' = 'localhost:2181', -- required: HBase Zookeeper quorum configuration + 'connector.zookeeper.znode.parent' = '/test', -- required: the root dir in Zookeeper for HBase cluster + + 'connector.write.buffer-flush.max-size' = '1048576', -- optional: Write option, sets when to flush a buffered request + -- based on the memory size of rows currently added. + + 'connector.write.buffer-flush.max-rows' = '1', -- optional: Write option, sets when to flush buffered Review comment: ```suggestion 'connector.write.buffer-flush.max-rows' = '1000', -- optional: writing option, determines how many rows to insert per round trip. This can help performance on writing to JDBC database. No default value, i.e. the default flushing is not depends on the number of buffered rows. ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services