[GitHub] [flink] sjwiesman commented on a change in pull request #9799: [FLINK-13360][documentation] Add documentation for HBase connector for Table API & SQL

GitBox Tue, 01 Oct 2019 13:39:14 -0700

sjwiesman commented on a change in pull request #9799: 
[FLINK-13360][documentation] Add documentation for HBase connector for Table 
API & SQL
URL: https://github.com/apache/flink/pull/9799#discussion_r330260287


 ##########
 File path: docs/dev/table/connect.md
 ##########
 @@ -1075,6 +1075,72 @@ CREATE TABLE MyUserTable (
 
 {% top %}
 
+### HBase Connector
+
+<span class="label label-primary">Source: Batch</span>
+<span class="label label-primary">Sink: Batch</span>
+<span class="label label-primary">Sink: Streaming Append Mode</span>
+<span class="label label-primary">Sink: Streaming Upsert Mode</span>
+<span class="label label-primary">Temporal Join: Sync Mode</span>
+
+The HBase connector allows for reading from an HBase cluster.
+The HBase connector allows for writing into an HBase cluster.
+
+The connector can operate in [upsert mode](#update-modes) for exchanging 
UPSERT/DELETE messages with the external system using a [key defined by the 
query](./streaming/dynamic_tables.html#table-to-stream-conversion).
+
+For append-only queries, the connector can also operate in [append 
mode](#update-modes) for exchanging only INSERT messages with the external 
system.
+
+To use this connector, add the following dependency to your project:
+
+{% highlight xml %}
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-connector-hbase{{ site.scala_version_suffix }}</artifactId>
+  <version>{{site.version }}</version>
+</dependency>
+{% endhighlight %}
+
+The connector can be defined as follows:
+
+<div data-lang="DDL" markdown="1">
+{% highlight sql %}
+CREATE TABLE MyUserTable (
+  hbase_rowkey_name rowkey_type,
+  hbase_column_family_name1 ROW<...>,
+  hbase_column_family_name2 ROW<...>
+) WITH (
+  'connector.type' = 'hbase', -- required: specify this table type is hbase
+  
+  'connector.version' = '1.4.3',          -- required: valid connector 
versions are "1.4.3"
+  
+  'connector.table-name' = 'hbase_table_name',  -- required: hbase table name
+  
+  'connector.zookeeper.quorum' = 'quorum_url', -- required: hbase zookeeper 
config
+  'connector.zookeeper.znode.parent' = 'znode',
+
+  'connector.write.buffer-flush.max-size' = '1048576', -- optional: Write 
option, sets when to flush a buffered request
+                                                       -- based on the memory 
size of rows currently added.
+
+  'connector.write.buffer-flush.max-rows' = '1', -- optional: Write option, 
sets when to flush buffered 
+                                                    -- request based on the 
number of rows currently added.
+
+  'connector.write.buffer-flush.interval' = '1', -- optional: Write option, 
sets a flush interval flushing buffered 
+                                                 -- requesting if the interval 
passes, in milliseconds.
+)
+{% endhighlight %}
+</div>
+</div>
+
+**Column family:** Values other than rowKey must be declared by column 
families. So need to wrap values with the SQL ROW function before inserting 
into hbase table.
+
+**HBase config:** If need to configure Config for HBase, create default 
configuration from current runtime env (`hbase-site.xml` in classpath) first, 
and overwrite configuration using serialized configuration from client-side env 
(`hbase-site.xml` in classpath).
+
+**Temporary join:** The Lookup Join of HBase does not use any caching, and 
every time the data is accessed directly to the client Api of HBase.
+
+**Rowkey:** User should confirm rowkey should not be empty string. (waiting 
for support)
 
 Review comment:
   ```suggestion
   **Rowkey:** Empty string rowkey values are currently unsupported.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] sjwiesman commented on a change in pull request #9799: [FLINK-13360][documentation] Add documentation for HBase connector for Table API & SQL

Reply via email to