This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push: new f025f0b7af [HUDI-4723] Add document about Hoodie Catalog (#6508) f025f0b7af is described below commit f025f0b7af5531f478f91fdd7c37f804c507f9b3 Author: Danny Chan <yuzhao....@gmail.com> AuthorDate: Fri Aug 26 16:00:35 2022 +0800 [HUDI-4723] Add document about Hoodie Catalog (#6508) --- website/docs/flink-quick-start-guide.md | 13 +++++----- website/docs/table_management.md | 28 +++++++++++++++++++++- .../version-0.12.0/flink-quick-start-guide.md | 13 +++++----- .../version-0.12.0/table_management.md | 28 +++++++++++++++++++++- 4 files changed, 66 insertions(+), 16 deletions(-) diff --git a/website/docs/flink-quick-start-guide.md b/website/docs/flink-quick-start-guide.md index 4cf2b1042b..f4b9668178 100644 --- a/website/docs/flink-quick-start-guide.md +++ b/website/docs/flink-quick-start-guide.md @@ -24,14 +24,13 @@ quick start tool for SQL users. #### Step.1 download Flink jar -Hudi works with both Flink 1.13 and Flink 1.14. You can follow the +Hudi works with both Flink 1.13, Flink 1.14, Flink 1.15. You can follow the instructions [here](https://flink.apache.org/downloads) for setting up Flink. Then choose the desired Hudi-Flink bundle jar to work with different Flink and Scala versions: -- `hudi-flink1.13-bundle_2.11` -- `hudi-flink1.13-bundle_2.12` -- `hudi-flink1.14-bundle_2.11` -- `hudi-flink1.14-bundle_2.12` +- `hudi-flink1.13-bundle` +- `hudi-flink1.14-bundle` +- `hudi-flink1.15-bundle` #### Step.2 start Flink cluster Start a standalone Flink cluster within hadoop environment. @@ -117,8 +116,8 @@ INSERT INTO t1 VALUES select * from t1; ``` -This query provides snapshot querying of the ingested data. -Refer to [Table types and queries](/docs/concepts#table-types--queries) for more info on all table types and query types supported. +This statement queries snapshot view of the dataset. +Refers to [Table types and queries](/docs/concepts#table-types--queries) for more info on all table types and query types supported. ### Update Data diff --git a/website/docs/table_management.md b/website/docs/table_management.md index 6099476c31..7dbccd19ed 100644 --- a/website/docs/table_management.md +++ b/website/docs/table_management.md @@ -208,7 +208,33 @@ set hoodie.upsert.shuffle.parallelism = 100; set hoodie.delete.shuffle.parallelism = 100; ``` -## Flink +## Flink + +### Create Catalog + +The catalog helps to manage the SQL tables, the table can be shared among CLI sessions if the catalog persists the table DDLs. +For `hms` mode, the catalog also supplements the hive syncing options. + +HMS mode catalog SQL demo: +```sql +CREATE CATALOG hoodie_catalog + WITH ( + 'type'='hudi', + 'catalog.path' = '${catalog default root path}', + 'hive.conf.dir' = '${directory where hive-site.xml is located}', + 'mode'='hms' -- supports 'dfs' mode that uses the DFS backend for table DDLs persistence + ); +``` + +#### Options +| Option Name | Required | Default | Remarks | +| ----------- | ------- | ------- | ------- | +| `catalog.path` | true | -- | Default root path for the catalog, the path is used to infer the table path automatically, the default table path: `${catalog.path}/${db_name}/${table_name}` | +| `default-database` | false | default | default database name | +| `hive.conf.dir` | false | -- | The directory where hive-site.xml is located, only valid in `hms` mode | +| `mode` | false | dfs | Supports `hms` mode that uses HMS to persist the table options | +| `table.external` | false | false | Whether to create the external table, only valid in `hms` mode | + ### Create Table The following is a Flink example to create a table. [Read the Flink Quick Start](/docs/flink-quick-start-guide) guide for more examples. diff --git a/website/versioned_docs/version-0.12.0/flink-quick-start-guide.md b/website/versioned_docs/version-0.12.0/flink-quick-start-guide.md index 4cf2b1042b..4a926aad04 100644 --- a/website/versioned_docs/version-0.12.0/flink-quick-start-guide.md +++ b/website/versioned_docs/version-0.12.0/flink-quick-start-guide.md @@ -24,14 +24,13 @@ quick start tool for SQL users. #### Step.1 download Flink jar -Hudi works with both Flink 1.13 and Flink 1.14. You can follow the +Hudi works with both Flink 1.13, Flink 1.14 and Flink 1.15. You can follow the instructions [here](https://flink.apache.org/downloads) for setting up Flink. Then choose the desired Hudi-Flink bundle jar to work with different Flink and Scala versions: -- `hudi-flink1.13-bundle_2.11` -- `hudi-flink1.13-bundle_2.12` -- `hudi-flink1.14-bundle_2.11` -- `hudi-flink1.14-bundle_2.12` +- `hudi-flink1.13-bundle` +- `hudi-flink1.14-bundle` +- `hudi-flink1.15-bundle` #### Step.2 start Flink cluster Start a standalone Flink cluster within hadoop environment. @@ -117,8 +116,8 @@ INSERT INTO t1 VALUES select * from t1; ``` -This query provides snapshot querying of the ingested data. -Refer to [Table types and queries](/docs/concepts#table-types--queries) for more info on all table types and query types supported. +This statement queries snapshot view of the dataset. +Refers to [Table types and queries](/docs/concepts#table-types--queries) for more info on all table types and query types supported. ### Update Data diff --git a/website/versioned_docs/version-0.12.0/table_management.md b/website/versioned_docs/version-0.12.0/table_management.md index 6099476c31..7dbccd19ed 100644 --- a/website/versioned_docs/version-0.12.0/table_management.md +++ b/website/versioned_docs/version-0.12.0/table_management.md @@ -208,7 +208,33 @@ set hoodie.upsert.shuffle.parallelism = 100; set hoodie.delete.shuffle.parallelism = 100; ``` -## Flink +## Flink + +### Create Catalog + +The catalog helps to manage the SQL tables, the table can be shared among CLI sessions if the catalog persists the table DDLs. +For `hms` mode, the catalog also supplements the hive syncing options. + +HMS mode catalog SQL demo: +```sql +CREATE CATALOG hoodie_catalog + WITH ( + 'type'='hudi', + 'catalog.path' = '${catalog default root path}', + 'hive.conf.dir' = '${directory where hive-site.xml is located}', + 'mode'='hms' -- supports 'dfs' mode that uses the DFS backend for table DDLs persistence + ); +``` + +#### Options +| Option Name | Required | Default | Remarks | +| ----------- | ------- | ------- | ------- | +| `catalog.path` | true | -- | Default root path for the catalog, the path is used to infer the table path automatically, the default table path: `${catalog.path}/${db_name}/${table_name}` | +| `default-database` | false | default | default database name | +| `hive.conf.dir` | false | -- | The directory where hive-site.xml is located, only valid in `hms` mode | +| `mode` | false | dfs | Supports `hms` mode that uses HMS to persist the table options | +| `table.external` | false | false | Whether to create the external table, only valid in `hms` mode | + ### Create Table The following is a Flink example to create a table. [Read the Flink Quick Start](/docs/flink-quick-start-guide) guide for more examples.