This is an automated email from the ASF dual-hosted git repository.
lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/paimon.git
The following commit(s) were added to refs/heads/master by this push:
new d2a6f4c92 [doc] Add catalog and table types in concept
d2a6f4c92 is described below
commit d2a6f4c925435fa8ac477752b712290a76c4e1c6
Author: Jingsong <[email protected]>
AuthorDate: Thu Nov 7 19:32:08 2024 +0800
[doc] Add catalog and table types in concept
---
docs/content/concepts/catalog.md | 90 +++++++++++++++++++++
docs/content/concepts/spec/_index.md | 2 +-
docs/content/concepts/table-types.md | 149 +++++++++++++++++++++++++++++++++++
docs/content/flink/_index.md | 2 +-
docs/content/spark/_index.md | 2 +-
5 files changed, 242 insertions(+), 3 deletions(-)
diff --git a/docs/content/concepts/catalog.md b/docs/content/concepts/catalog.md
new file mode 100644
index 000000000..9775113a6
--- /dev/null
+++ b/docs/content/concepts/catalog.md
@@ -0,0 +1,90 @@
+---
+title: "Catalog"
+weight: 4
+type: docs
+aliases:
+- /concepts/catalog.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Catalog
+
+Paimon provides a Catalog abstraction to manage the table of contents and
metadata. The Catalog abstraction provides
+a series of ways to help you better integrate with computing engines. We
always recommend that you use Catalog to
+access the Paimon table.
+
+## Catalogs
+
+Paimon catalogs currently support three types of metastores:
+
+* `filesystem` metastore (default), which stores both metadata and table files
in filesystems.
+* `hive` metastore, which additionally stores metadata in Hive metastore.
Users can directly access the tables from Hive.
+* `jdbc` metastore, which additionally stores metadata in relational databases
such as MySQL, Postgres, etc.
+
+## Filesystem Catalog
+
+Metadata and table files are stored under `hdfs:///path/to/warehouse`.
+
+```sql
+-- Flink SQL
+CREATE CATALOG my_catalog WITH (
+ 'type' = 'paimon',
+ 'warehouse' = 'hdfs:///path/to/warehouse'
+);
+```
+
+## Hive Catalog
+
+By using Paimon Hive catalog, changes to the catalog will directly affect the
corresponding Hive metastore. Tables
+created in such catalog can also be accessed directly from Hive. Metadata and
table files are stored under
+`hdfs:///path/to/warehouse`. In addition, schema is also stored in Hive
metastore.
+
+```sql
+-- Flink SQL
+CREATE CATALOG my_hive WITH (
+ 'type' = 'paimon',
+ 'metastore' = 'hive',
+ -- 'warehouse' = 'hdfs:///path/to/warehouse', default use
'hive.metastore.warehouse.dir' in HiveConf
+);
+```
+
+By default, Paimon does not synchronize newly created partitions into Hive
metastore. Users will see an unpartitioned
+table in Hive. Partition push-down will be carried out by filter push-down
instead.
+
+If you want to see a partitioned table in Hive and also synchronize newly
created partitions into Hive metastore,
+please set the table option `metastore.partitioned-table` to true.
+
+## JDBC Catalog
+
+By using the Paimon JDBC catalog, changes to the catalog will be directly
stored in relational databases such as SQLite,
+MySQL, postgres, etc.
+
+```sql
+-- Flink SQL
+CREATE CATALOG my_jdbc WITH (
+ 'type' = 'paimon',
+ 'metastore' = 'jdbc',
+ 'uri' = 'jdbc:mysql://<host>:<port>/<databaseName>',
+ 'jdbc.user' = '...',
+ 'jdbc.password' = '...',
+ 'catalog-key'='jdbc',
+ 'warehouse' = 'hdfs:///path/to/warehouse'
+);
+```
diff --git a/docs/content/concepts/spec/_index.md
b/docs/content/concepts/spec/_index.md
index 3bd8e657f..166ce4eea 100644
--- a/docs/content/concepts/spec/_index.md
+++ b/docs/content/concepts/spec/_index.md
@@ -1,7 +1,7 @@
---
title: Specification
bookCollapseSection: true
-weight: 4
+weight: 6
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/content/concepts/table-types.md
b/docs/content/concepts/table-types.md
new file mode 100644
index 000000000..88f0cc66c
--- /dev/null
+++ b/docs/content/concepts/table-types.md
@@ -0,0 +1,149 @@
+---
+title: "Table Types"
+weight: 5
+type: docs
+aliases:
+- /concepts/table-types.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Table Types
+
+Paimon supports table types:
+
+1. table with pk: Paimon Data Table with Primary key
+2. table w/o pk: Paimon Data Table without Primary key
+3. view: metastore required, views in SQL are a kind of virtual table
+4. format-table: file format table refers to a directory that contains
multiple files of the same format, where
+ operations on this table allow for reading or writing to these files,
compatible with Hive tables
+5. materialized-table: aimed at simplifying both batch and stream data
pipelines, providing a consistent development
+ experience, see [Flink Materialized
Table](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/materialized-table/overview/)
+
+## Table with PK
+
+See [Paimon with Primary key]({{< ref "primary-key-table/overview" >}}).
+
+Primary keys consist of a set of columns that contain unique values for each
record. Paimon enforces data ordering by
+sorting the primary key within each bucket, allowing streaming update and
streaming changelog read.
+
+The definition of primary key is similar to that of standard SQL, as it
ensures that there is only one data entry for
+the same primary key during batch queries.
+
+## Table w/o PK
+
+See [Paimon w/o Primary key]({{< ref "append-table/overview" >}}).
+
+If a table does not have a primary key defined, it is an append table.
Compared to the primary key table, it does not
+have the ability to directly receive changelogs. It cannot be directly updated
with data through streaming upsert. It
+can only receive incoming data from append data.
+
+However, it also supports batch sql: DELETE, UPDATE, and MERGE-INTO.
+
+## View
+
+View is supported when the metastore can support view, for example, hive
metastore.
+
+View will currently save the original SQL. If you need to use View across
engines, you can write a cross engine
+SQL statement. For example:
+
+```sql
+CREATE VIEW my_view AS SELECT a + 1, b FROM my_db.my_source;
+```
+
+## Format Table
+
+Format table is supported when the metastore can support format table, for
example, hive metastore. The Hive tables
+inside the metastore will be mapped to Paimon's Format Table for computing
engines (Spark, Hive, Flink) to read and write.
+
+Format table refers to a directory that contains multiple files of the same
format, where operations on this table
+allow for reading or writing to these files, facilitating the retrieval of
existing data and the addition of new files.
+
+Partitioned file format table just like the standard hive format. Partitions
are discovered and inferred based on
+directory structure.
+
+Format Table is enabled by default, you can disable it by configuring Catalog
option: `'format-table.enabled'`.
+
+Currently only support `CSV`, `Parquet`, `ORC` formats.
+
+### CSV
+
+{{< tabs "format-table-csv" >}}
+{{< tab "Flink SQL" >}}
+
+```sql
+CREATE TABLE my_csv_table (
+ a INT,
+ b STRING
+) WITH (
+ 'type'='format-table',
+ 'file.format'='csv',
+ 'field-delimiter'=','
+)
+```
+{{< /tab >}}
+
+{{< tab "Spark SQL" >}}
+
+```sql
+CREATE TABLE my_csv_table (
+ a INT,
+ b STRING
+) USING csv OPTIONS ('field-delimiter' ',')
+```
+
+{{< /tab >}}
+{{< /tabs >}}
+
+Now, only support `'field-delimiter'` option.
+
+### Parquet & ORC
+
+{{< tabs "format-table-parquet" >}}
+{{< tab "Flink SQL" >}}
+
+```sql
+CREATE TABLE my_parquet_table (
+ a INT,
+ b STRING
+) WITH (
+ 'type'='format-table',
+ 'file.format'='parquet'
+)
+```
+{{< /tab >}}
+
+{{< tab "Spark SQL" >}}
+
+```sql
+CREATE TABLE my_parquet_table (
+ a INT,
+ b STRING
+) USING parquet
+```
+
+{{< /tab >}}
+{{< /tabs >}}
+
+## Materialized Table
+
+Materialized Table aimed at simplifying both batch and stream data pipelines,
providing a consistent development
+experience, see [Flink Materialized
Table](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/materialized-table/overview/).
+
+Now only Flink SQL integrate to Materialized Table, we plan to support it in
Spark SQL too.
diff --git a/docs/content/flink/_index.md b/docs/content/flink/_index.md
index c39ff01d8..6ec757fa5 100644
--- a/docs/content/flink/_index.md
+++ b/docs/content/flink/_index.md
@@ -3,7 +3,7 @@ title: Engine Flink
icon: <i class="fa fa-gear title maindish" aria-hidden="true"></i>
bold: true
bookCollapseSection: true
-weight: 4
+weight: 5
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/content/spark/_index.md b/docs/content/spark/_index.md
index 24661e56f..07128574b 100644
--- a/docs/content/spark/_index.md
+++ b/docs/content/spark/_index.md
@@ -3,7 +3,7 @@ title: Engine Spark
icon: <i class="fa fa-gear title maindish" aria-hidden="true"></i>
bold: true
bookCollapseSection: true
-weight: 5
+weight: 6
---
<!--
Licensed to the Apache Software Foundation (ASF) under one