(paimon) branch master updated: [doc] Add table mode page for primary key table

lzljs3620320 Mon, 01 Jul 2024 03:31:55 -0700

This is an automated email from the ASF dual-hosted git repository.

lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/paimon.git



The following commit(s) were added to refs/heads/master by this push:
     new 59fb7134b [doc] Add table mode page for primary key table
59fb7134b is described below

commit 59fb7134bd01417d255c73cc5a6adcfee57ae775
Author: Jingsong <[email protected]>
AuthorDate: Mon Jul 1 18:31:05 2024 +0800

    [doc] Add table mode page for primary key table
---
 .../primary-key-table/changelog-producer.md        |   2 +-
 docs/content/primary-key-table/deletion-vectors.md |  49 ---------
 docs/content/primary-key-table/merge-engine.md     |   2 +-
 docs/content/primary-key-table/read-optimized.md   |  47 --------
 docs/content/primary-key-table/sequence-rowkind.md |   2 +-
 docs/content/primary-key-table/table-mode.md       | 119 +++++++++++++++++++++
 docs/static/img/cow.png                            | Bin 0 -> 1600997 bytes
 docs/static/img/lsm-inside-bucket.png              | Bin 0 -> 2214166 bytes
 docs/static/img/mor.png                            | Bin 0 -> 1595856 bytes
 docs/static/img/mow-example.png                    | Bin 0 -> 1366847 bytes
 docs/static/img/mow.png                            | Bin 0 -> 1302647 bytes
 11 files changed, 122 insertions(+), 99 deletions(-)

diff --git a/docs/content/primary-key-table/changelog-producer.md 
b/docs/content/primary-key-table/changelog-producer.md
index 45723da1f..88ae5817e 100644
--- a/docs/content/primary-key-table/changelog-producer.md
+++ b/docs/content/primary-key-table/changelog-producer.md
@@ -1,6 +1,6 @@
 ---
 title: "Changelog Producer"
-weight: 4
+weight: 5
 type: docs
 aliases:
 - /primary-key-table/changelog-producer.html
diff --git a/docs/content/primary-key-table/deletion-vectors.md 
b/docs/content/primary-key-table/deletion-vectors.md
deleted file mode 100644
index 3eb6f293f..000000000
--- a/docs/content/primary-key-table/deletion-vectors.md
+++ /dev/null
@@ -1,49 +0,0 @@
----
-title: "Deletion Vectors"
-weight: 6
-type: docs
-aliases:
-- /primary-key-table/deletion-vectors.html
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# Deletion Vectors
-
-## Overview
-
-The Deletion Vectors mode is designed to takes into account both data reading 
and writing efficiency.
-
-In this mode, additional overhead (looking up LSM Tree and generating the 
corresponding Deletion File) will be introduced during writing,
-but during reading, data can be directly retrieved by employing data with 
deletion vectors, avoiding additional merge costs between different files.
-
-Furthermore, data reading concurrency is no longer limited, and non-primary 
key columns can also be used for filter push down.
-Generally speaking, in this mode, we can get a huge improvement in read 
performance without losing too much write performance.
-
-{{< img src="/img/deletion-vectors-overview.png">}}
-
-## Usage
-
-By specifying `'deletion-vectors.enabled' = 'true'`, the Deletion Vectors mode 
can be enabled.
-
-## Limitation
-
-- `changelog-producer` needs to be `none` or `lookup`.
-- `merge-engine` can't be `first-row`, because the read of first-row is 
already no merging, deletion vectors are not needed.
-- This mode will filter the data in level-0, so when using time travel to read 
`APPEND` snapshot, there will be data delay.
diff --git a/docs/content/primary-key-table/merge-engine.md 
b/docs/content/primary-key-table/merge-engine.md
index f4daa7bf7..32b897a00 100644
--- a/docs/content/primary-key-table/merge-engine.md
+++ b/docs/content/primary-key-table/merge-engine.md
@@ -1,6 +1,6 @@
 ---
 title: "Merge Engine"
-weight: 3
+weight: 4
 type: docs
 aliases:
 - /primary-key-table/merge-engine.html
diff --git a/docs/content/primary-key-table/read-optimized.md 
b/docs/content/primary-key-table/read-optimized.md
deleted file mode 100644
index 1a5a72334..000000000
--- a/docs/content/primary-key-table/read-optimized.md
+++ /dev/null
@@ -1,47 +0,0 @@
----
-title: "Read Optimized"
-weight: 7
-type: docs
-aliases:
-- /primary-key-table/read-optimized.html
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# Read Optimized
-
-## Overview
-
-For Primary Key Table, it's a 'MergeOnRead' technology. When reading data, 
multiple layers of LSM data are merged,
-and the number of parallelism will be limited by the number of buckets. 
Although Paimon's merge performance is efficient,
-it still cannot catch up with the ordinary AppendOnly table.
-
-We recommend that you use [Deletion Vectors]({{< ref 
"primary-key-table/deletion-vectors" >}}) mode.
-
-If you don't want to use Deletion Vectors mode, you want to query fast enough 
in certain scenarios, but can only find
-older data, you can also:
-
-1. Configure 'compaction.optimization-interval' when writing data. For 
streaming jobs, optimized compaction will then
-   be performed periodically; For batch jobs, optimized compaction will be 
carried out when the job ends. (Or configure
-   `'full-compaction.delta-commits'`, its disadvantage is that it can only 
perform compaction synchronously, which will
-   affect writing efficiency)
-2. Query from [read-optimized system table]({{< ref 
"maintenance/system-tables#read-optimized-table" >}}). Reading from
-   results of optimized files avoids merging records with the same key, thus 
improving reading performance.
-
-You can flexibly balance query performance and data latency when reading.
diff --git a/docs/content/primary-key-table/sequence-rowkind.md 
b/docs/content/primary-key-table/sequence-rowkind.md
index 876348b62..61e2c01c8 100644
--- a/docs/content/primary-key-table/sequence-rowkind.md
+++ b/docs/content/primary-key-table/sequence-rowkind.md
@@ -1,6 +1,6 @@
 ---
 title: "Sequence & Rowkind"
-weight: 5
+weight: 6
 type: docs
 aliases:
 - /primary-key-table/sequence-rowkind.html
diff --git a/docs/content/primary-key-table/table-mode.md 
b/docs/content/primary-key-table/table-mode.md
new file mode 100644
index 000000000..15261cabc
--- /dev/null
+++ b/docs/content/primary-key-table/table-mode.md
@@ -0,0 +1,119 @@
+---
+title: "Table Mode"
+weight: 3
+type: docs
+aliases:
+- /primary-key-table/read-optimized.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Table Mode
+
+{{< img src="/img/lsm-inside-bucket.png">}}
+
+The file structure of the primary key table is roughly shown in the above 
figure. The table or partition contains
+multiple buckets, and each bucket is a separate LSM tree structure that 
contains multiple files.
+
+The writing process of LSM is roughly as follows: Flink checkpoint flush L0 
files, and trigger a compaction as needed
+to merge the data. According to the different processing ways during writing, 
there are three modes:
+
+1. MOR (Merge On Read): Default mode, only minor compactions are performed, 
and merging are required for reading.
+2. COW (Copy On Write): Using `'full-compaction.delta-commits' = '1'`, full 
compaction will be synchronized, which
+   means the merge is completed on write.
+3. MOW (Merge On Write): Using `'deletion-vectors.enabled' = 'true'`, in 
writing phase, LSM will be queried to generate
+   the deletion vector file for the data file, which directly filters out 
unnecessary lines during reading.
+
+The Merge On Write mode is recommended for general primary key tables 
(merge-engine is default `deduplicate`).
+
+## Merge On Read
+
+MOR is the default mode of primary key table.
+
+{{< img src="/img/mor.png">}}
+
+When the mode is MOR, it is necessary to merge all files for reading, as all 
files are ordered and undergo multi way
+merging, which includes a comparison calculation of the primary key.
+
+There is an obvious issue here, where a single LSM tree can only have a single 
thread to read, so the read parallelism
+is limited. If the amount of data in the bucket is too large, it can lead to 
poor read performance. So in order to read
+performance, it is recommended to analyze the query requirements table and set 
the data volume in the bucket to be
+between 200MB and 1GB. But if the bucket is too small, there will be a lot of 
small file reads and writes, causing
+pressure on the file system.
+
+In addition, due to the merging process, Filter based data skipping cannot be 
performed on non primary key columns, 
+otherwise new data will be filtered out, resulting in incorrect old data.
+
+- Write performance: very good.
+- Read performance: not so good.
+
+## Copy On Write
+
+```sql
+ALTER TABLE orders SET ('full-compaction.delta-commits' = '1');
+```
+
+Set `full-compaction.delta-commits` to 1, which means that every write will be 
fully merged, and all data will be merged
+to the highest level. When reading, merging is not necessary at this time, and 
the reading performance is the highest.
+But every write requires full merging, and write amplification is very severe.
+
+{{< img src="/img/cow.png">}}
+
+- Write performance: very bad.
+- Read performance: very good.
+
+## Merge On Write
+
+```sql
+ALTER TABLE orders SET ('deletion-vectors.enabled' = 'true');
+```
+
+Thanks to Paimon's LSM structure, it has the ability to be queried by primary 
key. We can generate deletion vectors
+files when writing, representing which data in the file has been deleted. This 
directly filters out unnecessary rows
+during reading, which is equivalent to merging and does not affect reading 
performance.
+
+{{< img src="/img/mow.png">}}
+
+A simple example just like:
+
+{{< img src="/img/mow-example.png">}}
+
+Updates data by deleting old record first and then adding new one.
+
+- Write performance: good.
+- Read performance: good.
+
+{{< hint info >}}
+Visibility guarantee: Tables in deletion vectors mode, the files with level 0 
will only be visible after compaction.
+So by default, compaction is synchronous, and if asynchronous is turned on, 
there may be delays in the data.
+{{< /hint >}}
+
+## MOR Read Optimized
+
+If you don't want to use Deletion Vectors mode, you want to query fast enough 
in MOR mode, but can only find
+older data, you can also:
+
+1. Configure 'compaction.optimization-interval' when writing data. For 
streaming jobs, optimized compaction will then
+   be performed periodically; For batch jobs, optimized compaction will be 
carried out when the job ends. (Or configure
+   `'full-compaction.delta-commits'`, its disadvantage is that it can only 
perform compaction synchronously, which will
+   affect writing efficiency)
+2. Query from [read-optimized system table]({{< ref 
"maintenance/system-tables#read-optimized-table" >}}). Reading from
+   results of optimized files avoids merging records with the same key, thus 
improving reading performance.
+
+You can flexibly balance query performance and data latency when reading.
diff --git a/docs/static/img/cow.png b/docs/static/img/cow.png
new file mode 100644
index 000000000..caa9787a9
Binary files /dev/null and b/docs/static/img/cow.png differ
diff --git a/docs/static/img/lsm-inside-bucket.png 
b/docs/static/img/lsm-inside-bucket.png
new file mode 100644
index 000000000..5ee5981eb
Binary files /dev/null and b/docs/static/img/lsm-inside-bucket.png differ
diff --git a/docs/static/img/mor.png b/docs/static/img/mor.png
new file mode 100644
index 000000000..0027f70bf
Binary files /dev/null and b/docs/static/img/mor.png differ
diff --git a/docs/static/img/mow-example.png b/docs/static/img/mow-example.png
new file mode 100644
index 000000000..9e944fc0d
Binary files /dev/null and b/docs/static/img/mow-example.png differ
diff --git a/docs/static/img/mow.png b/docs/static/img/mow.png
new file mode 100644
index 000000000..e9b392fcf
Binary files /dev/null and b/docs/static/img/mow.png differ

(paimon) branch master updated: [doc] Add table mode page for primary key table

Reply via email to