This is an automated email from the ASF dual-hosted git repository.
yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new db4618d07d81 docs: Fix typos and stale references in current docs
(#18881)
db4618d07d81 is described below
commit db4618d07d818cbf7a69be16b17b4fb41307eabb
Author: Y Ethan Guo <[email protected]>
AuthorDate: Fri May 29 01:24:49 2026 -0700
docs: Fix typos and stale references in current docs (#18881)
---
website/docs/catalog_polaris.md | 4 ++--
website/docs/cleaning.md | 2 +-
website/docs/cli.md | 2 +-
website/docs/flink-quick-start-guide.md | 2 +-
website/docs/gcp_bigquery.md | 2 +-
website/docs/reading_tables_streaming_reads.md | 2 +-
website/docs/sql_dml.md | 9 ++-------
website/docs/sql_queries.md | 6 +++---
website/docs/timeline.md | 4 ++--
9 files changed, 14 insertions(+), 19 deletions(-)
diff --git a/website/docs/catalog_polaris.md b/website/docs/catalog_polaris.md
index 10ff3986d296..570a0e0810ae 100644
--- a/website/docs/catalog_polaris.md
+++ b/website/docs/catalog_polaris.md
@@ -6,8 +6,8 @@ toc_max_heading_level: 4
keywords: [hudi, polaris, catalog, integration]
---
-:::warning Polaris Integration Status
-Hudi 1.1.0 added support for Apache Polaris catalog integration (see [PR
#13558](https://github.com/apache/hudi/pull/13558)). However, a Polaris release
that includes [this PR](https://github.com/apache/polaris/pull/1862) is pending
before this integration to be available.
+:::note Polaris Integration Status
+The Polaris integration is available since Polaris 1.3.0 and Hudi 1.1.1.
:::
## Overview
diff --git a/website/docs/cleaning.md b/website/docs/cleaning.md
index 829d0d57cd6d..fa92192f04eb 100644
--- a/website/docs/cleaning.md
+++ b/website/docs/cleaning.md
@@ -36,7 +36,7 @@ Hudi cleaner currently supports the below cleaning policies
to keep a certain nu
retain atleast the last 10 commits. With such a configuration, we ensure
that the oldest version of a file is kept on
disk for at least 5 hours, thereby preventing the longest running query from
failing at any point in time. Incremental
cleaning is also possible using this policy.
- Number of commits to retain can be configured by
[`hoodie.clean.commits.retained`](https://analytics.google.com/analytics/web/#/p300324801/reports/intelligenthome).
+ Number of commits to retain can be configured by
[`hoodie.clean.commits.retained`](https://hudi.apache.org/docs/configurations/#hoodiecleancommitsretained).
The corresponding Flink related config is
[`clean.retain_commits`](https://hudi.apache.org/docs/configurations/#cleanretain_commits).
- **KEEP_LATEST_FILE_VERSIONS**: This policy has the effect of keeping N
number of file versions irrespective of time.
diff --git a/website/docs/cli.md b/website/docs/cli.md
index ddb8132d3cf3..c46cf702d703 100644
--- a/website/docs/cli.md
+++ b/website/docs/cli.md
@@ -628,7 +628,7 @@ The following table shows the Hudi table versions
corresponding to the Hudi rele
| Hudi Table Version | Hudi Release Version(s) |
|:-------------------|:------------------------|
-| `NINE` or `9` | 1.1.x |
+| `NINE` or `9` | 1.1.x - 1.2.x |
| `EIGHT` or `8` | 1.0.x |
| `SIX` or `6` | 0.14.x - 0.15.x |
| `FIVE` or `5` | 0.12.x - 0.13.x |
diff --git a/website/docs/flink-quick-start-guide.md
b/website/docs/flink-quick-start-guide.md
index 23ef5efbe3cc..2d267b5479d0 100644
--- a/website/docs/flink-quick-start-guide.md
+++ b/website/docs/flink-quick-start-guide.md
@@ -51,7 +51,7 @@ values={[
>
<TabItem value="flinksql">
-We use the [Flink SQL
Client](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sqlclient/)
because it's a good
+We use the [Flink SQL
Client](https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/table/sqlclient/)
because it's a good
quick-start tool for SQL users.
### Start Flink SQL Client
diff --git a/website/docs/gcp_bigquery.md b/website/docs/gcp_bigquery.md
index 23e020b2419c..75324e01b5e0 100644
--- a/website/docs/gcp_bigquery.md
+++ b/website/docs/gcp_bigquery.md
@@ -65,7 +65,7 @@ Below shows an example for running `BigQuerySyncTool` with
Hudi Streamer.
```shell
spark-submit --master yarn \
--packages com.google.cloud:google-cloud-bigquery:2.10.4 \
---jars
"/opt/hudi-gcp-bundle-0.13.0.jar,/opt/hudi-utilities-slim-bundle_2.12-1.0.1.jar,/opt/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"/opt/hudi-gcp-bundle-1.2.0.jar,/opt/hudi-utilities-slim-bundle_2.12-1.0.1.jar,/opt/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
--class org.apache.hudi.utilities.streamer.HoodieStreamer \
/opt/hudi-utilities-slim-bundle_2.12-1.0.1.jar \
--target-base-path gs://my-hoodie-table/path \
diff --git a/website/docs/reading_tables_streaming_reads.md
b/website/docs/reading_tables_streaming_reads.md
index 7191cedf0115..3b0312beed22 100644
--- a/website/docs/reading_tables_streaming_reads.md
+++ b/website/docs/reading_tables_streaming_reads.md
@@ -79,7 +79,7 @@ df = spark.readStream \
.format("hudi") \
.load(basePath)
-# ead stream and output results to console
+# read stream and output results to console
spark.readStream \
.format("hudi") \
.load(basePath) \
diff --git a/website/docs/sql_dml.md b/website/docs/sql_dml.md
index 43e38e813971..bd5d7a8fe3e8 100644
--- a/website/docs/sql_dml.md
+++ b/website/docs/sql_dml.md
@@ -386,7 +386,7 @@ UPDATE hudi_table SET price = price * 2, ts = 1111 WHERE id
= 1;
```
:::note Key requirements
-Update query only work with batch excution mode.
+Update query only work with batch execution mode.
:::
### Delete From
@@ -396,17 +396,12 @@ With Flink SQL, you can use delete command to delete the
rows from hudi table. H
DELETE FROM tableIdentifier [ WHERE boolExpression ]
```
-```sql
-DELETE FROM hudi_table WHERE price < 100;
-```
-
-
```sql
DELETE FROM hudi_table WHERE price < 100;
```
:::note Key requirements
-Delete query only work with batch excution mode.
+Delete query only work with batch execution mode.
:::
### Lookup Joins
diff --git a/website/docs/sql_queries.md b/website/docs/sql_queries.md
index 5a87ead03370..71d71812c7b0 100644
--- a/website/docs/sql_queries.md
+++ b/website/docs/sql_queries.md
@@ -346,7 +346,7 @@ Please refer to [configurations](basic_configurations.md)
section for the import
:::note Incremental Query Checkpointing between Hudi 0.x and 1.0.
In Hudi 1.0, we switch the incremental and CDC query to used completion time,
instead of instant time, to determine the
range of commits to incrementally pull from. The checkpoint stored for Hudi
incremental source and related sources is
-also changed to use completion time. To support compatiblity, Hudi does a
checkpoint translation from requested instant
+also changed to use completion time. To support compatibility, Hudi does a
checkpoint translation from requested instant
time to completion time depending on the source table version.
:::
@@ -647,7 +647,7 @@ select * from hudi_table/*+
OPTIONS('read.streaming.enabled'='true', 'read.start
| ----------- | ------- | ------- | ------- |
| `read.streaming.enabled` | false | `false` | Specify `true` to read as
streaming |
| `read.start-commit` | false | the latest commit | Start commit time in
format 'yyyyMMddHHmmss', use `earliest` to consume from the start commit |
-| `read.streaming.skip_compaction` | false | `false` | Whether to skip
compaction instants for streaming read, generally for two purpose: 1) Avoid
consuming duplications from compaction instants created for created by Hudi
versions < 0.11.0 or when `hoodie.compaction.preserve.commit.metadata` is
disabled 2) When change log mode is enabled, to only consume change for right
semantics. |
+| `read.streaming.skip_compaction` | false | `false` | Whether to skip
compaction instants for streaming read, generally for two purpose: 1) Avoid
consuming duplications from compaction instants for tables created by Hudi
versions < 0.11.0 or when `hoodie.compaction.preserve.commit.metadata` is
disabled 2) When change log mode is enabled, to only consume change for right
semantics. |
| `clean.retain_commits` | false | `10` | The max number of commits to retain
before cleaning, when change log mode is enabled, tweaks this option to adjust
the change log live time. For example, the default strategy keeps 50 minutes of
change logs if the checkpoint interval is set up as 5 minutes. |
:::note
@@ -707,7 +707,7 @@ internal Hudi metadata such as commit time, record key, and
partition path. The
| Metadata Column Name | Description
|
|--------------------------|--------------------------------------------------------------------------------|
| `_hoodie_commit_time` | The commit time when the record was committed
|
-| `_hoodie_commit_seqno` | The commit requence number of the record
|
+| `_hoodie_commit_seqno` | The commit sequence number of the record
|
| `_hoodie_record_key` | The record key of the record
|
| `_hoodie_partition_path` | The partition path of the record
|
| `_hoodie_file_name` | The file name where the record is stored
|
diff --git a/website/docs/timeline.md b/website/docs/timeline.md
index 47d35a50eafc..0da85c7fddff 100644
--- a/website/docs/timeline.md
+++ b/website/docs/timeline.md
@@ -42,8 +42,8 @@ becomes _REPLACE_COMMIT_ in completed state. Compactions
complete as _COMMIT_ ac
may map to the same action on the timeline.
### State Transitions
-Actions go through state transitions on the timeline, with each transition
recorded by a file of the pattern `<requsted instant>.<action>.<state>`(for
other states) or
-`<requsted instant>_<completed instant>.<action>` (for COMPLETED state). Hudi
guarantees that the state transitions are atomic and timeline consistent based
on the instant time.
+Actions go through state transitions on the timeline, with each transition
recorded by a file of the pattern `<requested instant>.<action>.<state>`(for
other states) or
+`<requested instant>_<completed instant>.<action>` (for COMPLETED state). Hudi
guarantees that the state transitions are atomic and timeline consistent based
on the instant time.
Atomicity is achieved by relying on the atomic operations on the underlying
storage (e.g. PUT calls to S3/Cloud Storage).
Valid state transitions are as follows: