This is an automated email from the ASF dual-hosted git repository.
yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new e3a28fbdaf94 docs: Update documentation for new features in Hudi 1.2.0
(#18867)
e3a28fbdaf94 is described below
commit e3a28fbdaf946a6a5cc0f46229dac9d687f2aa99
Author: Y Ethan Guo <[email protected]>
AuthorDate: Thu May 28 09:27:50 2026 -0700
docs: Update documentation for new features in Hudi 1.2.0 (#18867)
---
website/docs/ai_overview.md | 4 +-
website/docs/azure_hoodie.md | 14 +-
website/docs/blob_unstructured_data.md | 45 ++++-
website/docs/cleaning.md | 47 +++++
website/docs/cli.md | 25 ++-
website/docs/clustering.md | 59 +++++-
website/docs/compaction.md | 13 +-
website/docs/concurrency_control.md | 42 ++++-
website/docs/deployment.md | 4 +-
website/docs/flink-quick-start-guide.md | 33 ++--
website/docs/flink_tuning.md | 47 +++++
website/docs/hoodie_streaming_ingestion.md | 106 ++++++++++-
website/docs/ingestion_flink.md | 244 +++++++++++++++++++++++--
website/docs/key_generation.md | 31 +++-
website/docs/lance_file_format.md | 129 +++++++++++--
website/docs/metadata.md | 31 ++++
website/docs/metadata_indexing.md | 33 +++-
website/docs/metrics.md | 67 ++++---
website/docs/overview.mdx | 4 +-
website/docs/precommit_validator.md | 67 +++++++
website/docs/procedures.md | 60 +++++-
website/docs/reading_tables_batch_reads.md | 26 +++
website/docs/reading_tables_streaming_reads.md | 44 +++++
website/docs/sql_ddl.md | 55 +++++-
website/docs/sql_queries.md | 33 +++-
website/docs/syncing_aws_glue_data_catalog.md | 4 +
website/docs/syncing_metastore.md | 41 +++++
website/docs/variant_type.md | 26 ++-
website/docs/vector_search.md | 42 ++++-
website/docs/writing_data.md | 29 ++-
30 files changed, 1293 insertions(+), 112 deletions(-)
diff --git a/website/docs/ai_overview.md b/website/docs/ai_overview.md
index 11c695d1bf9c..c28bcae6543a 100644
--- a/website/docs/ai_overview.md
+++ b/website/docs/ai_overview.md
@@ -18,7 +18,7 @@ Apache Hudi's AI-native capabilities bring this vision to
life with four foundat
### VECTOR Type and Similarity Search
-Store high-dimensional embedding vectors as first-class column types and run
approximate nearest neighbor (ANN)
+Store high-dimensional embedding vectors as first-class column types and run
vector similarity
search directly in Spark SQL.
```sql
@@ -99,7 +99,7 @@ query performance, while keeping the flexibility for
everything else.
Hudi's pluggable file format architecture supports **Lance**, a modern
columnar format purpose-built for
AI/ML workloads. Lance provides:
-- Efficient vector indexing and ANN search
+- Native vector column encoding (`FixedSizeList`) — no conversion overhead at
the file-format layer
- Fast random access for training data sampling
- Optimized storage for high-dimensional arrays and nested structures
diff --git a/website/docs/azure_hoodie.md b/website/docs/azure_hoodie.md
index a22d66598141..2730f6f9411a 100644
--- a/website/docs/azure_hoodie.md
+++ b/website/docs/azure_hoodie.md
@@ -2,7 +2,7 @@
title: Microsoft Azure
keywords: [ hudi, hive, azure, spark, presto]
summary: In this page, we go over how to configure Hudi with Azure filesystem.
-last_modified_at: 2020-05-25T19:00:57-04:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
In this page, we explain how to use Hudi on Microsoft Azure.
@@ -49,6 +49,18 @@ This combination works out of the box. No extra config
needed.
.load("/mountpoint/hudi-tables/customer")
```
+## Concurrency Control
+
+As of Hudi 1.2.0, the storage-based lock provider supports Azure ADLS Gen2
(`abfs://`, `abfss://`) and Azure Blob Storage (`wasb://`, `wasbs://`) base
paths for concurrency control. This allows multi-writer pipelines on Azure to
use storage-native conditional writes for locking — without requiring external
systems like ZooKeeper, or Hive Metastore.
+
+Add `hudi-azure-bundle` to your classpath and set:
+
+```properties
+hoodie.write.lock.provider=org.apache.hudi.client.transaction.lock.StorageBasedLockProvider
+```
+
+The lock client supports multiple Azure authentication methods (connection
string, SAS token, managed identity, service principal, and
`DefaultAzureCredential`). See [Concurrency Control — Azure Storage-Based
Lock](concurrency_control.md#azure-storage-based-lock) for the full
configuration reference and authentication precedence.
+
## Related Resources
<h3>Blogs</h3>
diff --git a/website/docs/blob_unstructured_data.md
b/website/docs/blob_unstructured_data.md
index 74acc799a230..253656565084 100644
--- a/website/docs/blob_unstructured_data.md
+++ b/website/docs/blob_unstructured_data.md
@@ -3,7 +3,7 @@ title: "Unstructured Data"
keywords: [ hudi, blob, unstructured data, images, binary, pdf, audio, video,
inline, out-of-line, read_blob]
summary: "Store and query unstructured data (images, PDFs, audio, video) in
Hudi tables using the BLOB type with inline or out-of-line storage"
toc: true
-last_modified_at: 2026-04-25T00:00:00-00:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
import Tabs from '@theme/Tabs';
@@ -72,6 +72,7 @@ schema = pa.schema([
pa.field("external_path", pa.string()),
pa.field("offset", pa.int64()),
pa.field("length", pa.int64()),
+ pa.field("managed", pa.bool_()),
])),
]), metadata={b"hudi_type": b"BLOB"}),
])
@@ -84,7 +85,7 @@ The BLOB internal structure is a struct with three fields:
- `external_path` — file path for out-of-line data
- `offset` — byte offset in the file (null means read from start)
- `length` — byte length to read (null means read to end of file)
- - `managed` — boolean indicating whether Hudi manages the external file
+ - `managed` — boolean. Only meaningful for `OUT_OF_LINE` blobs. Marks
whether Hudi owns the lifecycle of the referenced external file. **Not consumed
by the cleaner yet** — set the value to record intent, and a future cleaner
implementation will use it: `true` → cleaner may delete the external file when
the blob row is no longer referenced; `false` → cleaner will leave the external
file in place.
</TabItem>
</Tabs>
@@ -112,7 +113,7 @@ INSERT INTO media_assets VALUES (
named_struct(
'type', 'INLINE',
'data', /* binary literal or column reference */,
- 'reference', CAST(NULL AS STRUCT<external_path: STRING, offset:
BIGINT, length: BIGINT>)
+ 'reference', CAST(NULL AS STRUCT<external_path: STRING, offset:
BIGINT, length: BIGINT, managed: BOOLEAN>)
)
);
```
@@ -158,7 +159,8 @@ INSERT INTO media_assets VALUES (
'reference', named_struct(
'external_path', 's3://my-bucket/media/container_001.bin',
'offset', 8388608, -- byte offset in the container
- 'length', 1073741824 -- number of bytes
+ 'length', 1073741824, -- number of bytes
+ 'managed', false -- intent flag; not consumed by
the cleaner yet
)
)
);
@@ -290,15 +292,44 @@ Out-of-line BLOBs keep the Hudi table footprint extremely
small:
| Property | Default | Description |
|:---------|:--------|:------------|
-| `hoodie.read.blob.inline.mode` | `CONTENT` | Controls how INLINE BLOBs are
read. `CONTENT` materializes raw bytes in the `data` column. `DESCRIPTOR`
surfaces `(position, size)` coordinates rewritten as OUT_OF_LINE references. |
+| `hoodie.read.blob.inline.mode` | `DESCRIPTOR` | Controls how INLINE BLOBs
are read. `DESCRIPTOR` (default) returns an out-of-line-shaped reference
pointing at the in-file coordinates of the bytes — no bytes are materialized.
`CONTENT` materializes the raw inline bytes directly in the `data` field on
every read. |
| `hoodie.blob.batching.max.gap.bytes` | `4096` | Maximum gap (in bytes)
between consecutive byte ranges before they are merged into a single read.
Larger values reduce I/O calls at the cost of reading some unused bytes. |
| `hoodie.blob.batching.lookahead.size` | `50` | Number of rows to buffer for
batch read detection. Larger values improve batching for sorted data but
increase memory usage. |
:::note
-DESCRIPTOR mode is only supported on Lance-backed tables. CONTENT mode is
always used for internal
-operations (compaction, merge, log replay) regardless of this setting.
+`DESCRIPTOR` mode is the default for all storage formats including Lance.
`CONTENT` mode is always
+used for internal operations (compaction, merge, log replay) regardless of
this setting.
:::
+:::caution Calling read_blob() on INLINE columns under DESCRIPTOR mode
+Under the default `DESCRIPTOR` mode, calling `read_blob()` on an INLINE BLOB
column **throws** —
+the raw bytes are not materialized in the scan, so there is nothing for
`read_blob()` to return.
+To read inline bytes with `read_blob()`, switch to `CONTENT` mode first:
+
+```sql
+SET hoodie.read.blob.inline.mode=CONTENT;
+SELECT asset_id, read_blob(content) AS raw_bytes
+FROM media_assets
+WHERE asset_id = 'asset_001';
+```
+
+This setting affects only INLINE columns — OUT_OF_LINE columns always fetch
from the external path
+regardless of mode.
+:::
+
+## Metastore Sync
+
+When syncing BLOB column schemas to Hive or BigQuery, Hudi maps the BLOB
struct to the target
+catalog's native struct type:
+
+| Catalog | BLOB representation |
+|:--------|:-------------------|
+| Hive | `STRUCT<type:STRING, data:BINARY,
reference:STRUCT<external_path:STRING, offset:BIGINT, length:BIGINT,
managed:BOOLEAN>>` |
+| BigQuery | Equivalent `STRUCT` fields |
+
+The raw binary payload is preserved in the struct representation, but
`read_blob()` is a Spark SQL
+function and is not available in Hive or BigQuery directly.
+
## Best Practices
1. **Choose the right mode** — Use inline for small, frequently-accessed
objects. Use out-of-line for
diff --git a/website/docs/cleaning.md b/website/docs/cleaning.md
index c3498c8aa914..829d0d57cd6d 100644
--- a/website/docs/cleaning.md
+++ b/website/docs/cleaning.md
@@ -3,6 +3,7 @@ title: Cleaning
toc: true
toc_min_heading_level: 2
toc_max_heading_level: 4
+last_modified_at: 2026-05-27T00:00:00-00:00
---
## Background
Cleaning is a table service employed by Hudi to reclaim space occupied by
older versions of data and keep storage costs
@@ -50,6 +51,41 @@ Hudi cleaner currently supports the below cleaning policies
to keep a certain nu
be retained are cleaned. Currently you can configure by parameter
[`hoodie.clean.hours.retained`](https://hudi.apache.org/docs/configurations/#hoodiecleanerhoursretained).
The corresponding Flink related config is
[`clean.retain_hours`](https://hudi.apache.org/docs/configurations/#cleanretain_hours).
+#### Empty Clean Commits for Append-Only Tables
+
+Append-only tables never accumulate updates, so the cleaner's
`earliest_commit_to_retain` pointer never advances —
+causing the cleaner to scan the full table history on every run. Hudi 1.2.0
introduced periodic _empty clean commits_
+to advance this pointer even when there is nothing to delete.
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.write.empty.clean.interval.hours` | `-1` (disabled) | Interval in
hours at which an empty clean commit is created. `-1` disables the feature.
Must be `-1` or `>= 1`. When enabled, the cleaner advances
`earliest_commit_to_retain` so that subsequent clean plans only scan partitions
modified after the last empty clean's pointer. |
+
+#### Capping the Number of Commits Cleaned per Run
+
+Since 1.2.0, you can limit how many commits are cleaned in a single clean run,
which is useful for controlling job
+duration on tables that have fallen significantly behind on cleaning.
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.clean.max.commits.to.clean` | `Long.MAX_VALUE` (unbounded) | Maximum
number of commits cleaned in a single clean commit. Applicable when the
cleaning policy is `KEEP_LATEST_COMMITS` or `KEEP_LATEST_BY_HOURS`. Must be `>=
1`. |
+
+#### Full-Clean Partition Filtering
+
+When incremental cleaning is disabled
(`hoodie.clean.incremental.enabled=false`), the cleaner scans every partition on
+every run. For very large tables this can cause OOM during planning. Hudi
1.2.0 added two configs to restrict which
+partitions are examined.
+
+:::note
+Both configs require `hoodie.clean.incremental.enabled=false`. If both are
set, `hoodie.clean.partition.filter.selected`
+takes precedence over the regex.
+:::
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.clean.partition.filter.regex` | (none) | Java regex pattern; only
partitions whose path matches are cleaned. |
+| `hoodie.clean.partition.filter.selected` | (none) | Comma-separated list of
partition paths to clean; takes precedence over the regex when both are set. |
+
### Configs
For details about all possible configurations and their default values see the
[configuration
docs](https://hudi.apache.org/docs/next/configurations/#Clean-Configs).
For Flink related configs refer
[here](https://hudi.apache.org/docs/next/configurations/#FLINK_SQL).
@@ -76,6 +112,17 @@ hoodie.clean.async=true
For Flink based writing, this is the default mode of cleaning. Please refer to
[`clean.async.enabled`](https://hudi.apache.org/docs/configurations/#cleanasyncenabled)
for details.
+#### Pre-Write Cleaner Policy
+
+By default the cleaner runs _after_ a write commits. Hudi 1.2.0 introduced
`hoodie.prewrite.cleaner.policy`, which
+lets you force a clean (or rollback of failed writes) _before_ each write
begins. This is useful in multi-writer
+deployments where you want a deterministic table state before every write —
see [concurrency control](concurrency_control.md)
+for related multi-writer configuration.
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.prewrite.cleaner.policy` | `NONE` | Pre-write cleaning action.
`NONE`: no pre-write action (default). `CLEAN`: run a clean pass before each
write — this also rolls back failed writes as part of the clean.
`ROLLBACK_FAILED_WRITES`: only roll back any failed writes before each write,
without running a full clean. |
+
#### Run independently
Hoodie Cleaner can also be run as a separate process. Following is the command
for running the cleaner independently:
```
diff --git a/website/docs/cli.md b/website/docs/cli.md
index a29ffdfc2637..ddb8132d3cf3 100644
--- a/website/docs/cli.md
+++ b/website/docs/cli.md
@@ -1,7 +1,7 @@
---
title: CLI
keywords: [hudi, cli]
-last_modified_at: 2021-08-18T15:59:57-04:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
### Local set up
@@ -340,6 +340,13 @@ $ hdfs dfs -ls /app/uber/trips/.hoodie/*.inflight
-rw-r--r-- 3 vinoth supergroup 321984 2016-10-05 23:18
/app/uber/trips/.hoodie/20161005225920.inflight
```
+To list all inflight and requested instants that have been running longer than
a specified number of minutes, use `commits show_inflights`:
+
+```shell
+hudi:trips->commits show_inflights --lookbackInMins 30
+```
+
+This lists every inflight or requested instant whose requested timestamp is
older than 30 minutes, showing the commit time, action type, and current state.
This is useful for detecting hung or stuck writes. The `--lookbackInMins`
option defaults to `0` (returns all inflight/requested instants).
### Drilling Down to a specific Commit
@@ -675,6 +682,22 @@ corresponding to the library release version is used:
upgrade table
```
+### Record Index Lookup
+
+To look up a record's file location via the Record Level Index (RLI) stored in
the Metadata Table:
+
+```shell
+hudi:trips->metadata lookup-record-index --record_key <key>
+```
+
+For a partitioned (non-global) RLI, the partition path is required:
+
+```shell
+hudi:trips->metadata lookup-record-index --record_key <key> --partition_path
<partition>
+```
+
+The `--partition_path` argument is optional for a global RLI (where record
keys are unique across all partitions) and required for a partitioned RLI. If
`--partition_path` is omitted for a partitioned RLI, the command will return an
error. The output columns are `Record key`, `Partition path`, `File Id`, and
`Instant time`.
+
### Change Hudi Table Type
There are cases we want to change the hudi table type. For example, change COW
table to MOR for more efficient and
lower latency ingestion; change MOR to COW for better read performance and
compatibility with downstream engines.
diff --git a/website/docs/clustering.md b/website/docs/clustering.md
index 7442967b6505..7426bf9e2dcc 100644
--- a/website/docs/clustering.md
+++ b/website/docs/clustering.md
@@ -2,7 +2,7 @@
title: Clustering
summary: "In this page, we describe async compaction in Hudi."
toc: true
-last_modified_at: 2025-11-24T02:44:48
+last_modified_at: 2026-05-27T00:00:00-00:00
---
## Background
@@ -134,6 +134,47 @@ dynamically expanding the buckets for bucket index
datasets.
:::note The latter two strategies are applicable only for the Spark engine.
:::
+#### CommitBasedClusteringPlanStrategy
+
+Hudi 1.2.0 introduced
`org.apache.hudi.table.action.cluster.strategy.CommitBasedClusteringPlanStrategy`,
a plan
+strategy that schedules clustering based on commit patterns rather than just
file size. It groups file slices by the
+commits that produced them, making it easier to cluster data written in
specific time windows or under specific commit
+criteria.
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.clustering.plan.strategy.class` |
`SparkSizeBasedClusteringPlanStrategy` | Set to
`org.apache.hudi.table.action.cluster.strategy.CommitBasedClusteringPlanStrategy`
to use commit-based planning. |
+| `hoodie.clustering.plan.strategy.earliest.commit.to.cluster` | (none) |
Earliest commit time (exclusive) to start clustering from. Only commits after
this instant are considered. Useful for incrementally clustering new data while
skipping already-clustered history. |
+
+#### SparkStreamCopyClusteringPlanStrategy
+
+Available since Hudi 1.2.0,
`org.apache.hudi.client.clustering.plan.strategy.SparkStreamCopyClusteringPlanStrategy`
+is a Spark-only plan strategy that performs binary file stitching (byte-level
copy) rather than re-reading and
+re-writing records. This can be significantly faster when the goal is simply
to coalesce small files and sort order is
+not required. It is paired with
+`org.apache.hudi.client.clustering.run.strategy.SparkStreamCopyClusteringExecutionStrategy`.
+
+#### Single-Group Clustering Control
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.clustering.plan.strategy.single.group.clustering.enabled` | `true` |
Whether to generate a clustering plan when only one file group is eligible. Set
to `false` to skip clustering when there is nothing meaningful to consolidate
(i.e., the partition already has a single file group). |
+
+#### File-Slice Sort Order in Clustering Plan Generation
+
+Since 1.2.0, the order in which file slices are packed into clustering groups
is configurable, giving more control over
+which files are colocated and how groups are filled.
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.clustering.plan.strategy.file.slices.sort.by` | `SIZE` |
Comma-separated list of fields used to sort file slices when packing them into
clustering groups within a partition. `SIZE`: sort by file size descending
(largest first). `INSTANT_TIME`: sort by commit time ascending (oldest files
first). Example: `INSTANT_TIME,SIZE` sorts by commit time then by size. |
+
+#### Driver-Side Plan Generation
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.clustering.plan.generation.use.local.engine.context` | `false` |
When enabled, clustering group computation runs on the driver (local engine
context) instead of being distributed across executors. Enable when there are
only a few partitions with many files, where driver-local computation is more
resource-efficient than allocating executor slots. |
+
### Execution Strategy
After building the clustering groups in the planning phase, Hudi applies
execution strategy, for each group, primarily
@@ -251,6 +292,22 @@ In addition to the basic mode options, HoodieClusteringJob
supports the followin
These retry options are only effective when using `--mode scheduleAndExecute`.
The `--retry-last-failed-job` option requires `--job-max-processing-time-ms` to
be set to a positive value to detect stale inflight instants.
:::
+#### Automatic Expiration of Stale Clustering Instants
+
+When a clustering job is scheduled but never successfully executed (e.g., due
to a driver failure), the inflight
+`replacecommit` instant blocks future clustering runs. Hudi 1.2.0 adds
automatic expiration of such stale clustering
+instants, complementing the manual retry options above.
+
+:::note
+Expired clustering plan cleanup requires
`hoodie.clean.failed.writes.policy=LAZY`. With LAZY cleaning, the rollback of
+failed writes (triggered on the next write) also rolls back expired clustering
instants.
+:::
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.clustering.enable.expirations` | `false` | When enabled, rollback of
failed writes (under LAZY cleaning) also rolls back clustering `replacecommit`
instants whose heartbeat has expired. Clustering jobs record a heartbeat before
scheduling so other writers can detect stale attempts. |
+| `hoodie.clustering.expiration.threshold.mins` | `60` | A clustering instant
is not considered expired unless its creation time is at least this many
minutes old. Acts as a guardrail to avoid rolling back clustering attempts that
are still in progress. |
+
Note that to run this job while the original writer is still running, please
enable multi-writing:
```properties
diff --git a/website/docs/compaction.md b/website/docs/compaction.md
index 89b9214f0bd9..bf1dee7c12d6 100644
--- a/website/docs/compaction.md
+++ b/website/docs/compaction.md
@@ -4,7 +4,7 @@ summary: "In this page, we describe async compaction in Hudi."
toc: true
toc_min_heading_level: 2
toc_max_heading_level: 4
-last_modified_at: 2025-11-24T02:44:48
+last_modified_at: 2026-05-27T00:00:00-00:00
---
## Background
@@ -104,6 +104,17 @@ BoundedPartitionAwareCompactionStrategy</li></ul>
Please refer to [advanced
configs](https://hudi.apache.org/docs/next/configurations#Compaction-Configs)
for more details.
:::
+#### Metadata Table Compaction Trigger Strategy
+
+Available since Hudi 1.2.0, the metadata table (MDT) supports the same set of
compaction trigger strategies as the
+data table, plus a time-based option.
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.metadata.compact.trigger.strategy` | `NUM_COMMITS` | Trigger
strategy for MDT compaction. Accepts the same values as
`hoodie.compact.inline.trigger.strategy`: `NUM_COMMITS`,
`NUM_COMMITS_AFTER_LAST_REQUEST`, `TIME_ELAPSED`, `NUM_AND_TIME`,
`NUM_OR_TIME`. |
+| `hoodie.metadata.compact.max.delta.commits` | `10` | Number of delta commits
after the last MDT compaction before a new one is scheduled (for
`NUM_COMMITS`-based strategies). |
+| `hoodie.metadata.compact.max.delta.seconds` | `7200` | Elapsed seconds after
the last MDT compaction before scheduling a new one. Takes effect only for
`TIME_ELAPSED`, `NUM_AND_TIME`, and `NUM_OR_TIME` strategies. |
+
## Ways to trigger Compaction
### Inline
diff --git a/website/docs/concurrency_control.md
b/website/docs/concurrency_control.md
index d6231aca79b7..2056acfca64c 100644
--- a/website/docs/concurrency_control.md
+++ b/website/docs/concurrency_control.md
@@ -4,7 +4,7 @@ summary: On this page, we discuss how to perform concurrent
writes to Hudi table
toc: true
toc_min_heading_level: 2
toc_max_heading_level: 4
-last_modified_at: 2025-11-23T14:20:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
Concurrency control defines how different writers, readers, and table services
coordinate access to a Hudi table. Hudi ensures atomic writes by publishing
commits atomically to the timeline, stamped with an instant time that denotes
when the action is deemed to have occurred. Unlike general-purpose file version
control, Hudi draws a clear distinction between writer processes that issue
[write operations](write_operations.md), table services that (re)write
data/metadata to optimize or per [...]
@@ -47,6 +47,7 @@ Add the corresponding cloud bundle to your classpath:
* For S3: `hudi-aws-bundle`
* For GCS: `hudi-gcp-bundle`
+* For Azure (`abfs://`, `abfss://`, `wasb://`, `wasbs://`): `hudi-azure-bundle`
Set this configuration:
@@ -54,7 +55,7 @@ Set this configuration:
hoodie.write.lock.provider=org.apache.hudi.client.transaction.lock.StorageBasedLockProvider
```
-Supported for S3 and GCS (additional systems planned). This cloud-native
design works directly with storage features, simplifying large-scale cloud
operations.
+Supported for S3, GCS, and Azure ADLS Gen2 / Azure Blob Storage. This
cloud-native design works directly with storage features, simplifying
large-scale cloud operations.
Optional tuning configurations:
@@ -63,6 +64,27 @@ Optional tuning configurations:
| hoodie.write.lock.storage.validity.timeout.secs | 300 (Optional) | Validity
period (seconds) for each new lock. The provider renews its lock until the
lease extends or timeout occurs.<br /><br />`Config Param:
STORAGE_BASED_LOCK_VALIDITY_TIMEOUT_SECS`<br />`Since Version: 1.0.2` |
| hoodie.write.lock.storage.renew.interval.secs | 30 (Optional) | Interval
(seconds) between renewal attempts.<br /><br />`Config Param:
STORAGE_BASED_LOCK_RENEW_INTERVAL_SECS`<br />`Since Version: 1.0.2`
|
+#### Azure Storage-Based Lock
+
+Authentication is resolved in the following precedence order:
+
+| Priority | Config Key | Description |
+|----------|------------|-------------|
+| 1 (highest) | `hoodie.write.lock.azure.connection.string` | Azure Storage
connection string |
+| 2 | `hoodie.write.lock.azure.sas.token` | SAS token (not recommended for
production by Azure) |
+| 3 | `hoodie.write.lock.azure.managed.identity.client.id` | Client ID of a
user-assigned managed identity (`ManagedIdentityCredential`) |
+| 4 | `hoodie.write.lock.azure.client.tenant.id` + `.client.id` +
`.client.secret` | Service principal via `ClientSecretCredential` — all three
must be set |
+| 5 (lowest) | _(none)_ | `DefaultAzureCredential` chain (system-assigned
managed identity, environment variables, etc.) |
+
+Example configuration for service-principal authentication:
+
+```properties
+hoodie.write.lock.provider=org.apache.hudi.client.transaction.lock.StorageBasedLockProvider
+hoodie.write.lock.azure.client.tenant.id=<your-tenant-id>
+hoodie.write.lock.azure.client.id=<your-app-client-id>
+hoodie.write.lock.azure.client.secret=<your-client-secret>
+```
+
### Zookeeper-Based Lock Provider
```properties
@@ -359,6 +381,22 @@ hoodie.write.lock.client.num_retries
*Setting the right values for these depends on a case by case basis; some
defaults have been provided for general cases.*
+## Pre-Write Cleaner Policy
+
+When running multi-writer pipelines, failed writes can accumulate on storage
if a writer crashes before a clean cycle runs. Hudi 1.2.0 introduces
`hoodie.prewrite.cleaner.policy` to proactively handle this at write startup:
+
+| Config Key | Default | Description |
+|---|---|---|
+| `hoodie.prewrite.cleaner.policy` | `NONE` | Policy applied before starting a
new ingestion write commit. `NONE`: no pre-write action (default). `CLEAN`:
force a clean table service call (also rolls back failed writes).
`ROLLBACK_FAILED_WRITES`: only roll back failed writes without running a full
clean. |
+
+This is useful when a writer is perpetually crashing before completing a
`CLEAN`. See [Cleaning](cleaning.md) for the full list of cleaning
configurations.
+
+## Lock Audit Logging and Diagnostics
+
+The storage-based lock provider supports optional audit logging of lock
operations. When enabled, a `.hoodie/lock/audit_enabled.json` marker is written
to the table base path and lock acquisition/release events are recorded for
post-hoc debugging.
+
+For ZooKeeper-based locking, the ZK lock node now stores the Spark application
ID of the writer holding the lock, making it easier to correlate lock holders
with running Spark jobs in cluster UIs.
+
## Caveats
If you are using the `WriteClient` API, please note that multiple writes to
the table need to be initiated from 2 different instances of the write client.
diff --git a/website/docs/deployment.md b/website/docs/deployment.md
index 51f92e40d407..62497285a333 100644
--- a/website/docs/deployment.md
+++ b/website/docs/deployment.md
@@ -3,7 +3,7 @@ title: Deployment
keywords: [ hudi, administration, operation, devops, deployment]
summary: This section offers an overview of tools available to operate an
ecosystem of Hudi
toc: true
-last_modified_at: 2019-12-30T15:59:57-04:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
This section provides all the help you need to deploy and operate Hudi tables
at scale.
@@ -32,6 +32,8 @@ from varied sources such as DFS, Kafka and DB Changelogs and
ingest them to hudi
To use Hudi Streamer in Spark, the `hudi-utilities-slim-bundle` and Hudi Spark
bundle are required, by adding
`--packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1`
to the `spark-submit` command.
+Pick the Spark bundle that matches your Spark runtime — for example,
`hudi-spark3.3-bundle_2.12`, `hudi-spark3.4-bundle_2.12`,
`hudi-spark3.5-bundle_2.12` (Scala 2.12 or 2.13), `hudi-spark3.5-bundle_2.13`,
`hudi-spark4.0-bundle_2.13`, or `hudi-spark4.1-bundle_2.13`. Spark 4.0 and 4.1
require Java 17 or later at runtime; Spark 3.x runs on Java 8 or later.
+
- **Run Once Mode** : In this mode, Hudi Streamer performs one ingestion round
which includes incrementally pulling events from upstream sources and ingesting
them to hudi table. Background operations like cleaning old file versions and
archiving hoodie timeline are automatically executed as part of the run. For
Merge-On-Read tables, Compaction is also run inline as part of ingestion unless
disabled by passing the flag "--disable-compaction". By default, Compaction is
run inline for ever [...]
Here is an example invocation for reading from kafka topic in a single-run
mode and writing to Merge On Read table type in a yarn cluster.
diff --git a/website/docs/flink-quick-start-guide.md
b/website/docs/flink-quick-start-guide.md
index 2a04dc85abb2..23ef5efbe3cc 100644
--- a/website/docs/flink-quick-start-guide.md
+++ b/website/docs/flink-quick-start-guide.md
@@ -1,7 +1,7 @@
---
title: "Flink Quick Start"
toc: true
-last_modified_at: 2025-11-22T14:30:00+08:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
@@ -12,12 +12,13 @@ This page introduces Flink–Hudi integration and
demonstrates how Flink brings
### Flink Support Matrix
-| Hudi | Supported Flink versions
|
-| :----- |
:--------------------------------------------------------------------- |
-| 1.1.x | 1.17.x, 1.18.x, 1.19.x, 1.20.x (default build), 2.0.x
|
-| 1.0.x | 1.14.x, 1.15.x, 1.16.x, 1.17.x, 1.18.x, 1.19.x, 1.20.x (default
build) |
-| 0.15.x | 1.14.x, 1.15.x, 1.16.x, 1.17.x, 1.18.x
|
-| 0.14.x | 1.13.x, 1.14.x, 1.15.x, 1.16.x, 1.17.x
|
+| Hudi | Supported Flink versions
|
+| :----- |
:---------------------------------------------------------------------------- |
+| 1.2.x | 1.17.x, 1.18.x, 1.19.x, 1.20.x (default build), 2.0.x, 2.1.x
|
+| 1.1.x | 1.17.x, 1.18.x, 1.19.x, 1.20.x (default build), 2.0.x
|
+| 1.0.x | 1.14.x, 1.15.x, 1.16.x, 1.17.x, 1.18.x, 1.19.x, 1.20.x (default
build) |
+| 0.15.x | 1.14.x, 1.15.x, 1.16.x, 1.17.x, 1.18.x
|
+| 0.14.x | 1.13.x, 1.14.x, 1.15.x, 1.16.x, 1.17.x
|
### Download Flink and Start Flink cluster
@@ -62,9 +63,9 @@ You can build the jar manually under path
`hudi-source-dir/packaging/hudi-flink-
Now start the SQL CLI:
```bash
-# For Flink versions: 1.17-1.20, 2.0
-export FLINK_VERSION=1.20
-export HUDI_VERSION=1.1.1
+# Supported Flink versions for Hudi 1.2.x: 1.17, 1.18, 1.19, 1.20 (default
build), 2.0, 2.1
+export FLINK_VERSION=1.20
+export HUDI_VERSION=1.2.0
wget
https://repo1.maven.org/maven2/org/apache/hudi/hudi-flink${FLINK_VERSION}-bundle/${HUDI_VERSION}/hudi-flink${FLINK_VERSION}-bundle-${HUDI_VERSION}.jar
-P /tmp/
./bin/sql-client.sh embedded -j
/tmp/hudi-flink${FLINK_VERSION}-bundle-${HUDI_VERSION}.jar shell
```
@@ -77,11 +78,11 @@ The SQL CLI only executes the SQL line by line.
Please add the desired dependency to your project:
```xml
-<!-- For Flink versions 1.17-1.20, 2.0-->
+<!-- Supported Flink versions for Hudi 1.2.x: 1.17, 1.18, 1.19, 1.20 (default
build), 2.0, 2.1 -->
<properties>
- <flink.version>1.20.0</flink.version>
+ <flink.version>1.20.1</flink.version>
<flink.binary.version>1.20</flink.binary.version>
- <hudi.version>1.1.1</hudi.version>
+ <hudi.version>1.2.0</hudi.version>
</properties>
<dependency>
<groupId>org.apache.hudi</groupId>
@@ -446,9 +447,9 @@ feature is that it lets you author streaming pipelines on
streaming or batch dat
- **Quick Start**: Read the Quick Start section above to get started quickly
with the Flink SQL client to write to (and read from) Hudi.
- **Configuration**: For [Global
Configuration](flink_tuning.md#global-configurations), set up through
`$FLINK_HOME/conf/flink-conf.yaml`. For per-job configuration, set up through
[Table Option](flink_tuning.md#table-options).
-- **Writing Data** : Flink supports different modes for writing, such as [CDC
Ingestion](ingestion_flink.md#cdc-ingestion), [Bulk
Insert](ingestion_flink.md#bulk-insert), [Index
Bootstrap](ingestion_flink.md#index-bootstrap), [Changelog
Mode](ingestion_flink.md#changelog-mode) and [Append
Mode](ingestion_flink.md#append-mode). Flink also supports multiple streaming
writers with [non-blocking concurrency
control](sql_dml.md#non-blocking-concurrency-control-experimental).
-- **Reading Data** : Flink supports different modes for reading, such as
[Streaming Query](sql_queries.md#streaming-query) and [Incremental
Query](sql_queries.md#incremental-query).
-- **Tuning**: For write/read tasks, this guide provides some tuning
suggestions, such as [Memory Optimization](flink_tuning.md#memory-optimization)
and [Write Rate Limit](flink_tuning.md#write-rate-limit).
+- **Writing Data** : Flink supports different modes for writing, such as [CDC
Ingestion](ingestion_flink.md#cdc-ingestion), [Bulk
Insert](ingestion_flink.md#bulk-insert), [Index
Bootstrap](ingestion_flink.md#index-bootstrap), [Changelog
Mode](ingestion_flink.md#changelog-mode) and [Append
Mode](ingestion_flink.md#append-mode). For high-throughput append pipelines,
choose an [append write buffer mode](ingestion_flink.md#append-write-buffer).
For upsert workloads at scale, use [Record-Leve [...]
+- **Reading Data** : Flink supports different modes for reading, such as
[Streaming Query](sql_queries.md#streaming-query) and [Incremental
Query](sql_queries.md#incremental-query). For improved push-down and resumable
reads, see [Flink Source V2](ingestion_flink.md#flink-source-v2). For
dimension-table joins, use [lookup join](ingestion_flink.md#lookup-join) with
an optional off-heap RocksDB cache.
+- **Tuning**: For write/read tasks, this guide provides some tuning
suggestions, such as [Memory
Optimization](flink_tuning.md#memory-optimization), the [Managed-Memory Write
Buffer](flink_tuning.md#managed-memory-write-buffer), and [Write Rate
Limit](flink_tuning.md#write-rate-limit).
- **Optimization**: Offline compaction is supported: [Offline
Compaction](compaction.md#flink-offline-compaction).
- **Query Engines**: Besides Flink, many other engines are integrated: [Hive
Query](syncing_metastore.md#flink-setup), [Presto Query](sql_queries.md#presto).
- **Catalog**: A Hudi‑specific catalog is supported: [Hudi
Catalog](sql_ddl/#create-catalog).
diff --git a/website/docs/flink_tuning.md b/website/docs/flink_tuning.md
index 28e70e48b1f2..bd08b4be0fba 100644
--- a/website/docs/flink_tuning.md
+++ b/website/docs/flink_tuning.md
@@ -115,3 +115,50 @@ the `write.rate.limit` option can be turned on to ensure
smooth writing.
| Option Name | Required | Default | Remarks |
| ----------- | ------- | ------- | ------- |
| `write.rate.limit` | `false` | `0` | Turn off by default |
+
+## Managed-Memory Write Buffer
+
+By default, the Flink write buffer uses JVM heap memory (`ON_HEAP`). In
containerized environments where heap memory is tightly budgeted, you can
switch to Flink's managed (off-heap) memory pool to reduce GC pressure and
avoid OOM errors.
+
+:::note
+When using `MANAGED` memory type, ensure `taskmanager.memory.managed.size` is
configured sufficiently in `flink-conf.yaml`.
+:::
+
+| Option Name | Description | Default | Remarks |
+| ----------- | ------- | ------- | ------- |
+| `write.buffer.memory.type` | Memory type for the write buffer: `ON_HEAP`
(default, uses JVM heap) or `MANAGED` (uses Flink managed off-heap memory) |
`ON_HEAP` | Switch to `MANAGED` to avoid OOM in memory-constrained deployments |
+| `write.memory.segment.page.size` | Page size in bytes for memory segments
used in the write buffer | `32768` (32 KB) | Tune for workload characteristics;
larger pages reduce overhead for large records |
+
+## Disruptor Buffer Tuning
+
+When `write.buffer.type=DISRUPTOR` is set in the table options (see [Append
Write Buffer](ingestion_flink.md#append-write-buffer)), the following tuning
options control the Disruptor ring buffer:
+
+| Option Name | Description | Default | Remarks |
+| ----------- | ------- | ------- | ------- |
+| `write.buffer.disruptor.ring.size` | Size of the Disruptor ring buffer (must
be a power of 2) | `16384` | Larger values absorb write bursts but consume more
heap memory |
+| `write.buffer.disruptor.wait.strategy` | Wait strategy for the Disruptor
consumer: `BLOCKING_WAIT` (default), `SLEEPING_WAIT`, `YIELDING_WAIT`,
`BUSY_SPIN_WAIT` | `BLOCKING_WAIT` | `BLOCKING_WAIT` is safest for
containerized environments; `BUSY_SPIN_WAIT` offers lowest latency at the cost
of a dedicated CPU core |
+
+## Timeline-Server-Based Markers
+
+As of Hudi 1.2.0, Flink writers support `TIMELINE_SERVER_BASED` marker type
(`hoodie.write.markers.type=TIMELINE_SERVER_BASED`). This is recommended over
`DIRECT` markers on object stores (S3, GCS, ADLS) where the high cost of
directory listings makes `DIRECT` markers slow.
+
+```sql
+CREATE TABLE my_table (...)
+WITH (
+ 'connector' = 'hudi',
+ 'path' = 's3a://my-bucket/my-table',
+ 'hoodie.write.markers.type' = 'TIMELINE_SERVER_BASED'
+ -- other options
+);
+```
+
+## Source V2 Read-Lag Metrics
+
+When [Source V2](ingestion_flink.md#flink-source-v2) is enabled
(`read.source-v2.enabled=true`), the following read-lag metrics are emitted to
help monitor streaming pipeline health:
+
+| Metric | Description |
+|--------|-------------|
+| `issuedInstantDelay` | Time elapsed (ms) between when a new instant was
written and when the source issued it for reading |
+| `sourceReaderIdleTime` | Time (ms) the source reader has been idle (no new
splits assigned) |
+
+These metrics are exposed through Flink's standard metrics system and can be
forwarded to Prometheus, JMX, or other reporters.
diff --git a/website/docs/hoodie_streaming_ingestion.md
b/website/docs/hoodie_streaming_ingestion.md
index 286d5765c751..1fb282a5d718 100644
--- a/website/docs/hoodie_streaming_ingestion.md
+++ b/website/docs/hoodie_streaming_ingestion.md
@@ -146,7 +146,9 @@ Usage: <main class> [options]
Default: 0
--op
Takes one of these values : UPSERT (default), INSERT, BULK_INSERT,
- INSERT_OVERWRITE, INSERT_OVERWRITE_TABLE, DELETE_PARTITION
+ INSERT_OVERWRITE, INSERT_OVERWRITE_TABLE, DELETE_PARTITION, DELETE
+ (DELETE extracts HoodieKeys from source records and deletes the
+ corresponding records from the table.)
Default: UPSERT
Possible Values: [INSERT, INSERT_PREPPED, UPSERT, UPSERT_PREPPED,
BULK_INSERT, BULK_INSERT_PREPPED, DELETE, DELETE_PREPPED, BOOTSTRAP,
INSERT_OVERWRITE, CLUSTER, DELETE_PARTITION, INSERT_OVERWRITE_TABLE, COMPACT,
INDEX, ALTER_SCHEMA, LOG_COMPACT, UNKNOWN]
--payload-class
@@ -503,6 +505,77 @@ Check out [Kafka source
config](https://hudi.apache.org/docs/configurations#Kafk
Hudi Streamer also supports ingesting from Apache Pulsar via
`org.apache.hudi.utilities.sources.PulsarSource`.
Check out [Pulsar source
config](https://hudi.apache.org/docs/configurations#Pulsar-Source-Configs) for
more details.
+#### Amazon Kinesis
+
+Use the `JsonKinesisSource`
(`org.apache.hudi.utilities.sources.JsonKinesisSource`) to ingest JSON records
from an AWS Kinesis Data Stream into a Hudi table. It reads from every shard in
parallel, tracks per-shard progress in the Hudi Streamer checkpoint,
automatically handles shard splits and merges, and de-aggregates records
produced by the Kinesis Producer Library (KPL).
+
+##### Common configuration
+
+All keys use the prefix `hoodie.streamer.source.kinesis.`. The settings most
users need:
+
+| Config key | Default | Description |
+|---|---|---|
+| `hoodie.streamer.source.kinesis.stream.name` | (required) | Kinesis Data
Streams stream name. |
+| `hoodie.streamer.source.kinesis.region` | (required) | AWS region for the
stream (e.g., `us-east-1`). |
+| `hoodie.streamer.source.kinesis.starting.position` | `LATEST` | Where to
start when no checkpoint exists yet. `LATEST` starts at the tip of each shard;
`EARLIEST` replays from `TRIM_HORIZON`. |
+| `hoodie.streamer.source.kinesis.max.events` | `5000000` | Maximum number of
records read per batch across all shards. Tune to control batch size. |
+| `hoodie.streamer.source.kinesis.partitions` | `0` | Spark partitions to use
when reading. `0` means one Spark partition per Kinesis shard. Set a positive
value to repartition for downstream parallelism. |
+
+For credentials, the source uses the default AWS credential chain (instance
profile, environment variables, etc.). Authentication for custom endpoints
(e.g., LocalStack), API-level rate limiting, and retry tuning are also
available — see the [configurations reference](configurations.md) for the full
list of `hoodie.streamer.source.kinesis.*` keys.
+
+##### Checkpoint format
+
+Hudi Streamer persists Kinesis progress as a single checkpoint string on the
timeline. Each batch advances the checkpoint to the last record successfully
read from every shard, so a failed batch can be retried without skipping or
duplicating records.
+
+The checkpoint encodes per-shard state in plain text:
+
+```
+streamName,shardId:value,shardId:value,...
+```
+
+Each `value` is one of:
+
+- `lastSeq` — last sequence number consumed from an open shard.
+- `lastSeq@arrivalTime` — same, with the record's approximate arrival time
(epoch millis) for lag/observability.
+- `lastSeq|endSeq` — closed shard. `endSeq` is the shard's final sequence
number, used to detect data loss if the shard expires before being fully
consumed.
+- `lastSeq@arrivalTime|endSeq` — closed shard with arrival time.
+
+Example (sequence numbers abbreviated; Kinesis assigns each shard a 56-digit
decimal sequence number):
+
+```
+my-stream,shardId-000000000000:49590…88898,shardId-000000000001:49590…96306
+```
+
+You don't need to construct or parse this string yourself — it is read and
updated automatically by the source — but it's useful for debugging, manual
checkpoint resets, or comparing progress across shards.
+
+##### Minimal spark-submit example
+
+```properties
+# kinesis-source.properties
+hoodie.streamer.source.kinesis.stream.name=my-stream
+hoodie.streamer.source.kinesis.region=us-east-1
+hoodie.streamer.source.kinesis.starting.position=LATEST
+
+# Standard Hudi write / key-gen configs
+hoodie.datasource.write.recordkey.field=id
+hoodie.datasource.write.partitionpath.field=event_date
+hoodie.table.ordering.fields=ts
+```
+
+```bash
+spark-submit \
+ --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.2.0,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.2.0
\
+ --class org.apache.hudi.utilities.streamer.HoodieStreamer \
+ hudi-utilities-slim-bundle-*.jar \
+ --props kinesis-source.properties \
+ --source-class org.apache.hudi.utilities.sources.JsonKinesisSource \
+ --table-type COPY_ON_WRITE \
+ --target-base-path s3://my-bucket/hudi/my-table \
+ --target-table my_db.my_table \
+ --op UPSERT \
+ --continuous
+```
+
#### Cloud storage event sources
AWS S3 storage provides an event notification service which will post
notifications when certain events happen in your S3 bucket:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/NotificationHowTo.html
@@ -636,3 +709,34 @@ to how you run Hudi Streamer.
```
For detailed information on how to configure and use
`HoodieMultiTableStreamer`, please refer [blog
section](/blog/2020/08/22/ingest-multiple-tables-using-hudi).
+
+## On-Demand Hive Sync (HudiHiveSyncJob)
+
+`org.apache.hudi.utilities.HudiHiveSyncJob` is a standalone Spark job that
syncs a Hudi table's metadata to Hive metastore independently of any ingestion
workflow. It is useful for backfills, manual data corrections, or reconciling
metastore metadata after direct writes.
+
+### Arguments
+
+| Argument | Required | Description |
+|---|---|---|
+| `--base-path` / `-sp` | Yes | Base path of the Hudi table. |
+| `--base-file-format` / `-bff` | No | Base file format. Default: `PARQUET`. |
+| `--props-file-path` | No | Path to a properties file with Hudi / Hive sync
configs. |
+| `--hoodie-conf` | No | Inline config override (repeatable). |
+| `--spark-master` | No | Spark master URL. Inherits from environment if
unset. |
+
+### Example
+
+```bash
+spark-submit \
+ --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.2.0,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.2.0
\
+ --class org.apache.hudi.utilities.HudiHiveSyncJob \
+ hudi-utilities-slim-bundle-*.jar \
+ --base-path s3://my-bucket/hudi/my-table \
+ --base-file-format PARQUET \
+ --hoodie-conf hoodie.datasource.hive_sync.mode=hms \
+ --hoodie-conf
hoodie.datasource.hive_sync.metastore.uris=thrift://hive-metastore:9083 \
+ --hoodie-conf hoodie.datasource.hive_sync.database=my_db \
+ --hoodie-conf hoodie.datasource.hive_sync.table=my_table
+```
+
+All `hoodie.datasource.hive_sync.*` options accepted by the DataSource writer
are also accepted here. See [Syncing to Hive Metastore](syncing_metastore.md)
for the full list.
diff --git a/website/docs/ingestion_flink.md b/website/docs/ingestion_flink.md
index e720a748c1c8..2e534e286d67 100644
--- a/website/docs/ingestion_flink.md
+++ b/website/docs/ingestion_flink.md
@@ -1,7 +1,7 @@
---
title: Using Flink
keywords: [hudi, flink, streamer, ingestion]
-last_modified_at: 2025-11-22T12:53:57+08:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
## CDC Ingestion
@@ -112,15 +112,24 @@ the compaction options `compaction.delta_commits` and
`compaction.delta_seconds`
For `INSERT` mode write operations, new Parquet files are written directly,
and the [auto‑file sizing](file_sizing.md) is not enabled.
-### In-Memory Buffer Sort
+### Append Write Buffer
-For append-only workloads, Hudi supports in-memory buffer sorting to improve
Parquet compression ratio. When enabled, data is sorted within the write buffer
before being flushed to disk. This improves columnar file compression
efficiency by grouping similar values together.
+For append-only workloads, Hudi supports several write-buffer strategies that
improve Parquet compression ratio and write throughput. Data is sorted or
batched within the write buffer before being flushed to disk, grouping similar
values together for better columnar compression.
-| Option Name | Required | Default | Remarks
|
-|-----------------------------|----------|---------|-------------------------------------------------------------------------------------------------------------------------------|
-| `write.buffer.sort.enabled` | `false` | `false` | Whether to enable buffer
sort within append write function. Improves Parquet compression ratio by
sorting data before writing |
-| `write.buffer.sort.keys` | `false` | `N/A` | Sort keys concatenated by
comma (e.g., `col1,col2`). Required when `write.buffer.sort.enabled` is `true`
|
-| `write.buffer.size` | `false` | `1000` | Buffer size in number of
records. When buffer reaches this size, data is sorted and flushed to disk
|
+The buffer strategy is selected with `write.buffer.type`. In Hudi 1.2.0 this
replaces the deprecated `write.buffer.sort.enabled` flag.
+
+| Option Name | Required | Default | Remarks
|
+|------------------------------------------|----------|------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `write.buffer.type` | `false` | `NONE` | Buffer
type for append write. Values: `NONE` (no buffering), `BOUNDED_IN_MEMORY`
(double buffer with async write), `DISRUPTOR` (ring-buffer with async write,
recommended for higher throughput), `CONTINUOUS_SORT` (TreeMap-based continuous
sort with incremental draining) |
+| `write.buffer.size` | `false` | `1000` | Record
count threshold at which the buffer is flushed. Applies to all non-`NONE`
buffer types
|
+| `write.buffer.sort.keys` | `false` | `N/A` |
Comma-separated sort key columns (e.g., `col1,col2`). Required for `DISRUPTOR`
and `CONTINUOUS_SORT` modes
|
+| `write.buffer.sort.continuous.drain.size`| `false` | `1` | Number of
records drained per flush cycle in `CONTINUOUS_SORT` mode. Default 1 provides
smooth incremental draining; increase for batching (e.g., 10–100)
|
+
+:::note
+`write.buffer.sort.enabled` is deprecated as of 1.2.0. Use
`write.buffer.type=DISRUPTOR` instead for equivalent behavior. The `DISRUPTOR`
and `CONTINUOUS_SORT` modes require `write.buffer.sort.keys` to be set.
+:::
+
+For Disruptor-specific tuning options, see
[flink_tuning.md](flink_tuning.md#disruptor-buffer-tuning).
### Disable Meta Fields
@@ -156,7 +165,7 @@ Only Copy‑on‑Write tables are supported.
### Clustering Plan Strategy
-Custom clustering strategy is supported.
+Custom clustering strategy is supported. Hudi 1.2.0 adds
`FlinkSkipSingleFileClusteringPlanStrategy`
(`org.apache.hudi.client.clustering.plan.strategy.FlinkSkipSingleFileClusteringPlanStrategy`),
which skips file groups that already consist of a single file, reducing
unnecessary rewrites.
| Option Name | Required | Default
| Remarks
|
|---------------------------------------------------------|----------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------|
@@ -185,7 +194,7 @@ Hudi Flink writer supports two types of writer indexes:
| Cross‑Partition Changes | Cannot handle changes among partitions (unless
input is a CDC stream)
| No limit on
handling cross‑partition changes
|
:::note
-Bucket index supports only the `UPSERT` write operation and cannot be used
with the [append mode](#append-mode) in Flink.
+Bucket index supports `UPSERT` write operations on both COW and MOR tables. As
of Hudi 1.2.0, MOR + bucket index + upsert is fully supported. Bucket index
cannot be used with the [append mode](#append-mode) in Flink.
:::
### Bucket Index Examples
@@ -349,10 +358,215 @@ For Flink streaming reads, rate limiting helps avoid
backpressure when processin
The average read rate can be calculated as: **`read.splits.limit` /
`read.streaming.check-interval`** splits per second.
+Hudi 1.2.0 adds `read.commits.limit`, which complements `read.splits.limit` by
capping the number of commits (instants) consumed per check interval. This is
useful when tables have many small commits — limiting commits bounds the number
of splits regardless of their individual size.
+
+### Options
+
+| Option Name | Required | Default | Remarks
|
+|---------------------------------|----------|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `write.rate.limit` | `false` | `0` | Write
record rate limit per second to prevent traffic jitter and improve stability.
Default is 0 (no limit)
|
+| `read.splits.limit` | `false` | `Integer.MAX_VALUE` | Maximum
number of splits allowed to read in each instant check for streaming reads.
Average read rate = `read.splits.limit`/`read.streaming.check-interval`.
Default is no limit |
+| `read.commits.limit` | `false` | `(none)` | Maximum
number of commits (instants) allowed to read in each check interval.
Complements `read.splits.limit`. Average rate =
`read.commits.limit`/`read.streaming.check-interval`. Default is no limit |
+| `read.streaming.check-interval` | `false` | `60` | Check
interval in seconds for streaming reads. Default is 60 seconds (1 minute)
|
+
+## Flink Source V2
+
+Hudi 1.2.0 introduces a new Flink source implementation
([RFC-95](https://github.com/apache/hudi/blob/master/rfc/rfc-95/rfc-95.md))
based on
[FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface),
available as an opt-in feature via the `read.source-v2.enabled` flag.
+
+### Why Source V2?
+
+The legacy Hudi Flink source was built on Flink's `SourceFunction` API. The
FLIP-27 rewrite brings:
+
+- **Resumable split assignment** — splits can be checkpointed independently,
enabling finer-grained recovery
+- **Checkpoint alignment** — the new API participates in Flink's coordinated
checkpoint protocol, improving end-to-end consistency
+- **Push-down support** — predicate push-down, partition pruning, and `LIMIT`
push-down are supported through the new source interface, reducing data scanned
at the source level
+
+### Enabling Source V2
+
+```sql
+CREATE TABLE t1 (
+ uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
+ name VARCHAR(10),
+ age INT,
+ ts TIMESTAMP(3),
+ `partition` VARCHAR(20)
+)
+PARTITIONED BY (`partition`)
+WITH (
+ 'connector' = 'hudi',
+ 'path' = '${path}',
+ 'table.type' = 'MERGE_ON_READ',
+ 'read.source-v2.enabled' = 'true' -- enable the FLIP-27 source
+);
+```
+
+### Options
+
+| Option Name | Required | Default | Remarks
|
+|---------------------------|----------|---------|---------------------------------------------------------------------------------------------------------|
+| `read.source-v2.enabled` | `false` | `false` | Whether to use the FLIP-27
new source (Source V2) to consume data files. Default is the legacy source |
+
+### Savepoint Incompatibility
+
+:::warning
+Savepoints taken with the **legacy source** (`read.source-v2.enabled=false`)
are **not compatible** with the Source V2 source, and vice versa. When
switching from the legacy source to Source V2, start a fresh job without
restoring from a legacy savepoint. If you need to preserve read progress,
record the last committed instant time and use `read.start-commit` to resume
from that point.
+:::
+
+## Record-Level Index (RLI) Bucket Indexing for Flink
+
+As of Hudi 1.2.0, the Flink writer supports the Record-Level Index (RLI)
backed by the metadata table, in addition to the existing `FLINK_STATE` and
`BUCKET` index types. RLI is stored in the metadata table and avoids the
state-backend overhead of `FLINK_STATE`, while supporting full global or
partition-scoped uniqueness guarantees.
+
+Two RLI variants are available via `index.type`:
+
+- `RECORD_LEVEL_INDEX` — partitioned RLI; enforces uniqueness per (partition
path, record key) pair
+- `GLOBAL_RECORD_LEVEL_INDEX` — global RLI; enforces uniqueness across all
partitions
+
+### Bootstrap
+
+When enabling RLI on an existing table, the bootstrap process loads existing
record locations into RocksDB before the first write. Bootstrap is triggered by
setting `index.bootstrap.enabled=true`.
+
+```sql
+CREATE TABLE my_hudi_table (
+ id BIGINT,
+ name STRING,
+ ts BIGINT,
+ dt STRING,
+ PRIMARY KEY (id) NOT ENFORCED
+)
+PARTITIONED BY (dt)
+WITH (
+ 'connector' = 'hudi',
+ 'path' = 'hdfs:///warehouse/my_hudi_table',
+ 'table.type' = 'MERGE_ON_READ',
+ 'index.type' = 'RECORD_LEVEL_INDEX',
+ 'metadata.enabled' = 'true',
+ 'index.bootstrap.enabled' = 'true', -- enable bootstrap on first run
+ 'index.bootstrap.rocksdb.path' = '/tmp/hudi-rli-rocksdb'
+);
+```
+
+Once bootstrap completes (after the first successful checkpoint), you can
optionally restart the job with `index.bootstrap.enabled=false` to skip the
bootstrap operators. Leaving them enabled is harmless — they become no-ops on
subsequent runs and do not affect write performance.
+
+### In-Pipeline MDT Compaction
+
+For RLI workloads, the metadata table (MDT) accumulates log files that need
periodic compaction. The option `metadata.compaction.async.enabled` (default
`true`) runs MDT compaction inside the Flink pipeline after every
`metadata.compaction.delta_commits` (default `10`) delta commits.
+
+### Options
+
+| Option Name | Required | Default | Remarks
|
+|-------------------------------------|----------|----------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `index.type` | `false` | `FLINK_STATE` | Set to
`RECORD_LEVEL_INDEX` or `GLOBAL_RECORD_LEVEL_INDEX` to use the
metadata-table-backed RLI |
+| `index.bootstrap.enabled` | `false` | `false` | Bootstrap the
index from the existing table on first run. Blocks checkpoints during bootstrap
|
+| `index.bootstrap.rocksdb.path` | `false` | system temp dir | Local
path for RocksDB storage during RLI bootstrap. Each task manager creates a
unique subdirectory under this path |
+| `index.rli.cache.size` | `false` | `256` | Maximum memory
in MB for the RLI cache per bucket-assign task. Dynamically adjusted based on
historical usage |
+| `index.rli.lookup.minibatch.size` | `false` | `1000` | Maximum records
buffered per mini-batch during RLI lookup. Mini-batching reduces individual
index lookups. Minimum effective value is 1000 |
+| `metadata.compaction.async.enabled` | `false` | `true` | Whether to run
MDT compaction asynchronously within the Flink pipeline. Recommended to keep
enabled for RLI workloads |
+| `metadata.compaction.delta_commits` | `false` | `10` | Number of MDT
delta commits that trigger in-pipeline compaction
|
+
+:::note
+`GLOBAL_RECORD_LEVEL_INDEX` requires `metadata.enabled=true` and
`index.global.enabled=true`. The Flink table factory validates these
constraints automatically.
+:::
+
+## Lookup Join
+
+Hudi 1.2.0 adds a RocksDB-backed cache option for Flink lookup joins against
Hudi dimension tables. This avoids JVM heap pressure when the dimension table
is large.
+
### Options
-| Option Name | Required | Default | Remarks
|
-|---------------------------------|----------|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `write.rate.limit` | `false` | `0` | Write
record rate limit per second to prevent traffic jitter and improve stability.
Default is 0 (no limit)
|
-| `read.splits.limit` | `false` | `Integer.MAX_VALUE` | Maximum
number of splits allowed to read in each instant check for streaming reads.
Average read rate = `read.splits.limit`/`read.streaming.check-interval`.
Default is no limit |
-| `read.streaming.check-interval` | `false` | `60` | Check
interval in seconds for streaming reads. Default is 60 seconds (1 minute)
|
+| Option Name | Required | Default
| Remarks
|
+|--------------------------------|----------|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------|
+| `lookup.join.cache.type` | `false` | `heap`
| Storage backend for the lookup join cache. `heap` (default) stores rows in
JVM heap; `rocksdb` stores rows off-heap in an embedded RocksDB instance |
+| `lookup.join.rocksdb.path` | `false` |
`${java.io.tmpdir}/hudi-lookup-rocksdb` | Local directory for RocksDB data when
`lookup.join.cache.type=rocksdb`. Cleaned up when the lookup function closes
|
+| `lookup.async` | `false` | `false`
| Whether to enable async lookup join. Async join can improve throughput when
the lookup function has high latency |
+| `lookup.async-thread-number` | `false` | `16`
| Number of threads for async lookup join
|
+
+### Example
+
+```sql
+-- Streaming fact table with a processing-time attribute
+CREATE TABLE orders (
+ order_id BIGINT,
+ customer_id BIGINT,
+ amount DOUBLE,
+ proc_time AS PROCTIME(),
+ PRIMARY KEY (order_id) NOT ENFORCED
+) WITH (
+ 'connector' = 'hudi',
+ 'path' = 'hdfs:///warehouse/orders',
+ 'table.type' = 'MERGE_ON_READ',
+ 'read.streaming.enabled' = 'true'
+);
+
+-- Hudi dimension table with RocksDB-backed lookup cache
+CREATE TABLE customers (
+ customer_id BIGINT,
+ name STRING,
+ city STRING,
+ PRIMARY KEY (customer_id) NOT ENFORCED
+) WITH (
+ 'connector' = 'hudi',
+ 'path' = 'hdfs:///warehouse/customers',
+ 'lookup.join.cache.type' = 'rocksdb',
+ 'lookup.join.rocksdb.path' = '/tmp/hudi-lookup-rocksdb'
+);
+
+-- Lookup join keyed by the fact table's processing-time attribute
+SELECT o.order_id, c.name, o.amount
+FROM orders AS o
+JOIN customers FOR SYSTEM_TIME AS OF o.proc_time AS c
+ ON o.customer_id = c.customer_id;
+```
+
+## Virtual Metadata Columns
+
+Hudi metadata fields can be declared as `METADATA VIRTUAL` columns in the
Flink DDL. This allows accessing system metadata (e.g., commit time, record
key) without storing them as regular data columns.
+
+```sql
+CREATE TABLE events (
+ event_id BIGINT,
+ payload STRING,
+ -- virtual metadata columns (read-only, not persisted as data)
+ _hoodie_commit_time STRING METADATA VIRTUAL,
+ _hoodie_record_key STRING METADATA VIRTUAL,
+ _hoodie_partition_path STRING METADATA VIRTUAL,
+ PRIMARY KEY (event_id) NOT ENFORCED
+)
+WITH (
+ 'connector' = 'hudi',
+ 'path' = 'hdfs:///warehouse/events'
+);
+
+-- Query metadata alongside data
+SELECT event_id, _hoodie_commit_time, payload FROM events;
+```
+
+:::note
+Only `VIRTUAL` metadata columns are supported. All valid virtual columns
correspond to Hudi's built-in meta fields (`_hoodie_commit_time`,
`_hoodie_commit_seqno`, `_hoodie_record_key`, `_hoodie_partition_path`,
`_hoodie_file_name`, `_hoodie_operation`).
+:::
+
+## Advanced Options
+
+### Hadoop Configuration Pass-through
+
+Hadoop filesystem configuration properties can be passed to the Flink writer
using the `properties.hadoop.*` prefix (or directly as `hadoop.*`):
+
+```sql
+WITH (
+ 'connector' = 'hudi',
+ 'path' = 's3a://my-bucket/my-table',
+ 'properties.hadoop.fs.s3a.access.key' = 'AKID...',
+ 'properties.hadoop.fs.s3a.secret.key' = '...'
+)
+```
+
+### Kafka Offset Tracing
+
+For advanced Kafka offset tracing (internal/optional), the following
`kafka.offset.trace.*` options configure the checkpoint-service-based offset
lookup used in some deployment environments. These are advanced options with no
functional impact on standard Hudi writes:
+
+| Option Name | Default | Remarks
|
+|------------------------------------------|-------------------|------------------------------------------------------------|
+| `kafka.offset.trace.caller.service.name` | `ingestion-rt` | Caller
service name for checkpoint-service RPC headers |
+| `kafka.offset.trace.checkpoint.service` | `athena-job-manager` | Checkpoint
service name |
+| `kafka.offset.trace.dc` | `(none)` | Data center
for checkpoint offset lookup |
+| `kafka.offset.trace.env` | `(none)` | Environment
for checkpoint offset lookup |
+| `kafka.offset.trace.job.name` | `(none)` | Flink job
name for checkpoint offset lookup |
diff --git a/website/docs/key_generation.md b/website/docs/key_generation.md
index 3a7b109c3363..5f00956cefdb 100644
--- a/website/docs/key_generation.md
+++ b/website/docs/key_generation.md
@@ -2,7 +2,7 @@
title: Key Generation
summary: "In this page, we describe key generation in Hudi."
toc: true
-last_modified_at:
+last_modified_at: 2026-05-27T00:00:00-00:00
---
Hudi needs some way to point to records in the table, so that base/log files
can be merged efficiently for updates/deletes,
@@ -210,6 +210,35 @@ Partition path generated from key generator: "2020040118"
Input field value: "20200401" <br/>
Partition path generated from key generator: "04/01/2020"
+## Slash-Separated Date Partitioning
+
+By default, Hudi writes date-valued partition paths as a flat string (e.g.
`2024-03-15`).
+When `hoodie.datasource.write.slash.separated.date.partitioning` is set to
`true`, partition field
+values in `yyyy-MM-dd` format are stored as `yyyy/MM/dd` directory hierarchies
(e.g. `2024/03/15`).
+
+| Config Name |
Default | Description
|
+|----------------------------------------------------------------------|-----------|------------------------------------------------------------------------------------------------------------------------------------------|
+| `hoodie.datasource.write.slash.separated.date.partitioning` |
`false` | When `true`, transforms date partition values from `yyyy-MM-dd`
into `yyyy/MM/dd` directory paths. Cannot be used together with hive-style
partitioning (`hoodie.datasource.write.hive_style_partitioning=true`). |
+
+Example:
+
+```java
+df.write.format("hudi")
+ .option("hoodie.datasource.write.partitionpath.field", "event_date")
+ .option("hoodie.datasource.write.slash.separated.date.partitioning", "true")
+ .option("hoodie.table.name", tableName)
+ .mode("append")
+ .save(basePath)
+```
+
+A record with `event_date = "2024-03-15"` will be stored under
`basePath/2024/03/15/` instead of
+`basePath/2024-03-15/`.
+
+:::note
+`SHOW PARTITIONS` in Spark SQL correctly handles slash-separated date
partition paths: it displays
+the value in `yyyy-MM-dd` form (normalizing the `/` separators back to `-`)
for readability.
+:::
+
## Related Resources
<h3>Blogs</h3>
diff --git a/website/docs/lance_file_format.md
b/website/docs/lance_file_format.md
index 51269107f188..9754a4e31797 100644
--- a/website/docs/lance_file_format.md
+++ b/website/docs/lance_file_format.md
@@ -1,18 +1,28 @@
---
title: "Lance File Format"
keywords: [ hudi, lance, file format, vector, AI, ML, columnar, ANN, indexing]
-summary: "Use the Lance columnar file format with Hudi for vector-optimized
storage, ANN indexing, and efficient ML workloads"
+summary: "Use the Lance columnar file format with Hudi for vector-friendly
storage and efficient ML workloads"
toc: true
-last_modified_at: 2026-04-25T00:00:00-00:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
[Lance](https://lancedb.github.io/lance/) is a modern columnar data format
designed for AI and machine learning
workloads. Hudi's pluggable storage architecture lets you use Lance as the
base file format alongside Parquet
and ORC, unlocking vector indexing, fast random access, and optimized
high-dimensional array storage.
+:::caution Engine Support
+Lance file format support is **Spark-only**. Attempting to read a Lance-backed
table from Flink or Hive throws a
+`HoodieValidationException`:
+> Lance base file format is currently only supported with the Spark engine.
Please use Parquet, ORC, or HFile
+> for non-Spark engines (Flink, Hive, Presto, Trino).
+
+The Lance JAR is **not bundled** in the Hudi distribution — you must add it to
your Spark classpath
+(see [Required Dependencies](#required-dependencies)).
+:::
+
## Enabling Lance in Hudi
-### Table Creation
+### Table Creation (COW)
Set the base file format to `lance` in table properties:
@@ -26,7 +36,26 @@ TBLPROPERTIES (
primaryKey = 'id',
type = 'cow',
hoodie.record.merger.impls = 'org.apache.hudi.DefaultSparkRecordMerger',
- hoodie.datasource.write.base.file.format = 'lance'
+ hoodie.table.base.file.format = 'lance'
+);
+```
+
+### Table Creation (MOR)
+
+Lance base files work with MOR tables — Lance files act as base files while
Avro log files capture
+incremental changes. Log compaction merges the delta log back into Lance base
files.
+
+```sql
+CREATE TABLE my_ai_table_mor (
+ id STRING,
+ embedding VECTOR(768),
+ metadata STRING
+) USING hudi
+TBLPROPERTIES (
+ primaryKey = 'id',
+ type = 'mor',
+ hoodie.record.merger.impls = 'org.apache.hudi.DefaultSparkRecordMerger',
+ hoodie.table.base.file.format = 'lance'
);
```
@@ -39,18 +68,21 @@ TBLPROPERTIES (
.option("hoodie.datasource.write.recordkey.field", "id")
.option("hoodie.record.merger.impls",
"org.apache.hudi.DefaultSparkRecordMerger")
- .option("hoodie.datasource.write.base.file.format", "lance")
+ .option("hoodie.table.base.file.format", "lance")
.mode("overwrite")
.save("/path/to/my_ai_table"))
```
### Required Dependencies
-Add the Lance Spark bundle to your Spark classpath:
+The Lance JAR is not bundled in Hudi. Add the appropriate Lance Spark bundle
JAR to your Spark classpath:
-| Component | Maven Coordinates |
-|:----------|:-----------------|
-| Lance Spark Bundle (Spark 3.5) |
`org.lance:lance-spark-bundle-3.5_2.12:0.4.0` |
+| Spark Version | Bundle JAR (Maven Central) |
+|:--------------|:---------------------------|
+| Spark 3.4 | `org.lance:lance-spark-bundle-3.4_2.12:0.4.0` |
+| Spark 3.5 | `org.lance:lance-spark-bundle-3.5_2.12:0.4.0` |
+| Spark 4.0 | `org.lance:lance-spark-bundle-4.0_2.13:0.4.0` |
+| Spark 4.1 | `org.lance:lance-spark-bundle-4.1_2.13:0.4.0` |
```bash
export LANCE_BUNDLE_JAR=/path/to/lance-spark-bundle-3.5_2.12-0.4.0.jar
@@ -74,7 +106,6 @@ file-level storage:
│ (same Hudi concepts as Parquet) │
├───────────────────────────────────┤
│ Lance Data Files (.lance) │
-│ IVF-PQ vector index │
│ Columnar storage │
│ Fragment-based layout │
├───────────────────────────────────┤
@@ -87,11 +118,49 @@ All Hudi table services work with Lance-backed tables:
- **Compaction** — merges log files into Lance base files
- **Clustering** — reorganizes Lance files for better data locality
- **Cleaning** — removes old Lance file versions
-- **Metadata indexing** — column stats and bloom filters work across Lance
files
+- **Metadata indexing** — bloom filters work across Lance files; column stats
and partition stats are
+ **automatically disabled** for Lance tables
+
+## VECTOR Storage on Lance
+
+VECTOR columns are stored natively in Lance as `FixedSizeList<Float32/Float64,
dim>` — Lance's own
+vector column encoding, so embeddings are written without conversion overhead
at the file-format
+layer.
+
+Only **FLOAT** and **DOUBLE** element types are supported as VECTOR columns on
Lance. INT8 vectors
+are not yet supported and will fail fast at write time.
+
+See [Vector Search](vector_search.md) for the `hudi_vector_search` TVF that
queries VECTOR columns.
+
+## BLOB Columns on Lance
+
+INLINE BLOB columns on Lance default to `DESCRIPTOR` read mode — standard
queries return an
+out-of-line-shaped reference descriptor rather than materializing the raw
bytes. To read inline
+byte content via `read_blob()`, set `hoodie.read.blob.inline.mode=CONTENT`. See
+[Unstructured Data](blob_unstructured_data.md) for full documentation.
+
+## Schema Evolution
+
+Lance supports the following schema changes at the Hudi layer:
+
+| Operation | Supported? |
+|:----------|:-----------|
+| Add column | Yes |
+| Rename column | Yes (via Hudi schema evolution) |
+| Promote `FLOAT` → `DOUBLE` | **No** — not supported on Lance |
+| Promote `FLOAT` → `STRING` | **No** — not supported on Lance |
+| Drop column | Yes |
+
+:::caution
+`FLOAT → DOUBLE` and `FLOAT → STRING` type promotions are supported for
Parquet tables but **not**
+for Lance. Attempting these on a Lance table will fail. Use `DOUBLE` from the
start if you anticipate
+needing higher precision.
+:::
## Vector Search with Lance
-The `hudi_vector_search` TVF leverages Lance's built-in IVF-PQ index for
approximate nearest neighbor search:
+Use the `hudi_vector_search` TVF to run vector similarity queries against
VECTOR columns on a
+Lance-backed table:
```sql
SELECT id, metadata, _hudi_distance
@@ -107,10 +176,38 @@ See [Vector Search](vector_search.md) for full
documentation on the TVF and dist
## Configuration Reference
-| Property | Description | Default |
-|:---------|:------------|:--------|
-| `hoodie.datasource.write.base.file.format` | Set to `lance` to use Lance as
the base file format | `parquet` |
-| `hoodie.record.merger.impls` | Must be
`org.apache.hudi.DefaultSparkRecordMerger` for Lance | — |
+| Property | Default | Description |
+|:---------|:--------|:------------|
+| `hoodie.table.base.file.format` | `parquet` | Set to `lance` to use Lance as
the base file format. |
+| `hoodie.record.merger.impls` | — | Must be
`org.apache.hudi.DefaultSparkRecordMerger` for Lance. |
+| `hoodie.lance.max.file.size` | `125829120` (120 MiB) | Target file size in
bytes for Lance base files. |
+| `hoodie.lance.write.allocator.size.bytes` | `268435456` (256 MiB) | Maximum
size of the Arrow child allocator used for buffering in-flight batch data.
Increase for tables with very large BLOB columns. |
+| `hoodie.lance.write.flush.byte.watermark` | `100663296` (96 MiB) | Byte-size
threshold at which the current write batch is flushed. Must be less than
`hoodie.lance.write.allocator.size.bytes`. |
+
+### File Sizing and Memory
+
+The three sizing configs work together:
+
+- **`hoodie.lance.max.file.size`** controls when Hudi rolls over to a new
Lance file, similar to
+ `hoodie.parquet.max.file.size` for Parquet tables.
+- **`hoodie.lance.write.allocator.size.bytes`** caps the Arrow allocator's
in-flight memory. Arrow
+ uses power-of-2 buffer doubling; the default 256 MiB accommodates the 128
MiB doubling step with
+ headroom.
+- **`hoodie.lance.write.flush.byte.watermark`** triggers an early batch flush
when Arrow buffers
+ approach the cap. The default 96 MiB (≈ 3/8 of the allocator cap) leaves
room for offset and
+ validity buffers to double without exceeding the allocator limit.
+
+For tables with large BLOB columns, increase both
`hoodie.lance.write.allocator.size.bytes` and
+`hoodie.lance.write.flush.byte.watermark` proportionally (keep watermark at
roughly 3/8 of allocator
+size).
+
+## Additional Notes
+
+- **`populateMetaFields=false`** is supported. User-defined key generators
work normally with Lance
+ tables.
+- **Complex types** (struct, array, map) are supported as Lance columns.
+- **VARIANT columns** are **not supported** on Lance. Attempting to write a
table with VARIANT columns
+ to Lance throws a `HoodieNotSupportedException`. Use Parquet for tables with
VARIANT columns.
## Mixed-Format Tables
diff --git a/website/docs/metadata.md b/website/docs/metadata.md
index c5a01b42dfa9..9c108c815cdd 100644
--- a/website/docs/metadata.md
+++ b/website/docs/metadata.md
@@ -89,6 +89,15 @@ If you turn off the metadata table after enabling, be sure
to wait for a few com
cleaned up, before re-enabling the metadata table again.
:::
+### Auto-Delete of Disabled MDT Partitions
+
+When an index is disabled in the write config, Hudi automatically deletes the
corresponding metadata table partition.
+Available since Hudi 1.2.0, this behavior is configurable.
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.metadata.auto.delete.partitions` | `true` | When enabled (default),
metadata table partitions (indexes) that are disabled in the write config are
automatically deleted. Set to `false` to prevent accidental deletion in
multi-writer environments where not all writers may have the same config —
users must then drop indexes explicitly via Hudi CLI or `DROP INDEX`. |
+
## Leveraging metadata during queries
### files index
@@ -129,6 +138,28 @@ can bring up the writers sequentially after stopping the
writers for enabling me
configurations to only a subset of writers or table services is unsafe and can
lead to loss of data. So, please ensure you enable
metadata table across all writers.
+## MDT Cleaner and Compaction
+
+Hudi 1.2.0 introduced a config that lets the metadata table's cleaner derive
its retention policy directly from the
+data table, rather than requiring a separate configuration.
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.metadata.derive.from.datatable.clean.policy` | `true` | When
enabled, the metadata table's cleaner uses the same cleaning policy (retention
count, hours, etc.) as the data table. See also
[cleaning](cleaning.md#mdt-cleaner-inherits-data-table-policy). |
+
+The metadata table's compaction and log compaction can also be delegated to an
external table service platform. See
+[compaction](compaction.md#delegating-mdt-compaction-to-an-external-platform)
for the full config reference.
+
+## Timeline Archival Controls
+
+Hudi 1.2.0 added two configs in `HoodieArchivalConfig` to fine-tune how the
timeline manifest and archival interact
+with the most recent clean.
+
+| Config Name | Default | Description |
+|---|---|---|
+| `hoodie.timeline.manifest.retained.versions` | `3` | Number of timeline
manifest file versions to retain. Older manifest versions are pruned during
archival. |
+| `hoodie.archive.block.on.latest.clean.ectr` | `false` | When enabled,
archival stops at the Earliest Commit To Retain (ECTR) from the last completed
clean. This prevents archiving commits whose data files still exist on storage,
avoiding inconsistencies between the timeline and actual data. |
+
## Related Resources
<h3>Blogs</h3>
* [Table service deployment models in Apache
Hudi](https://medium.com/@simpsons/table-service-deployment-models-in-apache-hudi-9cfa5a44addf)
diff --git a/website/docs/metadata_indexing.md
b/website/docs/metadata_indexing.md
index 7056a1e02671..dbdd523df60d 100644
--- a/website/docs/metadata_indexing.md
+++ b/website/docs/metadata_indexing.md
@@ -2,7 +2,7 @@
title: Indexing
summary: "In this page, we describe how to run metadata indexing
asynchronously."
toc: true
-last_modified_at:
+last_modified_at: 2026-05-27T00:00:00-00:00
---
Hudi maintains a scalable [metadata](metadata.md) that has some auxiliary data
about the table.
@@ -36,7 +36,7 @@ For more information on these indexes please refer [metadata
section](metadata/#
:::note
Please note in order to create secondary index:
1. The table must have a primary key and merge mode should be
[COMMIT_TIME_ORDERING](record_merger.md#commit_time_ordering).
-2. Record index must be enabled. This can be done by setting
`hoodie.metadata.record.index.enable=true` and then creating `record_index`.
Please note the example below.
+2. Record index must be enabled. This can be done by setting
`hoodie.metadata.global.record.level.index.enable=true` and then creating
`record_index`. Please note the example below.
:::
**Examples**
@@ -73,8 +73,8 @@ hoodie.metadata.index.column.stats.enable=true
-- [Optional Configs] - list of columns to index on. By default all columns
are indexed
hoodie.metadata.index.column.stats.column.list=col1,col2,...
--- [Required Configs] Record Level Index
-hoodie.metadata.record.index.enable=true
+-- [Required Configs] Record Level Index (Global RLI — single record key
unique across all partitions)
+hoodie.metadata.global.record.level.index.enable=true
-- [Required Configs] Bloom filter Index
hoodie.metadata.index.bloom.filter.enable=true
@@ -116,7 +116,7 @@ inserts.write.format("hudi").
// Create record index and secondary index for the table
spark.sql(s"CREATE TABLE test_table_external USING hudi LOCATION '$basePath'")
-spark.sql(s"SET hoodie.metadata.record.index.enable=true")
+spark.sql(s"SET hoodie.metadata.global.record.level.index.enable=true")
spark.sql(s"CREATE INDEX record_index ON test_table_external (uuid)")
spark.sql(s"CREATE INDEX idx_rider ON test_table_external (rider)")
spark.sql(s"SHOW INDEXES FROM hudi_indexed_table").show(false)
@@ -191,6 +191,24 @@ Enabling the metadata table and configuring a lock
provider are the prerequisite
configuration below.
:::
+#### Record-Level Index Configuration Keys
+
+Hudi supports two flavors of the Record Level Index, each with its own enable
flag and sizing configs:
+
+- **Global RLI** — record key is unique across the entire table (across
partitions).
+- **Partitioned RLI** — `partition_path + record_key` is unique within each
partition.
+
+| Config Name | Default | Notes |
+|---|---|---|
+| `hoodie.metadata.global.record.level.index.enable` | `false` | Enables the
global RLI. |
+| `hoodie.metadata.global.record.level.index.min.filegroup.count` | `10` | Min
file groups for the global RLI. |
+| `hoodie.metadata.global.record.level.index.max.filegroup.count` | `10000` |
Max file groups for the global RLI. |
+| `hoodie.metadata.record.level.index.enable` | `false` | Enables the
partitioned RLI. Independent toggle from the global RLI above. |
+| `hoodie.metadata.record.level.index.min.filegroup.count` | `1` | Min file
groups for the partitioned RLI. |
+| `hoodie.metadata.record.level.index.max.filegroup.count` | `10` | Max file
groups for the partitioned RLI. |
+| `hoodie.metadata.record.level.index.defer.init` | `false` | When enabled,
defers RLI initialization to the second commit on a fresh table so Hudi can
size file groups based on actual record volume. Applies to both global and
partitioned RLI. |
+| `hoodie.metadata.record.index.max.filegroup.size` | `1073741824` (1 GB) |
Maximum size in bytes of a single RLI file group. Larger file groups take
longer to compact. |
+
```
# ensure that async indexing is enabled
hoodie.metadata.index.async=true
@@ -284,7 +302,10 @@ indexer logs, we would find that it indeed caught up with
instant `2022041419542
### Drop Index
-To drop an index, just run the index in `dropindex` mode.
+To drop an index, just run the index in `dropindex` mode. Note that as of Hudi
1.2.0, when an index is disabled in
+the write config, Hudi automatically drops its metadata table partition by
default; see
+[`hoodie.metadata.auto.delete.partitions`](metadata.md#auto-delete-of-disabled-mdt-partitions)
to control this
+behavior.
```
spark-submit \
diff --git a/website/docs/metrics.md b/website/docs/metrics.md
index ec94d21aa45a..151d27f5d319 100644
--- a/website/docs/metrics.md
+++ b/website/docs/metrics.md
@@ -3,7 +3,7 @@ title: Metrics
keywords: [ hudi, administration, operation, devops, metrics]
summary: This section offers an overview of metrics in Hudi
toc: true
-last_modified_at: 2020-06-20T15:59:57-04:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
In this section, we will introduce the `MetricsReporter` and `HoodieMetrics`
in Hudi. You can view the metrics-related configurations
[here](configurations.md#METRICS).
@@ -204,29 +204,46 @@ These `HoodieMetrics` can then be plotted on a standard
tool like grafana. Below
## List of metrics:
-The below metrics are available in all timeline operations that involves a
commit such as deltacommit, compaction, clustering and rollback.
+The metrics below are emitted across timeline operations (deltacommit,
compaction, clustering, rollback, clean, archival) and post-commit callbacks.
When `hoodie.metrics.reporter.metricsname.prefix` is set, every name is
prefixed with `<prefix>.<name>`.
-Name | Description
+Name | Description
--- | ---
-commitFreshnessInMs | Milliseconds from the commit end time and the maximum
event time of the incoming records
-commitLatencyInMs | Milliseconds from the commit end time and the minimum
event time of incoming records
-commitTime | Time of commit in epoch milliseconds
-duration | Total time taken for the commit/rollback in milliseconds
-numFilesDeleted | Number of files deleted during a clean/rollback
-numFilesFinalized | Number of files finalized in a write
-totalBytesWritten | Bytes written in a HoodieCommit
-totalCompactedRecordsUpdated | Number of records updated in a compaction
operation
-totalCreateTime | Time taken for file creation during a Hoodie Insert operation
-totalFilesInsert | Number of newly written files in a HoodieCommit
-totalFilesUpdate | Number of files updated in a HoodieCommit
-totalInsertRecordsWritten | Number of records inserted or converted to
updates(for small file handling) in a HoodieCommit
-totalLogFilesCompacted | Number of log files under a base file in a file
group compacted
-totalLogFilesSize | Total size in bytes of all log files under a base file in
a file group
-totalPartitionsWritten | Number of partitions that took writes in a
HoodieCommit
-totalRecordsWritten | Number of records written in a HoodieCommit. For
inserts, it is the total numbers of records inserted. And for updates, it the
total number of records in the file.
-totalScanTime | Time taken for reading and merging logblocks in a log file
-totalUpdateRecordsWritten | Number of records that got changed in a
HoodieCommit
-totalUpsertTime | Time taken for Hoodie Merge
-
-These metrics can be found at org.apache.hudi.metrics.HoodieMetrics and
referenced from
-org.apache.hudi.common.model.HoodieCommitMetadata and
org.apache.hudi.common.model.HoodieWriteStat
+commitFreshnessInMs | Milliseconds from the commit end time and the maximum
event time of the incoming records.
+commitLatencyInMs | Milliseconds from the commit end time and the minimum
event time of incoming records.
+commitTime | Time of commit in epoch milliseconds.
+duration | Total time taken for the commit/rollback in milliseconds.
+numFilesDeleted | Number of files deleted during a clean/rollback.
+numFilesFinalized | Number of files finalized in a write.
+totalBytesWritten | Bytes written in a HoodieCommit.
+totalCompactedRecordsUpdated | Number of records updated in a compaction
operation.
+totalCreateTime | Time taken for file creation during a Hoodie Insert
operation.
+totalFilesInsert | Number of newly written files in a HoodieCommit.
+totalFilesUpdate | Number of files updated in a HoodieCommit.
+totalInsertRecordsWritten | Number of records inserted or converted to updates
(for small file handling) in a HoodieCommit.
+totalLogFilesCompacted | Number of log files under a base file in a file group
compacted.
+totalLogFilesSize | Total size in bytes of all log files under a base file in
a file group.
+totalPartitionsWritten | Number of partitions that took writes in a
HoodieCommit.
+totalRecordsWritten | Number of records written in a HoodieCommit. For
inserts, the total records inserted; for updates, the total records in the file.
+totalScanTime | Time taken for reading and merging log blocks in a log file.
+totalUpdateRecordsWritten | Number of records that got changed in a
HoodieCommit.
+totalUpsertTime | Time taken for Hoodie Merge.
+clean.duration | Wall-clock time in milliseconds for a clean operation.
+archive.duration | Wall-clock time in milliseconds for an archive operation.
+rollback.failure.counter | Incremented each time a rollback operation fails.
+postCommit.success.counter | Incremented each time all post-commit callbacks
succeed.
+postCommit.failure.counter | Incremented each time a post-commit callback
fails (post-commit failures are non-fatal).
+postCommit.duration | Wall-clock time in milliseconds for post-commit callback
execution.
+archival.archivalNumAllCommits | Total number of instants archived in this
archival run.
+archival.archivalNumWriteCommits | Number of write instants (commit,
deltacommit, replacecommit) archived.
+archival.archivalNumCleanCommits | Number of clean instants archived.
+archival.archivalNumRollbackCommits | Number of rollback instants archived.
+archival.archivalStatus | `1` if archival succeeded, `-1` if it failed.
+archival.archivalFailure.\<ExceptionClassName\> | Incremented on archival
failure; the suffix is the simple class name of the exception thrown.
+archival.archivalOutOfMemory | Incremented when archival fails due to an
`OutOfMemoryError`.
+\<action\>.totalCorruptedLogBlocks | Number of corrupted log blocks
encountered during compaction. Reported only when
`hoodie.metricscompaction.log.blocks.on=true`. `<action>` is the commit action
type (e.g., `commit`).
+\<action\>.totalRollbackLogBlocks | Number of rollback log blocks encountered
during compaction. Reported only when
`hoodie.metricscompaction.log.blocks.on=true`.
+\<action\>.totalLogBlocksCompacted | Total number of log blocks compacted.
Reported only when `hoodie.metricscompaction.log.blocks.on=true`.
+
+These metrics live in `org.apache.hudi.metrics.HoodieMetrics` (with
archival-specific names sourced from
`org.apache.hudi.client.utils.ArchivalMetrics`) and are referenced from
`org.apache.hudi.common.model.HoodieCommitMetadata` and
`org.apache.hudi.common.model.HoodieWriteStat`.
+
+In multi-tenant deployments where a single Spark job writes to multiple Hudi
tables, each table gets its own isolated `MetricRegistry`, scoped as
`<tableName>.<registryName>` so metrics from different tables do not collide.
No configuration is required.
diff --git a/website/docs/overview.mdx b/website/docs/overview.mdx
index 46c2d5b4be1c..ed439cef38a1 100644
--- a/website/docs/overview.mdx
+++ b/website/docs/overview.mdx
@@ -59,10 +59,10 @@ If you want to experience Apache Hudi integrated into an
end to end demo with Ka
Hudi brings first-class support for AI and unstructured data workloads to the
data lakehouse:
-- **[VECTOR type & Similarity Search](vector_search.md)** — Store embeddings
and run approximate nearest neighbor search directly in Spark SQL
+- **[VECTOR type & Similarity Search](vector_search.md)** — Store embeddings
and run vector similarity search directly in Spark SQL
- **[BLOB type for Unstructured Data](blob_unstructured_data.md)** — Store
images, PDFs, audio, and other binary data with inline or out-of-line storage
- **[VARIANT type for Semi-Structured Data](variant_type.md)** — Store
flexible JSON-like data (LLM outputs, model metadata, feature maps) without
rigid schemas
-- **[Lance File Format](lance_file_format.md)** — Vector-optimized columnar
format with built-in ANN indexing
+- **[Lance File Format](lance_file_format.md)** — Vector-friendly columnar
format for AI/ML workloads
See the full [AI-Native Lakehouse Overview](ai_overview.md) for use cases and
architecture.
diff --git a/website/docs/precommit_validator.md
b/website/docs/precommit_validator.md
index fe0a0dd77605..d67d9ec3a054 100644
--- a/website/docs/precommit_validator.md
+++ b/website/docs/precommit_validator.md
@@ -109,6 +109,73 @@ Hudi offers a [commit notification
service](platform_services_post_commit_callba
The commit notification service can be combined with pre-commit validators to
send a notification when a commit fails a validation. This is possible by
passing details about the validation as a custom value to the HTTP endpoint.
+## Notes on Validator Behavior
+
+Hudi 1.2.0 introduced the following behavioral refinements:
+
+**Metadata fields in SQL queries**: Validator SQL can now reference Hudi
metadata fields (`_hoodie_record_key`, `_hoodie_partition_path`,
`_hoodie_file_name`, `_hoodie_commit_time`, `_hoodie_commit_seqno`) directly in
query expressions.
+
+**Empty writes**: Empty write commits no longer cause pre-commit validators to
error. Validators are skipped gracefully when no records are present in the
write.
+
+## Failure Policy
+
+Hudi 1.2.0 introduces a configurable failure policy for pre-commit validators:
+
+| Config Key | Default | Description |
+|---|---|---|
+| `hoodie.precommit.validators.failure.policy` | `FAIL` | How to handle
validator failures. `FAIL`: block the commit with an exception. `WARN_LOG`:
emit a warning log but allow the commit to proceed (useful for soft
monitoring). |
+
+## Flink and Streaming-Offset Validators
+
+Available since Hudi 1.2.0. Flink writers now honor
`hoodie.precommit.validators` using the same configuration key as Spark.
Validators intended for use with Flink must extend the engine-agnostic
`org.apache.hudi.client.validator.BasePreCommitValidator` (in `hudi-common`),
which provides access to commit metadata and timeline information independently
of Spark.
+
+Two built-in streaming-offset validators are now available for Kafka-sourced
pipelines:
+
+| Validator Class | Engine | Description |
+|---|---|---|
+| `org.apache.hudi.sink.validator.FlinkKafkaOffsetValidator` | Flink |
Validates that the number of records written matches the Kafka offset
difference for the batch |
+| `org.apache.hudi.utilities.streamer.validator.SparkKafkaOffsetValidator` |
Spark / HoodieStreamer | Same semantics for Spark-based Kafka ingestion
pipelines |
+
+Both validators use the following configuration:
+
+| Config Key | Default | Description |
+|---|---|---|
+| `hoodie.precommit.validators.streaming.offset.tolerance.percentage` | `0.0`
| Tolerance percentage for offset-based record-count validation. A value of
`0.0` requires an exact match between expected records (from Kafka offset
delta) and actual records written. For upsert workloads with deduplication, set
a higher tolerance (e.g., `10.0` for 10%). |
+| `hoodie.precommit.validators.failure.policy` | `FAIL` | See [Failure
Policy](#failure-policy) above. |
+
+Example (Flink):
+```properties
+hoodie.precommit.validators=org.apache.hudi.sink.validator.FlinkKafkaOffsetValidator
+hoodie.precommit.validators.streaming.offset.tolerance.percentage=5.0
+hoodie.precommit.validators.failure.policy=WARN_LOG
+```
+
+## Pre-Write Validators
+
+Introduced in Hudi 1.2.0, pre-write validators run **before** data is written
to storage, in contrast to pre-commit validators which run **after** data is
written but before the commit is published to the timeline. This enables
earlier rejection of invalid operations, avoiding unnecessary I/O.
+
+Configuration:
+
+| Config Key | Default | Description |
+|---|---|---|
+| `hoodie.prewrite.validators` | `""` | Comma-separated list of
fully-qualified class names implementing
`org.apache.hudi.client.validator.PreWriteValidator`. |
+
+To implement a custom pre-write validator, implement the
`org.apache.hudi.client.validator.PreWriteValidator` interface:
+
+```java
+public interface PreWriteValidator {
+ <T> void validate(
+ String instantTime,
+ WriteOperationType writeOperationType,
+ HoodieTableMetaClient metaClient,
+ HoodieWriteConfig writeConfig,
+ HoodieEngineContext engineContext,
+ Option<HoodieData<HoodieRecord<T>>> recordsOpt) throws
HoodieValidationException;
+}
+```
+
+No built-in pre-write validator implementations are provided yet; this
framework is designed for custom user extensions. Unlike pre-commit validators,
pre-write validators have access to the incoming records before any write I/O
occurs.
+
## Related Resources
<h3>Blogs</h3>
diff --git a/website/docs/procedures.md b/website/docs/procedures.md
index 71edf434ed10..94dd41991598 100644
--- a/website/docs/procedures.md
+++ b/website/docs/procedures.md
@@ -2,7 +2,7 @@
title: SQL Procedures
summary: "In this page, we introduce how to use SQL procedures with Hudi."
toc: true
-last_modified_at: 2025-11-24T00:00:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
@@ -298,6 +298,64 @@ call show_archived_commits(table => 'test_hudi_table');
| 20220216171027021 | 435346 | 1 | 0
| 1 | 1 | 0
| 0 |
| 20220216171019361 | 435349 | 1 | 0
| 1 | 1 | 0
| 0 |
+### show_timeline
+
+Show timeline entries for a Hudi table. Returns instant-level information for
all timeline operations (commits, compactions, clustering, clean, rollback,
etc.) from the active and optionally archived timeline. Results are sorted by
timestamp descending.
+
+**Input**
+
+| Parameter Name | Type | Required | Default Value | Description
|
+|----------------|---------|----------|---------------|----------------------------------------------------------------------------------|
+| table | String | N* | None | Hudi table name
(mutually exclusive with `path`) |
+| path | String | N* | None | Base path of the Hudi
table (mutually exclusive with `table`) |
+| limit | Int | N | 20 | Max number of timeline
entries to return (ignored when both `startTime` and `endTime` are set) |
+| showArchived | Boolean | N | false | Whether to include
archived timeline entries |
+| filter | String | N | "" | SQL expression to
filter results on any output column |
+| startTime | String | N | "" | Start timestamp for
filtering (format: `yyyyMMddHHmmss`, inclusive) |
+| endTime | String | N | "" | End timestamp for
filtering (format: `yyyyMMddHHmmss`, inclusive) |
+
+\* Either `table` or `path` must be provided.
+
+**Output**
+
+| Output Name | Type | Description
|
+|----------------|--------|---------------------------------------------------------------------------------------|
+| instant_time | String | Requested timestamp of the instant
|
+| action | String | Action type: `commit`, `deltacommit`,
`compaction`, `clustering`, `clean`, `rollback`, etc. |
+| state | String | State of the instant: `REQUESTED`, `INFLIGHT`, or
`COMPLETED` |
+| requested_time | String | Wall-clock time when the instant was requested
(format: `MM-dd HH:mm:ss`) |
+| inflight_time | String | Wall-clock time when the instant became inflight
(format: `MM-dd HH:mm:ss`) |
+| completed_time | String | Wall-clock time when the instant completed
(format: `MM-dd HH:mm:ss`), or `null` |
+| timeline_type | String | `ACTIVE` or `ARCHIVED`
|
+| rollback_info | String | For rollback instants: what was rolled back; for
rolled-back instants: which rollback instant rolled them back; otherwise `null`
|
+
+**Example**
+
+```sql
+-- Show the 20 most recent timeline entries
+call show_timeline(table => 'test_hudi_table');
+
+-- Show up to 50 entries including archived timeline
+call show_timeline(table => 'test_hudi_table', limit => 50, showArchived =>
true);
+
+-- Filter to completed commits in a time range
+call show_timeline(
+ table => 'test_hudi_table',
+ startTime => '20251201000000',
+ endTime => '20251231235959',
+ filter => "action = 'commit' AND state = 'COMPLETED'"
+);
+
+-- Look up by base path instead of table name
+call show_timeline(path => 'hdfs:///user/hive/warehouse/test_hudi_table');
+```
+
+| instant_time | action | state | requested_time | inflight_time
| completed_time | timeline_type | rollback_info |
+|-------------------|--------|-----------|---------------------|---------------------|---------------------|---------------|---------------|
+| 20251205143022001 | commit | COMPLETED | 12-05 14:30:20 | 12-05
14:30:21 | 12-05 14:30:22 | ACTIVE | null |
+| 20251205141510003 | clean | COMPLETED | 12-05 14:15:09 | 12-05
14:15:10 | 12-05 14:15:10 | ACTIVE | null |
+| 20251205140030002 | commit | COMPLETED | 12-05 14:00:28 | 12-05
14:00:29 | 12-05 14:00:30 | ACTIVE | null |
+
### show_commit_files
Show files of a commit.
diff --git a/website/docs/reading_tables_batch_reads.md
b/website/docs/reading_tables_batch_reads.md
index 0a62f34d55e3..3b1e845e2145 100644
--- a/website/docs/reading_tables_batch_reads.md
+++ b/website/docs/reading_tables_batch_reads.md
@@ -19,6 +19,32 @@ val tripsDF = spark.read.
tripsDF.where(tripsDF.fare > 20.0).show()
```
+## Flink Batch (Snapshot) Read
+
+Flink can read a Hudi table as a snapshot (batch) query by leaving
`read.streaming.enabled` at its default value of `false`.
+
+```sql
+CREATE TABLE hudi_table (
+ uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
+ name VARCHAR(10),
+ age INT,
+ ts TIMESTAMP(3),
+ `partition` VARCHAR(20)
+)
+PARTITIONED BY (`partition`)
+WITH (
+ 'connector' = 'hudi',
+ 'path' = '${path}',
+ 'table.type' = 'MERGE_ON_READ'
+ -- read.streaming.enabled defaults to false → batch/snapshot read
+);
+
+-- Snapshot query
+SELECT * FROM hudi_table WHERE age > 25;
+```
+
+For more Flink read options, see [Using Flink](ingestion_flink.md).
+
## Daft
[Daft](https://www.daft.ai/) supports reading Hudi tables using
`daft.read_hudi()` function.
diff --git a/website/docs/reading_tables_streaming_reads.md
b/website/docs/reading_tables_streaming_reads.md
index 5055f42c0449..7191cedf0115 100644
--- a/website/docs/reading_tables_streaming_reads.md
+++ b/website/docs/reading_tables_streaming_reads.md
@@ -97,3 +97,47 @@ spark.readStream \
Spark SQL can be used within ForeachBatch sink to do INSERT, UPDATE, DELETE
and MERGE INTO.
Target table must exist before write.
:::
+
+## Flink Streaming Read
+
+Flink can continuously consume new commits from a Hudi table as a streaming
source. Enable this by setting `read.streaming.enabled=true` and optionally a
`read.start-commit`.
+
+```sql
+CREATE TABLE hudi_table (
+ uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
+ name VARCHAR(10),
+ age INT,
+ ts TIMESTAMP(3),
+ `partition` VARCHAR(20)
+)
+PARTITIONED BY (`partition`)
+WITH (
+ 'connector' = 'hudi',
+ 'path' = '${path}',
+ 'table.type' = 'MERGE_ON_READ',
+ 'read.streaming.enabled' = 'true', -- enable streaming read
+ 'read.start-commit' = '20210316134557', -- start from this instant
(omit for latest)
+ 'read.streaming.check-interval' = '60' -- poll interval in seconds
+);
+
+SELECT * FROM hudi_table;
+```
+
+### Source V2 for Streaming
+
+As of Hudi 1.2.0, the [FLIP-27-based Source
V2](ingestion_flink.md#flink-source-v2) is available as an opt-in for streaming
reads. Source V2 participates in Flink's checkpoint protocol for finer-grained
recovery and supports partition pruning:
+
+```sql
+WITH (
+ 'connector' = 'hudi',
+ 'path' = '${path}',
+ 'read.streaming.enabled' = 'true',
+ 'read.source-v2.enabled' = 'true' -- enable FLIP-27 source (Hudi 1.2.0+)
+)
+```
+
+:::warning
+Savepoints taken with the legacy source are not compatible with Source V2.
Start a fresh job when switching. See [Flink Source
V2](ingestion_flink.md#flink-source-v2) for migration details.
+:::
+
+For a full list of Flink streaming read options (rate limiting, commits limit,
CDC mode, etc.), see [Using Flink](ingestion_flink.md).
diff --git a/website/docs/sql_ddl.md b/website/docs/sql_ddl.md
index d1c5ba865bdb..2fc0d421dab2 100644
--- a/website/docs/sql_ddl.md
+++ b/website/docs/sql_ddl.md
@@ -2,7 +2,7 @@
title: SQL DDL
summary: "In this page, we discuss using SQL DDL commands with Hudi"
toc: true
-last_modified_at:
+last_modified_at: 2026-05-27T00:00:00-00:00
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
@@ -696,6 +696,14 @@ SHOW PARTITIONS hudi_table;
ALTER TABLE hudi_table DROP PARTITION (dt='2021-12-09', hh='10');
```
+:::note Slash-separated date partitioning and SHOW PARTITIONS
+When a table is written with
`hoodie.datasource.write.slash.separated.date.partitioning=true`, the
+physical directory layout uses `yyyy/MM/dd` paths. `SHOW PARTITIONS` correctly
handles this: it
+returns partition values in the standard `col=yyyy-MM-dd` display format,
normalizing the `/`
+separators back to `-` for readability. See [Key
Generation](key_generation.md#slash-separated-date-partitioning)
+for details on configuring slash-separated partitioning.
+:::
+
### Show and drop index
**Syntax**
@@ -837,6 +845,39 @@ WITH (
);
```
+### Create Append-Only Table Without Primary Key
+
+Hudi 1.2.0 supports creating a Flink table **without a `PRIMARY KEY`** for
pure append workloads.
+In this mode, set `write.operation` to `insert`; Hudi will not enforce
record-level uniqueness and
+the record-key and ordering fields are optional.
+
+```sql
+-- Append-only table: no PRIMARY KEY required
+CREATE TABLE hudi_append_table (
+ id BIGINT,
+ name STRING,
+ ts BIGINT,
+ city STRING
+)
+PARTITIONED BY (`city`)
+WITH (
+ 'connector' = 'hudi',
+ 'path' = 'file:///tmp/hudi_append_table',
+ 'table.type' = 'COPY_ON_WRITE',
+ 'write.operation' = 'insert'
+);
+
+INSERT INTO hudi_append_table VALUES (1, 'Alice', 1695159649, 'sf'), (2,
'Bob', 1695091554, 'ny');
+```
+
+:::note
+Without a primary key, Hudi uses auto-generated record keys and does **not**
perform deduplication
+or upsert merging. This is equivalent to `bulk_insert` semantics and is well
suited for log/event
+ingestion pipelines where every incoming row should be appended as-is.
+If `write.operation` is any value other than `insert` and no `PRIMARY KEY` is
defined, Hudi will
+throw `"Primary key definition is missing"` at table creation time.
+:::
+
### Create Table in Non-Blocking Concurrency Control Mode
The following is an example of creating a Flink table in [Non-Blocking
Concurrency Control
mode](concurrency_control.md#non-blocking-concurrency-control).
@@ -967,3 +1008,15 @@ WITH (
| numeric | | not supported |
| null | | not supported |
| object | | not supported |
+
+### AI and Unstructured Data Types
+
+Hudi 1.2.0 introduces two additional column types for AI and unstructured data
workloads:
+
+- **`VECTOR(dim[, elementType])`** — stores fixed-dimension embedding vectors
(e.g. `VECTOR(768)`,
+ `VECTOR(768, FLOAT)`, `VECTOR(768, DOUBLE)`). Enables approximate
nearest-neighbor search via
+ the `hudi_vector_search` TVF. See [Vector Search](vector_search.md) for full
details.
+
+- **`BLOB`** — stores arbitrary binary objects (images, audio, documents)
either inline within the
+ base file or as external references. See [BLOB / Unstructured
Data](blob_unstructured_data.md)
+ for the storage modes, DDL syntax, and read APIs.
diff --git a/website/docs/sql_queries.md b/website/docs/sql_queries.md
index 9310c72fbc62..4b9d6c2dd695 100644
--- a/website/docs/sql_queries.md
+++ b/website/docs/sql_queries.md
@@ -2,7 +2,7 @@
title: SQL Queries
summary: "In this page, we go over querying Hudi tables using SQL"
toc: true
-last_modified_at:
+last_modified_at: 2026-05-27T00:00:00-00:00
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
@@ -14,6 +14,24 @@ This page will show how to issue different queries and
discuss any specific inst
## Spark SQL
The Spark [quickstart](quick-start-guide.md) provides a good overview of how
to use Spark SQL to query Hudi tables. This section will go into more advanced
configurations and functionalities.
+:::tip Setting Hudi read options at the session level
+Hudi 1.2.0 supports setting read options at the **Spark session level** using
the `spark.hoodie.*` prefix.
+Any `spark.hoodie.X` config set via `spark.conf.set` or `--conf` is treated
equivalently to `hoodie.X`.
+
+Config precedence (low → high):
+1. Global DFS properties
+2. `spark.hoodie.*` session-level configs (normalized to `hoodie.*`)
+3. Explicit `hoodie.*` data source options or per-table `SET` commands
+
+```sql
+-- Apply a Hudi read option for the entire session
+SET spark.hoodie.metadata.column.stats.enable = true;
+SELECT * FROM hudi_table WHERE price BETWEEN 10.0 AND 50.0;
+```
+
+If both `spark.hoodie.X` and `hoodie.X` are set, the explicit `hoodie.X` value
takes precedence.
+:::
+
### Snapshot Query
Snapshot queries are the most common query type for Hudi tables. Spark SQL
supports snapshot queries on both COPY_ON_WRITE and MERGE_ON_READ tables.
Using session properties, you can specify options around indexing to optimize
query performance, as shown below.
@@ -332,6 +350,19 @@ also changed to use completion time. To support
compatiblity, Hudi does a checkp
time to completion time depending on the source table version.
:::
+### Vector Similarity Search
+
+Hudi 1.2.0 introduces a `hudi_vector_search` table-valued function (TVF) for
approximate
+nearest-neighbor (ANN) search over `VECTOR` columns. This is an extension of
the
+`hudi_table_changes` TVF pattern.
+
+```sql
+-- Find the 10 nearest neighbors to a query vector in the 'embedding' column
+SELECT * FROM hudi_vector_search('db.embeddings_table', 'embedding',
ARRAY(0.1, 0.2, ...), 10);
+```
+
+See [Vector Search](vector_search.md) for the full API, supported metrics, and
setup instructions.
+
### Query Indexes and Timeline
Hudi also allows users to directly query the metadata partitions and check the
metadata corresponding to the table
diff --git a/website/docs/syncing_aws_glue_data_catalog.md
b/website/docs/syncing_aws_glue_data_catalog.md
index 35f43a8af472..4b1e295dc831 100644
--- a/website/docs/syncing_aws_glue_data_catalog.md
+++ b/website/docs/syncing_aws_glue_data_catalog.md
@@ -54,6 +54,10 @@
hoodie.datasource.meta.sync.glue.partition_index_fields.enable
hoodie.datasource.meta.sync.glue.partition_index_fields
```
+## Writer Version Table Property
+
+Hudi 1.2.0 Glue sync writes the table property `hudi_writer_version` (set to
the Hudi version that last synced the table) to the Glue Data Catalog entry on
every sync, consistent with HMS sync behavior.
+
## Other references
### Running AWS Glue Catalog Sync for Spark DataSource
diff --git a/website/docs/syncing_metastore.md
b/website/docs/syncing_metastore.md
index f260a585c1b6..059f172de003 100644
--- a/website/docs/syncing_metastore.md
+++ b/website/docs/syncing_metastore.md
@@ -297,3 +297,44 @@ While using hive beeline query, you need to enter settings:
```bash
set hive.input.format =
org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat;
```
+
+## Spark Catalog Metastore Client
+
+When running Hudi inside a Spark environment that already has Hive support
enabled (e.g., SparkSQL with `spark.sql.catalogImplementation=hive`), the
standard `IMetaStoreClient` initialization can conflict with Spark's own Hive
classloader. Setting
+
+```properties
+hoodie.datasource.hive_sync.use_spark_catalog=true
+```
+
+(default: `false`) makes Hudi use `SparkCatalogMetaStoreClient` — a
Spark-native `IMetaStoreClient` implementation — instead of creating its own.
This avoids classloader conflicts in Hive-on-Spark setups. Requires a
`SparkSession` with Hive support active.
+
+## HMS 4.x Support via JDBC Fallback
+
+HMS 4.x changed several Thrift API method signatures (e.g., `get_table` →
`get_table_req`), which makes the standard Thrift-based HMS client
incompatible. Hudi 1.2.0 adds automatic fallback: when a Thrift metadata call
surfaces a `TApplicationException` anywhere in its cause chain, Hudi flips an
internal `thriftIncompatible` flag and reroutes the rest of that sync run
through the JDBC path.
+
+**Requirement:** sync mode must be `jdbc` with a valid JDBC URL so the
fallback client is available:
+
+```properties
+hoodie.datasource.hive_sync.mode=jdbc
+hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://hiveserver:10000
+hoodie.datasource.hive_sync.username=<username>
+hoodie.datasource.hive_sync.password=<password>
+```
+
+If the table is synced with `mode=hms` or `mode=hiveql` against HMS 4.x, Hudi
logs `"Thrift API incompatible with HMS but no JDBC fallback available.
Consider using mode=jdbc with a valid jdbcUrl."` and surfaces the original
exception — no automatic recovery happens.
+
+**Detection scope.** The flag is per `HoodieHiveSyncClient` instance, not
global, and only transitions from `false` to `true` (it never resets). In
practice this means the first Thrift call of each sync run probes once, and the
rest of that run uses the JDBC fallback. The next sync run starts with a fresh
probe.
+
+**JDBC connection failures surface separately.** With `mode=jdbc`, Hudi opens
the JDBC connection eagerly when the sync client is constructed — before any
Thrift call is attempted. A bad JDBC URL, missing driver, or wrong credentials
therefore fails at startup with `HoodieHiveSyncException: Failed to create
HiveMetaStoreClient` and the underlying JDBC exception as the cause in the
stack trace. This is a configuration-error path, not an HMS API mismatch, and
is the same behavior as `mode= [...]
+
+## Writer Version Table Property
+
+Hudi 1.2.0 sync writes the table property `hudi_writer_version` (set to the
Hudi version that last synced the table) to the Hive metastore entry on every
sync. This allows tooling and metastore administrators to identify which Hudi
version wrote a given table.
+
+To emit `TOUCH` events to the metastore for partition-level change tracking
(e.g., for downstream catalog notifications), set:
+
+```properties
+hoodie.meta.sync.touch.partitions.enabled=true
+```
+
+Default is `false`. When enabled, a TOUCH event is issued for each partition
that was modified in the sync operation.
diff --git a/website/docs/variant_type.md b/website/docs/variant_type.md
index bf42390afe2f..fdcf6f54b54b 100644
--- a/website/docs/variant_type.md
+++ b/website/docs/variant_type.md
@@ -3,7 +3,7 @@ title: "Semi-Structured Data (VARIANT)"
keywords: [ hudi, variant, semi-structured, json, schemaless, shredding,
parse_json, flexible schema]
summary: "Store and query semi-structured JSON-like data in Hudi tables using
the VARIANT type, with optional shredding for query performance"
toc: true
-last_modified_at: 2026-04-25T00:00:00-00:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
import Tabs from '@theme/Tabs';
@@ -253,12 +253,25 @@ binary `value` field.
| Engine | VARIANT Support |
|:-------|:---------------|
-| **Spark 4.0+** | Native `VariantType` — full read/write/query |
+| **Spark 4.0** | Native `VariantType` — full read/write/query for COW and
MOR; native `df.write` with `VariantType` on the V1 datasource |
+| **Spark 4.1** | Native `VariantType` — full read/write/query for COW and MOR
|
| **Spark 3.x** | Reads as `STRUCT<value: BINARY, metadata: BINARY>` —
backward compatible |
-| **Flink** | Reads as `ROW<metadata BYTES, value BYTES>` — cross-engine
compatible |
+| **Flink** | Native `VARIANT` operations are not supported. Tables written by
Spark with VARIANT columns can be read in Flink only as the underlying
`ROW<metadata BYTES, value BYTES>` struct. |
-A VARIANT table written by Spark 4.0 can be read by Spark 3.x or Flink, and
vice versa. The
-binary encoding is engine-independent.
+A VARIANT table written by Spark 4.0/4.1 can be read by Spark 3.x using the
underlying binary struct, or by Flink as `ROW<metadata BYTES, value BYTES>`.
The binary encoding is engine-independent.
+
+## Metastore Sync
+
+When syncing VARIANT column schemas to external catalogs, Hudi maps the binary
encoding to the
+target catalog's native struct type:
+
+| Catalog | VARIANT representation |
+|:--------|:----------------------|
+| Hive | `STRUCT<metadata:BINARY, value:BINARY>` |
+| BigQuery | `STRUCT` with `metadata` and `value` fields (`BYTES` type) |
+
+Query engines that support VARIANT (Spark 4.0+, Flink 2.1+) read the table
directly using the
+Parquet VARIANT annotation and do not go through the Hive/BigQuery metastore
representation.
## Use Cases for AI Workloads
@@ -344,3 +357,6 @@ CREATE TABLE api_responses (
- Native `VARIANT` keyword in DDL requires Spark 4.0+. On Spark 3.x, use the
struct representation.
- VARIANT shredding configuration is determined at write time based on the
schema definition.
- Complex path expressions within VARIANT may require casting to STRING and
then using JSON functions.
+- Native VARIANT operations are not supported on Flink. VARIANT columns
surface as `ROW<metadata BYTES, value BYTES>` and can be read but not natively
decoded or queried as a variant.
+- VARIANT columns are **not supported** on Lance-backed tables. Use Parquet as
the base file format
+ for tables containing VARIANT columns.
diff --git a/website/docs/vector_search.md b/website/docs/vector_search.md
index 9c53443161a7..4e6687057a6c 100644
--- a/website/docs/vector_search.md
+++ b/website/docs/vector_search.md
@@ -1,9 +1,9 @@
---
title: "Vector Search"
-keywords: [ hudi, vector, search, embeddings, similarity, cosine, ANN, nearest
neighbor, VECTOR type]
-summary: "Store embedding vectors in Hudi tables and run approximate nearest
neighbor search using the VECTOR type and hudi_vector_search TVF"
+keywords: [ hudi, vector, search, embeddings, similarity, cosine, nearest
neighbor, VECTOR type]
+summary: "Store embedding vectors in Hudi tables and run vector similarity
search using the VECTOR type and hudi_vector_search TVF"
toc: true
-last_modified_at: 2026-04-25T00:00:00-00:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
import Tabs from '@theme/Tabs';
@@ -13,6 +13,20 @@ Hudi's `VECTOR` type and `hudi_vector_search` table-valued
function (TVF) bring
to the data lakehouse. Store embeddings alongside your structured data and
query them with familiar Spark SQL —
no external vector database required.
+## Storage Format
+
+VECTOR columns are stored in Parquet as `FIXED_LEN_BYTE_ARRAY` — a
fixed-length binary encoding of the
+float array. Hudi stamps `hudi_type` metadata on the column so the Spark
reader knows to decode the
+bytes back into a typed array.
+
+On **Lance** tables, VECTOR columns are stored natively as Lance
`FixedSizeList<Float32/Float64, dim>`,
+so embeddings are written without conversion overhead at the file-format
layer. See
+[Lance File Format](lance_file_format.md) for details.
+
+The `VECTOR(dim[, elementType])` DDL syntax works across Spark 3.4, 3.5, 4.0,
and 4.1. Hudi's SQL
+parser normalizes `VECTOR(128, FLOAT)` to `VECTOR(128)` (FLOAT is the default
element type).
+Nesting VECTOR inside STRUCT, ARRAY, or MAP is not supported.
+
## VECTOR Type
The `VECTOR(dim[, elementType])` type declares a column that stores
fixed-dimensional embedding vectors.
@@ -98,8 +112,8 @@ INSERT INTO products VALUES (
## hudi_vector_search TVF
-The `hudi_vector_search` table-valued function performs approximate nearest
neighbor (ANN) search
-over a VECTOR column.
+The `hudi_vector_search` table-valued function returns the `top_k` rows from a
Hudi table whose
+VECTOR column is closest to a given query vector under a chosen distance
metric.
### Syntax
@@ -240,3 +254,21 @@ FROM hudi_vector_search(
- VECTOR columns must be **top-level fields** — nesting inside STRUCT, ARRAY,
or MAP is not supported.
- The query vector's element type must **exactly match** the corpus
embedding's element type (no implicit casting).
- VECTOR dimension and element type **cannot be changed** after table creation
via schema evolution.
+- **Flink cannot read VECTOR columns.** VECTOR data is stored as Parquet
`FIXED_LEN_BYTE_ARRAY`, which
+ Flink's Parquet reader does not decode back into a typed array. Flink can
still read all **other**
+ columns in a table that contains a VECTOR column — only the VECTOR column
itself is inaccessible.
+ Use Spark to query VECTOR columns.
+
+## Metastore Sync
+
+When syncing VECTOR column schemas to external catalogs, Hudi maps the binary
encoding to the
+target catalog's native binary type, preserving the original VECTOR metadata
in table properties:
+
+| Catalog | VECTOR representation |
+|:--------|:---------------------|
+| Hive | `BINARY` |
+| BigQuery | `BYTES` |
+
+The `VECTOR(dim, elementType)` dimension and element-type metadata is
preserved in
+`TBLPROPERTIES`/table descriptions so the table can be correctly reconstructed
by Spark even after
+a metastore round-trip.
diff --git a/website/docs/writing_data.md b/website/docs/writing_data.md
index 5e4570182692..dcdc5a33bd81 100644
--- a/website/docs/writing_data.md
+++ b/website/docs/writing_data.md
@@ -1,7 +1,7 @@
---
title: Batch Writes
keywords: [hudi, incremental, batch, processing]
-last_modified_at: 2024-03-13T15:59:57-04:00
+last_modified_at: 2026-05-27T00:00:00-00:00
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
@@ -441,5 +441,32 @@ inputDF.write.format("hudi")
.save(basePath)
```
+### Rolling Extra Metadata
+
+Rolling extra metadata allows you to automatically carry forward selected
commit metadata keys to every subsequent commit and clean instant without
having to walk the full timeline. This is particularly useful for persisting
checkpoint information such as Kafka offsets or Flink checkpoints across
commits.
+
+| Config | Default | Description |
+|---|---|---|
+| `hoodie.write.rolling.metadata.keys` | `""` (disabled) | Comma-separated
list of extra metadata keys to carry forward to each new commit and clean
instant. Values are read from recent completed instants and written into the
new commit metadata, so they remain accessible without walking the timeline.
New values override old ones. Only applies to data table commits and clean
instants. |
+| `hoodie.write.rolling.metadata.timeline.lookback.commits` | `10` | Maximum
number of completed instants to walk back when searching for the configured
rolling metadata keys. Higher values improve resilience at a small performance
cost. |
+
+**Example:**
+
+```java
+inputDF.write.format("hudi")
+ .option("hoodie.write.rolling.metadata.keys",
"kafka.offset.partition.0,kafka.offset.partition.1")
+ .option("hoodie.write.rolling.metadata.timeline.lookback.commits", "10")
+ // ... other options
+ .save(basePath)
+```
+
+### Advanced Storage Options
+
+The following advanced storage configuration options were added in Hudi 1.2.0:
+
+| Config | Default | Description |
+|---|---|---|
+| `hoodie.parquet.write.config.injector.class` | (none) | Fully-qualified
class name of a custom `HoodieParquetConfigInjector` implementation. Use this
to inject custom Parquet writer properties (e.g., disable dictionary encoding,
set bloom filter sizes) without modifying the Hudi source. The implementing
class must implement `org.apache.hudi.io.HoodieParquetConfigInjector`. |
+
## Java Client
We can use plain java to write to hudi tables. To use Java client we can
refere
[here](https://github.com/apache/hudi/blob/master/hudi-examples/hudi-examples-java/src/main/java/org/apache/hudi/examples/java/HoodieJavaWriteClientExample.java)