This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new d2522d0c0743 docs: add flink RLI related configurations (#18869)
d2522d0c0743 is described below

commit d2522d0c07432509e225e028fdb54d67c0b9fb7c
Author: Danny Chan <[email protected]>
AuthorDate: Thu May 28 21:47:20 2026 +0800

    docs: add flink RLI related configurations (#18869)
---
 website/docs/indexes.md | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/website/docs/indexes.md b/website/docs/indexes.md
index 67d398853a4c..0b294d34b72f 100644
--- a/website/docs/indexes.md
+++ b/website/docs/indexes.md
@@ -211,13 +211,38 @@ for more details. All these, support the index types 
mentioned [above](#addition
 
 #### Flink based configs
 
-For Flink DataStream and Flink SQL, Bucket index and Flink state index are 
supported.
+For Flink DataStream and Flink SQL, Bucket index, Flink state index, and 
record-level index are supported.
 Following are the basic configs that control the indexing behavior. Please 
refer [the Flink 
configurations](configurations.md#Flink-Options-advanced-configs) for advanced 
configs.
 
-| Config Name                | Default                | Description            
                                                                                
                                                                                
                                                 |
-|----------------------------|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| index.type                 | FLINK_STATE (Optional) | Index type of Flink 
write job, default is using state backed index. Possible values:<br /> 
<ul><li>FLINK_STATE</li><li>BUCKET</li></ul><br />  `Config Param: INDEX_TYPE`  
                                                             |
-| hoodie.index.bucket.engine | SIMPLE (Optional)      | 
org.apache.hudi.index.HoodieIndex$BucketIndexEngineType: Determines the type of 
bucketing or hashing to use when `hoodie.index.type` is set to `BUCKET`.    
Possible Values: <br /> <ul><li>SIMPLE</li><li>CONSISTENT_HASHING</li></ul> |
+| Config Name                                                 | Default        
        | Description                                                           
                                                                                
                                                                                
                                                            |
+|-------------------------------------------------------------|------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| index.type                                                  | FLINK_STATE 
(Optional) | Index type of Flink write job, default is using state backed 
index. Possible values:<br /> 
<ul><li>FLINK_STATE</li><li>BUCKET</li><li>GLOBAL_RECORD_LEVEL_INDEX</li><li>RECORD_LEVEL_INDEX</li></ul><br
 />  `Config Param: INDEX_TYPE`                                                 
           |
+| hoodie.index.bucket.engine                                  | SIMPLE 
(Optional)      | org.apache.hudi.index.HoodieIndex$BucketIndexEngineType: 
Determines the type of bucketing or hashing to use when `hoodie.index.type` is 
set to `BUCKET`.    Possible Values: <br /> 
<ul><li>SIMPLE</li><li>CONSISTENT_HASHING</li></ul>                             
                              |
+| metadata.enabled                                            | true 
(Optional)        | Enables the metadata table. Required for Flink record-level 
index lookups.                                                                  
                                                                                
                                                                       |
+| index.global.enabled                                        | true 
(Optional)        | Whether to update the old partition path when the same 
record key arrives with a different partition path. This must be `true` for 
`GLOBAL_RECORD_LEVEL_INDEX` and is set to `false` for `RECORD_LEVEL_INDEX`.     
                                                                                
|
+| index.bootstrap.enabled                                     | false 
(Optional)       | When `index.type=GLOBAL_RECORD_LEVEL_INDEX`, controls 
whether Flink bootstraps the global index into a local RocksDB backend. If not 
explicitly set for global RLI, Flink enables bootstrap by default. Set to 
`false` to force native metadata-table RLI access.                              
      |
+| index.bootstrap.rocksdb.path                                | (Optional)     
        | Local directory path for the RocksDB backend used when 
`index.bootstrap.enabled=true`. Each task manager creates a unique subdirectory 
under this path.                                                                
                                                                             |
+| index.rli.cache.size                                        | 256 (Optional) 
        | Maximum memory, in MB, allocated for the record-level index cache per 
bucket-assign task. Applies to native metadata-table RLI access and partitioned 
RLI caches.                                                                     
                                                              |
+| index.rli.cache.concurrent.partitions.num                   | 2 (Optional)   
        | Expected number of partitions whose partitioned RLI caches are 
updated concurrently. Used to size each partition cache when historical cache 
usage is unavailable.                                                           
                                                                       |
+| index.rli.lookup.minibatch.size                             | 1000 
(Optional)        | Maximum number of input records buffered for mini-batch 
record-index lookup. Mini-batching reduces individual metadata-table lookup 
calls for native global RLI access.                                             
                                                                               |
+| index.rli.write.buffer.size                                 | 100 (Optional) 
        | Maximum memory, in MB, for the index record writer buffer. When the 
threshold is reached, Flink flushes index records to avoid OOM.                 
                                                                                
                                                                |
+| index.write.tasks                                           | (N/A)          
        | Parallelism for tasks that write record-level index records. Defaults 
to the execution environment parallelism when not set.                          
                                                                                
                                                              |
+| metadata.compaction.schedule.enabled                        | true 
(Optional)        | Schedules metadata table compaction plans.                  
                                                                                
                                                                                
                                                                       |
+| metadata.compaction.async.enabled                           | true 
(Optional)        | Runs metadata table compaction in the Flink compaction 
pipeline when record-level index streaming writes are enabled.                  
                                                                                
                                                                             |
+| metadata.compaction.delta_commits                           | 10 (Optional)  
        | Maximum metadata-table delta commits before metadata compaction is 
triggered.                                                                      
                                                                                
                                                                 |
+| hoodie.metadata.record.level.index.defer.init               | false 
(Optional)       | Defers RLI initialization for fresh tables. Flink ingestion 
does not support deferred RLI initialization, so keep this set to `false` for 
Flink RLI writes.                                                               
                                                                           |
+| hoodie.metadata.global.record.level.index.min.filegroup.count | 10 
(Optional)          | Minimum number of file groups to use for Global Record 
Index.                                                                          
                                                                                
                                                                            |
+| hoodie.metadata.global.record.level.index.max.filegroup.count | 10000 
(Optional)       | Maximum number of file groups to use for Global Record 
Index.                                                                          
                                                                                
                                                                            |
+| hoodie.metadata.record.level.index.min.filegroup.count      | 1 (Optional)   
        | Minimum number of file groups to use for Partitioned Record Index. 
New data partitions use this value for their initial partitioned RLI file group 
count, which is also used by dynamic bucket assignment before the partition 
appears in the metadata table.                                       |
+| hoodie.metadata.record.level.index.max.filegroup.count      | 10 (Optional)  
        | Maximum number of file groups to use for Partitioned Record Index.    
                                                                                
                                                                                
                                                             |
+
+Common Flink RLI configurations:
+
+| Use case                                           | Required settings       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
 | Notes      [...]
+|----------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------
 [...]
+| Flink global RLI with native MDT access            | 
`index.type=GLOBAL_RECORD_LEVEL_INDEX`<br />`metadata.enabled=true`<br 
/>`index.global.enabled=true`<br />`index.bootstrap.enabled=false`<br 
/>`hoodie.metadata.record.level.index.defer.init=false`                         
                                                                                
                                                                                
                                             | Flink rea [...]
+| Flink global RLI with local RocksDB cache          | 
`index.type=GLOBAL_RECORD_LEVEL_INDEX`<br />`metadata.enabled=true`<br 
/>`index.global.enabled=true`<br />`index.bootstrap.enabled=true`<br 
/>`index.bootstrap.rocksdb.path=<local-path>`<br 
/>`hoodie.metadata.record.level.index.defer.init=false`                         
                                                                                
                                                                            | 
Flink boot [...]
+| Dynamic bucket scaling with partitioned RLI        | 
`index.type=RECORD_LEVEL_INDEX`<br />`metadata.enabled=true`<br 
/>`index.global.enabled=false`<br 
/>`hoodie.metadata.record.level.index.min.filegroup.count=<initial-file-groups-per-partition>`<br
 
/>`hoodie.metadata.record.level.index.max.filegroup.count=<max-file-groups-per-partition>`<br
 />Optionally tune `index.rli.cache.size` and 
`index.rli.cache.concurrent.partitions.num` for the partition cache. | Flink 
uses partition-scoped RLI [...]
 
 ### Picking Indexing Strategies
 

Reply via email to