GitHub user xushiyan edited a discussion: Use config `hoodie.index.scope=global|partition` to avoid having variants of the same index type
Hudi supports many indexes for both read and write: simple, bloom, bucket, record, etc. And there are global variants such as global_simple, global_bloom, and partitioned variant like partitioned record index. To simplify the usage, we can add a new config `hoodie.index.scope=global|partition` to eliminate the need of keeping global or partitioned variants. A legacy config like `hoodie.index.type=global_simple` will be equivalent to ``` hoodie.index.type=simple hoodie.index.scope=global ``` The default index scope config can be inferred based on the index type to stay compatible with the current default behaviors of each index type. Setting a scope config value should be validated for the index type, for e.g., bucket index is currently working on partition level, so explicitly setting the scope to global will fail the config validation. Although it uses one more config, it makes the index names cleaner and aligned. For e.g., we have global_simple and global_bloom for global variants, but record_index is a global index with partitioned_record_index working on partition-level. The new index scope config also makes it more explicit to users about how the chosen index type behaves, and how the uniqueness should be enforced--either across partitions or within a partition. GitHub link: https://github.com/apache/hudi/discussions/13864 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
