GitHub user danny0405 added a comment to the discussion: Dynamic Bucket Index 
For Flink streaming

overall looks good, can you clarify these items:

1. the small file profile for assigning new keys to existing buckets, there are 
two metrics: the row count and file size(file group/base file), let's decide 
which one do we want here. and we need a way to calculate or estimate the 
values.
2. the read of partitioned RLI from specific partiiton, is there any read 
amplification? for e.g, is the partition index mappings scatter among multipe 
buckets or stored together with other partitions within one RLI bucket.

GitHub link: 
https://github.com/apache/hudi/discussions/18514#discussioncomment-16629755

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to