GitHub user vinothchandar edited a comment on the discussion: RLI support for Flink streaming
or this is what you are saying `BucketAssignor : tag record as I/U/D` => `produce index record and data record for each incoming record` => `custom partitioning such that index records are partitioned by hash(record key)%num_rli_shards AND data records are shuffled by bucket/file group id` => `write stage with two types of writer tasks: data write handles doing append, create, merge. RLI writes` On Flink checkpoint : each index/data writing task flushes all its records to RLI and data files respectively. So the RLI and data files are always consistent. We commit both as we do now, from Coordinator into a single hudi commit GitHub link: https://github.com/apache/hudi/discussions/17452#discussioncomment-15204553 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
