GitHub user danny0405 added a comment to the discussion: RocksDB as The Replica of MDT/RLI
> Question 1: If we treat rocksDB as the primary source of truth during writes, > how does concurrent updates from another writer get visible correctly during > BucketAssignOp stage The concurrent upsert consistency still works under OCC since the task failover would anyway retrigger the full bootstrap of the RocksDB replica, as of now, simple bucket index is a prerequisite for NBCC so it should not be a strong concern or blocker for the RLI. > Question 2 : the 2x is due to just compression? I think there will be an > additional 2x additional storage for un-compacted updates While the RocksDB suggests light compression(LZ4/Snappy) for L0 ~ L2 to get best write throughput, and heavier compression (Zstd/Zlib) for L3+, for updates that un-compacted, the native MDT also got the similar case for its payloads in log files, it may relies on the compression frequency gap between MDT(10 delta commits trigger a compaction) and RocksDB(4 sst triggers a compaction). GitHub link: https://github.com/apache/hudi/discussions/18296#discussioncomment-16061388 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
