Folks,

There have been many requests from users around supporting concurrency
control for Hudi tables. I'm proposing we break down this ask into 2
phases. The first phase will focus on providing the ability to perform
parallel writes to Hudi tables - this means as long as writes touch
non-overlapping files, users will have the flexibility to mutate the tables
from multiple writers.
In phase 2, we will explore the idea of providing concurrency control
semantics when 2 different writers want to mutate the same file or even
record and how to provide serializability.

Phase 1 solves some of the outstanding requirements while is relatively
less complex to implement and achieve than Phase 2. With such a breakdown,
we will make progress on multi-writers for Hudi tables while also
re-structuring the code and paving way for Phase 2.

Please chime in.

Thanks,
Nishith

Reply via email to