I left my thoughts on the RFC https://github.com/apache/hudi/pull/4309
I just see this as a another deployment model where a centralized set of
microservices take up scheduling, execution of Hudi's table services.
+1 on thinking about sharding,locking and HA upfront.
Thanks
Vinoth
On Thu, Apr 2
Hey, folks!
I feel there's quite a bit of confusion in this thread, so let's try to
clear it: my understanding (please correct me if I'm wrong) is that
Lake Manager was referred to as a service in a similar interpretation of
how we call compaction, clustering and cleaning a* table services.*
So,
Thanks for all your attention.
Sure, we do need to take care of high availability in design.
Also in my opinion this lake manager wouldn't drive hudi into a database on the
cloud. It is just an official option. Something like HoodieDeltaStreamer and
help users to reduce maintenance and hudi data
>
> I agree with Danny said. IMO, there are two points that should be
> considered
1. If Lake Manager is designed as a service, so we should consider its High
Availability, Dynamic Expanding/Shrinking, and state consistency.
2. How many resources will Lake Manager used to execute those actions of
In my point of view, this Lake Manager should be more like a centralized
management layer on top of Hudi tables to schedule different table services
and do data governance. The scheduling / managing part should be
lightweight. The execution should still be in cluster. It should not be a
single n
I have different concerns here, the Lake Manager seems like a single
node service here, and there is a risk that it becomes a bottleneck
for handling too many table services. And for every single node
service we should consider how to achieve high availability.
What is the final state of the Hudi
+1 This is a great idea! The proposed lake manager and centralized
management layer are essential to ease the burden of carrying out data
governance and optimizing the storage layout, making them independent of
ingestion and streaming. I see that this provides a better abstraction for
any potentia
Great idea, Zhang Yue! I see more potential collaborations in the work for
the table management service in this RFC 43
https://github.com/apache/hudi/pull/4309
On Mon, Apr 18, 2022 at 2:15 PM Yue Zhang wrote:
>
>
> Hi all,
> I would like to discuss and contribute a new feature named Hudi Lak