Currently Hudi only supports sync
dataset metadata to Hive through hive jdbc and IMetaStoreClient. When
you need to sync
to other frameworks, such as aws glue, aliyun DataLake analytics, etc.
You need to copy a lot of code from HoodieHiveClient, which creates a
lot of redundant code.
So need to redesign the hudi-hive-sync module to support
other frameworks and reuse current code as much as possible. Only the
interface is provided by Hudi, and the implement is customized by different
services as hive 、aws glue、aliyun DataLake analytics.

I created an RFC with more details
https://cwiki.apache.org/confluence/display/HUDI/RFC+-+17+Abstract+common+meta+sync+module+support+multiple+meta+service
. Any feedback is appreciated.

Best Regards,
Wei Li.

Reply via email to