Apache Hudi 0.8.0 Released

Gary Li Wed, 07 Apr 2021 07:54:32 -0700

Hi All,

We are excited to share that Apache Hudi 0.8.0 was released. Since the
0.7.0 release, we resolved 97 JIRA tickets and made 120 code commits. We
implemented many new features, bugfix, and performance improvement. Thanks
to all the contributors who had made this happened.


*Release Highlights*

*Flink Integration*
Since the initial support for the Hudi Flink Writer in the 0.7.0 release,
the Hudi community made great progress on improving the Flink/Hudi
integration, including redesigning the Flink writer pipeline with better
performance and scalability, state-backed indexing with bootstrap support,
Flink writer for MOR table, batch reader for COW&MOR table, streaming
reader for MOR table, and Flink SQL connector for both source and sink. In
the 0.8.0 release, the user is able to use all those features with Flink
1.11+.

Please see [RFC-24](
https://cwiki.apache.org/confluence/display/HUDI/RFC+-+24%3A+Hoodie+Flink+Writer+Proposal)
for more implementation details of the Flink writer and follow this [page](
https://hudi.apache.org/docs/flink-quick-start-guide.html) to get started
with Flink!

*Parallel Writers Support*
As many users requested, now Hudi supports multiple ingestion writers to
the same Hudi Table with optimistic concurrency control. Hudi supports file
level OCC, i.e., for any 2 commits (or writers) happening to the same
table, if they do not have writes to overlapping files being changed, both
writers are allowed to succeed. This feature is currently experimental and
requires either Zookeeper or HiveMetastore to acquire locks.

Please see [RFC-22](
https://cwiki.apache.org/confluence/display/HUDI/RFC+-+22+%3A+Snapshot+Isolation+using+Optimistic+Concurrency+Control+for+multi-writers)
for more implementation details and follow this [page](
https://hudi.apache.org/docs/concurrency_control.html) to get started with
concurrency control!

*Writer side improvements*
- InsertOverwrite Support for Flink writer client.
- Support CopyOnWriteTable in Java writer client.

*Query side improvements*
- Support Spark Structured Streaming read from Hudi table.
- Performance improvement of Metadata table.
- Performance improvement of Clustering.

*Raw Release Notes*
The raw release notes are available [here](
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12322822&version=12349423
)

Thanks,
Gary Li
(on behalf of the Hudi community)

Apache Hudi 0.8.0 Released

Reply via email to