Great news! :) On Tue, Aug 25, 2020 at 10:09 AM Vinoth Chandar <[email protected]> wrote:
> - announce > > Folks, please keep the follow ups to dev@ and users@ > > > > On Mon, Aug 24, 2020 at 9:26 PM vino yang <[email protected]> wrote: > > > Great news! > > > > Thanks to Bhavani Sudha for driving the release! And thanks to every one > of > > the whole community! > > > > Best, > > Vino > > > > Bhavani Sudha <[email protected]> 于2020年8月25日周二 上午11:37写道: > > > > > The Apache Hudi team is pleased to announce the release of Apache Hudi > > > 0.6.0. > > > > > > Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and > > > Incrementals. Apache Hudi manages storage of large analytical datasets > on > > > DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage) > and > > > provides the ability to query them. > > > > > > This release comes 2 months after 0.5.3. It includes more than 200 > > > resolved issues, comprising new features, perf improvements, as well as > > > general improvements and bug-fixes. Hudi 0.6.0 introduces mechanisms to > > > efficiently bootstrap large datasets into Hudi without having to copy > the > > > data (experimental feature), via both Spark datasource writer and > > > DeltaStreamer tool. A new index (HoodieSimpleIndex) is added that can > be > > > faster than bloom index for cases where updates/deletes spread across a > > > large portion of the table. With this version, rollbacks are done using > > > marker files and a supporting upgrade and downgrade infrastructure is > > > provided to users for smooth transition. HoodieMultiDeltaStreamer tool > > > (experimental feature) is added in this version to support ingesting > > > multiple kafka streams in a single DeltaStreamer deployment for > enhancing > > > operational experience. Bulk inserts are further improved by avoiding > any > > > dataframe-rdd conversions, accompanied with configurable sorting modes. > > > While this conversion of dataframe to rdd, is not a bottleneck for > > > upsert/deletes, subsequent releases will expand this to other write > > > operations. Other performance improvements include supporting async > > > compaction for spark streaming writes. > > > > > > For details on how to use Hudi, please look at the quick start page > > > located at: > > > https://hudi.apache.org/docs/quick-start-guide.html > > > > > > If you'd like to download the source release, you can find it here: > > > https://github.com/apache/hudi/releases/tag/release-0.6.0 > > > > > > You can read more about the release (including release notes) here: > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12322822&version=12346663 > > > > > > We would like to thank all contributors, the community, and the Apache > > > Software Foundation for enabling this release and we look forward to > > > continued collaboration. We welcome your help and feedback. For more > > > information on how to report problems, and to get involved, visit the > > > project website at: > > > http://hudi.apache.org/ > > > > > > Thanks to everyone involved! > > > - Bhavani Sudha > > > > > >
