Re: [ANNOUNCE] Apache Hudi 0.6.0 released

leesf Tue, 25 Aug 2020 03:32:55 -0700

Great, thanks sudha and all involved.

Pratyaksh Sharma <[email protected]> 于2020年8月25日周二 下午1:17写道：


> Great news! :)
>
> On Tue, Aug 25, 2020 at 10:09 AM Vinoth Chandar <[email protected]> wrote:
>
> > - announce
> >
> > Folks, please keep the follow ups to dev@ and users@
> >
> >
> >
> > On Mon, Aug 24, 2020 at 9:26 PM vino yang <[email protected]> wrote:
> >
> > > Great news!
> > >
> > > Thanks to Bhavani Sudha for driving the release! And thanks to every
> one
> > of
> > > the whole community!
> > >
> > > Best,
> > > Vino
> > >
> > > Bhavani Sudha <[email protected]> 于2020年8月25日周二 上午11:37写道：
> > >
> > > > The Apache Hudi team is pleased to announce the release of Apache
> Hudi
> > > > 0.6.0.
> > > >
> > > > Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and
> > > > Incrementals. Apache Hudi manages storage of large analytical
> datasets
> > on
> > > > DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage)
> > and
> > > > provides the ability to query them.
> > > >
> > > > This release comes 2 months after 0.5.3. It includes more than 200
> > > > resolved issues, comprising new features, perf improvements, as well
> as
> > > > general improvements and bug-fixes. Hudi 0.6.0 introduces mechanisms
> to
> > > > efficiently bootstrap large datasets into Hudi without having to copy
> > the
> > > > data (experimental feature), via both Spark datasource writer and
> > > > DeltaStreamer tool. A new index (HoodieSimpleIndex) is added that can
> > be
> > > > faster than bloom index for cases where updates/deletes spread
> across a
> > > > large portion of the table. With this version, rollbacks are done
> using
> > > > marker files and a supporting upgrade and downgrade infrastructure is
> > > > provided to users for smooth transition. HoodieMultiDeltaStreamer
> tool
> > > > (experimental feature) is added in this version to support ingesting
> > > > multiple kafka streams in a single DeltaStreamer deployment for
> > enhancing
> > > > operational experience. Bulk inserts are further improved by avoiding
> > any
> > > > dataframe-rdd conversions, accompanied with configurable sorting
> modes.
> > > > While this conversion of dataframe to rdd, is not a bottleneck for
> > > > upsert/deletes, subsequent releases will expand this to other write
> > > > operations. Other performance improvements include supporting async
> > > > compaction for spark streaming writes.
> > > >
> > > > For details on how to use Hudi, please look at the quick start page
> > > > located at:
> > > > https://hudi.apache.org/docs/quick-start-guide.html
> > > >
> > > > If you'd like to download the source release, you can find it here:
> > > > https://github.com/apache/hudi/releases/tag/release-0.6.0
> > > >
> > > > You can read more about the release (including release notes) here:
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12322822&version=12346663
> > > >
> > > > We would like to thank all contributors, the community, and the
> Apache
> > > > Software Foundation for enabling this release and we look forward to
> > > > continued collaboration. We welcome your help and feedback. For more
> > > > information on how to report problems, and to get involved, visit the
> > > > project website at:
> > > > http://hudi.apache.org/
> > > >
> > > > Thanks to everyone involved!
> > > > - Bhavani Sudha
> > > >
> > >
> >
>

Re: [ANNOUNCE] Apache Hudi 0.6.0 released

Reply via email to