Re: Gear up for Hudi 1.0!

Vinoth Chandar Wed, 04 Dec 2024 09:27:13 -0800

All, Hudi 1.0 is entering voting soon. It's a large community effort from
so many people here. Thank you!
Please test/provide feedback on Slack or on the dev mailing list. Watch for
a separate vote email on the dev list.


*New docs are live* (continuously updated for the next few days. but they
are already redone to reflect all new features and usage) .

   -  https://hudi.apache.org/docs/next/overview  (Note the “next”) the URL
   - Docs will get finalized once the community ratifies the release .



*Notable changes to docs.*

   - Use-cases: https://hudi.apache.org/docs/next/use_cases. this lands the
   core use-cases while talking about design differences that make Hudi shine
   for these use-cases.
   - Python/Rust guide:
   https://hudi.apache.org/docs/next/python-rust-quick-start-guide about
   Hudi in those ecosystems
   - Hudi stack page: https://hudi.apache.org/docs/next/hudi_stack revamped
   to pull out table format, and clearly show what Hudi adds on top. Aligned
   with a seminal database academic paper
   - Timeline page : https://hudi.apache.org/docs/next/timeline (explains
   time and specifically how Hudi implements TrueTime that also powers Google
   Cloud Spanner, Cockroach etc)
   - Storage format versioning :
   https://hudi.apache.org/docs/next/storage_layouts#storage-format-versioning
   (explains how we are making upgrades easy, with backwards compatible
   writing)
   - Write operations: https://hudi.apache.org/docs/next/write_operations
   (clearly split into incremental and batch operations.. showing benefits to
   data pipelines..)
   - Record merger https://hudi.apache.org/docs/next/record_merger (again
   here we talk differentiation on incremental pipelines/streaming workloads,
   stuff like event time ordering that no other system can do)


*Finally, the headliners & brand new, industry-first features:*

   - Indexing page
   https://hudi.apache.org/docs/next/indexes#multi-modal-indexing (we’re
   bringing secondary indexes to the lakehouse.)
   - look at examples in
   https://hudi.apache.org/docs/next/quick-start-guide#indexing and
   https://hudi.apache.org/docs/next/sql_ddl#create-index
   - you can build an index on any column now..
   https://hudi.apache.org/docs/next/metadata_indexing and accelerate any
   IN and = queries on top of those columns.
   - Non blocking concurrency control:
   
https://hudi.apache.org/docs/next/sql_dml#non-blocking-concurrency-control-experimental
   (for Flink users and in general people unhappy with OCC/looking for best in
   class concurrency control)
   - Finally,  partial update encoding..
   https://www.youtube.com/watch?v=mEwhBdOl53o (we are seeing about 85%
   reduction in data written and 30-60% drop in write latencies..)


On Mon, Dec 2, 2024 at 9:56 PM sagar sumit <cod...@apache.org> wrote:

> Hello Everyone,
>
> We are very close to cutting the RC1 for Hudi 1.0. For a preview of what's
> coming, please take a look at the updated docs. To begin with, check out
> the `Use Cases` doc [1] that throws more light on Streaming/CDC use
> cases. Then, of course my favorite docs are in `Design & Concepts`
> section such as:
>
> Apache Hudi Stack [2]
> Storage Layout [3]
> Timeline [4]
> Write Operations [5]
> Table and Query Types [6]
>
> We are actively updating the website. As the RM, I would like to invite you
> to check out the docs and try out RC1 yourself, and provide us feedback.
>
> So, gear up for an amazing Hudi 1.0!
>
> Regards,
> Sagar
>
> [1] https://hudi.apache.org/docs/next/use_cases
> [2] https://hudi.apache.org/docs/next/hudi_stack
> [3] https://hudi.apache.org/docs/next/storage_layouts
> [4] https://hudi.apache.org/docs/next/timeline
> [5] https://hudi.apache.org/docs/next/write_operations
> [6] https://hudi.apache.org/docs/next/table_types
>

Re: Gear up for Hudi 1.0!

Reply via email to