Hi Community, Here is a quick update on 0.9.0 release status. Over the last 10 days we made significant progress on the release blockers previously mentioned in the thread, thanks to all the owners. Here are the remaining blockers the we are currently tracking:
- [HUDI-2305] Add MARKERS.type and fix marker-based rollback - [HUDI-2268] Add upgrade and downgrade to and from 0.9.0 release-blockers - [HUDI-2307] When using delete_partition with ds should not rely on the primary key - [HUDI-2151] Flipping defaults - [HUDI-1897] Deltastreamer source for AWS S3 - [HUDI-2120] [DOC] Update docs about schema in flink sql configuration - [HUDI-2119] Ensure the rolled-back instance was previously synced to the Metadata Table when syncing a Rollback Instant. We plan to resolve these soon and cut a RC by *tomorrow (August 14th, 2021) end of day PST*. If you have any other blockers that you would like to surface for Hudi 0.9.0, feel free to reach out. Thanks, Udit On Fri, Aug 6, 2021 at 1:53 AM sagar sumit <[email protected]> wrote: > Hi Udit, Vinoth > > End of next week sounds good. Apart from the issues listed, there is one > more that we can take in this release: > [HUDI-1897] DeltaStreamer Source for AWS S3 > > It's under review and should be closed by early next week. > > Regards, > Sagar > > On 2021/08/06 00:55:19, Raymond Xu <[email protected]> wrote: > > +1 End of next week > > > > On Thu, Aug 5, 2021 at 3:06 PM Sivabalan <[email protected]> wrote: > > > > > Yeah, end of next week sounds good. > > > > > > Here are the status updates wrt patches I am involved. > > > > > > Plan to get these in by early next week. > > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei) > > > - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner: > > > Sivabalan) > > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner: > > > pengzhiwei) > > > - [HUDI-1138] Re-implement marker files via timeline server (Owner: > > > Ethan Guo) > > > - [HUDI-1129] Improving schema evolution support in hudi (Owner: > > > Sivabalan) > > > > > > Mid next week: > > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With > Hudi > > > (Owner: pengzhiwei) > > > > > > Waiting for reviews. Will try to get it in by early next week. If we > > > couldn't get this in, probably will skip this release. > > > - [HUDI-1763] Fixing honoring of Ordering val in > > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan) > > > > > > Removed from release blockers: > > > - [HUDI-1887] Setting default value to false for enabling schema > post > > > processor (Owner: Sivabalan) > > > - [HUDI-1850] Fixing read of a empty table but with failed write > (Owner: > > > Sivabalan) > > > > > > > > > On Thu, Aug 5, 2021 at 11:17 AM Vinoth Chandar <[email protected]> > wrote: > > > > > > > Any other thoughts? Love to lock this date down sooner than later. > > > > > > > > Thanks > > > > Vinoth > > > > > > > > On Tue, Aug 3, 2021 at 11:35 PM Udit Mehrotra <[email protected]> > wrote: > > > > > > > > > Agreed Vinoth. End of next week seems reasonable as a hard > deadline for > > > > > cutting the RC. > > > > > > > > > > If anyone thinks otherwise or needs more time, feel free to chime > in. > > > > > > > > > > On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <[email protected]> > > > wrote: > > > > > > > > > > > Thanks Udit! I propose we set end of next week as a hard > deadline for > > > > > > cutting the RC. Any thoughts? > > > > > > > > > > > > A good amount of progress is being made on these blockers, I > think. > > > > > > > > > > > > > > > > > > On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <[email protected]> > > > > wrote: > > > > > > > > > > > > > Hi Community, > > > > > > > > > > > > > > As we draw close to doing Hudi 0.9.0 release, I am happy to > share a > > > > > > summary > > > > > > > of the key features/improvements that would be going in the > release > > > > and > > > > > > the > > > > > > > current blockers for everyone's visibility. > > > > > > > > > > > > > > *Highlights* > > > > > > > > > > > > > > - [HUDI-1729] Asynchronous Hive sync and commits cleaning > for > > > > Flink > > > > > > > writer > > > > > > > - [HUDI-1738] Detect and emit deleted records for Flink MOR > > > table > > > > > > > streaming read > > > > > > > - [HUDI-1867] Support streaming reads for Flink COW table > > > > > > > - [HUDI-1908] Global index for flink writer > > > > > > > - [HUDI-1788] Support Insert Overwrite with Flink Writer > > > > > > > - [HUDI-2209] Bulk insert for flink writer > > > > > > > - [HUDI-1591] Support querying using non-globbed paths for > Hudi > > > > > Spark > > > > > > > DataSource queries > > > > > > > - [HUDI-1591] Partition pruning support for read optimized > > > queries > > > > > via > > > > > > > Hudi Spark DataSource > > > > > > > - [HUDI-1415] Register Hudi Table as a Spark DataSource > Table > > > with > > > > > > > metastore. Queries via Spark SQL will be routed through Hudi > > > > > > DataSource > > > > > > > (instead of InputFormat), thus making it more performant > due to > > > > > > Spark's > > > > > > > native/optimized readers > > > > > > > - [HUDI-1591] Partition pruning support for snapshot > queries via > > > > > Hudi > > > > > > > Spark DataSource > > > > > > > - [HUDI-1658] DML and DDL support via Spark SQL > > > > > > > - [HUDI-1790] Add SqlSource for DeltaStreamer to support > > > backfill > > > > > use > > > > > > > cases: > > > > > > > - [HUDI-251] Add JDBC Source support for DeltaStreamer > > > > > > > - [HUDI-1910] Support Kafka based checkpointing for > > > > > > HoodieDeltaStreamer > > > > > > > - [HUDI-1371] Support metadata based listing for Spark > > > DataSource > > > > > and > > > > > > > Spark SQL > > > > > > > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] > Improvements > > > to > > > > > > > Metadata based listing > > > > > > > - HUDI-89] Introduce a HoodieConfig/ConfigProperty > framework to > > > > > bring > > > > > > > all configs under one roof > > > > > > > - [HUDI-2124] Grafana dashboard for Hudi > > > > > > > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk > > > Insert > > > > > via > > > > > > > row writing > > > > > > > - [HUDI-1483] Async clustering for Delta Streamer > > > > > > > - [HUDI-2235] Add virtual key support to Hudi > > > > > > > - [HUDI-1848] Add support for Hive Metastore in > Hive-sync-tool > > > > > > > - In addition, there have been significant improvements and > bug > > > > > fixes > > > > > > to > > > > > > > improve the overall stability of Flink Hudi integration > > > > > > > > > > > > > > *Current Blockers* > > > > > > > > > > > > > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: > > > > pengzhiwei) > > > > > > > - [HUDI-1256] Follow on improvements to HFile tables for > > > metadata > > > > > > based > > > > > > > listing (Owner: None) > > > > > > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) > integration > > > With > > > > > > Hudi > > > > > > > (Owner: pengzhiwei) > > > > > > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table > > > > (Owner: > > > > > > > pengzhiwei) > > > > > > > - [HUDI-1138] Re-implement marker files via timeline server > > > > (Owner: > > > > > > > Ethan Guo) > > > > > > > - [HUDI-1985] Website redesign implementation (Owner: Vinoth > > > > > > > Govindarajan) > > > > > > > - [HUDI-2232] MERGE INTO fails with table having nested > struct > > > > > (Owner: > > > > > > > pengzhiwei) > > > > > > > - [HUDI-1468] incremental read support with clustering > (Owner: > > > > > Liwei) > > > > > > > - [HUDI-2250] Bulk insert support for tables w/ primary key > > > > (Owner: > > > > > > > None) > > > > > > > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar > > > Sumit) > > > > > > > - [HUDI-2221] [SQL] Functionality testing with Spark 2 > (Owner: > > > > Sagar > > > > > > > Sumit) > > > > > > > - [HUDI-1887] Setting default value to false for enabling > schema > > > > > post > > > > > > > processor (Owner: Sivabalan) > > > > > > > - [HUDI-1850] Fixing read of a empty table but with failed > write > > > > > > (Owner: > > > > > > > Sivabalan) > > > > > > > - [HUDI-2151] Enable defaults for out of box performance > (Owner: > > > > > Udit > > > > > > > Mehrotra) > > > > > > > - [HUDI-2119] Ensure the rolled-back instance was previously > > > > synced > > > > > to > > > > > > > the Metadata Table when syncing a Rollback Instant (Owner: > > > > Prashant > > > > > > > Wason) > > > > > > > - [HUDI-1458] Support custom clustering strategies and > preserve > > > > > commit > > > > > > > time to support incremental read (Owner: Satish Kotha) > > > > > > > - [HUDI-1763] Fixing honoring of Ordering val in > > > > > > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan) > > > > > > > - [HUDI-1129] Improving schema evolution support in hudi > (Owner: > > > > > > > Sivabalan) > > > > > > > - [HUDI-2120] [DOC] Update docs about schema in flink sql > > > > > > configuration > > > > > > > (Owner: Xianghu Wang) > > > > > > > - [HUDI-2182] Support Compaction Command For Spark Sql > (Owner: > > > > > > > pengzhiwei) > > > > > > > > > > > > > > Please respond to the thread if you think that I have missed > > > > capturing > > > > > > any > > > > > > > of the highlights or blockers for Hudi 0.9.0 release. For the > > > owners > > > > of > > > > > > > these release blockers, can you please provide a specific > timeline > > > > you > > > > > > are > > > > > > > willing to commit to for finishing these so we can cut an RC ? > > > > > > > > > > > > > > Thanks, > > > > > > > Udit > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Regards, > > > -Sivabalan > > > > > >
