[ANNOUNCE] Hudi Community Update(2021-04-25 ~ 2021-05-09)

leesf Sun, 09 May 2021 08:42:10 -0700

Dear community,

Nice to share Hudi community bi-weekly updates for 2021-04-25 ~ 2021-05-09
with updates on features, bug fixes and tests.



=======================================
Features

[Flink Integration] Add option to flush when total buckets memory exceeds
the threshold [1]
[Core] Add optional instant range to log record scanner for log [2]
[Deltastreamer] Improve table level config priority for
HoodieMultiTableDeltaStreamer [3]
[Flink Integration] Tweak the min max commits to keep when setting up
cleaning retain commits for Flink [4]
[Flink Integration] Logging consuming instant to
StreamReadOperator#processSplits [5]
[Spark Integration] use jsc union instead of rdd union [6]
[Flink Integration] Add rate limiter to Flink writer to avoid OOM for
bootstrap [7]
[Flink Integration] Streaming read for Flink COW table  [8]
[Deltastreamer] Add SCHEMA_REGISTRY_SOURCE_URL_SUFFIX and
SCHEMA_REGISTRY_TARGET_URL_SUFFIX property [9]
[Flink Integration] Remove legacy code for Flink writer [10]
[Flink Integration] Support streaming read with compaction and cleaning [11]
[Flink Integration] Add max memory option for flink writer task [12]



[1] https://issues.apache.org/jira/browse/HUDI-1844
[2] https://issues.apache.org/jira/browse/HUDI-1837
[3] https://issues.apache.org/jira/browse/HUDI-1742
[4] https://issues.apache.org/jira/browse/HUDI-1841
[5] https://issues.apache.org/jira/browse/HUDI-1836
[6] https://issues.apache.org/jira/browse/HUDI-1690
[7] https://issues.apache.org/jira/browse/HUDI-1863
[8] https://issues.apache.org/jira/browse/HUDI-1867
[9] https://issues.apache.org/jira/browse/HUDI-1852
[10] https://issues.apache.org/jira/browse/HUDI-1821
[11] https://issues.apache.org/jira/browse/HUDI-1880
[12] https://issues.apache.org/jira/browse/HUDI-1878


=======================================
Bugs

[Core] Fixing kafka native config param for auto offset reset [1]
[Core] rollback pending clustering even if there is greater commit [2]
[Flink Integration] Fix cannot create table due to jar conflict [3]
[Hive Integration] Exception Throws When Sync Non-Partitioned Table To Hive
With MultiPartKeysValueExtractor [4]
[Spark Integration] Fix getting incorrect partition path while using incr
query by spark-sql [5]
[Flink Integration] Fix Flink streaming reader throws ClassCastException [6]
[Flink Integration] When query incr view of mor table which has Multi level
partitions, the query failed [7]
[Core] wiring in Hadoop Conf with AvroSchemaConverters instantiation [8]
[Hive Integratoin] Save one connection retry to hive metastore when
hiveSyncTool run with useJdbc=false [9]


[1] https://issues.apache.org/jira/browse/HUDI-1835
[2] https://issues.apache.org/jira/browse/HUDI-1833
[3] https://issues.apache.org/jira/browse/HUDI-1858
[4] https://issues.apache.org/jira/browse/HUDI-1798
[5] https://issues.apache.org/jira/browse/HUDI-1801
[6] https://issues.apache.org/jira/browse/HUDI-1781
[7] https://issues.apache.org/jira/browse/HUDI-1718
[8] https://issues.apache.org/jira/browse/HUDI-1876
[9] https://issues.apache.org/jira/browse/HUDI-1759


======================================
Tests

[Tests] Fix TestHoodieRealtimeRecordReader [1]
[Tests] Fix azure setting for integ tests [2]
[Tests] Fix Metrics UT [3]

[1] https://issues.apache.org/jira/browse/HUDI-1811
[2] https://issues.apache.org/jira/browse/HUDI-1810
[3] https://issues.apache.org/jira/browse/HUDI-1620


Best,
Leesf

[ANNOUNCE] Hudi Community Update(2021-04-25 ~ 2021-05-09)

Reply via email to