Dear community,

Nice to share Hudi community bi-weekly updates for 2021-07-18 ~ 2021-08-01
with updates on features, bug fixes and tests.


=======================================
Features

[Core] Adding support to disable meta columns with bulk insert operation [1]
[DeltaStreamer] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to
DeltaStreamer [2]
[Spark Integration] MergeInto Support Partial Update For COW [3]
[Hive Integration] DeltaStreamer kafka source supports consuming from
specified timestamp [4]
[Hive Integration] Adding support for HMS for running DDL queries in
hive-sync [5]
[Docs] Automate the generation of configs webpage as configs are added to
Hudi repo [6]
[Core] Adding virtual key support to COW table [7]
[Flink Integration] Add rateLimiter when Flink writes to hudi [8]
[Core] Integrate consumers with rocksDB and compression within External
Spillable Map [9]
[Flink Integration] Add option 'hive_sync.mode' for flink writer [10]
[Spark Integration] Explicit parallelism for flink bulk insert [11]
[Hive Integration] Support setting hive sync partition extractor class
based on flink configuration [12]


[1] https://issues.apache.org/jira/browse/HUDI-2161
[2] https://issues.apache.org/jira/browse/HUDI-1860
[3] https://issues.apache.org/jira/browse/HUDI-1884
[4] https://issues.apache.org/jira/browse/HUDI-1447
[5] https://issues.apache.org/jira/browse/HUDI-1848
[6] https://issues.apache.org/jira/browse/HUDI-1241
[7] https://issues.apache.org/jira/browse/HUDI-2176
[8] https://issues.apache.org/jira/browse/HUDI-2215
[9] https://issues.apache.org/jira/browse/HUDI-2044
[10] https://issues.apache.org/jira/browse/HUDI-2228
[11] https://issues.apache.org/jira/browse/HUDI-2241
[12] https://issues.apache.org/jira/browse/HUDI-2184


=======================================
Bugs

[Flink Integration] Remove state in BootstrapFunction [1]
[Flink Integration] Create new bucket when NewFileAssignState filled[2]
[Flink Integration] Clean and reset the bootstrap events for coordinator
when task failover [3]
[Code Cleanup] Clean up Multiple versions of scala libraries detected
Warning [4]
[Flink Integraion] Add marker files for flink writer [5]
[Spark Integration] Sync Hive Failed When Execute CTAS In Spark2 And Spark3
[6]
[Core] Fix checkpoint blocked because getLastPendingInstant() action after
than restoreWriteMetadata() action [7]
[Flink Integration] Rollback inflight compaction for flink writer [8]
[Spark Integration] MergeInto MOR Table May Result InCorrect Result [9]
[Spark Integration] Missing PrimaryKey In Hoodie Properties For CTAS Table
 [10]
[Core] residual temporary files after clustering are not cleaned up [11]
[Core] Fix NPE of HoodieConfig [12]
[Core] Fix no value present in incremental query on MOR [13]
[Spark Integration] Fix Alter Partitioned Table Failed [14]
[Flink Integration] Only sync hive meta on successful commit for flink
batch writer [15]
[Core] Make codahale times transient to avoid serializable exceptions [16]
[Core]] BucketAssigner generates the fileId evenly to avoid data skew [17]
[Hive Integration] Fix database alreadyExists exception while hive sync [18]
[Spark Integration] Performance loss with the additional
hoodieRecords.isEmpty() in HoodieSparkSqlWriter#write [19]
[Spark Integration] Unpersist the input rdd after the commit is completed
to save the memory space for inline compaction [20]
[Spark Integration] Fix Exception Cause By Table Name Case Sensitivity For
Append Mode Write [21]
[Flink Integration] Default consumes from the latest instant for flink
streaming reader [22]
[Flink Integration] Builtin sort operator for flink bulk insert [23]
[Core] Fix missing HoodieWriteStat in HoodieCreateHandle [24]


[1] https://issues.apache.org/jira/browse/HUDI-2193
[2] https://issues.apache.org/jira/browse/HUDI-2145
[3] https://issues.apache.org/jira/browse/HUDI-2198
[4] https://issues.apache.org/jira/browse/HUDI-2192
[5] https://issues.apache.org/jira/browse/HUDI-2204
[6] https://issues.apache.org/jira/browse/HUDI-2195
[7] https://issues.apache.org/jira/browse/HUDI-2206
[8] https://issues.apache.org/jira/browse/HUDI-2205
[9] https://issues.apache.org/jira/browse/HUDI-2139
[10] https://issues.apache.org/jira/browse/HUDI-2212
[11] https://issues.apache.org/jira/browse/HUDI-2214
[12] https://issues.apache.org/jira/browse/HUDI-2219
[13] https://issues.apache.org/jira/browse/HUDI-2217
[14] https://issues.apache.org/jira/browse/HUDI-2223
[15] https://issues.apache.org/jira/browse/HUDI-2227
[16] https://issues.apache.org/jira/browse/HUDI-2240
[17] https://issues.apache.org/jira/browse/HUDI-2245
[18] https://issues.apache.org/jira/browse/HUDI-2244
[19] https://issues.apache.org/jira/browse/HUDI-1425
[20] https://issues.apache.org/jira/browse/HUDI-2117
[21] https://issues.apache.org/jira/browse/HUDI-2251
[22] https://issues.apache.org/jira/browse/HUDI-2252
[23] https://issues.apache.org/jira/browse/HUDI-2254
[24] https://issues.apache.org/jira/browse/HUDI-2218

======================================
Tests

[Tests] Fixing hudi_test_suite for spark nodes and adding spark bulk_insert
node [1]
[Tests] Fix NullPointerException in TestHoodieConsoleMetrics [2]
[Tests] Refactoring few tests to reduce runningtime. DeltaStreamer and
MultiDeltaStreamer tests. Bulk insert row writer tests [3]

[1] https://issues.apache.org/jira/browse/HUDI-2007
[2] https://issues.apache.org/jira/browse/HUDI-2211
[3] https://issues.apache.org/jira/browse/HUDI-2253

Best,
Leesf

Reply via email to