Dear community, Nice to share Hudi community bi-weekly updates for 2021-08-29 ~ 2021-09-12 with updates on features, bug fixes and tests.
======================================= Features [Core] Add support ByteArrayDeserializer in AvroKafkaSource [1] [Core] Add configs for common and pre validate [2] [CI] Use GitHub Actions to build different scala spark versions [3] [Flink Integration] Add pipeline for Append mode [4] [Flink Integration] Add metadata table listing for flink query source [5] [Core] Implement Kafka Sink Protocol for Hudi for Ingesting Immutable Data [6] [Flink Integration] Add timestamp based partitioning for flink writer [7] [1] https://issues.apache.org/jira/browse/HUDI-2320 [2] https://issues.apache.org/jira/browse/HUDI-2378 [3] https://issues.apache.org/jira/browse/HUDI-2280 [4] https://issues.apache.org/jira/browse/HUDI-2376 [5] https://issues.apache.org/jira/browse/HUDI-2403 [6] https://issues.apache.org/jira/browse/HUDI-2394 [7] https://issues.apache.org/jira/browse/HUDI-2412 ======================================= Bugs [Flink Integration] Include the pending compaction file groups for flink streaming reader [1] [Core] Change log file size config to long [2] [Flink Integration] Do not send partition delete record when changelog mode enabled [3] [Core] The default archive folder should be 'archived' [4] [Flink Integraion] Load archived instants for flink streaming reader [5] [Core] Extract common FS and IO utils for marker mechanism [6] [Core] Fix TimelineServer error because of replacecommit archive [7] [Core] Collect event time for inserts in DefaultHoodieRecordPayload [8] [1] https://issues.apache.org/jira/browse/HUDI-2379 [2] https://issues.apache.org/jira/browse/HUDI-2384 [3] https://issues.apache.org/jira/browse/HUDI-2392 [4] https://issues.apache.org/jira/browse/HUDI-2380 [5] https://issues.apache.org/jira/browse/HUDI-2401 [6] https://issues.apache.org/jira/browse/HUDI-2351 [7] https://issues.apache.org/jira/browse/HUDI-2354 [8] https://issues.apache.org/jira/browse/HUDI-2398 ====================================== Tests [Tests] Fix flakiness in TestHoodieMergeOnReadTable [1] [Tests] Disable HDFSParquetImporter related tests [2] [Tests] Rebalance CI jobs for shorter wait time [3] [Tests] Make CLI command tests functional [4] [Tests] Move to ubuntu-18.04 for Azure CI [5] [Tests] Deprecate FunctionalTestHarness to avoid init DFS [6] [Tests] Add yamls for large scale testing [7] [1] https://issues.apache.org/jira/browse/HUDI-1989 [2] https://issues.apache.org/jira/browse/HUDI-1989 [3] https://issues.apache.org/jira/browse/HUDI-2399 [4] https://issues.apache.org/jira/browse/HUDI-2079 [5] https://issues.apache.org/jira/browse/HUDI-2080 [6] https://issues.apache.org/jira/browse/HUDI-2408 [7] https://issues.apache.org/jira/browse/HUDI-2393 Best, Leesf
