Dear community, Nice to share Hudi community bi-weekly updates for 2021-04-11 ~ 2021-04-25 with updates on features, bug fixes and tests.
======================================= Features [Hudi Client] Move OperationConverter to hudi-client-common for code reuse [1] [Flink Integration] Add option for merge max memory [2] [Spark Integration] Insert overwrite (table) for Flink writer [3] [Core] Support BAIDU AFS storage format in hudi [4] [CLI] Add Hudi-CLI support for clustering [5] [Spark Integration] Read Hoodie Table As Spark DataSource Table [5] [Flink Integration] Non partitioned table for Flink writer [6] [Flink Integration] Add explicit index state TTL option for Flink writer [7] [CLI] Added support for replace commits in commit showpartitions, commit show_write_stats, commit showfiles [8] [Core] Add support for BigDecimal and Integer when partitioning based on time. [9] [1] https://issues.apache.org/jira/browse/HUDI-1785 [2] https://issues.apache.org/jira/browse/HUDI-1786 [3] https://issues.apache.org/jira/browse/HUDI-1788 [4] https://issues.apache.org/jira/browse/HUDI-1803 [5] https://issues.apache.org/jira/browse/HUDI-1415 [6] https://issues.apache.org/jira/browse/HUDI-1814 [7] https://issues.apache.org/jira/browse/HUDI-1812 [8] https://issues.apache.org/jira/browse/HUDI-1746 [9] https://issues.apache.org/jira/browse/HUDI-1551 ======================================= Bugs [Flink Integration] Remove the rocksdb jar from hudi-flink-bundle [1] [Core] Fix RealtimeCompactedRecordReader StackOverflowError [2] [Spark Integration] Fixing usage of NULL schema for delete operation in HoodieSparkSqlWriter [3] [Flink Integration] Flink streaming reader should always monitor the delta commits files [4] [Flink Integration] FlinkMergeHandle rolling over may miss to rename the latest file handle [5] [Flink Integration] flink-client query error when processing files larger than 128mb [6] [Flink Integration] Continue to write when Flink write task restart because of container killing [7] [Spark Integration] Resolving default values for schema from dataframe [8] [Timeline Server] Timeline Server Bundle need to include com.esotericsoftware package [9] [Core] rollback fails on mor table when the partition path hasn't any files [10] [Flink Integration] Flink merge on read input split uses wrong base file path for default merge type [11] [Flink Integration] Use while loop instead of recursive call in MergeOnReadInputFormat to avoid StackOverflow [12] [1] https://issues.apache.org/jira/browse/HUDI-1787 [2] https://issues.apache.org/jira/browse/HUDI-1720 [3] https://issues.apache.org/jira/browse/HUDI-1751 [4] https://issues.apache.org/jira/browse/HUDI-1798 [5] https://issues.apache.org/jira/browse/HUDI-1801 [6] https://issues.apache.org/jira/browse/HUDI-1792 [7] https://issues.apache.org/jira/browse/HUDI-1804 [8] https://issues.apache.org/jira/browse/HUDI-1716 [9] https://issues.apache.org/jira/browse/HUDI-1802 [10] https://issues.apache.org/jira/browse/HUDI-1744 [11] https://issues.apache.org/jira/browse/HUDI-1809 [12] https://issues.apache.org/jira/browse/HUDI-1829 ====================================== Tests [Tests] Added tests to TestHoodieTimelineArchiveLog for the archival of completed clean and rollback actions [1] [1] https://issues.apache.org/jira/browse/HUDI-1714 Best, Leesf