Dear community, Nice to share Hudi community bi-weekly updates for 2021-10-24 ~ 2021-11-07 with updates on features, bug fixes and tests.
======================================= Features [Spark SQL] Support replace commit in DeltaSync with commit metadata preserved [1] [Flink Integration] Adding inline read and seek based read(batch get) for hfile log blocks in metadata table [2] [Core] Hash ID generator util for Hudi table columns, partition and file [3] [Core] support z-order for hudi [4] [Core] Support concurrent key gen for different tables with row writer path [5] [Spark] Upgrading Spark3 To 3.1 [6] [Spark SQL] Add support ignoring case in merge into [7] [Core] Add ORC support in Bootstrap Op [8] [1] https://issues.apache.org/jira/browse/HUDI-1500 [2] https://issues.apache.org/jira/browse/HUDI-1294 [3] https://issues.apache.org/jira/browse/HUDI-1295 [4] https://issues.apache.org/jira/browse/HUDI-2101 [5] https://issues.apache.org/jira/browse/HUDI-2582 [6] https://issues.apache.org/jira/browse/HUDI-1869 [7] https://issues.apache.org/jira/browse/HUDI-2471 [8] https://issues.apache.org/jira/browse/HUDI-1827 ======================================= Bugs [Core] Avoiding direct fs calls in HoodieLogFileReader [1] [Core] Remove duplicated hadoop-common with tests classifier exists in bundles [2] [Core] Remove duplicated hadoop-hdfs with tests classifier exists in bundles [3] [Flink] Schema evolution for flink parquet reader [4] [Flink] Make precombine field optional for flink [5] [Core] Refactor index in hudi-client module [6] [Core] Fixing double locking with multi-writers [7] [Flink Integration] Schedules the compaction from earliest for flink [8] [Flink Integration] Add compaction failed event(part2) [9] [Core] Remove duplicated hbase-common with tests classifier exists in bundles [10] [Core] Add close when producing records failed [11] [Core] persist some configs to hoodie.properties when the first write [12] [Hive Integration] hudi hive reader should not print read values [13] [Flink Integration] Delete the view storage properties first before creation [14] [Hive Integration] Hudi should synchronize owner information to hudi _rt/_ro table [15] [Flink Integration] flink writer writes huge log file [16] [Core] Use DefaultHoodieRecordPayload when precombine field is specified specifically [17] [Flink Integration] Sync all the missing sql options for HoodieFlinkStreamer [18] [Flink Integration] Proccess record after all bootstrap operator ready [19] [Flink Integration] Remove the aborted checkpoint notification from coordinator [20] [Core] Moved static COMMIT_FORMATTER to thread local variable as SimpleDateFormat is not thread safe [21] [Core] Make spark.sql.parquet.writeLegacyFormat configurable [22] [Flink Integration] Set up keygen class explicit for write config for flink table upgrade [23] [Hive Integration] bugfix: NPE when select count start from a realtime table with Tez [24] [1] https://issues.apache.org/jira/browse/HUDI-2005 [2] https://issues.apache.org/jira/browse/HUDI-2600 [3] https://issues.apache.org/jira/browse/HUDI-2614 [4] https://issues.apache.org/jira/browse/HUDI-2632 [5] https://issues.apache.org/jira/browse/HUDI-2633 [6] https://issues.apache.org/jira/browse/HUDI-2502 [7] https://issues.apache.org/jira/browse/HUDI-2573 [8] https://issues.apache.org/jira/browse/HUDI-2654 [9] https://issues.apache.org/jira/browse/HUDI-2654 [10] https://issues.apache.org/jira/browse/HUDI-2643 [11] https://issues.apache.org/jira/browse/HUDI-2515 [12] https://issues.apache.org/jira/browse/HUDI-2538 [13] https://issues.apache.org/jira/browse/HUDI-2674 [14] https://issues.apache.org/jira/browse/HUDI-2660 [15] https://issues.apache.org/jira/browse/HUDI-2676 [16] https://issues.apache.org/jira/browse/HUDI-2678 [17] https://issues.apache.org/jira/browse/HUDI-2684 [18] https://issues.apache.org/jira/browse/HUDI-2651 [19] https://issues.apache.org/jira/browse/HUDI-2686 [20] https://issues.apache.org/jira/browse/HUDI-2696 [21] https://issues.apache.org/jira/browse/HUDI-1794 [22] https://issues.apache.org/jira/browse/HUDI-2526 [23] https://issues.apache.org/jira/browse/HUDI-2702 [24] https://issues.apache.org/jira/browse/HUDI-313 ====================================== Tests [Tests] Fix TestHoodieDeltaStreamerWithMultiWriter [1] [Tests] Enabling Metadata table for some of TestCleaner unit tests [2] [1] https://issues.apache.org/jira/browse/HUDI-2077 [2] https://issues.apache.org/jira/browse/HUDI-2472 Best, Leesf
