Hi everyone, Here's a draft of the board report for this month. Please reply with anything that you'd like to see added or that I've missed. Thanks!
rb ## Description: Apache Iceberg is a table format for huge analytic datasets that is designed for high performance and ease of use. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Iceberg was founded 2020-05-19 (2 months ago) There are currently 10 committers and 9 PMC members in this project. The Committer-to-PMC ratio is roughly 1:1. Community changes, past quarter: - No new PMC members (project graduated recently). - Shardul Mahadik was added as committer on 2020-07-25 ## Project Activity: 0.9.0 was released, including support for Spark 3 and SQL DDL commands, support for JDK 11, vectorized Parquet reads, and an action to compact data files. Since the 0.9.0 release, the community has made progress in several areas: - The Hive StorageHandler now provides access to query Iceberg tables (work is ongoing to implement projection and predicate pushdown). - Flink integration has made substantial progress toward using native RowData, and the first stage of the Flink sink (data file writers) has been committed. - An action to expire snapshots using Spark was added and is an improvement on the incremental approach because it compares the reachable file sets. - The implementation of row-level deletes is nearing completion. Scan planning now supports delete files, merge-based and set-based row filters have been committed, and delete file writers are under review. The delete file writers allow storing deleted row data in support of Flink CDC use cases. Releases: - 0.9.0 was released on 2020-07-13 - 0.9.1 has an ongoing vote ## Community Health: The month since the last report has been one of the busiest since the project started. 80 pull requests were merged in the last 4 weeks, and more importantly, came from 21 different contributors. Both of these are new high watermarks. Community members gave 2 Iceberg talks at Subsurface Conf, on enabling Hive queries against Iceberg tables and working with petabyte-scale Iceberg tables. Iceberg was also mentioned in the keynotes. -- Ryan Blue