Hey Iceberg Community, Here are the minutes and recording from our Iceberg Sync that took place on *May 4th, 9am-10am PT*. Also, thank you Russell for sharing the Apache Con Call For Papers <https://www.apachecon.com/acna2022/cfp.html> which is accepting proposals until May 23rd for anyone who is interested in submitting a talk!
Always remember, anyone can join the discussion so feel free to share the Iceberg-Sync <https://groups.google.com/g/iceberg-sync> google group with anyone who is seeking an invite. All notes and agendas are posted in the live doc <https://docs.google.com/document/d/1YuGhUdukLP5gGiqCbk0A5_Wifqe2CZWgOd3TbhY3UQg/edit?usp=drive_web> that's also attached to the meeting invitation and it's a good place to add items as you see fit so we can discuss them in the next community sync. 4 May 2022 Meeting Recording ⭕ <https://drive.google.com/file/d/1BfU7o9m87iWivYzmeukDtBi3v8HFdX7O/view> Top of the Meeting Highlights - Added Z-order strategy to rewrite data files (Thanks, Russell!) - Added support for Flink 1.15, removed 1.12 (Thanks, Kyle!) - Added default value handling to the spec (Thanks, Raymond!) - Added RangeReadable interface for FileIO (Thanks, Dan!) - Added expression framework in Python (Thanks, Sam and Nick!) Releases - 0.13.2 patch release - This release seems overdue and should probably be released despite minor issues that have been blocking this, i.e. failing Flink tets - Eduard has volunteered to be the Release Manager for this release - #4687 <https://github.com/apache/iceberg/pull/4687> is one pending PR that should go out in this release - 0.14.0 status update - Snapshot Expiration in the branching/tagging context: PR #4578 <https://github.com/apache/iceberg/pull/4578> - PR for LICENSE updates Agenda - Snapshot Expiration - Needs to go through the reachability code path which is only implemented in Spark today and is somewhat complex - The current proposal is to assume a linear history for tables and fail if unreferenced snapshots are found. (More strict than what currently happens and this is only if the flag for cleaning up data files incrementally is on) - Should we eventually have an in-memory reference comparison set that enables a reachability analysis? - This would not work for very large tables with rather large metadata that won’t all fit into memory. - This would work for most cases and it’s a safe inference that very large-scale tables have engines available such as Spark or Trino to perform snapshot expiration. - Remaining Branching and Tagging Work - Anyone interested in contributing here, reach out to Amogh or Ryan. Some examples of work to be done include: - Referencing branches/tags in engines - Committing directly to a branch/tag - Documentation Content Redesign (link <https://docs.google.com/document/d/1Y_PRv6p5oJaxg_68AUia_JHw8P4-AZIu3hP5IH2Cpsw/edit> ) - New top-level sections (part of the common site as opposed to the version-based site) - Quick-start: fully runnable docker based quickstarts for every engine with an Iceberg integration - Concepts: “no-code” overviews of core Iceberg concepts such as catalogs, tables, fileio, etc - Provide a master “Configuration” page in the versioned docs site - This is a one-stop-shop for all configuration tables that include parameters, descriptions, and value types. Currently, configuration tables are spread across multiple sections. - Format of “Docs” section pages - With configurations, quick-starts, and concepts being moved to dedicated sections, the version-based “Docs” section pages will follow a typical format of: - Feature Name - Feature Description - Code Snippet - Reading change streams - General Draft - PR #4539 <https://github.com/apache/iceberg/pull/4539> - Adds metadata column IS_DELETED - PR #4683 <https://github.com/apache/iceberg/pull/4683> - MVP is expected to be available in the next 1-2 months and will not include changes to the Java API - Idempotent Sort (design doc <https://docs.google.com/document/d/1rZUHljNsLn8JqsO5lYElst3F800T8OBnQ5fasUB-fp8/edit> ) - Apache Con Call For Papers May 23rd! <https://www.apachecon.com/acna2022/cfp.html> Thanks everyone! -- Sam Redai <[email protected]> Developer Advocate | Tabular <https://tabular.io/>
