Please find notes belowJun 25, 2025 | Apache Parquet Community Sync <https://www.google.com/calendar/event?eid=MmZvYnM1cXRoOWQ2aHVwbWRjcTF1azZpdmFfMjAyNTA1MjhUMTcwMDAwWiBqdWxpZW4ubGVkZW1AbQ>
Attendees: Apache Parquet Community Sync <[email protected]> - Micah Kornfield: Databricks - Alex Stephen: Google - Alkis Evlogimenos: Databricks - Brian Hulette: Google - Dewey Dunnington: Wherobots - Jonas: Snowflake - Jeff Plaisance: Snowflake - Rahul Sharma: Databricks - Gijs Burghoorn: Polars - Marc Cenac: Datadog - Martin Prammer: Carnegie Mellon - Rahul Sharma: Databricks - Russell Spitzer: Snowflake - Prateek Gaur: Snowflake - Andrew Lamb: InfluxData - Fokko Driesprong: Databricks - Yun Zou: Snowflake - Ashish Paliwal: SumoLogic - listening in Agenda: - [EXTERNAL] INT96 stats proposal parquet comittee <https://docs.google.com/document/d/1Ox0qHYBgs_3-pNqn9V8zVQm_W6qP0lsbd2XwQnQVz1Y/edit> - Andrew: Rust Variant implementation update. See epic https://github.com/apache/arrow-rs/issues/6736 - Decfloat proposal: DECFLOAT Parquet Proposal <https://docs.google.com/document/d/1j_Q6vnn6Nhy60K4o0tdC91kE5vKGNJaoDOAm71KLzNw/edit?tab=t.0#heading=h.4gcdhz9daib6> - Interval type <https://docs.google.com/document/d/12ghQxWxyAhSQeZyy0IWiwJ02gTqFOgfYm8x851HZFLk/edit?tab=t.0>: Iceberg/Parquet Interval Data Type Proposal <https://docs.google.com/document/d/12ghQxWxyAhSQeZyy0IWiwJ02gTqFOgfYm8x851HZFLk/edit?tab=t.0#heading=h.rt0cvesdzsj7> - Flatbuffer footers Notes: - Int96 timestamp - deprecated, has special comparison logic. - Databricks photon is emitting this. - Arrow-rs - reads and writes the stats - Need version checks in PRs (don’t assume any parquet-mr produced correct stats) - Across all implementations - New sort order needed? - Int96 only has meaning of timestamps, might not be needed. - Currently in spec it is undefined. - When will PR be merged to make int64 defaults in spark - https://github.com/apache/spark/pull/50215 - Maybe a note should be added in parquet-format explaining why it is deprecated - Variant type - Implementations started, and running smoothly. Relatively close to basic reading/writing basic types. Shredding will take a bit of time. - Java implementation close to have shredded data. - Can Java testing be shared? - Maybe we should have a java release. - Follow up ticket for getting examples of shredding: https://github.com/apache/parquet-testing/issues/75#issuecomment-3005659028 - DecFloat - Extension type proposal is latest, no objections in theory. But original author has not have bandwidth to drive through process. - Will update the doc with extension type alternative. - Extension types could provide actual proof of value. - Interval Type - Could variable FLBA types. Or parameterized based on type. On Mon, Jun 30, 2025 at 5:12 AM Julien Le Dem <[email protected]> wrote: > Thank you for volunteering. > Can you post the meeting notes to the list? > Thank you > Julien > > On Wed, Jun 25, 2025 at 6:56 PM Micah Kornfield <[email protected]> > wrote: > > > I think I volunteered at the last meeting to moderate this one. See > every > > one soon. > > > > On Wed, Jun 25, 2025 at 9:54 AM Julien Le Dem <[email protected]> wrote: > > > > > Reminder that I won’t be able to attend and someone else will volunteer > > to > > > moderate and take notes. > > > > > > The next Parquet sync is today June 25th at 10am PT - 1pm ET - 7pm CET > > > To join the invite, join the group: > > > https://groups.google.com/g/apache-parquet-community-sync > > > > > > Everybody is welcome, bring your topic or just listen in. > > > > > > (Some more details on how the meeting is run: > > > https://lists.apache.org/thread/bjdkscmx7zvgfbw0wlfttxy8h6v3f71t ) > > > > > >
