Attendees and topics:
-
Julien: Datadog. Following up on : Defining process for adoption of new
encodings, N-gram bloom filters, New footer next steps. Facilitating this
meeting.
-
Micah: Snowflake
-
Kenny: HyParquet (js)
-
Jiaying: CMU, PR parquet-rs Variant implementation
-
Gabor: Dremio
-
Andrew: InfluxData, {parquet,arrow}-rs, Example binary for variant
testing
-
Martin: Jane street, new encodings
-
Talat: Google
-
Russell: Snowflake, interval types.
-
Brian: Google
-
Fokko: Databricks
-
Neil: Snowflake Variant
-
Aihua: Snowflake Variant
-
Naren: Snowflake, interval types
-
Dan: Databricks. All the types
-
Aditya: CMU, variant
Agenda:
-
Defining process for adoption of new encodings
-
Micah to work on documenting the process.
-
N-gram bloom filters:
-
Optional features are easier to adopt.
-
New footer next steps.
-
Micah: still in the evaluation phase.
-
Alkis to send an update on the list
-
Facilitating this meeting.
-
Page on Parquet website.
-
Russel, Talat, Martin, Fokko as a backup.
-
Note taking app: Fathom.
-
Talat:
-
Google calendar.
-
Video call to youtube
-
Gemini to transcript.
-
Dan, Micah, Fokko: to update the website
-
Send message to the dev list.
-
Andrew: Example binary PRs in parquet-testing (what are next steps?)
-
PR from Dewey with example GEOMETRY / GEOGRAPHY types:
https://github.com/apache/parquet-testing/pull/70
-
PR from Andrew with example Variant:
https://github.com/apache/parquet-testing/pull/76
-
Micah says he will be willing to take a look at the PRs
-
Naren: Interval types
<https://docs.google.com/document/d/12ghQxWxyAhSQeZyy0IWiwJ02gTqFOgfYm8x851HZFLk/edit?usp=sharing>
in Iceberg/Parquet.
-
General support to finalize those types. (started back in
https://github.com/apache/parquet-format/pull/43/files)
-
Todo: follow up on mailing list
-
Variant:
-
Finalize 2nd implementation ({parquet,arrow}-rs)
-
Java implementation is close.
-
Once we have finalized those two, we can remove the caveats.
-
Also in the works: C++, Python implementations.
Action items
- [image: unchecked]
Please vote for 1.15.2:
https://lists.apache.org/thread/rt1xjw4hqd391h852opo6cc6sv80dcb5
- [image: unchecked]
Micah: draft on the new encodings selection. Talat to follow up.
- [image: unchecked]
TalaT: setup shared google calendar, posting recordings to youtube and
transcript.
- [image: unchecked]
Dan: update the website and email the dev list to improve meeting
visibility
- [image: unchecked]
Naren: follow up on mailing list re interval types.
-
Alkis/Micah: send an update re the new footer format
On Wed, Apr 30, 2025 at 7:17 AM Julien Le Dem <[email protected]> wrote:
> The next Parquet sync is today Apr 30th at 10am PT - 1pm ET - 7pm CET
> (in ~3h)
> To join the invite:
> https://calendar.app.google/z8fJupamRQ7Uu8tW9
> Please contact me to be added to the recurring invite. (every two weeks)
> Everybody is welcome, bring your topic or just listen in.
>
> (Some more details on how the meeting is run:
> https://lists.apache.org/thread/bjdkscmx7zvgfbw0wlfttxy8h6v3f71t )
>
>