Attendees:

   -

   Micah
   -

   Gene Pang: Variant type plan review
   -

   Rok: need help on encryption. _metadata problem
   -

   Ashish
   -

   Julien

Notes:

   -

   Variant:
   -

      Commitment to get Variant in Parquet
      -

      Gene is working on the plan for moving the implementation into
      parquet. He shared on the mailing list for feedback.
      -

      Next step: We need to finalize the plan. plan doc
      
<https://docs.google.com/document/d/1guEzBQjzOEEZvvibeZjNraKmZHWtxQR95O_DvtZU0xw/edit?usp=sharing>
      -

      Goals:
      -

         Do we need to release Variant independently?
         -

            Spec released with parquet-format
            -

            Java implementation can be released with parquet-java as long
            as we can do releases easily in the near future.
            -

            Discussion on the need to have a separate repo
            -

               We’d like to avoid parquet repos proliferation.
               -

         Separate jar with its own dependencies
         -

            To facilitate reuse by other projects (ex: Spark, …), there
            should be a separate artifact with minimal dependencies.
That can be done
            as a separate maven module in parquet-java.



   -

   Parquet cpp implementation is in Arrow
   -

   Parquet Rust implementation:
   -

      2 implementations
      -

         Arrow
         -

         Polars
         -

   Interval type in Parquet is a mess and needs improvement [Micah]
   -

   We’ll need another vote on the plan when finalized.


   -

   Encryption:
   -

      Use case: Need to use  _metadata file but it is not encrypted.
      -

      Work in progress to add this but currently the encryption doesn’t
      work properly. AD strings unique per file, reader does not use them
      correctly.
      -

      Need to solve it in a way that we can merge it and add encryption
      support to _metadata.
      -

      Giddon is helpin on the ML.
      -

      PR: https://github.com/apache/arrow/pull/41821
      -

   Action Items:
   -

      Gene to update doc with release requirements etc
      -

      Micah, Julien, …: Add feedback in the doc
      -

      Julien: enable people to add the recurring google calendar invite to
      their calendar.


On Wed, Sep 11, 2024 at 9:34 AM Julien Le Dem <[email protected]> wrote:

> The next Parquet Sync is happening now at 9:30am PT - 12:30pm ET - 6:30pm
> CET
> (sorry I didn't send a reminder earlier)
> To join the invite:
> https://calendar.app.google/KM28Ci71B2DMoHo4A
> Everybody is welcome, bring your topic or just listen in.
> Best
> Julien
>

Reply via email to