Attendees:

   -

   Julien: Datadog
   -

   Alex: Google, listening in
   -

   Aditya: CMU Variant
   -

   Alkis: Databricks, new footer update
   -

   Andrew Lamb: Influx Data. listen
   -

   Brian: Google
   -

   Dewey: geometry
   -

   Talat: Google.
   -

   Jeff: Snowflake. Footer, encoding
   -

   Martin: CMU variant
   -

   Mengmeng: snowflake
   -

   Prashant: snowflake
   -

   Prateek: snowflake scan team. Encodings, exp
   -

   Russel: Snowflake. Duck
   -

   Sai: snowflake
   -

   Sandieep: snowflake
   -

   Selcuk: snowflake
   -

   Selim: detection partition field, java api.modification date?
   -

   Thomas: snowflake: floating type proposal.
   -

   Vinoo: startup
   -

   Micah: Databricks.


Agenda/Notes:

   -

   New footer update (Alkis):
   -

      Prototype running in databricks:
      -

         https://github.com/apache/arrow/pull/43793
         -

      Deserialization results:
      -

         For compatibility: generate the thrift data structure from the
         flatbuffer.
         -

         Deserialization speed improved by 30% on average
         -

         Pathological cases 5 to 10x speedup
         -

         Some cases have regression (20-30%)
         -

            Need to figure out why and fix it.
            -

         Expect better fetch optimization because the footer is smaller.
         -

         This is all without taking advantage of:
         -

            Partial deserialization
            -

            Not having to produce the thrift
            -

      Do we need to evaluate in the context of caching files in SSDs,
      memory etc
      -

   Follow ups:
   -

      Process to adopt new encodings (Sorry Micah will share doc tonight
      for community input)
      -

         Selcuk, Jeff
         -

         Talat
         -

         Micah, Alkis
         -

   [Thomas]DECFLOAT Parquet Proposal
   
<https://docs.google.com/document/d/1j_Q6vnn6Nhy60K4o0tdC91kE5vKGNJaoDOAm71KLzNw/edit?tab=t.0#heading=h.a3yn4bu050pz>
   -

      Decimal floating point type: A third type beyond fixed type and
      floating point
      -

      Spark support Decfloat (using bigdecimal)


On Tue, May 27, 2025 at 8:35 PM Julien Le Dem <[email protected]> wrote:

> The next Parquet sync is tomorrow May 28th at 10am PT - 1pm ET - 7pm CET
> To join the invite, join the group:
> https://groups.google.com/g/apache-parquet-community-sync
>
> Everybody is welcome, bring your topic or just listen in.
>
> (Some more details on how the meeting is run:
> https://lists.apache.org/thread/bjdkscmx7zvgfbw0wlfttxy8h6v3f71t )
>

Reply via email to