Hi - unfortunately my question is not answered until now. Le sam. 31 mai 2025 à 01:37, Julien Le Dem <[email protected]> a écrit :
> Attendees: > > - > > Julien: Datadog > - > > Alex: Google, listening in > - > > Aditya: CMU Variant > - > > Alkis: Databricks, new footer update > - > > Andrew Lamb: Influx Data. listen > - > > Brian: Google > - > > Dewey: geometry > - > > Talat: Google. > - > > Jeff: Snowflake. Footer, encoding > - > > Martin: CMU variant > - > > Mengmeng: snowflake > - > > Prashant: snowflake > - > > Prateek: snowflake scan team. Encodings, exp > - > > Russel: Snowflake. Duck > - > > Sai: snowflake > - > > Sandieep: snowflake > - > > Selcuk: snowflake > - > > Selim: detection partition field, java api.modification date? > - > > Thomas: snowflake: floating type proposal. > - > > Vinoo: startup > - > > Micah: Databricks. > > > Agenda/Notes: > > - > > New footer update (Alkis): > - > > Prototype running in databricks: > - > > https://github.com/apache/arrow/pull/43793 > - > > Deserialization results: > - > > For compatibility: generate the thrift data structure from the > flatbuffer. > - > > Deserialization speed improved by 30% on average > - > > Pathological cases 5 to 10x speedup > - > > Some cases have regression (20-30%) > - > > Need to figure out why and fix it. > - > > Expect better fetch optimization because the footer is smaller. > - > > This is all without taking advantage of: > - > > Partial deserialization > - > > Not having to produce the thrift > - > > Do we need to evaluate in the context of caching files in SSDs, > memory etc > - > > Follow ups: > - > > Process to adopt new encodings (Sorry Micah will share doc tonight > for community input) > - > > Selcuk, Jeff > - > > Talat > - > > Micah, Alkis > - > > [Thomas]DECFLOAT Parquet Proposal > < > https://docs.google.com/document/d/1j_Q6vnn6Nhy60K4o0tdC91kE5vKGNJaoDOAm71KLzNw/edit?tab=t.0#heading=h.a3yn4bu050pz > > > - > > Decimal floating point type: A third type beyond fixed type and > floating point > - > > Spark support Decfloat (using bigdecimal) > > > On Tue, May 27, 2025 at 8:35 PM Julien Le Dem <[email protected]> wrote: > > > The next Parquet sync is tomorrow May 28th at 10am PT - 1pm ET - 7pm CET > > To join the invite, join the group: > > https://groups.google.com/g/apache-parquet-community-sync > > > > Everybody is welcome, bring your topic or just listen in. > > > > (Some more details on how the meeting is run: > > https://lists.apache.org/thread/bjdkscmx7zvgfbw0wlfttxy8h6v3f71t ) > > >
