Hi folks,

Here is the Meeting Notes and Recording link from the Variant Sync today
July 02, 2026: [Notes
<https://docs.google.com/document/d/1IuhLRxw1rcPD_f4jgHuGe3SwFgy7Y5wgEGvLzf6311s/edit?tab=t.g06kaw1fbwhz>
].
There are some action items and active work.
Please reach out if you have any questions.
Thanks

On Fri, Jun 5, 2026 at 12:22 PM Neelesh Salian <[email protected]>
wrote:

> Hi folks,
>
> Here is the Meeting Notes and Recording link from the Variant Sync on June
> 4, 2026: [Notes
> <https://docs.google.com/document/d/1IuhLRxw1rcPD_f4jgHuGe3SwFgy7Y5wgEGvLzf6311s/edit?tab=t.g06kaw1fbwhz#heading=h.r977qio1wsv2>
> ].
> There are some action items and active work.
> Please reach out if you have any questions.
> Thanks.
>
> On Fri, May 8, 2026 at 3:20 PM Neelesh Salian <[email protected]>
> wrote:
>
>> Hi folks,
>>
>> Here is the Meeting Notes and Recording link from the Variant Sync on May
>> 7, 2026: [Notes
>> <https://docs.google.com/document/d/1IuhLRxw1rcPD_f4jgHuGe3SwFgy7Y5wgEGvLzf6311s/edit?tab=t.g06kaw1fbwhz>
>> ].
>> There are some action items and active work.
>> Please reach out if you have any questions.
>> Thanks.
>>
>> On Thu, Apr 30, 2026 at 1:36 PM Neelesh Salian <[email protected]>
>> wrote:
>>
>>> Hi folks,
>>>
>>> I've set up a time starting next week on Thursday (May 7, 2026) at 10 am
>>> Pacific time for a sync for the active work on Variant.
>>> This will be a monthly sync (on the first Thursday of every month).
>>> You can find it on the dev calendar.
>>> Here is the calendar invite:
>>> https://calendar.app.google/b8ykdTV3EaNnVnkv8
>>> I'll be recording the call and capturing notes in the sync document: Iceberg
>>> - Variant Community Update
>>> <https://docs.google.com/document/d/1IuhLRxw1rcPD_f4jgHuGe3SwFgy7Y5wgEGvLzf6311s/edit?usp=sharing>
>>>  (Meeting
>>> Notes tab).
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Apr 20, 2026 at 1:49 PM Steve Loughran <[email protected]>
>>> wrote:
>>>
>>>> + regarding the rust, go and cpp impls, a status from each team would
>>>> be great!
>>>>
>>>> I've been reviewing arrow parquet variant stuff and it is all there,
>>>> including with some benchmarks and optimisations. Which may put it ahead of
>>>> the others.
>>>>
>>>> It also has some special handling for sorted variants, as key search
>>>> there is straightforward. AFAIK I don't think the others do that, and nor
>>>> do I see them going to any effort to sort fields in an object. I think
>>>> sorting would be good, but you would have to handle the case where there
>>>> are duplicate keys. It's allowed in the spec, and seems like itcould creep
>>>> in from nested variants. Has anyone looked at this?
>>>>
>>>> Also: has anyone created malformed parquet files with a shredded
>>>> variant and a metadata entry of the same name. The requirement is "ignore
>>>> the metadata one", but that's something to test. You'd have to write a
>>>> shredded file and then edit the binary content to achieve this, or manually
>>>> create one and put it into the parquet-testing repository under bad-data/
>>>>
>>>>
>>>> On Mon, 20 Apr 2026 at 19:08, Qiegang Long <[email protected]> wrote:
>>>>
>>>>> Thanks for the doc to track the status! +1 on the dedicated
>>>>> sync—definitely feels like there’s a lot of work before we see Variant’s
>>>>> full potential.
>>>>>
>>>>> Qiegang
>>>>>
>>>>> On Mon, Apr 20, 2026 at 11:09 AM Steve Loughran <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> This is great, we need that tracker as it is cross-project. piece of
>>>>>> work to say "this is readly
>>>>>>
>>>>>> I did have an agenda item from last month's community call which
>>>>>> didn't get through. If we can retain that open time slot we could do a 
>>>>>> very
>>>>>> quick summary of where we are (summarly slides of Qiegang's results and
>>>>>> mine, key outstanding issues and next steps, then we can start that
>>>>>> monthly session on it.
>>>>>>
>>>>>> Meanwhile, I have both parquet and iceberg PRs for benchmarks which I
>>>>>> think are ready for review -please take a look
>>>>>>
>>>>>> Finally, I'm thinking about interop of those many, many variant
>>>>>> readers out there. Has anyone explicitly cross-tested their 
>>>>>> implementations
>>>>>> of variant? what about consistent handling of invalid data? That includes
>>>>>> iceberg-rust, parquet-cpp and more...
>>>>>>
>>>>>> Steve
>>>>>>
>>>>>> On Sun, 19 Apr 2026 at 21:57, Neelesh Salian <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> The Variant umbrella issue (#10392
>>>>>>> <https://github.com/apache/iceberg/issues/10392>) hasn't been
>>>>>>> updated in a while, and with active work happening across multiple PRs 
>>>>>>> in
>>>>>>> Iceberg, Spark, and Parquet, it's been hard to keep track of where 
>>>>>>> things
>>>>>>> stand.
>>>>>>>
>>>>>>> Since a few of us are actively working on variant features, I
>>>>>>> thought it would help to put together a tracking document so the 
>>>>>>> community
>>>>>>> has a single place to see the current state, open work, and benchmark
>>>>>>> findings. I plan to update this on a weekly basis to keep track of the
>>>>>>> issues and PRs that are updated.
>>>>>>>
>>>>>>> Iceberg Variant Community Document
>>>>>>> <https://docs.google.com/document/d/1IuhLRxw1rcPD_f4jgHuGe3SwFgy7Y5wgEGvLzf6311s/edit?usp=sharing>
>>>>>>>
>>>>>>> The document has three tabs:
>>>>>>>
>>>>>>>    1. Overview - what shipped in 1.10, what's merged to main, open
>>>>>>>    work areas, and the dependency graph across Iceberg, Spark, and 
>>>>>>> Parquet
>>>>>>>    2. Tracker - all open variant issues and PRs across Iceberg,
>>>>>>>    Parquet-Java, Parquet-Format, and Spark with authors and status
>>>>>>>    3. Benchmarks - summary of three independent benchmark efforts
>>>>>>>    (details below)
>>>>>>>
>>>>>>> *Benchmark findings*
>>>>>>>
>>>>>>> Three independent benchmarks have measured variant performance. All
>>>>>>> converge on the same picture: variant is a modest improvement over JSON
>>>>>>> strings today (1.1-1.7x faster reads), but 15-17x slower than typed 
>>>>>>> columns.
>>>>>>>
>>>>>>>    1. Qiegang Long - 14 queries on GitHub Archive, 5 configs:
>>>>>>>    https://qlong.github.io/posts/2026-03-30-variant-early-results
>>>>>>>    2. Steve Loughran - JMH microbenchmarks, profiler-driven
>>>>>>>    optimization:
>>>>>>>    
>>>>>>> https://steveloughran.github.io/benchmarking-variants/benchmarking-variants.html
>>>>>>>    
>>>>>>> <https://steveloughran.github.io/benchmarking-variants/benchmarking-variants.html>
>>>>>>>    3. Neelesh Salian - Controlled baseline, 10M+100M rows, write +
>>>>>>>    read:
>>>>>>>    
>>>>>>> https://github.com/nssalian/iceberg/tree/iceberg-variant-benchmark/benchmark
>>>>>>>
>>>>>>> If you're working on variant-related changes, please chime in or let
>>>>>>> me know and I'll add it to the tracker. Feedback on the benchmarks or
>>>>>>> anything else is welcome.
>>>>>>>
>>>>>>> I've been giving variant updates during the Iceberg Spark Sync
>>>>>>> (Tuesdays, 10 AM PT), but given that this work now spans Iceberg, Spark,
>>>>>>> Parquet, and Flink, I think it deserves its own forum. I'd like to 
>>>>>>> propose
>>>>>>> a monthly Variant Sync; a short call where contributors can share 
>>>>>>> progress,
>>>>>>> surface blockers, and coordinate across repos. If there's interest, I'll
>>>>>>> set one up and share an invite on this thread.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Neelesh Salian.
>>>>>>>
>>>>>>

Reply via email to