We have completed cross-language validation for variant and the implementation 
compatibility appears solid. Matt has raised some comments regarding how to 
handle invalid cases. In fact, we had a long discussion during the spec 
development about whether to explicitly define the behavior for such cases. We 
should be able to clear that out soon.  


> On Aug 8, 2025, at 2:35 PM, Jia Yu <[email protected]> wrote:
> 
> Hi Gang,
> 
> Thanks for letting me know.
> 
> Would it make sense to create a new Parquet Java branch that includes all
> other commits except the Variant type implementation? That way, we could
> release a version without Variant entirely.
> 
> We’re eager to get the Geo type released, but at the same time, we don’t
> want to rush the Variant work or ship something that’s not fully ready.
> 
> Thanks,
> Jia
> 
>> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <[email protected]> wrote:
>> 
>> parquet-cpp does not implement variant type yet, so it is safe to release
>> the geo types. IIUC, there is no easy way to block users from producing
>> files with variant types in parquet-java, so this is the main concern.
>> 
>> Perhaps Aihua can provide an update on the progress?
>> 
>> Best,
>> Gang
>> 
>> 
>> 
>>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <[email protected]> wrote:
>>> 
>>> Hi all,
>>> 
>>> Thank you for all your hard work on Parquet.
>>> 
>>> Sorry for my ignorance, but I’d like to better understand why the Parquet
>>> Java release for Geo types is currently tied to the Variant type work.
>>> Arrow C++ (Parquet C++) has already been released with Geo type support,
>>> and it doesn’t seem to have encountered similar issues.
>>> 
>>> The Geo type support in Iceberg has been stalled for several months
>> because
>>> the Iceberg PMC cannot review or merge the implementation until there’s a
>>> corresponding Parquet Java release.
>>> 
>>> Would it be possible to proceed with a new Parquet Java release for Geo,
>>> and mark the Variant type as experimental or keep it behind a feature
>> flag?
>>> 
>>> I’d really appreciate your thoughts on this and am looking forward to
>> your
>>> response.
>>> 
>>> Thanks,
>>> Jia
>>> 
>>> 
>>> 
>>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu <[email protected]> wrote:
>>> 
>>>> Seems the concern from Gabor is that we should finalize the Variant
>> spec
>>> (
>>>> 
>> https://github.com/apache/parquet-format/blob/master/VariantEncoding.md
>>>> and
>>>> 
>> https://github.com/apache/parquet-format/blob/master/VariantShredding.md
>>> ),
>>>> have a parquet-format release, and then move forward with parquet-java
>>>> release. I totally agree.
>>>> 
>>>> We should have met the requirement with two reference implementations
>> for
>>>> Variant in open source and I will start a VOTE thread separately to
>> close
>>>> out the Variant spec if no objections.
>>>> 
>>>> Thanks for the discussions.
>>>> Aihua
>>>> 
>>>> 
>>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb <[email protected]>
>>>> wrote:
>>>> 
>>>>>> At this point, I’d like to check if we have enough implementation
>>>>> coverage
>>>>>> to move forward with finalizing the Variant spec. Would it make
>> sense
>>>> to
>>>>>> start a vote thread at this stage?
>>>>> 
>>>>> In my opinion we have sufficient open source implementations (the
>>> Golang
>>>>> implementation on arrow-go) and a vote to finalize the spec would be
>>>>> appropriate (and welcome)
>>>>> 
>>>>> From my experience working on the Rust implementation so far, I have
>>>> found
>>>>> the spec clear and easy to understand, the design well thought out,
>> and
>>>>> have not encountered anything that would require any changes.
>>>>> 
>>>>> Kudos to the team who designed and wrote the spec for this feature,
>>>>> Andrew
>>>>> 
>>>>> 
>>>>> 
>>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu <[email protected]> wrote:
>>>>> 
>>>>>> Thanks Aihua!
>>>>>> 
>>>>>> The geo type implementation in Iceberg is currently blocked by this
>>>>>> release. Really looking forward to it.
>>>>>> 
>>>>>> Jia
>>>>>> 
>>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky <
>> [email protected]>
>>>>>> wrote:
>>>>>> 
>>>>>>> My concern was related to the current stage of the Variant
>>>>> specification
>>>>>>> and the fact that we started talking about releasing parquet-java
>>>> with
>>>>>>> Variant features.
>>>>>>> If we formally release parquet-format with the finalized Variant
>>> spec
>>>>>>> first, then I have no concerns about writing Variant values in
>> the
>>>>>> upcoming
>>>>>>> parquet-java release. Otherwise, we need to block it by default
>> and
>>>>> mark
>>>>>> it
>>>>>>> as an experimental feature.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Gabor
>>>>>>> 
>>>>>>> Aihua Xu <[email protected]> ezt írta (időpont: 2025. júl. 16.,
>>> Sze,
>>>>>>> 19:37):
>>>>>>> 
>>>>>>>> Hi Gabor and all,
>>>>>>>> 
>>>>>>>> Here’s my current understanding of the progress on the
>> *Variant*
>>>>>> support
>>>>>>> in
>>>>>>>> Parquet:
>>>>>>>> 
>>>>>>>>   -
>>>>>>>> 
>>>>>>>>   Per Parquet's requirements, we need at least two reference
>>>>>>>>   implementations to finalize the Variant logical type
>>>>> specification.
>>>>>>>>   -
>>>>>>>> 
>>>>>>>>   The community is actively working on Java, Go, and Rust
>>>>>>> implementations:
>>>>>>>>   -
>>>>>>>> 
>>>>>>>>      Java already has the encoding and shredding
>> implementations
>>>> in
>>>>>>> place:
>>>>>>>>      -
>>>>>>>> 
>>>>>>>>         Variant Decoding <
>>>>>>>> https://github.com/apache/parquet-java/pull/3197>
>>>>>>>>         -
>>>>>>>> 
>>>>>>>>         Variant Encoding <
>>>>>>>> https://github.com/apache/parquet-java/pull/3202>
>>>>>>>>         -
>>>>>>>> 
>>>>>>>>         Variant Shredding Writer
>>>>>>>>         <https://github.com/apache/parquet-java/issues/3223>
>>>>>>>>         -
>>>>>>>> 
>>>>>>>>         Variant Shredding Reader
>>>>>>>>         <https://github.com/apache/parquet-java/issues/3211>
>>>>>>>>         -
>>>>>>>> 
>>>>>>>>      Go also includes encoding and shredding support:
>>>>>>>>      -
>>>>>>>> 
>>>>>>>>         Variant Encoding/Decoding
>>>>>>>>         <https://github.com/apache/arrow-go/pull/344>
>>>>>>>>         -
>>>>>>>> 
>>>>>>>>         Variant Shredding <
>>>>>> https://github.com/apache/arrow-go/pull/434>
>>>>>>>>         -
>>>>>>>> 
>>>>>>>>      Rust is currently working on the shredding
>> implementation.
>>>>>>>> 
>>>>>>>> In addition to these, we already have a full Variant
>>> implementation
>>>>> in
>>>>>>>> Apache Iceberg, as well as in some closed-source engines.
>>>>>>>> 
>>>>>>>> At this point, I’d like to check if we have enough
>> implementation
>>>>>>> coverage
>>>>>>>> to move forward with finalizing the Variant spec. Would it make
>>>> sense
>>>>>> to
>>>>>>>> start a vote thread at this stage?
>>>>>>>> 
>>>>>>>> Ultimately, our goal is to release a new version of
>>> parquet-format
>>>>> and
>>>>>>>> parquet-java that includes the Variant logical type, so that
>>>> Iceberg
>>>>>> and
>>>>>>>> other engines can officially depend on it and proceed with
>>> further
>>>>>>>> implementation.
>>>>>>>> 
>>>>>>>> Let me know your thoughts and how we should proceed.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>> Aihua
>>>>>>>> 
>>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky <
>>>> [email protected]>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> I was not able to open the recordings of the last meeting
>>> because
>>>>> of
>>>>>>>>> permission issues. (Shouldn't these be accessible for
>> anyone?)
>>>>>>>>> So, I'm not sure if you have talked about this, but the
>> Variant
>>>>> spec
>>>>>> is
>>>>>>>>> still not final. Since parquet-java already has Variant
>>> support,
>>>>> how
>>>>>> do
>>>>>>>> we
>>>>>>>>> prevent writing potentially invalid Variant data with the
>>> proper
>>>>>>> logical
>>>>>>>>> types we will use for the finalized spec? Is it behind a
>>> feature
>>>>>> flag?
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Gabor
>>>>>>>>> 
>>>>>>>>> Aihua Xu <[email protected]> ezt írta (időpont: 2025. júl.
>>> 11.,
>>>> P,
>>>>>>>> 19:33):
>>>>>>>>> 
>>>>>>>>>> Hi community,
>>>>>>>>>> 
>>>>>>>>>> As discussed in the last community sync-up meeting, I'd
>> like
>>> to
>>>>>>> proceed
>>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will include
>>>> support
>>>>>> for
>>>>>>>>>> *geo-type* and *variant*.
>>>>>>>>>> 
>>>>>>>>>> Please let me know if you have any objections or if you
>> have
>>>> any
>>>>>>>> upcoming
>>>>>>>>>> changes you'd like to include in this release.
>>>>>>>>>> Thanks,
>>>>>>>>>> Aihua
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 

Reply via email to