We have completed cross-language validation for variant and the implementation compatibility appears solid. Matt has raised some comments regarding how to handle invalid cases. In fact, we had a long discussion during the spec development about whether to explicitly define the behavior for such cases. We should be able to clear that out soon.
> On Aug 8, 2025, at 2:35 PM, Jia Yu <[email protected]> wrote: > > Hi Gang, > > Thanks for letting me know. > > Would it make sense to create a new Parquet Java branch that includes all > other commits except the Variant type implementation? That way, we could > release a version without Variant entirely. > > We’re eager to get the Geo type released, but at the same time, we don’t > want to rush the Variant work or ship something that’s not fully ready. > > Thanks, > Jia > >> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <[email protected]> wrote: >> >> parquet-cpp does not implement variant type yet, so it is safe to release >> the geo types. IIUC, there is no easy way to block users from producing >> files with variant types in parquet-java, so this is the main concern. >> >> Perhaps Aihua can provide an update on the progress? >> >> Best, >> Gang >> >> >> >>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <[email protected]> wrote: >>> >>> Hi all, >>> >>> Thank you for all your hard work on Parquet. >>> >>> Sorry for my ignorance, but I’d like to better understand why the Parquet >>> Java release for Geo types is currently tied to the Variant type work. >>> Arrow C++ (Parquet C++) has already been released with Geo type support, >>> and it doesn’t seem to have encountered similar issues. >>> >>> The Geo type support in Iceberg has been stalled for several months >> because >>> the Iceberg PMC cannot review or merge the implementation until there’s a >>> corresponding Parquet Java release. >>> >>> Would it be possible to proceed with a new Parquet Java release for Geo, >>> and mark the Variant type as experimental or keep it behind a feature >> flag? >>> >>> I’d really appreciate your thoughts on this and am looking forward to >> your >>> response. >>> >>> Thanks, >>> Jia >>> >>> >>> >>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu <[email protected]> wrote: >>> >>>> Seems the concern from Gabor is that we should finalize the Variant >> spec >>> ( >>>> >> https://github.com/apache/parquet-format/blob/master/VariantEncoding.md >>>> and >>>> >> https://github.com/apache/parquet-format/blob/master/VariantShredding.md >>> ), >>>> have a parquet-format release, and then move forward with parquet-java >>>> release. I totally agree. >>>> >>>> We should have met the requirement with two reference implementations >> for >>>> Variant in open source and I will start a VOTE thread separately to >> close >>>> out the Variant spec if no objections. >>>> >>>> Thanks for the discussions. >>>> Aihua >>>> >>>> >>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb <[email protected]> >>>> wrote: >>>> >>>>>> At this point, I’d like to check if we have enough implementation >>>>> coverage >>>>>> to move forward with finalizing the Variant spec. Would it make >> sense >>>> to >>>>>> start a vote thread at this stage? >>>>> >>>>> In my opinion we have sufficient open source implementations (the >>> Golang >>>>> implementation on arrow-go) and a vote to finalize the spec would be >>>>> appropriate (and welcome) >>>>> >>>>> From my experience working on the Rust implementation so far, I have >>>> found >>>>> the spec clear and easy to understand, the design well thought out, >> and >>>>> have not encountered anything that would require any changes. >>>>> >>>>> Kudos to the team who designed and wrote the spec for this feature, >>>>> Andrew >>>>> >>>>> >>>>> >>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu <[email protected]> wrote: >>>>> >>>>>> Thanks Aihua! >>>>>> >>>>>> The geo type implementation in Iceberg is currently blocked by this >>>>>> release. Really looking forward to it. >>>>>> >>>>>> Jia >>>>>> >>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky < >> [email protected]> >>>>>> wrote: >>>>>> >>>>>>> My concern was related to the current stage of the Variant >>>>> specification >>>>>>> and the fact that we started talking about releasing parquet-java >>>> with >>>>>>> Variant features. >>>>>>> If we formally release parquet-format with the finalized Variant >>> spec >>>>>>> first, then I have no concerns about writing Variant values in >> the >>>>>> upcoming >>>>>>> parquet-java release. Otherwise, we need to block it by default >> and >>>>> mark >>>>>> it >>>>>>> as an experimental feature. >>>>>>> >>>>>>> Cheers, >>>>>>> Gabor >>>>>>> >>>>>>> Aihua Xu <[email protected]> ezt írta (időpont: 2025. júl. 16., >>> Sze, >>>>>>> 19:37): >>>>>>> >>>>>>>> Hi Gabor and all, >>>>>>>> >>>>>>>> Here’s my current understanding of the progress on the >> *Variant* >>>>>> support >>>>>>> in >>>>>>>> Parquet: >>>>>>>> >>>>>>>> - >>>>>>>> >>>>>>>> Per Parquet's requirements, we need at least two reference >>>>>>>> implementations to finalize the Variant logical type >>>>> specification. >>>>>>>> - >>>>>>>> >>>>>>>> The community is actively working on Java, Go, and Rust >>>>>>> implementations: >>>>>>>> - >>>>>>>> >>>>>>>> Java already has the encoding and shredding >> implementations >>>> in >>>>>>> place: >>>>>>>> - >>>>>>>> >>>>>>>> Variant Decoding < >>>>>>>> https://github.com/apache/parquet-java/pull/3197> >>>>>>>> - >>>>>>>> >>>>>>>> Variant Encoding < >>>>>>>> https://github.com/apache/parquet-java/pull/3202> >>>>>>>> - >>>>>>>> >>>>>>>> Variant Shredding Writer >>>>>>>> <https://github.com/apache/parquet-java/issues/3223> >>>>>>>> - >>>>>>>> >>>>>>>> Variant Shredding Reader >>>>>>>> <https://github.com/apache/parquet-java/issues/3211> >>>>>>>> - >>>>>>>> >>>>>>>> Go also includes encoding and shredding support: >>>>>>>> - >>>>>>>> >>>>>>>> Variant Encoding/Decoding >>>>>>>> <https://github.com/apache/arrow-go/pull/344> >>>>>>>> - >>>>>>>> >>>>>>>> Variant Shredding < >>>>>> https://github.com/apache/arrow-go/pull/434> >>>>>>>> - >>>>>>>> >>>>>>>> Rust is currently working on the shredding >> implementation. >>>>>>>> >>>>>>>> In addition to these, we already have a full Variant >>> implementation >>>>> in >>>>>>>> Apache Iceberg, as well as in some closed-source engines. >>>>>>>> >>>>>>>> At this point, I’d like to check if we have enough >> implementation >>>>>>> coverage >>>>>>>> to move forward with finalizing the Variant spec. Would it make >>>> sense >>>>>> to >>>>>>>> start a vote thread at this stage? >>>>>>>> >>>>>>>> Ultimately, our goal is to release a new version of >>> parquet-format >>>>> and >>>>>>>> parquet-java that includes the Variant logical type, so that >>>> Iceberg >>>>>> and >>>>>>>> other engines can officially depend on it and proceed with >>> further >>>>>>>> implementation. >>>>>>>> >>>>>>>> Let me know your thoughts and how we should proceed. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Aihua >>>>>>>> >>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky < >>>> [email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I was not able to open the recordings of the last meeting >>> because >>>>> of >>>>>>>>> permission issues. (Shouldn't these be accessible for >> anyone?) >>>>>>>>> So, I'm not sure if you have talked about this, but the >> Variant >>>>> spec >>>>>> is >>>>>>>>> still not final. Since parquet-java already has Variant >>> support, >>>>> how >>>>>> do >>>>>>>> we >>>>>>>>> prevent writing potentially invalid Variant data with the >>> proper >>>>>>> logical >>>>>>>>> types we will use for the finalized spec? Is it behind a >>> feature >>>>>> flag? >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Gabor >>>>>>>>> >>>>>>>>> Aihua Xu <[email protected]> ezt írta (időpont: 2025. júl. >>> 11., >>>> P, >>>>>>>> 19:33): >>>>>>>>> >>>>>>>>>> Hi community, >>>>>>>>>> >>>>>>>>>> As discussed in the last community sync-up meeting, I'd >> like >>> to >>>>>>> proceed >>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will include >>>> support >>>>>> for >>>>>>>>>> *geo-type* and *variant*. >>>>>>>>>> >>>>>>>>>> Please let me know if you have any objections or if you >> have >>>> any >>>>>>>> upcoming >>>>>>>>>> changes you'd like to include in this release. >>>>>>>>>> Thanks, >>>>>>>>>> Aihua >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>
