> > You want to see if the write path in GO is compatible? Let > me check with Matt on this.
Yes, IIUC, I think there are now multiple OSS reader implementations, that have all been validated against parquet-java writing. So I think it is important we validate a second writer can produce files that can be read by parquet-java. Thanks, Micah On Mon, Aug 11, 2025 at 9:17 AM Aihua Xu <[email protected]> wrote: > Hi Micah, > > What we have done is to generate a large set of the test cases from the > Iceberg project and validate in Java and GO. All of those implementations > are independent. You want to see if the write path in GO is compatible? Let > me check with Matt on this. > > Thanks, > Aihua > > On Sun, Aug 10, 2025 at 9:24 PM Micah Kornfield <[email protected]> > wrote: > > > > > > > We have completed cross-language validation for variant and the > > > implementation compatibility appears solid > > > > > > Great, apologies if I missed it but did we verify Java being able to read > > Go's output? > > > > On Fri, Aug 8, 2025 at 9:38 PM Aihua Xu <[email protected]> wrote: > > > > > We have completed cross-language validation for variant and the > > > implementation compatibility appears solid. Matt has raised some > comments > > > regarding how to handle invalid cases. In fact, we had a long > discussion > > > during the spec development about whether to explicitly define the > > behavior > > > for such cases. We should be able to clear that out soon. > > > > > > > > > > On Aug 8, 2025, at 2:35 PM, Jia Yu <[email protected]> wrote: > > > > > > > > Hi Gang, > > > > > > > > Thanks for letting me know. > > > > > > > > Would it make sense to create a new Parquet Java branch that includes > > all > > > > other commits except the Variant type implementation? That way, we > > could > > > > release a version without Variant entirely. > > > > > > > > We’re eager to get the Geo type released, but at the same time, we > > don’t > > > > want to rush the Variant work or ship something that’s not fully > ready. > > > > > > > > Thanks, > > > > Jia > > > > > > > >> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <[email protected]> wrote: > > > >> > > > >> parquet-cpp does not implement variant type yet, so it is safe to > > > release > > > >> the geo types. IIUC, there is no easy way to block users from > > producing > > > >> files with variant types in parquet-java, so this is the main > concern. > > > >> > > > >> Perhaps Aihua can provide an update on the progress? > > > >> > > > >> Best, > > > >> Gang > > > >> > > > >> > > > >> > > > >>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <[email protected]> wrote: > > > >>> > > > >>> Hi all, > > > >>> > > > >>> Thank you for all your hard work on Parquet. > > > >>> > > > >>> Sorry for my ignorance, but I’d like to better understand why the > > > Parquet > > > >>> Java release for Geo types is currently tied to the Variant type > > work. > > > >>> Arrow C++ (Parquet C++) has already been released with Geo type > > > support, > > > >>> and it doesn’t seem to have encountered similar issues. > > > >>> > > > >>> The Geo type support in Iceberg has been stalled for several months > > > >> because > > > >>> the Iceberg PMC cannot review or merge the implementation until > > > there’s a > > > >>> corresponding Parquet Java release. > > > >>> > > > >>> Would it be possible to proceed with a new Parquet Java release for > > > Geo, > > > >>> and mark the Variant type as experimental or keep it behind a > feature > > > >> flag? > > > >>> > > > >>> I’d really appreciate your thoughts on this and am looking forward > to > > > >> your > > > >>> response. > > > >>> > > > >>> Thanks, > > > >>> Jia > > > >>> > > > >>> > > > >>> > > > >>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu <[email protected]> > > wrote: > > > >>> > > > >>>> Seems the concern from Gabor is that we should finalize the > Variant > > > >> spec > > > >>> ( > > > >>>> > > > >> > > https://github.com/apache/parquet-format/blob/master/VariantEncoding.md > > > >>>> and > > > >>>> > > > >> > > > > https://github.com/apache/parquet-format/blob/master/VariantShredding.md > > > >>> ), > > > >>>> have a parquet-format release, and then move forward with > > parquet-java > > > >>>> release. I totally agree. > > > >>>> > > > >>>> We should have met the requirement with two reference > > implementations > > > >> for > > > >>>> Variant in open source and I will start a VOTE thread separately > to > > > >> close > > > >>>> out the Variant spec if no objections. > > > >>>> > > > >>>> Thanks for the discussions. > > > >>>> Aihua > > > >>>> > > > >>>> > > > >>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb < > [email protected] > > > > > > >>>> wrote: > > > >>>> > > > >>>>>> At this point, I’d like to check if we have enough > implementation > > > >>>>> coverage > > > >>>>>> to move forward with finalizing the Variant spec. Would it make > > > >> sense > > > >>>> to > > > >>>>>> start a vote thread at this stage? > > > >>>>> > > > >>>>> In my opinion we have sufficient open source implementations (the > > > >>> Golang > > > >>>>> implementation on arrow-go) and a vote to finalize the spec would > > be > > > >>>>> appropriate (and welcome) > > > >>>>> > > > >>>>> From my experience working on the Rust implementation so far, I > > have > > > >>>> found > > > >>>>> the spec clear and easy to understand, the design well thought > out, > > > >> and > > > >>>>> have not encountered anything that would require any changes. > > > >>>>> > > > >>>>> Kudos to the team who designed and wrote the spec for this > feature, > > > >>>>> Andrew > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu <[email protected]> wrote: > > > >>>>> > > > >>>>>> Thanks Aihua! > > > >>>>>> > > > >>>>>> The geo type implementation in Iceberg is currently blocked by > > this > > > >>>>>> release. Really looking forward to it. > > > >>>>>> > > > >>>>>> Jia > > > >>>>>> > > > >>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky < > > > >> [email protected]> > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>>> My concern was related to the current stage of the Variant > > > >>>>> specification > > > >>>>>>> and the fact that we started talking about releasing > parquet-java > > > >>>> with > > > >>>>>>> Variant features. > > > >>>>>>> If we formally release parquet-format with the finalized > Variant > > > >>> spec > > > >>>>>>> first, then I have no concerns about writing Variant values in > > > >> the > > > >>>>>> upcoming > > > >>>>>>> parquet-java release. Otherwise, we need to block it by default > > > >> and > > > >>>>> mark > > > >>>>>> it > > > >>>>>>> as an experimental feature. > > > >>>>>>> > > > >>>>>>> Cheers, > > > >>>>>>> Gabor > > > >>>>>>> > > > >>>>>>> Aihua Xu <[email protected]> ezt írta (időpont: 2025. júl. > 16., > > > >>> Sze, > > > >>>>>>> 19:37): > > > >>>>>>> > > > >>>>>>>> Hi Gabor and all, > > > >>>>>>>> > > > >>>>>>>> Here’s my current understanding of the progress on the > > > >> *Variant* > > > >>>>>> support > > > >>>>>>> in > > > >>>>>>>> Parquet: > > > >>>>>>>> > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> Per Parquet's requirements, we need at least two reference > > > >>>>>>>> implementations to finalize the Variant logical type > > > >>>>> specification. > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> The community is actively working on Java, Go, and Rust > > > >>>>>>> implementations: > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> Java already has the encoding and shredding > > > >> implementations > > > >>>> in > > > >>>>>>> place: > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> Variant Decoding < > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3197> > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> Variant Encoding < > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3202> > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> Variant Shredding Writer > > > >>>>>>>> <https://github.com/apache/parquet-java/issues/3223> > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> Variant Shredding Reader > > > >>>>>>>> <https://github.com/apache/parquet-java/issues/3211> > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> Go also includes encoding and shredding support: > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> Variant Encoding/Decoding > > > >>>>>>>> <https://github.com/apache/arrow-go/pull/344> > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> Variant Shredding < > > > >>>>>> https://github.com/apache/arrow-go/pull/434> > > > >>>>>>>> - > > > >>>>>>>> > > > >>>>>>>> Rust is currently working on the shredding > > > >> implementation. > > > >>>>>>>> > > > >>>>>>>> In addition to these, we already have a full Variant > > > >>> implementation > > > >>>>> in > > > >>>>>>>> Apache Iceberg, as well as in some closed-source engines. > > > >>>>>>>> > > > >>>>>>>> At this point, I’d like to check if we have enough > > > >> implementation > > > >>>>>>> coverage > > > >>>>>>>> to move forward with finalizing the Variant spec. Would it > make > > > >>>> sense > > > >>>>>> to > > > >>>>>>>> start a vote thread at this stage? > > > >>>>>>>> > > > >>>>>>>> Ultimately, our goal is to release a new version of > > > >>> parquet-format > > > >>>>> and > > > >>>>>>>> parquet-java that includes the Variant logical type, so that > > > >>>> Iceberg > > > >>>>>> and > > > >>>>>>>> other engines can officially depend on it and proceed with > > > >>> further > > > >>>>>>>> implementation. > > > >>>>>>>> > > > >>>>>>>> Let me know your thoughts and how we should proceed. > > > >>>>>>>> > > > >>>>>>>> Thanks, > > > >>>>>>>> > > > >>>>>>>> Aihua > > > >>>>>>>> > > > >>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky < > > > >>>> [email protected]> > > > >>>>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>>> Hi, > > > >>>>>>>>> > > > >>>>>>>>> I was not able to open the recordings of the last meeting > > > >>> because > > > >>>>> of > > > >>>>>>>>> permission issues. (Shouldn't these be accessible for > > > >> anyone?) > > > >>>>>>>>> So, I'm not sure if you have talked about this, but the > > > >> Variant > > > >>>>> spec > > > >>>>>> is > > > >>>>>>>>> still not final. Since parquet-java already has Variant > > > >>> support, > > > >>>>> how > > > >>>>>> do > > > >>>>>>>> we > > > >>>>>>>>> prevent writing potentially invalid Variant data with the > > > >>> proper > > > >>>>>>> logical > > > >>>>>>>>> types we will use for the finalized spec? Is it behind a > > > >>> feature > > > >>>>>> flag? > > > >>>>>>>>> > > > >>>>>>>>> Cheers, > > > >>>>>>>>> Gabor > > > >>>>>>>>> > > > >>>>>>>>> Aihua Xu <[email protected]> ezt írta (időpont: 2025. júl. > > > >>> 11., > > > >>>> P, > > > >>>>>>>> 19:33): > > > >>>>>>>>> > > > >>>>>>>>>> Hi community, > > > >>>>>>>>>> > > > >>>>>>>>>> As discussed in the last community sync-up meeting, I'd > > > >> like > > > >>> to > > > >>>>>>> proceed > > > >>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will include > > > >>>> support > > > >>>>>> for > > > >>>>>>>>>> *geo-type* and *variant*. > > > >>>>>>>>>> > > > >>>>>>>>>> Please let me know if you have any objections or if you > > > >> have > > > >>>> any > > > >>>>>>>> upcoming > > > >>>>>>>>>> changes you'd like to include in this release. > > > >>>>>>>>>> Thanks, > > > >>>>>>>>>> Aihua > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >>> > > > >> > > > > > >
