>
> You want to see if the write path in GO is compatible? Let
> me check with Matt on this.


Yes, IIUC, I think there are now multiple OSS reader implementations, that
have all been validated against parquet-java writing.  So I think it is
important we validate a second writer can produce files that can be read by
parquet-java.

Thanks,
Micah

On Mon, Aug 11, 2025 at 9:17 AM Aihua Xu <[email protected]> wrote:

> Hi Micah,
>
> What we have done is to generate a large set of the test cases from the
> Iceberg project and validate in Java and GO. All of those implementations
> are independent. You want to see if the write path in GO is compatible? Let
> me check with Matt on this.
>
> Thanks,
> Aihua
>
> On Sun, Aug 10, 2025 at 9:24 PM Micah Kornfield <[email protected]>
> wrote:
>
> > >
> > > We have completed cross-language validation for variant and the
> > > implementation compatibility appears solid
> >
> >
> > Great, apologies if I missed it but did we verify Java being able to read
> > Go's output?
> >
> > On Fri, Aug 8, 2025 at 9:38 PM Aihua Xu <[email protected]> wrote:
> >
> > > We have completed cross-language validation for variant and the
> > > implementation compatibility appears solid. Matt has raised some
> comments
> > > regarding how to handle invalid cases. In fact, we had a long
> discussion
> > > during the spec development about whether to explicitly define the
> > behavior
> > > for such cases. We should be able to clear that out soon.
> > >
> > >
> > > > On Aug 8, 2025, at 2:35 PM, Jia Yu <[email protected]> wrote:
> > > >
> > > > Hi Gang,
> > > >
> > > > Thanks for letting me know.
> > > >
> > > > Would it make sense to create a new Parquet Java branch that includes
> > all
> > > > other commits except the Variant type implementation? That way, we
> > could
> > > > release a version without Variant entirely.
> > > >
> > > > We’re eager to get the Geo type released, but at the same time, we
> > don’t
> > > > want to rush the Variant work or ship something that’s not fully
> ready.
> > > >
> > > > Thanks,
> > > > Jia
> > > >
> > > >> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <[email protected]> wrote:
> > > >>
> > > >> parquet-cpp does not implement variant type yet, so it is safe to
> > > release
> > > >> the geo types. IIUC, there is no easy way to block users from
> > producing
> > > >> files with variant types in parquet-java, so this is the main
> concern.
> > > >>
> > > >> Perhaps Aihua can provide an update on the progress?
> > > >>
> > > >> Best,
> > > >> Gang
> > > >>
> > > >>
> > > >>
> > > >>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <[email protected]> wrote:
> > > >>>
> > > >>> Hi all,
> > > >>>
> > > >>> Thank you for all your hard work on Parquet.
> > > >>>
> > > >>> Sorry for my ignorance, but I’d like to better understand why the
> > > Parquet
> > > >>> Java release for Geo types is currently tied to the Variant type
> > work.
> > > >>> Arrow C++ (Parquet C++) has already been released with Geo type
> > > support,
> > > >>> and it doesn’t seem to have encountered similar issues.
> > > >>>
> > > >>> The Geo type support in Iceberg has been stalled for several months
> > > >> because
> > > >>> the Iceberg PMC cannot review or merge the implementation until
> > > there’s a
> > > >>> corresponding Parquet Java release.
> > > >>>
> > > >>> Would it be possible to proceed with a new Parquet Java release for
> > > Geo,
> > > >>> and mark the Variant type as experimental or keep it behind a
> feature
> > > >> flag?
> > > >>>
> > > >>> I’d really appreciate your thoughts on this and am looking forward
> to
> > > >> your
> > > >>> response.
> > > >>>
> > > >>> Thanks,
> > > >>> Jia
> > > >>>
> > > >>>
> > > >>>
> > > >>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu <[email protected]>
> > wrote:
> > > >>>
> > > >>>> Seems the concern from Gabor is that we should finalize the
> Variant
> > > >> spec
> > > >>> (
> > > >>>>
> > > >>
> > https://github.com/apache/parquet-format/blob/master/VariantEncoding.md
> > > >>>> and
> > > >>>>
> > > >>
> > >
> https://github.com/apache/parquet-format/blob/master/VariantShredding.md
> > > >>> ),
> > > >>>> have a parquet-format release, and then move forward with
> > parquet-java
> > > >>>> release. I totally agree.
> > > >>>>
> > > >>>> We should have met the requirement with two reference
> > implementations
> > > >> for
> > > >>>> Variant in open source and I will start a VOTE thread separately
> to
> > > >> close
> > > >>>> out the Variant spec if no objections.
> > > >>>>
> > > >>>> Thanks for the discussions.
> > > >>>> Aihua
> > > >>>>
> > > >>>>
> > > >>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb <
> [email protected]
> > >
> > > >>>> wrote:
> > > >>>>
> > > >>>>>> At this point, I’d like to check if we have enough
> implementation
> > > >>>>> coverage
> > > >>>>>> to move forward with finalizing the Variant spec. Would it make
> > > >> sense
> > > >>>> to
> > > >>>>>> start a vote thread at this stage?
> > > >>>>>
> > > >>>>> In my opinion we have sufficient open source implementations (the
> > > >>> Golang
> > > >>>>> implementation on arrow-go) and a vote to finalize the spec would
> > be
> > > >>>>> appropriate (and welcome)
> > > >>>>>
> > > >>>>> From my experience working on the Rust implementation so far, I
> > have
> > > >>>> found
> > > >>>>> the spec clear and easy to understand, the design well thought
> out,
> > > >> and
> > > >>>>> have not encountered anything that would require any changes.
> > > >>>>>
> > > >>>>> Kudos to the team who designed and wrote the spec for this
> feature,
> > > >>>>> Andrew
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu <[email protected]> wrote:
> > > >>>>>
> > > >>>>>> Thanks Aihua!
> > > >>>>>>
> > > >>>>>> The geo type implementation in Iceberg is currently blocked by
> > this
> > > >>>>>> release. Really looking forward to it.
> > > >>>>>>
> > > >>>>>> Jia
> > > >>>>>>
> > > >>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky <
> > > >> [email protected]>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> My concern was related to the current stage of the Variant
> > > >>>>> specification
> > > >>>>>>> and the fact that we started talking about releasing
> parquet-java
> > > >>>> with
> > > >>>>>>> Variant features.
> > > >>>>>>> If we formally release parquet-format with the finalized
> Variant
> > > >>> spec
> > > >>>>>>> first, then I have no concerns about writing Variant values in
> > > >> the
> > > >>>>>> upcoming
> > > >>>>>>> parquet-java release. Otherwise, we need to block it by default
> > > >> and
> > > >>>>> mark
> > > >>>>>> it
> > > >>>>>>> as an experimental feature.
> > > >>>>>>>
> > > >>>>>>> Cheers,
> > > >>>>>>> Gabor
> > > >>>>>>>
> > > >>>>>>> Aihua Xu <[email protected]> ezt írta (időpont: 2025. júl.
> 16.,
> > > >>> Sze,
> > > >>>>>>> 19:37):
> > > >>>>>>>
> > > >>>>>>>> Hi Gabor and all,
> > > >>>>>>>>
> > > >>>>>>>> Here’s my current understanding of the progress on the
> > > >> *Variant*
> > > >>>>>> support
> > > >>>>>>> in
> > > >>>>>>>> Parquet:
> > > >>>>>>>>
> > > >>>>>>>>   -
> > > >>>>>>>>
> > > >>>>>>>>   Per Parquet's requirements, we need at least two reference
> > > >>>>>>>>   implementations to finalize the Variant logical type
> > > >>>>> specification.
> > > >>>>>>>>   -
> > > >>>>>>>>
> > > >>>>>>>>   The community is actively working on Java, Go, and Rust
> > > >>>>>>> implementations:
> > > >>>>>>>>   -
> > > >>>>>>>>
> > > >>>>>>>>      Java already has the encoding and shredding
> > > >> implementations
> > > >>>> in
> > > >>>>>>> place:
> > > >>>>>>>>      -
> > > >>>>>>>>
> > > >>>>>>>>         Variant Decoding <
> > > >>>>>>>> https://github.com/apache/parquet-java/pull/3197>
> > > >>>>>>>>         -
> > > >>>>>>>>
> > > >>>>>>>>         Variant Encoding <
> > > >>>>>>>> https://github.com/apache/parquet-java/pull/3202>
> > > >>>>>>>>         -
> > > >>>>>>>>
> > > >>>>>>>>         Variant Shredding Writer
> > > >>>>>>>>         <https://github.com/apache/parquet-java/issues/3223>
> > > >>>>>>>>         -
> > > >>>>>>>>
> > > >>>>>>>>         Variant Shredding Reader
> > > >>>>>>>>         <https://github.com/apache/parquet-java/issues/3211>
> > > >>>>>>>>         -
> > > >>>>>>>>
> > > >>>>>>>>      Go also includes encoding and shredding support:
> > > >>>>>>>>      -
> > > >>>>>>>>
> > > >>>>>>>>         Variant Encoding/Decoding
> > > >>>>>>>>         <https://github.com/apache/arrow-go/pull/344>
> > > >>>>>>>>         -
> > > >>>>>>>>
> > > >>>>>>>>         Variant Shredding <
> > > >>>>>> https://github.com/apache/arrow-go/pull/434>
> > > >>>>>>>>         -
> > > >>>>>>>>
> > > >>>>>>>>      Rust is currently working on the shredding
> > > >> implementation.
> > > >>>>>>>>
> > > >>>>>>>> In addition to these, we already have a full Variant
> > > >>> implementation
> > > >>>>> in
> > > >>>>>>>> Apache Iceberg, as well as in some closed-source engines.
> > > >>>>>>>>
> > > >>>>>>>> At this point, I’d like to check if we have enough
> > > >> implementation
> > > >>>>>>> coverage
> > > >>>>>>>> to move forward with finalizing the Variant spec. Would it
> make
> > > >>>> sense
> > > >>>>>> to
> > > >>>>>>>> start a vote thread at this stage?
> > > >>>>>>>>
> > > >>>>>>>> Ultimately, our goal is to release a new version of
> > > >>> parquet-format
> > > >>>>> and
> > > >>>>>>>> parquet-java that includes the Variant logical type, so that
> > > >>>> Iceberg
> > > >>>>>> and
> > > >>>>>>>> other engines can officially depend on it and proceed with
> > > >>> further
> > > >>>>>>>> implementation.
> > > >>>>>>>>
> > > >>>>>>>> Let me know your thoughts and how we should proceed.
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>>
> > > >>>>>>>> Aihua
> > > >>>>>>>>
> > > >>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky <
> > > >>>> [email protected]>
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi,
> > > >>>>>>>>>
> > > >>>>>>>>> I was not able to open the recordings of the last meeting
> > > >>> because
> > > >>>>> of
> > > >>>>>>>>> permission issues. (Shouldn't these be accessible for
> > > >> anyone?)
> > > >>>>>>>>> So, I'm not sure if you have talked about this, but the
> > > >> Variant
> > > >>>>> spec
> > > >>>>>> is
> > > >>>>>>>>> still not final. Since parquet-java already has Variant
> > > >>> support,
> > > >>>>> how
> > > >>>>>> do
> > > >>>>>>>> we
> > > >>>>>>>>> prevent writing potentially invalid Variant data with the
> > > >>> proper
> > > >>>>>>> logical
> > > >>>>>>>>> types we will use for the finalized spec? Is it behind a
> > > >>> feature
> > > >>>>>> flag?
> > > >>>>>>>>>
> > > >>>>>>>>> Cheers,
> > > >>>>>>>>> Gabor
> > > >>>>>>>>>
> > > >>>>>>>>> Aihua Xu <[email protected]> ezt írta (időpont: 2025. júl.
> > > >>> 11.,
> > > >>>> P,
> > > >>>>>>>> 19:33):
> > > >>>>>>>>>
> > > >>>>>>>>>> Hi community,
> > > >>>>>>>>>>
> > > >>>>>>>>>> As discussed in the last community sync-up meeting, I'd
> > > >> like
> > > >>> to
> > > >>>>>>> proceed
> > > >>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will include
> > > >>>> support
> > > >>>>>> for
> > > >>>>>>>>>> *geo-type* and *variant*.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Please let me know if you have any objections or if you
> > > >> have
> > > >>>> any
> > > >>>>>>>> upcoming
> > > >>>>>>>>>> changes you'd like to include in this release.
> > > >>>>>>>>>> Thanks,
> > > >>>>>>>>>> Aihua
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
>

Reply via email to