Note that the extension type that was merged in Go
(https://github.com/apache/arrow-go/blob/c542dd68e2757122ce8ffc15936f2df46664c30c/arrow/extensions/variant.go#L170)
and also the one used in Parquet C++ in the arrow repo is using the
name "parquet.variant", not "arrow.variant".

That could help frame it as "Parquet variant" compatible instead of
*the* Arrow variant type. But from the discussion here (or the google
doc), it was not clear to me that this is the name being used in the
current implementations, and the proposal is to follow those
implementations or change them to "arrow.variant" once that would be
voted upon.

Joris

On Thu, 26 Jun 2025 at 13:58, Antoine Pitrou <anto...@python.org> wrote:
>
>
> The problem is that the discussion is still framed as "Arrow Variant"
> type (see mail subject line) but most people seem to be thinking of
> canonicalizing a Parquet Variant extension type in Arrow.
>
> That confusion should be cleared before we think of moving any further.
>
> Regards
>
> Antoine.
>
>
> On Wed, 25 Jun 2025 12:38:21 -0400
> Andrew Lamb <al...@influxdata.com> wrote:
> > Did we ever decide that Variant will be a Arrow canonical extension type?
> >
> > I don't see it currently listed in the docs [1] however an extension type
> > maybe was added to the C++ implementation in [2] (sorry I am not
> > familiar with that codebase to be sure)
> >
> > As I think was mentioned elsewhere there is also a github discussion about
> > adding Variant as a real type[3] that may also be relevant, from Curt.
> >
> > If this is the direction we are heading I will be happy to file a ticket to
> > track the work
> >
> > Andrew
> >
> > [1]:
> > https://arrow.apache.org/docs/format/CanonicalExtensions.html#canonical-extension-types
> > [2]: https://github.com/apache/arrow/pull/45375/files
> > [3]: https://github.com/apache/arrow/issues/42069
> >
> > On Wed, May 21, 2025 at 4:43 AM wish maple <maplewish...@gmail.com> wrote:
> >
> > > When I went through the parquet variant spec, I found that an arrow
> > > extension type might be a must because decoding the parquet row
> > > by row is so inefficient.
> > >
> > > I've draft a decoding tool in parquet c++ and ready for review now [1]
> > >
> > > [1] https://github.com/apache/arrow/pull/46372
> > >
> > > Best,
> > > Xuwei Fu
> > >
> > > Matt Topol <zotthewiz...@gmail.com> 于2025年5月9日周五 06:03写道:
> > >
> > > > Hey All,
> > > >
> > > > There's been various discussions occurring on many different thread
> > > > locations (issues, PRs, and so on)[1][2][3], and more that I haven't
> > > > linked to, concerning what a canonical Variant Extension Type for
> > > > Arrow might look like. As I've looked into implementing some things,
> > > > I've also spoken with members of the Arrow, Iceberg and Parquet
> > > > communities as to what a good representation for Arrow Variant would
> > > > be like in order to ensure good support and adoption.
> > > >
> > > > I also looked at the ClickHouse variant implementation [4]. The
> > > > ClickHouse Variant is nearly equivalent to the Arrow Dense Union type,
> > > > so we don't need to do any extra work there to support it.
> > > >
> > > > So, after discussions and looking into the needs for engines and so
> > > > on, I've iterated and written up a proposal for what a Canonical
> > > > Variant Extension Type for Arrow could be in a google doc[5]. I'm
> > > > hoping that this can spark some discussion and comments on the
> > > > document. If there's relative consensus on it, then I'll work on
> > > > creating some implementations of it that I can use to formally propose
> > > > the addition to the Canonical Extensions.
> > > >
> > > > Please take a read and leave comments on the google doc or on this
> > > > thread. Thanks everyone!
> > > >
> > > > --Matt
> > > >
> > > > [1]: https://github.com/apache/arrow-rs/issues/7063
> > > > [2]: https://github.com/apache/arrow/issues/45937
> > > > [3]: https://github.com/apache/arrow/pull/45375#issuecomment-2649807352
> > > > [4]:
> > > > https://clickhouse.com/blog/a-new-powerful-json-data-type-for-clickhouse
> > > > [5]:
> > > >
> > > https://docs.google.com/document/d/1pw0AWoMQY3SjD7R4LgbPvMjG_xSCtXp3rZHkVp9jpZ4/edit?usp=sharing
> > > >
> > >
> >
>
>
>

Reply via email to