I don't see why it shouldn't be supported. FBLA and String are orthogonal
features. The first optimizes encoding by not storing lengths and the
latter says the binary is valid UTF8.

On Tue, Jun 18, 2024 at 8:35 AM Gang Wu <ust...@gmail.com> wrote:

> FYI, both parquet-cpp [1] and parquet-java [2] do not allow FLBA.
>
> [1]
>
> https://github.com/apache/arrow/blob/eec6f17c8879b469dc3370dad4a7f68f11705a6b/cpp/src/parquet/types.cc#L829-L842
> [2]
>
> https://github.com/apache/parquet-java/blob/fbe13d89ae4193be12c164d4bb5342c5eba3963f/parquet-column/src/main/java/org/apache/parquet/schema/Types.java#L443-L447
>
> Best,
> Gang
>
> On Tue, Jun 18, 2024 at 11:53 AM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
>
> > >
> > > My instinct says "No", but others may have a different interpretation.
> >
> >
> > This is also my instinct, I think we should check validation in
> > Parquet-java and parquet-cpp to see if they are in agreement on the
> matter
> > and then make a decision from there.  It doesn't seem too onerous to
> > support FLBA as a String though if necessary?
> >
> > Cheers,
> > Micah
> >
> > On Mon, Jun 17, 2024 at 12:15 PM Ed Seidl <etse...@live.com> wrote:
> >
> > > Hi all,
> > > While discussing PARQUET-2485 a question was raised about the STRING
> > > annotation [1]. The current wording in the specification is "|STRING|
> > > may only be used to annotate the binary primitive type"; PARQUET-2485
> > > would change that to "|STRING| may only be used to annotate the
> > > |BYTE_ARRAY| primitive type". The question is, can FIXED_LEN_BYTE_ARRAY
> > > also be annotated with STRING? My instinct says "No", but others may
> > > have a different interpretation.
> > >
> > > Are there any strong opinions in the community? Are there any
> > > implementations that allow fixed length strings?
> > >
> > > Thanks,
> > > Ed
> > >
> > > [1]
> > >
> https://github.com/apache/parquet-format/pull/251#discussion_r1635669939
> > >
> >
>

Reply via email to