> Just my two cents,
>Ofir
>
>
>
> From: Gang Wu
> Sent: Tuesday, June 18, 2024 5:20 PM
> To: dev@parquet.apache.org
> Subject: [External] Re: [DISCUSS] Can FIXED_LEN_BYTE_ARRAY be annotated
> with STRING?
>
> I have the same feeling and that'
To me there is no fundamental reason to not allow STRING or ENUM on
FIXED_LEN_BYTE_ARRAY.
I think historically, the type FIXED_LEN_BYTE_ARRAY was added later.
Now, the question is more whether someone wants to spend the effort to add
support for it. I agree with Micah it doesn't look like a lot of
st my two cents,
Ofir
From: Gang Wu
Sent: Tuesday, June 18, 2024 5:20 PM
To: dev@parquet.apache.org
Subject: [External] Re: [DISCUSS] Can FIXED_LEN_BYTE_ARRAY be annotated with
STRING?
I have the same feeling and that's why I've asked in the mentioned PR.
It seems FLBA is just a
ommon cases like an 8-byte encoding
like a specific ASCII character set)
Just my two cents,
Ofir
From: Gang Wu
Sent: Tuesday, June 18, 2024 5:20 PM
To: dev@parquet.apache.org
Subject: [External] Re: [DISCUSS] Can FIXED_LEN_BYTE_ARRAY be annotated with
STRING?
I
I have the same feeling and that's why I've asked in the mentioned PR.
It seems FLBA is just a special case of BYTE_ARRAY.
On Tue, Jun 18, 2024 at 10:16 PM Alkis Evlogimenos
wrote:
> I don't see why it shouldn't be supported. FBLA and String are orthogonal
> features. The first optimizes encodin
I don't see why it shouldn't be supported. FBLA and String are orthogonal
features. The first optimizes encoding by not storing lengths and the
latter says the binary is valid UTF8.
On Tue, Jun 18, 2024 at 8:35 AM Gang Wu wrote:
> FYI, both parquet-cpp [1] and parquet-java [2] do not allow FLBA.
FYI, both parquet-cpp [1] and parquet-java [2] do not allow FLBA.
[1]
https://github.com/apache/arrow/blob/eec6f17c8879b469dc3370dad4a7f68f11705a6b/cpp/src/parquet/types.cc#L829-L842
[2]
https://github.com/apache/parquet-java/blob/fbe13d89ae4193be12c164d4bb5342c5eba3963f/parquet-column/src/main/ja
>
> My instinct says "No", but others may have a different interpretation.
This is also my instinct, I think we should check validation in
Parquet-java and parquet-cpp to see if they are in agreement on the matter
and then make a decision from there. It doesn't seem too onerous to
support FLBA a
Hi all,
While discussing PARQUET-2485 a question was raised about the STRING
annotation [1]. The current wording in the specification is "|STRING|
may only be used to annotate the binary primitive type"; PARQUET-2485
would change that to "|STRING| may only be used to annotate the
|BYTE_ARRAY|