I took a closer look and, yes the files were written with Parquet v2.
For some reason Parquet v2 was set as the default, I set it back to Parquet
v1.
Thanks Michael and Ryan for the info.
Andrei.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Parqu
Michael is right, the delta byte array encoding is a Parquet v2 feature.
Parquet v2 isn't finished yet, though some features are in releases and
those features will be supported in future releases. In other words,
Parquet will maintain backward-compatibility for any released v2 features.
I don't r
Hi AndreiL,
Were these files written with the Parquet V2 writer? The Spark 2.1 vectorized
reader does not appear to support that format.
Michael
> On May 9, 2017, at 11:04 AM, andreiL wrote:
>
> Hi, I am getting an exception in Spark 2.1 reading parquet files where some
> columns are DELTA_B
Hi, I am getting an exception in Spark 2.1 reading parquet files where some
columns are DELTA_BYTE_ARRAY encoded.
java.lang.UnsupportedOperationException: Unsupported encoding:
DELTA_BYTE_ARRAY
Is this exception by design, or am I missing something?
If I turn off the vectorized reader, reading t